* [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8
@ 2022-03-09 8:21 Dominik Csapak
2022-03-09 9:51 ` Dominik Csapak
2022-04-26 13:36 ` [pve-devel] applied: " Thomas Lamprecht
0 siblings, 2 replies; 4+ messages in thread
From: Dominik Csapak @ 2022-03-09 8:21 UTC (permalink / raw)
To: pve-devel
When writing into the file, explicitly utf8 encode it, and then try to
utf8 decode it on read.
If the notes are not valid utf8, we assume it was an iso-8859 comment
and return is at is was.
Technically this is a breaking change, since there are iso-8859 comments
that would sucessfully decode as utf8, for example:
the byte sequence "C2 A9" would be "£" in iso, but would decode to "£".
From what i can tell though, this is rather unlikely to happen for
"real world" notes, because the first byte would be in the range of
C0-F7 (which are mostly language dependent characters like "Â")
and the following bytes would have to be in the range of
80-BF, which are only special characters like "£" (or undefined)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
PVE/Storage/DirPlugin.pm | 9 +++++++--
PVE/Storage/Plugin.pm | 3 ++-
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/PVE/Storage/DirPlugin.pm b/PVE/Storage/DirPlugin.pm
index c60818b..2c58a17 100644
--- a/PVE/Storage/DirPlugin.pm
+++ b/PVE/Storage/DirPlugin.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use Cwd;
+use Encode qw(decode encode);
use File::Path;
use IO::File;
use POSIX;
@@ -103,7 +104,10 @@ sub get_volume_notes {
my $path = $class->filesystem_path($scfg, $volname);
$path .= $class->SUPER::NOTES_EXT;
- return PVE::Tools::file_get_contents($path) if -f $path;
+ if (-f $path) {
+ my $data = PVE::Tools::file_get_contents($path);
+ return eval { decode('UTF-8', $data, 1) } // $data;
+ }
return '';
}
@@ -120,7 +124,8 @@ sub update_volume_notes {
$path .= $class->SUPER::NOTES_EXT;
if (defined($notes) && $notes ne '') {
- PVE::Tools::file_set_contents($path, $notes);
+ my $encoded = encode('UTF-8', $notes);
+ PVE::Tools::file_set_contents($path, $encoded);
} else {
unlink $path or $! == ENOENT or die "could not delete notes - $!\n";
}
diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
index a6b0bdd..0c21987 100644
--- a/PVE/Storage/Plugin.pm
+++ b/PVE/Storage/Plugin.pm
@@ -3,6 +3,7 @@ package PVE::Storage::Plugin;
use strict;
use warnings;
+use Encode qw(decode);
use Fcntl ':mode';
use File::chdir;
use File::Path;
@@ -1172,7 +1173,7 @@ my $get_subdir_files = sub {
my $notes_fn = $original.NOTES_EXT;
if (-f $notes_fn) {
my $notes = PVE::Tools::file_read_firstline($notes_fn);
- $info->{notes} = $notes if defined($notes);
+ $info->{notes} = eval { decode('UTF-8', $notes, 1) } // $notes if defined($notes);
}
$info->{protected} = 1 if -e PVE::Storage::protection_file_path($original);
--
2.30.2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8
2022-03-09 8:21 [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8 Dominik Csapak
@ 2022-03-09 9:51 ` Dominik Csapak
2022-04-25 14:27 ` Dominik Csapak
2022-04-26 13:36 ` [pve-devel] applied: " Thomas Lamprecht
1 sibling, 1 reply; 4+ messages in thread
From: Dominik Csapak @ 2022-03-09 9:51 UTC (permalink / raw)
To: pve-devel
sorry i forgot to fix the s/sucessfully/successfully/ typo...
also:
On 3/9/22 09:21, Dominik Csapak wrote:
> When writing into the file, explicitly utf8 encode it, and then try to
> utf8 decode it on read.
>
> If the notes are not valid utf8, we assume it was an iso-8859 comment
> and return is at is was.
>
and return *it* *as* *it* was
s/t confusion :P
> Technically this is a breaking change, since there are iso-8859 comments
> that would sucessfully decode as utf8, for example:
> the byte sequence "C2 A9" would be "£" in iso, but would decode to "£".
>
> From what i can tell though, this is rather unlikely to happen for
> "real world" notes, because the first byte would be in the range of
> C0-F7 (which are mostly language dependent characters like "Â")
> and the following bytes would have to be in the range of
> 80-BF, which are only special characters like "£" (or undefined)
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> PVE/Storage/DirPlugin.pm | 9 +++++++--
> PVE/Storage/Plugin.pm | 3 ++-
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/PVE/Storage/DirPlugin.pm b/PVE/Storage/DirPlugin.pm
> index c60818b..2c58a17 100644
> --- a/PVE/Storage/DirPlugin.pm
> +++ b/PVE/Storage/DirPlugin.pm
> @@ -4,6 +4,7 @@ use strict;
> use warnings;
>
> use Cwd;
> +use Encode qw(decode encode);
> use File::Path;
> use IO::File;
> use POSIX;
> @@ -103,7 +104,10 @@ sub get_volume_notes {
> my $path = $class->filesystem_path($scfg, $volname);
> $path .= $class->SUPER::NOTES_EXT;
>
> - return PVE::Tools::file_get_contents($path) if -f $path;
> + if (-f $path) {
> + my $data = PVE::Tools::file_get_contents($path);
> + return eval { decode('UTF-8', $data, 1) } // $data;
> + }
>
> return '';
> }
> @@ -120,7 +124,8 @@ sub update_volume_notes {
> $path .= $class->SUPER::NOTES_EXT;
>
> if (defined($notes) && $notes ne '') {
> - PVE::Tools::file_set_contents($path, $notes);
> + my $encoded = encode('UTF-8', $notes);
> + PVE::Tools::file_set_contents($path, $encoded);
> } else {
> unlink $path or $! == ENOENT or die "could not delete notes - $!\n";
> }
> diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
> index a6b0bdd..0c21987 100644
> --- a/PVE/Storage/Plugin.pm
> +++ b/PVE/Storage/Plugin.pm
> @@ -3,6 +3,7 @@ package PVE::Storage::Plugin;
> use strict;
> use warnings;
>
> +use Encode qw(decode);
> use Fcntl ':mode';
> use File::chdir;
> use File::Path;
> @@ -1172,7 +1173,7 @@ my $get_subdir_files = sub {
> my $notes_fn = $original.NOTES_EXT;
> if (-f $notes_fn) {
> my $notes = PVE::Tools::file_read_firstline($notes_fn);
> - $info->{notes} = $notes if defined($notes);
> + $info->{notes} = eval { decode('UTF-8', $notes, 1) } // $notes if defined($notes);
> }
>
> $info->{protected} = 1 if -e PVE::Storage::protection_file_path($original);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8
2022-03-09 9:51 ` Dominik Csapak
@ 2022-04-25 14:27 ` Dominik Csapak
0 siblings, 0 replies; 4+ messages in thread
From: Dominik Csapak @ 2022-04-25 14:27 UTC (permalink / raw)
To: pve-devel
ping (i can send a v2 with the typos fixed ofc, if wanted please say so)
^ permalink raw reply [flat|nested] 4+ messages in thread
* [pve-devel] applied: [PATCH storage] Plugins: en/decode notes as UTF-8
2022-03-09 8:21 [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8 Dominik Csapak
2022-03-09 9:51 ` Dominik Csapak
@ 2022-04-26 13:36 ` Thomas Lamprecht
1 sibling, 0 replies; 4+ messages in thread
From: Thomas Lamprecht @ 2022-04-26 13:36 UTC (permalink / raw)
To: Proxmox VE development discussion, Dominik Csapak
On 09.03.22 09:21, Dominik Csapak wrote:
> When writing into the file, explicitly utf8 encode it, and then try to
> utf8 decode it on read.
>
> If the notes are not valid utf8, we assume it was an iso-8859 comment
> and return is at is was.
>
> Technically this is a breaking change, since there are iso-8859 comments
> that would sucessfully decode as utf8, for example:
> the byte sequence "C2 A9" would be "£" in iso, but would decode to "£".
>
> From what i can tell though, this is rather unlikely to happen for
> "real world" notes, because the first byte would be in the range of
> C0-F7 (which are mostly language dependent characters like "Â")
> and the following bytes would have to be in the range of
> 80-BF, which are only special characters like "£" (or undefined)
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> PVE/Storage/DirPlugin.pm | 9 +++++++--
> PVE/Storage/Plugin.pm | 3 ++-
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
>
applied, with commit message typos (there where at least one other) fixed, thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-04-26 13:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-09 8:21 [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8 Dominik Csapak
2022-03-09 9:51 ` Dominik Csapak
2022-04-25 14:27 ` Dominik Csapak
2022-04-26 13:36 ` [pve-devel] applied: " Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox