public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC PATCH storage] Plugins: en/decode notes as UTF-8
Date: Tue,  8 Mar 2022 15:41:45 +0100	[thread overview]
Message-ID: <20220308144145.536734-1-d.csapak@proxmox.com> (raw)

When writing into the file, explicitly utf8 encode it, and then try to
utf8 decode it on read.

If the notes are not valid utf8, we assume it was an iso-8859 comment
and return is at is was.

Technically this is a breaking change, since there are iso-8859 comments
that would sucessfully decode as utf8, for example:
the byte sequence "C2 A9" would be "£" in iso, but would decode to "£".

From what i can tell though, this is rather unlikely to happen for
"real world" notes, because the first byte would be in the range of
C0-F7 (which are mostly language dependent characters like "Â")
and the following bytes would have to be in the range of
80-BF, which are only special characters like "£" (or undefined)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
we may want to have this 'try_decode_utf8' in PVE::Tools i guess?
i just put it here for the RFC, so its more easy to review

 PVE/Storage.pm           | 17 +++++++++++++++++
 PVE/Storage/DirPlugin.pm |  9 +++++++--
 PVE/Storage/Plugin.pm    |  2 +-
 3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index b1d31bb..4335ee9 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -14,6 +14,7 @@ use File::Path;
 use Cwd 'abs_path';
 use Socket;
 use Time::Local qw(timelocal);
+use Encode qw(decode);
 
 use PVE::Tools qw(run_command file_read_firstline dir_glob_foreach $IPV6RE);
 use PVE::Cluster qw(cfs_read_file cfs_write_file cfs_lock_file);
@@ -2077,4 +2078,20 @@ sub normalize_content_filename {
     return $filename;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+
+    my $decoded = eval {
+	decode('UTF-8', $data, 1);
+    };
+
+    if (!defined($decoded)) {
+	# we could not decode, it's probably iso-8859,
+	# so return original value
+	return $data;
+    }
+
+    return $decoded;
+}
+
 1;
diff --git a/PVE/Storage/DirPlugin.pm b/PVE/Storage/DirPlugin.pm
index c60818b..bc559e6 100644
--- a/PVE/Storage/DirPlugin.pm
+++ b/PVE/Storage/DirPlugin.pm
@@ -7,6 +7,7 @@ use Cwd;
 use File::Path;
 use IO::File;
 use POSIX;
+use Encode qw(encode);
 
 use PVE::Storage::Plugin;
 use PVE::JSONSchema qw(get_standard_option);
@@ -103,7 +104,10 @@ sub get_volume_notes {
     my $path = $class->filesystem_path($scfg, $volname);
     $path .= $class->SUPER::NOTES_EXT;
 
-    return PVE::Tools::file_get_contents($path) if -f $path;
+    if (-f $path) {
+	my $data = PVE::Tools::file_get_contents($path);
+	return PVE::Storage::try_decode_utf8($data);
+    }
 
     return '';
 }
@@ -120,7 +124,8 @@ sub update_volume_notes {
     $path .= $class->SUPER::NOTES_EXT;
 
     if (defined($notes) && $notes ne '') {
-	PVE::Tools::file_set_contents($path, $notes);
+	my $encoded = encode('UTF-8', $notes);
+	PVE::Tools::file_set_contents($path, $encoded);
     } else {
 	unlink $path or $! == ENOENT or die "could not delete notes - $!\n";
     }
diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
index a6b0bdd..edec516 100644
--- a/PVE/Storage/Plugin.pm
+++ b/PVE/Storage/Plugin.pm
@@ -1172,7 +1172,7 @@ my $get_subdir_files = sub {
 	    my $notes_fn = $original.NOTES_EXT;
 	    if (-f $notes_fn) {
 		my $notes = PVE::Tools::file_read_firstline($notes_fn);
-		$info->{notes} = $notes if defined($notes);
+		$info->{notes} = PVE::Storage::try_decode_utf8($notes) if defined($notes);
 	    }
 
 	    $info->{protected} = 1 if -e PVE::Storage::protection_file_path($original);
-- 
2.30.2





             reply	other threads:[~2022-03-08 14:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-08 14:41 Dominik Csapak [this message]
2022-03-08 18:10 ` Thomas Lamprecht
2022-03-09  7:30   ` Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220308144145.536734-1-d.csapak@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal