From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 240DD65DA5 for ; Wed, 9 Mar 2022 09:22:06 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 176081A3C8 for ; Wed, 9 Mar 2022 09:21:36 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 7F2AC1A3BF for ; Wed, 9 Mar 2022 09:21:35 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 52A5E4183C for ; Wed, 9 Mar 2022 09:21:29 +0100 (CET) From: Dominik Csapak To: pve-devel@lists.proxmox.com Date: Wed, 9 Mar 2022 09:21:28 +0100 Message-Id: <20220309082128.760917-1-d.csapak@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.097 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_NUMSUBJECT 0.5 Subject ends in numbers excluding current years SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: [pve-devel] [PATCH storage] Plugins: en/decode notes as UTF-8 X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2022 08:22:06 -0000 When writing into the file, explicitly utf8 encode it, and then try to utf8 decode it on read. If the notes are not valid utf8, we assume it was an iso-8859 comment and return is at is was. Technically this is a breaking change, since there are iso-8859 comments that would sucessfully decode as utf8, for example: the byte sequence "C2 A9" would be "£" in iso, but would decode to "£". >From what i can tell though, this is rather unlikely to happen for "real world" notes, because the first byte would be in the range of C0-F7 (which are mostly language dependent characters like "Â") and the following bytes would have to be in the range of 80-BF, which are only special characters like "£" (or undefined) Signed-off-by: Dominik Csapak --- PVE/Storage/DirPlugin.pm | 9 +++++++-- PVE/Storage/Plugin.pm | 3 ++- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/PVE/Storage/DirPlugin.pm b/PVE/Storage/DirPlugin.pm index c60818b..2c58a17 100644 --- a/PVE/Storage/DirPlugin.pm +++ b/PVE/Storage/DirPlugin.pm @@ -4,6 +4,7 @@ use strict; use warnings; use Cwd; +use Encode qw(decode encode); use File::Path; use IO::File; use POSIX; @@ -103,7 +104,10 @@ sub get_volume_notes { my $path = $class->filesystem_path($scfg, $volname); $path .= $class->SUPER::NOTES_EXT; - return PVE::Tools::file_get_contents($path) if -f $path; + if (-f $path) { + my $data = PVE::Tools::file_get_contents($path); + return eval { decode('UTF-8', $data, 1) } // $data; + } return ''; } @@ -120,7 +124,8 @@ sub update_volume_notes { $path .= $class->SUPER::NOTES_EXT; if (defined($notes) && $notes ne '') { - PVE::Tools::file_set_contents($path, $notes); + my $encoded = encode('UTF-8', $notes); + PVE::Tools::file_set_contents($path, $encoded); } else { unlink $path or $! == ENOENT or die "could not delete notes - $!\n"; } diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm index a6b0bdd..0c21987 100644 --- a/PVE/Storage/Plugin.pm +++ b/PVE/Storage/Plugin.pm @@ -3,6 +3,7 @@ package PVE::Storage::Plugin; use strict; use warnings; +use Encode qw(decode); use Fcntl ':mode'; use File::chdir; use File::Path; @@ -1172,7 +1173,7 @@ my $get_subdir_files = sub { my $notes_fn = $original.NOTES_EXT; if (-f $notes_fn) { my $notes = PVE::Tools::file_read_firstline($notes_fn); - $info->{notes} = $notes if defined($notes); + $info->{notes} = eval { decode('UTF-8', $notes, 1) } // $notes if defined($notes); } $info->{protected} = 1 if -e PVE::Storage::protection_file_path($original); -- 2.30.2