all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Daniel Kral" <d.kral@proxmox.com>
To: "Fiona Ebner" <f.ebner@proxmox.com>,
	"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field
Date: Thu, 30 Oct 2025 10:56:01 +0100	[thread overview]
Message-ID: <DDVKEY2127XY.E5RTWKB5DOW6@proxmox.com> (raw)
In-Reply-To: <09e16950-42ca-4570-ba8c-8d0dcf1ac7e1@proxmox.com>

On Wed Oct 22, 2025 at 2:58 PM CEST, Fiona Ebner wrote:
> Am 05.09.25 um 12:17 PM schrieb Daniel Kral:
>> As reported by a user in the community forum [0].
>> 
>> [0] https://forum.proxmox.com/threads/169258/page-14#post-792521
>> 
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>> Tested a few strings from an unicode example page [1], including at
>> least a mix of ASCII, Latin-1, Cyrillic and Bengali characters.
>> 
>> Also checked this applies both on master and on another ha-manager
>> series [2] without any fuzz.
>> 
>> [1] https://www.cogsci.ed.ac.uk/~richard/unicode-sample.html
>> [2] https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/
>> 
>>  src/PVE/HA/Rules.pm | 28 ++++++++++++++++++++++++----
>>  1 file changed, 24 insertions(+), 4 deletions(-)
>> 
>> diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
>> index 323ad038..8c60b5ce 100644
>> --- a/src/PVE/HA/Rules.pm
>> +++ b/src/PVE/HA/Rules.pm
>> @@ -163,8 +163,6 @@ sub decode_value {
>>          }
>>  
>>          return $res;
>> -    } elsif ($key eq 'comment') {
>> -        return PVE::Tools::decode_text($value);
>>      }
>>  
>>      my $plugin = $class->lookup($type);
>> @@ -198,8 +196,6 @@ sub encode_value {
>>          PVE::HA::Tools::pve_verify_ha_resource_id($_) for keys %$value;
>>  
>>          return join(',', sort keys %$value);
>> -    } elsif ($key eq 'comment') {
>> -        return PVE::Tools::encode_text($value);
>>      }
>>  
>>      my $plugin = $class->lookup($type);
>
> Why did the original implementation with {de,en}code_value() not work?
>

Sorry for the late reply, I had to look into it a bit: It seems with the
{de,en}code_text($value) in {de,en}code_value(...) from above, we decode
the comment text twice and then encode it to store it.

I haven't found where it is done exactly/not sure about it, but it seems
that we already unescape+utf8-decode text somewhere else for the API
arguments (maybe AnyEvent::Http or some other handler?) before that, but
maybe someone more knowledgeable about pve-http-server could answer
here. The following script tries to capture what happens with the value:

```
#!/usr/bin/env perl

use v5.36;

use Encode;
use URI::Escape;
use Data::Dumper;

sub encode_text {
    my ($text) = @_;

    # all control and hi-bit characters, ':' and '%'
    my $unsafe = "^\x20-\x24\x26-\x39\x3b-\x7e";
    return uri_escape(Encode::encode("utf8", $text), $unsafe);
}

sub decode_text {
    my ($data) = @_;

    return Encode::decode("utf8", uri_unescape($data));
}

my $input = 'à á â ã ä å';

print "Original: $input\n";
print "Decode: " . ($input = decode_text($input)) . "\n";
print "2*Decode: " . ($input = decode_text($input)) . "\n";
print "2*Decode+Encode: " . ($input = encode_text($input)) . "\n";
```

With the following output:

```
# ./decode-encode.pl
Original: à á â ã ä å
Decode:
Wide character in print at ./decode-encode.pl line 27.
2*Decode: � � � � � �
2*Decode+Encode: %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD
```

which is exactly what is stored in the rules.cfg afterwards.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

      reply	other threads:[~2025-10-30  9:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 10:13 Daniel Kral
2025-10-22 12:58 ` Fiona Ebner
2025-10-30  9:56   ` Daniel Kral [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DDVKEY2127XY.E5RTWKB5DOW6@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal