public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field
@ 2025-09-05 10:13 Daniel Kral
  2025-10-22 12:58 ` Fiona Ebner
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Kral @ 2025-09-05 10:13 UTC (permalink / raw)
  To: pve-devel

As reported by a user in the community forum [0].

[0] https://forum.proxmox.com/threads/169258/page-14#post-792521

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
Tested a few strings from an unicode example page [1], including at
least a mix of ASCII, Latin-1, Cyrillic and Bengali characters.

Also checked this applies both on master and on another ha-manager
series [2] without any fuzz.

[1] https://www.cogsci.ed.ac.uk/~richard/unicode-sample.html
[2] https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/

 src/PVE/HA/Rules.pm | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index 323ad038..8c60b5ce 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -163,8 +163,6 @@ sub decode_value {
         }
 
         return $res;
-    } elsif ($key eq 'comment') {
-        return PVE::Tools::decode_text($value);
     }
 
     my $plugin = $class->lookup($type);
@@ -198,8 +196,6 @@ sub encode_value {
         PVE::HA::Tools::pve_verify_ha_resource_id($_) for keys %$value;
 
         return join(',', sort keys %$value);
-    } elsif ($key eq 'comment') {
-        return PVE::Tools::encode_text($value);
     }
 
     my $plugin = $class->lookup($type);
@@ -220,6 +216,30 @@ sub parse_section_header {
     return undef;
 }
 
+sub parse_config {
+    my ($class, $filename, $raw, $allow_unknown) = @_;
+
+    my $cfg = $class->SUPER::parse_config($filename, $raw, $allow_unknown);
+
+    for my $rule (values $cfg->{ids}->%*) {
+        $rule->{comment} = PVE::Tools::decode_text($rule->{comment})
+            if defined($rule->{comment});
+    }
+
+    return $cfg;
+}
+
+sub write_config {
+    my ($class, $filename, $cfg, $allow_unknown) = @_;
+
+    for my $rule (values $cfg->{ids}->%*) {
+        $rule->{comment} = PVE::Tools::encode_text($rule->{comment})
+            if defined($rule->{comment});
+    }
+
+    return $class->SUPER::write_config($filename, $cfg, $allow_unknown);
+}
+
 # General rule helpers
 
 =head3 $class->set_rule_defaults($rule)
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field
  2025-09-05 10:13 [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field Daniel Kral
@ 2025-10-22 12:58 ` Fiona Ebner
  2025-10-30  9:56   ` Daniel Kral
  0 siblings, 1 reply; 3+ messages in thread
From: Fiona Ebner @ 2025-10-22 12:58 UTC (permalink / raw)
  To: Proxmox VE development discussion, Daniel Kral

Am 05.09.25 um 12:17 PM schrieb Daniel Kral:
> As reported by a user in the community forum [0].
> 
> [0] https://forum.proxmox.com/threads/169258/page-14#post-792521
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
> Tested a few strings from an unicode example page [1], including at
> least a mix of ASCII, Latin-1, Cyrillic and Bengali characters.
> 
> Also checked this applies both on master and on another ha-manager
> series [2] without any fuzz.
> 
> [1] https://www.cogsci.ed.ac.uk/~richard/unicode-sample.html
> [2] https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/
> 
>  src/PVE/HA/Rules.pm | 28 ++++++++++++++++++++++++----
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
> index 323ad038..8c60b5ce 100644
> --- a/src/PVE/HA/Rules.pm
> +++ b/src/PVE/HA/Rules.pm
> @@ -163,8 +163,6 @@ sub decode_value {
>          }
>  
>          return $res;
> -    } elsif ($key eq 'comment') {
> -        return PVE::Tools::decode_text($value);
>      }
>  
>      my $plugin = $class->lookup($type);
> @@ -198,8 +196,6 @@ sub encode_value {
>          PVE::HA::Tools::pve_verify_ha_resource_id($_) for keys %$value;
>  
>          return join(',', sort keys %$value);
> -    } elsif ($key eq 'comment') {
> -        return PVE::Tools::encode_text($value);
>      }
>  
>      my $plugin = $class->lookup($type);

Why did the original implementation with {de,en}code_value() not work?

> @@ -220,6 +216,30 @@ sub parse_section_header {
>      return undef;
>  }
>  
> +sub parse_config {
> +    my ($class, $filename, $raw, $allow_unknown) = @_;
> +
> +    my $cfg = $class->SUPER::parse_config($filename, $raw, $allow_unknown);
> +
> +    for my $rule (values $cfg->{ids}->%*) {
> +        $rule->{comment} = PVE::Tools::decode_text($rule->{comment})
> +            if defined($rule->{comment});
> +    }
> +
> +    return $cfg;
> +}
> +
> +sub write_config {
> +    my ($class, $filename, $cfg, $allow_unknown) = @_;
> +
> +    for my $rule (values $cfg->{ids}->%*) {
> +        $rule->{comment} = PVE::Tools::encode_text($rule->{comment})
> +            if defined($rule->{comment});
> +    }
> +
> +    return $class->SUPER::write_config($filename, $cfg, $allow_unknown);
> +}
> +
>  # General rule helpers
>  
>  =head3 $class->set_rule_defaults($rule)



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field
  2025-10-22 12:58 ` Fiona Ebner
@ 2025-10-30  9:56   ` Daniel Kral
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Kral @ 2025-10-30  9:56 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On Wed Oct 22, 2025 at 2:58 PM CEST, Fiona Ebner wrote:
> Am 05.09.25 um 12:17 PM schrieb Daniel Kral:
>> As reported by a user in the community forum [0].
>> 
>> [0] https://forum.proxmox.com/threads/169258/page-14#post-792521
>> 
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>> Tested a few strings from an unicode example page [1], including at
>> least a mix of ASCII, Latin-1, Cyrillic and Bengali characters.
>> 
>> Also checked this applies both on master and on another ha-manager
>> series [2] without any fuzz.
>> 
>> [1] https://www.cogsci.ed.ac.uk/~richard/unicode-sample.html
>> [2] https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/
>> 
>>  src/PVE/HA/Rules.pm | 28 ++++++++++++++++++++++++----
>>  1 file changed, 24 insertions(+), 4 deletions(-)
>> 
>> diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
>> index 323ad038..8c60b5ce 100644
>> --- a/src/PVE/HA/Rules.pm
>> +++ b/src/PVE/HA/Rules.pm
>> @@ -163,8 +163,6 @@ sub decode_value {
>>          }
>>  
>>          return $res;
>> -    } elsif ($key eq 'comment') {
>> -        return PVE::Tools::decode_text($value);
>>      }
>>  
>>      my $plugin = $class->lookup($type);
>> @@ -198,8 +196,6 @@ sub encode_value {
>>          PVE::HA::Tools::pve_verify_ha_resource_id($_) for keys %$value;
>>  
>>          return join(',', sort keys %$value);
>> -    } elsif ($key eq 'comment') {
>> -        return PVE::Tools::encode_text($value);
>>      }
>>  
>>      my $plugin = $class->lookup($type);
>
> Why did the original implementation with {de,en}code_value() not work?
>

Sorry for the late reply, I had to look into it a bit: It seems with the
{de,en}code_text($value) in {de,en}code_value(...) from above, we decode
the comment text twice and then encode it to store it.

I haven't found where it is done exactly/not sure about it, but it seems
that we already unescape+utf8-decode text somewhere else for the API
arguments (maybe AnyEvent::Http or some other handler?) before that, but
maybe someone more knowledgeable about pve-http-server could answer
here. The following script tries to capture what happens with the value:

```
#!/usr/bin/env perl

use v5.36;

use Encode;
use URI::Escape;
use Data::Dumper;

sub encode_text {
    my ($text) = @_;

    # all control and hi-bit characters, ':' and '%'
    my $unsafe = "^\x20-\x24\x26-\x39\x3b-\x7e";
    return uri_escape(Encode::encode("utf8", $text), $unsafe);
}

sub decode_text {
    my ($data) = @_;

    return Encode::decode("utf8", uri_unescape($data));
}

my $input = 'à á â ã ä å';

print "Original: $input\n";
print "Decode: " . ($input = decode_text($input)) . "\n";
print "2*Decode: " . ($input = decode_text($input)) . "\n";
print "2*Decode+Encode: " . ($input = encode_text($input)) . "\n";
```

With the following output:

```
# ./decode-encode.pl
Original: à á â ã ä å
Decode:
Wide character in print at ./decode-encode.pl line 27.
2*Decode: � � � � � �
2*Decode+Encode: %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD %EF%BF%BD
```

which is exactly what is stored in the rules.cfg afterwards.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-10-30  9:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-05 10:13 [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field Daniel Kral
2025-10-22 12:58 ` Fiona Ebner
2025-10-30  9:56   ` Daniel Kral

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal