From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 07EBC1FF17A for ; Tue, 11 Nov 2025 18:10:06 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 48321115A0; Tue, 11 Nov 2025 18:10:50 +0100 (CET) Message-ID: <6785fc19-8b68-43a2-9dce-b9da688c2779@proxmox.com> Date: Tue, 11 Nov 2025 18:10:06 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Daniel Kral , Proxmox VE development discussion References: <20250905101648.79655-1-d.kral@proxmox.com> <09e16950-42ca-4570-ba8c-8d0dcf1ac7e1@proxmox.com> Content-Language: en-US From: Fiona Ebner In-Reply-To: X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1762880983519 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.020 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [ed.ac.uk, proxmox.com, rules.pm] Subject: Re: [pve-devel] [PATCH ha-manager] rules: fix utf-8 encoding and decoding for comments field X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Am 30.10.25 um 10:55 AM schrieb Daniel Kral: > On Wed Oct 22, 2025 at 2:58 PM CEST, Fiona Ebner wrote: >> Am 05.09.25 um 12:17 PM schrieb Daniel Kral: >>> As reported by a user in the community forum [0]. >>> >>> [0] https://forum.proxmox.com/threads/169258/page-14#post-792521 >>> >>> Signed-off-by: Daniel Kral >>> --- >>> Tested a few strings from an unicode example page [1], including at >>> least a mix of ASCII, Latin-1, Cyrillic and Bengali characters. >>> >>> Also checked this applies both on master and on another ha-manager >>> series [2] without any fuzz. >>> >>> [1] https://www.cogsci.ed.ac.uk/~richard/unicode-sample.html >>> [2] https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/ >>> >>> src/PVE/HA/Rules.pm | 28 ++++++++++++++++++++++++---- >>> 1 file changed, 24 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm >>> index 323ad038..8c60b5ce 100644 >>> --- a/src/PVE/HA/Rules.pm >>> +++ b/src/PVE/HA/Rules.pm >>> @@ -163,8 +163,6 @@ sub decode_value { >>> } >>> >>> return $res; >>> - } elsif ($key eq 'comment') { >>> - return PVE::Tools::decode_text($value); >>> } >>> >>> my $plugin = $class->lookup($type); >>> @@ -198,8 +196,6 @@ sub encode_value { >>> PVE::HA::Tools::pve_verify_ha_resource_id($_) for keys %$value; >>> >>> return join(',', sort keys %$value); >>> - } elsif ($key eq 'comment') { >>> - return PVE::Tools::encode_text($value); >>> } >>> >>> my $plugin = $class->lookup($type); >> >> Why did the original implementation with {de,en}code_value() not work? >> > > Sorry for the late reply, I had to look into it a bit: It seems with the > {de,en}code_text($value) in {de,en}code_value(...) from above, we decode > the comment text twice and then encode it to store it. > > I haven't found where it is done exactly/not sure about it, but it seems > that we already unescape+utf8-decode text somewhere else for the API > arguments (maybe AnyEvent::Http or some other handler?) before that, but > maybe someone more knowledgeable about pve-http-server could answer > here. The following script tries to capture what happens with the value: Sorry for the late reply too! It's not actually decoded twice in that sense. It's just the value we get from the API which was not encoded yet ;) The relevant call is: PVE::SectionConfig::check_config("PVE::HA::Rules::ResourceAffinity", "ha-rule-fa5e518c-1f55", HASH(0x57637c23c9e8), 0, 1) called at /usr/share/perl5/PVE/API2/HA/Rules.pm line 339 The documentation for decode_value() mentions: Called during Ccheck_config(...) >>> in order to convert values that have been read from a C> file which have been I beforehand by Cencode_value(...) >>>. Taking that literally, it would mean we cannot use check_config() on passed-in-via-API values, because we don't fulfill the contract, we don't have values that have been encoded before. Existing users of decode_value() do not seem to run into this design issue, except your recent one here. But of course, existing users already rely on check_config() to be done for passed-in-via-API values. So we would need to fix the design and the contract here. Telling check_config() whether it's dealing with not-previously encoded values and then not decoding sounds like an obvious approach, however, other callers do rely on decode_value() to be also called on passed-in-by-API, not previously encoded values, e.g. for constructing a nodes hash: } elsif ($key eq 'nodes') { my $res = {}; foreach my $node (PVE::Tools::split_list($value)) { if (PVE::JSONSchema::pve_verify_node_name($node)) { $res->{$node} = 1; } } return $res; It's just that the encoding results in the same value as passed-in-via-API ;) And that is actually part of the contract of decode_value() like it's currently used. Going with your approach fixes the issue at hand, but it leaves the design issue for the next person to run into. We should clarify the documentation if we are happy enough with the current design, so that people don't try to use decode_value() for a use case like yours. It does seem like it should be a supported use case though. Maybe somebody has ideas for a fix even? Too late for me today to come up with something :P _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel