From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <l.wagner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 5A92CE794
 for <pve-devel@lists.proxmox.com>; Wed, 19 Jul 2023 10:40:11 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 409B83F91
 for <pve-devel@lists.proxmox.com>; Wed, 19 Jul 2023 10:40:11 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Wed, 19 Jul 2023 10:40:10 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id EA9C640E55
 for <pve-devel@lists.proxmox.com>; Wed, 19 Jul 2023 10:40:09 +0200 (CEST)
Message-ID: <ea4eb365-fa1c-c594-0bd7-32b33dce76e3@proxmox.com>
Date: Wed, 19 Jul 2023 10:40:09 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: de-AT, en-US
To: Dominik Csapak <d.csapak@proxmox.com>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Wolfgang Bumiller <w.bumiller@proxmox.com>,
 Maximiliano Sandoval <m.sandoval@proxmox.com>
References: <20230717150051.710464-1-l.wagner@proxmox.com>
 <f4d88987-6aec-22d5-6635-ceccb3f30d64@proxmox.com>
From: Lukas Wagner <l.wagner@proxmox.com>
In-Reply-To: <f4d88987-6aec-22d5-6635-ceccb3f30d64@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.102 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [test.pl, notify.rs]
 WEIRD_PORT              0.001 Uses non-standard port number for HTTP
Subject: Re: [pve-devel] [PATCH v3 many 00/66] fix #4156: introduce new
 notification system
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Jul 2023 08:40:11 -0000

Hi again,

On 7/18/23 14:34, Dominik Csapak wrote:
> * i found one bug, but not quite sure yet where it comes from exactly,
>    putting in emojis into a field (e.g. a comment or author) it's accepted,
>    but editing a different entry fails with:
> 
> --->8---
> could not serialize configuration: writing 'notifications.cfg' failed: detected unexpected control character in section 'testgroup' key 'comment' (500)
> ---8<---
> 
> not sure where the utf-8 info gets lost. (or we could limit all fields to ascii?)
> such a notification target still works AFAICT (but if set as e.g. the author it's
> probably the wrong value)
> 
> (i used 😀 as a test)

So I investigated a bit and found a minimal reproducer. Turns out it's an encoding issue
in the FFI interface (perl->rust).

Let's assume that we have the following exported function in the pve-rs bindings:

   #[export]
   fn test_emoji(name: &str) {
       dbg!(&name);
   }



   use PVE::RS::Notify;
  
   my $str = "😊";
   PVE::RS::Notify::test_emoji($str);


   root@pve:~# perl test.pl
   [src/notify.rs:562] &name = "ð\u{9f}\u{98}\u{8a}"

To me it looks a bit like a UTF-16/UTF-8 mixup:

ð = 0x00F0 in UTF16
😊 = 0xF0 0x9F 0x98 0x8A in UTF-8

The issue can be fixed by doing a `$str = encode('utf-8', $str);` before calling
`test_emoji`.

However, I think this should be probably handled automagically by the perlmod bindings, if
at all possible?
@Wolfgang, what are your thoughts about this? Maximiliano said he is going to take a look
at the perlmod code, but if you have any idea about where to fix this issue, then
this would probably be helpful to him.

-- 
- Lukas