From: Fiona Ebner <f.ebner@proxmox.com>
To: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>,
"Laurențiu Leahu-Vlăducu" <l.leahu-vladucu@proxmox.com>
Subject: Re: [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files
Date: Mon, 17 Feb 2025 11:15:29 +0100 [thread overview]
Message-ID: <082d3fe0-9c6c-494d-9ec3-f64645cd7a53@proxmox.com> (raw)
In-Reply-To: <20250214154040.159607-1-l.leahu-vladucu@proxmox.com>
Am 14.02.25 um 16:40 schrieb Laurențiu Leahu-Vlăducu:
>
> This patch series fixes bug #3256:
>
> 1. It ensures that general config files (e.g. storage.cfg) are decoded
> from UTF-8 when deserialized. Previously, no decoding happened,
> meaning that Perl interpreted the string as single bytes instead of
> Unicode code points. Note: while I would have preferred to decode
> the text right after reading from the file, there are some Perl
> functions like Digest::SHA::sha1_hex that expect bytes
> instead of UTF-8.
What about pre-existing configs that are not UTF-8? Not breaking those
is very important here.
>
> 2. It ensures that general config files are explicitly encoded
> as UTF-8 before serialization to prevent similar issues the other
> way around.
>
> 3. It adds a unit test to prevent similar issues from happening in
> the future.
>
> 4. It fixes the PBS storage plugin for serializing/deserializing the
> password, similar to points 1 and 2, but for the case where the
> password itself contains Unicode characters.
>
> For more information on this topic, please read:
> https://perldoc.perl.org/perlunifaq#When-should-I-decode-or-encode?
>
> I'm sending this patch series to begin a discussion on how to handle
> encodings in our config files, and eventually also other relevant
> files. In my opinion, we should handle them consistently as UTF-8,
> also over both Perl and Rust code.
Yes, that is the long-term plan AFAIK, but right now existing config
files might be encoded differently.
>
> Due to the fact that Linux uses UTF-8 encoding by default since
> a long time, as well as browsers* and other software, I doubt that
> we have to worry too much about other encodings
> like Latin-1 (ISO-8859-1). However, according to the
> Perl documentation, Perl could have deserialized such a string
> in the past (since it's the default in Perl when not decoding
> explicitly), and it is no longer able to after the fixes included
> in this patch series.
Unfortunately, we do. E.g.
> [I] root@pve8a1 ~# pct set 112 --mp1 /root/ö,mp=/o
> [I] root@pve8a1 ~# file /etc/pve/lxc/112.conf
> /etc/pve/lxc/112.conf: ISO-8859 text
>
> We have to ask ourselves:
>
> a. Do we want to define, in general, that configuration files should
> always be serialized and deserialized as UTF-8? If yes, should we
> consider this a breaking change?
Yes, see above.
>
> b. Do we want to introduce any backward-compatibility for existing
> config files? In other words, assume that older files might have
> used other encodings in the past. To be honest, I didn't test
> Latin-1 encoded files yet, so I'm not sure how (or if) our
> current code would handle it.
Yes, we certainly need to.
>
> There are further parsers and plugins that I still need to modify,
> but I first wanted to get your feedback on this subject.
>
>
> * With browsers I mean the encoding in HTML and not the JavaScript
> internals with its UTF-16 encoding.
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
prev parent reply other threads:[~2025-02-17 10:15 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-14 15:40 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 1/2] fix #3256: SectionConfig: ensure UTF-8 encoding for general configs Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-storage 1/1] fix #3256: Storage: PBS: ensure passwords are saved and loaded as UTF-8 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 2/2] SectionConfig: add unit test for UTF-8 configs Laurențiu Leahu-Vlăducu
2025-02-17 10:15 ` Fiona Ebner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=082d3fe0-9c6c-494d-9ec3-f64645cd7a53@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=l.leahu-vladucu@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal