all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>,
	"Laurențiu Leahu-Vlăducu" <l.leahu-vladucu@proxmox.com>
Subject: Re: [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files
Date: Mon, 17 Feb 2025 11:15:29 +0100	[thread overview]
Message-ID: <082d3fe0-9c6c-494d-9ec3-f64645cd7a53@proxmox.com> (raw)
In-Reply-To: <20250214154040.159607-1-l.leahu-vladucu@proxmox.com>

Am 14.02.25 um 16:40 schrieb Laurențiu Leahu-Vlăducu:
> 
> This patch series fixes bug #3256:
> 
> 1. It ensures that general config files (e.g. storage.cfg) are decoded
>    from UTF-8 when deserialized. Previously, no decoding happened,
>    meaning that Perl interpreted the string as single bytes instead of
>    Unicode code points. Note: while I would have preferred to decode
>    the text right after reading from the file, there are some Perl
>    functions like Digest::SHA::sha1_hex that expect bytes
>    instead of UTF-8.


What about pre-existing configs that are not UTF-8? Not breaking those
is very important here.

> 
> 2. It ensures that general config files are explicitly encoded
>    as UTF-8 before serialization to prevent similar issues the other
>    way around.
> 
> 3. It adds a unit test to prevent similar issues from happening in
>    the future.
> 
> 4. It fixes the PBS storage plugin for serializing/deserializing the
>    password, similar to points 1 and 2, but for the case where the
>    password itself contains Unicode characters.
> 
> For more information on this topic, please read:
> https://perldoc.perl.org/perlunifaq#When-should-I-decode-or-encode?
> 
> I'm sending this patch series to begin a discussion on how to handle
> encodings in our config files, and eventually also other relevant
> files. In my opinion, we should handle them consistently as UTF-8,
> also over both Perl and Rust code.

Yes, that is the long-term plan AFAIK, but right now existing config
files might be encoded differently.

> 
> Due to the fact that Linux uses UTF-8 encoding by default since
> a long time, as well as browsers* and other software, I doubt that
> we have to worry too much about other encodings
> like Latin-1 (ISO-8859-1). However, according to the
> Perl documentation, Perl could have deserialized such a string
> in the past (since it's the default in Perl when not decoding
> explicitly), and it is no longer able to after the fixes included
> in this patch series.

Unfortunately, we do. E.g.

> [I] root@pve8a1 ~# pct set 112 --mp1 /root/ö,mp=/o
> [I] root@pve8a1 ~# file /etc/pve/lxc/112.conf
> /etc/pve/lxc/112.conf: ISO-8859 text

> 
> We have to ask ourselves:
> 
> a. Do we want to define, in general, that configuration files should
>    always be serialized and deserialized as UTF-8? If yes, should we
>    consider this a breaking change?

Yes, see above.

> 
> b. Do we want to introduce any backward-compatibility for existing
>    config files? In other words, assume that older files might have
>    used other encodings in the past. To be honest, I didn't test
>    Latin-1 encoded files yet, so I'm not sure how (or if) our
>    current code would handle it.

Yes, we certainly need to.

> 
> There are further parsers and plugins that I still need to modify,
> but I first wanted to get your feedback on this subject.
> 
> 
> * With browsers I mean the encoding in HTML and not the JavaScript
> internals with its UTF-16 encoding.
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

      parent reply	other threads:[~2025-02-17 10:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14 15:40 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 1/2] fix #3256: SectionConfig: ensure UTF-8 encoding for general configs Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-storage 1/1] fix #3256: Storage: PBS: ensure passwords are saved and loaded as UTF-8 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 2/2] SectionConfig: add unit test for UTF-8 configs Laurențiu Leahu-Vlăducu
2025-02-17 10:15 ` Fiona Ebner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=082d3fe0-9c6c-494d-9ec3-f64645cd7a53@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=l.leahu-vladucu@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal