public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Laurențiu Leahu-Vlăducu" <l.leahu-vladucu@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files
Date: Fri, 14 Feb 2025 16:40:37 +0100	[thread overview]
Message-ID: <20250214154040.159607-1-l.leahu-vladucu@proxmox.com> (raw)


This patch series fixes bug #3256:

1. It ensures that general config files (e.g. storage.cfg) are decoded
   from UTF-8 when deserialized. Previously, no decoding happened,
   meaning that Perl interpreted the string as single bytes instead of
   Unicode code points. Note: while I would have preferred to decode
   the text right after reading from the file, there are some Perl
   functions like Digest::SHA::sha1_hex that expect bytes
   instead of UTF-8.

2. It ensures that general config files are explicitly encoded
   as UTF-8 before serialization to prevent similar issues the other
   way around.

3. It adds a unit test to prevent similar issues from happening in
   the future.

4. It fixes the PBS storage plugin for serializing/deserializing the
   password, similar to points 1 and 2, but for the case where the
   password itself contains Unicode characters.

For more information on this topic, please read:
https://perldoc.perl.org/perlunifaq#When-should-I-decode-or-encode?

I'm sending this patch series to begin a discussion on how to handle
encodings in our config files, and eventually also other relevant
files. In my opinion, we should handle them consistently as UTF-8,
also over both Perl and Rust code.

Due to the fact that Linux uses UTF-8 encoding by default since
a long time, as well as browsers* and other software, I doubt that
we have to worry too much about other encodings
like Latin-1 (ISO-8859-1). However, according to the
Perl documentation, Perl could have deserialized such a string
in the past (since it's the default in Perl when not decoding
explicitly), and it is no longer able to after the fixes included
in this patch series.

We have to ask ourselves:

a. Do we want to define, in general, that configuration files should
   always be serialized and deserialized as UTF-8? If yes, should we
   consider this a breaking change?

b. Do we want to introduce any backward-compatibility for existing
   config files? In other words, assume that older files might have
   used other encodings in the past. To be honest, I didn't test
   Latin-1 encoded files yet, so I'm not sure how (or if) our
   current code would handle it.

There are further parsers and plugins that I still need to modify,
but I first wanted to get your feedback on this subject.


* With browsers I mean the encoding in HTML and not the JavaScript
internals with its UTF-16 encoding.


pve-common:

Laurențiu Leahu-Vlăducu (2):
  fix #3256: SectionConfig: ensure UTF-8 encoding for general configs
  SectionConfig: add unit test for UTF-8 configs

 src/PVE/SectionConfig.pm    | 10 +++++++---
 test/section_config_test.pl | 25 +++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 3 deletions(-)


pve-storage:

Laurențiu Leahu-Vlăducu (1):
  fix #3256: Storage: PBS: ensure passwords are saved and loaded as
    UTF-8

 src/PVE/Storage/PBSPlugin.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

             reply	other threads:[~2025-02-14 15:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14 15:40 Laurențiu Leahu-Vlăducu [this message]
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 1/2] fix #3256: SectionConfig: ensure UTF-8 encoding for general configs Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-storage 1/1] fix #3256: Storage: PBS: ensure passwords are saved and loaded as UTF-8 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 2/2] SectionConfig: add unit test for UTF-8 configs Laurențiu Leahu-Vlăducu
2025-02-17 10:15 ` [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250214154040.159607-1-l.leahu-vladucu@proxmox.com \
    --to=l.leahu-vladucu@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal