From: "Laurențiu Leahu-Vlăducu" <l.leahu-vladucu@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files
Date: Fri, 14 Feb 2025 16:40:37 +0100 [thread overview]
Message-ID: <20250214154040.159607-1-l.leahu-vladucu@proxmox.com> (raw)
This patch series fixes bug #3256:
1. It ensures that general config files (e.g. storage.cfg) are decoded
from UTF-8 when deserialized. Previously, no decoding happened,
meaning that Perl interpreted the string as single bytes instead of
Unicode code points. Note: while I would have preferred to decode
the text right after reading from the file, there are some Perl
functions like Digest::SHA::sha1_hex that expect bytes
instead of UTF-8.
2. It ensures that general config files are explicitly encoded
as UTF-8 before serialization to prevent similar issues the other
way around.
3. It adds a unit test to prevent similar issues from happening in
the future.
4. It fixes the PBS storage plugin for serializing/deserializing the
password, similar to points 1 and 2, but for the case where the
password itself contains Unicode characters.
For more information on this topic, please read:
https://perldoc.perl.org/perlunifaq#When-should-I-decode-or-encode?
I'm sending this patch series to begin a discussion on how to handle
encodings in our config files, and eventually also other relevant
files. In my opinion, we should handle them consistently as UTF-8,
also over both Perl and Rust code.
Due to the fact that Linux uses UTF-8 encoding by default since
a long time, as well as browsers* and other software, I doubt that
we have to worry too much about other encodings
like Latin-1 (ISO-8859-1). However, according to the
Perl documentation, Perl could have deserialized such a string
in the past (since it's the default in Perl when not decoding
explicitly), and it is no longer able to after the fixes included
in this patch series.
We have to ask ourselves:
a. Do we want to define, in general, that configuration files should
always be serialized and deserialized as UTF-8? If yes, should we
consider this a breaking change?
b. Do we want to introduce any backward-compatibility for existing
config files? In other words, assume that older files might have
used other encodings in the past. To be honest, I didn't test
Latin-1 encoded files yet, so I'm not sure how (or if) our
current code would handle it.
There are further parsers and plugins that I still need to modify,
but I first wanted to get your feedback on this subject.
* With browsers I mean the encoding in HTML and not the JavaScript
internals with its UTF-16 encoding.
pve-common:
Laurențiu Leahu-Vlăducu (2):
fix #3256: SectionConfig: ensure UTF-8 encoding for general configs
SectionConfig: add unit test for UTF-8 configs
src/PVE/SectionConfig.pm | 10 +++++++---
test/section_config_test.pl | 25 +++++++++++++++++++++++++
2 files changed, 32 insertions(+), 3 deletions(-)
pve-storage:
Laurențiu Leahu-Vlăducu (1):
fix #3256: Storage: PBS: ensure passwords are saved and loaded as
UTF-8
src/PVE/Storage/PBSPlugin.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next reply other threads:[~2025-02-14 15:41 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-14 15:40 Laurențiu Leahu-Vlăducu [this message]
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 1/2] fix #3256: SectionConfig: ensure UTF-8 encoding for general configs Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-storage 1/1] fix #3256: Storage: PBS: ensure passwords are saved and loaded as UTF-8 Laurențiu Leahu-Vlăducu
2025-02-14 15:40 ` [pve-devel] [RFC PATCH pve-common 2/2] SectionConfig: add unit test for UTF-8 configs Laurențiu Leahu-Vlăducu
2025-02-17 10:15 ` [pve-devel] [RFC PATCH pve-storage/common] fix #3256: allow special characters in storage-related config files Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250214154040.159607-1-l.leahu-vladucu@proxmox.com \
--to=l.leahu-vladucu@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal