all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [RFC PATCH proxmox-backup] api2/types ArchiveEntry: fall back to iso 8859-1 if not valid utf8
Date: Wed,  5 May 2021 09:53:02 +0200	[thread overview]
Message-ID: <20210505075302.14831-1-d.csapak@proxmox.com> (raw)

In case a file name is not valid utf-8 we fall back to iso 8859-1. this works
neatly, since the byte sequence of visible (non-control) characters in iso
maps 1:1 to unicode codepoints which happens when we do 'u8 as char' in rust.

Theoretically the source could also be another iso variant (e.g. iso 8859-15),
but this is (at this point) impossible to detect because the bytes simply have
a different meaning.

If we want, we could somehow make this configurable (e.g. with a parameter),
but i'm not sure it is worth the effort

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/api2/types/mod.rs | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/api2/types/mod.rs b/src/api2/types/mod.rs
index e829f207..03660531 100644
--- a/src/api2/types/mod.rs
+++ b/src/api2/types/mod.rs
@@ -1379,10 +1379,17 @@ impl ArchiveEntry {
         entry_type: Option<&DirEntryAttribute>,
         size: Option<u64>,
     ) -> Self {
+        let name = filepath.split(|x| *x == b'/').last().unwrap();
+        let text = match String::from_utf8(name.to_vec()) {
+            Ok(text) => text,
+            Err(err) => { // fall back to iso-8859-1
+                err.as_bytes().iter().map(|&c| c as char).collect()
+            }
+        };
+
         Self {
             filepath: base64::encode(filepath),
-            text: String::from_utf8_lossy(filepath.split(|x| *x == b'/').last().unwrap())
-                .to_string(),
+            text,
             entry_type: match entry_type {
                 Some(entry_type) => CatalogEntryType::from(entry_type).to_string(),
                 None => "v".to_owned(),
-- 
2.20.1





                 reply	other threads:[~2021-05-05  7:53 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210505075302.14831-1-d.csapak@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal