all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox v2] fix #3618: proxmox-async: zip: add conditional EFS flag to zip files
Date: Mon, 10 Jan 2022 12:23:41 +0100	[thread overview]
Message-ID: <20220110112341.2961733-1-d.csapak@proxmox.com> (raw)

this flag marks the file names as 'UTF-8' encoded if they are valid UTF-8.

By default, encoding of file names in zips are defined as code page 437,
but we save the filenames as bytes (like in linux fs).

For linux systems this would not be a problem since most tools
simply use the filenames as bytes, but for the zip utility under
windows it's important since NTFS uses UTF-16 for file names.

For filenames that are valid UTF-8, they are decoded as UTF-8 everywhere
correctly (Linux as UTF-8 bytes, Windows as correct UTF-16 sequence) and
for other filenames with a high bit set, it depends on the OS/Software
what exactly happens. Some cases below:

* Windows + Built-in/7zip: decoded as CP437
* Debian + zip: Bytes taken as-is
* Debian + 7z: interpreted as Windows1252, decoded as UTF-8

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v1:
* moved to proxmox/proxmox-async from proxmox-backup/pbs-tools
* included bug# in the subject
* removed two spurious newlines

 proxmox-async/src/zip.rs | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/proxmox-async/src/zip.rs b/proxmox-async/src/zip.rs
index 12d9f29..d1df713 100644
--- a/proxmox-async/src/zip.rs
+++ b/proxmox-async/src/zip.rs
@@ -34,6 +34,9 @@ const VERSION_MADE_BY: u16 = 0x032d;
 const ZIP64_EOCD_RECORD: u32 = 0x06064B50;
 const ZIP64_EOCD_LOCATOR: u32 = 0x07064B50;
 
+const LFH_GENERAL_PURPOSE_FLAGS: u16 = 1 << 3; // we place crc32 in the data descriptor
+const LFH_GPF_EFS_BIT: u16 = 1 << 11; // EFS, marks filename & comment as UTF-8
+
 // bits for time:
 // 0-4: day of the month (1-31)
 // 5-8: month: (1 = jan, etc.)
@@ -200,6 +203,7 @@ pub struct ZipEntry {
     compressed_size: u64,
     offset: u64,
     is_file: bool,
+    is_utf8_filename: bool,
 }
 
 impl ZipEntry {
@@ -220,8 +224,11 @@ impl ZipEntry {
             relpath.push(""); // adds trailing slash
         }
 
+        let filename: OsString = relpath.into();
+        let is_utf8_filename = filename.to_str().is_some();
+
         Self {
-            filename: relpath.into(),
+            filename,
             crc32: 0,
             mtime,
             mode,
@@ -229,6 +236,15 @@ impl ZipEntry {
             compressed_size: 0,
             offset: 0,
             is_file,
+            is_utf8_filename,
+        }
+    }
+
+    fn get_general_purpose_flags(&self) -> u16 {
+        if self.is_utf8_filename {
+            LFH_GENERAL_PURPOSE_FLAGS | LFH_GPF_EFS_BIT
+        } else {
+            LFH_GENERAL_PURPOSE_FLAGS
         }
     }
 
@@ -249,7 +265,7 @@ impl ZipEntry {
             LocalFileHeader {
                 signature: LOCAL_FH_SIG,
                 version_needed: 0x2d,
-                flags: 1 << 3,
+                flags: self.get_general_purpose_flags(),
                 compression: 0x8,
                 time,
                 date,
@@ -332,7 +348,7 @@ impl ZipEntry {
                 signature: CENTRAL_DIRECTORY_FH_SIG,
                 version_made_by: VERSION_MADE_BY,
                 version_needed: VERSION_NEEDED,
-                flags: 1 << 3,
+                flags: self.get_general_purpose_flags(),
                 compression: 0x8,
                 time,
                 date,
-- 
2.30.2





             reply	other threads:[~2022-01-10 11:24 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-10 11:23 Dominik Csapak [this message]
2022-01-11  5:49 ` [pbs-devel] applied: " Thomas Lamprecht
2022-01-11  7:40   ` Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220110112341.2961733-1-d.csapak@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal