all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension
@ 2021-03-15 11:21 Dominik Csapak
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
  To: pbs-devel

it is not optional, even though we give the size explicitely

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/tools/zip.rs | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index 1f9ee517..cca2e766 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -80,6 +80,7 @@ struct Zip64FieldWithOffset {
     uncompressed_size: u64,
     compressed_size: u64,
     offset: u64,
+    start_disk: u32,
 }
 
 #[derive(Endian)]
@@ -334,10 +335,11 @@ impl ZipEntry {
             &mut buf,
             Zip64FieldWithOffset {
                 field_type: 1,
-                field_size: 3 * 8,
+                field_size: 3 * 8 + 4,
                 uncompressed_size: self.uncompressed_size,
                 compressed_size: self.compressed_size,
                 offset: self.offset,
+                start_disk: 0,
             },
         )
         .await?;
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary
  2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
@ 2021-03-15 11:21 ` Dominik Csapak
  2021-03-16  8:13   ` [pbs-devel] applied: " Dietmar Maurer
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
  2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht
  2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
  To: pbs-devel

if neither offset nor size exceeds 32bit, do not add the
zip64 extension field

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
it does not harm normally, but can blow up the size of the zip a bit
if one has many small files

 src/tools/zip.rs | 52 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 17 deletions(-)

diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index cca2e766..55f2a24a 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -301,10 +301,26 @@ impl ZipEntry {
         let filename_len = filename.len();
         let header_size = size_of::<CentralDirectoryFileHeader>();
         let zip_field_size = size_of::<Zip64FieldWithOffset>();
-        let size: usize = header_size + filename_len + zip_field_size;
+        let mut size: usize = header_size + filename_len;
 
         let (date, time) = epoch_to_dos(self.mtime);
 
+        let (compressed_size, uncompressed_size, offset, need_zip64) = if self.compressed_size
+            >= (u32::MAX as u64)
+            || self.uncompressed_size >= (u32::MAX as u64)
+            || self.offset >= (u32::MAX as u64)
+        {
+            size += zip_field_size;
+            (0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, true)
+        } else {
+            (
+                self.compressed_size as u32,
+                self.uncompressed_size as u32,
+                self.offset as u32,
+                false,
+            )
+        };
+
         write_struct(
             &mut buf,
             CentralDirectoryFileHeader {
@@ -316,33 +332,35 @@ impl ZipEntry {
                 time,
                 date,
                 crc32: self.crc32,
-                compressed_size: 0xFFFFFFFF,
-                uncompressed_size: 0xFFFFFFFF,
+                compressed_size,
+                uncompressed_size,
                 filename_len: filename_len as u16,
-                extra_field_len: zip_field_size as u16,
+                extra_field_len: if need_zip64 { zip_field_size as u16 } else { 0 },
                 comment_len: 0,
                 start_disk: 0,
                 internal_flags: 0,
                 external_flags: (self.mode as u32) << 16 | (!self.is_file as u32) << 4,
-                offset: 0xFFFFFFFF,
+                offset,
             },
         )
         .await?;
 
         buf.write_all(filename).await?;
 
-        write_struct(
-            &mut buf,
-            Zip64FieldWithOffset {
-                field_type: 1,
-                field_size: 3 * 8 + 4,
-                uncompressed_size: self.uncompressed_size,
-                compressed_size: self.compressed_size,
-                offset: self.offset,
-                start_disk: 0,
-            },
-        )
-        .await?;
+        if need_zip64 {
+            write_struct(
+                &mut buf,
+                Zip64FieldWithOffset {
+                    field_type: 1,
+                    field_size: 3 * 8 + 4,
+                    uncompressed_size: self.uncompressed_size,
+                    compressed_size: self.compressed_size,
+                    offset: self.offset,
+                    start_disk: 0,
+                },
+            )
+            .await?;
+        }
 
         Ok(size)
     }
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate
  2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
@ 2021-03-15 11:21 ` Dominik Csapak
  2021-03-16  8:14   ` Dietmar Maurer
  2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht
  2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
  To: pbs-devel

to get smaller zip files

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
@Wolfgang, could you please look at this? I am not sure about using
the Compress in an async function. It is only in memory, but does it
'block'? i am not sure how we could do this differently in an
async context though...

 Cargo.toml       |  1 +
 src/tools/zip.rs | 75 +++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 63 insertions(+), 13 deletions(-)

diff --git a/Cargo.toml b/Cargo.toml
index 79945312..06967c20 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -31,6 +31,7 @@ crc32fast = "1"
 endian_trait = { version = "0.6", features = ["arrays"] }
 anyhow = "1.0"
 futures = "0.3"
+flate2 = "1.0"
 h2 = { version = "0.3", features = [ "stream" ] }
 handlebars = "3.0"
 http = "0.2"
diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index 55f2a24a..237b8a1f 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -11,9 +11,10 @@ use std::mem::size_of;
 use std::os::unix::ffi::OsStrExt;
 use std::path::{Component, Path, PathBuf};
 
-use anyhow::{Error, Result};
+use anyhow::{bail, Error, Result};
 use endian_trait::Endian;
 use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
+use flate2::{Compress, Compression, FlushCompress};
 
 use crc32fast::Hasher;
 use proxmox::tools::time::gmtime;
@@ -245,7 +246,7 @@ impl ZipEntry {
                 signature: LOCAL_FH_SIG,
                 version_needed: 0x2d,
                 flags: 1 << 3,
-                compression: 0,
+                compression: 0x8,
                 time,
                 date,
                 crc32: 0,
@@ -328,7 +329,7 @@ impl ZipEntry {
                 version_made_by: VERSION_MADE_BY,
                 version_needed: VERSION_NEEDED,
                 flags: 1 << 3,
-                compression: 0,
+                compression: 0x8,
                 time,
                 date,
                 crc32: self.crc32,
@@ -402,6 +403,7 @@ where
     files: Vec<ZipEntry>,
     target: W,
     buf: ByteBuffer,
+    outbuf: ByteBuffer,
 }
 
 impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
@@ -410,7 +412,8 @@ impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
             byte_count: 0,
             files: Vec::new(),
             target,
-            buf: ByteBuffer::with_capacity(1024*1024),
+            buf: ByteBuffer::with_capacity(1024 * 1024),
+            outbuf: ByteBuffer::with_capacity(1024 * 1024),
         }
     }
 
@@ -423,25 +426,71 @@ impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
         self.byte_count += entry.write_local_header(&mut self.target).await?;
         if let Some(mut content) = content {
             let mut hasher = Hasher::new();
-            let mut size = 0;
+            let mut deflate_encoder = Compress::new(Compression::fast(), false);
+
             loop {
 
+                let syncmode = if self.buf.is_full() {
+                    FlushCompress::Sync
+                } else {
+                    FlushCompress::None
+                };
+
+                let old_pos = self.buf.len();
                 let count = self.buf.read_from_async(&mut content).await?;
 
                 // end of file
-                if count == 0 {
+                if count == 0 && syncmode == FlushCompress::None {
                     break;
                 }
 
-                size += count;
-                hasher.update(&self.buf);
-                self.target.write_all(&self.buf).await?;
-                self.buf.consume(count);
+                hasher.update(&self.buf[old_pos..]);
+
+                let old_read = deflate_encoder.total_in();
+                let old_write = deflate_encoder.total_out();
+                deflate_encoder.compress(
+                    &self.buf,
+                    &mut self.outbuf.get_free_mut_slice(),
+                    syncmode,
+                )?;
+                let read = (deflate_encoder.total_in() - old_read) as usize;
+                let write = (deflate_encoder.total_out() - old_write) as usize;
+
+                self.outbuf.add_size(write);
+
+                if read == 0 {
+                    bail!("did not consume any data!");
+                }
+
+                self.target.write_all(&self.outbuf).await?;
+                self.buf.consume(read);
+                self.outbuf.clear();
             }
 
-            self.byte_count += size;
-            entry.compressed_size = size.try_into()?;
-            entry.uncompressed_size = size.try_into()?;
+            let old_read = deflate_encoder.total_in();
+            let old_write = deflate_encoder.total_out();
+            deflate_encoder.compress(
+                &self.buf,
+                &mut self.outbuf.get_free_mut_slice(),
+                FlushCompress::Finish,
+            )?;
+            let read = (deflate_encoder.total_in() - old_read) as usize;
+            let write = (deflate_encoder.total_out() - old_write) as usize;
+
+            self.outbuf.add_size(write);
+
+            if read != self.buf.len() {
+                bail!("deflate did not use all input bytes!");
+            }
+
+            self.target.write_all(&self.outbuf).await?;
+            self.buf.clear();
+            self.outbuf.clear();
+
+            self.byte_count += deflate_encoder.total_out() as usize;
+            entry.compressed_size = deflate_encoder.total_out();
+            entry.uncompressed_size = deflate_encoder.total_in();
+
             entry.crc32 = hasher.finalize();
         }
         self.byte_count += entry.write_data_descriptor(&mut self.target).await?;
-- 
2.20.1





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension
  2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
@ 2021-03-15 12:02 ` Thomas Lamprecht
  2 siblings, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2021-03-15 12:02 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

On 15.03.21 12:21, Dominik Csapak wrote:
> it is not optional, even though we give the size explicitely
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  src/tools/zip.rs | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
>

Fixes the issues when using `unar` for extraction here.

applied, thanks!




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pbs-devel] applied: [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
@ 2021-03-16  8:13   ` Dietmar Maurer
  0 siblings, 0 replies; 6+ messages in thread
From: Dietmar Maurer @ 2021-03-16  8:13 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

applied




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate
  2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
@ 2021-03-16  8:14   ` Dietmar Maurer
  0 siblings, 0 replies; 6+ messages in thread
From: Dietmar Maurer @ 2021-03-16  8:14 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Dominik Csapak

> @Wolfgang, could you please look at this? I am not sure about using
> the Compress in an async function. It is only in memory, but does it
> 'block'? i am not sure how we could do this differently in an
> async context though...

Compression can use 100% CPU, so you should move that to a separate thread ...




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-16  8:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
2021-03-16  8:13   ` [pbs-devel] applied: " Dietmar Maurer
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
2021-03-16  8:14   ` Dietmar Maurer
2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal