* [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension
@ 2021-03-15 11:21 Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
To: pbs-devel
it is not optional, even though we give the size explicitely
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
src/tools/zip.rs | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index 1f9ee517..cca2e766 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -80,6 +80,7 @@ struct Zip64FieldWithOffset {
uncompressed_size: u64,
compressed_size: u64,
offset: u64,
+ start_disk: u32,
}
#[derive(Endian)]
@@ -334,10 +335,11 @@ impl ZipEntry {
&mut buf,
Zip64FieldWithOffset {
field_type: 1,
- field_size: 3 * 8,
+ field_size: 3 * 8 + 4,
uncompressed_size: self.uncompressed_size,
compressed_size: self.compressed_size,
offset: self.offset,
+ start_disk: 0,
},
)
.await?;
--
2.20.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary
2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
@ 2021-03-15 11:21 ` Dominik Csapak
2021-03-16 8:13 ` [pbs-devel] applied: " Dietmar Maurer
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht
2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
To: pbs-devel
if neither offset nor size exceeds 32bit, do not add the
zip64 extension field
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
it does not harm normally, but can blow up the size of the zip a bit
if one has many small files
src/tools/zip.rs | 52 ++++++++++++++++++++++++++++++++----------------
1 file changed, 35 insertions(+), 17 deletions(-)
diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index cca2e766..55f2a24a 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -301,10 +301,26 @@ impl ZipEntry {
let filename_len = filename.len();
let header_size = size_of::<CentralDirectoryFileHeader>();
let zip_field_size = size_of::<Zip64FieldWithOffset>();
- let size: usize = header_size + filename_len + zip_field_size;
+ let mut size: usize = header_size + filename_len;
let (date, time) = epoch_to_dos(self.mtime);
+ let (compressed_size, uncompressed_size, offset, need_zip64) = if self.compressed_size
+ >= (u32::MAX as u64)
+ || self.uncompressed_size >= (u32::MAX as u64)
+ || self.offset >= (u32::MAX as u64)
+ {
+ size += zip_field_size;
+ (0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, true)
+ } else {
+ (
+ self.compressed_size as u32,
+ self.uncompressed_size as u32,
+ self.offset as u32,
+ false,
+ )
+ };
+
write_struct(
&mut buf,
CentralDirectoryFileHeader {
@@ -316,33 +332,35 @@ impl ZipEntry {
time,
date,
crc32: self.crc32,
- compressed_size: 0xFFFFFFFF,
- uncompressed_size: 0xFFFFFFFF,
+ compressed_size,
+ uncompressed_size,
filename_len: filename_len as u16,
- extra_field_len: zip_field_size as u16,
+ extra_field_len: if need_zip64 { zip_field_size as u16 } else { 0 },
comment_len: 0,
start_disk: 0,
internal_flags: 0,
external_flags: (self.mode as u32) << 16 | (!self.is_file as u32) << 4,
- offset: 0xFFFFFFFF,
+ offset,
},
)
.await?;
buf.write_all(filename).await?;
- write_struct(
- &mut buf,
- Zip64FieldWithOffset {
- field_type: 1,
- field_size: 3 * 8 + 4,
- uncompressed_size: self.uncompressed_size,
- compressed_size: self.compressed_size,
- offset: self.offset,
- start_disk: 0,
- },
- )
- .await?;
+ if need_zip64 {
+ write_struct(
+ &mut buf,
+ Zip64FieldWithOffset {
+ field_type: 1,
+ field_size: 3 * 8 + 4,
+ uncompressed_size: self.uncompressed_size,
+ compressed_size: self.compressed_size,
+ offset: self.offset,
+ start_disk: 0,
+ },
+ )
+ .await?;
+ }
Ok(size)
}
--
2.20.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate
2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
@ 2021-03-15 11:21 ` Dominik Csapak
2021-03-16 8:14 ` Dietmar Maurer
2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht
2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-03-15 11:21 UTC (permalink / raw)
To: pbs-devel
to get smaller zip files
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
@Wolfgang, could you please look at this? I am not sure about using
the Compress in an async function. It is only in memory, but does it
'block'? i am not sure how we could do this differently in an
async context though...
Cargo.toml | 1 +
src/tools/zip.rs | 75 +++++++++++++++++++++++++++++++++++++++---------
2 files changed, 63 insertions(+), 13 deletions(-)
diff --git a/Cargo.toml b/Cargo.toml
index 79945312..06967c20 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -31,6 +31,7 @@ crc32fast = "1"
endian_trait = { version = "0.6", features = ["arrays"] }
anyhow = "1.0"
futures = "0.3"
+flate2 = "1.0"
h2 = { version = "0.3", features = [ "stream" ] }
handlebars = "3.0"
http = "0.2"
diff --git a/src/tools/zip.rs b/src/tools/zip.rs
index 55f2a24a..237b8a1f 100644
--- a/src/tools/zip.rs
+++ b/src/tools/zip.rs
@@ -11,9 +11,10 @@ use std::mem::size_of;
use std::os::unix::ffi::OsStrExt;
use std::path::{Component, Path, PathBuf};
-use anyhow::{Error, Result};
+use anyhow::{bail, Error, Result};
use endian_trait::Endian;
use tokio::io::{AsyncRead, AsyncWrite, AsyncWriteExt};
+use flate2::{Compress, Compression, FlushCompress};
use crc32fast::Hasher;
use proxmox::tools::time::gmtime;
@@ -245,7 +246,7 @@ impl ZipEntry {
signature: LOCAL_FH_SIG,
version_needed: 0x2d,
flags: 1 << 3,
- compression: 0,
+ compression: 0x8,
time,
date,
crc32: 0,
@@ -328,7 +329,7 @@ impl ZipEntry {
version_made_by: VERSION_MADE_BY,
version_needed: VERSION_NEEDED,
flags: 1 << 3,
- compression: 0,
+ compression: 0x8,
time,
date,
crc32: self.crc32,
@@ -402,6 +403,7 @@ where
files: Vec<ZipEntry>,
target: W,
buf: ByteBuffer,
+ outbuf: ByteBuffer,
}
impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
@@ -410,7 +412,8 @@ impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
byte_count: 0,
files: Vec::new(),
target,
- buf: ByteBuffer::with_capacity(1024*1024),
+ buf: ByteBuffer::with_capacity(1024 * 1024),
+ outbuf: ByteBuffer::with_capacity(1024 * 1024),
}
}
@@ -423,25 +426,71 @@ impl<W: AsyncWrite + Unpin> ZipEncoder<W> {
self.byte_count += entry.write_local_header(&mut self.target).await?;
if let Some(mut content) = content {
let mut hasher = Hasher::new();
- let mut size = 0;
+ let mut deflate_encoder = Compress::new(Compression::fast(), false);
+
loop {
+ let syncmode = if self.buf.is_full() {
+ FlushCompress::Sync
+ } else {
+ FlushCompress::None
+ };
+
+ let old_pos = self.buf.len();
let count = self.buf.read_from_async(&mut content).await?;
// end of file
- if count == 0 {
+ if count == 0 && syncmode == FlushCompress::None {
break;
}
- size += count;
- hasher.update(&self.buf);
- self.target.write_all(&self.buf).await?;
- self.buf.consume(count);
+ hasher.update(&self.buf[old_pos..]);
+
+ let old_read = deflate_encoder.total_in();
+ let old_write = deflate_encoder.total_out();
+ deflate_encoder.compress(
+ &self.buf,
+ &mut self.outbuf.get_free_mut_slice(),
+ syncmode,
+ )?;
+ let read = (deflate_encoder.total_in() - old_read) as usize;
+ let write = (deflate_encoder.total_out() - old_write) as usize;
+
+ self.outbuf.add_size(write);
+
+ if read == 0 {
+ bail!("did not consume any data!");
+ }
+
+ self.target.write_all(&self.outbuf).await?;
+ self.buf.consume(read);
+ self.outbuf.clear();
}
- self.byte_count += size;
- entry.compressed_size = size.try_into()?;
- entry.uncompressed_size = size.try_into()?;
+ let old_read = deflate_encoder.total_in();
+ let old_write = deflate_encoder.total_out();
+ deflate_encoder.compress(
+ &self.buf,
+ &mut self.outbuf.get_free_mut_slice(),
+ FlushCompress::Finish,
+ )?;
+ let read = (deflate_encoder.total_in() - old_read) as usize;
+ let write = (deflate_encoder.total_out() - old_write) as usize;
+
+ self.outbuf.add_size(write);
+
+ if read != self.buf.len() {
+ bail!("deflate did not use all input bytes!");
+ }
+
+ self.target.write_all(&self.outbuf).await?;
+ self.buf.clear();
+ self.outbuf.clear();
+
+ self.byte_count += deflate_encoder.total_out() as usize;
+ entry.compressed_size = deflate_encoder.total_out();
+ entry.uncompressed_size = deflate_encoder.total_in();
+
entry.crc32 = hasher.finalize();
}
self.byte_count += entry.write_data_descriptor(&mut self.target).await?;
--
2.20.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension
2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
@ 2021-03-15 12:02 ` Thomas Lamprecht
2 siblings, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2021-03-15 12:02 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
On 15.03.21 12:21, Dominik Csapak wrote:
> it is not optional, even though we give the size explicitely
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> src/tools/zip.rs | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>
Fixes the issues when using `unar` for extraction here.
applied, thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] applied: [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
@ 2021-03-16 8:13 ` Dietmar Maurer
0 siblings, 0 replies; 6+ messages in thread
From: Dietmar Maurer @ 2021-03-16 8:13 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
applied
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
@ 2021-03-16 8:14 ` Dietmar Maurer
0 siblings, 0 replies; 6+ messages in thread
From: Dietmar Maurer @ 2021-03-16 8:14 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
> @Wolfgang, could you please look at this? I am not sure about using
> the Compress in an async function. It is only in memory, but does it
> 'block'? i am not sure how we could do this differently in an
> async context though...
Compression can use 100% CPU, so you should move that to a separate thread ...
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-16 8:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 11:21 [pbs-devel] [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Dominik Csapak
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 2/3] tools/zip: only add zip64 field when necessary Dominik Csapak
2021-03-16 8:13 ` [pbs-devel] applied: " Dietmar Maurer
2021-03-15 11:21 ` [pbs-devel] [RFC PATCH proxmox-backup 3/3] tools/zip: compress zips with deflate Dominik Csapak
2021-03-16 8:14 ` Dietmar Maurer
2021-03-15 12:02 ` [pbs-devel] applied: [PATCH proxmox-backup 1/3] tools/zip: add missing start_disk field for zip64 extension Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox