From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id B89781FF17A
	for <inbox@lore.proxmox.com>; Mon,  5 Aug 2024 11:24:41 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id D23F91E9C;
	Mon,  5 Aug 2024 11:24:48 +0200 (CEST)
From: Dominik Csapak <d.csapak@proxmox.com>
To: pbs-devel@lists.proxmox.com
Date: Mon,  5 Aug 2024 11:24:13 +0200
Message-Id: <20240805092414.1178930-5-d.csapak@proxmox.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20240805092414.1178930-1-d.csapak@proxmox.com>
References: <20240805092414.1178930-1-d.csapak@proxmox.com>
MIME-Version: 1.0
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.014 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: [pbs-devel] [PATCH proxmox-backup v3 4/5] datastore: data blob:
 increase compression throughput
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pbs-devel-bounces@lists.proxmox.com
Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com>

by not using `zstd::stream::copy_encode`, because that has an allocation
pattern that reduces throughput if the target/source storage and the
network are faster than the chunk creation.

instead use `zstd_safe::compress` which shouldn't do any big
allocations, since we provide the target buffer.

To handle the case that the target buffer is too small, we now ignore
all zstd error and continue with the uncompressed data, logging the error
except if the target buffer is too small.

Some benchmarks on my machine from tmpfs to a datastore on tmpfs:

Type                without patches (MiB/s)  with patches (MiB/s)
.img file           ~614                     ~767
pxar one big file   ~657                     ~807
pxar small files    ~576                     ~627

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v2:
* use zstd_safe instead of zstd
 pbs-datastore/src/data_blob.rs | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/pbs-datastore/src/data_blob.rs b/pbs-datastore/src/data_blob.rs
index adf5a932..4e689364 100644
--- a/pbs-datastore/src/data_blob.rs
+++ b/pbs-datastore/src/data_blob.rs
@@ -147,39 +147,40 @@ impl DataBlob {
 
             DataBlob { raw_data }
         } else {
-            let max_data_len = data.len() + std::mem::size_of::<DataBlobHeader>();
+            let header_len = std::mem::size_of::<DataBlobHeader>();
+            let max_data_len = data.len() + header_len;
+            let mut raw_data = vec![0; max_data_len];
             if compress {
-                let mut comp_data = Vec::with_capacity(max_data_len);
-
                 let head = DataBlobHeader {
                     magic: COMPRESSED_BLOB_MAGIC_1_0,
                     crc: [0; 4],
                 };
                 unsafe {
-                    comp_data.write_le_value(head)?;
+                    (&mut raw_data[0..header_len]).write_le_value(head)?;
                 }
 
-                zstd::stream::copy_encode(data, &mut comp_data, 1)?;
-
-                if comp_data.len() < max_data_len {
-                    let mut blob = DataBlob {
-                        raw_data: comp_data,
-                    };
-                    blob.set_crc(blob.compute_crc());
-                    return Ok(blob);
+                match zstd_safe::compress(&mut raw_data[header_len..], data, 1) {
+                    Ok(size) if size <= data.len() => {
+                        raw_data.truncate(header_len + size);
+                        let mut blob = DataBlob { raw_data };
+                        blob.set_crc(blob.compute_crc());
+                        return Ok(blob);
+                    }
+                    Err(err) if !zstd_error_is_target_too_small(err) => {
+                        log::warn!("zstd compression error: {err}");
+                    }
+                    _ => {}
                 }
             }
 
-            let mut raw_data = Vec::with_capacity(max_data_len);
-
             let head = DataBlobHeader {
                 magic: UNCOMPRESSED_BLOB_MAGIC_1_0,
                 crc: [0; 4],
             };
             unsafe {
-                raw_data.write_le_value(head)?;
+                (&mut raw_data[0..header_len]).write_le_value(head)?;
             }
-            raw_data.extend_from_slice(data);
+            (&mut raw_data[header_len..]).write_all(data)?;
 
             DataBlob { raw_data }
         };
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel