From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id D7D8592183 for ; Fri, 5 Apr 2024 11:42:47 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B784C108E5 for ; Fri, 5 Apr 2024 11:42:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 5 Apr 2024 11:42:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id BD8A3460CA for ; Fri, 5 Apr 2024 11:42:46 +0200 (CEST) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable In-Reply-To: <20240328123707.336951-50-c.ebner@proxmox.com> References: <20240328123707.336951-1-c.ebner@proxmox.com> <20240328123707.336951-50-c.ebner@proxmox.com> From: Fabian =?utf-8?q?Gr=C3=BCnbichler?= To: Christian Ebner , pbs-devel@lists.proxmox.com Date: Fri, 05 Apr 2024 11:42:39 +0200 Message-ID: <171231015999.1926770.4997530295571319940@yuna.proxmox.com> User-Agent: alot/0.10 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.058 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, main.rs] Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 49/58] client: backup: increase average chunk size for metadata X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Apr 2024 09:42:47 -0000 Quoting Christian Ebner (2024-03-28 13:36:58) > Use double the average chunk size for the metadata archive as compared > to the payload stream. This does not only reduce the number of unique > chunks produced by the metadata archive, not well chunkable because > mainly many localized small changes, but further has the positive side > effect of producing well compressable larger chunks. The reduced number > of chunks further increases the performance for access because of > reduced number of download requests and increased cachability. >=20 > Signed-off-by: Christian Ebner > --- > changes since version 2: > - not present in previous version >=20 > proxmox-backup-client/src/main.rs | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) >=20 > diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/sr= c/main.rs > index 66dcaa63e..4aad0ff8c 100644 > --- a/proxmox-backup-client/src/main.rs > +++ b/proxmox-backup-client/src/main.rs > @@ -78,6 +78,8 @@ pub(crate) use helper::*; > pub mod key; > pub mod namespace; > =20 > +const AVG_METADATA_CHUNK_SIZE: usize =3D 8 * 1024 * 1024; > + > fn record_repository(repo: &BackupRepository) { > let base =3D match BaseDirectories::with_prefix("proxmox-backup") { > Ok(v) =3D> v, > @@ -209,7 +211,15 @@ async fn backup_directory>( > payload_target.is_some(), > )?; > =20 > - let mut chunk_stream =3D ChunkStream::new(pxar_stream, chunk_size, N= one); > + let avg_chunk_size =3D if payload_stream.is_none() { > + chunk_size > + } else { > + chunk_size > + .map(|size| 2 * size) what if the user provided us with a very small chunk size? should we have a= lower bound here? I still wonder whether getting rid of the sliding window chunker wouldn't b= e a net benefit for the split archive case. for the metadata stream it probably doesn't matter much (it has a lot of churn, is small and compresses well). for the payload stream simple accumulating 1..N files (or rather, their contents) in a chunk until a certain size threshold is reached might perform better (as in, both be faster than the current chunker, and give us more/be= tter re-usable chunks). > + .or_else(|| Some(AVG_METADATA_CHUNK_SIZE)) > + }; > + > + let mut chunk_stream =3D ChunkStream::new(pxar_stream, avg_chunk_siz= e, None); > let (tx, rx) =3D mpsc::channel(10); // allow to buffer 10 chunks > =20 > let stream =3D ReceiverStream::new(rx).map_err(Error::from); > --=20 > 2.39.2 >=20 >=20 >=20 > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel >=20 >