From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 316E78E1D for ; Mon, 31 Jul 2023 17:14:17 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 118F75981 for ; Mon, 31 Jul 2023 17:14:17 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 31 Jul 2023 17:14:15 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 55B73432D4 for ; Mon, 31 Jul 2023 17:14:15 +0200 (CEST) Message-ID: <68fbd44d-fc6f-5265-b3a6-8720e454da9c@proxmox.com> Date: Mon, 31 Jul 2023 17:14:14 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Content-Language: en-US To: pbs-devel@lists.proxmox.com References: <20230731133404.859756-1-m.carrara@proxmox.com> <20230731133404.859756-2-m.carrara@proxmox.com> <1690811866.sqrlbr0udn.astroid@yuna.none> From: Max Carrara In-Reply-To: <1690811866.sqrlbr0udn.astroid@yuna.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.358 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_ASCII_DIVIDERS 0.8 Email that uses ascii formatting dividers and possible spam tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.101 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [PATCH v2 pxar 2/2] decoder: aio: improve performance of async file reads X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jul 2023 15:14:17 -0000 On 7/31/23 16:57, Fabian Grünbichler wrote: > On July 31, 2023 3:34 pm, Max Carrara wrote: >> In order to bring `aio::Decoder` on par with its `sync` counterpart >> as well as `sync::Accessor` and `aio::Accessor`, its input is now >> buffered. >> >> As the `tokio` docs mention themselves [0], it can be really >> inefficient to directly work with an (unbuffered) `AsyncRead` >> instance. >> >> The other aforementioned types already buffer their reads in one way >> or another, so wrapping the input reader in `tokio::io::BufReader` >> results in a substantial performance gain. [1] >> >> `tokio/io-util` is added as dependency in order to use >> `tokio::io::BufReader`. >> >> [0]: https://docs.rs/tokio/1.29.1/tokio/io/struct.BufReader.html >> [1]: Tested via examples/compare-read.rs on a large (13GB) pxar archive >> >> Before: >>> PXAR Read Performance Comparison >>> Running in mode: release >>> >>> First pass: >>> With aio::Decoder: Ok(()) (elapsed: 20.532270177s) >>> With sync::Decoder: Ok(()) (elapsed: 3.498566141s) >>> With aio::Accessor: Ok(()) (elapsed: 3.978160609s) >>> With sync::Accessor: Ok(()) (elapsed: 3.885640895s) >>> >>> Second pass: >>> With aio::Decoder: Ok(()) (elapsed: 18.648986266s) >>> With sync::Decoder: Ok(()) (elapsed: 3.617167922s) >>> With aio::Accessor: Ok(()) (elapsed: 4.083678211s) >>> With sync::Accessor: Ok(()) (elapsed: 4.103763507s) >> >> After: >>> PXAR Read Performance Comparison >>> Running in mode: release >>> >>> First pass: >>> With aio::Decoder: Ok(()) (elapsed: 9.546522171s) >>> With sync::Decoder: Ok(()) (elapsed: 3.535062119s) >>> With aio::Accessor: Ok(()) (elapsed: 3.926439101s) >>> With sync::Accessor: Ok(()) (elapsed: 3.905232916s) >>> >>> Second pass: >>> With aio::Decoder: Ok(()) (elapsed: 10.633561678s) >>> With sync::Decoder: Ok(()) (elapsed: 3.528989778s) >>> With aio::Accessor: Ok(()) (elapsed: 3.831093917s) >>> With sync::Accessor: Ok(()) (elapsed: 3.848684845s) > > this does look good to me in general, do you have more details about > your test pxar file? > > because for me with a big archive with lots of hardlinks (POM-created > mirror): > > buffered: > Time (mean ± σ): 17.360 s ± 0.769 s [User: 2.460 s, System: 14.345 s] > Range (min … max): 16.004 s … 18.225 s 10 runs > > stock: > Time (mean ± σ): 20.512 s ± 1.248 s [User: 3.158 s, System: 16.510 s] > Range (min … max): 19.045 s … 22.176 s 10 runs > > Summary > buffered ran > 1.18 ± 0.09 times faster than stock > > and for another even bigger (~40G) archive consisting of a PBS .chunks > dir: > > buffered: > Time (mean ± σ): 138.329 s ± 3.627 s [User: 19.407 s, System: 114.824 s] > Range (min … max): 134.266 s … 146.754 s 10 runs > > stock: > Time (mean ± σ): 179.822 s ± 3.679 s [User: 26.894 s, System: 144.526 s] > Range (min … max): 173.166 s … 186.505 s 10 runs > > Summary > buffered ran > 1.30 ± 0.04 times faster than stock > > which, while an obvious improvement, is far from your almost 2x speedup > ;) > The file I've used contains a couple test files from the sparse copy bug (essentially files with random data and some holes) and a .tar of the /var/log/mysql directory of a MariaDB instance I had used in an attempt to create yet another test file. Maybe we can exchange files? ;) >> >> Signed-off-by: Max Carrara >> --- >> Changes v1 --> v2: >> * Include addition of `tokio/io-util` as dependency >> * Use new examples/compare-read.rs instead of old custom tool to >> measure performance impact >> * Use default buffer size (8K) instead of 16K >> (I wasn't able to reproduce the performance gains, so ...) >> >> Cargo.toml | 2 +- >> src/decoder/aio.rs | 10 +++++++--- >> 2 files changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/Cargo.toml b/Cargo.toml >> index 8669e30..08c0973 100644 >> --- a/Cargo.toml >> +++ b/Cargo.toml >> @@ -63,7 +63,7 @@ libc = "0.2" >> >> [features] >> default = [ "tokio-io" ] >> -tokio-io = [ "tokio" ] >> +tokio-io = [ "tokio", "tokio/io-util" ] >> tokio-fs = [ "tokio-io", "tokio/fs" ] >> >> full = [ "tokio-fs"] >> diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs >> index 200dd3d..7cb9c12 100644 >> --- a/src/decoder/aio.rs >> +++ b/src/decoder/aio.rs >> @@ -79,14 +79,18 @@ mod tok { >> use std::pin::Pin; >> use std::task::{Context, Poll}; >> >> - /// Read adapter for `futures::io::AsyncRead` >> + use tokio::io::AsyncRead; >> + >> + /// Read adapter for `tokio::io::AsyncRead` >> pub struct TokioReader { >> - inner: T, >> + inner: tokio::io::BufReader, >> } >> >> impl TokioReader { >> pub fn new(inner: T) -> Self { >> - Self { inner } >> + Self { >> + inner: tokio::io::BufReader::new(inner), >> + } >> } >> } >> >> -- >> 2.39.2 >> >> >> >> _______________________________________________ >> pbs-devel mailing list >> pbs-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel >> >> >> > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel