From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E700F9B57 for ; Fri, 4 Aug 2023 13:28:11 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BF0A8DF21 for ; Fri, 4 Aug 2023 13:27:41 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 4 Aug 2023 13:27:41 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id EDA9040B2B for ; Fri, 4 Aug 2023 13:27:40 +0200 (CEST) Date: Fri, 4 Aug 2023 13:27:39 +0200 From: Wolfgang Bumiller To: Max Carrara Cc: pbs-devel@lists.proxmox.com Message-ID: References: <20230720171505.1053912-1-m.carrara@proxmox.com> <20230720171505.1053912-2-m.carrara@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230720171505.1053912-2-m.carrara@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.115 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [aio.rs] Subject: Re: [pbs-devel] [PATCH pxar 2/2] decoder: aio: improve performance of async file reads X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Aug 2023 11:28:11 -0000 On Thu, Jul 20, 2023 at 07:15:05PM +0200, Max Carrara wrote: > In order to bring `aio::Decoder` on par with its `sync` counterpart > as well as `sync::Accessor` and `aio::Accessor`, its input is now > buffered. > > As the `tokio` docs mention themselves [0], it can be really > inefficient to directly work with an (unbuffered) `AsyncRead` > instance. Sure, but the question is *where* does it truly make sense to do the buffering, more below... (...) > --- > src/decoder/aio.rs | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs > index 200dd3d..174551b 100644 > --- a/src/decoder/aio.rs > +++ b/src/decoder/aio.rs > @@ -79,14 +79,20 @@ mod tok { > use std::pin::Pin; > use std::task::{Context, Poll}; > > - /// Read adapter for `futures::io::AsyncRead` > + use tokio::io::AsyncRead; > + > + /// Read adapter for `tokio::io::AsyncRead` > pub struct TokioReader { ^ This is a very generic interface here... > - inner: T, > + inner: tokio::io::BufReader, > } > > impl TokioReader { > pub fn new(inner: T) -> Self { Note that `tokio`'s `BufReader` itself also implements `AsyncRead`, and the user may already have a buffered reader here. A better choice for us here would be to perform this change with the `tokio-fs` feature and replace the impl Decoder> { fn open(...) -> io::Result { ... } } (which exists only so that `Decoder::open` can be used by the crate consumer easily, automatically producing a `Decoder` for "some file type"...) with: impl Decoder>> { fn open(...) -> io::Result { ... } } Since this is the place where we *actually* should be creating the buffered reader. > - Self { inner } > + // buffer size "sweet spot" - larger sizes don't seem to provide any benefit > + const BUF_SIZE: usize = 1024 * 16; And we also wouldn't have to decide on what would be a sane size here with the assumption that it is the right size for any possible T we instantiate the decoder with. There's a bit of a danger with sprinkling `BufReaders` in generic `T: Read` APIs, as this may lead to multiple of those getting chained together. Eg. a consumer of the crate may instantiate a `Decoder>`. Then reads that buffering for such things can improve performance and turn that into: `Decoder>>`. Little do they know that `Decoder` buffers, the creator of `SomeNetworkFile` also thought the same thing and buffers as well, and `TlsStreamThing` might also need buffering for a sane implementation, and suddenly you're just chaining memcpys across 4 buffers before they end up at the destination ;-) > + Self { > + inner: tokio::io::BufReader::with_capacity(BUF_SIZE, inner), > + } > } > } > > -- > 2.39.2