From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id AF3E06ACC3 for ; Wed, 17 Feb 2021 09:39:34 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A1FD51F7D0 for ; Wed, 17 Feb 2021 09:39:04 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 796091F7C3 for ; Wed, 17 Feb 2021 09:39:00 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3E33542353 for ; Wed, 17 Feb 2021 09:39:00 +0100 (CET) To: Wolfgang Bumiller Cc: pbs-devel@lists.proxmox.com References: <20210216170710.31767-1-s.reiter@proxmox.com> <20210216170710.31767-3-s.reiter@proxmox.com> <20210217082000.un46hdah3pbdyzpz@olga.proxmox.com> From: Stefan Reiter Message-ID: Date: Wed, 17 Feb 2021 09:38:58 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <20210217082000.un46hdah3pbdyzpz@olga.proxmox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.025 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [mod.rs, aio.rs, sync.rs] Subject: Re: [pbs-devel] [PATCH pxar 02/22] decoder: add peek() X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Feb 2021 08:39:34 -0000 On 17/02/2021 09:20, Wolfgang Bumiller wrote: > On Tue, Feb 16, 2021 at 06:06:50PM +0100, Stefan Reiter wrote: >> Allows peeking the current element, but will not advance the state >> (except for contents() and content_size() functions). >> >> Signed-off-by: Stefan Reiter >> --- >> src/accessor/mod.rs | 3 +++ >> src/decoder/aio.rs | 10 +++++++++- >> src/decoder/mod.rs | 19 +++++++++++++++++-- >> src/decoder/sync.rs | 10 +++++++++- >> 4 files changed, 38 insertions(+), 4 deletions(-) >> >> diff --git a/src/accessor/mod.rs b/src/accessor/mod.rs >> index d02dc13..aa1b3f6 100644 >> --- a/src/accessor/mod.rs >> +++ b/src/accessor/mod.rs >> @@ -293,6 +293,7 @@ impl AccessorImpl { >> let entry = decoder >> .next() >> .await >> + .transpose() >> .ok_or_else(|| io_format_err!("unexpected EOF while decoding file entry"))??; >> Ok(FileEntryImpl { >> input: self.input.clone(), >> @@ -334,6 +335,7 @@ impl AccessorImpl { >> let entry = decoder >> .next() >> .await >> + .transpose() >> .ok_or_else(|| io_format_err!("unexpected EOF while following a hardlink"))??; >> >> match entry.kind() { >> @@ -516,6 +518,7 @@ impl DirectoryImpl { >> let entry = decoder >> .next() >> .await >> + .transpose() >> .ok_or_else(|| io_format_err!("unexpected EOF while decoding directory entry"))??; >> Ok((entry, decoder)) >> } >> diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs >> index 5cc6694..c553d45 100644 >> --- a/src/decoder/aio.rs >> +++ b/src/decoder/aio.rs >> @@ -53,7 +53,15 @@ impl Decoder { >> #[allow(clippy::should_implement_trait)] >> /// If this is a directory entry, get the next item inside the directory. >> pub async fn next(&mut self) -> Option> { >> - self.inner.next_do().await.transpose() >> + self.inner.next().await.transpose() >> + } >> + >> + /// If this is a directory entry, get the next item inside the directory. >> + /// Do not advance the cursor, so multiple calls to peek() will return the same entry, >> + /// and the next call to next() will read the item once again before moving on. >> + /// NOTE: This *will* advance the state for contents() and content_size()! > > ^ Which is why I'm wondering whether we should maybe leave this up to > the *user* rather than provide a sort-of broken API here? > > I'd rather have this be guarded by a Seek trait, but that too is > something we won't get from `std` and so we'd have to add one. > > Why do we need this exactly? See patches 8 and 22 (specifically 'fn extract_to_target_seq') of the series. I didn't want to add more special casing to the sequential extractors, they are "special-cased" enough as it is IMO, so they work on the assumption that they can just call "next()" and get the root entry of what they want to extract. But I also need to check whether that entry is a file or a dir before calling them, which I do with peek(). > > And would this be solved by simply *generally* storing a > "current_entry"? Then we can have a `.current_entry() -> Option<&Entry>` > which works after at least `next()` call, and `.next()` working as > usual. And we may just have `next()` also return a reference instead. > The user can `.clone()` if necessary. Or we return a mutable reference > and allow `.take()`, then the user is responsible for knowing whether > calling `.current_entry()` makes sense ;-) > current_entry() wouldn't help my use-case, and returning a reference is somewhat pointless since Entry is small and Clone anyway IIRC? I believe there might be a way to avoid this patch entirely though if I give the sequential extractor API some more thought, if not I'll think about your proposals for a v2. >> + pub async fn peek(&mut self) -> Option> { >> + self.inner.peek().await.transpose() >> } >> >> /// Get a reader for the contents of the current entry, if the entry has contents. >> diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs >> index 2a5e79a..041226d 100644 >> --- a/src/decoder/mod.rs >> +++ b/src/decoder/mod.rs >> @@ -155,6 +155,7 @@ pub(crate) struct DecoderImpl { >> path_lengths: Vec, >> state: State, >> with_goodbye_tables: bool, >> + peeked: Option>>, >> >> /// The random access code uses decoders for sub-ranges which may not end in a `PAYLOAD` for >> /// entries like FIFOs or sockets, so there we explicitly allow an item to terminate with EOF. >> @@ -218,6 +219,7 @@ impl DecoderImpl { >> path_lengths: Vec::new(), >> state: State::Begin, >> with_goodbye_tables: false, >> + peeked: None, >> eof_after_entry, >> }; >> >> @@ -227,8 +229,21 @@ impl DecoderImpl { >> } >> >> /// Get the next file entry, recursing into directories. >> - pub async fn next(&mut self) -> Option> { >> - self.next_do().await.transpose() >> + pub async fn next(&mut self) -> io::Result> { >> + if let Some(ent) = self.peeked.take() { >> + return ent; >> + } >> + self.next_do().await >> + } >> + >> + pub async fn peek(&mut self) -> io::Result> { >> + self.peeked = Some(self.next().await); >> + match &self.peeked { >> + Some(Ok(ent)) => Ok(ent.clone()), >> + // io::Error does not implement Clone... >> + Some(Err(err)) => Err(io_format_err!("{}", err)), >> + None => unreachable!() >> + } >> } >> >> async fn next_do(&mut self) -> io::Result> { >> diff --git a/src/decoder/sync.rs b/src/decoder/sync.rs >> index 85b4865..c6a1bc3 100644 >> --- a/src/decoder/sync.rs >> +++ b/src/decoder/sync.rs >> @@ -63,7 +63,15 @@ impl Decoder { >> #[allow(clippy::should_implement_trait)] >> /// If this is a directory entry, get the next item inside the directory. >> pub fn next(&mut self) -> Option> { >> - poll_result_once(self.inner.next_do()).transpose() >> + poll_result_once(self.inner.next()).transpose() >> + } >> + >> + /// If this is a directory entry, get the next item inside the directory. >> + /// Do not advance the cursor, so multiple calls to peek() will return the same entry, >> + /// and the next call to next() will read the item once again before moving on. >> + /// NOTE: This *will* advance the state for contents() and content_size()! >> + pub async fn peek(&mut self) -> Option> { >> + poll_result_once(self.inner.peek()).transpose() >> } >> >> /// Get a reader for the contents of the current entry, if the entry has contents. >> -- >> 2.20.1