public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Stefan Reiter <s.reiter@proxmox.com>
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH pxar 02/22] decoder: add peek()
Date: Wed, 17 Feb 2021 09:38:58 +0100	[thread overview]
Message-ID: <af0da89d-3063-b23e-402f-83465363550c@proxmox.com> (raw)
In-Reply-To: <20210217082000.un46hdah3pbdyzpz@olga.proxmox.com>

On 17/02/2021 09:20, Wolfgang Bumiller wrote:
> On Tue, Feb 16, 2021 at 06:06:50PM +0100, Stefan Reiter wrote:
>> Allows peeking the current element, but will not advance the state
>> (except for contents() and content_size() functions).
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>>   src/accessor/mod.rs |  3 +++
>>   src/decoder/aio.rs  | 10 +++++++++-
>>   src/decoder/mod.rs  | 19 +++++++++++++++++--
>>   src/decoder/sync.rs | 10 +++++++++-
>>   4 files changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/accessor/mod.rs b/src/accessor/mod.rs
>> index d02dc13..aa1b3f6 100644
>> --- a/src/accessor/mod.rs
>> +++ b/src/accessor/mod.rs
>> @@ -293,6 +293,7 @@ impl<T: Clone + ReadAt> AccessorImpl<T> {
>>           let entry = decoder
>>               .next()
>>               .await
>> +            .transpose()
>>               .ok_or_else(|| io_format_err!("unexpected EOF while decoding file entry"))??;
>>           Ok(FileEntryImpl {
>>               input: self.input.clone(),
>> @@ -334,6 +335,7 @@ impl<T: Clone + ReadAt> AccessorImpl<T> {
>>           let entry = decoder
>>               .next()
>>               .await
>> +            .transpose()
>>               .ok_or_else(|| io_format_err!("unexpected EOF while following a hardlink"))??;
>>   
>>           match entry.kind() {
>> @@ -516,6 +518,7 @@ impl<T: Clone + ReadAt> DirectoryImpl<T> {
>>           let entry = decoder
>>               .next()
>>               .await
>> +            .transpose()
>>               .ok_or_else(|| io_format_err!("unexpected EOF while decoding directory entry"))??;
>>           Ok((entry, decoder))
>>       }
>> diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs
>> index 5cc6694..c553d45 100644
>> --- a/src/decoder/aio.rs
>> +++ b/src/decoder/aio.rs
>> @@ -53,7 +53,15 @@ impl<T: SeqRead> Decoder<T> {
>>       #[allow(clippy::should_implement_trait)]
>>       /// If this is a directory entry, get the next item inside the directory.
>>       pub async fn next(&mut self) -> Option<io::Result<Entry>> {
>> -        self.inner.next_do().await.transpose()
>> +        self.inner.next().await.transpose()
>> +    }
>> +
>> +    /// If this is a directory entry, get the next item inside the directory.
>> +    /// Do not advance the cursor, so multiple calls to peek() will return the same entry,
>> +    /// and the next call to next() will read the item once again before moving on.
>> +    /// NOTE: This *will* advance the state for contents() and content_size()!
> 
> ^ Which is why I'm wondering whether we should maybe leave this up to
> the *user* rather than provide a sort-of broken API here?
> 
> I'd rather have this be guarded by a Seek trait, but that too is
> something we won't get from `std` and so we'd have to add one.
> 
> Why do we need this exactly?

See patches 8 and 22 (specifically 'fn extract_to_target_seq') of the 
series. I didn't want to add more special casing to the sequential 
extractors, they are "special-cased" enough as it is IMO, so they work 
on the assumption that they can just call "next()" and get the root 
entry of what they want to extract. But I also need to check whether 
that entry is a file or a dir before calling them, which I do with peek().

> 
> And would this be solved by simply *generally* storing a
> "current_entry"? Then we can have a `.current_entry() -> Option<&Entry>`
> which works after at least `next()` call, and `.next()` working as
> usual.  And we may just have `next()` also return a reference instead.
> The user can `.clone()` if necessary. Or we return a mutable reference
> and allow `.take()`, then the user is responsible for knowing whether
> calling `.current_entry()` makes sense ;-)
> 

current_entry() wouldn't help my use-case, and returning a reference is 
somewhat pointless since Entry is small and Clone anyway IIRC?

I believe there might be a way to avoid this patch entirely though if I 
give the sequential extractor API some more thought, if not I'll think 
about your proposals for a v2.

>> +    pub async fn peek(&mut self) -> Option<io::Result<Entry>> {
>> +        self.inner.peek().await.transpose()
>>       }
>>   
>>       /// Get a reader for the contents of the current entry, if the entry has contents.
>> diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs
>> index 2a5e79a..041226d 100644
>> --- a/src/decoder/mod.rs
>> +++ b/src/decoder/mod.rs
>> @@ -155,6 +155,7 @@ pub(crate) struct DecoderImpl<T> {
>>       path_lengths: Vec<usize>,
>>       state: State,
>>       with_goodbye_tables: bool,
>> +    peeked: Option<io::Result<Option<Entry>>>,
>>   
>>       /// The random access code uses decoders for sub-ranges which may not end in a `PAYLOAD` for
>>       /// entries like FIFOs or sockets, so there we explicitly allow an item to terminate with EOF.
>> @@ -218,6 +219,7 @@ impl<I: SeqRead> DecoderImpl<I> {
>>               path_lengths: Vec::new(),
>>               state: State::Begin,
>>               with_goodbye_tables: false,
>> +            peeked: None,
>>               eof_after_entry,
>>           };
>>   
>> @@ -227,8 +229,21 @@ impl<I: SeqRead> DecoderImpl<I> {
>>       }
>>   
>>       /// Get the next file entry, recursing into directories.
>> -    pub async fn next(&mut self) -> Option<io::Result<Entry>> {
>> -        self.next_do().await.transpose()
>> +    pub async fn next(&mut self) -> io::Result<Option<Entry>> {
>> +        if let Some(ent) = self.peeked.take() {
>> +            return ent;
>> +        }
>> +        self.next_do().await
>> +    }
>> +
>> +    pub async fn peek(&mut self) -> io::Result<Option<Entry>> {
>> +        self.peeked = Some(self.next().await);
>> +        match &self.peeked {
>> +            Some(Ok(ent)) => Ok(ent.clone()),
>> +            // io::Error does not implement Clone...
>> +            Some(Err(err)) => Err(io_format_err!("{}", err)),
>> +            None => unreachable!()
>> +        }
>>       }
>>   
>>       async fn next_do(&mut self) -> io::Result<Option<Entry>> {
>> diff --git a/src/decoder/sync.rs b/src/decoder/sync.rs
>> index 85b4865..c6a1bc3 100644
>> --- a/src/decoder/sync.rs
>> +++ b/src/decoder/sync.rs
>> @@ -63,7 +63,15 @@ impl<T: SeqRead> Decoder<T> {
>>       #[allow(clippy::should_implement_trait)]
>>       /// If this is a directory entry, get the next item inside the directory.
>>       pub fn next(&mut self) -> Option<io::Result<Entry>> {
>> -        poll_result_once(self.inner.next_do()).transpose()
>> +        poll_result_once(self.inner.next()).transpose()
>> +    }
>> +
>> +    /// If this is a directory entry, get the next item inside the directory.
>> +    /// Do not advance the cursor, so multiple calls to peek() will return the same entry,
>> +    /// and the next call to next() will read the item once again before moving on.
>> +    /// NOTE: This *will* advance the state for contents() and content_size()!
>> +    pub async fn peek(&mut self) -> Option<io::Result<Entry>> {
>> +        poll_result_once(self.inner.peek()).transpose()
>>       }
>>   
>>       /// Get a reader for the contents of the current entry, if the entry has contents.
>> -- 
>> 2.20.1




  reply	other threads:[~2021-02-17  8:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-16 17:06 [pbs-devel] [PATCH 00/22] Single file restore for VM images Stefan Reiter
2021-02-16 17:06 ` [pbs-devel] [PATCH pxar 01/22] decoder/aio: add contents() and content_size() calls Stefan Reiter
2021-02-17  7:56   ` Wolfgang Bumiller
2021-02-16 17:06 ` [pbs-devel] [PATCH pxar 02/22] decoder: add peek() Stefan Reiter
2021-02-17  8:20   ` Wolfgang Bumiller
2021-02-17  8:38     ` Stefan Reiter [this message]
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-restore-vm-data 03/22] initial commit Stefan Reiter
2021-03-15 18:35   ` [pbs-devel] applied: " Thomas Lamprecht
2021-03-16 15:33     ` Stefan Reiter
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 04/22] api2/admin/datastore: refactor list_dir_content in catalog_reader Stefan Reiter
2021-02-17  7:50   ` [pbs-devel] applied: " Thomas Lamprecht
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 05/22] api2/admin/datastore: accept "/" as path for root Stefan Reiter
2021-02-17  7:50   ` [pbs-devel] applied: " Thomas Lamprecht
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 06/22] api2/admin/datastore: refactor create_zip into pxar/extract Stefan Reiter
2021-02-17  7:50   ` [pbs-devel] applied: " Thomas Lamprecht
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 07/22] pxar/extract: add extract_sub_dir Stefan Reiter
2021-02-17  7:51   ` [pbs-devel] applied: " Thomas Lamprecht
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 08/22] pxar/extract: add sequential variants to create_zip, extract_sub_dir Stefan Reiter
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 09/22] client: extract common functions to proxmox_client_tools module Stefan Reiter
2021-02-17  6:49   ` Dietmar Maurer
2021-02-17  7:58     ` Stefan Reiter
2021-02-17  8:50       ` Dietmar Maurer
2021-02-17  9:47         ` Stefan Reiter
2021-02-17 10:12           ` Dietmar Maurer
2021-02-17  9:13   ` [pbs-devel] applied: " Dietmar Maurer
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 10/22] proxmox_client_tools: extract 'key' from client module Stefan Reiter
2021-02-17  9:11   ` Dietmar Maurer
2021-02-16 17:06 ` [pbs-devel] [PATCH proxmox-backup 11/22] file-restore: add binary and basic commands Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 12/22] file-restore: allow specifying output-format Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 13/22] rest: implement tower service for UnixStream Stefan Reiter
2021-02-17  6:52   ` [pbs-devel] applied: " Dietmar Maurer
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 14/22] client: add VsockClient to connect to virtio-vsock VMs Stefan Reiter
2021-02-17  7:24   ` [pbs-devel] applied: " Dietmar Maurer
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 15/22] file-restore-daemon: add binary with virtio-vsock API server Stefan Reiter
2021-02-17 10:17   ` Dietmar Maurer
2021-02-17 10:25   ` Dietmar Maurer
2021-02-17 10:30     ` Stefan Reiter
2021-02-17 11:13       ` Dietmar Maurer
2021-02-17 11:26   ` Dietmar Maurer
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 16/22] file-restore-daemon: add watchdog module Stefan Reiter
2021-02-17 10:52   ` Wolfgang Bumiller
2021-02-17 11:14     ` Stefan Reiter
2021-02-17 11:29       ` Wolfgang Bumiller
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 17/22] file-restore-daemon: add disk module Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 18/22] file-restore: add basic VM/block device support Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 19/22] file-restore: improve logging of VM with logrotate Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 20/22] debian/client: add postinst hook to rebuild file-restore initramfs Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 21/22] file-restore(-daemon): implement list API Stefan Reiter
2021-02-16 17:07 ` [pbs-devel] [PATCH proxmox-backup 22/22] file-restore: add 'extract' command for VM file restore Stefan Reiter
2021-02-16 17:11 ` [pbs-devel] [PATCH 00/22] Single file restore for VM images Stefan Reiter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af0da89d-3063-b23e-402f-83465363550c@proxmox.com \
    --to=s.reiter@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal