public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives
@ 2024-06-11 13:29 Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 1/5] format: add helper for payload header consistency checks Christian Ebner
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

Fuse mounts for split pxar archives currently greatly suffer from the
consistency check between metadata and payload data archives, as
these happen already during decoding of the payload reference entry
in the metadata archive. By moving this check to the content reader
instantiation, the performance can be improved significantly, as now
the payload data chunks only need to be fetched and decoded when
actually accessing the file payloads.

Changes since version 1:
- Add previously missing check when accessing contents via the accessor
  instead of the decoder.
- Add missing context and refactor file entry extraction branch in pxar
  extract according to suggestions

pxar:

Christian Ebner (4):
  format: add helper for payload header consistency checks
  format: add helper type ContentRange
  decoder: move payload header check for split input
  accessor: add payload checks for split archives

 src/accessor/aio.rs  | 15 ++++++-----
 src/accessor/mod.rs  | 60 +++++++++++++++++++++++++++++++++-----------
 src/accessor/sync.rs | 16 ++++++------
 src/decoder/aio.rs   |  4 +--
 src/decoder/mod.rs   | 56 +++++++++++++++++++++++------------------
 src/decoder/sync.rs  |  5 ++--
 src/format/mod.rs    | 32 +++++++++++++++++++++++
 7 files changed, 132 insertions(+), 56 deletions(-)

proxmox-backup:

Christian Ebner (1):
  client: pxar: fix fuse mount performance for split archives

 pbs-client/src/pxar/extract.rs | 62 ++++++++++++++++------------------
 pbs-pxar-fuse/src/lib.rs       | 14 ++++----
 src/api2/tape/restore.rs       |  2 +-
 3 files changed, 37 insertions(+), 41 deletions(-)

-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH v2 pxar 1/5] format: add helper for payload header consistency checks
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
@ 2024-06-11 13:29 ` Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 2/5] format: add helper type ContentRange Christian Ebner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

The helper method will be used to check the payload header being
consistent with what was encoded as paylaod reference for split
pxar archives.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 src/format/mod.rs | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/src/format/mod.rs b/src/format/mod.rs
index 59d40e1..746f39d 100644
--- a/src/format/mod.rs
+++ b/src/format/mod.rs
@@ -823,3 +823,24 @@ pub fn check_file_name(path: &Path) -> io::Result<()> {
         Ok(())
     }
 }
+
+/// Check if provided header is of type payload and the headers content size matches given size.
+///
+/// Returns an [`io::Error`](std::io::Error) of type [`Other`](std::io::ErrorKind::Other) if that's
+/// not the case.
+pub fn check_payload_header_and_size(header: &Header, size: u64) -> io::Result<()> {
+    header.check_header_size()?;
+
+    if header.htype != PXAR_PAYLOAD {
+        io_bail!("unexpected header: expected {PXAR_PAYLOAD:x?} , got {header:x?}");
+    }
+
+    if header.content_size() != size {
+        io_bail!(
+            "encountered size mismatch: expected {}, got {size}",
+            header.content_size()
+        );
+    }
+
+    Ok(())
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH v2 pxar 2/5] format: add helper type ContentRange
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 1/5] format: add helper for payload header consistency checks Christian Ebner
@ 2024-06-11 13:29 ` Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 3/5] decoder: move payload header check for split input Christian Ebner
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

The new `ContentRange` type will be used to store content ranges to
be used when accessing a pxar archive via the decoder, but with
additional payload reference information in order to be able to
perfom additional consistency checks on these entires.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 src/format/mod.rs | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/format/mod.rs b/src/format/mod.rs
index 746f39d..91ca9ea 100644
--- a/src/format/mod.rs
+++ b/src/format/mod.rs
@@ -43,6 +43,7 @@ use std::fmt;
 use std::fmt::Display;
 use std::io;
 use std::mem::size_of;
+use std::ops::Range;
 use std::os::unix::ffi::OsStrExt;
 use std::path::Path;
 use std::time::{Duration, SystemTime};
@@ -844,3 +845,13 @@ pub fn check_payload_header_and_size(header: &Header, size: u64) -> io::Result<(
 
     Ok(())
 }
+
+/// Stores a content range to be accessed via the `Accessor` as well as the payload reference to
+/// perform consistency checks on payload references for archives accessed via split variant input.
+#[derive(Clone)]
+pub struct ContentRange {
+    // Range of the content
+    pub content: Range<u64>,
+    // Optional payload ref
+    pub payload_ref: Option<PayloadRef>,
+}
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH v2 pxar 3/5] decoder: move payload header check for split input
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 1/5] format: add helper for payload header consistency checks Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 2/5] format: add helper type ContentRange Christian Ebner
@ 2024-06-11 13:29 ` Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 4/5] accessor: add payload checks for split archives Christian Ebner
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

The payload entries in the payload output for split pxar archives are
separated by payload headers, which allow to perform consistency
checks for the payload references encoded in the metadata archive.

Currently, this consistency check is performed right after reading the
entry in the metadata archive, which however has the downside that the
payload has to be fetched and decoded just for this consistency check.
This greatly impacts performance when accessing a metadata archive
with attached payload input reader, e.g. in the fuse implementation to
mount pxar archives, being especially severe when accessed over the
network in combination with a remote chunk reader as the Proxmox
Backup Server does.

Therefore, move this check to the contents reader instantiation
instead and add an additional flag to the decoder's `InPayload` state.

Getting the decoder now needs to be async and the method must return
an error when the check fails.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- use helper method for payload consistency check

 src/decoder/aio.rs  |  4 ++--
 src/decoder/mod.rs  | 56 +++++++++++++++++++++++++--------------------
 src/decoder/sync.rs |  5 ++--
 3 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/src/decoder/aio.rs b/src/decoder/aio.rs
index 3f9881d..19e7023 100644
--- a/src/decoder/aio.rs
+++ b/src/decoder/aio.rs
@@ -60,8 +60,8 @@ impl<T: SeqRead> Decoder<T> {
     }
 
     /// Get a reader for the contents of the current entry, if the entry has contents.
-    pub fn contents(&mut self) -> Option<Contents<T>> {
-        self.inner.content_reader()
+    pub async fn contents(&mut self) -> io::Result<Option<Contents<T>>> {
+        self.inner.content_reader().await
     }
 
     /// Get the size of the current contents, if the entry has contents.
diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs
index 46a21b8..6191627 100644
--- a/src/decoder/mod.rs
+++ b/src/decoder/mod.rs
@@ -182,6 +182,7 @@ enum State {
     InPayload {
         offset: u64,
         size: u64,
+        header_checked: bool,
     },
 
     /// file entries with no data (fifo, socket)
@@ -296,8 +297,16 @@ impl<I: SeqRead> DecoderImpl<I> {
                     // hierarchy and parse the next PXAR_FILENAME or the PXAR_GOODBYE:
                     self.read_next_item().await?;
                 }
-                State::InPayload { offset, .. } => {
+                State::InPayload {
+                    offset,
+                    header_checked,
+                    ..
+                } => {
                     if self.input.payload().is_some() {
+                        if !header_checked {
+                            // header is only checked if payload has been accessed
+                            self.payload_consumed += size_of::<Header>() as u64;
+                        }
                         // Update consumed payload as given by the offset referenced by the content reader
                         self.payload_consumed += offset;
                     } else {
@@ -370,19 +379,31 @@ impl<I: SeqRead> DecoderImpl<I> {
         }
     }
 
-    pub fn content_reader(&mut self) -> Option<Contents<I>> {
-        if let State::InPayload { offset, size } = &mut self.state {
-            if self.input.payload().is_some() {
-                Some(Contents::new(
+    pub async fn content_reader(&mut self) -> Result<Option<Contents<I>>, io::Error> {
+        if let State::InPayload {
+            offset,
+            size,
+            header_checked,
+        } = &mut self.state
+        {
+            if let Some(payload_input) = self.input.payload_mut() {
+                if !*header_checked {
+                    let header: Header = seq_read_entry(payload_input).await?;
+                    self.payload_consumed += size_of::<Header>() as u64;
+                    format::check_payload_header_and_size(&header, *size)?;
+                    *header_checked = true;
+                }
+
+                Ok(Some(Contents::new(
                     self.input.payload_mut().unwrap(),
                     offset,
                     *size,
-                ))
+                )))
             } else {
-                Some(Contents::new(self.input.archive_mut(), offset, *size))
+                Ok(Some(Contents::new(self.input.archive_mut(), offset, *size)))
             }
         } else {
-            None
+            Ok(None)
         }
     }
 
@@ -621,6 +642,7 @@ impl<I: SeqRead> DecoderImpl<I> {
                 };
                 self.state = State::InPayload {
                     offset: 0,
+                    header_checked: false,
                     size: self.current_header.content_size(),
                 };
                 return Ok(ItemResult::Entry);
@@ -652,23 +674,6 @@ impl<I: SeqRead> DecoderImpl<I> {
                         let end = start + payload_ref.size + size_of::<Header>() as u64;
                         payload_input.update_range(start..end);
                     }
-
-                    let header: Header = seq_read_entry(payload_input).await?;
-                    if header.htype != format::PXAR_PAYLOAD {
-                        io_bail!(
-                            "unexpected header in payload input: expected {} , got {header}",
-                            format::PXAR_PAYLOAD,
-                        );
-                    }
-                    self.payload_consumed += size_of::<Header>() as u64;
-
-                    if header.content_size() != payload_ref.size {
-                        io_bail!(
-                            "encountered payload size mismatch: got {}, expected {}",
-                            payload_ref.size,
-                            header.content_size(),
-                        );
-                    }
                 }
 
                 self.entry.kind = EntryKind::File {
@@ -678,6 +683,7 @@ impl<I: SeqRead> DecoderImpl<I> {
                 };
                 self.state = State::InPayload {
                     offset: 0,
+                    header_checked: false,
                     size: payload_ref.size,
                 };
                 return Ok(ItemResult::Entry);
diff --git a/src/decoder/sync.rs b/src/decoder/sync.rs
index 8779f87..1116fe8 100644
--- a/src/decoder/sync.rs
+++ b/src/decoder/sync.rs
@@ -77,8 +77,9 @@ impl<T: SeqRead> Decoder<T> {
     }
 
     /// Get a reader for the contents of the current entry, if the entry has contents.
-    pub fn contents(&mut self) -> Option<Contents<T>> {
-        self.inner.content_reader().map(|inner| Contents { inner })
+    pub fn contents(&mut self) -> io::Result<Option<Contents<T>>> {
+        let content_reader = poll_result_once(self.inner.content_reader())?;
+        Ok(content_reader.map(|inner| Contents { inner }))
     }
 
     /// Get the size of the current contents, if the entry has contents.
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH v2 pxar 4/5] accessor: add payload checks for split archives
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
                   ` (2 preceding siblings ...)
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 3/5] decoder: move payload header check for split input Christian Ebner
@ 2024-06-11 13:29 ` Christian Ebner
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 proxmox-backup 5/5] client: pxar: fix fuse mount performance " Christian Ebner
  2024-06-12  7:53 ` [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] " Christian Ebner
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

Add checks for split variant inputs when accessing the payload
contents via the accessor instance. Both cases, accessing via the
safe `contents` method and via the unsafe `contents_at_range` call
are covered.

To be able to perform the checks, the current range is wrapped into a
`ContentRange` type, which also passed the payload offset and size
as expected from the metadata archive. The corresponding interfaces
have been adapted accordingly.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 src/accessor/aio.rs  | 15 ++++++-----
 src/accessor/mod.rs  | 60 +++++++++++++++++++++++++++++++++-----------
 src/accessor/sync.rs | 16 ++++++------
 3 files changed, 64 insertions(+), 27 deletions(-)

diff --git a/src/accessor/aio.rs b/src/accessor/aio.rs
index 73b1025..c86adc5 100644
--- a/src/accessor/aio.rs
+++ b/src/accessor/aio.rs
@@ -7,7 +7,6 @@
 use std::future::Future;
 use std::io;
 use std::mem;
-use std::ops::Range;
 use std::os::unix::fs::FileExt;
 use std::path::Path;
 use std::pin::Pin;
@@ -16,6 +15,7 @@ use std::task::{Context, Poll};
 
 use crate::accessor::{self, cache::Cache, MaybeReady, ReadAt, ReadAtOperation};
 use crate::decoder::aio::Decoder;
+use crate::format::ContentRange;
 use crate::format::GoodbyeItem;
 use crate::util;
 use crate::{Entry, PxarVariant};
@@ -153,13 +153,16 @@ impl<T: Clone + ReadAt> Accessor<T> {
     ///
     /// This will provide a reader over an arbitrary range of the archive file, so unless this
     /// comes from a actual file entry data, the contents might not make much sense.
-    pub unsafe fn open_contents_at_range(&self, range: Range<u64>) -> FileContents<T> {
-        FileContents {
-            inner: unsafe { self.inner.open_contents_at_range(range) },
+    pub async unsafe fn open_contents_at_range(
+        &self,
+        range: &ContentRange,
+    ) -> io::Result<FileContents<T>> {
+        Ok(FileContents {
+            inner: unsafe { self.inner.open_contents_at_range(range).await? },
             at: 0,
             buffer: Vec::new(),
             future: None,
-        }
+        })
     }
 
     /// Following a hardlink.
@@ -235,7 +238,7 @@ impl<T: Clone + ReadAt> FileEntry<T> {
     }
 
     /// For use with unsafe accessor methods.
-    pub fn content_range(&self) -> io::Result<Option<Range<u64>>> {
+    pub fn content_range(&self) -> io::Result<Option<ContentRange>> {
         self.inner.content_range()
     }
 
diff --git a/src/accessor/mod.rs b/src/accessor/mod.rs
index 1c9e30f..9f27dfd 100644
--- a/src/accessor/mod.rs
+++ b/src/accessor/mod.rs
@@ -17,7 +17,7 @@ use endian_trait::Endian;
 
 use crate::binary_tree_array;
 use crate::decoder::{self, DecoderImpl};
-use crate::format::{self, FormatVersion, GoodbyeItem};
+use crate::format::{self, ContentRange, FormatVersion, GoodbyeItem, PayloadRef};
 use crate::util;
 use crate::{Entry, EntryKind, PxarVariant};
 
@@ -194,11 +194,8 @@ impl<T: ReadAt> AccessorImpl<T> {
         header.check_header_size()?;
 
         if header.htype == format::PXAR_FORMAT_VERSION {
-            let version: u64 = read_entry_at(
-                input.archive(),
-                size_of::<format::Header>() as u64,
-            )
-            .await?;
+            let version: u64 =
+                read_entry_at(input.archive(), size_of::<format::Header>() as u64).await?;
             FormatVersion::deserialize(version)?;
         }
 
@@ -339,11 +336,25 @@ impl<T: Clone + ReadAt> AccessorImpl<T> {
     }
 
     /// Allow opening arbitrary contents from a specific range.
-    pub unsafe fn open_contents_at_range(&self, range: Range<u64>) -> FileContentsImpl<T> {
+    pub async unsafe fn open_contents_at_range(
+        &self,
+        range: &format::ContentRange,
+    ) -> io::Result<FileContentsImpl<T>> {
         if let Some((payload_input, _)) = &self.input.payload() {
-            FileContentsImpl::new(payload_input.clone(), range)
+            if let Some(payload_ref) = &range.payload_ref {
+                let header: format::Header =
+                    read_entry_at(payload_input, payload_ref.offset).await?;
+                format::check_payload_header_and_size(&header, payload_ref.size)?;
+            }
+            Ok(FileContentsImpl::new(
+                payload_input.clone(),
+                range.content.clone(),
+            ))
         } else {
-            FileContentsImpl::new(self.input.archive().clone(), range)
+            Ok(FileContentsImpl::new(
+                self.input.archive().clone(),
+                range.content.clone(),
+            ))
         }
     }
 
@@ -761,7 +772,7 @@ impl<T: Clone + ReadAt> FileEntryImpl<T> {
     }
 
     /// For use with unsafe accessor methods.
-    pub fn content_range(&self) -> io::Result<Option<Range<u64>>> {
+    pub fn content_range(&self) -> io::Result<Option<ContentRange>> {
         match self.entry.kind {
             EntryKind::File { offset: None, .. } => {
                 io_bail!("cannot open file, reader provided no offset")
@@ -770,7 +781,10 @@ impl<T: Clone + ReadAt> FileEntryImpl<T> {
                 size,
                 offset: Some(offset),
                 payload_offset: None,
-            } => Ok(Some(offset..(offset + size))),
+            } => Ok(Some(ContentRange {
+                content: offset..(offset + size),
+                payload_ref: None,
+            })),
             // Payload offset beats regular offset if some
             EntryKind::File {
                 size,
@@ -778,7 +792,13 @@ impl<T: Clone + ReadAt> FileEntryImpl<T> {
                 payload_offset: Some(payload_offset),
             } => {
                 let start_offset = payload_offset + size_of::<format::Header>() as u64;
-                Ok(Some(start_offset..start_offset + size))
+                Ok(Some(ContentRange {
+                    content: start_offset..start_offset + size,
+                    payload_ref: Some(PayloadRef {
+                        offset: payload_offset,
+                        size,
+                    }),
+                }))
             }
             _ => Ok(None),
         }
@@ -789,9 +809,21 @@ impl<T: Clone + ReadAt> FileEntryImpl<T> {
             .content_range()?
             .ok_or_else(|| io_format_err!("not a file"))?;
         if let Some((ref payload_input, _)) = self.input.payload() {
-            Ok(FileContentsImpl::new(payload_input.clone(), range))
+            if let Some(payload_ref) = range.payload_ref {
+                let header: format::Header =
+                    read_entry_at(payload_input, payload_ref.offset).await?;
+                format::check_payload_header_and_size(&header, payload_ref.size)?;
+            }
+
+            Ok(FileContentsImpl::new(
+                payload_input.clone(),
+                range.content.clone(),
+            ))
         } else {
-            Ok(FileContentsImpl::new(self.input.archive().clone(), range))
+            Ok(FileContentsImpl::new(
+                self.input.archive().clone(),
+                range.content.clone(),
+            ))
         }
     }
 
diff --git a/src/accessor/sync.rs b/src/accessor/sync.rs
index df2ed23..63f0265 100644
--- a/src/accessor/sync.rs
+++ b/src/accessor/sync.rs
@@ -1,7 +1,6 @@
 //! Blocking `pxar` random access handling.
 
 use std::io;
-use std::ops::Range;
 use std::os::unix::fs::FileExt;
 use std::path::Path;
 use std::pin::Pin;
@@ -10,7 +9,7 @@ use std::task::Context;
 
 use crate::accessor::{self, cache::Cache, MaybeReady, ReadAt, ReadAtOperation};
 use crate::decoder::Decoder;
-use crate::format::GoodbyeItem;
+use crate::format::{ContentRange, GoodbyeItem};
 use crate::util::poll_result_once;
 use crate::{Entry, PxarVariant};
 
@@ -142,11 +141,14 @@ impl<T: Clone + ReadAt> Accessor<T> {
     ///
     /// This will provide a reader over an arbitrary range of the archive file, so unless this
     /// comes from a actual file entry data, the contents might not make much sense.
-    pub unsafe fn open_contents_at_range(&self, range: Range<u64>) -> FileContents<T> {
-        FileContents {
-            inner: unsafe { self.inner.open_contents_at_range(range) },
+    pub unsafe fn open_contents_at_range(
+        &self,
+        range: &ContentRange,
+    ) -> io::Result<FileContents<T>> {
+        Ok(FileContents {
+            inner: poll_result_once(unsafe { self.inner.open_contents_at_range(range) })?,
             at: 0,
-        }
+        })
     }
 
     /// Following a hardlink.
@@ -291,7 +293,7 @@ impl<T: Clone + ReadAt> FileEntry<T> {
     }
 
     /// For use with unsafe accessor methods.
-    pub fn content_range(&self) -> io::Result<Option<Range<u64>>> {
+    pub fn content_range(&self) -> io::Result<Option<ContentRange>> {
         self.inner.content_range()
     }
 
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pbs-devel] [PATCH v2 proxmox-backup 5/5] client: pxar: fix fuse mount performance for split archives
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
                   ` (3 preceding siblings ...)
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 4/5] accessor: add payload checks for split archives Christian Ebner
@ 2024-06-11 13:29 ` Christian Ebner
  2024-06-12  7:53 ` [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] " Christian Ebner
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-11 13:29 UTC (permalink / raw)
  To: pbs-devel

Adapt to the decoder/accessor method changes introduced in the pxar
library, which were introduced in order to move the consistency check
for metadata and payload data archives.

The new location of the checks allows to access the pxar archive via
a `Split` variant reader instance, without penalization when just
accessing the metadata, not reading any payload data.

This greatly improves performance when accessing fuse mounted
archives.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- adapt to cover missing check for access via accessor in FUSE mount
- include suggested code change to correctly provide error context

 pbs-client/src/pxar/extract.rs | 62 ++++++++++++++++------------------
 pbs-pxar-fuse/src/lib.rs       | 14 ++++----
 src/api2/tape/restore.rs       |  2 +-
 3 files changed, 37 insertions(+), 41 deletions(-)

diff --git a/pbs-client/src/pxar/extract.rs b/pbs-client/src/pxar/extract.rs
index 38d4ca89a..e1192c73f 100644
--- a/pbs-client/src/pxar/extract.rs
+++ b/pbs-client/src/pxar/extract.rs
@@ -373,26 +373,22 @@ where
                     Ok(())
                 }
             }
-            (true, EntryKind::File { size, .. }) => {
-                let contents = self.decoder.contents();
-
-                if let Some(mut contents) = contents {
-                    self.extractor.extract_file(
-                        &file_name,
-                        metadata,
-                        *size,
-                        &mut contents,
-                        self.extractor
-                            .overwrite_flags
-                            .contains(OverwriteFlags::FILE),
-                    )
-                } else {
-                    Err(format_err!(
-                        "found regular file entry without contents in archive"
-                    ))
-                }
-                .context(PxarExtractContext::ExtractFile)
+            (true, EntryKind::File { size, .. }) => match self.decoder.contents() {
+                Ok(Some(mut contents)) => self.extractor.extract_file(
+                    &file_name,
+                    metadata,
+                    *size,
+                    &mut contents,
+                    self.extractor
+                        .overwrite_flags
+                        .contains(OverwriteFlags::FILE),
+                ),
+                Ok(None) => Err(format_err!(
+                    "found regular file entry without contents in archive"
+                )),
+                Err(err) => Err(err.into()),
             }
+            .context(PxarExtractContext::ExtractFile),
             (false, _) => Ok(()), // skip this
         };
 
@@ -872,7 +868,8 @@ where
             match entry.kind() {
                 EntryKind::File { .. } => {
                     let size = decoder.content_size().unwrap_or(0);
-                    tar_add_file(&mut tarencoder, decoder.contents(), size, metadata, path).await?
+                    let contents = decoder.contents().await?;
+                    tar_add_file(&mut tarencoder, contents, size, metadata, path).await?
                 }
                 EntryKind::Hardlink(link) => {
                     if !link.data.is_empty() {
@@ -894,14 +891,9 @@ where
                                     path
                                 } else {
                                     let size = decoder.content_size().unwrap_or(0);
-                                    tar_add_file(
-                                        &mut tarencoder,
-                                        decoder.contents(),
-                                        size,
-                                        metadata,
-                                        path,
-                                    )
-                                    .await?;
+                                    let contents = decoder.contents().await?;
+                                    tar_add_file(&mut tarencoder, contents, size, metadata, path)
+                                        .await?;
                                     hardlinks.insert(realpath.to_owned(), path.to_owned());
                                     continue;
                                 }
@@ -1038,7 +1030,8 @@ where
                         metadata.stat.mode as u16,
                         true,
                     );
-                    zip.add_entry(entry, decoder.contents())
+                    let contents = decoder.contents().await?;
+                    zip.add_entry(entry, contents)
                         .await
                         .context("could not send file entry")?;
                 }
@@ -1056,7 +1049,8 @@ where
                         metadata.stat.mode as u16,
                         true,
                     );
-                    zip.add_entry(entry, decoder.contents())
+                    let contents = decoder.contents().await?;
+                    zip.add_entry(entry, contents)
                         .await
                         .context("could not send file entry")?;
                 }
@@ -1279,14 +1273,16 @@ where
                         .with_context(|| format!("error at entry {file_name_os:?}"))?;
                 }
                 EntryKind::File { size, .. } => {
+                    let mut contents = decoder
+                        .contents()
+                        .await?
+                        .context("found regular file entry without contents in archive")?;
                     extractor
                         .async_extract_file(
                             &file_name,
                             metadata,
                             *size,
-                            &mut decoder
-                                .contents()
-                                .context("found regular file entry without contents in archive")?,
+                            &mut contents,
                             extractor.overwrite_flags.contains(OverwriteFlags::FILE),
                         )
                         .await?
diff --git a/pbs-pxar-fuse/src/lib.rs b/pbs-pxar-fuse/src/lib.rs
index 780a4ddbe..c75fba28c 100644
--- a/pbs-pxar-fuse/src/lib.rs
+++ b/pbs-pxar-fuse/src/lib.rs
@@ -5,7 +5,6 @@ use std::ffi::{OsStr, OsString};
 use std::future::Future;
 use std::io;
 use std::mem;
-use std::ops::Range;
 use std::os::unix::ffi::OsStrExt;
 use std::path::Path;
 use std::pin::Pin;
@@ -21,6 +20,7 @@ use futures::stream::{StreamExt, TryStreamExt};
 
 use proxmox_io::vec;
 use pxar::accessor::{self, EntryRangeInfo, ReadAt};
+use pxar::format::ContentRange;
 
 use proxmox_fuse::requests::{self, FuseRequest};
 use proxmox_fuse::{EntryParam, Fuse, ReplyBufState, Request, ROOT_ID};
@@ -130,7 +130,7 @@ struct Lookup {
     inode: u64,
     parent: u64,
     entry_range_info: EntryRangeInfo,
-    content_range: Option<Range<u64>>,
+    content_range: Option<ContentRange>,
 }
 
 impl Lookup {
@@ -138,7 +138,7 @@ impl Lookup {
         inode: u64,
         parent: u64,
         entry_range_info: EntryRangeInfo,
-        content_range: Option<Range<u64>>,
+        content_range: Option<ContentRange>,
     ) -> Box<Lookup> {
         Box::new(Self {
             refs: AtomicUsize::new(1),
@@ -433,13 +433,13 @@ impl SessionImpl {
         }
     }
 
-    fn open_content(&self, lookup: &LookupRef) -> Result<FileContents, Error> {
+    async fn open_content<'a>(&'a self, lookup: &'a LookupRef<'a>) -> Result<FileContents, Error> {
         if is_dir_inode(lookup.inode) {
             io_return!(libc::EISDIR);
         }
 
-        match lookup.content_range.clone() {
-            Some(range) => Ok(unsafe { self.accessor.open_contents_at_range(range) }),
+        match &lookup.content_range {
+            Some(range) => Ok(unsafe { self.accessor.open_contents_at_range(range).await? }),
             None => io_return!(libc::EBADF),
         }
     }
@@ -581,7 +581,7 @@ impl SessionImpl {
 
     async fn read(&self, inode: u64, len: usize, offset: u64) -> Result<Vec<u8>, Error> {
         let file = self.get_lookup(inode)?;
-        let content = self.open_content(&file)?;
+        let content = self.open_content(&file).await?;
         let mut buf = vec::undefined(len);
         let mut pos = 0;
         // fuse' read is different from normal read - no short reads allowed except for EOF!
diff --git a/src/api2/tape/restore.rs b/src/api2/tape/restore.rs
index 382909647..2b2025f6d 100644
--- a/src/api2/tape/restore.rs
+++ b/src/api2/tape/restore.rs
@@ -1748,7 +1748,7 @@ fn try_restore_snapshot_archive<R: pxar::decoder::SeqRead>(
         }
 
         let filename = entry.file_name();
-        let mut contents = match decoder.contents() {
+        let mut contents = match decoder.contents()? {
             None => bail!("missing file content"),
             Some(contents) => contents,
         };
-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives
  2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
                   ` (4 preceding siblings ...)
  2024-06-11 13:29 ` [pbs-devel] [PATCH v2 proxmox-backup 5/5] client: pxar: fix fuse mount performance " Christian Ebner
@ 2024-06-12  7:53 ` Christian Ebner
  5 siblings, 0 replies; 7+ messages in thread
From: Christian Ebner @ 2024-06-12  7:53 UTC (permalink / raw)
  To: pbs-devel

Please disregard these patches, I will send a version 3 including some 
minor modifications.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-06-12  7:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-11 13:29 [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] fix fuse mount performance for split archives Christian Ebner
2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 1/5] format: add helper for payload header consistency checks Christian Ebner
2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 2/5] format: add helper type ContentRange Christian Ebner
2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 3/5] decoder: move payload header check for split input Christian Ebner
2024-06-11 13:29 ` [pbs-devel] [PATCH v2 pxar 4/5] accessor: add payload checks for split archives Christian Ebner
2024-06-11 13:29 ` [pbs-devel] [PATCH v2 proxmox-backup 5/5] client: pxar: fix fuse mount performance " Christian Ebner
2024-06-12  7:53 ` [pbs-devel] [PATCH v2 pxar proxmox-backp 0/5] " Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal