all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH v4 pxar 5/26] fix #3174: enc/dec: impl PXAR_APPENDIX entrytype
Date: Thu,  9 Nov 2023 19:45:53 +0100	[thread overview]
Message-ID: <20231109184614.1611127-6-c.ebner@proxmox.com> (raw)
In-Reply-To: <20231109184614.1611127-1-c.ebner@proxmox.com>

Add an additional entry type for marking the start of a pxar archive
appendix section. The appendix is a concatenation of possibly
uncorrelated chunks, therefore not following the pxar archive format
anymore. The appendix is only used to access the file metadata and
payloads when a PXAR_APPENDIX_REF entry is encountered in the archive
before this point.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
Changes since v3:
- specify u64 as AppendixRefOffset for full_size in add_appendix

Changes since v2:
- no changes

Changes since v1:
- Use custom type for appendix start offset instead of raw `u64`

 examples/mk-format-hashes.rs |  1 +
 src/decoder/mod.rs           |  9 +++++++++
 src/encoder/aio.rs           | 12 +++++++++++-
 src/encoder/mod.rs           | 33 +++++++++++++++++++++++++++++++++
 src/encoder/sync.rs          | 12 +++++++++++-
 src/format/mod.rs            |  7 +++++++
 src/lib.rs                   |  4 ++++
 7 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/examples/mk-format-hashes.rs b/examples/mk-format-hashes.rs
index 8b4f5de..f068edd 100644
--- a/examples/mk-format-hashes.rs
+++ b/examples/mk-format-hashes.rs
@@ -12,6 +12,7 @@ const CONSTANTS: &[(&str, &str, &str)] = &[
         "__PROXMOX_FORMAT_ENTRY__",
     ),
     ("", "PXAR_FILENAME", "__PROXMOX_FORMAT_FILENAME__"),
+    ("", "PXAR_APPENDIX", "__PROXMOX_FORMAT_APPENDIX__"),
     ("", "PXAR_SYMLINK", "__PROXMOX_FORMAT_SYMLINK__"),
     ("", "PXAR_DEVICE", "__PROXMOX_FORMAT_DEVICE__"),
     ("", "PXAR_XATTR", "__PROXMOX_FORMAT_XATTR__"),
diff --git a/src/decoder/mod.rs b/src/decoder/mod.rs
index 70a8697..43a1122 100644
--- a/src/decoder/mod.rs
+++ b/src/decoder/mod.rs
@@ -295,6 +295,7 @@ impl<I: SeqRead> DecoderImpl<I> {
                         continue;
                     }
                 }
+                format::PXAR_APPENDIX => return Ok(Some(self.entry.take())),
                 _ => io_bail!(
                     "expected filename or directory-goodbye pxar entry, got: {}",
                     self.current_header,
@@ -546,6 +547,14 @@ impl<I: SeqRead> DecoderImpl<I> {
                 self.entry.kind = EntryKind::Device(self.read_device().await?);
                 return Ok(ItemResult::Entry);
             }
+            format::PXAR_APPENDIX => {
+                let bytes = self.read_entry_as_bytes().await?;
+                let total = u64::from_le_bytes(bytes[0..8].try_into().unwrap());
+                self.entry.kind = EntryKind::Appendix {
+                    total,
+                };
+                return Ok(ItemResult::Entry);
+            }
             format::PXAR_PAYLOAD => {
                 let offset = seq_read_position(&mut self.input).await.transpose()?;
                 self.entry.kind = EntryKind::File {
diff --git a/src/encoder/aio.rs b/src/encoder/aio.rs
index 66ea535..9cc26e0 100644
--- a/src/encoder/aio.rs
+++ b/src/encoder/aio.rs
@@ -5,7 +5,7 @@ use std::path::Path;
 use std::pin::Pin;
 use std::task::{Context, Poll};
 
-use crate::encoder::{self, AppendixRefOffset, LinkOffset, SeqWrite};
+use crate::encoder::{self, AppendixRefOffset, AppendixStartOffset, LinkOffset, SeqWrite};
 use crate::format;
 use crate::Metadata;
 
@@ -124,6 +124,16 @@ impl<'a, T: SeqWrite + 'a> Encoder<'a, T> {
             .await
     }
 
+    /// Add the appendix start entry marker
+    ///
+    /// Returns the LinkOffset pointing after the entry, the appendix start offset
+    pub async fn add_appendix(
+        &mut self,
+        full_size: AppendixRefOffset,
+    ) -> io::Result<AppendixStartOffset> {
+        self.inner.add_appendix(full_size).await
+    }
+
     /// Add a symbolic link to the archive.
     pub async fn add_symlink<PF: AsRef<Path>, PT: AsRef<Path>>(
         &mut self,
diff --git a/src/encoder/mod.rs b/src/encoder/mod.rs
index 78612a6..1d5fe82 100644
--- a/src/encoder/mod.rs
+++ b/src/encoder/mod.rs
@@ -65,6 +65,15 @@ impl AppendixRefOffset {
 /// Offset pointing to the start of the appendix section of the archive.
 #[derive(Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd)]
 pub struct AppendixStartOffset(u64);
+
+impl AppendixStartOffset {
+    /// Get the raw byte start offset for this appenidx section.
+    #[inline]
+    pub fn raw(self) -> u64 {
+        self.0
+    }
+}
+
 /// Sequential write interface used by the encoder's state machine.
 ///
 /// This is our internal writer trait which is available for `std::io::Write` types in the
@@ -510,6 +519,30 @@ impl<'a, T: SeqWrite + 'a> EncoderImpl<'a, T> {
         Ok(())
     }
 
+    /// Add the appendix start entry marker
+    ///
+    /// Returns the AppendixStartOffset pointing after the entry, the start of the appendix
+    /// section of the archive.
+    pub async fn add_appendix(
+        &mut self,
+        full_size: AppendixRefOffset,
+    ) -> io::Result<AppendixStartOffset> {
+        self.check()?;
+
+        let data = &full_size.raw().to_le_bytes().to_vec();
+        seq_write_pxar_entry(
+            self.output.as_mut(),
+            format::PXAR_APPENDIX,
+            &data,
+            &mut self.state.write_position,
+        )
+        .await?;
+
+        let offset = self.position();
+
+        Ok(AppendixStartOffset(offset))
+    }
+
     /// Return a file offset usable with `add_hardlink`.
     pub async fn add_symlink(
         &mut self,
diff --git a/src/encoder/sync.rs b/src/encoder/sync.rs
index 2c9ea2b..c7a7acb 100644
--- a/src/encoder/sync.rs
+++ b/src/encoder/sync.rs
@@ -6,7 +6,7 @@ use std::pin::Pin;
 use std::task::{Context, Poll};
 
 use crate::decoder::sync::StandardReader;
-use crate::encoder::{self, AppendixRefOffset, LinkOffset, SeqSink, SeqWrite};
+use crate::encoder::{self, AppendixRefOffset, AppendixStartOffset, LinkOffset, SeqSink, SeqWrite};
 use crate::format;
 use crate::util::poll_result_once;
 use crate::Metadata;
@@ -124,6 +124,16 @@ impl<'a, T: SeqWrite + 'a> Encoder<'a, T> {
         ))
     }
 
+    /// Add the appendix start entry marker
+    ///
+    /// Returns the LinkOffset pointing after the entry, the appendix start offset
+    pub async fn add_appendix(
+        &mut self,
+        full_size: AppendixRefOffset,
+    ) -> io::Result<AppendixStartOffset> {
+        poll_result_once(self.inner.add_appendix(full_size))
+    }
+
     /// Add a symbolic link to the archive.
     pub fn add_symlink<PF: AsRef<Path>, PT: AsRef<Path>>(
         &mut self,
diff --git a/src/format/mod.rs b/src/format/mod.rs
index 5eb7562..8254df9 100644
--- a/src/format/mod.rs
+++ b/src/format/mod.rs
@@ -35,6 +35,12 @@
 //!   * `<archive>`         -- serialization of the second directory entry
 //!   * ...
 //!   * `GOODBYE`           -- lookup table at the end of a list of directory entries
+//!
+//! For backups referencing previous backups to skip file payloads, the archive is followed by a
+//! appendix maker after which the concatinated pxar archive fragments containing the file payloads
+//! are appended. They are NOT guaranteed to follow the full pxar structure and should only be
+//! used to extract the file payloads by given offset.
+//!   * `APPENDIX`          -- pxar archive fragments containing file payloads
 
 use std::cmp::Ordering;
 use std::ffi::{CStr, OsStr};
@@ -85,6 +91,7 @@ pub const PXAR_ENTRY: u64 = 0xd5956474e588acef;
 /// Previous version of the entry struct
 pub const PXAR_ENTRY_V1: u64 = 0x11da850a1c1cceff;
 pub const PXAR_FILENAME: u64 = 0x16701121063917b3;
+pub const PXAR_APPENDIX: u64 = 0x9ff6c9507864b38d;
 pub const PXAR_SYMLINK: u64 = 0x27f971e7dbf5dc5f;
 pub const PXAR_DEVICE: u64 = 0x9fc9e906586d5ce9;
 pub const PXAR_XATTR: u64 = 0x0dab0229b57dcd03;
diff --git a/src/lib.rs b/src/lib.rs
index fa84e7a..035f995 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -372,6 +372,10 @@ pub enum EntryKind {
         file_size: u64,
     },
 
+    Appendix {
+        total: u64,
+    },
+
     /// Directory entry. When iterating through an archive, the contents follow next.
     Directory,
 
-- 
2.39.2





  parent reply	other threads:[~2023-11-09 18:47 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-09 18:45 [pbs-devel] [PATCH-SERIES v4 pxar proxmox-backup proxmox-widget-toolkit 00/26] fix #3174: improve file-level backup Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 1/26] fix #3174: decoder: factor out skip_bytes from skip_entry Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 2/26] fix #3174: decoder: impl skip_bytes for sync dec Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 3/26] fix #3174: encoder: calc filename + metadata byte size Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 4/26] fix #3174: enc/dec: impl PXAR_APPENDIX_REF entrytype Christian Ebner
2023-11-09 18:45 ` Christian Ebner [this message]
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 6/26] fix #3174: encoder: helper to add to encoder position Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 7/26] fix #3174: enc/dec: impl PXAR_APPENDIX_TAIL entrytype Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 08/26] fix #3174: index: add fn index list from start/end-offsets Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 09/26] fix #3174: index: add fn digest for DynamicEntry Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 10/26] fix #3174: api: double catalog upload size Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 11/26] fix #3174: catalog: introduce extended format v2 Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 12/26] fix #3174: archiver/extractor: impl appendix ref Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 13/26] fix #3174: catalog: add specialized Archive entry Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 14/26] fix #3174: extractor: impl seq restore from appendix Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 15/26] fix #3174: archiver: store ref to previous backup Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 16/26] fix #3174: upload stream: impl reused chunk injector Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 17/26] fix #3174: chunker: add forced boundaries Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 18/26] fix #3174: backup writer: inject queued chunk in upload steam Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 19/26] fix #3174: archiver: reuse files with unchanged metadata Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 20/26] fix #3174: specs: add backup detection mode specification Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 21/26] fix #3174: client: Add detection mode to backup creation Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 22/26] test-suite: add detection mode change benchmark Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 23/26] test-suite: Add bin to deb, add shell completions Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 24/26] catalog: fetch offset and size for files and refs Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 25/26] pxar: add heuristic to reduce reused chunk fragmentation Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-widget-toolkit 26/26] file-browser: support pxar archive and fileref types Christian Ebner
2023-11-13 14:23 ` [pbs-devel] [PATCH-SERIES v4 pxar proxmox-backup proxmox-widget-toolkit 00/26] fix #3174: improve file-level backup Fabian Grünbichler
2023-11-13 15:14   ` Christian Ebner
2023-11-13 15:21     ` Christian Ebner
2023-11-13 15:35     ` Fabian Grünbichler
2023-11-13 15:45       ` Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231109184614.1611127-6-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal