applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

* applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust
@ 2026-04-28  1:28 Thomas Lamprecht
  2026-04-28  1:28 ` [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity Thomas Lamprecht
  2026-04-28  1:28 ` [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2026-04-28  1:28 UTC (permalink / raw)
  To: pbs-devel

The strict payload-offset check added in pxar 1.0.1 rejects any
PXAR_PAYLOAD_REF whose offset is not strictly greater than the
encoder's previously recorded one. A producer-side bug in pre-1.0.1
clients let archives be written that violate that invariant: when a
cache-range discontinuity opens between two flushes and the previous
flush held its last chunk back for continuation, dedup against the
next range's first chunk could collapse the held chunk away without
ever injecting its bytes. payload_write_position then trails the
previous_payload_offset recorded in the same flush, and the next
add_payload_ref produces a backwards offset.

With such an archive on disk, every later metadata-mode incremental
backup aborts on the strict check until the chain is cycled out via
--change-detection-mode=data.

Patch 1 fixes the producer site: a held chunk is now injected explicitly
when a cache-range discontinuity is detected, before update_range, so
no held chunk can survive across a hole.

Patch 2 lets existing affected archives "self-heal" on next backup: the
Archiver tracks the highest previously-recorded payload offset globally
and treats prev-archive entries with a smaller-or-equal offset as
non-reusable, so they are re-encoded. A per-file warning makes the
fall-back visible. Tracking is intentionally global so inversions that
span cache ranges are caught too.

Healthy archives written on or after pxar 1.0.1 cannot trigger the
fall-back since the encoder already enforces strict monotonicity.

tldr: avoids creating new backups that break our invariant and tries
best to recover setups (new) backups, which should address some reports
in the forum [0].

[0]: https://forum.proxmox.com/threads/183043/

Thomas Lamprecht (2):
  client: pxar: inject held chunk on cache-range discontinuity
  client: pxar: re-encode files with non-monotonic previous offsets

 pbs-client/src/pxar/create.rs | 123 ++++++++++++++++++++++++++++++++++
 1 file changed, 123 insertions(+)

-- 
2.47.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity
  2026-04-28  1:28 applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust Thomas Lamprecht
@ 2026-04-28  1:28 ` Thomas Lamprecht
  2026-04-28  1:28 ` [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2026-04-28  1:28 UTC (permalink / raw)
  To: pbs-devel

When the cached payload range cannot be extended because a non-reusable
file or directory end opens a hole, the previous flush may have held
back its last chunk for continuation. If the new range's first chunk
shares a digest with the held chunk via dedup, but the two correspond
to different physical positions in the previous archive, the next
flush's digest-match fast path collapses both into one without ever
injecting the held chunk's bytes. The encoder's payload_write_position
then trails the previous_payload_offset recorded during this flush,
and the next add_payload_ref produces a backwards PXAR_PAYLOAD_REF
offset, which the strict offset check introduced in pxar 1.0.1
rejects.

Inject any held chunk explicitly when a cache-range discontinuity is
detected, before update_range, so no held chunk can survive across a
hole. Healthy archives are unaffected: the held chunk only exists
when keep_last_chunk = true, and the inject path is the same one used
by every other branch that already advances the payload position.

The added test pins the cache contract this fix relies on:
update_range alone must not drop last_chunk, so the explicit
take_last_chunk + inject pair has to stay on the hole path.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
 pbs-client/src/pxar/create.rs | 37 +++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index e42bcc87f..15e149a23 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -822,6 +822,14 @@ impl Archiver {
                     debug!("Cache range has hole, new range: {payload_range:?}");
                     self.flush_cached_reusing_if_below_threshold(encoder, true)
                         .await?;
+                    // Inject any held-back chunk now: the new range is non-contiguous, so it
+                    // cannot legitimately merge with the next range's first chunk. Without this,
+                    // a digest collision via dedup would let the next flush skip the injection,
+                    // leaving the encoder's payload position behind the recorded offset and
+                    // writing a backwards PXAR_PAYLOAD_REF.
+                    if let Some(held) = self.cache.take_last_chunk() {
+                        self.inject_chunks_at_current_payload_position(encoder, vec![held])?;
+                    }
                     // range has to be set after flushing of cached entries, which resets the range
                     self.cache.update_range(payload_range.clone());
                 }
@@ -2034,4 +2042,33 @@ mod tests {
             Ok::<(), Error>(())
         })
     }
+
+    /// Pins the contract the cache-hole path in [`Archiver`] depends on: `update_range` must not
+    /// clear `last_chunk`. The hole-path must take and inject the held chunk before
+    /// `update_range`, otherwise it would carry across the discontinuity and the next flush's
+    /// digest-match fast-path could collapse two physically separate but content-identical
+    /// (via dedup) chunks, leaving the encoder's payload position behind the recorded offset.
+    #[test]
+    fn cache_update_range_preserves_held_chunk() {
+        let mut cache = super::PxarLookaheadCache::new(None);
+        let chunk = super::ReusableDynamicEntry {
+            size: 1024,
+            padding: 0,
+            digest: [0u8; 32],
+        };
+
+        cache.update_last_chunk(Some(chunk));
+        cache.update_range(100..200);
+        cache.update_range(500..600);
+
+        assert!(
+            cache.take_last_chunk().is_some(),
+            "update_range alone must preserve last_chunk -- if this fails, the explicit \
+             take_last_chunk in the cache-hole path of add_entry_to_archive_impl can be removed"
+        );
+        assert!(
+            cache.take_last_chunk().is_none(),
+            "take_last_chunk must consume the value"
+        );
+    }
 }
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets
  2026-04-28  1:28 applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust Thomas Lamprecht
  2026-04-28  1:28 ` [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity Thomas Lamprecht
@ 2026-04-28  1:28 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2026-04-28  1:28 UTC (permalink / raw)
  To: pbs-devel

The strict payload-offset check added in pxar 1.0.1 rejects any
payload_offset not strictly greater than the encoder's previously
recorded one. Older client versions could write metadata archives
with duplicate or non-monotonic PXAR_PAYLOAD_REF offsets via the
producer-side bug fixed in the previous commit. Without the encode-
time guard such an archive looked fine on disk, but every later
metadata-mode incremental backup now aborts on the strict check until
the chain is cycled out via --change-detection-mode=data.

Track the highest previously-recorded payload offset globally on the
Archiver and treat any prev-archive entry with a smaller-or-equal
offset as non-reusable, so it is re-encoded instead. The chain
self-heals on the next run, and a single warning per affected file
is emitted to make the fall-back visible. Tracking is intentionally
global, not per cache range, so inversions that span ranges (because
the next backup's cache flushes happen between the two inverted
offsets) are caught too.

Healthy archives written on or after pxar 1.0.1 cannot trigger this
fall-back since the encoder already enforces strict monotonicity at
write time.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
 pbs-client/src/pxar/create.rs | 86 +++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 15e149a23..70b858092 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -188,6 +188,11 @@ struct Archiver {
     suggested_boundaries: Option<mpsc::Sender<u64>>,
     previous_payload_index: Option<DynamicIndexReader>,
     cache: PxarLookaheadCache,
+    // Highest payload offset accepted as reusable so far. Mirrors the encoder's accumulated
+    // strict-monotonic-offset invariant on the read side, so that a previous archive written by
+    // an older client without the encode-time check is detected here and the affected file is
+    // re-encoded instead of reused.
+    last_reusable_offset: Option<u64>,
     reuse_stats: ReuseStats,
     split_archive: bool,
 }
@@ -298,6 +303,7 @@ where
         suggested_boundaries,
         previous_payload_index,
         cache: PxarLookaheadCache::new(options.max_cache_size),
+        last_reusable_offset: None,
         reuse_stats: ReuseStats::default(),
         split_archive,
     };
@@ -418,6 +424,16 @@ impl Archiver {
         .boxed()
     }
 
+    /// Record an accepted reusable payload offset from the previous metadata archive.
+    ///
+    /// Returns `false` if `offset` is not strictly greater than the last accepted offset, which
+    /// signals a duplicate or non-monotonic previous archive. The caller must then treat the file
+    /// as non-reusable so the encoder's strict offset check cannot reject the eventual
+    /// `add_payload_ref`. Tracking is global to catch inversions that span cache ranges.
+    fn try_record_reusable_offset(&mut self, offset: u64) -> bool {
+        try_record_strictly_greater(&mut self.last_reusable_offset, offset)
+    }
+
     async fn is_reusable_entry(
         &mut self,
         previous_metadata_accessor: &Option<Directory<MetadataArchiveReader>>,
@@ -437,6 +453,19 @@ impl Archiver {
                         if file_size != *size {
                             return Ok(None);
                         }
+                        // a previous archive written by an older client version may contain
+                        // duplicate or non-monotonic PXAR_PAYLOAD_REF offsets for distinct files.
+                        // Re-encoding the affected file lets the chain self-heal and avoids the
+                        // pxar strict offset check rejecting the whole backup.
+                        if !self.try_record_reusable_offset(*offset) {
+                            let last = self.last_reusable_offset.unwrap_or(0);
+                            warn!(
+                                "re-encode: {file_name:?} previous archive payload offset \
+                                 {offset} not strictly greater than last seen {last}, \
+                                 treating as non-reusable"
+                            );
+                            return Ok(None);
+                        }
                         let range =
                             *offset..*offset + size + size_of::<pxar::format::Header>() as u64;
                         debug!(
@@ -1378,6 +1407,19 @@ impl ReusableDynamicEntry {
     }
 }
 
+/// Updates `last` to `offset` and returns `true` if `offset` is strictly greater than the current
+/// value (or `last` is `None`). Returns `false` without modifying `last` for duplicate or
+/// backwards offsets.
+fn try_record_strictly_greater(last: &mut Option<u64>, offset: u64) -> bool {
+    if let Some(prev) = *last {
+        if offset <= prev {
+            return false;
+        }
+    }
+    *last = Some(offset);
+    true
+}
+
 /// List of dynamic entries containing the data given by an offset range
 fn lookup_dynamic_entries(
     index: &DynamicIndexReader,
@@ -2009,6 +2051,7 @@ mod tests {
                 previous_payload_index,
                 suggested_boundaries: Some(suggested_boundaries),
                 cache: PxarLookaheadCache::new(None),
+                last_reusable_offset: None,
                 reuse_stats: ReuseStats::default(),
                 split_archive: true,
             };
@@ -2071,4 +2114,47 @@ mod tests {
             "take_last_chunk must consume the value"
         );
     }
+
+    /// Strictly-greater bookkeeping for [`Archiver::try_record_reusable_offset`]; mirrors the
+    /// encoder's accumulated strict-monotonic-offset invariant on the read side.
+    #[test]
+    fn try_record_strictly_greater_accepts_increasing() {
+        let mut last = None;
+        assert!(super::try_record_strictly_greater(&mut last, 100));
+        assert_eq!(last, Some(100));
+        assert!(super::try_record_strictly_greater(&mut last, 200));
+        assert_eq!(last, Some(200));
+    }
+
+    #[test]
+    fn try_record_strictly_greater_rejects_equal_and_backwards() {
+        let mut last = Some(200);
+        // duplicate offset: previous archive recorded two distinct files at the same offset
+        assert!(!super::try_record_strictly_greater(&mut last, 200));
+        assert_eq!(last, Some(200), "rejected offset must not update state");
+        // backwards offset: previous archive recorded the next file at a lower offset
+        assert!(!super::try_record_strictly_greater(&mut last, 150));
+        assert_eq!(last, Some(200));
+    }
+
+    #[test]
+    fn try_record_strictly_greater_first_call_accepts_any_value() {
+        let mut last = None;
+        assert!(super::try_record_strictly_greater(&mut last, 0));
+        assert_eq!(last, Some(0));
+    }
+
+    /// Inversions that span cache ranges must be caught by the global tracking; a per-range
+    /// implementation would have reset on flush and accepted the second offset, after which
+    /// the encoder strict check would still abort the backup.
+    #[test]
+    fn try_record_strictly_greater_persists_across_resets() {
+        let mut last = None;
+        assert!(super::try_record_strictly_greater(&mut last, 1000));
+        // a per-range implementation would have cleared `last` here on cache flush
+        assert!(!super::try_record_strictly_greater(&mut last, 500));
+        assert_eq!(last, Some(1000));
+        assert!(super::try_record_strictly_greater(&mut last, 1500));
+        assert_eq!(last, Some(1500));
+    }
 }
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-28  1:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  1:28 applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust Thomas Lamprecht
2026-04-28  1:28 ` [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity Thomas Lamprecht
2026-04-28  1:28 ` [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal