all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity
Date: Tue, 28 Apr 2026 03:28:09 +0200	[thread overview]
Message-ID: <20260428014635.2655155-2-t.lamprecht@proxmox.com> (raw)
In-Reply-To: <20260428014635.2655155-1-t.lamprecht@proxmox.com>

When the cached payload range cannot be extended because a non-reusable
file or directory end opens a hole, the previous flush may have held
back its last chunk for continuation. If the new range's first chunk
shares a digest with the held chunk via dedup, but the two correspond
to different physical positions in the previous archive, the next
flush's digest-match fast path collapses both into one without ever
injecting the held chunk's bytes. The encoder's payload_write_position
then trails the previous_payload_offset recorded during this flush,
and the next add_payload_ref produces a backwards PXAR_PAYLOAD_REF
offset, which the strict offset check introduced in pxar 1.0.1
rejects.

Inject any held chunk explicitly when a cache-range discontinuity is
detected, before update_range, so no held chunk can survive across a
hole. Healthy archives are unaffected: the held chunk only exists
when keep_last_chunk = true, and the inject path is the same one used
by every other branch that already advances the payload position.

The added test pins the cache contract this fix relies on:
update_range alone must not drop last_chunk, so the explicit
take_last_chunk + inject pair has to stay on the hole path.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
 pbs-client/src/pxar/create.rs | 37 +++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index e42bcc87f..15e149a23 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -822,6 +822,14 @@ impl Archiver {
                     debug!("Cache range has hole, new range: {payload_range:?}");
                     self.flush_cached_reusing_if_below_threshold(encoder, true)
                         .await?;
+                    // Inject any held-back chunk now: the new range is non-contiguous, so it
+                    // cannot legitimately merge with the next range's first chunk. Without this,
+                    // a digest collision via dedup would let the next flush skip the injection,
+                    // leaving the encoder's payload position behind the recorded offset and
+                    // writing a backwards PXAR_PAYLOAD_REF.
+                    if let Some(held) = self.cache.take_last_chunk() {
+                        self.inject_chunks_at_current_payload_position(encoder, vec![held])?;
+                    }
                     // range has to be set after flushing of cached entries, which resets the range
                     self.cache.update_range(payload_range.clone());
                 }
@@ -2034,4 +2042,33 @@ mod tests {
             Ok::<(), Error>(())
         })
     }
+
+    /// Pins the contract the cache-hole path in [`Archiver`] depends on: `update_range` must not
+    /// clear `last_chunk`. The hole-path must take and inject the held chunk before
+    /// `update_range`, otherwise it would carry across the discontinuity and the next flush's
+    /// digest-match fast-path could collapse two physically separate but content-identical
+    /// (via dedup) chunks, leaving the encoder's payload position behind the recorded offset.
+    #[test]
+    fn cache_update_range_preserves_held_chunk() {
+        let mut cache = super::PxarLookaheadCache::new(None);
+        let chunk = super::ReusableDynamicEntry {
+            size: 1024,
+            padding: 0,
+            digest: [0u8; 32],
+        };
+
+        cache.update_last_chunk(Some(chunk));
+        cache.update_range(100..200);
+        cache.update_range(500..600);
+
+        assert!(
+            cache.take_last_chunk().is_some(),
+            "update_range alone must preserve last_chunk -- if this fails, the explicit \
+             take_last_chunk in the cache-hole path of add_entry_to_archive_impl can be removed"
+        );
+        assert!(
+            cache.take_last_chunk().is_none(),
+            "take_last_chunk must consume the value"
+        );
+    }
 }
-- 
2.47.3





  reply	other threads:[~2026-04-28  1:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28  1:28 applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust Thomas Lamprecht
2026-04-28  1:28 ` Thomas Lamprecht [this message]
2026-04-28  1:28 ` [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260428014635.2655155-2-t.lamprecht@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal