From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [PATCH 1/2] client: pxar: inject held chunk on cache-range discontinuity
Date: Tue, 28 Apr 2026 03:28:09 +0200 [thread overview]
Message-ID: <20260428014635.2655155-2-t.lamprecht@proxmox.com> (raw)
In-Reply-To: <20260428014635.2655155-1-t.lamprecht@proxmox.com>
When the cached payload range cannot be extended because a non-reusable
file or directory end opens a hole, the previous flush may have held
back its last chunk for continuation. If the new range's first chunk
shares a digest with the held chunk via dedup, but the two correspond
to different physical positions in the previous archive, the next
flush's digest-match fast path collapses both into one without ever
injecting the held chunk's bytes. The encoder's payload_write_position
then trails the previous_payload_offset recorded during this flush,
and the next add_payload_ref produces a backwards PXAR_PAYLOAD_REF
offset, which the strict offset check introduced in pxar 1.0.1
rejects.
Inject any held chunk explicitly when a cache-range discontinuity is
detected, before update_range, so no held chunk can survive across a
hole. Healthy archives are unaffected: the held chunk only exists
when keep_last_chunk = true, and the inject path is the same one used
by every other branch that already advances the payload position.
The added test pins the cache contract this fix relies on:
update_range alone must not drop last_chunk, so the explicit
take_last_chunk + inject pair has to stay on the hole path.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
---
pbs-client/src/pxar/create.rs | 37 +++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index e42bcc87f..15e149a23 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -822,6 +822,14 @@ impl Archiver {
debug!("Cache range has hole, new range: {payload_range:?}");
self.flush_cached_reusing_if_below_threshold(encoder, true)
.await?;
+ // Inject any held-back chunk now: the new range is non-contiguous, so it
+ // cannot legitimately merge with the next range's first chunk. Without this,
+ // a digest collision via dedup would let the next flush skip the injection,
+ // leaving the encoder's payload position behind the recorded offset and
+ // writing a backwards PXAR_PAYLOAD_REF.
+ if let Some(held) = self.cache.take_last_chunk() {
+ self.inject_chunks_at_current_payload_position(encoder, vec![held])?;
+ }
// range has to be set after flushing of cached entries, which resets the range
self.cache.update_range(payload_range.clone());
}
@@ -2034,4 +2042,33 @@ mod tests {
Ok::<(), Error>(())
})
}
+
+ /// Pins the contract the cache-hole path in [`Archiver`] depends on: `update_range` must not
+ /// clear `last_chunk`. The hole-path must take and inject the held chunk before
+ /// `update_range`, otherwise it would carry across the discontinuity and the next flush's
+ /// digest-match fast-path could collapse two physically separate but content-identical
+ /// (via dedup) chunks, leaving the encoder's payload position behind the recorded offset.
+ #[test]
+ fn cache_update_range_preserves_held_chunk() {
+ let mut cache = super::PxarLookaheadCache::new(None);
+ let chunk = super::ReusableDynamicEntry {
+ size: 1024,
+ padding: 0,
+ digest: [0u8; 32],
+ };
+
+ cache.update_last_chunk(Some(chunk));
+ cache.update_range(100..200);
+ cache.update_range(500..600);
+
+ assert!(
+ cache.take_last_chunk().is_some(),
+ "update_range alone must preserve last_chunk -- if this fails, the explicit \
+ take_last_chunk in the cache-hole path of add_entry_to_archive_impl can be removed"
+ );
+ assert!(
+ cache.take_last_chunk().is_none(),
+ "take_last_chunk must consume the value"
+ );
+ }
}
--
2.47.3
next prev parent reply other threads:[~2026-04-28 1:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 1:28 applied: [PATCH 0/2]: client: pxar: make payload-offset handling more robust Thomas Lamprecht
2026-04-28 1:28 ` Thomas Lamprecht [this message]
2026-04-28 1:28 ` [PATCH 2/2] client: pxar: re-encode files with non-monotonic previous offsets Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428014635.2655155-2-t.lamprecht@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox