all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection
@ 2025-12-11 15:38 Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
                   ` (8 more replies)
  0 siblings, 9 replies; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

This patch series contains some followup to the recently introduced
changes to phase 2 of garbage collection on s3 backed datastores [0]
and clenups for marker files during phase 3 [1], as well as some
refactoring of the chunk filename extension parsing and bad chunk
filename generation.

The first patch introduces a dedicated struct for tracking objects
to be deleted from the s3 object store during phase 2, bundling the
length and age based deletion threshold logic to the list state.

Subsequent patches rename and adapt the chunk store iterator to
optimize for regular chunk filenames and clearly distinguish chunks
by directory entry filename extension.

Further, chunk extension parsing and bad chunk filename generation are
moved to be associated functions of the newly introduced `ChunkExt`
enum.

Changes since version 1 (thanks @Fabian for comments):
- Rebased onto current master
- Followup patches for refactoring extension parsing and bad chunk
  filename generation.

[0] https://lore.proxmox.com/pbs-devel/2b637884-f6bd-434d-be00-fa4e9d9e2dcb@proxmox.com/T/
[1] https://lore.proxmox.com/pbs-devel/1764145262.ycdoq9dzrx.astroid@yuna.none/T/

proxmox-backup:

Christian Ebner (8):
  GC: Move S3 delete list state and logic to a dedicated struct
  chunk store: rename and limit scope for chunk store iterator
  chunk store: invert chunk filename checks in chunk store iterator
  chunk store: return chunk extension and check for used marker
  chunk store: refactor chunk extension parsing into dedicated helper
  datastore: move bad chunk touching logic to chunk store
  chunk store: move next bad chunk path generator into dedicated helper
  chunk store: move bad chunk filename generation into dedicated helper

 pbs-datastore/src/chunk_store.rs | 100 ++++++++++++++++----
 pbs-datastore/src/datastore.rs   | 154 +++++++++++++++----------------
 2 files changed, 161 insertions(+), 93 deletions(-)


Summary over all repositories:
  2 files changed, 161 insertions(+), 93 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:23   ` Fabian Grünbichler
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 2/8] chunk store: rename and limit scope for chunk store iterator Christian Ebner
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

To better keep track of the state and declutter the code on the
callsites, bundle the S3 delete list and it's logic by a dedicated
struct. Since the check for empty lists is now performed as part
of the deletion related methods, the callsites can get rid of that.

Further, avoid the proxmox_time::epoch_i64() and use SystemTime and
Duration with their methods directly.

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 132 +++++++++++++++++++--------------
 1 file changed, 75 insertions(+), 57 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 9c57aaac1..58fd034fc 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -24,7 +24,7 @@ use proxmox_sys::error::SysError;
 use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
 use proxmox_sys::linux::procfs::MountInfo;
 use proxmox_sys::process_locker::{ProcessLockExclusiveGuard, ProcessLockSharedGuard};
-use proxmox_time::{epoch_i64, TimeSpan};
+use proxmox_time::TimeSpan;
 use proxmox_worker_task::WorkerTaskContext;
 
 use pbs_api_types::{
@@ -64,7 +64,7 @@ const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
 // s3 deletion batch size to avoid 1024 open files soft limit
 const S3_DELETE_BATCH_LIMIT: usize = 100;
 // max defer time for s3 batch deletions
-const S3_DELETE_DEFER_LIMIT_SECONDS: i64 = 60 * 5;
+const S3_DELETE_DEFER_LIMIT_SECONDS: Duration = Duration::from_secs(60 * 5);
 
 /// checks if auth_id is owner, or, if owner is a token, if
 /// auth_id is the user of the token
@@ -1689,40 +1689,8 @@ impl DataStore {
                 proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
                     .context("failed to list chunk in s3 object store")?;
 
-            let mut delete_list = Vec::with_capacity(S3_DELETE_BATCH_LIMIT);
-            let mut delete_list_age = epoch_i64();
-
-            let s3_delete_batch = |delete_list: &mut Vec<(S3ObjectKey, BackupLockGuard)>,
-                                   s3_client: &Arc<S3Client>|
-             -> Result<(), Error> {
-                let delete_objects_result = proxmox_async::runtime::block_on(
-                    s3_client.delete_objects(
-                        &delete_list
-                            .iter()
-                            .map(|(key, _)| key.clone())
-                            .collect::<Vec<S3ObjectKey>>(),
-                    ),
-                )?;
-                if let Some(_err) = delete_objects_result.error {
-                    bail!("failed to delete some objects");
-                }
-                // drops all chunk flock guards
-                delete_list.clear();
-                Ok(())
-            };
-
-            let add_to_delete_list =
-                |delete_list: &mut Vec<(S3ObjectKey, BackupLockGuard)>,
-                 delete_list_age: &mut i64,
-                 key: S3ObjectKey,
-                 _chunk_guard: BackupLockGuard| {
-                    // set age based on first insertion
-                    if delete_list.is_empty() {
-                        *delete_list_age = epoch_i64();
-                    }
-                    delete_list.push((key, _chunk_guard));
-                };
-
+            let mut delete_list =
+                S3DeleteList::with_thresholds(S3_DELETE_BATCH_LIMIT, S3_DELETE_DEFER_LIMIT_SECONDS);
             loop {
                 for content in list_bucket_result.contents {
                     worker.check_abort()?;
@@ -1779,12 +1747,7 @@ impl DataStore {
                                             std::fs::remove_file(chunk_path)?;
                                         }
                                     }
-                                    add_to_delete_list(
-                                        &mut delete_list,
-                                        &mut delete_list_age,
-                                        content.key,
-                                        _chunk_guard,
-                                    );
+                                    delete_list.push(content.key, _chunk_guard);
                                     Ok(())
                                 },
                             )?;
@@ -1792,12 +1755,7 @@ impl DataStore {
                     } else {
                         gc_status.removed_chunks += 1;
                         gc_status.removed_bytes += content.size;
-                        add_to_delete_list(
-                            &mut delete_list,
-                            &mut delete_list_age,
-                            content.key,
-                            _chunk_guard,
-                        );
+                        delete_list.push(content.key, _chunk_guard);
                     }
 
                     chunk_count += 1;
@@ -1806,12 +1764,7 @@ impl DataStore {
                     drop(_guard);
 
                     // limit pending deletes to avoid holding too many chunk flocks
-                    if delete_list.len() >= S3_DELETE_BATCH_LIMIT
-                        || (!delete_list.is_empty()
-                            && epoch_i64() - delete_list_age > S3_DELETE_DEFER_LIMIT_SECONDS)
-                    {
-                        s3_delete_batch(&mut delete_list, s3_client)?;
-                    }
+                    delete_list.conditional_delete_and_drop_locks(s3_client)?;
                 }
                 // Process next batch of chunks if there is more
                 if list_bucket_result.is_truncated {
@@ -1827,9 +1780,7 @@ impl DataStore {
             }
 
             // delete the last batch of objects, if there are any remaining
-            if !delete_list.is_empty() {
-                s3_delete_batch(&mut delete_list, s3_client)?;
-            }
+            delete_list.delete_and_drop_locks(s3_client)?;
 
             info!("processed {chunk_count} total chunks");
 
@@ -2768,3 +2719,70 @@ impl DataStore {
         result
     }
 }
+
+/// Track S3 object keys to be deleted by garbage collection while holding their file lock.
+struct S3DeleteList {
+    list: Vec<(S3ObjectKey, BackupLockGuard)>,
+    first_entry_added: SystemTime,
+    age_threshold: Duration,
+    capacity_threshold: usize,
+}
+
+impl S3DeleteList {
+    /// Create a new list instance with given capacity and age thresholds.
+    fn with_thresholds(capacity_threshold: usize, age_threshold: Duration) -> Self {
+        Self {
+            first_entry_added: SystemTime::now(), // init only, updated once added
+            list: Vec::with_capacity(capacity_threshold),
+            age_threshold,
+            capacity_threshold,
+        }
+    }
+
+    /// Pushes the current key and backup lock guard to the list, updating the delete list age if
+    /// the list was empty before the insert.
+    fn push(&mut self, key: S3ObjectKey, guard: BackupLockGuard) {
+        // set age based on first insertion
+        if self.list.is_empty() {
+            self.first_entry_added = SystemTime::now();
+        }
+        self.list.push((key, guard));
+    }
+
+    /// Delete the objects in the list via the provided S3 client instance.
+    /// Clears the list contents and frees the per-chunk file locks.
+    fn delete_and_drop_locks(&mut self, s3_client: &Arc<S3Client>) -> Result<(), Error> {
+        if self.list.is_empty() {
+            return Ok(());
+        }
+        let delete_objects_result = proxmox_async::runtime::block_on(
+            s3_client.delete_objects(
+                &self
+                    .list
+                    .iter()
+                    .map(|(key, _)| key.clone())
+                    .collect::<Vec<S3ObjectKey>>(),
+            ),
+        )?;
+        if delete_objects_result.error.is_some() {
+            bail!("failed to delete some objects");
+        }
+        // drops all chunk flock guards
+        self.list.clear();
+        Ok(())
+    }
+
+    /// Delete the object stored in the list if either the list exceeds the capacity threshold or
+    /// the delete list age threshold.
+    fn conditional_delete_and_drop_locks(
+        &mut self,
+        s3_client: &Arc<S3Client>,
+    ) -> Result<(), Error> {
+        if self.list.len() >= self.capacity_threshold
+            || (!self.list.is_empty() && self.first_entry_added.elapsed()? > self.age_threshold)
+        {
+            self.delete_and_drop_locks(s3_client)?;
+        }
+        Ok(())
+    }
+}
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 2/8] chunk store: rename and limit scope for chunk store iterator
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in " Christian Ebner
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Since the introduction of marker files for datastores backed by s3
object stores, the iterator can also include the <digest>.using file
entries and chunk marker files which have 0 size. Also, the returned
entries for regular datastores are not guaranteed be of type file, as
that is only checked by stating the entry afterwards, so there is no
guarantee to only get chunks by the iterator.

Therefore, rename the method to get_chunk_store_iterator(), which is
more generic and does not imply the returned entries are chunks.

While at it, also limit the scope for the method as this is not used
outside the module.

No functional changes intended.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 7fe09b914..a5e5f6261 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -276,7 +276,7 @@ impl ChunkStore {
         Ok(true)
     }
 
-    pub fn get_chunk_iterator(
+    fn get_chunk_store_iterator(
         &self,
     ) -> Result<
         impl std::iter::FusedIterator<
@@ -397,7 +397,7 @@ impl ChunkStore {
         let mut last_percentage = 0;
         let mut chunk_count = 0;
 
-        for (entry, percentage, bad) in self.get_chunk_iterator()? {
+        for (entry, percentage, bad) in self.get_chunk_store_iterator()? {
             if last_percentage != percentage {
                 last_percentage = percentage;
                 info!("processed {percentage}% ({chunk_count} chunks)");
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in chunk store iterator
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 2/8] chunk store: rename and limit scope for chunk store iterator Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:23   ` Fabian Grünbichler
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker Christian Ebner
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Optimizes the chunk filename check towards regular chunk files by
explicitley checking for the correct length.

While the check for ascii hexdigits needs to be stated twice, this
avoids to check for the `.bad` extension if the chunk filename did
already match the expected length.

This will also help to better distinguish bad chunks and chunks
used markers for s3 datastores in subsequent changes.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index a5e5f6261..7980938ad 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -315,15 +315,20 @@ impl ChunkStore {
                         Some(Ok(entry)) => {
                             // skip files if they're not a hash
                             let bytes = entry.file_name().to_bytes();
-                            if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
-                                continue;
+
+                            if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
+                            {
+                                return Some((Ok(entry), percentage, false));
                             }
-                            if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
-                                continue;
+
+                            if bytes.len() == 64 + ".0.bad".len()
+                                && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
+                            {
+                                let bad = bytes.ends_with(b".bad");
+                                return Some((Ok(entry), percentage, bad));
                             }
 
-                            let bad = bytes.ends_with(b".bad");
-                            return Some((Ok(entry), percentage, bad));
+                            continue;
                         }
                         Some(Err(err)) => {
                             // stop after first error
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (2 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in " Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:24   ` Fabian Grünbichler
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper Christian Ebner
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Clearly distinguish the cases for `bad` and `using` extensions for
items returned by the chunk iterator. Directory entries with
filenames which matched the expected length but not the extension
are now skipped over.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 48 ++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 11 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 7980938ad..ad517391d 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -280,7 +280,11 @@ impl ChunkStore {
         &self,
     ) -> Result<
         impl std::iter::FusedIterator<
-            Item = (Result<proxmox_sys::fs::ReadDirEntry, Error>, usize, bool),
+            Item = (
+                Result<proxmox_sys::fs::ReadDirEntry, Error>,
+                usize,
+                ChunkExt,
+            ),
         >,
         Error,
     > {
@@ -318,14 +322,20 @@ impl ChunkStore {
 
                             if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
                             {
-                                return Some((Ok(entry), percentage, false));
+                                return Some((Ok(entry), percentage, ChunkExt::None));
                             }
 
                             if bytes.len() == 64 + ".0.bad".len()
                                 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
                             {
-                                let bad = bytes.ends_with(b".bad");
-                                return Some((Ok(entry), percentage, bad));
+                                let chunk_ext = if bytes.ends_with(b".bad") {
+                                    ChunkExt::Bad
+                                } else if bytes.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
+                                    ChunkExt::UsedMarker
+                                } else {
+                                    continue;
+                                };
+                                return Some((Ok(entry), percentage, chunk_ext));
                             }
 
                             continue;
@@ -334,7 +344,7 @@ impl ChunkStore {
                             // stop after first error
                             done = true;
                             // and pass the error through:
-                            return Some((Err(err), percentage, false));
+                            return Some((Err(err), percentage, ChunkExt::None));
                         }
                         None => (), // open next directory
                     }
@@ -367,7 +377,7 @@ impl ChunkStore {
                         return Some((
                             Err(format_err!("unable to read subdir '{subdir}' - {err}")),
                             percentage,
-                            false,
+                            ChunkExt::None,
                         ));
                     }
                 }
@@ -402,7 +412,8 @@ impl ChunkStore {
         let mut last_percentage = 0;
         let mut chunk_count = 0;
 
-        for (entry, percentage, bad) in self.get_chunk_store_iterator()? {
+        for (entry, percentage, chunk_ext) in self.get_chunk_store_iterator()? {
+            let bad = chunk_ext.is_bad();
             if last_percentage != percentage {
                 last_percentage = percentage;
                 info!("processed {percentage}% ({chunk_count} chunks)");
@@ -433,10 +444,7 @@ impl ChunkStore {
                     drop(lock);
                     continue;
                 }
-                if filename
-                    .to_bytes()
-                    .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
-                {
+                if chunk_ext.is_used_marker() {
                     unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
                         format_err!("unlinking chunk using marker {filename:?} failed - {err}")
                     })?;
@@ -908,6 +916,24 @@ impl ChunkStore {
     }
 }
 
+#[derive(PartialEq)]
+/// Chunk iterator directory entry filename extension
+enum ChunkExt {
+    None,
+    Bad,
+    UsedMarker,
+}
+
+impl ChunkExt {
+    fn is_bad(&self) -> bool {
+        *self == Self::Bad
+    }
+
+    fn is_used_marker(&self) -> bool {
+        *self == Self::UsedMarker
+    }
+}
+
 #[test]
 fn test_chunk_store1() {
     let mut path = std::fs::canonicalize(".").unwrap(); // we need absolute path
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (3 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:24   ` Fabian Grünbichler
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store Christian Ebner
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Move the chunk filename extension parsing into a dedicated helper.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 36 ++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index ad517391d..e05e67dfe 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -320,22 +320,8 @@ impl ChunkStore {
                             // skip files if they're not a hash
                             let bytes = entry.file_name().to_bytes();
 
-                            if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
-                            {
-                                return Some((Ok(entry), percentage, ChunkExt::None));
-                            }
-
-                            if bytes.len() == 64 + ".0.bad".len()
-                                && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
-                            {
-                                let chunk_ext = if bytes.ends_with(b".bad") {
-                                    ChunkExt::Bad
-                                } else if bytes.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
-                                    ChunkExt::UsedMarker
-                                } else {
-                                    continue;
-                                };
-                                return Some((Ok(entry), percentage, chunk_ext));
+                            if let Some(ext) = ChunkExt::from_raw_filename(bytes) {
+                                return Some((Ok(entry), percentage, ext));
                             }
 
                             continue;
@@ -932,6 +918,24 @@ impl ChunkExt {
     fn is_used_marker(&self) -> bool {
         *self == Self::UsedMarker
     }
+
+    fn from_raw_filename(filename: &[u8]) -> Option<Self> {
+        if filename.len() == 64 && filename.iter().take(64).all(u8::is_ascii_hexdigit)
+        {
+            return Some(Self::None);
+        }
+
+        if filename.len() == 64 + ".0.bad".len()
+            && filename.iter().take(64).all(u8::is_ascii_hexdigit)
+        {
+            if filename.ends_with(b".bad") {
+                return Some(Self::Bad)
+            } else if filename.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
+                return Some(Self::UsedMarker)
+            }
+        }
+        None
+    }
 }
 
 #[test]
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (4 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:24   ` Fabian Grünbichler
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 7/8] chunk store: move next bad chunk path generator into dedicated helper Christian Ebner
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Provide an dedicated method and encapsulate the bad chunk filename
generation and access time updates. This is an implementation detail
of the chunks store, the garbage collection should not be concerned
about this.

In preparation for making the bad filename generation a dedicated
helper as well, all contained within the chunk store module.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 15 +++++++++++++++
 pbs-datastore/src/datastore.rs   | 11 +----------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index e05e67dfe..2ad90757a 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -276,6 +276,21 @@ impl ChunkStore {
         Ok(true)
     }
 
+    /// Update access timestamp on all bad chunks for given digest
+    pub(super) fn cond_touch_bad_chunks(&self, digest: &[u8; 32]) -> Result<bool, Error> {
+        let (chunk_path, _digest_str) = self.chunk_path(digest);
+        let mut is_bad = false;
+        for i in 0..=9 {
+            let bad_ext = format!("{i}.bad");
+            let mut bad_path = chunk_path.clone();
+            bad_path.set_extension(bad_ext);
+            if self.cond_touch_path(&bad_path, false)? {
+                is_bad = true;
+            }
+        }
+        Ok(is_bad)
+    }
+
     fn get_chunk_store_iterator(
         &self,
     ) -> Result<
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 58fd034fc..2bab20294 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1355,19 +1355,10 @@ impl DataStore {
             }
 
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                let (chunk_path, _digest_str) = self.chunk_path(digest);
                 // touch any corresponding .bad files to keep them around, meaning if a chunk is
                 // rewritten correctly they will be removed automatically, as well as if no index
                 // file requires the chunk anymore (won't get to this loop then)
-                let mut is_bad = false;
-                for i in 0..=9 {
-                    let bad_ext = format!("{i}.bad");
-                    let mut bad_path = chunk_path.clone();
-                    bad_path.set_extension(bad_ext);
-                    if self.inner.chunk_store.cond_touch_path(&bad_path, false)? {
-                        is_bad = true;
-                    }
-                }
+                let is_bad = self.inner.chunk_store.cond_touch_bad_chunks(digest)?;
 
                 if let Some(ref _s3_client) = s3_client {
                     // Do not retry here, this is very unlikely to happen as chunk markers will
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 7/8] chunk store: move next bad chunk path generator into dedicated helper
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (5 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 8/8] chunk store: move bad chunk filename generation " Christian Ebner
  2026-01-13 10:24 ` [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Fabian Grünbichler
  8 siblings, 0 replies; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Better encapsulate the logic as this concerns the chunk store and
perpare for placing the logic on bad chunk filename generation into
a single place.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 16 ++++++++++++++++
 pbs-datastore/src/datastore.rs   | 11 +----------
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 2ad90757a..f5585d2cb 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -915,6 +915,22 @@ impl ChunkStore {
         })?;
         Ok(guard)
     }
+
+    /// Generate the next bad chunk file path for given digest. Returns the path as well as the bad
+    /// chunk counter.
+    pub(crate) fn next_bad_chunk_path(&self, digest: &[u8; 32]) -> (PathBuf, usize) {
+        let (mut chunk_path, digest_str) = self.chunk_path(digest);
+        let mut counter = 0;
+        loop {
+            chunk_path.set_file_name(format!("{digest_str}.{counter}.bad"));
+            if chunk_path.exists() && counter < 9 {
+                counter += 1;
+            } else {
+                break;
+            }
+        }
+        (chunk_path, counter)
+    }
 }
 
 #[derive(PartialEq)]
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 2bab20294..f24a3a056 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2652,16 +2652,7 @@ impl DataStore {
         }
         let _lock = self.inner.chunk_store.mutex().lock().unwrap();
 
-        let mut counter = 0;
-        let mut new_path = path.clone();
-        loop {
-            new_path.set_file_name(format!("{digest_str}.{counter}.bad"));
-            if new_path.exists() && counter < 9 {
-                counter += 1;
-            } else {
-                break;
-            }
-        }
+        let (new_path, counter) = self.inner.chunk_store.next_bad_chunk_path(digest);
 
         let result = match std::fs::rename(&path, &new_path) {
             Ok(_) => Ok(Some(new_path)),
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 8/8] chunk store: move bad chunk filename generation into dedicated helper
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (6 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 7/8] chunk store: move next bad chunk path generator into dedicated helper Christian Ebner
@ 2025-12-11 15:38 ` Christian Ebner
  2026-01-13 10:24 ` [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Fabian Grünbichler
  8 siblings, 0 replies; 15+ messages in thread
From: Christian Ebner @ 2025-12-11 15:38 UTC (permalink / raw)
  To: pbs-devel

Deduplicate the bad chunk filename generation logic by moving it to
be an associated function of `ChunkExt`. Instead of set_extension()
garbage collection now uses set_file_name() as well, which boils down
to a pop and push [0] as compared to an extension replacement [1].

[0] https://doc.rust-lang.org/src/std/path.rs.html#1455
[1] https://doc.rust-lang.org/src/std/path.rs.html#1524

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index f5585d2cb..efd79ec8f 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -278,13 +278,11 @@ impl ChunkStore {
 
     /// Update access timestamp on all bad chunks for given digest
     pub(super) fn cond_touch_bad_chunks(&self, digest: &[u8; 32]) -> Result<bool, Error> {
-        let (chunk_path, _digest_str) = self.chunk_path(digest);
+        let (mut chunk_path, digest_str) = self.chunk_path(digest);
         let mut is_bad = false;
         for i in 0..=9 {
-            let bad_ext = format!("{i}.bad");
-            let mut bad_path = chunk_path.clone();
-            bad_path.set_extension(bad_ext);
-            if self.cond_touch_path(&bad_path, false)? {
+            chunk_path.set_file_name(ChunkExt::bad_chunk_filename(&digest_str, i));
+            if self.cond_touch_path(&chunk_path, false)? {
                 is_bad = true;
             }
         }
@@ -922,7 +920,7 @@ impl ChunkStore {
         let (mut chunk_path, digest_str) = self.chunk_path(digest);
         let mut counter = 0;
         loop {
-            chunk_path.set_file_name(format!("{digest_str}.{counter}.bad"));
+            chunk_path.set_file_name(ChunkExt::bad_chunk_filename(&digest_str, counter));
             if chunk_path.exists() && counter < 9 {
                 counter += 1;
             } else {
@@ -967,6 +965,10 @@ impl ChunkExt {
         }
         None
     }
+
+    fn bad_chunk_filename(digest_str: &str, counter: usize) -> String {
+        format!("{digest_str}.{counter}.bad")
+    }
 }
 
 #[test]
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
@ 2026-01-13 10:23   ` Fabian Grünbichler
  0 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:23 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> To better keep track of the state and declutter the code on the
> callsites, bundle the S3 delete list and it's logic by a dedicated
> struct. Since the check for empty lists is now performed as part
> of the deletion related methods, the callsites can get rid of that.
> 
> Further, avoid the proxmox_time::epoch_i64() and use SystemTime and
> Duration with their methods directly.
> 
> Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 132 +++++++++++++++++++--------------
>  1 file changed, 75 insertions(+), 57 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 9c57aaac1..58fd034fc 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -24,7 +24,7 @@ use proxmox_sys::error::SysError;
>  use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
>  use proxmox_sys::linux::procfs::MountInfo;
>  use proxmox_sys::process_locker::{ProcessLockExclusiveGuard, ProcessLockSharedGuard};
> -use proxmox_time::{epoch_i64, TimeSpan};

we already use epoch_i64 for other calculations in this module, why not
keep it?

> +use proxmox_time::TimeSpan;
>  use proxmox_worker_task::WorkerTaskContext;
>  
>  use pbs_api_types::{
> @@ -64,7 +64,7 @@ const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
>  // s3 deletion batch size to avoid 1024 open files soft limit
>  const S3_DELETE_BATCH_LIMIT: usize = 100;
>  // max defer time for s3 batch deletions
> -const S3_DELETE_DEFER_LIMIT_SECONDS: i64 = 60 * 5;
> +const S3_DELETE_DEFER_LIMIT_SECONDS: Duration = Duration::from_secs(60 * 5);
>  
>  /// checks if auth_id is owner, or, if owner is a token, if
>  /// auth_id is the user of the token
> @@ -1689,40 +1689,8 @@ impl DataStore {
>                  proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
>                      .context("failed to list chunk in s3 object store")?;
>  
> -            let mut delete_list = Vec::with_capacity(S3_DELETE_BATCH_LIMIT);
> -            let mut delete_list_age = epoch_i64();
> -
> -            let s3_delete_batch = |delete_list: &mut Vec<(S3ObjectKey, BackupLockGuard)>,
> -                                   s3_client: &Arc<S3Client>|
> -             -> Result<(), Error> {
> -                let delete_objects_result = proxmox_async::runtime::block_on(
> -                    s3_client.delete_objects(
> -                        &delete_list
> -                            .iter()
> -                            .map(|(key, _)| key.clone())
> -                            .collect::<Vec<S3ObjectKey>>(),
> -                    ),
> -                )?;
> -                if let Some(_err) = delete_objects_result.error {
> -                    bail!("failed to delete some objects");
> -                }
> -                // drops all chunk flock guards
> -                delete_list.clear();
> -                Ok(())
> -            };
> -
> -            let add_to_delete_list =
> -                |delete_list: &mut Vec<(S3ObjectKey, BackupLockGuard)>,
> -                 delete_list_age: &mut i64,
> -                 key: S3ObjectKey,
> -                 _chunk_guard: BackupLockGuard| {
> -                    // set age based on first insertion
> -                    if delete_list.is_empty() {
> -                        *delete_list_age = epoch_i64();
> -                    }
> -                    delete_list.push((key, _chunk_guard));
> -                };
> -
> +            let mut delete_list =
> +                S3DeleteList::with_thresholds(S3_DELETE_BATCH_LIMIT, S3_DELETE_DEFER_LIMIT_SECONDS);
>              loop {
>                  for content in list_bucket_result.contents {
>                      worker.check_abort()?;
> @@ -1779,12 +1747,7 @@ impl DataStore {
>                                              std::fs::remove_file(chunk_path)?;
>                                          }
>                                      }
> -                                    add_to_delete_list(
> -                                        &mut delete_list,
> -                                        &mut delete_list_age,
> -                                        content.key,
> -                                        _chunk_guard,
> -                                    );
> +                                    delete_list.push(content.key, _chunk_guard);
>                                      Ok(())
>                                  },
>                              )?;
> @@ -1792,12 +1755,7 @@ impl DataStore {
>                      } else {
>                          gc_status.removed_chunks += 1;
>                          gc_status.removed_bytes += content.size;
> -                        add_to_delete_list(
> -                            &mut delete_list,
> -                            &mut delete_list_age,
> -                            content.key,
> -                            _chunk_guard,
> -                        );
> +                        delete_list.push(content.key, _chunk_guard);
>                      }
>  
>                      chunk_count += 1;
> @@ -1806,12 +1764,7 @@ impl DataStore {
>                      drop(_guard);
>  
>                      // limit pending deletes to avoid holding too many chunk flocks
> -                    if delete_list.len() >= S3_DELETE_BATCH_LIMIT
> -                        || (!delete_list.is_empty()
> -                            && epoch_i64() - delete_list_age > S3_DELETE_DEFER_LIMIT_SECONDS)
> -                    {
> -                        s3_delete_batch(&mut delete_list, s3_client)?;
> -                    }
> +                    delete_list.conditional_delete_and_drop_locks(s3_client)?;
>                  }
>                  // Process next batch of chunks if there is more
>                  if list_bucket_result.is_truncated {
> @@ -1827,9 +1780,7 @@ impl DataStore {
>              }
>  
>              // delete the last batch of objects, if there are any remaining
> -            if !delete_list.is_empty() {
> -                s3_delete_batch(&mut delete_list, s3_client)?;
> -            }
> +            delete_list.delete_and_drop_locks(s3_client)?;
>  
>              info!("processed {chunk_count} total chunks");
>  
> @@ -2768,3 +2719,70 @@ impl DataStore {
>          result
>      }
>  }
> +
> +/// Track S3 object keys to be deleted by garbage collection while holding their file lock.
> +struct S3DeleteList {
> +    list: Vec<(S3ObjectKey, BackupLockGuard)>,
> +    first_entry_added: SystemTime,
> +    age_threshold: Duration,
> +    capacity_threshold: usize,
> +}
> +
> +impl S3DeleteList {
> +    /// Create a new list instance with given capacity and age thresholds.
> +    fn with_thresholds(capacity_threshold: usize, age_threshold: Duration) -> Self {
> +        Self {
> +            first_entry_added: SystemTime::now(), // init only, updated once added
> +            list: Vec::with_capacity(capacity_threshold),
> +            age_threshold,
> +            capacity_threshold,
> +        }
> +    }
> +
> +    /// Pushes the current key and backup lock guard to the list, updating the delete list age if
> +    /// the list was empty before the insert.
> +    fn push(&mut self, key: S3ObjectKey, guard: BackupLockGuard) {
> +        // set age based on first insertion
> +        if self.list.is_empty() {
> +            self.first_entry_added = SystemTime::now();
> +        }
> +        self.list.push((key, guard));
> +    }
> +
> +    /// Delete the objects in the list via the provided S3 client instance.
> +    /// Clears the list contents and frees the per-chunk file locks.
> +    fn delete_and_drop_locks(&mut self, s3_client: &Arc<S3Client>) -> Result<(), Error> {
> +        if self.list.is_empty() {
> +            return Ok(());
> +        }
> +        let delete_objects_result = proxmox_async::runtime::block_on(
> +            s3_client.delete_objects(
> +                &self
> +                    .list
> +                    .iter()
> +                    .map(|(key, _)| key.clone())
> +                    .collect::<Vec<S3ObjectKey>>(),
> +            ),
> +        )?;
> +        if delete_objects_result.error.is_some() {
> +            bail!("failed to delete some objects");
> +        }
> +        // drops all chunk flock guards
> +        self.list.clear();
> +        Ok(())
> +    }
> +
> +    /// Delete the object stored in the list if either the list exceeds the capacity threshold or
> +    /// the delete list age threshold.
> +    fn conditional_delete_and_drop_locks(
> +        &mut self,
> +        s3_client: &Arc<S3Client>,
> +    ) -> Result<(), Error> {
> +        if self.list.len() >= self.capacity_threshold
> +            || (!self.list.is_empty() && self.first_entry_added.elapsed()? > self.age_threshold)

this bails if the clock jumps backwards (further than the
first_entry_added timestamp). maybe we should instead always delete? or
at least add a meaningful error message/context..

> +        {
> +            self.delete_and_drop_locks(s3_client)?;
> +        }
> +        Ok(())
> +    }
> +}
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in chunk store iterator
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in " Christian Ebner
@ 2026-01-13 10:23   ` Fabian Grünbichler
  0 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:23 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> Optimizes the chunk filename check towards regular chunk files by
> explicitley checking for the correct length.
> 
> While the check for ascii hexdigits needs to be stated twice, this
> avoids to check for the `.bad` extension if the chunk filename did
> already match the expected length.

I don't get this part, we could still check first and only once that the
first 64 bytes are valid hex?

if bytes.len() < 64 {
  continue;
}

if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
  continue;
}

// now start looking at the length + potential extension

> 
> This will also help to better distinguish bad chunks and chunks
> used markers for s3 datastores in subsequent changes.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/chunk_store.rs | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index a5e5f6261..7980938ad 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -315,15 +315,20 @@ impl ChunkStore {
>                          Some(Ok(entry)) => {
>                              // skip files if they're not a hash
>                              let bytes = entry.file_name().to_bytes();
> -                            if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
> -                                continue;
> +
> +                            if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
> +                            {
> +                                return Some((Ok(entry), percentage, false));
>                              }
> -                            if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
> -                                continue;
> +
> +                            if bytes.len() == 64 + ".0.bad".len()
> +                                && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
> +                            {
> +                                let bad = bytes.ends_with(b".bad");
> +                                return Some((Ok(entry), percentage, bad));

while this mimics the old code, it is still broken (a chunk digest +
.fooba or any other 6-byte suffix that is not "??.bad" is returned as
non-bad chunk, since the length matches a bad chunk, but the extension
does not).

>                              }
>  
> -                            let bad = bytes.ends_with(b".bad");
> -                            return Some((Ok(entry), percentage, bad));
> +                            continue;
>                          }
>                          Some(Err(err)) => {
>                              // stop after first error
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker Christian Ebner
@ 2026-01-13 10:24   ` Fabian Grünbichler
  0 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:24 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> Clearly distinguish the cases for `bad` and `using` extensions for
> items returned by the chunk iterator. Directory entries with
> filenames which matched the expected length but not the extension
> are now skipped over.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/chunk_store.rs | 48 ++++++++++++++++++++++++--------
>  1 file changed, 37 insertions(+), 11 deletions(-)
> 
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 7980938ad..ad517391d 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -280,7 +280,11 @@ impl ChunkStore {
>          &self,
>      ) -> Result<
>          impl std::iter::FusedIterator<
> -            Item = (Result<proxmox_sys::fs::ReadDirEntry, Error>, usize, bool),
> +            Item = (
> +                Result<proxmox_sys::fs::ReadDirEntry, Error>,
> +                usize,
> +                ChunkExt,
> +            ),
>          >,
>          Error,
>      > {
> @@ -318,14 +322,20 @@ impl ChunkStore {
>  
>                              if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>                              {
> -                                return Some((Ok(entry), percentage, false));
> +                                return Some((Ok(entry), percentage, ChunkExt::None));
>                              }
>  
>                              if bytes.len() == 64 + ".0.bad".len()
>                                  && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
>                              {
> -                                let bad = bytes.ends_with(b".bad");
> -                                return Some((Ok(entry), percentage, bad));
> +                                let chunk_ext = if bytes.ends_with(b".bad") {
> +                                    ChunkExt::Bad
> +                                } else if bytes.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
> +                                    ChunkExt::UsedMarker
> +                                } else {
> +                                    continue;
> +                                };

this should be a match (on length and filename) or multiple conditions
checking length and extension *each*, because this code is just working
by chance (USING_MARKER_FILENAME_EXT and "0.bad" having the same length
is not enforced here!).

> +                                return Some((Ok(entry), percentage, chunk_ext));
>                              }
>  
>                              continue;
> @@ -334,7 +344,7 @@ impl ChunkStore {
>                              // stop after first error
>                              done = true;
>                              // and pass the error through:
> -                            return Some((Err(err), percentage, false));
> +                            return Some((Err(err), percentage, ChunkExt::None));
>                          }
>                          None => (), // open next directory
>                      }
> @@ -367,7 +377,7 @@ impl ChunkStore {
>                          return Some((
>                              Err(format_err!("unable to read subdir '{subdir}' - {err}")),
>                              percentage,
> -                            false,
> +                            ChunkExt::None,
>                          ));
>                      }
>                  }
> @@ -402,7 +412,8 @@ impl ChunkStore {
>          let mut last_percentage = 0;
>          let mut chunk_count = 0;
>  
> -        for (entry, percentage, bad) in self.get_chunk_store_iterator()? {
> +        for (entry, percentage, chunk_ext) in self.get_chunk_store_iterator()? {
> +            let bad = chunk_ext.is_bad();
>              if last_percentage != percentage {
>                  last_percentage = percentage;
>                  info!("processed {percentage}% ({chunk_count} chunks)");
> @@ -433,10 +444,7 @@ impl ChunkStore {
>                      drop(lock);
>                      continue;
>                  }
> -                if filename
> -                    .to_bytes()
> -                    .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
> -                {
> +                if chunk_ext.is_used_marker() {
>                      unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>                          format_err!("unlinking chunk using marker {filename:?} failed - {err}")
>                      })?;
> @@ -908,6 +916,24 @@ impl ChunkStore {
>      }
>  }
>  
> +#[derive(PartialEq)]
> +/// Chunk iterator directory entry filename extension
> +enum ChunkExt {
> +    None,
> +    Bad,
> +    UsedMarker,
> +}
> +
> +impl ChunkExt {
> +    fn is_bad(&self) -> bool {
> +        *self == Self::Bad
> +    }
> +
> +    fn is_used_marker(&self) -> bool {
> +        *self == Self::UsedMarker
> +    }
> +}

I am not sure if those two really warrant their own helpers?

> +
>  #[test]
>  fn test_chunk_store1() {
>      let mut path = std::fs::canonicalize(".").unwrap(); // we need absolute path
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper Christian Ebner
@ 2026-01-13 10:24   ` Fabian Grünbichler
  0 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:24 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> Move the chunk filename extension parsing into a dedicated helper.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/chunk_store.rs | 36 ++++++++++++++++++--------------
>  1 file changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index ad517391d..e05e67dfe 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -320,22 +320,8 @@ impl ChunkStore {
>                              // skip files if they're not a hash
>                              let bytes = entry.file_name().to_bytes();
>  
> -                            if bytes.len() == 64 && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
> -                            {
> -                                return Some((Ok(entry), percentage, ChunkExt::None));
> -                            }
> -
> -                            if bytes.len() == 64 + ".0.bad".len()
> -                                && bytes.iter().take(64).all(u8::is_ascii_hexdigit)
> -                            {
> -                                let chunk_ext = if bytes.ends_with(b".bad") {
> -                                    ChunkExt::Bad
> -                                } else if bytes.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
> -                                    ChunkExt::UsedMarker
> -                                } else {
> -                                    continue;
> -                                };
> -                                return Some((Ok(entry), percentage, chunk_ext));
> +                            if let Some(ext) = ChunkExt::from_raw_filename(bytes) {
> +                                return Some((Ok(entry), percentage, ext));
>                              }
>  
>                              continue;
> @@ -932,6 +918,24 @@ impl ChunkExt {
>      fn is_used_marker(&self) -> bool {
>          *self == Self::UsedMarker
>      }
> +
> +    fn from_raw_filename(filename: &[u8]) -> Option<Self> {
> +        if filename.len() == 64 && filename.iter().take(64).all(u8::is_ascii_hexdigit)

the hexdigit part should still be above, and not in this helper..

> +        {
> +            return Some(Self::None);
> +        }
> +
> +        if filename.len() == 64 + ".0.bad".len()
> +            && filename.iter().take(64).all(u8::is_ascii_hexdigit)

and like I commented on the earlier patch, the hexdigit check doesn't
need to be done twice..

and this part here should check for each suffix separately:

        if filename.len() == 64 {
            return Some(Self::None);
        }

        if filename.len() == 64 + ".0.bad".len() && filename.ends_with(b".bad") {
            return Some(Self::Bad);
        }

        if filename.len() == 64 + USING_MARKER_FILENAME_EXT.len()
            && filename.ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
        {
            return Some(Self::UsedMarker);
        }

        None


> +        {
> +            if filename.ends_with(b".bad") {
> +                return Some(Self::Bad)
> +            } else if filename.ends_with(USING_MARKER_FILENAME_EXT.as_bytes()) {
> +                return Some(Self::UsedMarker)
> +            }
> +        }
> +        None
> +    }
>  }
>  
>  #[test]
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store Christian Ebner
@ 2026-01-13 10:24   ` Fabian Grünbichler
  0 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:24 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> Provide an dedicated method and encapsulate the bad chunk filename
> generation and access time updates. This is an implementation detail
> of the chunks store, the garbage collection should not be concerned
> about this.
> 
> In preparation for making the bad filename generation a dedicated
> helper as well, all contained within the chunk store module.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
>  pbs-datastore/src/chunk_store.rs | 15 +++++++++++++++
>  pbs-datastore/src/datastore.rs   | 11 +----------
>  2 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index e05e67dfe..2ad90757a 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -276,6 +276,21 @@ impl ChunkStore {
>          Ok(true)
>      }
>  
> +    /// Update access timestamp on all bad chunks for given digest
> +    pub(super) fn cond_touch_bad_chunks(&self, digest: &[u8; 32]) -> Result<bool, Error> {

please document the locking semantics for this new interface!

AFAICT, this is only called once next to calls to cond_touch_chunk which
obtain the chunk store Mutex, so should this also do so?

> +        let (chunk_path, _digest_str) = self.chunk_path(digest);
> +        let mut is_bad = false;
> +        for i in 0..=9 {
> +            let bad_ext = format!("{i}.bad");
> +            let mut bad_path = chunk_path.clone();
> +            bad_path.set_extension(bad_ext);
> +            if self.cond_touch_path(&bad_path, false)? {
> +                is_bad = true;
> +            }
> +        }
> +        Ok(is_bad)
> +    }
> +
>      fn get_chunk_store_iterator(
>          &self,
>      ) -> Result<
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 58fd034fc..2bab20294 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1355,19 +1355,10 @@ impl DataStore {
>              }
>  
>              if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
> -                let (chunk_path, _digest_str) = self.chunk_path(digest);
>                  // touch any corresponding .bad files to keep them around, meaning if a chunk is
>                  // rewritten correctly they will be removed automatically, as well as if no index
>                  // file requires the chunk anymore (won't get to this loop then)
> -                let mut is_bad = false;
> -                for i in 0..=9 {
> -                    let bad_ext = format!("{i}.bad");
> -                    let mut bad_path = chunk_path.clone();
> -                    bad_path.set_extension(bad_ext);
> -                    if self.inner.chunk_store.cond_touch_path(&bad_path, false)? {

this was the last call to cond_touch_path outside of chunk_store.rs ,
which means we can make it private now (which was part of the point of this series ;)).

> -                        is_bad = true;
> -                    }
> -                }
> +                let is_bad = self.inner.chunk_store.cond_touch_bad_chunks(digest)?;
>  
>                  if let Some(ref _s3_client) = s3_client {
>                      // Do not retry here, this is very unlikely to happen as chunk markers will
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection
  2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
                   ` (7 preceding siblings ...)
  2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 8/8] chunk store: move bad chunk filename generation " Christian Ebner
@ 2026-01-13 10:24 ` Fabian Grünbichler
  8 siblings, 0 replies; 15+ messages in thread
From: Fabian Grünbichler @ 2026-01-13 10:24 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On December 11, 2025 4:38 pm, Christian Ebner wrote:
> This patch series contains some followup to the recently introduced
> changes to phase 2 of garbage collection on s3 backed datastores [0]
> and clenups for marker files during phase 3 [1], as well as some
> refactoring of the chunk filename extension parsing and bad chunk
> filename generation.
> 
> The first patch introduces a dedicated struct for tracking objects
> to be deleted from the s3 object store during phase 2, bundling the
> length and age based deletion threshold logic to the list state.
> 
> Subsequent patches rename and adapt the chunk store iterator to
> optimize for regular chunk filenames and clearly distinguish chunks
> by directory entry filename extension.
> 
> Further, chunk extension parsing and bad chunk filename generation are
> moved to be associated functions of the newly introduced `ChunkExt`
> enum.
> 
> Changes since version 1 (thanks @Fabian for comments):
> - Rebased onto current master
> - Followup patches for refactoring extension parsing and bad chunk
>   filename generation.

a few more nits inline, but in general this already looks like a nice
cleanup!

> 
> [0] https://lore.proxmox.com/pbs-devel/2b637884-f6bd-434d-be00-fa4e9d9e2dcb@proxmox.com/T/
> [1] https://lore.proxmox.com/pbs-devel/1764145262.ycdoq9dzrx.astroid@yuna.none/T/
> 
> proxmox-backup:
> 
> Christian Ebner (8):
>   GC: Move S3 delete list state and logic to a dedicated struct
>   chunk store: rename and limit scope for chunk store iterator
>   chunk store: invert chunk filename checks in chunk store iterator
>   chunk store: return chunk extension and check for used marker
>   chunk store: refactor chunk extension parsing into dedicated helper
>   datastore: move bad chunk touching logic to chunk store
>   chunk store: move next bad chunk path generator into dedicated helper
>   chunk store: move bad chunk filename generation into dedicated helper
> 
>  pbs-datastore/src/chunk_store.rs | 100 ++++++++++++++++----
>  pbs-datastore/src/datastore.rs   | 154 +++++++++++++++----------------
>  2 files changed, 161 insertions(+), 93 deletions(-)
> 
> 
> Summary over all repositories:
>   2 files changed, 161 insertions(+), 93 deletions(-)
> 
> -- 
> Generated by git-murpp 0.8.1
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-01-13 10:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-11 15:38 [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 1/8] GC: Move S3 delete list state and logic to a dedicated struct Christian Ebner
2026-01-13 10:23   ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 2/8] chunk store: rename and limit scope for chunk store iterator Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 3/8] chunk store: invert chunk filename checks in " Christian Ebner
2026-01-13 10:23   ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 4/8] chunk store: return chunk extension and check for used marker Christian Ebner
2026-01-13 10:24   ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 5/8] chunk store: refactor chunk extension parsing into dedicated helper Christian Ebner
2026-01-13 10:24   ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 6/8] datastore: move bad chunk touching logic to chunk store Christian Ebner
2026-01-13 10:24   ` Fabian Grünbichler
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 7/8] chunk store: move next bad chunk path generator into dedicated helper Christian Ebner
2025-12-11 15:38 ` [pbs-devel] [PATCH proxmox-backup v2 8/8] chunk store: move bad chunk filename generation " Christian Ebner
2026-01-13 10:24 ` [pbs-devel] [PATCH proxmox-backup v2 0/8] followups for garbage collection Fabian Grünbichler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal