From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup v4 08/14] GC: fix race with chunk upload/insert on s3 backends
Date: Mon, 10 Nov 2025 12:56:21 +0100 [thread overview]
Message-ID: <20251110115627.280318-9-c.ebner@proxmox.com> (raw)
In-Reply-To: <20251110115627.280318-1-c.ebner@proxmox.com>
The previous approach relying soly on local marker files was flawed
as it could not protect against all the possible races between chunk
upload to the s3 object store, insertion in the local datastore cache
and in-memory LRU cache.
Since these operations are now protected by getting a per-chunk file
lock, use that to check if it is safe to remove the chunk and do so
in a consistent manner by holding the chunk lock guard until it is
actually removed from the s3 backend and the caches.
Since an error when removing the chunk from cache could lead to
inconsistencies, GC must now fail in that case.
The chunk store lock is now only required on cache remove.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
pbs-datastore/src/datastore.rs | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 10acc91a0..71a8b1b60 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1671,14 +1671,18 @@ impl DataStore {
let mut delete_list = Vec::with_capacity(1000);
loop {
- let lock = self.inner.chunk_store.mutex().lock().unwrap();
-
for content in list_bucket_result.contents {
let (chunk_path, digest) = match self.chunk_path_from_object_key(&content.key) {
Some(path) => path,
None => continue,
};
+ let timeout = std::time::Duration::from_secs(0);
+ let _chunk_guard = match self.inner.chunk_store.lock_chunk(&digest, timeout) {
+ Ok(guard) => guard,
+ Err(_) => continue,
+ };
+
// Check local markers (created or atime updated during phase1) and
// keep or delete chunk based on that.
let atime = match std::fs::metadata(&chunk_path) {
@@ -1707,10 +1711,10 @@ impl DataStore {
&mut gc_status,
|| {
if let Some(cache) = self.cache() {
- // ignore errors, phase 3 will retry cleanup anyways
- let _ = cache.remove(&digest);
+ let _guard = self.inner.chunk_store.mutex().lock().unwrap();
+ cache.remove(&digest)?;
}
- delete_list.push(content.key);
+ delete_list.push((content.key, _chunk_guard));
Ok(())
},
)?;
@@ -1719,14 +1723,19 @@ impl DataStore {
chunk_count += 1;
}
- drop(lock);
-
if !delete_list.is_empty() {
- let delete_objects_result =
- proxmox_async::runtime::block_on(s3_client.delete_objects(&delete_list))?;
+ let delete_objects_result = proxmox_async::runtime::block_on(
+ s3_client.delete_objects(
+ &delete_list
+ .iter()
+ .map(|(key, _)| key.clone())
+ .collect::<Vec<S3ObjectKey>>(),
+ ),
+ )?;
if let Some(_err) = delete_objects_result.error {
bail!("failed to delete some objects");
}
+ // release all chunk guards
delete_list.clear();
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-10 11:56 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 02/14] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 03/14] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 04/14] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 05/14] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 06/14] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 07/14] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-10 11:56 ` Christian Ebner [this message]
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 09/14] GC: cleanup chunk markers from cache in phase 3 on s3 backends Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 10/14] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 11/14] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 12/14] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 13/14] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 14/14] chunk store: never fail when trying to remove missing chunk file Christian Ebner
2025-11-11 11:09 ` [pbs-devel] partially-applied: [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
2025-11-11 14:31 ` [pbs-devel] " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251110115627.280318-9-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.