all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
@ 2025-11-14 13:18 Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
                   ` (22 more replies)
  0 siblings, 23 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

These patches fix possible race conditions on datastores with s3 backend for
chunk insert, renaming of corrupt chunks during verification and cleanup during
garbage collection. Further, the patches assure consistency between the chunk
marker file of the local datastore cache, the s3 object store and the in-memory
LRU cache during state changes occurring by one of the above mentioned operations.

Consistency is achieved by using a per-chunk file locking mechanism. File locks
are stored on the predefined location for datastore file locks, using the same
`.chunks/prefix/digest` folder layout for consistency and to keep readdir and
other fs operations performant.

As part of the series it is now also assured that chunks which are removed from
the local datastore cache, are also dropped from it's in-memory LRU cache and
therefore a consistent state is achieved. Further, bad chunks are touched as
well during GC phase 1 for s3 backed datastores and the creation of missing
marker files performed conditionally, to avoid consistency issues.

Changes since version 5 (thanks @Fabian for catching 2 more issues):
- Only remove corrupt chunk from cache after renaming it, as otherwise the cache
  remove already deletes the chunk file.
- Correctly distinguish bad chunks from regular ones in chunk_path_from_object_key(),
  and use that information for correctly processing the bad chunks in GC phase 2.

Changes since version 4:
- Incorporated patches by Fabian for better handling of the chunk store mutex locking
- Add patches to fix missing marker file creation and keeping of bad chunks during
  garbage collection for s3 backend
- Document locking order restrictions

Changes since version 3:
- Add patches to limit visibility of BackupDir and BackupGroup destroy
- Refactored s3 upload index helper
- Avoid unneeded double stat for GC phase 3 clenaups

Changes since version 2:
- Incorporate additional race fix as discussed in
  https://lore.proxmox.com/pbs-devel/8ab74557-9592-43e7-8706-10fceaae31b7@proxmox.com/T/
  and suggested offlist.

Changes since version 1 (thanks @Fabian for review):
- Fix lock inversion for rename corrup chunk.
- Inline the chunk lock helper, making it explicit and thereby avoid calling the
  helper for regular datastores.
- Pass the backend to the add_blob datastore helper, so it can be reused for the
  backup session and pull sync job.
- Move also the s3 index upload helper from the backup env to the datastore, and
  reuse it for the sync job as well.

This patch series obsoletes two previous patch series with unfortunately
incomplete bugfix attempts found at:
- https://lore.proxmox.com/pbs-devel/8d711a20-b193-47a9-8f38-6ce800e6d0e8@proxmox.com/T/
- https://lore.proxmox.com/pbs-devel/20251015164008.975591-1-c.ebner@proxmox.com/T/

proxmox-backup:

Christian Ebner (18):
  datastore: GC: drop overly verbose info message during s3 chunk sweep
  chunk store: implement per-chunk file locking helper for s3 backend
  datastore: acquire chunk store mutex lock when renaming corrupt chunk
  datastore: get per-chunk file lock for chunk rename on s3 backend
  fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU
    cache
  datastore: add locking to protect against races on chunk insert for s3
  GC: fix race with chunk upload/insert on s3 backends
  chunk store: reduce exposure of clear_chunk() to crate only
  chunk store: make chunk removal a helper method of the chunk store
  GC: cleanup chunk markers from cache in phase 3 on s3 backends
  GC: touch bad chunk files independent of backend type
  GC: guard missing marker file insertion for s3 backed stores
  GC: s3: track if a chunk marker file is missing since a bad chunk
  chunk store: add helpers marking missing local chunk markers as
    expected
  GC: assure chunk exists on s3 store when creating missing chunk marker
  datastore: document s3 backend specific locking restrictions
  GC: fix: don't drop bad extension for S3 object to chunk path helper
  GC: clean up bad chunks from the filesystem only

Fabian Grünbichler (3):
  store: split insert_chunk into wrapper + unsafe locked implementation
  store: cache: move Mutex acquire to cache insertion
  chunk store: rename cache-specific helpers

 pbs-datastore/src/backup_info.rs              |   2 +-
 pbs-datastore/src/chunk_store.rs              | 124 +++++++++++-
 pbs-datastore/src/datastore.rs                | 180 ++++++++++++------
 pbs-datastore/src/lib.rs                      |  13 ++
 .../src/local_datastore_lru_cache.rs          |  23 ++-
 5 files changed, 263 insertions(+), 79 deletions(-)


Summary over all repositories:
  5 files changed, 263 insertions(+), 79 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 02/21] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

This message does not provide much additional information and is
shown on correct operation anyways, leaving it however open for
misinterpretation of being an error/warning. Drop it in favor of
being less verbose and not potentially spam the task log in case of
many chunks being removed, still being visible in the final garbage
collection stats output.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 1 -
 1 file changed, 1 deletion(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index b9d4ed2d2..7e5c8b0f2 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1674,7 +1674,6 @@ impl DataStore {
                         Ok(stat) => stat.accessed()?,
                         Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
                             // File not found, delete by setting atime to unix epoch
-                            info!("Not found, mark for deletion: {}", content.key);
                             SystemTime::UNIX_EPOCH
                         }
                         Err(err) => return Err(err.into()),
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 02/21] chunk store: implement per-chunk file locking helper for s3 backend
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 03/21] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

Adds chunk store helper methods to create per-chunk file locks. These
will be used to guard chunk operations on s3 backends to guarantee
exclusive access when performing cache and backend operations.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/backup_info.rs |  2 +-
 pbs-datastore/src/chunk_store.rs | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 29ca6f901..859039cf4 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -939,7 +939,7 @@ fn lock_file_path_helper(ns: &BackupNamespace, path: PathBuf) -> PathBuf {
 /// deletion.
 ///
 /// It also creates the base directory for lock files.
-fn lock_helper<F>(
+pub(crate) fn lock_helper<F>(
     store_name: &str,
     path: &std::path::Path,
     lock_fn: F,
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index b88a0a096..8261080aa 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -8,6 +8,7 @@ use anyhow::{bail, format_err, Context, Error};
 use tracing::{info, warn};
 
 use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
+use pbs_config::BackupLockGuard;
 use proxmox_io::ReadExt;
 use proxmox_s3_client::S3Client;
 use proxmox_sys::fs::{create_dir, create_path, file_type_from_file_stat, CreateOptions};
@@ -16,6 +17,7 @@ use proxmox_sys::process_locker::{
 };
 use proxmox_worker_task::WorkerTaskContext;
 
+use crate::backup_info::DATASTORE_LOCKS_DIR;
 use crate::data_blob::DataChunkBuilder;
 use crate::file_formats::{
     COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
@@ -776,6 +778,30 @@ impl ChunkStore {
         ChunkStore::check_permissions(lockfile_path, 0o644)?;
         Ok(())
     }
+
+    /// Generates the path to the chunks lock file
+    pub(crate) fn chunk_lock_path(&self, digest: &[u8]) -> PathBuf {
+        let mut lock_path = Path::new(DATASTORE_LOCKS_DIR).join(self.name.clone());
+        let digest_str = hex::encode(digest);
+        lock_path.push(".chunks");
+        let prefix = digest_to_prefix(digest);
+        lock_path.push(&prefix);
+        lock_path.push(&digest_str);
+        lock_path
+    }
+
+    /// Get an exclusive lock on the chunks lock file
+    pub(crate) fn lock_chunk(
+        &self,
+        digest: &[u8],
+        timeout: Duration,
+    ) -> Result<BackupLockGuard, Error> {
+        let lock_path = self.chunk_lock_path(digest);
+        let guard = crate::backup_info::lock_helper(self.name(), &lock_path, |path| {
+            pbs_config::open_backup_lockfile(path, Some(timeout), true)
+        })?;
+        Ok(guard)
+    }
 }
 
 #[test]
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 03/21] datastore: acquire chunk store mutex lock when renaming corrupt chunk
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 02/21] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 04/21] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

When renaming a corrupt chunk in the chunk store, currently the chunk
store mutex lock is not held, leading to possible races with other
operations which hold the lock and therefore assume exclusive access
such as garbage collection or backup chunk inserts. This affects
both, filesystem and S3 backends.

To fix the possible race, get the lock and rearrange the code such
that it is never held when entering async code for the s3 backend.
This does not yet solve the race and cache consistency for s3
backends, which is addressed by introducing per-chunk file locking
in subsequent patches.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 7e5c8b0f2..905d709e7 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2596,6 +2596,8 @@ impl DataStore {
     pub fn rename_corrupt_chunk(&self, digest: &[u8; 32]) -> Result<Option<PathBuf>, Error> {
         let (path, digest_str) = self.chunk_path(digest);
 
+        let _lock = self.inner.chunk_store.mutex().lock().unwrap();
+
         let mut counter = 0;
         let mut new_path = path.clone();
         loop {
@@ -2607,6 +2609,14 @@ impl DataStore {
             }
         }
 
+        let result = match std::fs::rename(&path, &new_path) {
+            Ok(_) => Ok(Some(new_path)),
+            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
+            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
+        };
+
+        drop(_lock);
+
         let backend = self.backend().map_err(|err| {
             format_err!(
                 "failed to get backend while trying to rename bad chunk: {digest_str} - {err}"
@@ -2637,10 +2647,6 @@ impl DataStore {
             )?;
         }
 
-        match std::fs::rename(&path, &new_path) {
-            Ok(_) => Ok(Some(new_path)),
-            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
-            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
-        }
+        result
     }
 }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 04/21] datastore: get per-chunk file lock for chunk rename on s3 backend
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (2 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 03/21] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 05/21] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

To guarantee exclusive access during s3 object store and cache
operations, acquire the per-chunk file before renaming a chunk when
the datastore is backed by s3.

This does not yet cover locking for the GC and chunk insert, part
of subsequent changes.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 905d709e7..a3c0f32aa 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -56,6 +56,8 @@ pub const GROUP_OWNER_FILE_NAME: &str = "owner";
 /// Filename for in-use marker stored on S3 object store backend
 pub const S3_DATASTORE_IN_USE_MARKER: &str = ".in-use";
 const NAMESPACE_MARKER_FILENAME: &str = ".namespace";
+// s3 put request times out after upload_size / 1 Kib/s, so about 2.3 hours for 8 MiB
+const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
 
 /// checks if auth_id is owner, or, if owner is a token, if
 /// auth_id is the user of the token
@@ -2596,6 +2598,13 @@ impl DataStore {
     pub fn rename_corrupt_chunk(&self, digest: &[u8; 32]) -> Result<Option<PathBuf>, Error> {
         let (path, digest_str) = self.chunk_path(digest);
 
+        let _chunk_guard;
+        if let DatastoreBackendType::S3 = self.inner.backend_config.ty.unwrap_or_default() {
+            _chunk_guard = self
+                .inner
+                .chunk_store
+                .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
+        }
         let _lock = self.inner.chunk_store.mutex().lock().unwrap();
 
         let mut counter = 0;
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 05/21] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (3 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 04/21] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 06/21] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

lock cache store on rename of fs backed datastores and additional
per-chunk file lock on s3 backend.

Chunks detected as corrupt have been renamed on both, the S3 backend
and the local datastore cache, but not evicted from the in-memory
cache containing the LRU chunk digests. This can lead to the chunks
being considered as already present if their digest is still cached,
and therefore not being re-inserted in the local store cache and S3
backend on backup upload.

Fix this by not only renaming the local datastore's chunk marker
file, but also removing it from the in-memory cache while holding the
chunk store mutex lock to exclude interference from concurrent chunk
inserts.

Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6961
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a3c0f32aa..291c542b3 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2624,6 +2624,12 @@ impl DataStore {
             Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
         };
 
+        if let Some(cache) = self.cache() {
+            // Locks are being held, so it is safe to call the method.
+            // Ignores errors from chunk marker file remove, it cannot be present since renamed.
+            let _ = unsafe { cache.remove(digest) };
+        }
+
         drop(_lock);
 
         let backend = self.backend().map_err(|err| {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 06/21] datastore: add locking to protect against races on chunk insert for s3
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (4 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 05/21] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 07/21] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

Acquire the per-chunk file lock to get exclusive access to the chunk
on insert and make it possible to guarantee that s3 backend
operations and the local caches reflect the consistent state without
other operations such as verify chunk renaming or garbage collection
interfering.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 291c542b3..eb4125c4a 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1913,6 +1913,10 @@ impl DataStore {
         match backend {
             DatastoreBackend::Filesystem => self.inner.chunk_store.insert_chunk(chunk, digest),
             DatastoreBackend::S3(s3_client) => {
+                let _chunk_guard = self
+                    .inner
+                    .chunk_store
+                    .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
                 let chunk_data: Bytes = chunk.raw_data().to_vec().into();
                 let chunk_size = chunk_data.len() as u64;
                 let object_key = crate::s3::object_key_from_digest(digest)?;
@@ -1933,6 +1937,10 @@ impl DataStore {
     ) -> Result<(bool, u64), Error> {
         let chunk_data = chunk.raw_data();
         let chunk_size = chunk_data.len() as u64;
+        let _chunk_guard = self
+            .inner
+            .chunk_store
+            .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
 
         // Avoid re-upload to S3 if the chunk is either present in the in-memory LRU cache
         // or the chunk marker file exists on filesystem. The latter means the chunk has
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 07/21] GC: fix race with chunk upload/insert on s3 backends
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (5 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 06/21] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 08/21] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

The previous approach relying soly on local marker files was flawed
as it could not protect against all the possible races between chunk
upload to the s3 object store, insertion in the local datastore cache
and in-memory LRU cache.

Since these operations are now protected by getting a per-chunk file
lock, use that to check if it is safe to remove the chunk and do so
in a consistent manner by holding the chunk lock guard until it is
actually removed from the s3 backend and the caches.

Since an error when removing the chunk from cache could lead to
inconsistencies, GC must now fail in that case.

The chunk store lock is now acquired for each chunk to follow the
required locking order to avoid deadlocks:
- per-chunk file lock
- chunk store mutex lock
- lru cache mutex lock

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index eb4125c4a..ff8517e45 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1662,14 +1662,19 @@ impl DataStore {
 
             let mut delete_list = Vec::with_capacity(1000);
             loop {
-                let lock = self.inner.chunk_store.mutex().lock().unwrap();
-
                 for content in list_bucket_result.contents {
                     let (chunk_path, digest) = match self.chunk_path_from_object_key(&content.key) {
                         Some(path) => path,
                         None => continue,
                     };
 
+                    let timeout = std::time::Duration::from_secs(0);
+                    let _chunk_guard = match self.inner.chunk_store.lock_chunk(&digest, timeout) {
+                        Ok(guard) => guard,
+                        Err(_) => continue,
+                    };
+                    let _guard = self.inner.chunk_store.mutex().lock().unwrap();
+
                     // Check local markers (created or atime updated during phase1) and
                     // keep or delete chunk based on that.
                     let atime = match std::fs::metadata(&chunk_path) {
@@ -1697,10 +1702,9 @@ impl DataStore {
                             &mut gc_status,
                             || {
                                 if let Some(cache) = self.cache() {
-                                    // ignore errors, phase 3 will retry cleanup anyways
-                                    let _ = cache.remove(&digest);
+                                    cache.remove(&digest)?;
                                 }
-                                delete_list.push(content.key);
+                                delete_list.push((content.key, _chunk_guard));
                                 Ok(())
                             },
                         )?;
@@ -1709,14 +1713,19 @@ impl DataStore {
                     chunk_count += 1;
                 }
 
-                drop(lock);
-
                 if !delete_list.is_empty() {
-                    let delete_objects_result =
-                        proxmox_async::runtime::block_on(s3_client.delete_objects(&delete_list))?;
+                    let delete_objects_result = proxmox_async::runtime::block_on(
+                        s3_client.delete_objects(
+                            &delete_list
+                                .iter()
+                                .map(|(key, _)| key.clone())
+                                .collect::<Vec<S3ObjectKey>>(),
+                        ),
+                    )?;
                     if let Some(_err) = delete_objects_result.error {
                         bail!("failed to delete some objects");
                     }
+                    // release all chunk guards
                     delete_list.clear();
                 }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 08/21] chunk store: reduce exposure of clear_chunk() to crate only
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (6 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 07/21] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 09/21] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

This helper method is only used by the local datastore cache to clear
chunk when evicted from the cache, freeing up space usage. This should
not be called from outside pbs-datastore crate.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 8261080aa..7099b6614 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -686,7 +686,7 @@ impl ChunkStore {
     ///
     /// Used to evict chunks from the local datastore cache, while keeping them as in-use markers
     /// for garbage collection. Returns with success also if chunk file is not pre-existing.
-    pub fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+    pub(crate) fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, digest_str) = self.chunk_path(digest);
         let mut create_options = CreateOptions::new();
         if nix::unistd::Uid::effective().is_root() {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 09/21] chunk store: make chunk removal a helper method of the chunk store
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (7 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 08/21] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 10/21] store: split insert_chunk into wrapper + unsafe locked implementation Christian Ebner
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

Better encapsulate functionality which touches the chunk store, to
assure a common interface is used. The local datastore LRU cache will
call this during garbage collection to clean up chunk marker files
for chunks which have been removed from the S3 object store backend.

This is in preparation for fixing a deadlock, no functional changes
intended.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs               | 9 +++++++++
 pbs-datastore/src/local_datastore_lru_cache.rs | 3 +--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 7099b6614..a7ae4fca5 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -702,6 +702,15 @@ impl ChunkStore {
         Ok(())
     }
 
+    /// Removes a chunk marker file from the `LocalDatastoreLruCache`s chunk store.
+    ///
+    /// Callers must hold the per-chunk file lock in order to avoid races with renaming of corrupt
+    /// chunks by verifications and chunk inserts by backups.
+    pub(crate) fn remove_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+        let (chunk_path, _digest_str) = self.chunk_path(digest);
+        std::fs::remove_file(chunk_path).map_err(Error::from)
+    }
+
     pub fn relative_path(&self, path: &Path) -> PathBuf {
         // unwrap: only `None` in unit tests
         assert!(self.locker.is_some());
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index fe3b51a55..7b9d8caae 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -52,8 +52,7 @@ impl LocalDatastoreLruCache {
     /// Fails if the chunk cannot be deleted successfully.
     pub(crate) unsafe fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
         self.cache.remove(*digest);
-        let (path, _digest_str) = self.store.chunk_path(digest);
-        std::fs::remove_file(path).map_err(Error::from)
+        self.store.remove_chunk(digest)
     }
 
     /// Access the locally cached chunk or fetch it from the S3 object store via the provided
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 10/21] store: split insert_chunk into wrapper + unsafe locked implementation
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (8 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 09/21] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 11/21] store: cache: move Mutex acquire to cache insertion Christian Ebner
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

From: Fabian Grünbichler <f.gruenbichler@proxmox.com>

to allow calling it with the chunk store Mutex already held, to synchronize the
LRU cache and the chunk store inserts.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index a7ae4fca5..085816f42 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -587,9 +587,22 @@ impl ChunkStore {
 
         //println!("DIGEST {}", hex::encode(digest));
 
-        let (chunk_path, digest_str) = self.chunk_path(digest);
+        let _lock = self.mutex.lock();
 
-        let lock = self.mutex.lock();
+        // Safety: lock acquired above
+        unsafe { self.insert_chunk_nolock(chunk, digest) }
+    }
+
+    /// Safety: requires holding the chunk store mutex!
+    pub(crate) unsafe fn insert_chunk_nolock(
+        &self,
+        chunk: &DataBlob,
+        digest: &[u8; 32],
+    ) -> Result<(bool, u64), Error> {
+        // unwrap: only `None` in unit tests
+        assert!(self.locker.is_some());
+
+        let (chunk_path, digest_str) = self.chunk_path(digest);
 
         let raw_data = chunk.raw_data();
         let encoded_size = raw_data.len() as u64;
@@ -665,8 +678,6 @@ impl ChunkStore {
                 .map_err(|err| format_err!("fsync failed: {err}"))?;
         }
 
-        drop(lock);
-
         Ok((false, encoded_size))
     }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 11/21] store: cache: move Mutex acquire to cache insertion
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (9 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 10/21] store: split insert_chunk into wrapper + unsafe locked implementation Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 12/21] chunk store: rename cache-specific helpers Christian Ebner
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

From: Fabian Grünbichler <f.gruenbichler@proxmox.com>

to avoid the lock ordering issue between the cache implemention's internal
Mutex, and the chunk store Mutex, refactor the interface so that any cache
actions that modify the chunk store can acquire the chunk store Mutex first,
before locking the cache.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs              |  6 +++---
 .../src/local_datastore_lru_cache.rs          | 20 ++++++++++++++-----
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 085816f42..a17c258a7 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -697,7 +697,9 @@ impl ChunkStore {
     ///
     /// Used to evict chunks from the local datastore cache, while keeping them as in-use markers
     /// for garbage collection. Returns with success also if chunk file is not pre-existing.
-    pub(crate) fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+    ///
+    /// Safety: chunk store mutex must be held!
+    pub(crate) unsafe fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, digest_str) = self.chunk_path(digest);
         let mut create_options = CreateOptions::new();
         if nix::unistd::Uid::effective().is_root() {
@@ -706,8 +708,6 @@ impl ChunkStore {
             create_options = create_options.owner(uid).group(gid);
         }
 
-        let _lock = self.mutex.lock();
-
         proxmox_sys::fs::replace_file(&chunk_path, &[], create_options, false)
             .map_err(|err| format_err!("clear chunk failed for {digest_str} - {err}"))?;
         Ok(())
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index 7b9d8caae..8b2dbedfd 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -34,9 +34,16 @@ impl LocalDatastoreLruCache {
     ///
     /// Fails if the chunk cannot be inserted successfully.
     pub fn insert(&self, digest: &[u8; 32], chunk: &DataBlob) -> Result<(), Error> {
-        self.store.insert_chunk(chunk, digest)?;
-        self.cache
-            .insert(*digest, (), |digest| self.store.clear_chunk(&digest))
+        let _lock = self.store.mutex().lock().unwrap();
+
+        // Safety: lock acquire above
+        unsafe {
+            self.store.insert_chunk_nolock(chunk, digest)?;
+        }
+        self.cache.insert(*digest, (), |digest| {
+            // Safety: lock acquired above, this is executed inline!
+            unsafe { self.store.clear_chunk(&digest) }
+        })
     }
 
     /// Remove a chunk from the local datastore cache.
@@ -70,8 +77,11 @@ impl LocalDatastoreLruCache {
             Ok(mut file) => match DataBlob::load_from_reader(&mut file) {
                 // File was still cached with contents, load response from file
                 Ok(chunk) => {
-                    self.cache
-                        .insert(*digest, (), |digest| self.store.clear_chunk(&digest))?;
+                    let _lock = self.store.mutex().lock().unwrap();
+                    self.cache.insert(*digest, (), |digest| {
+                        // Safety: lock acquired above, this is executed inline
+                        unsafe { self.store.clear_chunk(&digest) }
+                    })?;
                     Ok(Some(chunk))
                 }
                 // File was empty, might have been evicted since
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 12/21] chunk store: rename cache-specific helpers
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (10 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 11/21] store: cache: move Mutex acquire to cache insertion Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 13/21] GC: cleanup chunk markers from cache in phase 3 on s3 backends Christian Ebner
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

From: Fabian Grünbichler <f.gruenbichler@proxmox.com>

to make it more explicit that these operate in cache context, and are not
generic helpers as the original names kind of imply.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs               | 4 ++--
 pbs-datastore/src/local_datastore_lru_cache.rs | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index a17c258a7..f5a77276d 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -699,7 +699,7 @@ impl ChunkStore {
     /// for garbage collection. Returns with success also if chunk file is not pre-existing.
     ///
     /// Safety: chunk store mutex must be held!
-    pub(crate) unsafe fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+    pub(crate) unsafe fn replace_chunk_with_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, digest_str) = self.chunk_path(digest);
         let mut create_options = CreateOptions::new();
         if nix::unistd::Uid::effective().is_root() {
@@ -717,7 +717,7 @@ impl ChunkStore {
     ///
     /// Callers must hold the per-chunk file lock in order to avoid races with renaming of corrupt
     /// chunks by verifications and chunk inserts by backups.
-    pub(crate) fn remove_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+    pub(crate) fn remove_chunk_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, _digest_str) = self.chunk_path(digest);
         std::fs::remove_file(chunk_path).map_err(Error::from)
     }
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index 8b2dbedfd..ae2215be1 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -42,7 +42,7 @@ impl LocalDatastoreLruCache {
         }
         self.cache.insert(*digest, (), |digest| {
             // Safety: lock acquired above, this is executed inline!
-            unsafe { self.store.clear_chunk(&digest) }
+            unsafe { self.store.replace_chunk_with_marker(&digest) }
         })
     }
 
@@ -59,7 +59,7 @@ impl LocalDatastoreLruCache {
     /// Fails if the chunk cannot be deleted successfully.
     pub(crate) unsafe fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
         self.cache.remove(*digest);
-        self.store.remove_chunk(digest)
+        self.store.remove_chunk_marker(digest)
     }
 
     /// Access the locally cached chunk or fetch it from the S3 object store via the provided
@@ -80,7 +80,7 @@ impl LocalDatastoreLruCache {
                     let _lock = self.store.mutex().lock().unwrap();
                     self.cache.insert(*digest, (), |digest| {
                         // Safety: lock acquired above, this is executed inline
-                        unsafe { self.store.clear_chunk(&digest) }
+                        unsafe { self.store.replace_chunk_with_marker(&digest) }
                     })?;
                     Ok(Some(chunk))
                 }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 13/21] GC: cleanup chunk markers from cache in phase 3 on s3 backends
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (11 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 12/21] chunk store: rename cache-specific helpers Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 14/21] GC: touch bad chunk files independent of backend type Christian Ebner
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

Pass along the in-memory cache when sweeping unused chunks in phase 3
of garbage collection for datastores with s3 backend.
When a dangling marker file has been detected, which will only happen
if the chunk was removed from the object store by some unexpected
interaction (e.g. manually removed from the bucket), this marker must be
removed to get a consistent state (snapshots referencing the chunk
remain however corrupt).

Clear such a chunk from both, the in-memory and local datastore cache,
so it can be reuploaded by future backup or sync jobs.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 22 +++++++++++++++++++++-
 pbs-datastore/src/datastore.rs   |  2 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index f5a77276d..20f71efef 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
 use std::time::Duration;
 
 use anyhow::{bail, format_err, Context, Error};
+use hex::FromHex;
 use tracing::{info, warn};
 
 use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
@@ -22,7 +23,7 @@ use crate::data_blob::DataChunkBuilder;
 use crate::file_formats::{
     COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
 };
-use crate::DataBlob;
+use crate::{DataBlob, LocalDatastoreLruCache};
 
 /// File system based chunk store
 pub struct ChunkStore {
@@ -383,6 +384,7 @@ impl ChunkStore {
         min_atime: i64,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        cache: Option<&LocalDatastoreLruCache>,
     ) -> Result<(), Error> {
         // unwrap: only `None` in unit tests
         assert!(self.locker.is_some());
@@ -436,6 +438,24 @@ impl ChunkStore {
                         bad,
                         status,
                         || {
+                            // non-bad S3 chunks need to be removed via cache
+                            if let Some(cache) = cache {
+                                if !bad {
+                                    let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
+
+                                    // unless there is a concurrent upload pending,
+                                    // must never block due to required locking order
+                                    if let Ok(_guard) =
+                                        self.lock_chunk(&digest, Duration::from_secs(0))
+                                    {
+                                        cache.remove(&digest)?;
+                                    }
+
+                                    return Ok(());
+                                }
+                            }
+
+                            // bad or local chunks
                             unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
                                 |err| {
                                     format_err!(
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index ff8517e45..083a83f7d 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1755,6 +1755,7 @@ impl DataStore {
                 min_atime,
                 &mut tmp_gc_status,
                 worker,
+                self.cache(),
             )?;
         } else {
             self.inner.chunk_store.sweep_unused_chunks(
@@ -1762,6 +1763,7 @@ impl DataStore {
                 min_atime,
                 &mut gc_status,
                 worker,
+                None,
             )?;
         }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 14/21] GC: touch bad chunk files independent of backend type
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (12 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 13/21] GC: cleanup chunk markers from cache in phase 3 on s3 backends Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 15/21] GC: guard missing marker file insertion for s3 backed stores Christian Ebner
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

The bad chunks marker files should be kept around also for s3 backed
datastores so GC does not clean them up unless a good chunk was
re-uploaded and bad chunk renaming detects them and does not lead
to conflicting renames.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 64 ++++++++++++++++------------------
 1 file changed, 30 insertions(+), 34 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 083a83f7d..d71106ea8 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1322,41 +1322,37 @@ impl DataStore {
                 }
             }
 
-            match s3_client {
-                None => {
-                    // Filesystem backend
-                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                        let hex = hex::encode(digest);
-                        warn!(
-                            "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
-                        );
-
-                        // touch any corresponding .bad files to keep them around, meaning if a chunk is
-                        // rewritten correctly they will be removed automatically, as well as if no index
-                        // file requires the chunk anymore (won't get to this loop then)
-                        for i in 0..=9 {
-                            let bad_ext = format!("{i}.bad");
-                            let mut bad_path = PathBuf::new();
-                            bad_path.push(self.chunk_path(digest).0);
-                            bad_path.set_extension(bad_ext);
-                            self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
-                        }
-                    }
+            if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+                let hex = hex::encode(digest);
+                warn!(
+                    "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
+                );
+
+                // touch any corresponding .bad files to keep them around, meaning if a chunk is
+                // rewritten correctly they will be removed automatically, as well as if no index
+                // file requires the chunk anymore (won't get to this loop then)
+                for i in 0..=9 {
+                    let bad_ext = format!("{i}.bad");
+                    let mut bad_path = PathBuf::new();
+                    bad_path.push(self.chunk_path(digest).0);
+                    bad_path.set_extension(bad_ext);
+                    self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
                 }
-                Some(ref _s3_client) => {
-                    // Update atime on local cache marker files.
-                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                        let (chunk_path, _digest) = self.chunk_path(digest);
-                        // Insert empty file as marker to tell GC phase2 that this is
-                        // a chunk still in-use, so to keep in the S3 object store.
-                        std::fs::File::options()
-                            .write(true)
-                            .create_new(true)
-                            .open(&chunk_path)
-                            .with_context(|| {
-                                format!("failed to create marker for chunk {}", hex::encode(digest))
-                            })?;
-                    }
+            }
+
+            if let Some(ref _s3_client) = s3_client {
+                // Update atime on local cache marker files.
+                if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+                    let (chunk_path, _digest) = self.chunk_path(digest);
+                    // Insert empty file as marker to tell GC phase2 that this is
+                    // a chunk still in-use, so to keep in the S3 object store.
+                    std::fs::File::options()
+                        .write(true)
+                        .create_new(true)
+                        .open(&chunk_path)
+                        .with_context(|| {
+                            format!("failed to create marker for chunk {}", hex::encode(digest))
+                        })?;
                 }
             }
         }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 15/21] GC: guard missing marker file insertion for s3 backed stores
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (13 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 14/21] GC: touch bad chunk files independent of backend type Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 16/21] GC: s3: track if a chunk marker file is missing since a bad chunk Christian Ebner
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

The chunk marker file should only ever be missing if the local
datastore cache has been recreated (e.g. after setup on a new PBS
instance while reusing the s3 bucket contents) or by manual user
interaction. Garbage collection does re-create the marker in these
cases.

Guard this marker file creation by the per-chunk file lock to not run
into races with chunk insertion operations. Since this requires to
stat the chunk marker file and locking is expensive, first check if
the marker exists without holding a lock, only then lock and retry.

Since the first chunk file existence check is already performed
anyways move the logic to be within the non-existing branch thereof.
By making this happen after touching potential bad chunks, this will
allow to check these beforehand.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 46 ++++++++++++++++++----------------
 1 file changed, 25 insertions(+), 21 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index d71106ea8..9c0ce9859 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1323,36 +1323,40 @@ impl DataStore {
             }
 
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                let hex = hex::encode(digest);
-                warn!(
-                    "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
-                );
-
+                let (chunk_path, _digest_str) = self.chunk_path(digest);
                 // touch any corresponding .bad files to keep them around, meaning if a chunk is
                 // rewritten correctly they will be removed automatically, as well as if no index
                 // file requires the chunk anymore (won't get to this loop then)
                 for i in 0..=9 {
                     let bad_ext = format!("{i}.bad");
-                    let mut bad_path = PathBuf::new();
-                    bad_path.push(self.chunk_path(digest).0);
+                    let mut bad_path = chunk_path.clone();
                     bad_path.set_extension(bad_ext);
                     self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
                 }
-            }
 
-            if let Some(ref _s3_client) = s3_client {
-                // Update atime on local cache marker files.
-                if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                    let (chunk_path, _digest) = self.chunk_path(digest);
-                    // Insert empty file as marker to tell GC phase2 that this is
-                    // a chunk still in-use, so to keep in the S3 object store.
-                    std::fs::File::options()
-                        .write(true)
-                        .create_new(true)
-                        .open(&chunk_path)
-                        .with_context(|| {
-                            format!("failed to create marker for chunk {}", hex::encode(digest))
-                        })?;
+                if let Some(ref _s3_client) = s3_client {
+                    // Do not retry here, this is very unlikely to happen as chunk markers will
+                    // most likely only be missing if the local cache store was recreated.
+                    let _guard = self
+                        .inner
+                        .chunk_store
+                        .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
+                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+                        // Insert empty file as marker to tell GC phase2 that this is
+                        // a chunk still in-use, so to keep in the S3 object store.
+                        std::fs::File::options()
+                            .write(true)
+                            .create_new(true)
+                            .open(&chunk_path)
+                            .with_context(|| {
+                                format!("failed to create marker for chunk {}", hex::encode(digest))
+                            })?;
+                    }
+                } else {
+                    let hex = hex::encode(digest);
+                    warn!(
+                        "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
+                    );
                 }
             }
         }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 16/21] GC: s3: track if a chunk marker file is missing since a bad chunk
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (14 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 15/21] GC: guard missing marker file insertion for s3 backed stores Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 17/21] chunk store: add helpers marking missing local chunk markers as expected Christian Ebner
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

For bad chunks the marker file must never be created, as otherwise it
cannot be re-inserted by chunk upload.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 9c0ce9859..a726b152b 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1327,11 +1327,14 @@ impl DataStore {
                 // touch any corresponding .bad files to keep them around, meaning if a chunk is
                 // rewritten correctly they will be removed automatically, as well as if no index
                 // file requires the chunk anymore (won't get to this loop then)
+                let mut is_bad = false;
                 for i in 0..=9 {
                     let bad_ext = format!("{i}.bad");
                     let mut bad_path = chunk_path.clone();
                     bad_path.set_extension(bad_ext);
-                    self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
+                    if self.inner.chunk_store.cond_touch_path(&bad_path, false)? {
+                        is_bad = true;
+                    }
                 }
 
                 if let Some(ref _s3_client) = s3_client {
@@ -1341,7 +1344,7 @@ impl DataStore {
                         .inner
                         .chunk_store
                         .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
-                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? && !is_bad {
                         // Insert empty file as marker to tell GC phase2 that this is
                         // a chunk still in-use, so to keep in the S3 object store.
                         std::fs::File::options()
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 17/21] chunk store: add helpers marking missing local chunk markers as expected
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (15 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 16/21] GC: s3: track if a chunk marker file is missing since a bad chunk Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 18/21] GC: assure chunk exists on s3 store when creating missing chunk marker Christian Ebner
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

writes or removes a `<digest>.using` marker file in place of an
expected but missing local store cache chunk marker file for s3
backed datastores.

These markers will be used to identify chunks which are still in-use
and not bad, but the local store does not contain the marker file
(e.g. because the cache might be empty due to datastore re-creation
from s3 bucket).

These markers will be set in phase 1 of garbage collection and
replaced with the expected marker file by either a concurrent chunk
insert or if the listing of chunk in phase 2 of garbage collection
detects the chunk to be present on the s3 object store.

Phase 3 will clean up the files after falling out of the cutoff window,
given that the extension is chosen to fit the filename criteria.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 42 ++++++++++++++++++++++++++++++--
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 20f71efef..f10415d1f 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -721,18 +721,56 @@ impl ChunkStore {
     /// Safety: chunk store mutex must be held!
     pub(crate) unsafe fn replace_chunk_with_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, digest_str) = self.chunk_path(digest);
+        Self::create_marker_file(&chunk_path)
+            .map_err(|err| format_err!("clear chunk failed for {digest_str} - {err}"))?;
+        Ok(())
+    }
+
+    /// Helper to generate new empty marker file
+    fn create_marker_file(path: &Path) -> Result<(), Error> {
         let mut create_options = CreateOptions::new();
         if nix::unistd::Uid::effective().is_root() {
             let uid = pbs_config::backup_user()?.uid;
             let gid = pbs_config::backup_group()?.gid;
             create_options = create_options.owner(uid).group(gid);
         }
+        proxmox_sys::fs::replace_file(&path, &[], create_options, false)
+    }
 
-        proxmox_sys::fs::replace_file(&chunk_path, &[], create_options, false)
-            .map_err(|err| format_err!("clear chunk failed for {digest_str} - {err}"))?;
+    /// Mark chunk as expected to be present by writing a file the chunk store.
+    ///
+    /// Used to mark chunks which are found in index files during phase 1 of garbage collection
+    /// for s3 datastores, but the marker file is not present and it is seemingly not a bad chunk.
+    /// This might happen if the local store cache is empty after datastore re-creation.
+    pub(crate) fn mark_chunk_as_expected(&self, digest: &[u8; 32]) -> Result<(), Error> {
+        let marker_path = self.chunk_expected_marker_path(digest);
+        Self::create_marker_file(&marker_path)
+            .map_err(|err| format_err!("mark chunk failed for {} - {err}", hex::encode(digest)))?;
         Ok(())
     }
 
+    /// Remove chunk as required mark by removing file the chunk store.
+    ///
+    /// Returns true if the file was present and removed, false if the file did not exist.
+    pub(crate) fn clear_chunk_expected_mark(&self, digest: &[u8; 32]) -> Result<bool, Error> {
+        let marker_path = self.chunk_expected_marker_path(digest);
+        if let Err(err) = std::fs::remove_file(marker_path) {
+            if err.kind() != std::io::ErrorKind::NotFound {
+                return Err(err.into());
+            } else {
+                return Ok(false);
+            }
+        }
+        Ok(true)
+    }
+
+    /// Helper to generate marker file path for expected chunks
+    fn chunk_expected_marker_path(&self, digest: &[u8; 32]) -> PathBuf {
+        let (mut path, _digest_str) = self.chunk_path(digest);
+        path.set_extension("using");
+        path
+    }
+
     /// Removes a chunk marker file from the `LocalDatastoreLruCache`s chunk store.
     ///
     /// Callers must hold the per-chunk file lock in order to avoid races with renaming of corrupt
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 18/21] GC: assure chunk exists on s3 store when creating missing chunk marker
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (16 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 17/21] chunk store: add helpers marking missing local chunk markers as expected Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 19/21] datastore: document s3 backend specific locking restrictions Christian Ebner
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

Currently it is not assured the chunk is still present on the s3
object store before re-creating the chunk marker file. That will
however lead to the chunk not being re-inserted if re-uploaded.
Since checking the presence right away is expensive as it requires
additional api requests, mark the chunk as expected instead and
delay the existence check to phase 2 which must fetch the chunks
anyways.

Rely on the per-chunk file locks for consistency.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a726b152b..c0dc0f75d 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1347,13 +1347,7 @@ impl DataStore {
                     if !self.inner.chunk_store.cond_touch_chunk(digest, false)? && !is_bad {
                         // Insert empty file as marker to tell GC phase2 that this is
                         // a chunk still in-use, so to keep in the S3 object store.
-                        std::fs::File::options()
-                            .write(true)
-                            .create_new(true)
-                            .open(&chunk_path)
-                            .with_context(|| {
-                                format!("failed to create marker for chunk {}", hex::encode(digest))
-                            })?;
+                        self.inner.chunk_store.mark_chunk_as_expected(digest)?;
                     }
                 } else {
                     let hex = hex::encode(digest);
@@ -1683,8 +1677,16 @@ impl DataStore {
                     let atime = match std::fs::metadata(&chunk_path) {
                         Ok(stat) => stat.accessed()?,
                         Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
-                            // File not found, delete by setting atime to unix epoch
-                            SystemTime::UNIX_EPOCH
+                            if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
+                                unsafe {
+                                    // chunk store lock held
+                                    self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
+                                }
+                                SystemTime::now()
+                            } else {
+                                // File not found, delete by setting atime to unix epoch
+                                SystemTime::UNIX_EPOCH
+                            }
                         }
                         Err(err) => return Err(err.into()),
                     };
@@ -1980,7 +1982,8 @@ impl DataStore {
         )
         .map_err(|err| format_err!("failed to upload chunk to s3 backend - {err:#}"))?;
         tracing::info!("Caching of chunk {}", hex::encode(digest));
-        self.cache_insert(&digest, &chunk)?;
+        self.cache_insert(digest, chunk)?;
+        self.inner.chunk_store.clear_chunk_expected_mark(digest)?;
         Ok((is_duplicate, chunk_size))
     }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 19/21] datastore: document s3 backend specific locking restrictions
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (17 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 18/21] GC: assure chunk exists on s3 store when creating missing chunk marker Christian Ebner
@ 2025-11-14 13:18 ` Christian Ebner
  2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 20/21] GC: fix: don't drop bad extension for S3 object to chunk path helper Christian Ebner
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:18 UTC (permalink / raw)
  To: pbs-devel

The requirements are stricter here, since not only must it be avoided
to hold std::sync::Mutex guards for async contexts, but also there
must be consistency between s3 object store, local datastore cache
and in-memory LRU cache.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/lib.rs | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/pbs-datastore/src/lib.rs b/pbs-datastore/src/lib.rs
index 849078a8f..1f7c54ae8 100644
--- a/pbs-datastore/src/lib.rs
+++ b/pbs-datastore/src/lib.rs
@@ -81,6 +81,19 @@
 //! because running these operations concurrently is treated as a feature
 //! on its own.
 //!
+//! For datastores with S3 backend there are further restrictions since
+//! there are 3 types of locking mechanisms involved:
+//! - per-chunk file lock
+//! - chunk store mutex lock
+//! - lru cache mutex lock
+//!
+//! Locks must always be acquired in this specific order to avoid deadlocks.
+//! The per-chunk file lock is used to avoid holding a mutex lock during calls
+//! into async contexts, which can deadlock otherwise. It must be held for the
+//! whole time from starting an operation on the chunk until it is persisted
+//! to s3 backend, local datastore cache and in-memory LRU cache where
+//! required.
+//!
 //! ## Inter-process Locking
 //!
 //! We need to be able to restart the proxmox-backup service daemons, so
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 20/21] GC: fix: don't drop bad extension for S3 object to chunk path helper
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (18 preceding siblings ...)
  2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 19/21] datastore: document s3 backend specific locking restrictions Christian Ebner
@ 2025-11-14 13:19 ` Christian Ebner
  2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 21/21] GC: clean up bad chunks from the filesystem only Christian Ebner
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:19 UTC (permalink / raw)
  To: pbs-devel

The current implementation does return a path and digest also for S3
object keys which are bad chunks, does however drop the extension.

Since this will cause issues for phase 2 of garbage collection on
S3 backends, include the extension and return a flag signaling if
this is a bad chunk or not.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 37 +++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index c0dc0f75d..b91e90638 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1660,10 +1660,11 @@ impl DataStore {
             let mut delete_list = Vec::with_capacity(1000);
             loop {
                 for content in list_bucket_result.contents {
-                    let (chunk_path, digest) = match self.chunk_path_from_object_key(&content.key) {
-                        Some(path) => path,
-                        None => continue,
-                    };
+                    let (chunk_path, digest, bad) =
+                        match self.chunk_path_from_object_key(&content.key) {
+                            Some(path) => path,
+                            None => continue,
+                        };
 
                     let timeout = std::time::Duration::from_secs(0);
                     let _chunk_guard = match self.inner.chunk_store.lock_chunk(&digest, timeout) {
@@ -1692,11 +1693,6 @@ impl DataStore {
                     };
                     let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
 
-                    let bad = chunk_path
-                        .as_path()
-                        .extension()
-                        .is_some_and(|ext| ext == "bad");
-
                     unsafe {
                         self.inner.chunk_store.cond_sweep_chunk(
                             atime,
@@ -1852,14 +1848,23 @@ impl DataStore {
     }
 
     // Check and generate a chunk path from given object key
-    fn chunk_path_from_object_key(&self, object_key: &S3ObjectKey) -> Option<(PathBuf, [u8; 32])> {
+    fn chunk_path_from_object_key(
+        &self,
+        object_key: &S3ObjectKey,
+    ) -> Option<(PathBuf, [u8; 32], bool)> {
         // Check object is actually a chunk
+        let path = Path::new::<str>(object_key);
         // file_name() should always be Some, as objects will have a filename
-        let digest = Path::new::<str>(object_key).file_name()?;
+        let digest = path.file_name()?;
         let bytes = digest.as_bytes();
-        if bytes.len() != 64 && bytes.len() != 64 + ".0.bad".len() {
+        let bad_ext_len = ".0.bad".len();
+        let bad_chunk = if bytes.len() == 64 + bad_ext_len {
+            true
+        } else if bytes.len() == 64 {
+            false
+        } else {
             return None;
-        }
+        };
         if !bytes.iter().take(64).all(u8::is_ascii_hexdigit) {
             return None;
         }
@@ -1871,13 +1876,17 @@ impl DataStore {
         chunk_path.push(".chunks");
         chunk_path.push(hexdigit_prefix);
         chunk_path.push(digest);
+        if bad_chunk {
+            let extension = unsafe { digest_str.get_unchecked(64..64 + bad_ext_len) };
+            chunk_path.push(extension);
+        }
 
         let mut digest_bytes = [0u8; 32];
         let digest = digest.as_bytes();
         // safe to unwrap as already checked above
         hex::decode_to_slice(&digest[..64], &mut digest_bytes).unwrap();
 
-        Some((chunk_path, digest_bytes))
+        Some((chunk_path, digest_bytes, bad_chunk))
     }
 
     pub fn try_shared_chunk_store_lock(&self) -> Result<ProcessLockSharedGuard, Error> {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v6 21/21] GC: clean up bad chunks from the filesystem only
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (19 preceding siblings ...)
  2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 20/21] GC: fix: don't drop bad extension for S3 object to chunk path helper Christian Ebner
@ 2025-11-14 13:19 ` Christian Ebner
  2025-11-14 13:34 ` [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
  2025-11-14 22:14 ` Thomas Lamprecht
  22 siblings, 0 replies; 24+ messages in thread
From: Christian Ebner @ 2025-11-14 13:19 UTC (permalink / raw)
  To: pbs-devel

Bad chunks are not present in the in-memory LRU cache, so only clean
up the bad chunk file from the filesystem.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index b91e90638..fa8212760 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1703,7 +1703,11 @@ impl DataStore {
                             &mut gc_status,
                             || {
                                 if let Some(cache) = self.cache() {
-                                    cache.remove(&digest)?;
+                                    if !bad {
+                                        cache.remove(&digest)?;
+                                    } else {
+                                        std::fs::remove_file(chunk_path)?;
+                                    }
                                 }
                                 delete_list.push((content.key, _chunk_guard));
                                 Ok(())
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (20 preceding siblings ...)
  2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 21/21] GC: clean up bad chunks from the filesystem only Christian Ebner
@ 2025-11-14 13:34 ` Fabian Grünbichler
  2025-11-14 22:14 ` Thomas Lamprecht
  22 siblings, 0 replies; 24+ messages in thread
From: Fabian Grünbichler @ 2025-11-14 13:34 UTC (permalink / raw)
  To: Christian Ebner, pbs-devel

Quoting Christian Ebner (2025-11-14 14:18:40)
> These patches fix possible race conditions on datastores with s3 backend for
> chunk insert, renaming of corrupt chunks during verification and cleanup during
> garbage collection. Further, the patches assure consistency between the chunk
> marker file of the local datastore cache, the s3 object store and the in-memory
> LRU cache during state changes occurring by one of the above mentioned operations.
> 
> Consistency is achieved by using a per-chunk file locking mechanism. File locks
> are stored on the predefined location for datastore file locks, using the same
> `.chunks/prefix/digest` folder layout for consistency and to keep readdir and
> other fs operations performant.
> 
> As part of the series it is now also assured that chunks which are removed from
> the local datastore cache, are also dropped from it's in-memory LRU cache and
> therefore a consistent state is achieved. Further, bad chunks are touched as
> well during GC phase 1 for s3 backed datastores and the creation of missing
> marker files performed conditionally, to avoid consistency issues.

Consider this series:

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

an additional set of eyes would still be good, given the scope of changes!

I plan to follow up with some significant refactoring after the PBS release is
out the door, as there are two areas where the code is becoming a bit unwieldy:

- make the chunk store mutex a proper lock guard and implement the operations
  that require the lock to be held on top
- make more of the backend handling implicit


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
  2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (21 preceding siblings ...)
  2025-11-14 13:34 ` [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
@ 2025-11-14 22:14 ` Thomas Lamprecht
  22 siblings, 0 replies; 24+ messages in thread
From: Thomas Lamprecht @ 2025-11-14 22:14 UTC (permalink / raw)
  To: pbs-devel, Christian Ebner

On Fri, 14 Nov 2025 14:18:40 +0100, Christian Ebner wrote:
> These patches fix possible race conditions on datastores with s3 backend for
> chunk insert, renaming of corrupt chunks during verification and cleanup during
> garbage collection. Further, the patches assure consistency between the chunk
> marker file of the local datastore cache, the s3 object store and the in-memory
> LRU cache during state changes occurring by one of the above mentioned operations.
> 
> Consistency is achieved by using a per-chunk file locking mechanism. File locks
> are stored on the predefined location for datastore file locks, using the same
> `.chunks/prefix/digest` folder layout for consistency and to keep readdir and
> other fs operations performant.
> 
> [...]

Applied, thanks!

[01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep
        commit: d326fc5c055fd2783cff5817aa9c3d1df3bc4092
[02/21] chunk store: implement per-chunk file locking helper for s3 backend
        commit: adcc39540a3ba070d4627f0785dd113427d8c92c
[03/21] datastore: acquire chunk store mutex lock when renaming corrupt chunk
        commit: e35597690a064be6efe78e5df5c196bc3c9ecd7b
[04/21] datastore: get per-chunk file lock for chunk rename on s3 backend
        commit: 5eabe8b9222a108daf335a6eba7fbd64d9329d6b
[05/21] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache
        commit: 34a395e4f95976f8d96253c6cdb0ccc11ba21604
[06/21] datastore: add locking to protect against races on chunk insert for s3
        commit: 59414b6c9c6e5958d0dc5c0356e754095637f8a3
[07/21] GC: fix race with chunk upload/insert on s3 backends
        commit: 86d5d073037132e6e9aeb01f8e6aaf311dadcd10
[08/21] chunk store: reduce exposure of clear_chunk() to crate only
        commit: 9bd3f5a65110e6bafe5921cb4a7fe24da726cc98
[09/21] chunk store: make chunk removal a helper method of the chunk store
        commit: 9312c9de136f3df8001b5b397a51ed69fa3ee818
[10/21] store: split insert_chunk into wrapper + unsafe locked implementation
        commit: 25db75fb4b23356b28717d2bc88b6826486c851a
[11/21] store: cache: move Mutex acquire to cache insertion
        commit: e4905d1cadfb17d85cd350d9815e4787c888eea8
[12/21] chunk store: rename cache-specific helpers
        commit: 76ab83194af389d954241eb4d10126e32236b714
[13/21] GC: cleanup chunk markers from cache in phase 3 on s3 backends
        commit: 77438efecb8ff3d5e51b04650597ba738d3419b6
[14/21] GC: touch bad chunk files independent of backend type
        commit: f46fd187003eb2da355ff8da19aa71cb8d22878f
[15/21] GC: guard missing marker file insertion for s3 backed stores
        commit: 014f22d51df63d3721fb0a96861bd08978bc8064
[16/21] GC: s3: track if a chunk marker file is missing since a bad chunk
        commit: 7548f8f84ff60e8dcb2787f296db6bc34554f8d7
[17/21] chunk store: add helpers marking missing local chunk markers as expected
        commit: abfad2f38cd3bf94c2df5f6b97f8b192417641c0
[18/21] GC: assure chunk exists on s3 store when creating missing chunk marker
        commit: 9510ef1af395b0e744a938155ff796be9ce67ff6
[19/21] datastore: document s3 backend specific locking restrictions
        commit: c1b003afd932cca0d35e319d413382268e112470
[20/21] GC: fix: don't drop bad extension for S3 object to chunk path helper
        commit: 27e3f5b7bb50cc8b419c9b51b560344469629da5
[21/21] GC: clean up bad chunks from the filesystem only
        commit: 36f62da10d1cb40a19e32b8a0d3d4bcb8ac322bf


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-11-14 22:16 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-14 13:18 [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 01/21] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 02/21] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 03/21] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 04/21] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 05/21] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 06/21] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 07/21] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 08/21] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 09/21] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 10/21] store: split insert_chunk into wrapper + unsafe locked implementation Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 11/21] store: cache: move Mutex acquire to cache insertion Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 12/21] chunk store: rename cache-specific helpers Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 13/21] GC: cleanup chunk markers from cache in phase 3 on s3 backends Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 14/21] GC: touch bad chunk files independent of backend type Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 15/21] GC: guard missing marker file insertion for s3 backed stores Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 16/21] GC: s3: track if a chunk marker file is missing since a bad chunk Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 17/21] chunk store: add helpers marking missing local chunk markers as expected Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 18/21] GC: assure chunk exists on s3 store when creating missing chunk marker Christian Ebner
2025-11-14 13:18 ` [pbs-devel] [PATCH proxmox-backup v6 19/21] datastore: document s3 backend specific locking restrictions Christian Ebner
2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 20/21] GC: fix: don't drop bad extension for S3 object to chunk path helper Christian Ebner
2025-11-14 13:19 ` [pbs-devel] [PATCH proxmox-backup v6 21/21] GC: clean up bad chunks from the filesystem only Christian Ebner
2025-11-14 13:34 ` [pbs-devel] [PATCH proxmox-backup v6 00/21] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
2025-11-14 22:14 ` Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal