public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
@ 2025-11-10 11:56 Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate Christian Ebner
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

These patches fix possible race conditions on datastores with s3 backend for
chunk insert, renaming of corrupt chunks during verification and cleanup during
garbage collection. Further, the patches assure consistency between the chunk
marker file of the local datastore cache, the s3 object store and the in-memory
LRU cache during state changes occurring by one of the above mentioned operations.

Consistency is achieved by using a per-chunk file locking mechanism. File locks
are stored on the predefined location for datastore file locks, using the same
`.chunks/prefix/digest` folder layout for consistency and to keep readdir and
other fs operations performant.

Before introducing the file locking mechanism, the patches refactor pre-existing
code to move most of the backend related logic away from the api code to the
datastore implementation, in order to have a common interface especially for
chunk insert.

As part of the series it is now also assured that chunks which are removed from
the local datastore cache, are also dropped from it's in-memory LRU cache and
therefore a consistent state is achieved.

Changes since version 3:
- Add patches to limit visibility of BackupDir and BackupGroup destroy
- Refactored s3 upload index helper
- Avoid unneeded double stat for GC phase 3 clenaups

Changes since version 2:
- Incorporate additional race fix as discussed in
  https://lore.proxmox.com/pbs-devel/8ab74557-9592-43e7-8706-10fceaae31b7@proxmox.com/T/
  and suggested offlist.

Changes since version 1 (thanks @Fabian for review):
- Fix lock inversion for rename corrup chunk.
- Inline the chunk lock helper, making it explicit and thereby avoid calling the
  helper for regular datastores.
- Pass the backend to the add_blob datastore helper, so it can be reused for the
  backup session and pull sync job.
- Move also the s3 index upload helper from the backup env to the datastore, and
  reuse it for the sync job as well.

This patch series obsoletes two previous patch series with unfortunately
incomplete bugfix attempts found at:
- https://lore.proxmox.com/pbs-devel/8d711a20-b193-47a9-8f38-6ce800e6d0e8@proxmox.com/T/
- https://lore.proxmox.com/pbs-devel/20251015164008.975591-1-c.ebner@proxmox.com/T/

proxmox-backup:

Christian Ebner (14):
  datastore: limit scope of snapshot/group destroy methods to crate
  api/datastore: move s3 index upload helper to datastore backend
  chunk store: implement per-chunk file locking helper for s3 backend
  datastore: acquire chunk store mutex lock when renaming corrupt chunk
  datastore: get per-chunk file lock for chunk rename on s3 backend
  fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU
    cache
  datastore: add locking to protect against races on chunk insert for s3
  GC: fix race with chunk upload/insert on s3 backends
  GC: cleanup chunk markers from cache in phase 3 on s3 backends
  datastore: GC: drop overly verbose info message during s3 chunk sweep
  chunk store: reduce exposure of clear_chunk() to crate only
  chunk store: make chunk removal a helper method of the chunk store
  GC: fix deadlock for cache eviction and garbage collection
  chunk store: never fail when trying to remove missing chunk file

 pbs-datastore/src/backup_info.rs              |  9 +-
 pbs-datastore/src/chunk_store.rs              | 67 ++++++++++++-
 pbs-datastore/src/datastore.rs                | 97 ++++++++++++++++---
 .../src/local_datastore_lru_cache.rs          |  6 +-
 src/api2/backup/environment.rs                | 36 +++----
 src/server/pull.rs                            | 16 +--
 6 files changed, 172 insertions(+), 59 deletions(-)


Summary over all repositories:
  6 files changed, 172 insertions(+), 59 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 02/14] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Commit 800e960c5 ("api/datastore: move snapshot deletion into
dedicated datastore helper") moved the snapshot destroy and the group
destroy was already only used inside. By scoping these they are
better encapsulated to limit potential future use.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/backup_info.rs | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 4b10b6435..29ca6f901 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -221,7 +221,10 @@ impl BackupGroup {
     ///
     /// Returns `BackupGroupDeleteStats`, containing the number of deleted snapshots
     /// and number of protected snaphsots, which therefore were not removed.
-    pub fn destroy(&self, backend: &DatastoreBackend) -> Result<BackupGroupDeleteStats, Error> {
+    pub(crate) fn destroy(
+        &self,
+        backend: &DatastoreBackend,
+    ) -> Result<BackupGroupDeleteStats, Error> {
         let _guard = self
             .lock()
             .with_context(|| format!("while destroying group '{self:?}'"))?;
@@ -611,7 +614,7 @@ impl BackupDir {
     /// Destroy the whole snapshot, bails if it's protected
     ///
     /// Setting `force` to true skips locking and thus ignores if the backup is currently in use.
-    pub fn destroy(&self, force: bool, backend: &DatastoreBackend) -> Result<(), Error> {
+    pub(crate) fn destroy(&self, force: bool, backend: &DatastoreBackend) -> Result<(), Error> {
         let (_guard, _manifest_guard);
         if !force {
             _guard = self
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 02/14] api/datastore: move s3 index upload helper to datastore backend
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 03/14] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

In an effort to decouple the api implementation from the backend
implementation and deduplicate code. Return a boolean flag to
distigush between successful uploads and no actions required
(filesystem backends only).

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 30 ++++++++++++++++++++++++++++
 src/api2/backup/environment.rs | 36 ++++++++++++----------------------
 src/server/pull.rs             | 16 ++++-----------
 3 files changed, 47 insertions(+), 35 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 70af94d8f..d66e68332 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -224,6 +224,36 @@ pub enum DatastoreBackend {
     S3(Arc<S3Client>),
 }
 
+impl DatastoreBackend {
+    /// Reads the index file and uploads it to the backend.
+    ///
+    /// Returns with true if the backend was updated, false if no action needed to be performed
+    pub async fn upload_index_to_backend(
+        &self,
+        backup_dir: &BackupDir,
+        name: &str,
+    ) -> Result<bool, Error> {
+        match self {
+            Self::Filesystem => Ok(false),
+            Self::S3(s3_client) => {
+                let object_key = crate::s3::object_key_from_path(&backup_dir.relative_path(), name)
+                    .context("invalid index file object key")?;
+
+                let mut full_path = backup_dir.full_path();
+                full_path.push(name);
+                let data = tokio::fs::read(&full_path)
+                    .await
+                    .context("failed to read index contents")?;
+                let contents = hyper::body::Bytes::from(data);
+                let _is_duplicate = s3_client
+                    .upload_replace_with_retry(object_key, contents)
+                    .await?;
+                Ok(true)
+            }
+        }
+    }
+}
+
 impl DataStore {
     // This one just panics on everything
     #[doc(hidden)]
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index 0faf6c8e0..1b8e0e1db 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -18,7 +18,6 @@ use pbs_datastore::dynamic_index::DynamicIndexWriter;
 use pbs_datastore::fixed_index::FixedIndexWriter;
 use pbs_datastore::{DataBlob, DataStore, DatastoreBackend};
 use proxmox_rest_server::{formatter::*, WorkerTask};
-use proxmox_s3_client::S3Client;
 
 use crate::backup::VerifyWorker;
 
@@ -560,11 +559,14 @@ impl BackupEnvironment {
         drop(state);
 
         // For S3 backends, upload the index file to the object store after closing
-        if let DatastoreBackend::S3(s3_client) = &self.backend {
-            self.s3_upload_index(s3_client, &writer_name)
-                .context("failed to upload dynamic index to s3 backend")?;
+        if proxmox_async::runtime::block_on(
+            self.backend
+                .upload_index_to_backend(&self.backup_dir, &writer_name),
+        )
+        .context("failed to upload dynamic index to backend")?
+        {
             self.log(format!(
-                "Uploaded dynamic index file to s3 backend: {writer_name}"
+                "Uploaded dynamic index file to backend: {writer_name}"
             ))
         }
 
@@ -659,9 +661,12 @@ impl BackupEnvironment {
         drop(state);
 
         // For S3 backends, upload the index file to the object store after closing
-        if let DatastoreBackend::S3(s3_client) = &self.backend {
-            self.s3_upload_index(s3_client, &writer_name)
-                .context("failed to upload fixed index to s3 backend")?;
+        if proxmox_async::runtime::block_on(
+            self.backend
+                .upload_index_to_backend(&self.backup_dir, &writer_name),
+        )
+        .context("failed to upload fixed index to backend")?
+        {
             self.log(format!(
                 "Uploaded fixed index file to object store: {writer_name}"
             ))
@@ -842,21 +847,6 @@ impl BackupEnvironment {
         let state = self.state.lock().unwrap();
         state.finished == BackupState::Finished
     }
-
-    fn s3_upload_index(&self, s3_client: &S3Client, name: &str) -> Result<(), Error> {
-        let object_key =
-            pbs_datastore::s3::object_key_from_path(&self.backup_dir.relative_path(), name)
-                .context("invalid index file object key")?;
-
-        let mut full_path = self.backup_dir.full_path();
-        full_path.push(name);
-        let data = std::fs::read(&full_path).context("failed to read index contents")?;
-        let contents = hyper::body::Bytes::from(data);
-        proxmox_async::runtime::block_on(
-            s3_client.upload_replace_with_retry(object_key, contents),
-        )?;
-        Ok(())
-    }
 }
 
 impl RpcEnvironment for BackupEnvironment {
diff --git a/src/server/pull.rs b/src/server/pull.rs
index 92513fe70..ba79704cd 100644
--- a/src/server/pull.rs
+++ b/src/server/pull.rs
@@ -342,19 +342,11 @@ async fn pull_single_archive<'a>(
     if let Err(err) = std::fs::rename(&tmp_path, &path) {
         bail!("Atomic rename file {:?} failed - {}", path, err);
     }
-    if let DatastoreBackend::S3(s3_client) = backend {
-        let object_key =
-            pbs_datastore::s3::object_key_from_path(&snapshot.relative_path(), archive_name)
-                .context("invalid archive object key")?;
 
-        let data = tokio::fs::read(&path)
-            .await
-            .context("failed to read archive contents")?;
-        let contents = hyper::body::Bytes::from(data);
-        let _is_duplicate = s3_client
-            .upload_replace_with_retry(object_key, contents)
-            .await?;
-    }
+    backend
+        .upload_index_to_backend(snapshot, archive_name)
+        .await?;
+
     Ok(sync_stats)
 }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 03/14] chunk store: implement per-chunk file locking helper for s3 backend
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 02/14] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 04/14] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Adds chunk store helper methods to create per-chunk file locks. These
will be used to guard chunk operations on s3 backends to guarantee
exclusive access when performing cache and backend operations.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/backup_info.rs |  2 +-
 pbs-datastore/src/chunk_store.rs | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/backup_info.rs b/pbs-datastore/src/backup_info.rs
index 29ca6f901..859039cf4 100644
--- a/pbs-datastore/src/backup_info.rs
+++ b/pbs-datastore/src/backup_info.rs
@@ -939,7 +939,7 @@ fn lock_file_path_helper(ns: &BackupNamespace, path: PathBuf) -> PathBuf {
 /// deletion.
 ///
 /// It also creates the base directory for lock files.
-fn lock_helper<F>(
+pub(crate) fn lock_helper<F>(
     store_name: &str,
     path: &std::path::Path,
     lock_fn: F,
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index ba7618e40..49687b2fa 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -8,6 +8,7 @@ use anyhow::{bail, format_err, Context, Error};
 use tracing::{info, warn};
 
 use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
+use pbs_config::BackupLockGuard;
 use proxmox_io::ReadExt;
 use proxmox_s3_client::S3Client;
 use proxmox_sys::fs::{create_dir, create_path, file_type_from_file_stat, CreateOptions};
@@ -16,6 +17,7 @@ use proxmox_sys::process_locker::{
 };
 use proxmox_worker_task::WorkerTaskContext;
 
+use crate::backup_info::DATASTORE_LOCKS_DIR;
 use crate::data_blob::DataChunkBuilder;
 use crate::file_formats::{
     COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
@@ -759,6 +761,30 @@ impl ChunkStore {
         ChunkStore::check_permissions(lockfile_path, 0o644)?;
         Ok(())
     }
+
+    /// Generates the path to the chunks lock file
+    pub(crate) fn chunk_lock_path(&self, digest: &[u8]) -> PathBuf {
+        let mut lock_path = Path::new(DATASTORE_LOCKS_DIR).join(self.name.clone());
+        let digest_str = hex::encode(digest);
+        lock_path.push(".chunks");
+        let prefix = digest_to_prefix(digest);
+        lock_path.push(&prefix);
+        lock_path.push(&digest_str);
+        lock_path
+    }
+
+    /// Get an exclusive lock on the chunks lock file
+    pub(crate) fn lock_chunk(
+        &self,
+        digest: &[u8],
+        timeout: Duration,
+    ) -> Result<BackupLockGuard, Error> {
+        let lock_path = self.chunk_lock_path(digest);
+        let guard = crate::backup_info::lock_helper(self.name(), &lock_path, |path| {
+            pbs_config::open_backup_lockfile(path, Some(timeout), true)
+        })?;
+        Ok(guard)
+    }
 }
 
 #[test]
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 04/14] datastore: acquire chunk store mutex lock when renaming corrupt chunk
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (2 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 03/14] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 05/14] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

When renaming a corrupt chunk in the chunk store, currently the chunk
store mutex lock is not held, leading to possible races with other
operations which hold the lock and therefore assume exclusive access
such as garbage collection or backup chunk inserts. This affects
both, filesystem and S3 backends.

To fix the possible race, get the lock and rearrange the code such
that it is never held when entering async code for the s3 backend.
This does not yet solve the race and cache consistency for s3
backends, which is addressed by introducing per-chunk file locking
in subsequent patches.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index d66e68332..6f823de39 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2606,6 +2606,8 @@ impl DataStore {
     pub fn rename_corrupt_chunk(&self, digest: &[u8; 32]) -> Result<Option<PathBuf>, Error> {
         let (path, digest_str) = self.chunk_path(digest);
 
+        let _lock = self.inner.chunk_store.mutex().lock().unwrap();
+
         let mut counter = 0;
         let mut new_path = path.clone();
         loop {
@@ -2617,6 +2619,14 @@ impl DataStore {
             }
         }
 
+        let result = match std::fs::rename(&path, &new_path) {
+            Ok(_) => Ok(Some(new_path)),
+            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
+            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
+        };
+
+        drop(_lock);
+
         let backend = self.backend().map_err(|err| {
             format_err!(
                 "failed to get backend while trying to rename bad chunk: {digest_str} - {err}"
@@ -2647,10 +2657,6 @@ impl DataStore {
             )?;
         }
 
-        match std::fs::rename(&path, &new_path) {
-            Ok(_) => Ok(Some(new_path)),
-            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
-            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
-        }
+        result
     }
 }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 05/14] datastore: get per-chunk file lock for chunk rename on s3 backend
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (3 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 04/14] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 06/14] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

To guarantee exclusive access during s3 object store and cache
operations, acquire the per-chunk file before renaming a chunk when
the datastore is backed by s3.

This does not yet cover locking for the GC and chunk insert, part
of subsequent changes.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 6f823de39..7e8213136 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -56,6 +56,8 @@ pub const GROUP_OWNER_FILE_NAME: &str = "owner";
 /// Filename for in-use marker stored on S3 object store backend
 pub const S3_DATASTORE_IN_USE_MARKER: &str = ".in-use";
 const NAMESPACE_MARKER_FILENAME: &str = ".namespace";
+// s3 put request times out after upload_size / 1 Kib/s, so about 2.3 hours for 8 MiB
+const CHUNK_LOCK_TIMEOUT: Duration = Duration::from_secs(3 * 60 * 60);
 
 /// checks if auth_id is owner, or, if owner is a token, if
 /// auth_id is the user of the token
@@ -2606,6 +2608,13 @@ impl DataStore {
     pub fn rename_corrupt_chunk(&self, digest: &[u8; 32]) -> Result<Option<PathBuf>, Error> {
         let (path, digest_str) = self.chunk_path(digest);
 
+        let _chunk_guard;
+        if let DatastoreBackendType::S3 = self.inner.backend_config.ty.unwrap_or_default() {
+            _chunk_guard = self
+                .inner
+                .chunk_store
+                .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
+        }
         let _lock = self.inner.chunk_store.mutex().lock().unwrap();
 
         let mut counter = 0;
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 06/14] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (4 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 05/14] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 07/14] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

lock cache store on rename of fs backed datastores and additional
per-chunk file lock on s3 backend.

Chunks detected as corrupt have been renamed on both, the S3 backend
and the local datastore cache, but not evicted from the in-memory
cache containing the LRU chunk digests. This can lead to the chunks
being considered as already present if their digest is still cached,
and therefore not being re-inserted in the local store cache and S3
backend on backup upload.

Fix this by not only renaming the local datastore's chunk marker
file, but also removing it from the in-memory cache while holding the
chunk store mutex lock to exclude interference from concurrent chunk
inserts.

Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6961
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 7e8213136..3beba940c 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2628,6 +2628,11 @@ impl DataStore {
             }
         }
 
+        if let Some(cache) = self.cache() {
+            // Locks are being held, so it is safe to call the method
+            let _ = unsafe { cache.remove(digest) };
+        }
+
         let result = match std::fs::rename(&path, &new_path) {
             Ok(_) => Ok(Some(new_path)),
             Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 07/14] datastore: add locking to protect against races on chunk insert for s3
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (5 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 06/14] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 08/14] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Acquire the per-chunk file lock to get exclusive access to the chunk
on insert and make it possible to guarantee that s3 backend
operations and the local caches reflect the consistent state without
other operations such as verify chunk renaming or garbage collection
interfering.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 3beba940c..10acc91a0 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1923,6 +1923,10 @@ impl DataStore {
         match backend {
             DatastoreBackend::Filesystem => self.inner.chunk_store.insert_chunk(chunk, digest),
             DatastoreBackend::S3(s3_client) => {
+                let _chunk_guard = self
+                    .inner
+                    .chunk_store
+                    .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
                 let chunk_data: Bytes = chunk.raw_data().to_vec().into();
                 let chunk_size = chunk_data.len() as u64;
                 let object_key = crate::s3::object_key_from_digest(digest)?;
@@ -1943,6 +1947,10 @@ impl DataStore {
     ) -> Result<(bool, u64), Error> {
         let chunk_data = chunk.raw_data();
         let chunk_size = chunk_data.len() as u64;
+        let _chunk_guard = self
+            .inner
+            .chunk_store
+            .lock_chunk(digest, CHUNK_LOCK_TIMEOUT)?;
 
         // Avoid re-upload to S3 if the chunk is either present in the in-memory LRU cache
         // or the chunk marker file exists on filesystem. The latter means the chunk has
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 08/14] GC: fix race with chunk upload/insert on s3 backends
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (6 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 07/14] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 09/14] GC: cleanup chunk markers from cache in phase 3 " Christian Ebner
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

The previous approach relying soly on local marker files was flawed
as it could not protect against all the possible races between chunk
upload to the s3 object store, insertion in the local datastore cache
and in-memory LRU cache.

Since these operations are now protected by getting a per-chunk file
lock, use that to check if it is safe to remove the chunk and do so
in a consistent manner by holding the chunk lock guard until it is
actually removed from the s3 backend and the caches.

Since an error when removing the chunk from cache could lead to
inconsistencies, GC must now fail in that case.

The chunk store lock is now only required on cache remove.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 10acc91a0..71a8b1b60 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1671,14 +1671,18 @@ impl DataStore {
 
             let mut delete_list = Vec::with_capacity(1000);
             loop {
-                let lock = self.inner.chunk_store.mutex().lock().unwrap();
-
                 for content in list_bucket_result.contents {
                     let (chunk_path, digest) = match self.chunk_path_from_object_key(&content.key) {
                         Some(path) => path,
                         None => continue,
                     };
 
+                    let timeout = std::time::Duration::from_secs(0);
+                    let _chunk_guard = match self.inner.chunk_store.lock_chunk(&digest, timeout) {
+                        Ok(guard) => guard,
+                        Err(_) => continue,
+                    };
+
                     // Check local markers (created or atime updated during phase1) and
                     // keep or delete chunk based on that.
                     let atime = match std::fs::metadata(&chunk_path) {
@@ -1707,10 +1711,10 @@ impl DataStore {
                             &mut gc_status,
                             || {
                                 if let Some(cache) = self.cache() {
-                                    // ignore errors, phase 3 will retry cleanup anyways
-                                    let _ = cache.remove(&digest);
+                                    let _guard = self.inner.chunk_store.mutex().lock().unwrap();
+                                    cache.remove(&digest)?;
                                 }
-                                delete_list.push(content.key);
+                                delete_list.push((content.key, _chunk_guard));
                                 Ok(())
                             },
                         )?;
@@ -1719,14 +1723,19 @@ impl DataStore {
                     chunk_count += 1;
                 }
 
-                drop(lock);
-
                 if !delete_list.is_empty() {
-                    let delete_objects_result =
-                        proxmox_async::runtime::block_on(s3_client.delete_objects(&delete_list))?;
+                    let delete_objects_result = proxmox_async::runtime::block_on(
+                        s3_client.delete_objects(
+                            &delete_list
+                                .iter()
+                                .map(|(key, _)| key.clone())
+                                .collect::<Vec<S3ObjectKey>>(),
+                        ),
+                    )?;
                     if let Some(_err) = delete_objects_result.error {
                         bail!("failed to delete some objects");
                     }
+                    // release all chunk guards
                     delete_list.clear();
                 }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 09/14] GC: cleanup chunk markers from cache in phase 3 on s3 backends
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (7 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 08/14] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 10/14] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Pass along the in-memory cache when sweeping unused chunks in phase 3
of garbage collection for datastores with s3 backend.
When a dangling marker file has been detected, which will only happen
if the chunk was removed from the object store by some unexpected
interaction (e.g. manually removed from the bucket), this marker must be
removed to get a consistent state (snapshots referencing the chunk
remain however corrupt).

Clear such a chunk from both, the in-memory and local datastore cache,
so it can be reuploaded by future backup or sync jobs.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 21 ++++++++++++++++++++-
 pbs-datastore/src/datastore.rs   |  2 ++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 49687b2fa..917c5a877 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
 use std::time::Duration;
 
 use anyhow::{bail, format_err, Context, Error};
+use hex::FromHex;
 use tracing::{info, warn};
 
 use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
@@ -22,7 +23,7 @@ use crate::data_blob::DataChunkBuilder;
 use crate::file_formats::{
     COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
 };
-use crate::DataBlob;
+use crate::{DataBlob, LocalDatastoreLruCache};
 
 /// File system based chunk store
 pub struct ChunkStore {
@@ -366,6 +367,7 @@ impl ChunkStore {
         min_atime: i64,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
+        cache: Option<&LocalDatastoreLruCache>,
     ) -> Result<(), Error> {
         // unwrap: only `None` in unit tests
         assert!(self.locker.is_some());
@@ -419,6 +421,23 @@ impl ChunkStore {
                         bad,
                         status,
                         || {
+                            // non-bad S3 chunks need to be removed via cache
+                            if let Some(cache) = cache {
+                                if !bad {
+                                    let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
+
+                                    // unless there is a concurrent upload pending
+                                    if let Ok(_guard) =
+                                        self.lock_chunk(&digest, Duration::from_secs(0))
+                                    {
+                                        cache.remove(&digest)?;
+                                    }
+
+                                    return Ok(());
+                                }
+                            }
+
+                            // bad or local chunks
                             unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
                                 |err| {
                                     format_err!(
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 71a8b1b60..5022ccc0f 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1765,6 +1765,7 @@ impl DataStore {
                 min_atime,
                 &mut tmp_gc_status,
                 worker,
+                self.cache(),
             )?;
         } else {
             self.inner.chunk_store.sweep_unused_chunks(
@@ -1772,6 +1773,7 @@ impl DataStore {
                 min_atime,
                 &mut gc_status,
                 worker,
+                None,
             )?;
         }
 
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 10/14] datastore: GC: drop overly verbose info message during s3 chunk sweep
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (8 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 09/14] GC: cleanup chunk markers from cache in phase 3 " Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 11/14] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

This message does not provide much additional information and is
shown on correct operation anyways, leaving it however open for
misinterpretation of being an error/warning. Drop it in favor of
being less verbose and not potentially spam the task log in case of
many chunks being removed, still being visible in the final garbage
collection stats output.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/datastore.rs | 1 -
 1 file changed, 1 deletion(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 5022ccc0f..354caeae1 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1689,7 +1689,6 @@ impl DataStore {
                         Ok(stat) => stat.accessed()?,
                         Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
                             // File not found, delete by setting atime to unix epoch
-                            info!("Not found, mark for deletion: {}", content.key);
                             SystemTime::UNIX_EPOCH
                         }
                         Err(err) => return Err(err.into()),
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 11/14] chunk store: reduce exposure of clear_chunk() to crate only
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (9 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 10/14] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 12/14] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

This helper method is only used by the local datastore cache to clear
chunk when evicted from the cache, freeing up space usage. This should
not be called from outside pbs-datastore crate.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 917c5a877..566252eeb 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -688,7 +688,7 @@ impl ChunkStore {
     ///
     /// Used to evict chunks from the local datastore cache, while keeping them as in-use markers
     /// for garbage collection. Returns with success also if chunk file is not pre-existing.
-    pub fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+    pub(crate) fn clear_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, digest_str) = self.chunk_path(digest);
         let mut create_options = CreateOptions::new();
         if nix::unistd::Uid::effective().is_root() {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 12/14] chunk store: make chunk removal a helper method of the chunk store
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (10 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 11/14] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 13/14] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Better encapsulate functionality which touches the chunk store, to
assure a common interface is used. The local datastore LRU cache will
call this during garbage collection to clean up chunk marker files
for chunks which have been removed from the S3 object store backend.

This is in preparation for fixing a deadlock, no functional changes
intended.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs               | 9 +++++++++
 pbs-datastore/src/local_datastore_lru_cache.rs | 3 +--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 566252eeb..879ac4923 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -704,6 +704,15 @@ impl ChunkStore {
         Ok(())
     }
 
+    /// Removes a chunk marker file from the `LocalDatastoreLruCache`s chunk store.
+    ///
+    /// Callers must hold the per-chunk file lock in order to avoid races with renaming of corrupt
+    /// chunks by verifications and chunk inserts by backups.
+    pub(crate) fn remove_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
+        let (chunk_path, _digest_str) = self.chunk_path(digest);
+        std::fs::remove_file(chunk_path).map_err(Error::from)
+    }
+
     pub fn relative_path(&self, path: &Path) -> PathBuf {
         // unwrap: only `None` in unit tests
         assert!(self.locker.is_some());
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index fe3b51a55..7b9d8caae 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -52,8 +52,7 @@ impl LocalDatastoreLruCache {
     /// Fails if the chunk cannot be deleted successfully.
     pub(crate) unsafe fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
         self.cache.remove(*digest);
-        let (path, _digest_str) = self.store.chunk_path(digest);
-        std::fs::remove_file(path).map_err(Error::from)
+        self.store.remove_chunk(digest)
     }
 
     /// Access the locally cached chunk or fetch it from the S3 object store via the provided
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 13/14] GC: fix deadlock for cache eviction and garbage collection
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (11 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 12/14] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 14/14] chunk store: never fail when trying to remove missing chunk file Christian Ebner
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

When inserting a chunk via the local datastore cache, first the
chunk is inserted into the chunk store and then into the in-memory
AsyncLruCache. If the cache capacity is reached, the AsycLruCache
will execute a callback on the evicted cache node, which in case of
the local datastore cache performs a clear chunk call. For this
codepath, the AsyncLruCache is guarded by locking a mutex to get
exclusive access on the cache, and then the chunk store mutex guard
is acquired for safe clearing of the chunk.

Garbage collection however tries the opposite if a chunk is no longer
present and should be cleaned up. It first guards the chunk store
mutex, only to then try and remove the chunk from the local chunk
store and the AsyncLruCache, the latter trying to guarantee
exclusive access by guarding its own mutex.

This therefore can result in a deadlock, further locking the whole
chunk store.

Fix this by guarding the chunk store within the remove_chunk() helper
method on the ChunkStore, not acquiring the lock in the garbage
collection itself for this code path (still guarded by the per-chunk
file lock). By this the order of locking is the same as on cache
eviction.

Reported-by: https://forum.proxmox.com/threads/174878/
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs               | 4 +++-
 pbs-datastore/src/datastore.rs                 | 1 -
 pbs-datastore/src/local_datastore_lru_cache.rs | 3 ++-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 879ac4923..273b17d5f 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -430,6 +430,8 @@ impl ChunkStore {
                                     if let Ok(_guard) =
                                         self.lock_chunk(&digest, Duration::from_secs(0))
                                     {
+                                        // still protected by per-chunk file lock
+                                        drop(lock);
                                         cache.remove(&digest)?;
                                     }
 
@@ -450,7 +452,6 @@ impl ChunkStore {
                     )?;
                 }
             }
-            drop(lock);
         }
 
         Ok(())
@@ -710,6 +711,7 @@ impl ChunkStore {
     /// chunks by verifications and chunk inserts by backups.
     pub(crate) fn remove_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, _digest_str) = self.chunk_path(digest);
+        let _lock = self.mutex.lock();
         std::fs::remove_file(chunk_path).map_err(Error::from)
     }
 
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 354caeae1..0a86aaede 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1710,7 +1710,6 @@ impl DataStore {
                             &mut gc_status,
                             || {
                                 if let Some(cache) = self.cache() {
-                                    let _guard = self.inner.chunk_store.mutex().lock().unwrap();
                                     cache.remove(&digest)?;
                                 }
                                 delete_list.push((content.key, _chunk_guard));
diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
index 7b9d8caae..74647cdb2 100644
--- a/pbs-datastore/src/local_datastore_lru_cache.rs
+++ b/pbs-datastore/src/local_datastore_lru_cache.rs
@@ -42,7 +42,8 @@ impl LocalDatastoreLruCache {
     /// Remove a chunk from the local datastore cache.
     ///
     /// Callers to this method must assure that:
-    /// - no concurrent insert is being performed, the chunk store's mutex must be held.
+    /// - the chunk store's mutex is not being held.
+    /// - no concurrent insert is being performed, the per-chunk file lock must be held.
     /// - the chunk to be removed is no longer referenced by an index file.
     /// - the chunk to be removed has not been inserted by an active writer (atime newer than
     ///   writer start time).
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v4 14/14] chunk store: never fail when trying to remove missing chunk file
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (12 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 13/14] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
@ 2025-11-10 11:56 ` Christian Ebner
  2025-11-11 11:09 ` [pbs-devel] partially-applied: [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
  2025-11-11 14:31 ` [pbs-devel] " Christian Ebner
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-10 11:56 UTC (permalink / raw)
  To: pbs-devel

Although this is an unexpected case, never fail if the chunk store
marker is not there during removal from cache in phase 2 of garbage
collection for datastores with s3 backend (that codepath needs no
further adaption). The chunk file might be missing for other reasons,
e.g. after manual intervention in offline maintenance mode.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-datastore/src/chunk_store.rs | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index 273b17d5f..efc950fcc 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -712,7 +712,12 @@ impl ChunkStore {
     pub(crate) fn remove_chunk(&self, digest: &[u8; 32]) -> Result<(), Error> {
         let (chunk_path, _digest_str) = self.chunk_path(digest);
         let _lock = self.mutex.lock();
-        std::fs::remove_file(chunk_path).map_err(Error::from)
+        if let Err(err) = std::fs::remove_file(chunk_path) {
+            if err.kind() != std::io::ErrorKind::NotFound {
+                return Err(err.into());
+            }
+        }
+        Ok(())
     }
 
     pub fn relative_path(&self, path: &Path) -> PathBuf {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [pbs-devel] partially-applied: [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (13 preceding siblings ...)
  2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 14/14] chunk store: never fail when trying to remove missing chunk file Christian Ebner
@ 2025-11-11 11:09 ` Fabian Grünbichler
  2025-11-11 14:31 ` [pbs-devel] " Christian Ebner
  15 siblings, 0 replies; 17+ messages in thread
From: Fabian Grünbichler @ 2025-11-11 11:09 UTC (permalink / raw)
  To: pbs-devel, Christian Ebner


On Mon, 10 Nov 2025 12:56:13 +0100, Christian Ebner wrote:
> These patches fix possible race conditions on datastores with s3 backend for
> chunk insert, renaming of corrupt chunks during verification and cleanup during
> garbage collection. Further, the patches assure consistency between the chunk
> marker file of the local datastore cache, the s3 object store and the in-memory
> LRU cache during state changes occurring by one of the above mentioned operations.
> 
> Consistency is achieved by using a per-chunk file locking mechanism. File locks
> are stored on the predefined location for datastore file locks, using the same
> `.chunks/prefix/digest` folder layout for consistency and to keep readdir and
> other fs operations performant.
> 
> [...]

Applied, thanks!

[01/14] datastore: limit scope of snapshot/group destroy methods to crate
        commit: 6a4b1b9b1887d8b388b1e169a40588bb2c1c1482
[02/14] api/datastore: move s3 index upload helper to datastore backend
        commit: 832c085ed6018f64f6172a48ded679bb3001344d

applied these two, the rest will require some slight rework as discussed
off-list!

Best regards,
-- 
Fabian Grünbichler <f.gruenbichler@proxmox.com>


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
  2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
                   ` (14 preceding siblings ...)
  2025-11-11 11:09 ` [pbs-devel] partially-applied: [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
@ 2025-11-11 14:31 ` Christian Ebner
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Ebner @ 2025-11-11 14:31 UTC (permalink / raw)
  To: pbs-devel

superseded-by version 5:
https://lore.proxmox.com/pbs-devel/20251111143002.759901-1-c.ebner@proxmox.com/T/


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-11-11 14:31 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-10 11:56 [pbs-devel] [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 01/14] datastore: limit scope of snapshot/group destroy methods to crate Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 02/14] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 03/14] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 04/14] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 05/14] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 06/14] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 07/14] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 08/14] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 09/14] GC: cleanup chunk markers from cache in phase 3 " Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 10/14] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 11/14] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 12/14] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 13/14] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
2025-11-10 11:56 ` [pbs-devel] [PATCH proxmox-backup v4 14/14] chunk store: never fail when trying to remove missing chunk file Christian Ebner
2025-11-11 11:09 ` [pbs-devel] partially-applied: [PATCH proxmox-backup v4 00/14] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
2025-11-11 14:31 ` [pbs-devel] " Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal