public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup v3 13/23] datastore: acquire chunk store mutex lock when renaming corrupt chunk
Date: Wed,  5 Nov 2025 13:22:23 +0100	[thread overview]
Message-ID: <20251105122233.439382-14-c.ebner@proxmox.com> (raw)
In-Reply-To: <20251105122233.439382-1-c.ebner@proxmox.com>

When renaming a corrupt chunk in the chunk store, currently the chunk
store mutex lock is not held, leading to possible races with other
operations which hold the lock and therefore assume exclusive access
such as garbage collection or backup chunk inserts. This affects
both, filesystem and S3 backends.

To fix the possible race, get the lock and rearrange the code such
that it is never held when entering async code for the s3 backend.
This does not yet solve the race and cache consistency for s3
backends, which is addressed by introducing per-chunk file locking
in subsequent patches.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- no changes

 pbs-datastore/src/datastore.rs | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 555674e7c..0aff95cdd 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -2600,6 +2600,8 @@ impl DataStore {
     pub fn rename_corrupt_chunk(&self, digest: &[u8; 32]) -> Result<Option<PathBuf>, Error> {
         let (path, digest_str) = self.chunk_path(digest);
 
+        let _lock = self.inner.chunk_store.mutex().lock().unwrap();
+
         let mut counter = 0;
         let mut new_path = path.clone();
         loop {
@@ -2611,6 +2613,14 @@ impl DataStore {
             }
         }
 
+        let result = match std::fs::rename(&path, &new_path) {
+            Ok(_) => Ok(Some(new_path)),
+            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
+            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
+        };
+
+        drop(_lock);
+
         let backend = self.backend().map_err(|err| {
             format_err!(
                 "failed to get backend while trying to rename bad chunk: {digest_str} - {err}"
@@ -2641,10 +2651,6 @@ impl DataStore {
             )?;
         }
 
-        match std::fs::rename(&path, &new_path) {
-            Ok(_) => Ok(Some(new_path)),
-            Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(None),
-            Err(err) => bail!("could not rename corrupt chunk {path:?} - {err}"),
-        }
+        result
     }
 }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  parent reply	other threads:[~2025-11-05 12:22 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 12:22 [pbs-devel] [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 01/23] sync: pull: instantiate backend only once per sync job Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 02/23] api/datastore: move group notes setting to the datastore Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 03/23] api/datastore: move snapshot deletion into dedicated datastore helper Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 04/23] api/datastore: move backup log upload by implementing " Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 05/23] api: backup: use datastore add_blob helper for backup session Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 06/23] api/datastore: add dedicated datastore helper to set snapshot notes Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 07/23] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 08/23] datastore: refactor chunk insert based on backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 09/23] verify: rename corrupted to corrupt in log output and function names Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 10/23] verify/datastore: make rename corrupt chunk a datastore helper method Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 11/23] datastore: refactor rename_corrupt_chunk error handling Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 12/23] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-05 12:22 ` Christian Ebner [this message]
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 14/23] datastore: get per-chunk file lock for chunk rename on " Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 15/23] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 16/23] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 17/23] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 " Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 19/23] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 20/23] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 21/23] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 22/23] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 23/23] chunk store: never fail when trying to remove missing chunk file Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251105122233.439382-14-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal