public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup] gc: only create locally missing chunk markers if present on S3 store
Date: Wed, 17 Sep 2025 15:41:31 +0200	[thread overview]
Message-ID: <20250917134131.444585-1-c.ebner@proxmox.com> (raw)

This fixes a logic issue with the local cache marker files for all
indexed chunks being created during phase 1 of the garbage collection,
independent on whether the chunks might have been moved to be bad by
a verification job.

Therefore, check if the chunk exists in the S3 object store before
creating the marker, if no bad chunk marker is present in the local
store cache (this can also happen after reusing a fresh local store
cache after a new datastore setup).

Make sure to also keep the bad chunk marker files in the local store
cache. This is achieved by marking these unconditionally if the chunk
marker file is not found.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
Encountered while investigating an unrelated issue.

 pbs-datastore/src/datastore.rs | 73 +++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 32 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 7cf020fc0..4562b0fca 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -1308,40 +1308,49 @@ impl DataStore {
                 }
             }
 
-            match s3_client {
-                None => {
-                    // Filesystem backend
-                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                        let hex = hex::encode(digest);
-                        warn!(
-                            "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
-                        );
-
-                        // touch any corresponding .bad files to keep them around, meaning if a chunk is
-                        // rewritten correctly they will be removed automatically, as well as if no index
-                        // file requires the chunk anymore (won't get to this loop then)
-                        for i in 0..=9 {
-                            let bad_ext = format!("{}.bad", i);
-                            let mut bad_path = PathBuf::new();
-                            bad_path.push(self.chunk_path(digest).0);
-                            bad_path.set_extension(bad_ext);
-                            self.inner.chunk_store.cond_touch_path(&bad_path, false)?;
-                        }
+            if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
+                let hex = hex::encode(digest);
+                let mut is_bad_chunk = false;
+
+                // touch any corresponding .bad files to keep them around, meaning if a chunk is
+                // rewritten correctly they will be removed automatically, as well as if no index
+                // file requires the chunk anymore (won't get to this loop then)
+                for i in 0..=9 {
+                    let bad_ext = format!("{}.bad", i);
+                    let mut bad_path = PathBuf::new();
+                    bad_path.push(self.chunk_path(digest).0);
+                    bad_path.set_extension(bad_ext);
+                    if self.inner.chunk_store.cond_touch_path(&bad_path, false)? {
+                        is_bad_chunk = true;
                     }
                 }
-                Some(ref _s3_client) => {
-                    // Update atime on local cache marker files.
-                    if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
-                        let (chunk_path, _digest) = self.chunk_path(digest);
-                        // Insert empty file as marker to tell GC phase2 that this is
-                        // a chunk still in-use, so to keep in the S3 object store.
-                        std::fs::File::options()
-                            .write(true)
-                            .create_new(true)
-                            .open(&chunk_path)
-                            .with_context(|| {
-                                format!("failed to create marker for chunk {}", hex::encode(digest))
-                            })?;
+
+                match s3_client {
+                    None => warn!(
+                        "warning: unable to access non-existent chunk {hex}, required by {file_name:?}"
+                    ),
+                    Some(ref s3_client) => {
+                        if !is_bad_chunk {
+                            // Might be a new empty cache with no markers present yet, so fallback to
+                            // check the s3 object store in that case.
+                            let object_key = crate::s3::object_key_from_digest(digest)?;
+                            let head_object_response =
+                                proxmox_async::runtime::block_on(s3_client.head_object(object_key))
+                                .context("failed to check chunk presence in s3 object store")?;
+                            if head_object_response.is_some() {
+                                info!("Create missing marker for existing chunk {hex}");
+                                let (chunk_path, _digest) = self.chunk_path(digest);
+                                // Insert empty file as marker to tell GC phase2 that this is
+                                // a chunk still in-use, so to keep in the S3 object store.
+                                std::fs::File::options()
+                                    .write(true)
+                                    .create_new(true)
+                                    .open(&chunk_path)
+                                    .with_context(|| {
+                                        format!("failed to create marker for chunk {hex}")
+                                    })?;
+                            }
+                        }
                     }
                 }
             }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


                 reply	other threads:[~2025-09-17 13:41 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250917134131.444585-1-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal