public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
Date: Fri, 21 Mar 2025 10:32:02 +0100	[thread overview]
Message-ID: <20250321093202.155899-6-c.ebner@proxmox.com> (raw)
In-Reply-To: <20250321093202.155899-1-c.ebner@proxmox.com>

To reduce the number of atimes updates, keep track of the recently
marked chunks in phase 1 of garbage to avoid multiple atime updates
via expensive utimensat() calls.

Recently touched chunks are tracked by storing the chunk digests in
an LRU cache of fixed capacity. By inserting a digest, the chunk will
be the most recently touched one and if already present in the cache
before insert, the atime update can be skipped.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- no changes

 pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index ea7e5e9f3..4445944c0 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
 
 use anyhow::{bail, format_err, Context, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
+use pbs_tools::lru_cache::LruCache;
 use tracing::{info, warn};
 
 use proxmox_human_byte::HumanByte;
@@ -1078,6 +1079,7 @@ impl DataStore {
         &self,
         index: Box<dyn IndexFile>,
         file_name: &Path, // only used for error reporting
+        recently_touched_chunks: &mut LruCache<[u8; 32], ()>,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
     ) -> Result<(), Error> {
@@ -1088,6 +1090,12 @@ impl DataStore {
             worker.check_abort()?;
             worker.fail_on_shutdown()?;
             let digest = index.index_digest(pos).unwrap();
+
+            // Avoid multiple expensive atime updates by utimensat
+            if recently_touched_chunks.insert(*digest, ()) {
+                continue;
+            }
+
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
                 let hex = hex::encode(digest);
                 warn!(
@@ -1128,6 +1136,8 @@ impl DataStore {
         let mut unprocessed_index_list = self.list_index_files()?;
         let index_count = unprocessed_index_list.len();
 
+        // Allow up to 32 MiB, as only storing the 32 digest as key
+        let mut recently_touched_chunks = LruCache::new(1024 * 1024);
         let mut processed_index_files = 0;
         let mut last_percentage: usize = 0;
 
@@ -1154,7 +1164,13 @@ impl DataStore {
                             Some(index) => index,
                             None => continue,
                         };
-                        self.index_mark_used_chunks(index, &path, status, worker)?;
+                        self.index_mark_used_chunks(
+                            index,
+                            &path,
+                            &mut recently_touched_chunks,
+                            status,
+                            worker,
+                        )?;
 
                         unprocessed_index_list.remove(&path);
 
@@ -1181,7 +1197,13 @@ impl DataStore {
                 Some(index) => index,
                 None => continue,
             };
-            self.index_mark_used_chunks(index, &path, status, worker)?;
+            self.index_mark_used_chunks(
+                index,
+                &path,
+                &mut recently_touched_chunks,
+                status,
+                worker,
+            )?;
             warn!("Marked chunks for unexpected index file at '{path:?}'");
         }
 
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  parent reply	other threads:[~2025-03-21  9:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple " Christian Ebner
2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] garbage collection: format error including anyhow error context Christian Ebner
2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] datastore: add helper method to open index reader from path Christian Ebner
2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators Christian Ebner
2025-03-25 12:09   ` Thomas Lamprecht
2025-03-21  9:32 ` Christian Ebner [this message]
2025-03-25 11:56   ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Thomas Lamprecht
2025-03-25 12:07     ` Thomas Lamprecht
2025-03-25 13:05     ` Christian Ebner
2025-03-26 10:05 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple " Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250321093202.155899-6-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal