From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH v3 proxmox-backup 6/6] fix #5331: garbage collection: avoid multiple chunk atime updates
Date: Thu, 20 Mar 2025 13:30:10 +0100 [thread overview]
Message-ID: <20250320123010.250234-7-c.ebner@proxmox.com> (raw)
In-Reply-To: <20250320123010.250234-1-c.ebner@proxmox.com>
To reduce the number of atimes updates, keep track of the recently
marked chunks in phase 1 of garbage to avoid multiple atime updates
via expensive utimensat() calls.
Recently touched chunks are tracked by storing the chunk digests in
an LRU cache of fixed capacity. By inserting a digest, the chunk will
be the most recently touched one and if already present in the cache
before insert, the atime update can be skipped.
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- Switch to LRU cache instead of keeping track of chunks from previous
snapshot of the same group.
pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index c4123f2b7..1f1c3b396 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
use anyhow::{bail, format_err, Context, Error};
use nix::unistd::{unlinkat, UnlinkatFlags};
+use pbs_tools::lru_cache::LruCache;
use tracing::{info, warn};
use proxmox_human_byte::HumanByte;
@@ -1081,6 +1082,7 @@ impl DataStore {
&self,
index: Box<dyn IndexFile>,
file_name: &Path, // only used for error reporting
+ recently_touched_chunks: &mut LruCache<[u8; 32], ()>,
status: &mut GarbageCollectionStatus,
worker: &dyn WorkerTaskContext,
) -> Result<(), Error> {
@@ -1091,6 +1093,12 @@ impl DataStore {
worker.check_abort()?;
worker.fail_on_shutdown()?;
let digest = index.index_digest(pos).unwrap();
+
+ // Avoid multiple expensive atime updates by utimensat
+ if recently_touched_chunks.insert(*digest, ()) {
+ continue;
+ }
+
if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
let hex = hex::encode(digest);
warn!(
@@ -1131,6 +1139,8 @@ impl DataStore {
let mut unprocessed_image_list = self.list_images()?;
let image_count = unprocessed_image_list.len();
+ // Allow up to 32 MiB, as only storing the 32 digest as key
+ let mut recently_touched_chunks = LruCache::new(1024 * 1024);
let mut processed_images = 0;
let mut last_percentage: usize = 0;
@@ -1157,7 +1167,13 @@ impl DataStore {
Some(index) => index,
None => continue,
};
- self.index_mark_used_chunks(index, &path, status, worker)?;
+ self.index_mark_used_chunks(
+ index,
+ &path,
+ &mut recently_touched_chunks,
+ status,
+ worker,
+ )?;
unprocessed_image_list.remove(&path);
@@ -1184,7 +1200,13 @@ impl DataStore {
Some(index) => index,
None => continue,
};
- self.index_mark_used_chunks(index, &path, status, worker)?;
+ self.index_mark_used_chunks(
+ index,
+ &path,
+ &mut recently_touched_chunks,
+ status,
+ worker,
+ )?;
warn!(
"Marked chunks for unexpected index file at '{}'",
path.to_string_lossy()
--
2.39.5
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-03-20 12:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple " Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
2025-03-20 13:47 ` [pbs-devel] applied: " Wolfgang Bumiller
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 2/6] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 3/6] garbage collection: format error including anyhow error context Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path Christian Ebner
2025-03-20 14:22 ` Wolfgang Bumiller
2025-03-20 14:40 ` Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 5/6] garbage collection: generate image list via datastore iterators Christian Ebner
2025-03-20 12:30 ` Christian Ebner [this message]
2025-03-21 9:32 ` [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250320123010.250234-7-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.