public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates
@ 2025-03-21  9:31 Christian Ebner
  2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:31 UTC (permalink / raw)
  To: pbs-devel

This patches implement the logic to greatly improve the performance
of phase 1 garbage collection by avoiding multiple atime updates on
the same chunk.

Currently, phase 1 GC iterates over all folders in the datastore
looking and collecting all image index files without taking any
logical assumptions (e.g. namespaces, groups, snapshots, ...). This
is to avoid accidentally missing image index files located in
unexpected paths and therefore not marking their chunks as in use,
leading to potential data losses.

This patches improve phase 1 by:
- Iterating index images using the datatstore's iterators for detecting
  regular index files. Paths outside of the iterator logic are still taken
  into account and processed as well by generating a list of all the found
  images first, removing index files encountered while iterating, finally
  leaving a list of indexes with unexpected paths. These unexpected paths
  are now also logged, for the user to potentially take action.
- Keeping track of recently touched chunks by storing their digests in a
  LRU cache, skipping over expensive atime updates for chunks already
  present in the cache.

Most notably changes since version 3 (thanks Wolfgang for feedback):
- Use `with_context` over `context` to avoid possibly unnecessary allocation
- Align terminology with docs and rest of the codebase by using index
  file instead of image in method and variable names.

Most notably changes since version 2 (thanks Fabian for feedback):
- Use LRU cache instead of keeping track of chunks from the previous
  snapshot in the group.
- Split patches to logically separate iteration from caching logic
- Adapt for better anyhow context error propagation and formatting

Most notably changes since version 1 (thanks Fabian for feedback):
- Logically iterate using pre-existing iterators instead of constructing
  data structure for iteration when listing images.
- Tested that double listing does not affect runtime.
- Chunks are now remembered for all archives per snapshot, not just a
  single archive per snapshot as previously, this mimics more closely
  the backup behaviour, this give some additional gains in some cases.

Below are still data from version 3, as no logical changes affecting runtime
were made in version 4 of the patches.

Statistics generated by averaging 3 GC runtimes, measured after an initial
run each to warm up caches. Datastores A (1515 index files) and B (192
index files) are unrelated, containing "real" backups.
The syscall counts were generated using
`strace -f -e utimensat -p $(pidof proxmox-backup-proxy) 2&> dump.trace` and
(after small cleanup) `wc -l dump.trace`.

datastore A on spinning disk:
unpatched: 115.3s ± 0.5 s,  utimensat calls: 6864667
version 2:  52.0s ± 0.0 s,  utimensat calls: 1151734
version 3:  24.6s ± 0.5 s,  utimensat calls: 1079149

datastore B on SSD:
unpatched: 44.6 ± 3.2 s, utimensat calls: 2034949
version 2: 15.0 ± 1.0 s, utimensat calls:  562929
version 3: 14.6 ± 0.5 s, utimensat calls:  559053

Christian Ebner (5):
  tools: lru cache: tell if node was already present or newly inserted
  garbage collection: format error including anyhow error context
  datastore: add helper method to open index reader from path
  garbage collection: generate index file list via datastore iterators
  fix #5331: garbage collection: avoid multiple chunk atime updates

 pbs-datastore/src/datastore.rs  | 188 +++++++++++++++++++++++---------
 pbs-tools/src/lru_cache.rs      |   4 +-
 src/api2/admin/datastore.rs     |   6 +-
 src/bin/proxmox-backup-proxy.rs |   2 +-
 4 files changed, 140 insertions(+), 60 deletions(-)

-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
@ 2025-03-21  9:31 ` Christian Ebner
  2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] garbage collection: format error including anyhow error context Christian Ebner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:31 UTC (permalink / raw)
  To: pbs-devel

Add a boolean return type to LruCache::insert(), telling if the node
was already present in the cache or if it was newly inserted.

This will allow to use the LRU cache for garbage collection, where
it is required to skip atime updates for chunks already marked in
use.

That improves phase 1 garbage collection performance by avoiding,
multiple atime updates.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- no changes

 pbs-tools/src/lru_cache.rs | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/pbs-tools/src/lru_cache.rs b/pbs-tools/src/lru_cache.rs
index 1befb1a0d..9e0112647 100644
--- a/pbs-tools/src/lru_cache.rs
+++ b/pbs-tools/src/lru_cache.rs
@@ -133,7 +133,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
 
     /// Insert or update an entry identified by `key` with the given `value`.
     /// This entry is placed as the most recently used node at the head.
-    pub fn insert(&mut self, key: K, value: V) {
+    pub fn insert(&mut self, key: K, value: V) -> bool {
         match self.map.entry(key) {
             Entry::Occupied(mut o) => {
                 // Node present, update value
@@ -142,6 +142,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
                 let mut node = unsafe { Box::from_raw(node_ptr) };
                 node.value = value;
                 let _node_ptr = Box::into_raw(node);
+                true
             }
             Entry::Vacant(v) => {
                 // Node not present, insert a new one
@@ -159,6 +160,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
                 if self.map.len() > self.capacity {
                     self.pop_tail();
                 }
+                false
             }
         }
     }
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v4 proxmox-backup 2/5] garbage collection: format error including anyhow error context
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
  2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
@ 2025-03-21  9:31 ` Christian Ebner
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] datastore: add helper method to open index reader from path Christian Ebner
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:31 UTC (permalink / raw)
  To: pbs-devel

Until now errors are shown ignoring the anyhow error context. In
order to allow the garbage collection to return additional error
context, format the error including the context as single line.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- cargo fmt fix

 src/api2/admin/datastore.rs     | 6 +-----
 src/bin/proxmox-backup-proxy.rs | 2 +-
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index 483e595c1..4009f7b80 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1218,11 +1218,7 @@ pub fn start_garbage_collection(
     let upid_str =
         crate::server::do_garbage_collection_job(job, datastore, &auth_id, None, to_stdout)
             .map_err(|err| {
-                format_err!(
-                    "unable to start garbage collection job on datastore {} - {}",
-                    store,
-                    err
-                )
+                format_err!("unable to start garbage collection job on datastore {store} - {err:#}")
             })?;
 
     Ok(json!(upid_str))
diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
index c9a6032e6..1d4cf37c5 100644
--- a/src/bin/proxmox-backup-proxy.rs
+++ b/src/bin/proxmox-backup-proxy.rs
@@ -543,7 +543,7 @@ async fn schedule_datastore_garbage_collection() {
             Some(event_str),
             false,
         ) {
-            eprintln!("unable to start garbage collection job on datastore {store} - {err}");
+            eprintln!("unable to start garbage collection job on datastore {store} - {err:#}");
         }
     }
 }
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v4 proxmox-backup 3/5] datastore: add helper method to open index reader from path
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
  2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
  2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] garbage collection: format error including anyhow error context Christian Ebner
@ 2025-03-21  9:32 ` Christian Ebner
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators Christian Ebner
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:32 UTC (permalink / raw)
  To: pbs-devel

Refactor the archive type and index file reader opening with its
error handling into a helper method for better reusability.

This allows to use the same logic for both, expected image paths
and unexpected image paths when iterating trough the datastore
in a hierarchical manner.

Improve error handling by switching to anyhow's error context.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- use `with_context` over `context`
- inline path in error messages

 pbs-datastore/src/datastore.rs | 68 +++++++++++++++++++++++-----------
 1 file changed, 46 insertions(+), 22 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a6a91ca79..8c62d4b2b 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -5,7 +5,7 @@ use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, LazyLock, Mutex};
 
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, format_err, Context, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
 use tracing::{info, warn};
 
@@ -1029,10 +1029,49 @@ impl DataStore {
         Ok(list)
     }
 
+    // Similar to open index, but ignore index files with blob or unknown archive type.
+    // Further, do not fail if file vanished.
+    fn open_index_reader(&self, absolute_path: &Path) -> Result<Option<Box<dyn IndexFile>>, Error> {
+        let archive_type = match ArchiveType::from_path(absolute_path) {
+            Ok(archive_type) => archive_type,
+            // ignore archives with unknown archive type
+            Err(_) => return Ok(None),
+        };
+
+        if absolute_path.is_relative() {
+            bail!("expected absolute path, got '{absolute_path:?}'");
+        }
+
+        let file = match std::fs::File::open(absolute_path) {
+            Ok(file) => file,
+            // ignore vanished files
+            Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(None),
+            Err(err) => {
+                return Err(
+                    Error::from(err).context(format!("can't open file '{absolute_path:?}'"))
+                )
+            }
+        };
+
+        match archive_type {
+            ArchiveType::FixedIndex => {
+                let reader = FixedIndexReader::new(file)
+                    .with_context(|| format!("can't open fixed index '{absolute_path:?}'"))?;
+                Ok(Some(Box::new(reader)))
+            }
+            ArchiveType::DynamicIndex => {
+                let reader = DynamicIndexReader::new(file)
+                    .with_context(|| format!("can't open dynamic index '{absolute_path:?}'"))?;
+                Ok(Some(Box::new(reader)))
+            }
+            ArchiveType::Blob => Ok(None),
+        }
+    }
+
     // mark chunks  used by ``index`` as used
-    fn index_mark_used_chunks<I: IndexFile>(
+    fn index_mark_used_chunks(
         &self,
-        index: I,
+        index: Box<dyn IndexFile>,
         file_name: &Path, // only used for error reporting
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
@@ -1090,24 +1129,8 @@ impl DataStore {
                 }
             }
 
-            match std::fs::File::open(&img) {
-                Ok(file) => {
-                    if let Ok(archive_type) = ArchiveType::from_path(&img) {
-                        if archive_type == ArchiveType::FixedIndex {
-                            let index = FixedIndexReader::new(file).map_err(|e| {
-                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
-                            })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
-                        } else if archive_type == ArchiveType::DynamicIndex {
-                            let index = DynamicIndexReader::new(file).map_err(|e| {
-                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
-                            })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
-                        }
-                    }
-                }
-                Err(err) if err.kind() == io::ErrorKind::NotFound => (), // ignore vanished files
-                Err(err) => bail!("can't open index {} - {}", img.to_string_lossy(), err),
+            if let Some(index) = self.open_index_reader(&img)? {
+                self.index_mark_used_chunks(index, &img, status, worker)?;
             }
 
             let percentage = (i + 1) * 100 / image_count;
@@ -1173,7 +1196,8 @@ impl DataStore {
 
             info!("Start GC phase1 (mark used chunks)");
 
-            self.mark_used_chunks(&mut gc_status, worker)?;
+            self.mark_used_chunks(&mut gc_status, worker)
+                .context("marking used chunks failed")?;
 
             info!("Start GC phase2 (sweep unused chunks)");
             self.inner.chunk_store.sweep_unused_chunks(
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
                   ` (2 preceding siblings ...)
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] datastore: add helper method to open index reader from path Christian Ebner
@ 2025-03-21  9:32 ` Christian Ebner
  2025-03-25 12:09   ` Thomas Lamprecht
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
  2025-03-26 10:05 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple " Christian Ebner
  5 siblings, 1 reply; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:32 UTC (permalink / raw)
  To: pbs-devel

Instead of iterating over all index files found in the datastore in
an unstructured manner, use the datastore iterators to logically
iterate over them as other datastore operations will.

This allows to better distinguish index files in unexpected locations
from ones in their expected location, warning the user of unexpected
ones to allow to act on possible missconfigurations. Further, this
will allow to integrate marking of snapshots with missing chunks as
incomplete/corrupt more easily.

This now iterates twice over the index files, as indices in
unexpected locations are still considered by generating the list of
all index files to be found in the datastore and removing regular
index files from that list, leaving unexpected ones behind.

Further, align terminology by renaming the `list_images` method to
a more fitting `list_index_files` and the variable names accordingly.

This will reduce possible confusion since throughout the codebase and
in the documentation files referencing the data chunks are referred
to as index files. The term image on the other hand is associated
with virtual machine images and other large binary data stored as
fixed-size chunks.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- align method and variable names to docs by using variants of
  `index_file` over `image`.
- inline path in error messages

 pbs-datastore/src/datastore.rs | 104 ++++++++++++++++++++++-----------
 1 file changed, 70 insertions(+), 34 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 8c62d4b2b..ea7e5e9f3 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -25,7 +25,7 @@ use pbs_api_types::{
     MaintenanceMode, MaintenanceType, Operation, UPID,
 };
 
-use crate::backup_info::{BackupDir, BackupGroup};
+use crate::backup_info::{BackupDir, BackupGroup, BackupInfo};
 use crate::chunk_store::ChunkStore;
 use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
 use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
@@ -970,10 +970,15 @@ impl DataStore {
         ListGroups::new(Arc::clone(self), ns)?.collect()
     }
 
-    fn list_images(&self) -> Result<Vec<PathBuf>, Error> {
+    /// Lookup all index files to be found in the datastore without taking any logical iteration
+    /// into account.
+    /// The filesystem is walked recursevly to detect index files based on their archive type based
+    /// on the filename. This however excludes the chunks folder, hidden files and does not follow
+    /// symlinks.
+    fn list_index_files(&self) -> Result<HashSet<PathBuf>, Error> {
         let base = self.base_path();
 
-        let mut list = vec![];
+        let mut list = HashSet::new();
 
         use walkdir::WalkDir;
 
@@ -1021,7 +1026,7 @@ impl DataStore {
                 if archive_type == ArchiveType::FixedIndex
                     || archive_type == ArchiveType::DynamicIndex
                 {
-                    list.push(path);
+                    list.insert(path);
                 }
             }
         }
@@ -1109,44 +1114,75 @@ impl DataStore {
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
     ) -> Result<(), Error> {
-        let image_list = self.list_images()?;
-        let image_count = image_list.len();
-
+        // Iterate twice over the datastore to fetch index files, even if this comes with an
+        // additional runtime cost:
+        // - First iteration to find all index files, no matter if they are in a location expected
+        //   by the datastore's hierarchy
+        // - Iterate using the datastore's helpers, so the namespaces, groups and snapshots are
+        //   looked up given the expected hierarchy and iterator logic
+        //
+        // By this it is assured that all index files are used, even if they would not have been
+        // seen by the regular logic and the user is informed by the garbage collection run about
+        // the detected index files not following the iterators logic.
+
+        let mut unprocessed_index_list = self.list_index_files()?;
+        let index_count = unprocessed_index_list.len();
+
+        let mut processed_index_files = 0;
         let mut last_percentage: usize = 0;
 
-        let mut strange_paths_count: u64 = 0;
-
-        for (i, img) in image_list.into_iter().enumerate() {
-            worker.check_abort()?;
-            worker.fail_on_shutdown()?;
-
-            if let Some(backup_dir_path) = img.parent() {
-                let backup_dir_path = backup_dir_path.strip_prefix(self.base_path())?;
-                if let Some(backup_dir_str) = backup_dir_path.to_str() {
-                    if pbs_api_types::parse_ns_and_snapshot(backup_dir_str).is_err() {
-                        strange_paths_count += 1;
+        let arc_self = Arc::new(self.clone());
+        for namespace in arc_self
+            .recursive_iter_backup_ns(BackupNamespace::root())
+            .context("creating namespace iterator failed")?
+        {
+            let namespace = namespace.context("iterating namespaces failed")?;
+            for group in arc_self.iter_backup_groups(namespace)? {
+                let group = group.context("iterating backup groups failed")?;
+                let mut snapshots = group.list_backups().context("listing snapshots failed")?;
+                // Sort by snapshot timestamp to iterate over consecutive snapshots for each image.
+                BackupInfo::sort_list(&mut snapshots, true);
+                for snapshot in snapshots {
+                    for file in snapshot.files {
+                        worker.check_abort()?;
+                        worker.fail_on_shutdown()?;
+
+                        let mut path = snapshot.backup_dir.full_path();
+                        path.push(file);
+
+                        let index = match self.open_index_reader(&path)? {
+                            Some(index) => index,
+                            None => continue,
+                        };
+                        self.index_mark_used_chunks(index, &path, status, worker)?;
+
+                        unprocessed_index_list.remove(&path);
+
+                        let percentage = (processed_index_files + 1) * 100 / index_count;
+                        if percentage > last_percentage {
+                            info!(
+                                "marked {percentage}% ({} of {index_count} index files)",
+                                processed_index_files + 1,
+                            );
+                            last_percentage = percentage;
+                        }
+                        processed_index_files += 1;
                     }
                 }
             }
-
-            if let Some(index) = self.open_index_reader(&img)? {
-                self.index_mark_used_chunks(index, &img, status, worker)?;
-            }
-
-            let percentage = (i + 1) * 100 / image_count;
-            if percentage > last_percentage {
-                info!(
-                    "marked {percentage}% ({} of {image_count} index files)",
-                    i + 1,
-                );
-                last_percentage = percentage;
-            }
         }
 
+        let strange_paths_count = unprocessed_index_list.len();
         if strange_paths_count > 0 {
-            info!(
-                "found (and marked) {strange_paths_count} index files outside of expected directory scheme"
-            );
+            warn!("found {strange_paths_count} index files outside of expected directory scheme");
+        }
+        for path in unprocessed_index_list {
+            let index = match self.open_index_reader(&path)? {
+                Some(index) => index,
+                None => continue,
+            };
+            self.index_mark_used_chunks(index, &path, status, worker)?;
+            warn!("Marked chunks for unexpected index file at '{path:?}'");
         }
 
         Ok(())
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
                   ` (3 preceding siblings ...)
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators Christian Ebner
@ 2025-03-21  9:32 ` Christian Ebner
  2025-03-25 11:56   ` Thomas Lamprecht
  2025-03-26 10:05 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple " Christian Ebner
  5 siblings, 1 reply; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:32 UTC (permalink / raw)
  To: pbs-devel

To reduce the number of atimes updates, keep track of the recently
marked chunks in phase 1 of garbage to avoid multiple atime updates
via expensive utimensat() calls.

Recently touched chunks are tracked by storing the chunk digests in
an LRU cache of fixed capacity. By inserting a digest, the chunk will
be the most recently touched one and if already present in the cache
before insert, the atime update can be skipped.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 3:
- no changes

 pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index ea7e5e9f3..4445944c0 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
 
 use anyhow::{bail, format_err, Context, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
+use pbs_tools::lru_cache::LruCache;
 use tracing::{info, warn};
 
 use proxmox_human_byte::HumanByte;
@@ -1078,6 +1079,7 @@ impl DataStore {
         &self,
         index: Box<dyn IndexFile>,
         file_name: &Path, // only used for error reporting
+        recently_touched_chunks: &mut LruCache<[u8; 32], ()>,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
     ) -> Result<(), Error> {
@@ -1088,6 +1090,12 @@ impl DataStore {
             worker.check_abort()?;
             worker.fail_on_shutdown()?;
             let digest = index.index_digest(pos).unwrap();
+
+            // Avoid multiple expensive atime updates by utimensat
+            if recently_touched_chunks.insert(*digest, ()) {
+                continue;
+            }
+
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
                 let hex = hex::encode(digest);
                 warn!(
@@ -1128,6 +1136,8 @@ impl DataStore {
         let mut unprocessed_index_list = self.list_index_files()?;
         let index_count = unprocessed_index_list.len();
 
+        // Allow up to 32 MiB, as only storing the 32 digest as key
+        let mut recently_touched_chunks = LruCache::new(1024 * 1024);
         let mut processed_index_files = 0;
         let mut last_percentage: usize = 0;
 
@@ -1154,7 +1164,13 @@ impl DataStore {
                             Some(index) => index,
                             None => continue,
                         };
-                        self.index_mark_used_chunks(index, &path, status, worker)?;
+                        self.index_mark_used_chunks(
+                            index,
+                            &path,
+                            &mut recently_touched_chunks,
+                            status,
+                            worker,
+                        )?;
 
                         unprocessed_index_list.remove(&path);
 
@@ -1181,7 +1197,13 @@ impl DataStore {
                 Some(index) => index,
                 None => continue,
             };
-            self.index_mark_used_chunks(index, &path, status, worker)?;
+            self.index_mark_used_chunks(
+                index,
+                &path,
+                &mut recently_touched_chunks,
+                status,
+                worker,
+            )?;
             warn!("Marked chunks for unexpected index file at '{path:?}'");
         }
 
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
@ 2025-03-25 11:56   ` Thomas Lamprecht
  2025-03-25 12:07     ` Thomas Lamprecht
  2025-03-25 13:05     ` Christian Ebner
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Lamprecht @ 2025-03-25 11:56 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Christian Ebner

Am 21.03.25 um 10:32 schrieb Christian Ebner:
> To reduce the number of atimes updates, keep track of the recently
> marked chunks in phase 1 of garbage to avoid multiple atime updates
> via expensive utimensat() calls.
> 
> Recently touched chunks are tracked by storing the chunk digests in
> an LRU cache of fixed capacity. By inserting a digest, the chunk will
> be the most recently touched one and if already present in the cache
> before insert, the atime update can be skipped.

Code-wise this looks alright to me, albeit I did not look at it in-depth,
but what I'd be interested is documenting some more thoughts about how
the size of the cache was chosen; even if it was mostly random then stating
so can help a lot when rethinking this in the future, as then one doesn't
have to guess if there was some more reasoning behind that.

Also some basic benchmarks might be great, even if from some random grown
setup, as long as one describes it, like the overall pool data usage,
deduplication factor, amount of backup groups, amount of snapshots and
their rough age (distribution) and basic system characteristics like the
cpu and basic parameters of the underlying storage, like filesystem type
and (block) device type that backs it, as with that one can classify the
change somewhat good enough.


> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 3:
> - no changes
> 
>  pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index ea7e5e9f3..4445944c0 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs

...

> @@ -1128,6 +1136,8 @@ impl DataStore {
>          let mut unprocessed_index_list = self.list_index_files()?;
>          let index_count = unprocessed_index_list.len();
>  
> +        // Allow up to 32 MiB, as only storing the 32 digest as key

Above comment is IMO a bit hard to parse and does not really provide any
reasoning about the chosen size FWICT.

> +        let mut recently_touched_chunks = LruCache::new(1024 * 1024);

It's quite a descriptive and good name, but something slightly shorter
like `chunk_lru_cache` would be IMO fine here too, but really no hard
feelings.




_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
  2025-03-25 11:56   ` Thomas Lamprecht
@ 2025-03-25 12:07     ` Thomas Lamprecht
  2025-03-25 13:05     ` Christian Ebner
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Lamprecht @ 2025-03-25 12:07 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Christian Ebner

Am 25.03.25 um 12:56 schrieb Thomas Lamprecht:
> Am 21.03.25 um 10:32 schrieb Christian Ebner:
>> To reduce the number of atimes updates, keep track of the recently
>> marked chunks in phase 1 of garbage to avoid multiple atime updates
>> via expensive utimensat() calls.
>>
>> Recently touched chunks are tracked by storing the chunk digests in
>> an LRU cache of fixed capacity. By inserting a digest, the chunk will
>> be the most recently touched one and if already present in the cache
>> before insert, the atime update can be skipped.
> 
> Code-wise this looks alright to me, albeit I did not look at it in-depth,
> but what I'd be interested is documenting some more thoughts about how
> the size of the cache was chosen; even if it was mostly random then stating
> so can help a lot when rethinking this in the future, as then one doesn't
> have to guess if there was some more reasoning behind that.
> 
> Also some basic benchmarks might be great, even if from some random grown
> setup, as long as one describes it, like the overall pool data usage,
> deduplication factor, amount of backup groups, amount of snapshots and
> their rough age (distribution) and basic system characteristics like the
> cpu and basic parameters of the underlying storage, like filesystem type
> and (block) device type that backs it, as with that one can classify the
> change somewhat good enough.

Oh and it would naturally be nice to repeat that for some different LRU
cache sizes to see how that changes things – here the actual datastore
used naturally is a bit more important, maybe we use one of our more
production-like PBS instances in the testlab for that though and pass
the LRU sizes through some environment variable or the like in a testbuild.

Mentioning that the LRU should profit from the recent change to process
snapshots in a more logical order would be also good IMO.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators Christian Ebner
@ 2025-03-25 12:09   ` Thomas Lamprecht
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Lamprecht @ 2025-03-25 12:09 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion, Christian Ebner

Am 21.03.25 um 10:32 schrieb Christian Ebner:
> Instead of iterating over all index files found in the datastore in
> an unstructured manner, use the datastore iterators to logically
> iterate over them as other datastore operations will.
> 
> This allows to better distinguish index files in unexpected locations
> from ones in their expected location, warning the user of unexpected
> ones to allow to act on possible missconfigurations. Further, this
> will allow to integrate marking of snapshots with missing chunks as
> incomplete/corrupt more easily.
> 
> This now iterates twice over the index files, as indices in
> unexpected locations are still considered by generating the list of
> all index files to be found in the datastore and removing regular
> index files from that list, leaving unexpected ones behind.

Can you also describe the rough changes in terms of time and (peak)
memory used? As replied to patch 5/5 it's probably not too important
to get the perfect reference setup for evaluating this, but to have
one that isn't an obvious extreme (no snapshots, the same snapshots
a hundred times, ...) and describe that setup.

It might be also nice to mention that this is in preparation to get
better cache reuse due to processing snapshots from the same group
closely together.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates
  2025-03-25 11:56   ` Thomas Lamprecht
  2025-03-25 12:07     ` Thomas Lamprecht
@ 2025-03-25 13:05     ` Christian Ebner
  1 sibling, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-25 13:05 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Backup Server development discussion

On 3/25/25 12:56, Thomas Lamprecht wrote:
> Am 21.03.25 um 10:32 schrieb Christian Ebner:
>> To reduce the number of atimes updates, keep track of the recently
>> marked chunks in phase 1 of garbage to avoid multiple atime updates
>> via expensive utimensat() calls.
>>
>> Recently touched chunks are tracked by storing the chunk digests in
>> an LRU cache of fixed capacity. By inserting a digest, the chunk will
>> be the most recently touched one and if already present in the cache
>> before insert, the atime update can be skipped.
> 
> Code-wise this looks alright to me, albeit I did not look at it in-depth,
> but what I'd be interested is documenting some more thoughts about how
> the size of the cache was chosen; even if it was mostly random then stating
> so can help a lot when rethinking this in the future, as then one doesn't
> have to guess if there was some more reasoning behind that.
> 
> Also some basic benchmarks might be great, even if from some random grown
> setup, as long as one describes it, like the overall pool data usage,
> deduplication factor, amount of backup groups, amount of snapshots and
> their rough age (distribution) and basic system characteristics like the
> cpu and basic parameters of the underlying storage, like filesystem type
> and (block) device type that backs it, as with that one can classify the
> change somewhat good enough.
> 
> 
>> Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 3:
>> - no changes
>>
>>   pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
>>   1 file changed, 24 insertions(+), 2 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index ea7e5e9f3..4445944c0 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
> 
> ...
> 
>> @@ -1128,6 +1136,8 @@ impl DataStore {
>>           let mut unprocessed_index_list = self.list_index_files()?;
>>           let index_count = unprocessed_index_list.len();
>>   
>> +        // Allow up to 32 MiB, as only storing the 32 digest as key
> 
> Above comment is IMO a bit hard to parse and does not really provide any
> reasoning about the chosen size FWICT.
> 
>> +        let mut recently_touched_chunks = LruCache::new(1024 * 1024);
> 
> It's quite a descriptive and good name, but something slightly shorter
> like `chunk_lru_cache` would be IMO fine here too, but really no hard
> feelings.

Okay, will adapt this and the other suggestions and use the testlab 
datastore to generate the benchmarks as requested. Thanks!


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates
  2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
                   ` (4 preceding siblings ...)
  2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
@ 2025-03-26 10:05 ` Christian Ebner
  5 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-26 10:05 UTC (permalink / raw)
  To: pbs-devel

superseded-by version 5:
https://lore.proxmox.com/pbs-devel/20250326100333.116722-1-c.ebner@proxmox.com/T/


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-03-26 10:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-21  9:31 [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates Christian Ebner
2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 1/5] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
2025-03-21  9:31 ` [pbs-devel] [PATCH v4 proxmox-backup 2/5] garbage collection: format error including anyhow error context Christian Ebner
2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 3/5] datastore: add helper method to open index reader from path Christian Ebner
2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 4/5] garbage collection: generate index file list via datastore iterators Christian Ebner
2025-03-25 12:09   ` Thomas Lamprecht
2025-03-21  9:32 ` [pbs-devel] [PATCH v4 proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
2025-03-25 11:56   ` Thomas Lamprecht
2025-03-25 12:07     ` Thomas Lamprecht
2025-03-25 13:05     ` Christian Ebner
2025-03-26 10:05 ` [pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple " Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal