public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates
@ 2025-03-20 12:30 Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

This patches implement the logic to greatly improve the performance
of phase 1 garbage collection by avoiding multiple atime updates on
the same chunk.

Currently, phase 1 GC iterates over all folders in the datastore
looking and collecting all image index files without taking any
logical assumptions (e.g. namespaces, groups, snapshots, ...). This
is to avoid accidentally missing image index files located in
unexpected paths and therefore not marking their chunks as in use,
leading to potential data losses.

This patches improve phase 1 by:
- iterating index images using the datatstore's iterators for detecting
  regular index files. Paths outside of the iterator logic are still taken
  into account and processed as well by generating a list of all the found
  images first, removing index files encountered while iterating, finally
  leaving a list of indexes with unexpected paths. These unexpected paths
  are now also logged, for the user to potentially take action.
- Keeping track of recently touched chunks by storing their digests in a
  LRU cache, skipping over expensive atime updates for chunks already
  present in the cache.

Most notably changes since version 2 (thanks Fabian for feedback):
- Use LRU cache instead of keeping track of chunks from the previous
  snapshot in the group.
- Split patches to logically separate iteration from caching logic
- Adapt for better anyhow context error propagation and formatting

Most notably changes since version 1 (thanks Fabian for feedback):
- Logically iterate using pre-existing iterators instead of constructing
  data structure for iteration when listing images.
- Tested that double listing does not affect runtime.
- Chunks are now remembered for all archives per snapshot, not just a
  single archive per snapshot as previously, this mimics more closely
  the backup behaviour, this give some additional gains in some cases.

Statistics generated by averaging 3 GC runtimes, measured after an initial
run each to warm up caches. Datastores A (1515 index files) and B (192
index files) are unrelated, containing "real" backups.
The syscall counts were generated using
`strace -f -e utimensat -p $(pidof proxmox-backup-proxy) 2&> dump.trace` and
(after small cleanup) `wc -l dump.trace`.

datastore A on spinning disk:
unpatched: 115.3s ± 0.5 s,  utimensat calls: 6864667
version 2:  52.0s ± 0.0 s,  utimensat calls: 1151734
version 3:  24.6s ± 0.5 s,  utimensat calls: 1079149

datastore B on SSD:
unpatched: 44.6 ± 3.2 s, utimensat calls: 2034949
version 2: 15.0 ± 1.0 s, utimensat calls:  562929
version 3: 14.6 ± 0.5 s, utimensat calls:  559053

proxmox:

Christian Ebner (1):
  worker task: include anyhow error context in state error message

 proxmox-rest-server/src/worker_task.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

proxmox-backup:

Christian Ebner (5):
  tools: lru cache: tell if node was already present or newly inserted
  garbage collection: format error including anyhow error context
  datastore: add helper method to open index reader from path
  garbage collection: generate image list via datastore iterators
  fix #5331: garbage collection: avoid multiple chunk atime updates

 pbs-datastore/src/datastore.rs  | 192 +++++++++++++++++++++++---------
 pbs-tools/src/lru_cache.rs      |   4 +-
 src/api2/admin/datastore.rs     |   4 +-
 src/bin/proxmox-backup-proxy.rs |   2 +-
 4 files changed, 145 insertions(+), 57 deletions(-)

-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-20 13:47   ` [pbs-devel] applied: " Wolfgang Bumiller
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 2/6] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

Currently the anyhow error context of a given error is not included
in the error message, as `to_string` does use the default formatting
[0].

Include the error context, formatting it as single line as the
message is also shown to the users in e.g. the Proxmox Backup Severs
task state in the UI.

[0] https://docs.rs/anyhow/latest/anyhow/struct.Error.html#display-representations

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 proxmox-rest-server/src/worker_task.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/proxmox-rest-server/src/worker_task.rs b/proxmox-rest-server/src/worker_task.rs
index 2968a68c..a3a65add 100644
--- a/proxmox-rest-server/src/worker_task.rs
+++ b/proxmox-rest-server/src/worker_task.rs
@@ -1003,7 +1003,7 @@ impl WorkerTask {
 
         if let Err(err) = result {
             TaskState::Error {
-                message: err.to_string(),
+                message: format!("{err:#}"),
                 endtime,
             }
         } else if warn_count > 0 {
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 2/6] tools: lru cache: tell if node was already present or newly inserted
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 3/6] garbage collection: format error including anyhow error context Christian Ebner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

Add a boolean return type to LruCache::insert(), telling if the node
was already present in the cache or if it was newly inserted.

This will allow to use the LRU cache for garbage collection, where
it is required to skip atime updates for chunks already marked in
use.

That improves phase 1 garbage collection performance by avoiding,
multiple atime updates.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 pbs-tools/src/lru_cache.rs | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/pbs-tools/src/lru_cache.rs b/pbs-tools/src/lru_cache.rs
index 1befb1a0d..9e0112647 100644
--- a/pbs-tools/src/lru_cache.rs
+++ b/pbs-tools/src/lru_cache.rs
@@ -133,7 +133,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
 
     /// Insert or update an entry identified by `key` with the given `value`.
     /// This entry is placed as the most recently used node at the head.
-    pub fn insert(&mut self, key: K, value: V) {
+    pub fn insert(&mut self, key: K, value: V) -> bool {
         match self.map.entry(key) {
             Entry::Occupied(mut o) => {
                 // Node present, update value
@@ -142,6 +142,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
                 let mut node = unsafe { Box::from_raw(node_ptr) };
                 node.value = value;
                 let _node_ptr = Box::into_raw(node);
+                true
             }
             Entry::Vacant(v) => {
                 // Node not present, insert a new one
@@ -159,6 +160,7 @@ impl<K: std::cmp::Eq + std::hash::Hash + Copy, V> LruCache<K, V> {
                 if self.map.len() > self.capacity {
                     self.pop_tail();
                 }
+                false
             }
         }
     }
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 3/6] garbage collection: format error including anyhow error context
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 2/6] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path Christian Ebner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

Until now errors are shown ignoring the anyhow error context. In
order to allow the garbage collection to return additional error
context, format the error including the context as single line.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 src/api2/admin/datastore.rs     | 4 +---
 src/bin/proxmox-backup-proxy.rs | 2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
index cfe319c4e..54dad81cb 100644
--- a/src/api2/admin/datastore.rs
+++ b/src/api2/admin/datastore.rs
@@ -1219,9 +1219,7 @@ pub fn start_garbage_collection(
         crate::server::do_garbage_collection_job(job, datastore, &auth_id, None, to_stdout)
             .map_err(|err| {
                 format_err!(
-                    "unable to start garbage collection job on datastore {} - {}",
-                    store,
-                    err
+                    "unable to start garbage collection job on datastore {store} - {err:#}"
                 )
             })?;
 
diff --git a/src/bin/proxmox-backup-proxy.rs b/src/bin/proxmox-backup-proxy.rs
index fdfa7463a..dd735767f 100644
--- a/src/bin/proxmox-backup-proxy.rs
+++ b/src/bin/proxmox-backup-proxy.rs
@@ -543,7 +543,7 @@ async fn schedule_datastore_garbage_collection() {
             Some(event_str),
             false,
         ) {
-            eprintln!("unable to start garbage collection job on datastore {store} - {err}");
+            eprintln!("unable to start garbage collection job on datastore {store} - {err:#}");
         }
     }
 }
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
                   ` (2 preceding siblings ...)
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 3/6] garbage collection: format error including anyhow error context Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-20 14:22   ` Wolfgang Bumiller
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 5/6] garbage collection: generate image list via datastore iterators Christian Ebner
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

Refactor the archive type and index file reader opening with its
error handling into a helper method for better reusability.

This allows to use the same logic for both, expected image paths
and unexpected image paths when iterating trough the datastore
in a hierarchical manner.

Improve error handling by switching to anyhow's error context.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- add check for passed path being absolute

 pbs-datastore/src/datastore.rs | 76 ++++++++++++++++++++++++----------
 1 file changed, 54 insertions(+), 22 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index a6a91ca79..7b5ea4272 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -5,7 +5,7 @@ use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, LazyLock, Mutex};
 
-use anyhow::{bail, format_err, Error};
+use anyhow::{bail, format_err, Context, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
 use tracing::{info, warn};
 
@@ -1029,10 +1029,57 @@ impl DataStore {
         Ok(list)
     }
 
+    // Similar to open index, but ignore index files with blob or unknown archive type.
+    // Further, do not fail if file vanished.
+    fn open_index_reader(&self, absolute_path: &Path) -> Result<Option<Box<dyn IndexFile>>, Error> {
+        let archive_type = match ArchiveType::from_path(absolute_path) {
+            Ok(archive_type) => archive_type,
+            // ignore archives with unknown archive type
+            Err(_) => return Ok(None),
+        };
+
+        if absolute_path.is_relative() {
+            bail!(
+                "expected absolute path, got '{}'",
+                absolute_path.to_string_lossy()
+            );
+        }
+
+        let file = match std::fs::File::open(absolute_path) {
+            Ok(file) => file,
+            // ignore vanished files
+            Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(None),
+            Err(err) => {
+                return Err(Error::from(err).context(format!(
+                    "can't open file {}",
+                    absolute_path.to_string_lossy()
+                )))
+            }
+        };
+
+        match archive_type {
+            ArchiveType::FixedIndex => {
+                let reader = FixedIndexReader::new(file).context(format!(
+                    "can't open fixed index {}",
+                    absolute_path.to_string_lossy()
+                ))?;
+                Ok(Some(Box::new(reader)))
+            }
+            ArchiveType::DynamicIndex => {
+                let reader = DynamicIndexReader::new(file).context(format!(
+                    "can't open dynamic index {}",
+                    absolute_path.to_string_lossy()
+                ))?;
+                Ok(Some(Box::new(reader)))
+            }
+            ArchiveType::Blob => Ok(None),
+        }
+    }
+
     // mark chunks  used by ``index`` as used
-    fn index_mark_used_chunks<I: IndexFile>(
+    fn index_mark_used_chunks(
         &self,
-        index: I,
+        index: Box<dyn IndexFile>,
         file_name: &Path, // only used for error reporting
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
@@ -1090,24 +1137,8 @@ impl DataStore {
                 }
             }
 
-            match std::fs::File::open(&img) {
-                Ok(file) => {
-                    if let Ok(archive_type) = ArchiveType::from_path(&img) {
-                        if archive_type == ArchiveType::FixedIndex {
-                            let index = FixedIndexReader::new(file).map_err(|e| {
-                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
-                            })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
-                        } else if archive_type == ArchiveType::DynamicIndex {
-                            let index = DynamicIndexReader::new(file).map_err(|e| {
-                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
-                            })?;
-                            self.index_mark_used_chunks(index, &img, status, worker)?;
-                        }
-                    }
-                }
-                Err(err) if err.kind() == io::ErrorKind::NotFound => (), // ignore vanished files
-                Err(err) => bail!("can't open index {} - {}", img.to_string_lossy(), err),
+            if let Some(index) = self.open_index_reader(&img)? {
+                self.index_mark_used_chunks(index, &img, status, worker)?;
             }
 
             let percentage = (i + 1) * 100 / image_count;
@@ -1173,7 +1204,8 @@ impl DataStore {
 
             info!("Start GC phase1 (mark used chunks)");
 
-            self.mark_used_chunks(&mut gc_status, worker)?;
+            self.mark_used_chunks(&mut gc_status, worker)
+                .context("marking used chunks failed")?;
 
             info!("Start GC phase2 (sweep unused chunks)");
             self.inner.chunk_store.sweep_unused_chunks(
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 5/6] garbage collection: generate image list via datastore iterators
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
                   ` (3 preceding siblings ...)
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 6/6] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
  2025-03-21  9:32 ` [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple " Christian Ebner
  6 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

Instead of iterating over all images found in the datastore in an
unstructured manner, use the datastore iterators to logically iterate
over them as other datastore operations will.

This allows to better distinguish images in unexpected locations from
ones in their expected location, warning the user of unexpected ones
to allow to act on possible missconfigurations. Further, this will
allow to integrate marking of snapshots with missing chunks as
incomplete/corrupt more easily.

This now iterates twice over the images, as index files in unexpected
locations are still considered by generating a list of all images to
be found in the datastore and removing regular index files from that
list, leaving unexpected ones behind.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- split iterating from caching logic

 pbs-datastore/src/datastore.rs | 100 ++++++++++++++++++++++-----------
 1 file changed, 67 insertions(+), 33 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 7b5ea4272..c4123f2b7 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -25,7 +25,7 @@ use pbs_api_types::{
     MaintenanceMode, MaintenanceType, Operation, UPID,
 };
 
-use crate::backup_info::{BackupDir, BackupGroup};
+use crate::backup_info::{BackupDir, BackupGroup, BackupInfo};
 use crate::chunk_store::ChunkStore;
 use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
 use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
@@ -970,10 +970,10 @@ impl DataStore {
         ListGroups::new(Arc::clone(self), ns)?.collect()
     }
 
-    fn list_images(&self) -> Result<Vec<PathBuf>, Error> {
+    fn list_images(&self) -> Result<HashSet<PathBuf>, Error> {
         let base = self.base_path();
 
-        let mut list = vec![];
+        let mut list = HashSet::new();
 
         use walkdir::WalkDir;
 
@@ -1021,7 +1021,7 @@ impl DataStore {
                 if archive_type == ArchiveType::FixedIndex
                     || archive_type == ArchiveType::DynamicIndex
                 {
-                    list.push(path);
+                    list.insert(path);
                 }
             }
         }
@@ -1117,43 +1117,77 @@ impl DataStore {
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
     ) -> Result<(), Error> {
-        let image_list = self.list_images()?;
-        let image_count = image_list.len();
-
+        // Iterate twice over the datastore to fetch images, even if this comes with an additional
+        // runtime cost:
+        // - First iteration to find all index files, no matter if they are in a location expected
+        //   by the datastore's hierarchy
+        // - Iterate using the datastore's helpers, so the namespaces, groups and snapshots are
+        //   looked up given the expected hierarchy and iterator logic
+        //
+        // By this it is assured that all index files are used, even if they would not have been
+        // seen by the regular logic and the user is informed by the garbage collection run about
+        // the detected index files not following the iterators logic.
+
+        let mut unprocessed_image_list = self.list_images()?;
+        let image_count = unprocessed_image_list.len();
+
+        let mut processed_images = 0;
         let mut last_percentage: usize = 0;
 
-        let mut strange_paths_count: u64 = 0;
-
-        for (i, img) in image_list.into_iter().enumerate() {
-            worker.check_abort()?;
-            worker.fail_on_shutdown()?;
-
-            if let Some(backup_dir_path) = img.parent() {
-                let backup_dir_path = backup_dir_path.strip_prefix(self.base_path())?;
-                if let Some(backup_dir_str) = backup_dir_path.to_str() {
-                    if pbs_api_types::parse_ns_and_snapshot(backup_dir_str).is_err() {
-                        strange_paths_count += 1;
+        let arc_self = Arc::new(self.clone());
+        for namespace in arc_self
+            .recursive_iter_backup_ns(BackupNamespace::root())
+            .context("creating namespace iterator failed")?
+        {
+            let namespace = namespace.context("iterating namespaces failed")?;
+            for group in arc_self.iter_backup_groups(namespace)? {
+                let group = group.context("iterating backup groups failed")?;
+                let mut snapshots = group.list_backups().context("listing snapshots failed")?;
+                // Sort by snapshot timestamp to iterate over consecutive snapshots for each image.
+                BackupInfo::sort_list(&mut snapshots, true);
+                for snapshot in snapshots {
+                    for file in snapshot.files {
+                        worker.check_abort()?;
+                        worker.fail_on_shutdown()?;
+
+                        let mut path = snapshot.backup_dir.full_path();
+                        path.push(file);
+
+                        let index = match self.open_index_reader(&path)? {
+                            Some(index) => index,
+                            None => continue,
+                        };
+                        self.index_mark_used_chunks(index, &path, status, worker)?;
+
+                        unprocessed_image_list.remove(&path);
+
+                        let percentage = (processed_images + 1) * 100 / image_count;
+                        if percentage > last_percentage {
+                            info!(
+                                "marked {percentage}% ({} of {image_count} index files)",
+                                processed_images + 1,
+                            );
+                            last_percentage = percentage;
+                        }
+                        processed_images += 1;
                     }
                 }
             }
-
-            if let Some(index) = self.open_index_reader(&img)? {
-                self.index_mark_used_chunks(index, &img, status, worker)?;
-            }
-
-            let percentage = (i + 1) * 100 / image_count;
-            if percentage > last_percentage {
-                info!(
-                    "marked {percentage}% ({} of {image_count} index files)",
-                    i + 1,
-                );
-                last_percentage = percentage;
-            }
         }
 
+        let strange_paths_count = unprocessed_image_list.len();
         if strange_paths_count > 0 {
-            info!(
-                "found (and marked) {strange_paths_count} index files outside of expected directory scheme"
+            warn!("found {strange_paths_count} index files outside of expected directory scheme");
+        }
+        for path in unprocessed_image_list {
+            let index = match self.open_index_reader(&path)? {
+                Some(index) => index,
+                None => continue,
+            };
+            self.index_mark_used_chunks(index, &path, status, worker)?;
+            warn!(
+                "Marked chunks for unexpected index file at '{}'",
+                path.to_string_lossy()
             );
         }
 
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] [PATCH v3 proxmox-backup 6/6] fix #5331: garbage collection: avoid multiple chunk atime updates
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
                   ` (4 preceding siblings ...)
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 5/6] garbage collection: generate image list via datastore iterators Christian Ebner
@ 2025-03-20 12:30 ` Christian Ebner
  2025-03-21  9:32 ` [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple " Christian Ebner
  6 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 12:30 UTC (permalink / raw)
  To: pbs-devel

To reduce the number of atimes updates, keep track of the recently
marked chunks in phase 1 of garbage to avoid multiple atime updates
via expensive utimensat() calls.

Recently touched chunks are tracked by storing the chunk digests in
an LRU cache of fixed capacity. By inserting a digest, the chunk will
be the most recently touched one and if already present in the cache
before insert, the atime update can be skipped.

Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=5331
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 2:
- Switch to LRU cache instead of keeping track of chunks from previous
  snapshot of the same group.

 pbs-datastore/src/datastore.rs | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index c4123f2b7..1f1c3b396 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -7,6 +7,7 @@ use std::sync::{Arc, LazyLock, Mutex};
 
 use anyhow::{bail, format_err, Context, Error};
 use nix::unistd::{unlinkat, UnlinkatFlags};
+use pbs_tools::lru_cache::LruCache;
 use tracing::{info, warn};
 
 use proxmox_human_byte::HumanByte;
@@ -1081,6 +1082,7 @@ impl DataStore {
         &self,
         index: Box<dyn IndexFile>,
         file_name: &Path, // only used for error reporting
+        recently_touched_chunks: &mut LruCache<[u8; 32], ()>,
         status: &mut GarbageCollectionStatus,
         worker: &dyn WorkerTaskContext,
     ) -> Result<(), Error> {
@@ -1091,6 +1093,12 @@ impl DataStore {
             worker.check_abort()?;
             worker.fail_on_shutdown()?;
             let digest = index.index_digest(pos).unwrap();
+
+            // Avoid multiple expensive atime updates by utimensat
+            if recently_touched_chunks.insert(*digest, ()) {
+                continue;
+            }
+
             if !self.inner.chunk_store.cond_touch_chunk(digest, false)? {
                 let hex = hex::encode(digest);
                 warn!(
@@ -1131,6 +1139,8 @@ impl DataStore {
         let mut unprocessed_image_list = self.list_images()?;
         let image_count = unprocessed_image_list.len();
 
+        // Allow up to 32 MiB, as only storing the 32 digest as key
+        let mut recently_touched_chunks = LruCache::new(1024 * 1024);
         let mut processed_images = 0;
         let mut last_percentage: usize = 0;
 
@@ -1157,7 +1167,13 @@ impl DataStore {
                             Some(index) => index,
                             None => continue,
                         };
-                        self.index_mark_used_chunks(index, &path, status, worker)?;
+                        self.index_mark_used_chunks(
+                            index,
+                            &path,
+                            &mut recently_touched_chunks,
+                            status,
+                            worker,
+                        )?;
 
                         unprocessed_image_list.remove(&path);
 
@@ -1184,7 +1200,13 @@ impl DataStore {
                 Some(index) => index,
                 None => continue,
             };
-            self.index_mark_used_chunks(index, &path, status, worker)?;
+            self.index_mark_used_chunks(
+                index,
+                &path,
+                &mut recently_touched_chunks,
+                status,
+                worker,
+            )?;
             warn!(
                 "Marked chunks for unexpected index file at '{}'",
                 path.to_string_lossy()
-- 
2.39.5



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pbs-devel] applied: [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
@ 2025-03-20 13:47   ` Wolfgang Bumiller
  0 siblings, 0 replies; 11+ messages in thread
From: Wolfgang Bumiller @ 2025-03-20 13:47 UTC (permalink / raw)
  To: Christian Ebner; +Cc: pbs-devel

(this is a duplicate - it's applied now)

On Thu, Mar 20, 2025 at 01:30:05PM +0100, Christian Ebner wrote:
> Currently the anyhow error context of a given error is not included
> in the error message, as `to_string` does use the default formatting
> [0].
> 
> Include the error context, formatting it as single line as the
> message is also shown to the users in e.g. the Proxmox Backup Severs
> task state in the UI.
> 
> [0] https://docs.rs/anyhow/latest/anyhow/struct.Error.html#display-representations
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 1:
> - not present in previous version
> 
>  proxmox-rest-server/src/worker_task.rs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/proxmox-rest-server/src/worker_task.rs b/proxmox-rest-server/src/worker_task.rs
> index 2968a68c..a3a65add 100644
> --- a/proxmox-rest-server/src/worker_task.rs
> +++ b/proxmox-rest-server/src/worker_task.rs
> @@ -1003,7 +1003,7 @@ impl WorkerTask {
>  
>          if let Err(err) = result {
>              TaskState::Error {
> -                message: err.to_string(),
> +                message: format!("{err:#}"),
>                  endtime,
>              }
>          } else if warn_count > 0 {
> -- 
> 2.39.5


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path Christian Ebner
@ 2025-03-20 14:22   ` Wolfgang Bumiller
  2025-03-20 14:40     ` Christian Ebner
  0 siblings, 1 reply; 11+ messages in thread
From: Wolfgang Bumiller @ 2025-03-20 14:22 UTC (permalink / raw)
  To: Christian Ebner; +Cc: pbs-devel

minor error handling nits:

On Thu, Mar 20, 2025 at 01:30:08PM +0100, Christian Ebner wrote:
> Refactor the archive type and index file reader opening with its
> error handling into a helper method for better reusability.
> 
> This allows to use the same logic for both, expected image paths
> and unexpected image paths when iterating trough the datastore
> in a hierarchical manner.
> 
> Improve error handling by switching to anyhow's error context.
> 
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 1:
> - add check for passed path being absolute
> 
>  pbs-datastore/src/datastore.rs | 76 ++++++++++++++++++++++++----------
>  1 file changed, 54 insertions(+), 22 deletions(-)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index a6a91ca79..7b5ea4272 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -5,7 +5,7 @@ use std::os::unix::io::AsRawFd;
>  use std::path::{Path, PathBuf};
>  use std::sync::{Arc, LazyLock, Mutex};
>  
> -use anyhow::{bail, format_err, Error};
> +use anyhow::{bail, format_err, Context, Error};
>  use nix::unistd::{unlinkat, UnlinkatFlags};
>  use tracing::{info, warn};
>  
> @@ -1029,10 +1029,57 @@ impl DataStore {
>          Ok(list)
>      }
>  
> +    // Similar to open index, but ignore index files with blob or unknown archive type.
> +    // Further, do not fail if file vanished.
> +    fn open_index_reader(&self, absolute_path: &Path) -> Result<Option<Box<dyn IndexFile>>, Error> {
> +        let archive_type = match ArchiveType::from_path(absolute_path) {
> +            Ok(archive_type) => archive_type,
> +            // ignore archives with unknown archive type
> +            Err(_) => return Ok(None),
> +        };
> +
> +        if absolute_path.is_relative() {
> +            bail!(
> +                "expected absolute path, got '{}'",
> +                absolute_path.to_string_lossy()

Since this get touched anyway - I think it makes sense (and it's much
less code) to just format paths with `{:?}` (and inline them into the
format string ->
    bail!("expected absolute path, got {absolute_path:?}");
is a single line after all ;-) )

> +            );
> +        }
> +
> +        let file = match std::fs::File::open(absolute_path) {
> +            Ok(file) => file,
> +            // ignore vanished files
> +            Err(err) if err.kind() == io::ErrorKind::NotFound => return Ok(None),
> +            Err(err) => {
> +                return Err(Error::from(err).context(format!(
> +                    "can't open file {}",
> +                    absolute_path.to_string_lossy()
> +                )))
> +            }
> +        };
> +
> +        match archive_type {
> +            ArchiveType::FixedIndex => {
> +                let reader = FixedIndexReader::new(file).context(format!(

`format!()` allocates, so this should use `.with_context()` instead of
`.context()` to only do that when an actual error happens.


> +                    "can't open fixed index {}",
> +                    absolute_path.to_string_lossy()
> +                ))?;
> +                Ok(Some(Box::new(reader)))
> +            }
> +            ArchiveType::DynamicIndex => {
> +                let reader = DynamicIndexReader::new(file).context(format!(

^ same here

> +                    "can't open dynamic index {}",
> +                    absolute_path.to_string_lossy()
> +                ))?;
> +                Ok(Some(Box::new(reader)))
> +            }
> +            ArchiveType::Blob => Ok(None),
> +        }
> +    }
> +
>      // mark chunks  used by ``index`` as used
> -    fn index_mark_used_chunks<I: IndexFile>(
> +    fn index_mark_used_chunks(
>          &self,
> -        index: I,
> +        index: Box<dyn IndexFile>,
>          file_name: &Path, // only used for error reporting
>          status: &mut GarbageCollectionStatus,
>          worker: &dyn WorkerTaskContext,
> @@ -1090,24 +1137,8 @@ impl DataStore {
>                  }
>              }
>  
> -            match std::fs::File::open(&img) {
> -                Ok(file) => {
> -                    if let Ok(archive_type) = ArchiveType::from_path(&img) {
> -                        if archive_type == ArchiveType::FixedIndex {
> -                            let index = FixedIndexReader::new(file).map_err(|e| {
> -                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
> -                            })?;
> -                            self.index_mark_used_chunks(index, &img, status, worker)?;
> -                        } else if archive_type == ArchiveType::DynamicIndex {
> -                            let index = DynamicIndexReader::new(file).map_err(|e| {
> -                                format_err!("can't read index '{}' - {}", img.to_string_lossy(), e)
> -                            })?;
> -                            self.index_mark_used_chunks(index, &img, status, worker)?;
> -                        }
> -                    }
> -                }
> -                Err(err) if err.kind() == io::ErrorKind::NotFound => (), // ignore vanished files
> -                Err(err) => bail!("can't open index {} - {}", img.to_string_lossy(), err),
> +            if let Some(index) = self.open_index_reader(&img)? {
> +                self.index_mark_used_chunks(index, &img, status, worker)?;
>              }
>  
>              let percentage = (i + 1) * 100 / image_count;
> @@ -1173,7 +1204,8 @@ impl DataStore {
>  
>              info!("Start GC phase1 (mark used chunks)");
>  
> -            self.mark_used_chunks(&mut gc_status, worker)?;
> +            self.mark_used_chunks(&mut gc_status, worker)
> +                .context("marking used chunks failed")?;
>  
>              info!("Start GC phase2 (sweep unused chunks)");
>              self.inner.chunk_store.sweep_unused_chunks(
> -- 
> 2.39.5


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path
  2025-03-20 14:22   ` Wolfgang Bumiller
@ 2025-03-20 14:40     ` Christian Ebner
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-20 14:40 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: pbs-devel

On 3/20/25 15:22, Wolfgang Bumiller wrote:
> minor error handling nits:

Okay, will incorporate these comments in a version 4.

Will however wait a bit for further feedback if there is any, not to 
spam the mailing list with to may iterations of the patches.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates
  2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
                   ` (5 preceding siblings ...)
  2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 6/6] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
@ 2025-03-21  9:32 ` Christian Ebner
  6 siblings, 0 replies; 11+ messages in thread
From: Christian Ebner @ 2025-03-21  9:32 UTC (permalink / raw)
  To: pbs-devel

superseded-by version 4:
https://lore.proxmox.com/pbs-devel/20250321093202.155899-1-c.ebner@proxmox.com/T/


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-03-21  9:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-20 12:30 [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple atime updates Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox 1/6] worker task: include anyhow error context in state error message Christian Ebner
2025-03-20 13:47   ` [pbs-devel] applied: " Wolfgang Bumiller
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 2/6] tools: lru cache: tell if node was already present or newly inserted Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 3/6] garbage collection: format error including anyhow error context Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 4/6] datastore: add helper method to open index reader from path Christian Ebner
2025-03-20 14:22   ` Wolfgang Bumiller
2025-03-20 14:40     ` Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 5/6] garbage collection: generate image list via datastore iterators Christian Ebner
2025-03-20 12:30 ` [pbs-devel] [PATCH v3 proxmox-backup 6/6] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
2025-03-21  9:32 ` [pbs-devel] [PATCH v3 proxmox proxmox-backup 0/6] GC: avoid multiple " Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal