public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH proxmox{, -backup} v2 0/2] GC: log progress output for phase 2 S3 datastores
@ 2026-01-21 10:45 Christian Ebner
  2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox v2 1/1] s3-client: add request statistics gathering capabilities Christian Ebner
  2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox-backup v2 1/1] GC: log progress output for phase 2 on datastores backed by s3 Christian Ebner
  0 siblings, 2 replies; 3+ messages in thread
From: Christian Ebner @ 2026-01-21 10:45 UTC (permalink / raw)
  To: pbs-devel

This patch series adds accounting for basic request statistics to
the S3 client, and uses it to provide request count and average
request time for phase 2 of garbage collection on S3 stores.

Logging for phase 2 is implemented using a rate limited logger
implementation, to control the number of output lines and avoid
log spamming, given that the total number of objects cannot be
determined a-priori.

Output the number of processed chunks as well as to be removed
object count and size, including the number of objects still pending
a delete API call as stored in the delete list.

Changes since v1 (thanks @Fabian for feedback):
- Switch from dedicated thread output for logging to the rate limited
  logging implementation.
- Extend logging output to include delete object accounting and
  request statistics.

proxmox:

Christian Ebner (1):
  s3-client: add request statistics gathering capabilities

 proxmox-s3-client/src/client.rs | 52 ++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)


proxmox-backup:

Christian Ebner (1):
  GC: log progress output for phase 2 on datastores backed by s3

 pbs-datastore/src/datastore.rs | 98 +++++++++++++++++++++++++++++++++-
 1 file changed, 96 insertions(+), 2 deletions(-)


Summary over all repositories:
  2 files changed, 147 insertions(+), 3 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pbs-devel] [PATCH proxmox v2 1/1] s3-client: add request statistics gathering capabilities
  2026-01-21 10:45 [pbs-devel] [PATCH proxmox{, -backup} v2 0/2] GC: log progress output for phase 2 S3 datastores Christian Ebner
@ 2026-01-21 10:45 ` Christian Ebner
  2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox-backup v2 1/1] GC: log progress output for phase 2 on datastores backed by s3 Christian Ebner
  1 sibling, 0 replies; 3+ messages in thread
From: Christian Ebner @ 2026-01-21 10:45 UTC (permalink / raw)
  To: pbs-devel

Track the per-method request counts and the duration for the last
successful request send by a client instance.

With the intend to utilize the request statistics in PBS, starting
with showing request statistics for garbage collection as a first
step.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- not present in previous version

 proxmox-s3-client/src/client.rs | 52 ++++++++++++++++++++++++++++++++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
index 83176b39..2183fd18 100644
--- a/proxmox-s3-client/src/client.rs
+++ b/proxmox-s3-client/src/client.rs
@@ -1,5 +1,6 @@
 use std::path::{Path, PathBuf};
 use std::str::FromStr;
+use std::sync::atomic::{AtomicU64, Ordering};
 use std::sync::{Arc, Mutex};
 use std::time::{Duration, Instant};
 
@@ -135,12 +136,26 @@ impl S3ClientOptions {
     }
 }
 
+#[derive(Default)]
+/// Request statistics for s3 client instance
+struct RequestStats {
+    /// Number of requests performed by method
+    delete: AtomicU64,
+    get: AtomicU64,
+    head: AtomicU64,
+    post: AtomicU64,
+    put: AtomicU64,
+    /// Duration of the last successfully completed request
+    last_request_duration: Mutex<Option<Duration>>,
+}
+
 /// S3 client for object stores compatible with the AWS S3 API
 pub struct S3Client {
     client: Client<HttpsConnector, Body>,
     options: S3ClientOptions,
     authority: Authority,
     put_rate_limiter: Option<Arc<Mutex<RateLimiter>>>,
+    request_stats: Arc<RequestStats>,
 }
 
 impl S3Client {
@@ -241,6 +256,7 @@ impl S3Client {
             options,
             authority,
             put_rate_limiter,
+            request_stats: Arc::new(RequestStats::default()),
         })
     }
 
@@ -385,6 +401,16 @@ impl S3Client {
                 tokio::time::sleep(backoff_secs).await;
             }
 
+            match parts.method {
+                Method::DELETE => self.request_stats.delete.fetch_add(1, Ordering::SeqCst),
+                Method::GET => self.request_stats.get.fetch_add(1, Ordering::SeqCst),
+                Method::HEAD => self.request_stats.head.fetch_add(1, Ordering::SeqCst),
+                Method::POST => self.request_stats.post.fetch_add(1, Ordering::SeqCst),
+                Method::PUT => self.request_stats.put.fetch_add(1, Ordering::SeqCst),
+                _ => 0,
+            };
+
+            let now = Instant::now();
             let response = if let Some(deadline) = deadline {
                 tokio::time::timeout_at(deadline, self.client.request(request)).await
             } else {
@@ -392,7 +418,10 @@ impl S3Client {
             };
 
             match response {
-                Ok(Ok(response)) => return Ok(response),
+                Ok(Ok(response)) => {
+                    *self.request_stats.last_request_duration.lock().unwrap() = Some(now.elapsed());
+                    return Ok(response);
+                }
                 Ok(Err(err)) => {
                     if retry >= MAX_S3_HTTP_REQUEST_RETRY - 1 {
                         return Err(err.into());
@@ -409,6 +438,27 @@ impl S3Client {
         bail!("failed to send request exceeding retries");
     }
 
+    /// Duration of the last successfully completed request.
+    pub fn last_request_duration(&self) -> Option<Duration> {
+        self.request_stats
+            .last_request_duration
+            .lock()
+            .unwrap()
+            .clone()
+    }
+
+    /// Request counts by method for the client since instantiation.
+    pub fn request_count(&self, method: Method) -> u64 {
+        match method {
+            Method::DELETE => self.request_stats.delete.load(Ordering::SeqCst),
+            Method::GET => self.request_stats.get.load(Ordering::SeqCst),
+            Method::HEAD => self.request_stats.head.load(Ordering::SeqCst),
+            Method::POST => self.request_stats.post.load(Ordering::SeqCst),
+            Method::PUT => self.request_stats.put.load(Ordering::SeqCst),
+            _ => 0,
+        }
+    }
+
     /// Check if bucket exists and got permissions to access it.
     /// See reference docs: https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadBucket.html
     pub async fn head_bucket(&self) -> Result<(), Error> {
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v2 1/1] GC: log progress output for phase 2 on datastores backed by s3
  2026-01-21 10:45 [pbs-devel] [PATCH proxmox{, -backup} v2 0/2] GC: log progress output for phase 2 S3 datastores Christian Ebner
  2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox v2 1/1] s3-client: add request statistics gathering capabilities Christian Ebner
@ 2026-01-21 10:45 ` Christian Ebner
  1 sibling, 0 replies; 3+ messages in thread
From: Christian Ebner @ 2026-01-21 10:45 UTC (permalink / raw)
  To: pbs-devel

Currently there is no progress output for phase 2 of garbage
collection on datastores with s3 backend.

To avoid log spamming, introduce a rate limited logger, updating
the log frequency to max number of outputs for given intervals:
- 30 seconds within the first hour, max 120 lines
- 1 minute for the second hour, max 60 lines
- 5 minutes for [3..24] hour range, max 12 lines per hour
- 15 minutes after the 24 hours, max 96 per day

List request statistics are gathered and displayed as well to get
basic S3 API performance insights.

Example output:
```
...
Start GC phase2 (sweep unused chunks)
2961 chunks processed, removing 0 B in 0 chunks, 0 pending (3 total list requests, 4 ms average)
5961 chunks processed, removing 0 B in 0 chunks, 0 pending (6 total list requests, 1 ms average)
8949 chunks processed, removing 0 B in 0 chunks, 0 pending (9 total list requests, 1 ms average)
processed 9125 chunks, 0 B in 0 chunks deleted
...
```

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- switch from dedicated thread for logging to inline rate limited logger
- adapt log rate intervals as suggested
- extend statistics

 pbs-datastore/src/datastore.rs | 98 +++++++++++++++++++++++++++++++++-
 1 file changed, 96 insertions(+), 2 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 7ad3d917d..51dcde3d8 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -4,11 +4,12 @@ use std::os::unix::ffi::OsStrExt;
 use std::os::unix::io::AsRawFd;
 use std::path::{Path, PathBuf};
 use std::sync::{Arc, LazyLock, Mutex};
-use std::time::{Duration, SystemTime};
+use std::time::{Duration, Instant, SystemTime};
 
 use anyhow::{bail, format_err, Context, Error};
 use http_body_util::BodyExt;
 use hyper::body::Bytes;
+use hyper::http::method::Method;
 use nix::unistd::{unlinkat, UnlinkatFlags};
 use pbs_tools::lru_cache::LruCache;
 use tokio::io::AsyncWriteExt;
@@ -1778,14 +1779,20 @@ impl DataStore {
 
         if let Some(ref s3_client) = s3_client {
             let mut chunk_count = 0;
+            let mut list_request_count = 0;
             let prefix = S3PathPrefix::Some(".chunks/".to_string());
             // Operates in batches of 1000 objects max per request
             let mut list_bucket_result =
                 proxmox_async::runtime::block_on(s3_client.list_objects_v2(&prefix, None))
                     .context("failed to list chunk in s3 object store")?;
+            let mut average_request_duration =
+                s3_client.last_request_duration().unwrap_or_default();
+            list_request_count += 1;
 
             let mut delete_list =
                 S3DeleteList::with_thresholds(S3_DELETE_BATCH_LIMIT, S3_DELETE_DEFER_LIMIT_SECONDS);
+            let mut rate_limited_logger = RateLimitedLogger::with_interval(Duration::from_secs(30));
+
             loop {
                 for content in list_bucket_result.contents {
                     worker.check_abort()?;
@@ -1861,6 +1868,35 @@ impl DataStore {
                     // limit pending deletes to avoid holding too many chunk flocks
                     delete_list.conditional_delete_and_drop_locks(s3_client)?;
                 }
+
+                match rate_limited_logger.elapsed_since_creation() {
+                    e if e >= Duration::from_secs(24 * 60 * 60) => {
+                        // max 96 lines per additional day
+                        rate_limited_logger.update_interval(Duration::from_secs(15 * 60))
+                    }
+                    e if e >= Duration::from_secs(3 * 60 * 60) => {
+                        // max 12 lines per hour for [3..24] hour interval
+                        rate_limited_logger.update_interval(Duration::from_secs(5 * 60))
+                    }
+                    e if e >= Duration::from_secs(60 * 60) => {
+                        // max 60 lines for [1..2] hour interval
+                        rate_limited_logger.update_interval(Duration::from_secs(60))
+                    }
+                    // max 120 lines for first hour (with initial 30s log interval)
+                    _ => (),
+                }
+
+                rate_limited_logger.log_with(|| {
+                    format!(
+                        "{chunk_count} chunks processed, removing {} in {} chunks, {} pending ({} total list requests, {} ms average)",
+                        HumanByte::from(gc_status.removed_bytes),
+                        gc_status.removed_chunks,
+                        delete_list.len(),
+                        s3_client.request_count(Method::GET),
+                        average_request_duration.as_millis(),
+                    )
+                });
+
                 // Process next batch of chunks if there is more
                 if list_bucket_result.is_truncated {
                     list_bucket_result =
@@ -1868,6 +1904,14 @@ impl DataStore {
                             &prefix,
                             list_bucket_result.next_continuation_token.as_deref(),
                         ))?;
+                    list_request_count += 1;
+                    if let Some(duration) = s3_client.last_request_duration() {
+                        average_request_duration += duration;
+                        average_request_duration = average_request_duration
+                            .checked_div(list_request_count)
+                            .unwrap_or_default();
+                    }
+
                     continue;
                 }
 
@@ -1877,7 +1921,11 @@ impl DataStore {
             // delete the last batch of objects, if there are any remaining
             delete_list.delete_and_drop_locks(s3_client)?;
 
-            info!("processed {chunk_count} total chunks");
+            info!(
+                "processed {chunk_count} chunks, {} in {} chunks deleted",
+                HumanByte::from(gc_status.removed_bytes),
+                gc_status.removed_chunks,
+            );
 
             // Phase 2 GC of Filesystem backed storage is phase 3 for S3 backed GC
             info!("Start GC phase3 (sweep unused chunk markers)");
@@ -2876,4 +2924,50 @@ impl S3DeleteList {
         }
         Ok(())
     }
+
+    /// Current number of object keys pending deletion
+    fn len(&self) -> usize {
+        self.list.len()
+    }
+}
+
+struct RateLimitedLogger {
+    interval: Duration,
+    start_time: Instant,
+    last_log_time: Instant,
+}
+
+impl RateLimitedLogger {
+    /// Create a rate limited logger with given maximum log frequency by given interval.
+    fn with_interval(interval: Duration) -> Self {
+        let now = Instant::now();
+        Self {
+            interval,
+            start_time: now,
+            last_log_time: now,
+        }
+    }
+
+    /// Set the maximum log frequency to given interval.
+    fn update_interval(&mut self, interval: Duration) {
+        self.interval = interval;
+    }
+
+    /// Returns the age of the logger instance as time elapsed since creation.
+    fn elapsed_since_creation(&self) -> Duration {
+        self.start_time.elapsed()
+    }
+
+    /// Executes the provided callback if the last invocation exceeds the
+    /// configured interval, logging the returned value.
+    fn log_with<F>(&mut self, callback: F)
+    where
+        F: Fn() -> String,
+    {
+        let now = Instant::now();
+        if now.duration_since(self.last_log_time) >= self.interval {
+            info!("{}", callback());
+            self.last_log_time = now;
+        }
+    }
 }
-- 
2.47.3



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-01-21 10:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-01-21 10:45 [pbs-devel] [PATCH proxmox{, -backup} v2 0/2] GC: log progress output for phase 2 S3 datastores Christian Ebner
2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox v2 1/1] s3-client: add request statistics gathering capabilities Christian Ebner
2026-01-21 10:45 ` [pbs-devel] [PATCH proxmox-backup v2 1/1] GC: log progress output for phase 2 on datastores backed by s3 Christian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal