public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts
@ 2026-05-27 12:37 Robert Obkircher
  2026-05-27 16:12 ` Christian Ebner
  2026-05-29 10:22 ` applied: " Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Robert Obkircher @ 2026-05-27 12:37 UTC (permalink / raw)
  To: pbs-devel

Move synchronous operations off the worker threads to prevent blocking
the I/O and timer drivers of the entire runtime. This is especially
important for S3 uploads, which wait up to 3 hours for the chunk lock.

Also prevent worker starvation, which could happen because S3 uploads
are wrapped in proxmox_async::runtime::block_on, which prevents other
futures from running in the current thread.

In the backtrace from the linked forum post, two workers were waiting
for chunk locks (presumably due to duplicates) while the remaining 19
were stuck because block_on called std::thread::park.

Fixes: https://forum.proxmox.com/threads/183705
Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
---
 src/api2/backup/upload_chunk.rs | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
index 59e4caee2..eec24add2 100644
--- a/src/api2/backup/upload_chunk.rs
+++ b/src/api2/backup/upload_chunk.rs
@@ -1,4 +1,5 @@
 use std::pin::Pin;
+use std::sync::Arc;
 use std::task::{Context, Poll};
 
 use anyhow::{Error, bail, format_err};
@@ -8,6 +9,7 @@ use http_body_util::{BodyDataStream, BodyExt};
 use hyper::body::Incoming;
 use hyper::http::request::Parts;
 use serde_json::{Value, json};
+use tokio::task::spawn_blocking;
 
 use proxmox_router::{ApiHandler, ApiMethod, ApiResponseFuture, RpcEnvironment};
 use proxmox_schema::*;
@@ -232,14 +234,18 @@ async fn upload_to_backend(
     let (digest, size, chunk) =
         UploadChunk::new(BodyDataStream::new(req_body), digest, size, encoded_size).await?;
 
+    let datastore = Arc::clone(&env.datastore);
+    let backend = env.backend.clone();
+
     if env.no_cache {
         let (is_duplicate, chunk_size) =
-            env.datastore
-                .insert_chunk_no_cache(&chunk, &digest, &env.backend)?;
+            spawn_blocking(move || datastore.insert_chunk_no_cache(&chunk, &digest, &backend))
+                .await??;
         return Ok((digest, size, chunk_size as u32, is_duplicate));
     }
 
-    let (is_duplicate, chunk_size) = env.datastore.insert_chunk(&chunk, &digest, &env.backend)?;
+    let (is_duplicate, chunk_size) =
+        spawn_blocking(move || datastore.insert_chunk(&chunk, &digest, &backend)).await??;
     Ok((digest, size, chunk_size as u32, is_duplicate))
 }
 
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts
  2026-05-27 12:37 [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts Robert Obkircher
@ 2026-05-27 16:12 ` Christian Ebner
  2026-05-29 10:22 ` applied: " Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Christian Ebner @ 2026-05-27 16:12 UTC (permalink / raw)
  To: Robert Obkircher, pbs-devel

On 5/27/26 2:37 PM, Robert Obkircher wrote:
> Move synchronous operations off the worker threads to prevent blocking
> the I/O and timer drivers of the entire runtime. This is especially
> important for S3 uploads, which wait up to 3 hours for the chunk lock.
> 
> Also prevent worker starvation, which could happen because S3 uploads
> are wrapped in proxmox_async::runtime::block_on, which prevents other
> futures from running in the current thread.
> 
> In the backtrace from the linked forum post, two workers were waiting
> for chunk locks (presumably due to duplicates) while the remaining 19
> were stuck because block_on called std::thread::park.
> 
> Fixes: https://forum.proxmox.com/threads/183705
> Signed-off-by: Robert Obkircher <r.obkircher@proxmox.com>
> ---

Gave this patch a spin, changes look good to me although a second pair 
of eyes is highly appreciated given the scope.

In order to look for performance regressions, I did also some basic 
benchmarks:

Performed backups of a linux kernel git repo with size 11.488 GiB
(compressed 9.96 GiB).
For each patch state and target datastore, performed 5 runs removing the
backup snapshot again after each run to force re-upload of chunks, but
left chunk in store to avoid overhead of insert/upload. In-memory state
cleared after each run by restarting proxmox-backup-proxy.service.
delta RSS was obtained by taking difference of max and min value of
output as reported by:
`watch -n 1 "ps -p $(pidof proxmox-backup-proxy) -o rss | tail -n 1 | 
tee -a ps-rss.out"`

Regular datastore and S3 store backed by Ceph RGW using a local 
datastore cache on virtualized

regular datastore - unpatched
-----------------------------
runtime (s)  | delta RSS (KiB)
25.24 ± 0.09 | 31772
-----------------------------

S3 datastore - unpatched
-----------------------------
runtime (s)  | delta RSS (KiB)
25.19 ± 0.14 | 28072
-----------------------------

regular datastore - patched
-----------------------------
runtime (s)  | delta RSS (KiB)
25.28 ± 0.08 | 26184
-----------------------------

S3 datastore - patched
-----------------------------
runtime (s)  | delta rss (KiB)
25.42 ± 0.15 | 35024
-----------------------------

Consider:

Reviewed-by: Christian Ebner <c.ebner@proxmox.com>
Tested-by: Christian Ebner <c.ebner@proxmox.com>




^ permalink raw reply	[flat|nested] 3+ messages in thread

* applied: [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts
  2026-05-27 12:37 [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts Robert Obkircher
  2026-05-27 16:12 ` Christian Ebner
@ 2026-05-29 10:22 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2026-05-29 10:22 UTC (permalink / raw)
  To: pbs-devel, Robert Obkircher

On Wed, 27 May 2026 14:37:51 +0200, Robert Obkircher wrote:
> Move synchronous operations off the worker threads to prevent blocking
> the I/O and timer drivers of the entire runtime. This is especially
> important for S3 uploads, which wait up to 3 hours for the chunk lock.
> 
> Also prevent worker starvation, which could happen because S3 uploads
> are wrapped in proxmox_async::runtime::block_on, which prevents other
> futures from running in the current thread.
> 
> [...]

Applied, thanks!

[1/1] api: do not block tokio worker threads during chunk inserts
      commit: 23400016322c7a6981f111558e8d22666e32ee8c




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-29 10:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27 12:37 [PATCH v1 proxmox-backup] api: do not block tokio worker threads during chunk inserts Robert Obkircher
2026-05-27 16:12 ` Christian Ebner
2026-05-29 10:22 ` applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal