* [PATCH proxmox v2 0/3] fix #6858: implement retry logic for transient API errors
@ 2026-02-24 13:49 Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 1/3] s3-client: early return when request timeout deadline reached Christian Ebner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Christian Ebner @ 2026-02-24 13:49 UTC (permalink / raw)
To: pbs-devel
These patches implement the best practice [0] on handling S3 API
response status codes 500, 503 by retrying the requests after an
exponential backoff time. Do the same for status code 504, as this
is returned by some storage providers if overwhelmed [1].
The first 2 patches contain a small fix to avoid additional response
latency in case of request timeout being reached and reorganize the
code for better logical flow. The final patch then adds the
additional response status code checks for retires.
Link to the issue in bugzilla:
https://bugzilla.proxmox.com/show_bug.cgi?id=6858
[0] https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorBestPractices.html
[1] https://forum.proxmox.com/threads/180956/
Changes since version 1 (thanks @Fabian for review):
- return the last error if retries are exhausted
- consider also 504 gateway timeout as retryable
Christian Ebner (3):
s3-client: early return when request timeout deadline reached
s3-client: move exponential backoff to after the response state check
fix #6858: s3-client: retry request on 500, 503 and 504 status codes
proxmox-s3-client/src/client.rs | 38 ++++++++++++++++-----------------
1 file changed, 19 insertions(+), 19 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH proxmox v2 1/3] s3-client: early return when request timeout deadline reached
2026-02-24 13:49 [PATCH proxmox v2 0/3] fix #6858: implement retry logic for transient API errors Christian Ebner
@ 2026-02-24 13:49 ` Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 2/3] s3-client: move exponential backoff to after the response state check Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 3/3] fix #6858: s3-client: retry request on 500, 503 and 504 status codes Christian Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Christian Ebner @ 2026-02-24 13:49 UTC (permalink / raw)
To: pbs-devel
The optional timeout value generates a deadline, after which the
request times out and fails, independent from retries.
The current implementation however unneededly continues to loop over
the remaining retires, including potential put rate limit delay and
exponential backoff time, creating unjustified additional latency.
Fix this by early returning with error once the deadline is reached.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- no changes
proxmox-s3-client/src/client.rs | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
index 83176b39..5e30aa12 100644
--- a/proxmox-s3-client/src/client.rs
+++ b/proxmox-s3-client/src/client.rs
@@ -386,23 +386,20 @@ impl S3Client {
}
let response = if let Some(deadline) = deadline {
- tokio::time::timeout_at(deadline, self.client.request(request)).await
+ tokio::time::timeout_at(deadline, self.client.request(request))
+ .await
+ .context("request timeout reached")?
} else {
- Ok(self.client.request(request).await)
+ self.client.request(request).await
};
match response {
- Ok(Ok(response)) => return Ok(response),
- Ok(Err(err)) => {
+ Ok(response) => return Ok(response),
+ Err(err) => {
if retry >= MAX_S3_HTTP_REQUEST_RETRY - 1 {
return Err(err.into());
}
}
- Err(_elapsed) => {
- if retry >= MAX_S3_HTTP_REQUEST_RETRY - 1 {
- bail!("request timed out exceeding retries");
- }
- }
}
}
--
2.47.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH proxmox v2 2/3] s3-client: move exponential backoff to after the response state check
2026-02-24 13:49 [PATCH proxmox v2 0/3] fix #6858: implement retry logic for transient API errors Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 1/3] s3-client: early return when request timeout deadline reached Christian Ebner
@ 2026-02-24 13:49 ` Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 3/3] fix #6858: s3-client: retry request on 500, 503 and 504 status codes Christian Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Christian Ebner @ 2026-02-24 13:49 UTC (permalink / raw)
To: pbs-devel
The exponential backup must only be performed after transient error
states anyways, so move it to the end of the loop, further avoiding
an unneeded retry counter check.
Since the put rate limiter remains in-place, this now also correctly
accounts for the additional exponential backoff time, already doing
some of the potential delay.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- no changes
proxmox-s3-client/src/client.rs | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
index 5e30aa12..f3e5eb45 100644
--- a/proxmox-s3-client/src/client.rs
+++ b/proxmox-s3-client/src/client.rs
@@ -380,11 +380,6 @@ impl S3Client {
}
}
- if retry > 0 {
- let backoff_secs = S3_HTTP_REQUEST_RETRY_BACKOFF_DEFAULT * 3_u32.pow(retry as u32);
- tokio::time::sleep(backoff_secs).await;
- }
-
let response = if let Some(deadline) = deadline {
tokio::time::timeout_at(deadline, self.client.request(request))
.await
@@ -401,6 +396,9 @@ impl S3Client {
}
}
}
+
+ let backoff_secs = S3_HTTP_REQUEST_RETRY_BACKOFF_DEFAULT * 3_u32.pow(retry as u32);
+ tokio::time::sleep(backoff_secs).await;
}
bail!("failed to send request exceeding retries");
--
2.47.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH proxmox v2 3/3] fix #6858: s3-client: retry request on 500, 503 and 504 status codes
2026-02-24 13:49 [PATCH proxmox v2 0/3] fix #6858: implement retry logic for transient API errors Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 1/3] s3-client: early return when request timeout deadline reached Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 2/3] s3-client: move exponential backoff to after the response state check Christian Ebner
@ 2026-02-24 13:49 ` Christian Ebner
2 siblings, 0 replies; 4+ messages in thread
From: Christian Ebner @ 2026-02-24 13:49 UTC (permalink / raw)
To: pbs-devel
Follow the best practices for AWS S3 error handling [0] and perform
retries on requests with http status code 500 or 503 in the response.
Further, do the same for 504 gateway timeout errors encountered by
some users in the community forum [1] in combination with Hetzner's
S3 storage offerings.
This is done for all requests unconditionally, maximum number of
retires and optional request timeout being honored.
[0] https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorBestPractices.html
[1] https://forum.proxmox.com/threads/180956/
Fixes: https://bugzilla.proxmox.com/show_bug.cgi?id=6858
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 1:
- return the last error if retries are exhausted
- consider also 504 gateway timeout as retryable
proxmox-s3-client/src/client.rs | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/proxmox-s3-client/src/client.rs b/proxmox-s3-client/src/client.rs
index f3e5eb45..35c80948 100644
--- a/proxmox-s3-client/src/client.rs
+++ b/proxmox-s3-client/src/client.rs
@@ -388,13 +388,18 @@ impl S3Client {
self.client.request(request).await
};
- match response {
- Ok(response) => return Ok(response),
- Err(err) => {
- if retry >= MAX_S3_HTTP_REQUEST_RETRY - 1 {
- return Err(err.into());
- }
- }
+ let do_retry = match &response {
+ Ok(response) => matches!(
+ response.status(),
+ StatusCode::INTERNAL_SERVER_ERROR
+ | StatusCode::SERVICE_UNAVAILABLE
+ | StatusCode::GATEWAY_TIMEOUT
+ ),
+ Err(_) => true,
+ };
+
+ if !do_retry || retry >= MAX_S3_HTTP_REQUEST_RETRY - 1 {
+ return Ok(response?);
}
let backoff_secs = S3_HTTP_REQUEST_RETRY_BACKOFF_DEFAULT * 3_u32.pow(retry as u32);
--
2.47.3
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-24 13:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-24 13:49 [PATCH proxmox v2 0/3] fix #6858: implement retry logic for transient API errors Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 1/3] s3-client: early return when request timeout deadline reached Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 2/3] s3-client: move exponential backoff to after the response state check Christian Ebner
2026-02-24 13:49 ` [PATCH proxmox v2 3/3] fix #6858: s3-client: retry request on 500, 503 and 504 status codes Christian Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox