From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 6/7] api: chunk upload: fix race with garbage collection for no-cache on s3
Date: Mon, 06 Oct 2025 15:18:13 +0200 [thread overview]
Message-ID: <1759754434.n5p51h47f1.astroid@yuna.none> (raw)
In-Reply-To: <20251006104151.487202-7-c.ebner@proxmox.com>
On October 6, 2025 12:41 pm, Christian Ebner wrote:
> Chunks uploaded to the s3 backend are never inserted into the local
> datastore cache. The presence of the chunk marker file is however
> required for garbage collection to not cleanup the chunks. While the
> marker files are created during phase 1 of the garbage collection for
> indexed chunks, this is not the case for in progress backups with the
> no-cache flag set.
>
> Therefore, mark chunks as in-progress while being uploaded just like
> for the regular mode with cache, but replace this with the zero-sized
> chunk marker file after upload finished to avoid incorrect garbage
> collection cleanup.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 13 +++++++++++++
> pbs-datastore/src/datastore.rs | 7 +++++++
> src/api2/backup/upload_chunk.rs | 12 ++++++++++--
> 3 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 22efe4a32..7fd92b626 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -594,6 +594,19 @@ impl ChunkStore {
> Ok(())
> }
>
> + pub(crate) fn persist_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + if self.datastore_backend_type == DatastoreBackendType::Filesystem {
> + bail!("cannot create backend upload marker, not a cache store");
> + }
> + let (marker_path, _digest_str) = self.chunk_backed_upload_marker_path(digest);
> + let (chunk_path, digest_str) = self.chunk_path(digest);
> + let _lock = self.mutex.lock();
> +
> + std::fs::rename(marker_path, chunk_path).map_err(|err| {
> + format_err!("persisting backup upload marker failed for {digest_str} - {err}")
> + })
> + }
> +
> pub(crate) fn cleanup_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> if self.datastore_backend_type == DatastoreBackendType::Filesystem {
> bail!("cannot cleanup backend upload marker, not a cache store");
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 58fb863ec..8b0d4ab5c 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1894,6 +1894,13 @@ impl DataStore {
> self.inner.chunk_store.insert_backend_upload_marker(digest)
> }
>
> + /// Persist the backend upload marker to be a zero size chunk marker.
> + ///
> + /// Marks the chunk as present in the local store cache without inserting its payload.
> + pub fn persist_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + self.inner.chunk_store.persist_backend_upload_marker(digest)
> + }
> +
> /// Remove the marker file signaling an in-progress upload to the backend
> pub fn cleanup_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> self.inner.chunk_store.cleanup_backend_upload_marker(digest)
> diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
> index d4b1850eb..35d873ebf 100644
> --- a/src/api2/backup/upload_chunk.rs
> +++ b/src/api2/backup/upload_chunk.rs
> @@ -263,10 +263,18 @@ async fn upload_to_backend(
>
> if env.no_cache {
> let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
> - let is_duplicate = s3_client
> + env.datastore.insert_backend_upload_marker(&digest)?;
this has the same issue as patch #5 - if two clients attempt to upload
the same digest concurrently, then one of them will fail and abort the
backup..
> + let is_duplicate = match s3_client
> .upload_no_replace_with_retry(object_key, data)
> .await
> - .map_err(|err| format_err!("failed to upload chunk to s3 backend - {err:#}"))?;
> + {
> + Ok(is_duplicate) => is_duplicate,
> + Err(err) => {
> + datastore.cleanup_backend_upload_marker(&digest)?;
> + bail!("failed to upload chunk to s3 backend - {err:#}");
> + }
> + };
> + env.datastore.persist_backend_upload_marker(&digest)?;
and if this fails, the corresponding chunk can never be uploaded again..
> return Ok((digest, size, encoded_size, is_duplicate));
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-10-06 13:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 10:41 [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 1/7] datastore: gc: inline single callsite method Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 2/7] gc: chunk store: rework atime check and gc status into common helper Christian Ebner
2025-10-06 13:14 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 3/7] chunk store: add and use method to remove chunks Christian Ebner
2025-10-06 13:17 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 4/7] chunk store: fix: replace evicted cache chunks instead of truncate Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 15:35 ` Christian Ebner
2025-10-06 16:14 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 5/7] api: chunk upload: fix race between chunk backend upload and insert Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-07 10:15 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 6/7] api: chunk upload: fix race with garbage collection for no-cache on s3 Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler [this message]
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 7/7] pull: guard chunk upload and only insert into cache after upload Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 13:18 ` [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Fabian Grünbichler
2025-10-08 15:22 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1759754434.n5p51h47f1.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox