From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 6/7] api: chunk upload: fix race with garbage collection for no-cache on s3
Date: Mon, 06 Oct 2025 15:18:13 +0200 [thread overview]
Message-ID: <1759754434.n5p51h47f1.astroid@yuna.none> (raw)
In-Reply-To: <20251006104151.487202-7-c.ebner@proxmox.com>
On October 6, 2025 12:41 pm, Christian Ebner wrote:
> Chunks uploaded to the s3 backend are never inserted into the local
> datastore cache. The presence of the chunk marker file is however
> required for garbage collection to not cleanup the chunks. While the
> marker files are created during phase 1 of the garbage collection for
> indexed chunks, this is not the case for in progress backups with the
> no-cache flag set.
>
> Therefore, mark chunks as in-progress while being uploaded just like
> for the regular mode with cache, but replace this with the zero-sized
> chunk marker file after upload finished to avoid incorrect garbage
> collection cleanup.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 13 +++++++++++++
> pbs-datastore/src/datastore.rs | 7 +++++++
> src/api2/backup/upload_chunk.rs | 12 ++++++++++--
> 3 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 22efe4a32..7fd92b626 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -594,6 +594,19 @@ impl ChunkStore {
> Ok(())
> }
>
> + pub(crate) fn persist_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + if self.datastore_backend_type == DatastoreBackendType::Filesystem {
> + bail!("cannot create backend upload marker, not a cache store");
> + }
> + let (marker_path, _digest_str) = self.chunk_backed_upload_marker_path(digest);
> + let (chunk_path, digest_str) = self.chunk_path(digest);
> + let _lock = self.mutex.lock();
> +
> + std::fs::rename(marker_path, chunk_path).map_err(|err| {
> + format_err!("persisting backup upload marker failed for {digest_str} - {err}")
> + })
> + }
> +
> pub(crate) fn cleanup_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> if self.datastore_backend_type == DatastoreBackendType::Filesystem {
> bail!("cannot cleanup backend upload marker, not a cache store");
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 58fb863ec..8b0d4ab5c 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1894,6 +1894,13 @@ impl DataStore {
> self.inner.chunk_store.insert_backend_upload_marker(digest)
> }
>
> + /// Persist the backend upload marker to be a zero size chunk marker.
> + ///
> + /// Marks the chunk as present in the local store cache without inserting its payload.
> + pub fn persist_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + self.inner.chunk_store.persist_backend_upload_marker(digest)
> + }
> +
> /// Remove the marker file signaling an in-progress upload to the backend
> pub fn cleanup_backend_upload_marker(&self, digest: &[u8; 32]) -> Result<(), Error> {
> self.inner.chunk_store.cleanup_backend_upload_marker(digest)
> diff --git a/src/api2/backup/upload_chunk.rs b/src/api2/backup/upload_chunk.rs
> index d4b1850eb..35d873ebf 100644
> --- a/src/api2/backup/upload_chunk.rs
> +++ b/src/api2/backup/upload_chunk.rs
> @@ -263,10 +263,18 @@ async fn upload_to_backend(
>
> if env.no_cache {
> let object_key = pbs_datastore::s3::object_key_from_digest(&digest)?;
> - let is_duplicate = s3_client
> + env.datastore.insert_backend_upload_marker(&digest)?;
this has the same issue as patch #5 - if two clients attempt to upload
the same digest concurrently, then one of them will fail and abort the
backup..
> + let is_duplicate = match s3_client
> .upload_no_replace_with_retry(object_key, data)
> .await
> - .map_err(|err| format_err!("failed to upload chunk to s3 backend - {err:#}"))?;
> + {
> + Ok(is_duplicate) => is_duplicate,
> + Err(err) => {
> + datastore.cleanup_backend_upload_marker(&digest)?;
> + bail!("failed to upload chunk to s3 backend - {err:#}");
> + }
> + };
> + env.datastore.persist_backend_upload_marker(&digest)?;
and if this fails, the corresponding chunk can never be uploaded again..
> return Ok((digest, size, encoded_size, is_duplicate));
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-10-06 13:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 10:41 [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 1/7] datastore: gc: inline single callsite method Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 2/7] gc: chunk store: rework atime check and gc status into common helper Christian Ebner
2025-10-06 13:14 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 3/7] chunk store: add and use method to remove chunks Christian Ebner
2025-10-06 13:17 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 4/7] chunk store: fix: replace evicted cache chunks instead of truncate Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 15:35 ` Christian Ebner
2025-10-06 16:14 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 5/7] api: chunk upload: fix race between chunk backend upload and insert Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-07 10:15 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 6/7] api: chunk upload: fix race with garbage collection for no-cache on s3 Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler [this message]
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 7/7] pull: guard chunk upload and only insert into cache after upload Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 13:18 ` [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Fabian Grünbichler
2025-10-08 15:22 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1759754434.n5p51h47f1.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.