public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox{, -backup} v3 00/10] s3 store: fix chunk upload/insert and GC race condition for s3 backend
Date: Wed, 15 Oct 2025 18:39:58 +0200	[thread overview]
Message-ID: <20251015164008.975591-1-c.ebner@proxmox.com> (raw)

Patches 1 to 3 extend the current s3-client implementation by an additional
helper method, allowing to unconditionally upload chunks on final retry of
otherwise conditional uploads. This is to avoid failing backups in case of
concurrent uploads to the same chunk digest.

The remaining patches fix a possible race condition between s3 backend upload
and garbage collection, which can result in chunk loss. If the chunk upload
finished, garbage collection listed and checked the chunk's in-use marker, just
before it being written by the cache insert after the upload, garbage collection
can incorrectly delete the chunk. This is circumvented by setting and checking
an additional chunk marker file, which is touched before starting the upload
and removed after cache insert, assuring that these chunks are not removed.

Chunk upload marker files will not get cleaned up if the upload fails, excluding
issues in case of concurrent uploads of the same chunk. Garbage collection has
to clean up these markers as well, based on their atime.

Changes since version 2 (thanks @Fabian):
- Also fix issues arising from the concurrent conditional s3 put requests.
- Only cleanup upload marker file on success, add GC logic to cleanup lingering
  ones from failed uploads.
- Refactor chunk upload marker file related helpers.
- Split patches into smaller portions.

Changes since version 1 (thanks @Fabian):
- Refactor the garbage collection rework patches, using a callback to perform the
  chunk removal, so both filesystem and s3 backend can use the same logic without
  the need to readapt the gc status.
- Completely reworked the local datastore cache access method, so it not only
  serves the contents from s3 backend if that needs to be fetched, but also
  closes the download/insert race and drops quite some duplicate code,
  completely getting rid of the now obsolete S3Cacher
- Rework the chunk insert for s3 to also cover cases were concurrent uploads of
  the same object/key occurs, making sure that the upload marker creation will
  not lead to failure and that the upload marker cleanup is handled correctly as
  well. The only race still open is which of the two concurrent uploads inserts
  to the local cache, but since both versions must encode for the same data (
  as they have the same digest), this is not an issue. If one of the upload
  fails however, both must be considered as failed, since then there is no
  guarantee anymore that garbage collection did not cleanup the chunks from the
  s3 backend in the meantime.

proxmox:

Christian Ebner (2):
  s3-client: add exponential backoff time for upload retries
  s3-client: add helper method to force final unconditional upload on

 proxmox-s3-client/src/client.rs | 36 +++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)


proxmox-backup:

Christian Ebner (8):
  api/pull: avoid failing on concurrent conditional chunk uploads
  datastore: GC: drop overly verbose info message during s3 chunk sweep
  GC: refactor atime gathering for local chunk markers with s3 backend
  chunk store: refactor method for chunk insertion
  chunk store: add backend upload marker helpers for s3 backed stores
  api: chunk upload: fix race between chunk backend upload and insert
  api: chunk upload: fix race with garbage collection for no-cache on s3
  pull: guard chunk upload and only insert into cache after upload

 pbs-datastore/src/chunk_store.rs              | 200 +++++++++++++++++-
 pbs-datastore/src/datastore.rs                |  41 +++-
 .../src/local_datastore_lru_cache.rs          |   3 +-
 src/api2/backup/upload_chunk.rs               |  17 +-
 src/api2/config/datastore.rs                  |   2 +
 src/server/pull.rs                            |   7 +-
 6 files changed, 242 insertions(+), 28 deletions(-)


Summary over all repositories:
  7 files changed, 274 insertions(+), 32 deletions(-)

-- 
Generated by git-murpp 0.8.1


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


             reply	other threads:[~2025-10-15 16:40 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15 16:39 Christian Ebner [this message]
2025-10-15 16:39 ` [pbs-devel] [PATCH proxmox v3 1/2] s3-client: add exponential backoff time for upload retries Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox v3 2/2] s3-client: add helper method to force final unconditional upload on Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 1/8] api/pull: avoid failing on concurrent conditional chunk uploads Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 2/8] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 3/8] GC: refactor atime gathering for local chunk markers with s3 backend Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 4/8] chunk store: refactor method for chunk insertion Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 5/8] chunk store: add backend upload marker helpers for s3 backed stores Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 6/8] api: chunk upload: fix race between chunk backend upload and insert Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 7/8] api: chunk upload: fix race with garbage collection for no-cache on s3 Christian Ebner
2025-10-15 16:40 ` [pbs-devel] [PATCH proxmox-backup v3 8/8] pull: guard chunk upload and only insert into cache after upload Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251015164008.975591-1-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal