From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH proxmox-backup v2 00/19] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend
Date: Tue, 4 Nov 2025 14:06:40 +0100 [thread overview]
Message-ID: <20251104130659.435139-1-c.ebner@proxmox.com> (raw)
These patches fix possible race conditions on datastores with s3 backend for
chunk insert, renaming of corrupt chunks during verification and cleanup during
garbage collection. Further, the patches assure consistency between the chunk
marker file of the local datastore cache, the s3 object store and the in-memory
LRU cache during state changes occurring by one of the above mentioned operations.
Consistency is achieved by using a per-chunk file locking mechanism. File locks
are stored on the predefined location for datastore file locks, using the same
`.chunks/prefix/digest` folder layout for consistency and to keep readdir and
other fs operations performant.
Before introducing the file locking mechanism, the patches refactor pre-existing
code to move most of the backend related logic away from the api code to the
datastore implementation, in order to have a common interface especially for
chunk insert.
As part of the series it is now also assured that chunks which are removed from
the local datastore cache, are also dropped from it's in-memory LRU cache and
therefore a consistent state is achieved.
Changes since version 1 (thanks @Fabian for review):
- Fix lock inversion for rename corrup chunk.
- Inline the chunk lock helper, making it explicit and thereby avoid calling the
helper for regular datastores.
- Pass the backend to the add_blob datastore helper, so it can be reused for the
backup session and pull sync job.
- Move also the s3 index upload helper from the backup env to the datastore, and
reuse it for the sync job as well.
This patch series obsoletes two previous patch series with unfortunately
incomplete bugfix attempts found at:
- https://lore.proxmox.com/pbs-devel/8d711a20-b193-47a9-8f38-6ce800e6d0e8@proxmox.com/T/
- https://lore.proxmox.com/pbs-devel/20251015164008.975591-1-c.ebner@proxmox.com/T/
proxmox-backup:
Christian Ebner (19):
sync: pull: instantiate backend only once per sync job
api/datastore: move group notes setting to the datastore
api/datastore: move snapshot deletion into dedicated datastore helper
api/datastore: move backup log upload by implementing datastore helper
api: backup: use datastore add_blob helper for backup session
api/datastore: add dedicated datastore helper to set snapshot notes
api/datastore: move s3 index upload helper to datastore backend
datastore: refactor chunk insert based on backend
verify: rename corrupted to corrupt in log output and function names
verify/datastore: make rename corrupt chunk a datastore helper method
datastore: refactor rename_corrupt_chunk error handling
chunk store: implement per-chunk file locking helper for s3 backend
datastore: acquire chunk store mutex lock when renaming corrupt chunk
datastore: get per-chunk file lock for chunk rename on s3 backend
fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU
cache
datastore: add locking to protect against races on chunk insert for s3
GC: fix race with chunk upload/insert on s3 backends
GC: lock chunk marker before cleanup in phase 3 on s3 backends
datastore: GC: drop overly verbose info message during s3 chunk sweep
pbs-datastore/src/backup_info.rs | 2 +-
pbs-datastore/src/chunk_store.rs | 54 +++++-
pbs-datastore/src/datastore.rs | 291 +++++++++++++++++++++++++++++--
src/api2/admin/datastore.rs | 77 +++-----
src/api2/backup/environment.rs | 53 ++----
src/api2/backup/upload_chunk.rs | 64 ++-----
src/api2/tape/restore.rs | 6 +-
src/backup/verify.rs | 83 ++-------
src/server/pull.rs | 61 +++----
9 files changed, 415 insertions(+), 276 deletions(-)
Summary over all repositories:
9 files changed, 415 insertions(+), 276 deletions(-)
--
Generated by git-murpp 0.8.1
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next reply other threads:[~2025-11-04 13:07 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-04 13:06 Christian Ebner [this message]
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 01/19] sync: pull: instantiate backend only once per sync job Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 02/19] api/datastore: move group notes setting to the datastore Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 03/19] api/datastore: move snapshot deletion into dedicated datastore helper Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 04/19] api/datastore: move backup log upload by implementing " Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 05/19] api: backup: use datastore add_blob helper for backup session Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 06/19] api/datastore: add dedicated datastore helper to set snapshot notes Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 07/19] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 08/19] datastore: refactor chunk insert based on backend Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 09/19] verify: rename corrupted to corrupt in log output and function names Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 10/19] verify/datastore: make rename corrupt chunk a datastore helper method Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 11/19] datastore: refactor rename_corrupt_chunk error handling Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 12/19] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 13/19] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 14/19] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 15/19] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 16/19] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 17/19] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 18/19] GC: lock chunk marker before cleanup in phase 3 " Christian Ebner
2025-11-04 13:06 ` [pbs-devel] [PATCH proxmox-backup v2 19/19] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104130659.435139-1-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox