From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 on s3 backends
Date: Thu, 06 Nov 2025 10:37:42 +0100 [thread overview]
Message-ID: <1762418497.lnn72w5td0.astroid@yuna.none> (raw)
In-Reply-To: <20251105122233.439382-19-c.ebner@proxmox.com>
On November 5, 2025 1:22 pm, Christian Ebner wrote:
> To make sure there is no race between atime check and deletion with
> possible re-insertion.
could you expand on this?
sweep_unused_chunks holds the chunk store mutex for the full processing
of each chunk it iterates over..
so we have for each chunk:
- obtain mutex
- fstat
- call cond_sweep_chunk
- lock chunk
- fstat again
- drop mutex
- remove from cache (which re-obtains mutex)
shouldn't we protect this race by obtaining the mutex when conditionally
touching the chunk in `insert_chunk_cached()`, or even in
`datastore.cond_touch_chunk()`? then there is no way for a chunk
insertion to update the atime here.. AFAICT the latter would also close
a race with pull syncing, which touches chunks while they are referenced
by tmp indices, which means those touches are not protected by the mutex
or the rest of the GC logic?
>
> By only acquiring the file lock if the chunk marker would be removed
> and double stating, the file locking penalty is avoided for the other
> cases.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 2:
> - no changes
>
> pbs-datastore/src/chunk_store.rs | 28 +++++++++++++++++++++++++++-
> pbs-datastore/src/datastore.rs | 2 ++
> 2 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 49687b2fa..08519fe2b 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
> use std::time::Duration;
>
> use anyhow::{bail, format_err, Context, Error};
> +use hex::FromHex;
> use tracing::{info, warn};
>
> use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
> @@ -22,7 +23,7 @@ use crate::data_blob::DataChunkBuilder;
> use crate::file_formats::{
> COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
> };
> -use crate::DataBlob;
> +use crate::{DataBlob, LocalDatastoreLruCache};
>
> /// File system based chunk store
> pub struct ChunkStore {
> @@ -366,6 +367,7 @@ impl ChunkStore {
> min_atime: i64,
> status: &mut GarbageCollectionStatus,
> worker: &dyn WorkerTaskContext,
> + cache: Option<&LocalDatastoreLruCache>,
> ) -> Result<(), Error> {
> // unwrap: only `None` in unit tests
> assert!(self.locker.is_some());
> @@ -419,6 +421,30 @@ impl ChunkStore {
> bad,
> status,
> || {
basically we have two callbacks here:
- remove chunk/chunk marker (called for all bad chunks or non S3 case)
- remove chunk via cache (called for non-bad S3 chunks)
> + if let Some(cache) = cache {
> + // never lock bad chunks
> + if filename.to_bytes().len() == 64 {
this is already passed in as bool ;)
> + let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
> + match self.lock_chunk(&digest, Duration::from_secs(0)) {
so this is basically a "skip chunk if currently being uploaded" check,
right? might warrant a comment, else this reads like a deadlock waiting
to happen..
in phase2 we obtain the chunk lock for all chunks on the S3 side outside
of the remove callback, here we do it inside - is there a reason for that?
chunks ending up in this state here should be rather rare (compared to
chunks existing on S3 being garbage collected), right?
but even if we keep this, the whole callback could become (AFAICT):
|| {
// non-bad S3 chunks need to be removed via cache
if let Some(cache) = cache {
if !bad {
let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
// unless there is a concurrent upload pending
if let Ok(_guard) =
self.lock_chunk(&digest, Duration::from_secs(0))
{
drop(lock);
cache.remove(&digest)?;
}
return Ok(());
}
}
// bad or local chunks
unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
|err| {
format_err!(
"unlinking chunk {filename:?} failed on store '{}' - {err}",
self.name,
)
},
)
},
> + Ok(_guard) => {
> + // don't remove if changed since locking
> + match fstatat(
> + Some(dirfd),
> + filename,
> + nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
> + ) {
> + Ok(stat) if stat.st_atime < min_atime => {
> + let _ = cache.remove(&digest);
> + return Ok(());
> + }
> + _ => return Ok(()),
> + }
> + }
> + Err(_) => return Ok(()),
> + }
> + }
> + }
> +
> unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
> |err| {
> format_err!(
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 8b7333d80..df344974a 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1761,6 +1761,7 @@ impl DataStore {
> min_atime,
> &mut tmp_gc_status,
> worker,
> + self.cache(),
> )?;
> } else {
> self.inner.chunk_store.sweep_unused_chunks(
> @@ -1768,6 +1769,7 @@ impl DataStore {
> min_atime,
> &mut gc_status,
> worker,
> + None,
> )?;
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-06 9:37 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 12:22 [pbs-devel] [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 01/23] sync: pull: instantiate backend only once per sync job Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 02/23] api/datastore: move group notes setting to the datastore Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 03/23] api/datastore: move snapshot deletion into dedicated datastore helper Christian Ebner
2025-11-06 9:37 ` Fabian Grünbichler
2025-11-06 10:19 ` Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 04/23] api/datastore: move backup log upload by implementing " Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 05/23] api: backup: use datastore add_blob helper for backup session Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 06/23] api/datastore: add dedicated datastore helper to set snapshot notes Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 07/23] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-06 9:37 ` Fabian Grünbichler
2025-11-06 10:24 ` Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 08/23] datastore: refactor chunk insert based on backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 09/23] verify: rename corrupted to corrupt in log output and function names Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 10/23] verify/datastore: make rename corrupt chunk a datastore helper method Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 11/23] datastore: refactor rename_corrupt_chunk error handling Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 12/23] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 13/23] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 14/23] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 15/23] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 16/23] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 17/23] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 " Christian Ebner
2025-11-06 9:37 ` Fabian Grünbichler [this message]
2025-11-06 11:49 ` Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 19/23] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 20/23] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 21/23] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 22/23] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 23/23] chunk store: never fail when trying to remove missing chunk file Christian Ebner
2025-11-06 14:01 ` [pbs-devel] partially applied: [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1762418497.lnn72w5td0.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.