From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup
Date: Thu, 06 Nov 2025 14:56:48 +0100 [thread overview]
Message-ID: <1762436861.ipo8b4a3lk.astroid@yuna.none> (raw)
In-Reply-To: <20251106125458.479328-1-c.ebner@proxmox.com>
On November 6, 2025 1:54 pm, Christian Ebner wrote:
> Sweeping of unused chunks during garbage collection checks their
> atime to distinguish between chunks being in-use and chunks no
> longer being used. While garbage collection does lock the chunk
> store by guarding its mutex before reading file stats and deleting
> unused chunks, the conditional touch did not do this before updating
> the chunks atime (thereby also checking the presence).
>
> Therefore there is a race window between the chunks metadata being
> read and the chunk being removed, but the chunk being touched
> in-between.
>
> The race is however rare, as for this to happen the chunk must be
> older than the cutoff time and not be referenced by any index file,
> otherwise the atime would be updated during phase 1 already.
>
> Fix by guarding the chunk store mutex before touching a chunk.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index ba7618e40..d21db4a71 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -217,6 +217,7 @@ impl ChunkStore {
> assert!(self.locker.is_some());
>
> let (chunk_path, _digest_str) = self.chunk_path(digest);
> + let _lock = self.mutex.lock();
> self.cond_touch_path(&chunk_path, assert_exists)
alas, it's not as simple as that - this helper is also called while
already holding the mutex, so we need to split it up further else we
deadlock immediately on chunk insertion..
1. make the existing cond_touch_chunk private and give it _no_lock
suffix
2. make touch_chunk private and make it call the _no_lock variant
3. add a new cond_touch_chunk helper that obtains the lock and calls
_no_lock internally
4. analyze other callers to ensure nobody else calls us with the mutex
held already
and while looking at that, I realized that index_mark_used_chunks is
creating a chunk marker without holding a lock. but alas, that could
(would) then be solved with your chunk-flock series, since it's only in
the S3 case..
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-06 13:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-06 12:54 Christian Ebner
2025-11-06 13:56 ` Fabian Grünbichler [this message]
2025-11-06 17:15 ` Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1762436861.ipo8b4a3lk.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.