* [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup
@ 2025-11-06 12:54 Christian Ebner
2025-11-06 13:56 ` Fabian Grünbichler
2025-11-06 17:15 ` Christian Ebner
0 siblings, 2 replies; 3+ messages in thread
From: Christian Ebner @ 2025-11-06 12:54 UTC (permalink / raw)
To: pbs-devel
Sweeping of unused chunks during garbage collection checks their
atime to distinguish between chunks being in-use and chunks no
longer being used. While garbage collection does lock the chunk
store by guarding its mutex before reading file stats and deleting
unused chunks, the conditional touch did not do this before updating
the chunks atime (thereby also checking the presence).
Therefore there is a race window between the chunks metadata being
read and the chunk being removed, but the chunk being touched
in-between.
The race is however rare, as for this to happen the chunk must be
older than the cutoff time and not be referenced by any index file,
otherwise the atime would be updated during phase 1 already.
Fix by guarding the chunk store mutex before touching a chunk.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
pbs-datastore/src/chunk_store.rs | 1 +
1 file changed, 1 insertion(+)
diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
index ba7618e40..d21db4a71 100644
--- a/pbs-datastore/src/chunk_store.rs
+++ b/pbs-datastore/src/chunk_store.rs
@@ -217,6 +217,7 @@ impl ChunkStore {
assert!(self.locker.is_some());
let (chunk_path, _digest_str) = self.chunk_path(digest);
+ let _lock = self.mutex.lock();
self.cond_touch_path(&chunk_path, assert_exists)
}
--
2.47.3
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup
2025-11-06 12:54 [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup Christian Ebner
@ 2025-11-06 13:56 ` Fabian Grünbichler
2025-11-06 17:15 ` Christian Ebner
1 sibling, 0 replies; 3+ messages in thread
From: Fabian Grünbichler @ 2025-11-06 13:56 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On November 6, 2025 1:54 pm, Christian Ebner wrote:
> Sweeping of unused chunks during garbage collection checks their
> atime to distinguish between chunks being in-use and chunks no
> longer being used. While garbage collection does lock the chunk
> store by guarding its mutex before reading file stats and deleting
> unused chunks, the conditional touch did not do this before updating
> the chunks atime (thereby also checking the presence).
>
> Therefore there is a race window between the chunks metadata being
> read and the chunk being removed, but the chunk being touched
> in-between.
>
> The race is however rare, as for this to happen the chunk must be
> older than the cutoff time and not be referenced by any index file,
> otherwise the atime would be updated during phase 1 already.
>
> Fix by guarding the chunk store mutex before touching a chunk.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index ba7618e40..d21db4a71 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -217,6 +217,7 @@ impl ChunkStore {
> assert!(self.locker.is_some());
>
> let (chunk_path, _digest_str) = self.chunk_path(digest);
> + let _lock = self.mutex.lock();
> self.cond_touch_path(&chunk_path, assert_exists)
alas, it's not as simple as that - this helper is also called while
already holding the mutex, so we need to split it up further else we
deadlock immediately on chunk insertion..
1. make the existing cond_touch_chunk private and give it _no_lock
suffix
2. make touch_chunk private and make it call the _no_lock variant
3. add a new cond_touch_chunk helper that obtains the lock and calls
_no_lock internally
4. analyze other callers to ensure nobody else calls us with the mutex
held already
and while looking at that, I realized that index_mark_used_chunks is
creating a chunk marker without holding a lock. but alas, that could
(would) then be solved with your chunk-flock series, since it's only in
the S3 case..
> }
>
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup
2025-11-06 12:54 [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup Christian Ebner
2025-11-06 13:56 ` Fabian Grünbichler
@ 2025-11-06 17:15 ` Christian Ebner
1 sibling, 0 replies; 3+ messages in thread
From: Christian Ebner @ 2025-11-06 17:15 UTC (permalink / raw)
To: pbs-devel
superseded-by version 2:
https://lore.proxmox.com/pbs-devel/20251106171358.865503-1-c.ebner@proxmox.com/T/
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-11-06 17:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-06 12:54 [pbs-devel] [PATCH proxmox-backup] chunk store: fix race window between chunk stat and gc cleanup Christian Ebner
2025-11-06 13:56 ` Fabian Grünbichler
2025-11-06 17:15 ` Christian Ebner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.