From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 3/7] chunk store: add and use method to remove chunks
Date: Mon, 06 Oct 2025 15:17:15 +0200 [thread overview]
Message-ID: <1759750889.o3xg1a8w89.astroid@yuna.none> (raw)
In-Reply-To: <20251006104151.487202-4-c.ebner@proxmox.com>
On October 6, 2025 12:41 pm, Christian Ebner wrote:
> Reworks the removing of cached chunks during phase 2 of garbage
> collection for datastores backed by s3.
>
> Move the actual chunk removal logic to be a method of the chunk store
> and require the mutex guard to be passe as shared reference,
> signaling that the caller locked the store as required to avoid races
> with chunk insert.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> pbs-datastore/src/chunk_store.rs | 15 ++++++++++++++-
> pbs-datastore/src/datastore.rs | 2 +-
> pbs-datastore/src/local_datastore_lru_cache.rs | 7 +++----
> 3 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
> index 0725ca3a7..010785fbc 100644
> --- a/pbs-datastore/src/chunk_store.rs
> +++ b/pbs-datastore/src/chunk_store.rs
> @@ -1,7 +1,7 @@
> use std::os::unix::fs::MetadataExt;
> use std::os::unix::io::AsRawFd;
> use std::path::{Path, PathBuf};
> -use std::sync::{Arc, Mutex};
> +use std::sync::{Arc, Mutex, MutexGuard};
> use std::time::Duration;
>
> use anyhow::{bail, format_err, Context, Error};
> @@ -254,6 +254,19 @@ impl ChunkStore {
> Ok(true)
> }
>
> + /// Remove a chunk from the chunk store
> + ///
> + /// Used to remove chunks from the local datastore cache. Caller must signal to hold the chunk
> + /// store mutex lock.
> + pub fn remove_chunk(
> + &self,
> + digest: &[u8; 32],
> + _guard: &MutexGuard<'_, ()>,
if we do this, then this should be a proper type across the board..
but it also is a bit wrong interface-wise - just obtaining the chunk
store lock doesn't make it safe to remove chunks, it's still only GC
that is "allowed" to do that since it handles all the logic and
additional locking..
while that is the case in the call path here/now, it should be mentioned
in the doc comments at least, and the visibility restricted
accordingly..
> + ) -> Result<(), Error> {
> + let (path, _digest) = self.chunk_path(digest);
> + std::fs::remove_file(path).map_err(Error::from)
> + }
> +
> pub fn get_chunk_iterator(
> &self,
> ) -> Result<
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index e36af68fc..4f55eb9db 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1686,7 +1686,7 @@ impl DataStore {
> ) {
> if let Some(cache) = self.cache() {
> // ignore errors, phase 3 will retry cleanup anyways
> - let _ = cache.remove(&digest);
> + let _ = cache.remove(&digest, &lock);
so this call site here is okay, because it happens in GC after all the
checks and additional locking has been done to ensure:
- chunks which are not yet referenced by visible indices are not removed
- GC is not running in a pre-reload process that doesn't "see" new
backup writers
- GC is not running in a post-reload process while the old process still
has writers
- ..
> }
> delete_list.push(content.key);
> }
> diff --git a/pbs-datastore/src/local_datastore_lru_cache.rs b/pbs-datastore/src/local_datastore_lru_cache.rs
> index c0edd3619..1d2e87cb9 100644
> --- a/pbs-datastore/src/local_datastore_lru_cache.rs
> +++ b/pbs-datastore/src/local_datastore_lru_cache.rs
> @@ -2,7 +2,7 @@
> //! a network layer (e.g. via the S3 backend).
>
> use std::future::Future;
> -use std::sync::Arc;
> +use std::sync::{Arc, MutexGuard};
>
> use anyhow::{bail, Error};
> use http_body_util::BodyExt;
> @@ -87,10 +87,9 @@ impl LocalDatastoreLruCache {
> /// Remove a chunk from the local datastore cache.
> ///
> /// Fails if the chunk cannot be deleted successfully.
> - pub fn remove(&self, digest: &[u8; 32]) -> Result<(), Error> {
> + pub fn remove(&self, digest: &[u8; 32], guard: &MutexGuard<'_, ()>) -> Result<(), Error> {
> self.cache.remove(*digest);
> - let (path, _digest_str) = self.store.chunk_path(digest);
> - std::fs::remove_file(path).map_err(Error::from)
> + self.store.remove_chunk(digest, guard)
and this here is the only call site "forwarding" this removal from the
cache called above by GC to the underlying chunk store, but it should
probably also not be `pub`??
> }
>
> /// Access the locally cached chunk or fetch it from the S3 object store via the provided
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-10-06 13:17 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 10:41 [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 1/7] datastore: gc: inline single callsite method Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 2/7] gc: chunk store: rework atime check and gc status into common helper Christian Ebner
2025-10-06 13:14 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 3/7] chunk store: add and use method to remove chunks Christian Ebner
2025-10-06 13:17 ` Fabian Grünbichler [this message]
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 4/7] chunk store: fix: replace evicted cache chunks instead of truncate Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 15:35 ` Christian Ebner
2025-10-06 16:14 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 5/7] api: chunk upload: fix race between chunk backend upload and insert Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-07 10:15 ` Christian Ebner
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 6/7] api: chunk upload: fix race with garbage collection for no-cache on s3 Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 10:41 ` [pbs-devel] [PATCH proxmox-backup 7/7] pull: guard chunk upload and only insert into cache after upload Christian Ebner
2025-10-06 13:18 ` Fabian Grünbichler
2025-10-06 13:18 ` [pbs-devel] [PATCH proxmox-backup 0/7] s3 store: fix issues with chunk s3 backend upload and cache eviction Fabian Grünbichler
2025-10-08 15:22 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1759750889.o3xg1a8w89.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.