public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 on s3 backends
Date: Thu, 6 Nov 2025 12:49:08 +0100	[thread overview]
Message-ID: <8e2f4f4d-08e1-406e-a1f8-b2cd55bd8554@proxmox.com> (raw)
In-Reply-To: <1762418497.lnn72w5td0.astroid@yuna.none>

On 11/6/25 10:37 AM, Fabian Grünbichler wrote:
> On November 5, 2025 1:22 pm, Christian Ebner wrote:
>> To make sure there is no race between atime check and deletion with
>> possible re-insertion.
> 
> could you expand on this?

Might have mixed up phase 2 (which in the newest version no longer holds 
the mutex lock) with phase 3 (which does so).

> 
> sweep_unused_chunks holds the chunk store mutex for the full processing
> of each chunk it iterates over..
> 
> so we have for each chunk:
> 
> - obtain mutex
> - fstat
> - call cond_sweep_chunk
> - lock chunk
> - fstat again
> - drop mutex
> - remove from cache (which re-obtains mutex)
> 
> shouldn't we protect this race by obtaining the mutex when conditionally
> touching the chunk in `insert_chunk_cached()`, or even in> `datastore.cond_touch_chunk()`? then there is no way for a chunk
> insertion to update the atime here.. AFAICT the latter would also close
> a race with pull syncing, which touches chunks while they are referenced
> by tmp indices, which means those touches are not protected by the mutex
> or the rest of the GC logic?
Right, that case is not protected... this warrants some a really close 
look though, as this will have a significant performance impact if not 
done right.
>> By only acquiring the file lock if the chunk marker would be removed
>> and double stating, the file locking penalty is avoided for the other
>> cases.
>>
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> changes since version 2:
>> - no changes
>>
>>   pbs-datastore/src/chunk_store.rs | 28 +++++++++++++++++++++++++++-
>>   pbs-datastore/src/datastore.rs   |  2 ++
>>   2 files changed, 29 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index 49687b2fa..08519fe2b 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -5,6 +5,7 @@ use std::sync::{Arc, Mutex};
>>   use std::time::Duration;
>>   
>>   use anyhow::{bail, format_err, Context, Error};
>> +use hex::FromHex;
>>   use tracing::{info, warn};
>>   
>>   use pbs_api_types::{DatastoreFSyncLevel, GarbageCollectionStatus};
>> @@ -22,7 +23,7 @@ use crate::data_blob::DataChunkBuilder;
>>   use crate::file_formats::{
>>       COMPRESSED_BLOB_MAGIC_1_0, ENCRYPTED_BLOB_MAGIC_1_0, UNCOMPRESSED_BLOB_MAGIC_1_0,
>>   };
>> -use crate::DataBlob;
>> +use crate::{DataBlob, LocalDatastoreLruCache};
>>   
>>   /// File system based chunk store
>>   pub struct ChunkStore {
>> @@ -366,6 +367,7 @@ impl ChunkStore {
>>           min_atime: i64,
>>           status: &mut GarbageCollectionStatus,
>>           worker: &dyn WorkerTaskContext,
>> +        cache: Option<&LocalDatastoreLruCache>,
>>       ) -> Result<(), Error> {
>>           // unwrap: only `None` in unit tests
>>           assert!(self.locker.is_some());
>> @@ -419,6 +421,30 @@ impl ChunkStore {
>>                           bad,
>>                           status,
>>                           || {
> 
> basically we have two callbacks here:
> - remove chunk/chunk marker (called for all bad chunks or non S3 case)
> - remove chunk via cache (called for non-bad S3 chunks)
> 
>> +                            if let Some(cache) = cache {
>> +                                // never lock bad chunks
>> +                                if filename.to_bytes().len() == 64 {
> 
> this is already passed in as bool ;)Right, can be checked by that directly.

> 
>> +                                    let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
>> +                                    match self.lock_chunk(&digest, Duration::from_secs(0)) {
> 
> so this is basically a "skip chunk if currently being uploaded" check,
> right? might warrant a comment, else this reads like a deadlock waiting
> to happen..
Yes, will add a comment here. You are right, this *must* never be blocking.

> in phase2 we obtain the chunk lock for all chunks on the S3 side outside
> of the remove callback, here we do it inside - is there a reason for that?

That will be actually adapted to only acquire the per-chunk file lock, 
but the ordering of the patches does not reflect this anymore, might 
reorder this one to better reflect that.

> chunks ending up in this state here should be rather rare (compared to
> chunks existing on S3 being garbage collected), right?

Yes, this code path *should* never be taken, unless the chunk vanished 
from the s3 object store because of some other interaction. Basically it 
just cleans up the dangling chunk marker files.
> but even if we keep this, the whole callback could become (AFAICT):
> 
> 
>                          || {
>                              // non-bad S3 chunks need to be removed via cache
>                              if let Some(cache) = cache {
>                                  if !bad {
>                                      let digest = <[u8; 32]>::from_hex(filename.to_bytes())?;
> 
>                                      // unless there is a concurrent upload pending
>                                      if let Ok(_guard) =
>                                          self.lock_chunk(&digest, Duration::from_secs(0))
>                                      {
>                                          drop(lock);
>                                          cache.remove(&digest)?;
>                                      }
> 
>                                      return Ok(());
>                                  }
>                              }
> 
>                              // bad or local chunks
>                              unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
>                                  |err| {
>                                      format_err!(
>                                          "unlinking chunk {filename:?} failed on store '{}' - {err}",
>                                          self.name,
>                                      )
>                                  },
>                              )
>                          },
> 
>> +                                        Ok(_guard) => {
>> +                                            // don't remove if changed since locking
>> +                                            match fstatat(
>> +                                                Some(dirfd),
>> +                                                filename,
>> +                                                nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
>> +                                            ) {
>> +                                                Ok(stat) if stat.st_atime < min_atime => {
>> +                                                    let _ = cache.remove(&digest);
>> +                                                    return Ok(());
>> +                                                }
>> +                                                _ => return Ok(()),
>> +                                            }
>> +                                        }
>> +                                        Err(_) => return Ok(()),
>> +                                    }
>> +                                }
>> +                            }
>> +
>>                               unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(
>>                                   |err| {
>>                                       format_err!(
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 8b7333d80..df344974a 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -1761,6 +1761,7 @@ impl DataStore {
>>                   min_atime,
>>                   &mut tmp_gc_status,
>>                   worker,
>> +                self.cache(),
>>               )?;
>>           } else {
>>               self.inner.chunk_store.sweep_unused_chunks(
>> @@ -1768,6 +1769,7 @@ impl DataStore {
>>                   min_atime,
>>                   &mut gc_status,
>>                   worker,
>> +                None,
>>               )?;
>>           }
>>   
>> -- 
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  reply	other threads:[~2025-11-06 11:48 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 12:22 [pbs-devel] [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 01/23] sync: pull: instantiate backend only once per sync job Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 02/23] api/datastore: move group notes setting to the datastore Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 03/23] api/datastore: move snapshot deletion into dedicated datastore helper Christian Ebner
2025-11-06  9:37   ` Fabian Grünbichler
2025-11-06 10:19     ` Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 04/23] api/datastore: move backup log upload by implementing " Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 05/23] api: backup: use datastore add_blob helper for backup session Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 06/23] api/datastore: add dedicated datastore helper to set snapshot notes Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 07/23] api/datastore: move s3 index upload helper to datastore backend Christian Ebner
2025-11-06  9:37   ` Fabian Grünbichler
2025-11-06 10:24     ` Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 08/23] datastore: refactor chunk insert based on backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 09/23] verify: rename corrupted to corrupt in log output and function names Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 10/23] verify/datastore: make rename corrupt chunk a datastore helper method Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 11/23] datastore: refactor rename_corrupt_chunk error handling Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 12/23] chunk store: implement per-chunk file locking helper for s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 13/23] datastore: acquire chunk store mutex lock when renaming corrupt chunk Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 14/23] datastore: get per-chunk file lock for chunk rename on s3 backend Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 15/23] fix #6961: datastore: verify: evict corrupt chunks from in-memory LRU cache Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 16/23] datastore: add locking to protect against races on chunk insert for s3 Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 17/23] GC: fix race with chunk upload/insert on s3 backends Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 18/23] GC: lock chunk marker before cleanup in phase 3 " Christian Ebner
2025-11-06  9:37   ` Fabian Grünbichler
2025-11-06 11:49     ` Christian Ebner [this message]
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 19/23] datastore: GC: drop overly verbose info message during s3 chunk sweep Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 20/23] chunk store: reduce exposure of clear_chunk() to crate only Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 21/23] chunk store: make chunk removal a helper method of the chunk store Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 22/23] GC: fix deadlock for cache eviction and garbage collection Christian Ebner
2025-11-05 12:22 ` [pbs-devel] [PATCH proxmox-backup v3 23/23] chunk store: never fail when trying to remove missing chunk file Christian Ebner
2025-11-06 14:01 ` [pbs-devel] partially applied: [PATCH proxmox-backup v3 00/23] fix chunk upload/insert, rename corrupt chunks and GC race conditions for s3 backend Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8e2f4f4d-08e1-406e-a1f8-b2cd55bd8554@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal