all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup v2] GC: chunk store: fix chunk using markers cleanup
Date: Tue, 25 Nov 2025 15:27:14 +0100	[thread overview]
Message-ID: <2a0d9f19-1a45-4552-8bf4-56d21c37231b@proxmox.com> (raw)
In-Reply-To: <1764080186.d9oxoqi5dh.astroid@yuna.none>

On 11/25/25 3:19 PM, Fabian Grünbichler wrote:
> On November 25, 2025 3:00 pm, Christian Ebner wrote:
>> Since commit 9510ef1a ("GC: assure chunk exists on s3 store when
>> creating missing chunk marker") chunks which are referenced by
>> an index file but do not have a local marker file are marked by a
>> file with the `using` extension, so they are not cleaned up during
>> phase 2 if the chunk is still present on the backend.
>>
>> If the chunk is however not encountered, phase 3 will see the marker
>> and tries to clean it up, which currently however fails because
>> it is first tried to be cleaned up from the LRU cache, the filename
>> being converted to the chunk digest.
>>
>> Therefore, clean up any using marker file encountered during phase 3
>> before any regular or bad chunk, independent from the atime.
>>
>> Fixes: https://forum.proxmox.com/threads/176567/post-819437
>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>> ---
>> Changes since version 1 (thanks a lot for offlist discussion Thomas):
>> - Cleanup using marker chunks independent from atime cutoff
>>
>>   pbs-datastore/src/chunk_store.rs | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs
>> index f53460664..7fe09b914 100644
>> --- a/pbs-datastore/src/chunk_store.rs
>> +++ b/pbs-datastore/src/chunk_store.rs
>> @@ -25,6 +25,8 @@ use crate::file_formats::{
>>   };
>>   use crate::{DataBlob, LocalDatastoreLruCache};
>>   
>> +const USING_MARKER_FILENAME_EXT: &str = "using";
>> +
>>   /// File system based chunk store
>>   pub struct ChunkStore {
>>       name: String, // used for error reporting
>> @@ -426,6 +428,16 @@ impl ChunkStore {
>>                       drop(lock);
>>                       continue;
>>                   }
>> +                if filename
>> +                    .to_bytes()
>> +                    .ends_with(USING_MARKER_FILENAME_EXT.as_bytes())
>> +                {
>> +                    unlinkat(Some(dirfd), filename, UnlinkatFlags::NoRemoveDir).map_err(|err| {
>> +                        format_err!("unlinking chunk using marker {filename:?} failed - {err}")
>> +                    })?;
>> +                    drop(lock);
>> +                    continue;
>> +                }
> 
> this looks okay as a stop-gap, but isn't the actual problem that
> 
> .using
> 
> and
> 
> .0.bad
> 
> have the same length, so we end up taking a codepath using a weird "bad
> but not bad" filename instead of skipping those markers in phase3?

but we need to clean them up at some point, otherwise the following 
might happen:
- chunk is in use by index file, phase 1 sets marker
- chunk is not present on s3 object store (bad chunk), therefore not 
seen in phase 2 and not replaced by regular marker file
- chunk is uploaded
- both index files are pruned
- chunk is never cleaned up because using marker file persists.

> in get_chunk_iterator, we skip all files that are not 64 bytes or
> 64+len(.0.bad) bytes long, but then set the "bad" flag based on the
> extension..

this might return the information if this was a using marker by some 
enum variant instead of the bad boolean flag, so that can be used to 
clearly distinguish these.

> 
> and then in cond_sweep_chunk in sweep_unused_chunk, we convert non-bad
> chunk filenames to digests which then fails for the "using" filenames,
> because they are too long (per the error from the forum thread).
> 
>>   
>>                   chunk_count += 1;
>>   
>> @@ -776,7 +788,7 @@ impl ChunkStore {
>>       /// Helper to generate marker file path for expected chunks
>>       fn chunk_expected_marker_path(&self, digest: &[u8; 32]) -> PathBuf {
>>           let (mut path, _digest_str) = self.chunk_path(digest);
>> -        path.set_extension("using");
>> +        path.set_extension(USING_MARKER_FILENAME_EXT);
>>           path
>>       }
>>   
>> -- 
>> 2.47.3
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel

  parent reply	other threads:[~2025-11-25 14:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 14:00 Christian Ebner
2025-11-25 14:18 ` Fabian Grünbichler
2025-11-25 14:23   ` Thomas Lamprecht
2025-11-25 14:27   ` Christian Ebner [this message]
2025-11-26  8:23     ` Fabian Grünbichler
2025-11-25 15:02 ` [pbs-devel] applied: " Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a0d9f19-1a45-4552-8bf4-56d21c37231b@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal