From: Christian Ebner <c.ebner@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
"Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>,
"Thomas Lamprecht" <t.lamprecht@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] GC: s3: fix local marker cleanup for unreferenced, s3 only chunks
Date: Mon, 24 Nov 2025 09:34:46 +0100 [thread overview]
Message-ID: <4ccbdde9-8e2e-4880-a245-4d15480f2d15@proxmox.com> (raw)
In-Reply-To: <1763972888.fq1bwvzgnc.astroid@yuna.none>
On 11/24/25 9:29 AM, Fabian Grünbichler wrote:
> On November 24, 2025 9:22 am, Christian Ebner wrote:
>> On 11/24/25 9:13 AM, Fabian Grünbichler wrote:
>>> On November 24, 2025 8:35 am, Christian Ebner wrote:
>>>> Thanks for taking a look at this already!
>>>>
>>>> On 11/22/25 3:55 PM, Thomas Lamprecht wrote:
>>>>> Am 22.11.25 um 11:41 schrieb Christian Ebner:
>>>>>> If a chunk object is located on the s3 object store only, not being
>>>>>> referenced by any index file and having no local marker file it is
>>>>>> marked for cleanup by pretending an atime equal to the unix epoch.
>>>>>>
>>>>>> While this will mark the chunk for deletion from the backend and
>>>>>> include it in the delete list for the next s3 delete objects call, it
>>>>>> also will lead to the chunk marker and LRU cache entry being tried to
>>>>>> clean up locally, which however fails since there is no marker to be
>>>>>> cleaned up.
>>>>>>
>>>>>> In order to treat this edge case with the same cleanup logic, simply
>>>>>> insert the marker file if not present, for it to get correctly
>>>>>> cleaned up as expected afterwards. This should not happen under
>>>>>> normal datastore operation anyways, most likely to appear after
>>>>>> re-creation of the datastore from existing bucket contents containing
>>>>>> such unreferenced chunks.
>>>>>>
>>>>>> Fixes: https://forum.proxmox.com/threads/176567/
>>>>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>>>>> ---
>>>>>> pbs-datastore/src/datastore.rs | 9 +++++----
>>>>>> 1 file changed, 5 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>>>>>> index 65299cca9..a24392d9f 100644
>>>>>> --- a/pbs-datastore/src/datastore.rs
>>>>>> +++ b/pbs-datastore/src/datastore.rs
>>>>>> @@ -1711,11 +1711,12 @@ impl DataStore {
>>>>>> let atime = match std::fs::metadata(&chunk_path) {
>>>>>> Ok(stat) => stat.accessed()?,
>>>>>> Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
>>>>>> + unsafe {
>>>>>> + // chunk store lock held
>>>>>> + // insert marke unconditionally, cleaned up again below if required
>>>>>> + self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>>>>> + }
>>>>>> if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
>>>>>> - unsafe {
>>>>>> - // chunk store lock held
>>>>>> - self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>>>>> - }
>>>>>> SystemTime::now()
>>>>>
>>>>> Why not drop that whole branch instead, it does not really makes sense IIUC.
>>>>
>>>> No, this branch is needed. This is required for avoiding API calls to
>>>> the s3 backend in case the chunk is referenced by an index file as
>>>> detected during phase 1, but the local marker file is not present. In
>>>> that case we do not want to directly check the existence on the backend
>>>> (which we need to make sure to not mark a chunk which is however not
>>>> present on the backend), but defer that check to phase 2, where we do
>>>> the listing of all chunks anyways. This is done by flagging that chunk
>>>> via the <digest>.using file.
>>>>
>>>> Here, if the chunk is encountered during s3 object store listing, but
>>>> the local file is missing, we check and clear the chunk expected marker,
>>>> which if present tells us the chunk still needs to be used. If not it is
>>>> safe to clear it from the backend.
>>>>
>>>>>
>>>>> And `replace_chunk_with_marker` replaces the chunk file directly (no extension) whereas
>>>>> `clear_chunk_expected_mark` checks the chunk.using file, so does your reordering even
>>>>> change anything, or is there a bug in `replace_chunk_with_marker`?
>>>>
>>>> `replace_chunk_with_marker` replaces a full chunk file with an empty
>>>> marker, but also creates the empty marker if the original file is not
>>>> present, so in this particular case it is actually used to create the
>>>> marker, not to evict chunks from local datastore cache as under normal
>>>> operation. I can send a patch to rename that method to make that clear.
>>>>
>>>>>
>>>>> And independent of that, would it be better (more performant and less confusing) if
>>>>> we ignore the "not present in LRU or no marker" in that edge case rather than creating
>>>>> a file (doing more IO) just to delete that then again?
>>>>
>>>> I can do that as well of course by checking a flag in the remove
>>>> callback. I opted for not doing that however since above is a very
>>>> unlikely case to happen, as the s3 backend and local datastore cache
>>>> should be in sync most of the time.
>>>> Adding that check would be performed for each chunk being removed, this
>>>> only once if the chunk is still present on the backend, but not on the
>>>> local datastore cache.
>>>>
>>>> The additional IO is therefore justfied IMO.
>>>>
>>>> I could of course also go the route of just setting a boolean flag and
>>>> checking that in the callback?
>>>
>>> we basically already have such a boolean marker - we set `atime` to 0 in
>>> this case (and only this case), and we could just ignore the removal
>>> errors then? possibly limited to just ignoring ENOTFOUND?
>>
>> That's a good idea! So I will perform the additional checks based on that.
>
> alternatively, skipping cond_sweep_chunk entirely would also work (this
> is with `-w`):
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 65299cca9..b9debd2b1 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -1709,21 +1709,22 @@ impl DataStore {
> // Check local markers (created or atime updated during phase1) and
> // keep or delete chunk based on that.
> let atime = match std::fs::metadata(&chunk_path) {
> - Ok(stat) => stat.accessed()?,
> + Ok(stat) => Some(stat.accessed()?),
> Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
> if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
> unsafe {
> // chunk store lock held
> self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
> }
> - SystemTime::now()
> + Some(SystemTime::now())
> } else {
> - // File not found, delete by setting atime to unix epoch
> - SystemTime::UNIX_EPOCH
> + // File not found, only delete from S3
> + None
> }
> }
> Err(err) => return Err(err.into()),
> };
> + if let Some(atime) = atime {
> let atime = atime.duration_since(SystemTime::UNIX_EPOCH)?.as_secs() as i64;
>
> unsafe {
> @@ -1752,6 +1753,13 @@ impl DataStore {
> },
> )?;
> }
> + } else {
> + // set age based on first insertion
> + if delete_list.is_empty() {
> + delete_list_age = epoch_i64();
> + }
> + delete_list.push((content.key, _chunk_guard));
> + }
>
> chunk_count += 1;
>
>
Nice, even better! By this we can also get rid of the hacky setting
atime to unix epoch. Can I send this with:
Originally-by Fabian Grünbichler <f.gruenbichler@proxmox.com>?
Together with some followups to docstring and method renaming?
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-24 8:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-22 10:41 Christian Ebner
2025-11-22 14:56 ` Thomas Lamprecht
2025-11-24 7:35 ` Christian Ebner
2025-11-24 8:13 ` Fabian Grünbichler
2025-11-24 8:22 ` Christian Ebner
2025-11-24 8:29 ` Fabian Grünbichler
2025-11-24 8:34 ` Christian Ebner [this message]
2025-11-24 9:00 ` Thomas Lamprecht
2025-11-24 9:41 ` Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ccbdde9-8e2e-4880-a245-4d15480f2d15@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox