From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 1655E1FF15E
	for <inbox@lore.proxmox.com>; Mon, 24 Nov 2025 09:13:46 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id ACD3419F20;
	Mon, 24 Nov 2025 09:13:56 +0100 (CET)
Date: Mon, 24 Nov 2025 09:13:49 +0100
From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>, Thomas Lamprecht <t.lamprecht@proxmox.com>
References: <20251122104118.205994-1-c.ebner@proxmox.com>
 <f75e6ddb-f6a0-4550-be6b-76f69fe9f478@proxmox.com>
 <7393d75a-04be-4940-80c9-1f8e9781227a@proxmox.com>
In-Reply-To: <7393d75a-04be-4940-80c9-1f8e9781227a@proxmox.com>
MIME-Version: 1.0
User-Agent: astroid/0.17.0 (https://github.com/astroidmail/astroid)
Message-Id: <1763971841.tk1259krov.astroid@yuna.none>
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1763971997832
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.046 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup] GC: s3: fix local marker
 cleanup for unreferenced, s3 only chunks
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pbs-devel-bounces@lists.proxmox.com
Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com>

On November 24, 2025 8:35 am, Christian Ebner wrote:
> Thanks for taking a look at this already!
> 
> On 11/22/25 3:55 PM, Thomas Lamprecht wrote:
>> Am 22.11.25 um 11:41 schrieb Christian Ebner:
>>> If a chunk object is located on the s3 object store only, not being
>>> referenced by any index file and having no local marker file it is
>>> marked for cleanup by pretending an atime equal to the unix epoch.
>>>
>>> While this will mark the chunk for deletion from the backend and
>>> include it in the delete list for the next s3 delete objects call, it
>>> also will lead to the chunk marker and LRU cache entry being tried to
>>> clean up locally, which however fails since there is no marker to be
>>> cleaned up.
>>>
>>> In order to treat this edge case with the same cleanup logic, simply
>>> insert the marker file if not present, for it to get correctly
>>> cleaned up as expected afterwards. This should not happen under
>>> normal datastore operation anyways, most likely to appear after
>>> re-creation of the datastore from existing bucket contents containing
>>> such unreferenced chunks.
>>>
>>> Fixes: https://forum.proxmox.com/threads/176567/
>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
>>> ---
>>>   pbs-datastore/src/datastore.rs | 9 +++++----
>>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>>> index 65299cca9..a24392d9f 100644
>>> --- a/pbs-datastore/src/datastore.rs
>>> +++ b/pbs-datastore/src/datastore.rs
>>> @@ -1711,11 +1711,12 @@ impl DataStore {
>>>                       let atime = match std::fs::metadata(&chunk_path) {
>>>                           Ok(stat) => stat.accessed()?,
>>>                           Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
>>> +                            unsafe {
>>> +                                // chunk store lock held
>>> +                                // insert marke unconditionally, cleaned up again below if required
>>> +                                self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>> +                            }
>>>                               if self.inner.chunk_store.clear_chunk_expected_mark(&digest)? {
>>> -                                unsafe {
>>> -                                    // chunk store lock held
>>> -                                    self.inner.chunk_store.replace_chunk_with_marker(&digest)?;
>>> -                                }
>>>                                   SystemTime::now()
>> 
>> Why not drop that whole branch instead, it does not really makes sense IIUC.
> 
> No, this branch is needed. This is required for avoiding API calls to 
> the s3 backend in case the chunk is referenced by an index file as 
> detected during phase 1, but the local marker file is not present. In 
> that case we do not want to directly check the existence on the backend 
> (which we need to make sure to not mark a chunk which is however not 
> present on the backend), but defer that check to phase 2, where we do 
> the listing of all chunks anyways. This is done by flagging that chunk 
> via the <digest>.using file.
> 
> Here, if the chunk is encountered during s3 object store listing, but 
> the local file is missing, we check and clear the chunk expected marker, 
> which if present tells us the chunk still needs to be used. If not it is 
> safe to clear it from the backend.
> 
>> 
>> And `replace_chunk_with_marker` replaces the chunk file directly (no extension) whereas
>> `clear_chunk_expected_mark` checks the chunk.using file, so does your reordering even
>> change anything, or is there a bug in `replace_chunk_with_marker`?
> 
> `replace_chunk_with_marker` replaces a full chunk file with an empty 
> marker, but also creates the empty marker if the original file is not 
> present, so in this particular case it is actually used to create the 
> marker, not to evict chunks from local datastore cache as under normal 
> operation. I can send a patch to rename that method to make that clear.
> 
>> 
>> And independent of that, would it be better (more performant and less confusing) if
>> we ignore the "not present in LRU or no marker" in that edge case rather than creating
>> a file (doing more IO) just to delete that then again?
> 
> I can do that as well of course by checking a flag in the remove 
> callback. I opted for not doing that however since above is a very 
> unlikely case to happen, as the s3 backend and local datastore cache 
> should be in sync most of the time.
> Adding that check would be performed for each chunk being removed, this 
> only once if the chunk is still present on the backend, but not on the 
> local datastore cache.
> 
> The additional IO is therefore justfied IMO.
> 
> I could of course also go the route of just setting a boolean flag and 
> checking that in the callback?

we basically already have such a boolean marker - we set `atime` to 0 in
this case (and only this case), and we could just ignore the removal
errors then? possibly limited to just ignoring ENOTFOUND?

> 
> What do you think?
> 
>>>                               } else {
>>>                                   // File not found, delete by setting atime to unix epoch
>> 
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel