From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 3/3] GC: S3: phase2: delete last partial batch of objects at the very end
Date: Fri, 21 Nov 2025 10:46:15 +0100 [thread overview]
Message-ID: <1763717700.9j817e6qly.astroid@yuna.none> (raw)
In-Reply-To: <e5e652d2-02f2-4a0d-b349-5e4fcd7d76f8@proxmox.com>
On November 21, 2025 10:31 am, Christian Ebner wrote:
> While going trough the rest of the series in detail now, one idea right
> away.
>
> On 11/21/25 10:06 AM, Fabian Grünbichler wrote:
>> instead of after every processing every batch of 1000 listed objects. this
>> reduces the number of delete calls made to the backend, making regular garbage
>> collections that do not delete most objects cheaper, but means holding the
>> flocks for garbage chunks/objects longer.
>
> We could avoid holding the flock for to long (e.g. GC over several days
> because of super slow local datastore cache, S3 backend, ...) by setting
> (or resetting) a timer on each last delete list insert, and not only
> using the batch size to decide if to perform the deleteObjects() call,
> but rather compare if a timeout has been elapsed.
>
> This would safeguard us from locking some chunks way to long, causing
> potential issues with concurrent backups, but not trow out all the
> benefits this patch brings.
>
> What do you think? I could send that as followup if you like.
considerations like this were why I split this out as separate patch ;)
the loop here basically does:
- one S3 call to (continue to) list objects in the bucket
(potentially expensive), then for each object:
-- maps each object back to a chunk (free)
-- does some local operations
(stat, empty marker handling - these should not take too long?)
-- remove from cache if garbage
(might take a bit if local storage is very slow?)
so yeah, we should probably cap the max. number of list calls before we
trigger the deletion, to avoid locking a garbage chunk in the first 1000
objects until the end of GC, if there is no further garbage to fill up
the batch of 100.. either by number of iterations since the first
not-yet-process delete insertion, or via timestamp?
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-21 9:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 9:05 [pbs-devel] [PATCH proxmox-backup 0/3] reduce GC S3 locking Fabian Grünbichler
2025-11-21 9:05 ` [pbs-devel] [PATCH proxmox-backup 1/3] GC: S3: reduce number of open FDs for to-be-deleted objects Fabian Grünbichler
2025-11-21 9:43 ` Christian Ebner
2025-11-21 9:06 ` [pbs-devel] [PATCH proxmox-backup 2/3] GC: S3: factor out batch object deletion Fabian Grünbichler
2025-11-21 9:06 ` [pbs-devel] [PATCH proxmox-backup 3/3] GC: S3: phase2: delete last partial batch of objects at the very end Fabian Grünbichler
2025-11-21 9:31 ` Christian Ebner
2025-11-21 9:46 ` Fabian Grünbichler [this message]
2025-11-21 9:53 ` Christian Ebner
2025-11-21 10:05 ` [pbs-devel] [PATCH proxmox-backup 0/3] reduce GC S3 locking Christian Ebner
2025-11-21 10:19 ` [pbs-devel] superseded: " Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1763717700.9j817e6qly.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox