From: Christian Ebner <c.ebner@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
"Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 3/3] GC: S3: phase2: delete last partial batch of objects at the very end
Date: Fri, 21 Nov 2025 10:53:30 +0100 [thread overview]
Message-ID: <1555dfe2-0507-4cc3-b2bc-861d1f3e699c@proxmox.com> (raw)
In-Reply-To: <1763717700.9j817e6qly.astroid@yuna.none>
On 11/21/25 10:45 AM, Fabian Grünbichler wrote:
> On November 21, 2025 10:31 am, Christian Ebner wrote:
>> While going trough the rest of the series in detail now, one idea right
>> away.
>>
>> On 11/21/25 10:06 AM, Fabian Grünbichler wrote:
>>> instead of after every processing every batch of 1000 listed objects. this
>>> reduces the number of delete calls made to the backend, making regular garbage
>>> collections that do not delete most objects cheaper, but means holding the
>>> flocks for garbage chunks/objects longer.
>>
>> We could avoid holding the flock for to long (e.g. GC over several days
>> because of super slow local datastore cache, S3 backend, ...) by setting
>> (or resetting) a timer on each last delete list insert, and not only
>> using the batch size to decide if to perform the deleteObjects() call,
>> but rather compare if a timeout has been elapsed.
>>
>> This would safeguard us from locking some chunks way to long, causing
>> potential issues with concurrent backups, but not trow out all the
>> benefits this patch brings.
>>
>> What do you think? I could send that as followup if you like.
>
> considerations like this were why I split this out as separate patch ;)
>
> the loop here basically does:
> - one S3 call to (continue to) list objects in the bucket
> (potentially expensive), then for each object:
> -- maps each object back to a chunk (free)
> -- does some local operations
> (stat, empty marker handling - these should not take too long?)
> -- remove from cache if garbage
> (might take a bit if local storage is very slow?)
>
> so yeah, we should probably cap the max. number of list calls before we
> trigger the deletion, to avoid locking a garbage chunk in the first 1000
> objects until the end of GC, if there is no further garbage to fill up
> the batch of 100.. either by number of iterations since the first
> not-yet-process delete insertion, or via timestamp?
Maybe best to reuse a fraction of the already defined
CHUNK_LOCK_TIMEOUT, so even if there are a lot of list objects
iterations (many chunks), we don't loose out on the delete list collections?
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-11-21 9:53 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-21 9:05 [pbs-devel] [PATCH proxmox-backup 0/3] reduce GC S3 locking Fabian Grünbichler
2025-11-21 9:05 ` [pbs-devel] [PATCH proxmox-backup 1/3] GC: S3: reduce number of open FDs for to-be-deleted objects Fabian Grünbichler
2025-11-21 9:43 ` Christian Ebner
2025-11-21 9:06 ` [pbs-devel] [PATCH proxmox-backup 2/3] GC: S3: factor out batch object deletion Fabian Grünbichler
2025-11-21 9:06 ` [pbs-devel] [PATCH proxmox-backup 3/3] GC: S3: phase2: delete last partial batch of objects at the very end Fabian Grünbichler
2025-11-21 9:31 ` Christian Ebner
2025-11-21 9:46 ` Fabian Grünbichler
2025-11-21 9:53 ` Christian Ebner [this message]
2025-11-21 10:05 ` [pbs-devel] [PATCH proxmox-backup 0/3] reduce GC S3 locking Christian Ebner
2025-11-21 10:19 ` [pbs-devel] superseded: " Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1555dfe2-0507-4cc3-b2bc-861d1f3e699c@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox