From: Roland <devzero@web.de>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>,
Christian Ebner <c.ebner@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 0/5] GC: avoid multiple atime updates
Date: Fri, 21 Feb 2025 16:35:46 +0100 [thread overview]
Message-ID: <dd4f458f-e64b-4061-a94f-4cd565f1f385@web.de> (raw)
In-Reply-To: <20250221140110.377328-1-c.ebner@proxmox.com>
hello,
this looks like this relates to or adresses what's being mentioned here ?
https://forum.proxmox.com/threads/better-understand-proxmox-garbage-collection-lots-of-avoidable-utimensat-calls.101417/
?
it came to my mind when reading this. not sure if i did an RFE for
this, guess i forgot...
>for my curiosity, i see there is one chunk in both of my homeserver
repos which is accessed orders of magnitudes more often then all
>the other chunks (for one repo roundabout 1000x more often) during gc
run. anyhow, many chunks are being acessed repeatedly, but at a much
lower rate.
ah, now i understand why that file is touched 34742 times - it's that
"all zeroes" chunk, which does occur in the vm image much more often.
34742 proxmox-backup-(1428): R
/backup/pve-t620_backup/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8
any clue what's this one, which is being touched an order of magnitude
more often then the zeroes chunk ?
940079 proxmox-backup-(1428): R
/backup/pve-maker-bonn_backup/.chunks/7b27/7b27b1eb4febae3273321255d8304e9b3e7938d9e254564bef859a4307a88638
regards
roland
Am 21.02.25 um 15:01 schrieb Christian Ebner:
> This patches implement the logic to greatly improve the performance
> of phase 1 garbage collection by avoiding multiple atime updates on
> the same chunk.
>
> Currently, phase 1 GC iterates over all folders in the datastore
> looking and collecting all image index files without taking any
> logical assumptions (e.g. namespaces, groups, snapshots, ...). This
> is to avoid accidentally missing image index files located in
> unexpected paths and therefore not marking their chunks as in use,
> leading to potential data losses.
>
> This patches improve phase 1 by inserting encountered index image
> paths into a data structure which allows to iterate the index files
> in a more logical manner, following the same principle as for
> incremental backup snapshots. The index files for the same namespace
> and group as well as image filename can therefore be consecutevly
> inspected.
>
> Further, by keeping track of already seen and therefore updated chunk
> atimes, it is now avoided to update the atime over and over again on the
> chunks shared by consecutive backup snaphshots.
>
> To give some ballpark figures, this reduced phase 1 garbage collection
> on a real world datastore containing some of my backups from around
> 2 minutes to about 16 seconds.
>
> Christian Ebner (5):
> datastore: restrict datastores list_images method scope to module
> garbage collection: refactor archive type based chunk marking logic
> garbage collection: add structure for optimized image iteration
> garbage collection: allow to keep track of already touched chunks
> fix #5331: garbage collection: avoid multiple chunk atime updates
>
> pbs-datastore/src/datastore.rs | 204 ++++++++++++++++++++++++++-------
> 1 file changed, 160 insertions(+), 44 deletions(-)
>
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-02-21 15:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-21 14:01 Christian Ebner
2025-02-21 14:01 ` [pbs-devel] [PATCH proxmox-backup 1/5] datastore: restrict datastores list_images method scope to module Christian Ebner
2025-02-21 14:01 ` [pbs-devel] [PATCH proxmox-backup 2/5] garbage collection: refactor archive type based chunk marking logic Christian Ebner
2025-02-21 14:01 ` [pbs-devel] [PATCH proxmox-backup 3/5] garbage collection: add structure for optimized image iteration Christian Ebner
2025-03-05 13:47 ` Fabian Grünbichler
2025-03-07 8:24 ` Christian Ebner
2025-03-07 8:53 ` Fabian Grünbichler
2025-03-07 8:59 ` Christian Ebner
2025-02-21 14:01 ` [pbs-devel] [PATCH proxmox-backup 4/5] garbage collection: allow to keep track of already touched chunks Christian Ebner
2025-02-21 14:01 ` [pbs-devel] [PATCH proxmox-backup 5/5] fix #5331: garbage collection: avoid multiple chunk atime updates Christian Ebner
2025-02-21 15:35 ` Roland [this message]
2025-02-21 15:49 ` [pbs-devel] [PATCH proxmox-backup 0/5] GC: avoid multiple " Christian Ebner
2025-02-22 17:50 ` Roland
2025-03-10 11:18 ` Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dd4f458f-e64b-4061-a94f-4cd565f1f385@web.de \
--to=devzero@web.de \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.