From: Roland <devzero@web.de>
To: Mark Schouten <mark@tuxis.nl>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>,
Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pbs-devel] Slow overview of existing backups
Date: Fri, 10 Mar 2023 11:16:05 +0100 [thread overview]
Message-ID: <205dd8f8-a0f6-8146-27c1-f53d9b98b838@web.de> (raw)
In-Reply-To: <emb66595c1-d8df-4996-903f-e39ebd7039c2@eb9993ff.com>
Hi,
>> Requesting the available backups from a PBS takes quite a long time.
>> Are there any plans to start implementing caching or an overal
index-file for a datastore?
> There's already the host systems page cache that helps a lot, as long
> there's enough memory to avoid displacing its content frequently.
>- 2 three-way mirrors of 960GB Samsung PM9A3 nvme’s as special devices
ah ok, i see you should have fast metadata access because of special device
what about freshly booting your backup server and issuing
zpool iostat -rv
after listing backups and observing slowness ?
with this we can get more insight where time is spent , if it's only
all about metadata access and if things are working good from a
filesystem/performance/metadata point of view.
i don't expect issues there anymore as special vdev got mature in the
meantime, but you never know. remembering
https://github.com/openzfs/zfs/issues/8130 for example....
if that looks sane from a performance perspective, taking a closer look
at the pbs/indexer level would be good.
regards
roland
Am 10.03.23 um 10:09 schrieb Mark Schouten:
> Hi all,
>
> any thought on this?
>
> —
> Mark Schouten, CTO
> Tuxis B.V.
> mark@tuxis.nl / +31 318 200208
>
>
> ------ Original Message ------
> From "Mark Schouten" <mark@tuxis.nl>
> To "Thomas Lamprecht" <t.lamprecht@proxmox.com>; "Proxmox Backup
> Server development discussion" <pbs-devel@lists.proxmox.com>
> Date 1/26/2023 9:03:24 AM
> Subject Re[2]: [pbs-devel] Slow overview of existing backups
>
>> Hi,
>>
>>>> PBS knows when something changed in terms of backups, and thus
>>>> when it’s time to update that index.
>>>>
>>>
>>> PBS is build such that the file system is the source of truth, one can,
>>> e.g., remove stuff there or use the manager CLI, multiple PBS instances
>>> can also run parallel, e.g., during upgrade.
>>>
>>> So having a guaranteed in-sync cache is not as trivial as it might
>>> sound.
>>>
>>
>> You can also remove stuff from /var/lib/mysql/, but then you break
>> it. There is nothing wrong with demanding your user to don’t touch
>> any files, except via the tooling you provide. And the tooling you
>> provide, can hint the service to rebuild the index. Same goes for
>> upgrades, you are in charge of them.
>>
>> We also need to regularly run garbage collection, which is a nice
>> moment to update my desired index and check if it’s actually correct.
>> On every backup run, delete, verify, you can update and check the
>> index. Those are all moments a user is not actually waiting for it
>> and getting timeouts, refreshing screens, and other annoyances.
>>
>>>
>>>> I have the feeling that when you request an overview now, all
>>>> individual backups are checked, which seems suboptimal.
>>>
>>> We mostly walk the directory structure and read the (quite small)
>>> manifest
>>> files for some info like last verification, but we do not check the
>>> backup
>>> (data) itself.
>>>
>>> Note that using namespaces for separating many backups into multiple
>>> folder
>>> can help, as a listing then only needs to check the indices from the
>>> namespace.
>>>
>>> But, what data and backup amount count/sizes are we talking here?
>>
>> Server:
>> 2x Intel Silver 4114 (10 cores, 20 threads each)
>> 256GB RAM
>> A zpool consisting of:
>> - 17 three-way mirrors of 18TB Western Digital HC550’s, SAS
>> - 2 three-way mirrors of 960GB Samsung PM9A3 nvme’s as special devices
>>
>> Datastores:
>> - 73 datastores
>> - Total of 240T Allocated data
>>
>> Datastore that triggered my question:
>> - 263 Groups
>> - 2325 Snapshots
>> - 60TB In use
>> - Dedup factor of 19.3
>>
>>> How many groups, how many snapshots (per group), many disks on backups?
>>>
>>> And what hardware is hosting that data (cpu, disk, memory).
>>>
>>> Hows PSI looking during listing? head /proc/pressure/*
>>
>> root@pbs003:/proc/pressure# head *
>> ==> cpu <==
>> some avg10=0.74 avg60=0.58 avg300=0.21 total=8570917611
>> full avg10=0.00 avg60=0.00 avg300=0.00 total=0
>>
>> ==> io <==
>> some avg10=20.45 avg60=23.93 avg300=27.69 total=176562636690
>> full avg10=19.25 avg60=22.69 avg300=26.82 total=165397148422
>>
>> ==> memory <==
>> some avg10=0.00 avg60=0.00 avg300=0.00 total=67894436
>> full avg10=0.00 avg60=0.00 avg300=0.00 total=66761631
>>
>> Currently running 9 tasks:
>> - 3 Verifys
>> - 1 Backup
>> - 2 Syncjobs
>> - 2 GC Runs
>> - 1 Reader
>>
>> —
>> Mark Schouten, CTO
>> Tuxis B.V.
>> mark@tuxis.nl / +31 318 200208
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2023-03-10 10:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-25 10:26 Mark Schouten
2023-01-25 16:08 ` Thomas Lamprecht
2023-01-26 8:03 ` Mark Schouten
2023-03-10 9:09 ` Mark Schouten
2023-03-10 10:16 ` Roland [this message]
2023-03-10 10:52 ` Mark Schouten
2023-03-13 12:48 ` Mark Schouten
2023-03-10 10:02 ` Roland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=205dd8f8-a0f6-8146-27c1-f53d9b98b838@web.de \
--to=devzero@web.de \
--cc=mark@tuxis.nl \
--cc=pbs-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal