public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Roland <devzero@web.de>
To: Mark Schouten <mark@tuxis.nl>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pbs-devel] Slow overview of existing backups
Date: Fri, 10 Mar 2023 11:16:05 +0100	[thread overview]
Message-ID: <205dd8f8-a0f6-8146-27c1-f53d9b98b838@web.de> (raw)
In-Reply-To: <emb66595c1-d8df-4996-903f-e39ebd7039c2@eb9993ff.com>

Hi,

 >> Requesting the available backups from a PBS takes quite a long time.
 >> Are there any plans to start implementing caching or an overal
index-file for a datastore?

> There's already the host systems page cache that helps a lot, as long
> there's enough memory to avoid displacing its content frequently.

 >- 2 three-way mirrors of 960GB Samsung PM9A3 nvme’s as special devices

ah ok, i see you should have fast metadata access because of special device

what about freshly booting your backup server and issuing

zpool iostat -rv

after listing backups and observing slowness ?

with this we can get more insight where time is spent ,  if it's only
all about metadata access and if things are working good from a
filesystem/performance/metadata point of view.

i don't expect issues there anymore as special vdev got mature in the
meantime, but you never know.  remembering
https://github.com/openzfs/zfs/issues/8130  for example....

if that looks sane from a performance perspective, taking a closer look
at the pbs/indexer level would be good.

regards
roland

Am 10.03.23 um 10:09 schrieb Mark Schouten:
> Hi all,
>
> any thought on this?
>
> —
> Mark Schouten, CTO
> Tuxis B.V.
> mark@tuxis.nl / +31 318 200208
>
>
> ------ Original Message ------
> From "Mark Schouten" <mark@tuxis.nl>
> To "Thomas Lamprecht" <t.lamprecht@proxmox.com>; "Proxmox Backup
> Server development discussion" <pbs-devel@lists.proxmox.com>
> Date 1/26/2023 9:03:24 AM
> Subject Re[2]: [pbs-devel] Slow overview of existing backups
>
>> Hi,
>>
>>>>  PBS knows when something changed in terms of backups, and thus
>>>> when it’s time to update that index.
>>>>
>>>
>>> PBS is build such that the file system is the source of truth, one can,
>>> e.g., remove stuff there or use the manager CLI, multiple PBS instances
>>> can also run parallel, e.g., during upgrade.
>>>
>>> So having a guaranteed in-sync cache is not as trivial as it might
>>> sound.
>>>
>>
>> You can also remove stuff from /var/lib/mysql/, but then you break
>> it. There is nothing wrong with demanding your user to don’t touch
>> any files, except via the tooling you provide. And the tooling you
>> provide, can hint the service to rebuild the index. Same goes for
>> upgrades, you are in charge of them.
>>
>> We also need to regularly run garbage collection, which is a nice
>> moment to update my desired index and check if it’s actually correct.
>> On every backup run, delete, verify, you can update and check the
>> index. Those are all moments a user is not actually waiting for it
>> and getting timeouts, refreshing screens, and other annoyances.
>>
>>>
>>>>  I have the feeling that when you request an overview now, all
>>>> individual backups are checked, which seems suboptimal.
>>>
>>> We mostly walk the directory structure and read the (quite small)
>>> manifest
>>> files for some info like last verification, but we do not check the
>>> backup
>>> (data) itself.
>>>
>>> Note that using namespaces for separating many backups into multiple
>>> folder
>>> can help, as a listing then only needs to check the indices from the
>>> namespace.
>>>
>>> But, what data and backup amount count/sizes are we talking here?
>>
>> Server:
>> 2x Intel Silver 4114 (10 cores, 20 threads each)
>> 256GB RAM
>> A zpool consisting of:
>> - 17 three-way mirrors of 18TB Western Digital HC550’s, SAS
>> - 2 three-way mirrors of 960GB Samsung PM9A3 nvme’s as special devices
>>
>> Datastores:
>> - 73 datastores
>> - Total of 240T Allocated data
>>
>> Datastore that triggered my question:
>> - 263 Groups
>> - 2325 Snapshots
>> - 60TB In use
>> - Dedup factor of 19.3
>>
>>> How many groups, how many snapshots (per group), many disks on backups?
>>>
>>> And what hardware is hosting that data (cpu, disk, memory).
>>>
>>> Hows PSI looking during listing? head /proc/pressure/*
>>
>> root@pbs003:/proc/pressure# head *
>> ==> cpu <==
>> some avg10=0.74 avg60=0.58 avg300=0.21 total=8570917611
>> full avg10=0.00 avg60=0.00 avg300=0.00 total=0
>>
>> ==> io <==
>> some avg10=20.45 avg60=23.93 avg300=27.69 total=176562636690
>> full avg10=19.25 avg60=22.69 avg300=26.82 total=165397148422
>>
>> ==> memory <==
>> some avg10=0.00 avg60=0.00 avg300=0.00 total=67894436
>> full avg10=0.00 avg60=0.00 avg300=0.00 total=66761631
>>
>> Currently running 9 tasks:
>> - 3 Verifys
>> - 1 Backup
>> - 2 Syncjobs
>> - 2 GC Runs
>> - 1 Reader
>>
>> —
>> Mark Schouten, CTO
>> Tuxis B.V.
>> mark@tuxis.nl / +31 318 200208
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel



  reply	other threads:[~2023-03-10 10:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25 10:26 Mark Schouten
2023-01-25 16:08 ` Thomas Lamprecht
2023-01-26  8:03   ` Mark Schouten
2023-03-10  9:09     ` Mark Schouten
2023-03-10 10:16       ` Roland [this message]
2023-03-10 10:52         ` Mark Schouten
2023-03-13 12:48           ` Mark Schouten
2023-03-10 10:02 ` Roland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=205dd8f8-a0f6-8146-27c1-f53d9b98b838@web.de \
    --to=devzero@web.de \
    --cc=mark@tuxis.nl \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal