From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 5B551910B2 for ; Fri, 10 Mar 2023 11:16:13 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3DE6D1F2A1 for ; Fri, 10 Mar 2023 11:16:13 +0100 (CET) Received: from mout.web.de (mout.web.de [212.227.15.14]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 10 Mar 2023 11:16:12 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=s29768273; t=1678443366; i=devzero@web.de; bh=gQ8IwcX2v25pqeQyPilaKGdVphI2+DX4OOHcg27OG0Y=; h=X-UI-Sender-Class:Date:Subject:To:References:From:In-Reply-To; b=LXsf7DYa7Ys5VZhtoyOIqhvpXTO5GOYwpHTsaChKvMXyK4z/rWgLiTd18UAry7Kx9 TvQu4TOnpphRHhnftbX6RRqDbttqLhZ26nn5t4NC/4HJ48L2+98SwFL5mXCWCu+YLh kYaAgEgCJzILci1gSMdBWEy1IMO7ci3c4WQHiruyv7SH9tFCJm5CpJxWK7bj0fEDsi cBKL67vAo600QrZMIJzuMs9P9ihfKNeAScqI6OWuoC9ldMAhYcZ9mERWpFi7orxlMr K71thqkKagVyTzWCZyJU9uyyatcjmdop1QslBug8niv6kuZK3moo/SAZB96aw3O0fW fMlTNXfitQv9A== X-UI-Sender-Class: 814a7b36-bfc1-4dae-8640-3722d8ec6cd6 Received: from [172.20.35.164] ([37.24.118.138]) by smtp.web.de (mrweb005 [213.165.67.108]) with ESMTPSA (Nemesis) id 1MRk0k-1pyKlg0t2I-00TE5y; Fri, 10 Mar 2023 11:16:06 +0100 Message-ID: <205dd8f8-a0f6-8146-27c1-f53d9b98b838@web.de> Date: Fri, 10 Mar 2023 11:16:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 To: Mark Schouten , Proxmox Backup Server development discussion , Thomas Lamprecht References: <11d3a67b-bc11-483f-c25b-2c6b634e4326@proxmox.com> From: Roland In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:BsWC03K+CayIJkbgL40+Swuf1Daf+IBZH6GdbxwHI6bsuwC3oPr x/+X5NylmQOFiF6EzKT/8gIoT7nCuOQwzepdar7G21AnRPkeLxHp/wnXNzecJBVHMo+dffY FMRs3yt9gStjqhsimT69txsmXV04eQ4f7cnEZIvI0BzCLRWar1x5BbYIaXaHCI12hk5g2iO jIs4hjauOeJQnuMrx0FrA== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:VNg4OVObExc=;5fhK3xTn9ZhtL5qxSBndFMsJ1lR e7dta2EtQw7zcDtGDxE7PhCjKCTgVWe9mHmymP2sI8KPjstrxJf5C/BSSND5EmmpEkpnzogBE WT4uj2gMT1TA6Hi1MdIcnHGWh2jOH631rbu6VruY+lvwz7JIGCZk5AqN94x7XUi6kHYPAZ0zz 6bOnkwHCChmB3iNYJJyX81fy9zs3Z6oS8J0yJMnSrWgVFNmJpnVgGuJfnT12Hy4P9YoYZp114 6mWP1ARhlNYhMIZK+Hz/gSY/aHCiNdXFZxdKeIrfOOirReRrLdLwaV4eb+zwwGico+7P1RRt7 +vRm2xSwSNLYLp1sDdSMEWlySLUJ6XWu0ArSUci/Kaijo/6IF0Di/eYvjpnjo3hsnJHRhvxPQ aa8z52dY+2dV+KZdSOBrINaNN/ySESo0A+ZewdFe+S1z21sHgSbnMhwFBdS5/Tfw9dXj2zHY9 +cSd7JGqK/ZDPY3NGFNpLuEGlynQDq1grkBCcjjDkfLcdYUjhHnAB+SRrTUvLqvWZ8hxq4z0U 8/q7kmnGw1VVt+QQkBN4DQCkbx8nmlor5+LIIub3trkNQzbNDgzGYYVOxmmSZW/v8aKKTONV/ 0Cdk5rbZgOFgZQbTDsxvMo1TgDBoYDoZCu3HSRpHqWWhF7jwKAz8MYXsqez78quwNQzSVDqmP UU/fuk/CTBSVR43Z5lsbZUEavl4mRiQefOz/i3y1N0c3t25WvlIt3Bo8YExCy/VbgC72945W7 NlbGQQIlAf9moFmtR5GRUIpxJGK0psWp7+PzSVFLcQm8/WwfMJSPHBKPdx0YDUUN4YmXaNDHf lfHxW76ILZCI/X2tiLnN5dmq6ym85TpSLXelqkvnK0hV8jf5UFmhWnchroLICwrLSr5p/fTFM kcJRoAGjxndN7KxPovRIfGwg9z7PJXaazJtL6W2Lx4zbTxLU2znxtSiCGfMXnrTBBiamr0cOZ 9K5RTJZgW5CTleo/HN4k2pqAlfI= X-SPAM-LEVEL: Spam detection results: 0 AWL 0.468 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain FREEMAIL_FROM 0.001 Sender email is commonly abused enduser mail provider NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_LOW -0.7 Sender listed at https://www.dnswl.org/, low trust RCVD_IN_MSPIKE_H2 -0.001 Average reputation (+2) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] Slow overview of existing backups X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Mar 2023 10:16:13 -0000 Hi, >> Requesting the available backups from a PBS takes quite a long time. >> Are there any plans to start implementing caching or an overal index-file for a datastore? > There's already the host systems page cache that helps a lot, as long > there's enough memory to avoid displacing its content frequently. >- 2 three-way mirrors of 960GB Samsung PM9A3 nvme=E2=80=99s as special d= evices ah ok, i see you should have fast metadata access because of special devic= e what about freshly booting your backup server and issuing zpool iostat -rv after listing backups and observing slowness ? with this we can get more insight where time is spent ,=C2=A0 if it's only all about metadata access and if things are working good from a filesystem/performance/metadata point of view. i don't expect issues there anymore as special vdev got mature in the meantime, but you never know.=C2=A0 remembering https://github.com/openzfs/zfs/issues/8130=C2=A0 for example.... if that looks sane from a performance perspective, taking a closer look at the pbs/indexer level would be good. regards roland Am 10.03.23 um 10:09 schrieb Mark Schouten: > Hi all, > > any thought on this? > > =E2=80=94 > Mark Schouten, CTO > Tuxis B.V. > mark@tuxis.nl / +31 318 200208 > > > ------ Original Message ------ > From "Mark Schouten" > To "Thomas Lamprecht" ; "Proxmox Backup > Server development discussion" > Date 1/26/2023 9:03:24 AM > Subject Re[2]: [pbs-devel] Slow overview of existing backups > >> Hi, >> >>>> =C2=A0PBS knows when something changed in terms of backups, and thus >>>> when it=E2=80=99s time to update that index. >>>> >>> >>> PBS is build such that the file system is the source of truth, one can= , >>> e.g., remove stuff there or use the manager CLI, multiple PBS instance= s >>> can also run parallel, e.g., during upgrade. >>> >>> So having a guaranteed in-sync cache is not as trivial as it might >>> sound. >>> >> >> You can also remove stuff from /var/lib/mysql/, but then you break >> it. There is nothing wrong with demanding your user to don=E2=80=99t to= uch >> any files, except via the tooling you provide. And the tooling you >> provide, can hint the service to rebuild the index. Same goes for >> upgrades, you are in charge of them. >> >> We also need to regularly run garbage collection, which is a nice >> moment to update my desired index and check if it=E2=80=99s actually co= rrect. >> On every backup run, delete, verify, you can update and check the >> index. Those are all moments a user is not actually waiting for it >> and getting timeouts, refreshing screens, and other annoyances. >> >>> >>>> =C2=A0I have the feeling that when you request an overview now, all >>>> individual backups are checked, which seems suboptimal. >>> >>> We mostly walk the directory structure and read the (quite small) >>> manifest >>> files for some info like last verification, but we do not check the >>> backup >>> (data) itself. >>> >>> Note that using namespaces for separating many backups into multiple >>> folder >>> can help, as a listing then only needs to check the indices from the >>> namespace. >>> >>> But, what data and backup amount count/sizes are we talking here? >> >> Server: >> 2x Intel Silver 4114 (10 cores, 20 threads each) >> 256GB RAM >> A zpool consisting of: >> - 17 three-way mirrors of 18TB Western Digital HC550=E2=80=99s, SAS >> - 2 three-way mirrors of 960GB Samsung PM9A3 nvme=E2=80=99s as special = devices >> >> Datastores: >> - 73 datastores >> - Total of 240T Allocated data >> >> Datastore that triggered my question: >> - 263 Groups >> - 2325 Snapshots >> - 60TB In use >> - Dedup factor of 19.3 >> >>> How many groups, how many snapshots (per group), many disks on backups= ? >>> >>> And what hardware is hosting that data (cpu, disk, memory). >>> >>> Hows PSI looking during listing? head /proc/pressure/* >> >> root@pbs003:/proc/pressure# head * >> =3D=3D> cpu <=3D=3D >> some avg10=3D0.74 avg60=3D0.58 avg300=3D0.21 total=3D8570917611 >> full avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0 >> >> =3D=3D> io <=3D=3D >> some avg10=3D20.45 avg60=3D23.93 avg300=3D27.69 total=3D176562636690 >> full avg10=3D19.25 avg60=3D22.69 avg300=3D26.82 total=3D165397148422 >> >> =3D=3D> memory <=3D=3D >> some avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D67894436 >> full avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D66761631 >> >> Currently running 9 tasks: >> - 3 Verifys >> - 1 Backup >> - 2 Syncjobs >> - 2 GC Runs >> - 1 Reader >> >> =E2=80=94 >> Mark Schouten, CTO >> Tuxis B.V. >> mark@tuxis.nl / +31 318 200208 > > > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel