public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: "Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>,
	"Dominik Csapak" <d.csapak@proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	"Gabriel Goller" <g.goller@proxmox.com>
Subject: Re: [pbs-devel] [PATCH widget-toolkit] fix #4961: disks: increase timeout to 1min
Date: Tue, 26 Sep 2023 15:42:08 +0200	[thread overview]
Message-ID: <7b0ac740-096b-4850-a81d-5fa6bc9c347a@proxmox.com> (raw)
In-Reply-To: <f31588c3-5dba-47f8-a140-d015d0a0456c@proxmox.com>

Am 26/09/2023 um 10:15 schrieb Dominik Csapak:
> On 9/26/23 09:55, Fabian Grünbichler wrote:
>> On September 21, 2023 10:21 am, Gabriel Goller wrote:
>>> Increase the timeout of the api call to `../nodes/localhost/disks/list`
>>> to 1min. When having a lot of disks the default timeout of 30sec is
>>> not enough.
>>>
>>
>> that alone won't be enough, since we also have a 30s timeout for
>> proxying requests in a PVE cluster (and this component is used there).
> 
> it would actually be enough to fix that exact bug, which is for pbs
> where we don't have the 30s proxy limit, but you're right it's not
> a generic fix across both products
> 
> imho we could still apply this here (maybe with a short test to
> see if it doesn't make the situation worse for pve) so it's fixed
> for pbs at least and can properly fix it for pve later still

I'd rather like to avoid that, 30s is already long and might make some
users just close the dialogue or whatever.

And general "lightweight-task" mechanism that Fabian proposed could be
nice for some things but IMO the wrong approach here too.

Here, with the end result being to show a simple disk list, I just
refuse to believe that this has to take that long; here the API call has
some serious architectural problems, among other things that can mean a
mix of:

1. too much stuff read that is not relevant most of the time, we could
   avoid querying the data that has the worst expensive/useful ratio
   and make that only available on a per-disk info dialogue.

2. getting some info in a convoluted way (e.g, we have already lsblk
   info, which knows if a disk is a "zfs_member", but still do a zpool
   list, and all of those do separate stats on the device nodes to get
   the rdevs, multiple times...)

3. missing a cache for the more expensive stuff, e.g., one that the rrd
   stat querying code could probably fill every X'd run or so, as it
   checks for IO metrics anyway

I'd start with the 2. point above, as I saw quite a bit optimization
potential with only a quick skimming, some of that should hopefully
produce some actual improvements.

Then I'd check where most of the time is spent, either by asking the
reporter nicely for timings of all relevant commands and potentially
also by just checking ourselves, we have slow HW here too, and while
not 90 disks, even a tenth should show slowish enough behavior to
be able to find the major culprits with some tracing or even just
printf-timestamp-debugging


The _last_ thing I'd do here for this specific endpoint is extending
timeouts or inventing some new fancy light-weight task mechanisms.
As optimizations, omitting expensive stuff, or moving it out from
"for-all-disk" to a "per-single-disk" view, and caching for the
"expensive but just really interesting to always see" stuff goes a
long way..

btw. PBS still misses the Wipe-Disk feature from PVE..




  reply	other threads:[~2023-09-26 13:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-21  8:21 Gabriel Goller
2023-09-26  7:55 ` Fabian Grünbichler
2023-09-26  8:15   ` Dominik Csapak
2023-09-26 13:42     ` Thomas Lamprecht [this message]
2023-09-28 12:57       ` Gabriel Goller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b0ac740-096b-4850-a81d-5fa6bc9c347a@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=g.goller@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal