Re: [pve-devel] [PATCH qemu-server] fix #6207: vm status: cache last disk read/write values

From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH qemu-server] fix #6207: vm status: cache last disk read/write values
Date: Thu, 25 Sep 2025 11:28:14 +0200	[thread overview]
Message-ID: <841c9758-1478-431f-8cd4-e8fd8c0ac9cd@proxmox.com> (raw)
In-Reply-To: <0681c879-433c-49d6-ae56-582111efb5ee@proxmox.com>

Am 25.09.25 um 10:52 AM schrieb Thomas Lamprecht:
> Am 25.09.25 um 10:27 schrieb Fiona Ebner:
>> Am 22.09.25 um 7:26 PM schrieb Thomas Lamprecht:
>>> Am 22.09.25 um 12:18 schrieb Fiona Ebner:
>>> Or maybe we could make this caching opt-in through some module flag
>>> that only pvestatd sets? But not really thought that through, so
>>> please take this with a grain of salt.
>>>
>>> btw. what about QMP being "stuck" for a prolonged time, should we
>>> stop using the previous value after a few times (or duration)? 
>>
>> What other value could we use? Since the graph looks at the differences
>> of reported values, the only reasonable value we can use if we cannot
>> get a new one is the previous one. No matter how long it takes to get a
>> new one, or there will be that completely wrong spike again. Or is there
>> a N/A kind of value that we could use, where RRD/graph would be smart
>> enough to know "I cannot calculate a difference now, will have to wait
>> for multiple good values"? Then I'd go for that instead of the current
>> approach.
> 
> That should never be the problem of the metric collecting entity, but of
> the one interpreting or displaying the data, as else this is creating a
> false impression of reality.
> 
> So the more I think of this, the more I'm sure that we won't do anybody
> a favor in the mid/long term here with "faking it" in the backend.

Very good point! I'll look into what happens when reporting an undef
value, because right now the interpreting entity cannot distinguish
between "0 because of no data" and "0 yes I really mean this is the
actual value".

> I'd need to look into RRD, but even if there wasn't a way there to
> submit null-ish values, I'd rather see that as further argument for
> switching out RRD with the rust based proxmox-rrd crate, where we have
> control over these things, compared to recording measurements that did
> not happen.
> 
> That does not mean that doing this correctly in proxmox-rrd will be
> trivial to do once we migrated–which is non-trivial on it's own–though.
> There are also some ideas to switching to a rather different way to
> encode metrics, using a more flexible format and stuff like delta
> encoding, i.e. closer to modern time series DBs like influxdb do it,
> Lukas signaled some interest in this work here.
> But that is vaporware as of now, so no need to wait on that to happen
> now, just wanted to mention it to not have those ideas isolated to much.
> 
> 
> But taking a step back, why is QMP even timing out here? Is this not
> just reading some in-memory counters that QEMU has ready to go?

There can be another QMP operation going on blocking the request (e.g.
backup), or the QEMU main thread might be busy or the system in general
might be under too much load to handle all of the QMP commands to all
the VMs in time. The report of this issue in the enterprise support has
VMs that are not being backed-up showing the spike during backup of
other VMs.

But it seems like there is potential for improvement how we do things.
We collect :

>     my $statuscb = sub {
>         my ($vmid, $resp) = @_;
> 
>         $qmpclient->queue_cmd($vmid, $blockstatscb, 'query-blockstats');
>         $qmpclient->queue_cmd($vmid, $machinecb, 'query-machines');
>         $qmpclient->queue_cmd($vmid, $versioncb, 'query-version');
>         # this fails if ballon driver is not loaded, so this must be
>         # the last command (following command are aborted if this fails).
>         $qmpclient->queue_cmd($vmid, $ballooncb, 'query-balloon');
> 
>         my $status = 'unknown';
>         if (!defined($status = $resp->{'return'}->{status})) {
>             warn "unable to get VM status\n";
>             return;
>         }
> 
>         $res->{$vmid}->{qmpstatus} = $resp->{'return'}->{status};
>     };
> 
>     foreach my $vmid (keys %$list) {
>         next if $opt_vmid && ($vmid ne $opt_vmid);
>         next if !$res->{$vmid}->{pid}; # not running
>         $qmpclient->queue_cmd($vmid, $statuscb, 'query-status');
>     }

Okay, good! We collect all commands so we can issue them in parallel.

>     $qmpclient->queue_execute(undef, 2);

Here we only have the default timeout of 3 seconds (i.e. the undef
argument), maybe we should bump that to something like 5 seconds? Right
now, without having the pvestatd parallelize the update_qemu_status()
with other update_xyz() operations, that might already be quite costly
:/ But considering it's for all VMs it might be fair?

>     foreach my $vmid (keys %$list) {
>         next if $opt_vmid && ($vmid ne $opt_vmid);
>         next if !$res->{$vmid}->{pid}; #not running
> 
>         # we can't use the $qmpclient since it might have already aborted on
>         # 'query-balloon', but this might also fail for older versions...
>         my $qemu_support = eval { mon_cmd($vmid, "query-proxmox-support") };
>         $res->{$vmid}->{'proxmox-support'} = $qemu_support // {};
>     }

This OTOH, seems just bad, querying the info one-by-one, each with its
own timeout. I'll look into whether this can be reworked to be part of
the queue (before 'query-balloon'). And/or we should be able to even
disable this for the status daemon, I think it doesn't use that info at all.

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel