public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Philipp Hufnagl <p.hufnagl@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Maximiliano Sandoval <m.sandoval@proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-container] pct: fix cpu load calculation on command line
Date: Tue, 18 Jul 2023 15:02:29 +0200	[thread overview]
Message-ID: <2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com> (raw)
In-Reply-To: <edc36f4f-f830-fba6-696e-0db56c81d318@proxmox.com>

Am 18/07/2023 um 14:00 schrieb Philipp Hufnagl:
> Sorry forgott to tag as v2

and also forgot to document the patch changelog like asked yesterday..

> 
> On 7/18/23 13:58, Philipp Hufnagl wrote:
>>   When called from the command line, it was not possible to calculate
>>   cpu load because there was no 2nd data point available for the
>>   calculation. Now (when called) from the command line, cpu stats will
>>   be fetched twice with a minimum delta of 20ms. That way the load can
>>   be calculated

@Maximiliano, didn't we decide to just drop it instead? This isn't really
useful, once can get much better data from the pressure stall information
(PSI) which is tracked per cgroup and tells a user much more than a 20 ms
sample interval..

https://docs.kernel.org/accounting/psi.html#cgroup2-interface

Still a few comments inline.

>>
>> fixes #4765

Add this to the start of the commit subject: fix #4765: 

>>
>> Signed-off-by: Philipp Hufnagl <p.hufnagl@proxmox.com>
>> ---
>>   src/PVE/CLI/pct.pm |  4 ++--
>>   src/PVE/LXC.pm     | 32 +++++++++++++++++++++++++++++---
>>   2 files changed, 31 insertions(+), 5 deletions(-)
>>

> diff --git a/src/PVE/CLI/pct.pm b/src/PVE/CLI/pct.pm
> index ff75d33..e531b27 100755
> --- a/src/PVE/CLI/pct.pm
> +++ b/src/PVE/CLI/pct.pm
> @@ -60,8 +60,8 @@ __PACKAGE__->register_method ({
>  
>  	# test if CT exists
>  	my $conf = PVE::LXC::Config->load_config ($param->{vmid});
> -
> -	my $vmstatus = PVE::LXC::vmstatus($param->{vmid});
> +	# workaround to get cpu usage is to fetch cpu stats twice with a delay
> +	my $vmstatus = PVE::LXC::vmstatus($param->{vmid}, 20);
>  	my $stat = $vmstatus->{$param->{vmid}};
>  	if ($param->{verbose}) {
>  	    foreach my $k (sort (keys %$stat)) {
> diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
> index a531ea5..9fc171f 100644
> --- a/src/PVE/LXC.pm
> +++ b/src/PVE/LXC.pm
> @@ -12,7 +12,7 @@ use IO::Poll qw(POLLIN POLLHUP);
>  use IO::Socket::UNIX;
>  use POSIX qw(EINTR);
>  use Socket;
> -use Time::HiRes qw (gettimeofday);
> +use Time::HiRes qw (gettimeofday usleep);
>  
>  use PVE::AccessControl;
>  use PVE::CGroup;
> @@ -171,11 +171,37 @@ our $vmstatus_return_properties = {
>      }
>  };
>  
> +sub get_first_cpu {

would expect that this actually returns something, i.e., the ID of the first CPU
or something like that, so method name should be telling more about what this does,
e.g.:

prime_vmstatus_cpu_sampling

> +    my ($list, $measure_timespan_ms) = @_;
> +    my $cdtime = gettimeofday;
> +
> +    foreach my $vmid (keys %$list) {
> +	my $cgroups = PVE::LXC::CGroup->new($vmid);
> +	if (defined(my $cpu = $cgroups->get_cpu_stat())) {
> +	    # Total time (in milliseconds) used up by the cpu.
> +	    my $used_ms = $cpu->{utime} + $cpu->{stime};
> +	    $last_proc_vmid_stat->{$vmid} = {
> +		time => $cdtime,
> +		used => $used_ms,
> +		cpu => 0,
> +	    };
> +	}
> +    }
> +    usleep($measure_timespan_ms * 1000);

this is rather ugly, the reading the call site one definitively does not expect
that a innocent named get_first_cpu unconditionally sleeps even though that only
the caller would require this for their sampling..

If we don't just drop this CPU load stuff in pct status I'd rather do one of four
options:

1) rename this to prime_vmstatus_cpu_sampling and just do it for a single vmid,
then call this new method in PVE::CLI::pct->status and do the sleep there, as
that's actually the one call sites that cares about it, the existing vmstatus
method then just needs one change:

-           if ($delta_ms > 1000.0) {
+           if ($delta_ms > 1000.0 || $old->{cpu} == 0) {

2) The same as 1) but instead of adding the prime_vmstatus_cpu_sampling helper
just call vmstatus twice with sleeping in-between (and the same change to the if
condition as for 1).

3) get the data where it's already available, i.e., pvestatd, might need more
rework though

4) switch over to reporting the PSI from /sys/fs/cgroup/lxc/VMID/cpu.pressure
this is pretty simple as in PSI ~ 0 -> no overload 0 >> PSI > 1 -> some overload
and PSI >> 1 a lot of overload.

Option 4 sounds niceish, but needs more work and has not that high of a benefit
(users can already query this easily themselves), option 1 or 2 would be OK-ish,
but IMO not ideal, as we'd use a 20ms avg here compared to a >> 1s average elswhere,
which can be confusing as it can be quite, well spikey. option 3 would be better here
but as mentioned also more rework and possible more intrusive one, so IMO just
dropping it sounds almost the nicest and def. most simple one.

> +}
> +
>  sub vmstatus {
> -    my ($opt_vmid) = @_;
> +    my ($opt_vmid, $measure_timespan_ms) = @_;

nit: in control theory, signal processing and acquiring stats in general, using
"sampling period" or "sampling interval" is a bit more common for describing what
you do here with "$measure_timespan_ms".

>  
>      my $list = $opt_vmid ? { $opt_vmid => { type => 'lxc', vmid => int($opt_vmid) }} : config_list();
>  
> +    if (defined($measure_timespan_ms))
> +    {

Doesn't follows our coding style guide:
https://pve.proxmox.com/wiki/Perl_Style_Guide#Spacing_and_syntax_usage

> +	get_first_cpu($list, $measure_timespan_ms);
> +    }
> +
> +    $measure_timespan_ms //= 1000;

could just put that in the else block.

> +
>      my $active_hash = list_active_containers();
>  
>      my $cpucount = $cpuinfo->{cpus} || 1;
> @@ -285,7 +311,7 @@ sub vmstatus {
>  	    }
>  
>  	    my $delta_ms = ($cdtime - $old->{time}) * $cpucount * 1000.0;
> -	    if ($delta_ms > 1000.0) {
> +	    if ($delta_ms > $measure_timespan_ms) {
>  		my $delta_used_ms = $used_ms - $old->{used};
>  		$d->{cpu} = (($delta_used_ms / $delta_ms) * $cpucount) / $d->{cpus};
>  		$last_proc_vmid_stat->{$vmid} = {




  reply	other threads:[~2023-07-18 13:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18 11:58 Philipp Hufnagl
2023-07-18 12:00 ` Philipp Hufnagl
2023-07-18 13:02   ` Thomas Lamprecht [this message]
2023-07-18 13:48     ` Philipp Hufnagl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=m.sandoval@proxmox.com \
    --cc=p.hufnagl@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal