From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id BB06F1FF161 for ; Tue, 24 Sep 2024 14:25:34 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 93CB5D049; Tue, 24 Sep 2024 14:25:47 +0200 (CEST) Message-ID: <7c8fa551-1682-4556-9322-15fd280fcfad@proxmox.com> Date: Tue, 24 Sep 2024 14:25:44 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Daniel Kral To: Proxmox VE development discussion References: <20240917055020.10507-1-sascha.westermann@hl-services.de> Content-Language: en-US In-Reply-To: X-SPAM-LEVEL: Spam detection results: 0 AWL 0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH qemu-server 3/3] Fix #5708: Add CPU raw counters X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Cc: Sascha Westermann Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On 9/17/24 07:50, Sascha Westermann via pve-devel wrote: > Add a map containing raw values from /proc//stat (utime, stime and > guest_time), "uptime_ticks" and "user_hz" (from cpuinfo) to calcuate > physical CPU usage from two samples. In addition, virtual CPU statistics > based on /proc//task//schedstat ( for virtual cores) are > added - based on this data, the CPU usage can be calculated from the > perspective of the virtual machine. > > The total usage corresponds to "cpu_ns + runqueue_ns", "cpu_ns" should > roughly reflect the physical CPU usage (without I/O-threads and > emulators) and "runqueue_ns" corresponds to the value of %steal, i.e. > the same as "CPU ready" for VMware or "Wait for dispatch" for Hyper-V. > > To calculate the difference value, uptime_ticks and user_hz would be > converted to nanoseconds - the value was determined immediately after > utime, stime and guest_time were determined from /proc//stat, i.e. > before /proc//task//schedstat was determined. The time value > is therefore not exact, but should be sufficiently close to the time of > determination so that the values determined should be relatively > accurate. > > Signed-off-by: Sascha Westermann > --- > PVE/QemuServer.pm | 55 +++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 53 insertions(+), 2 deletions(-) > > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm > index b26da505..39830709 100644 > --- a/PVE/QemuServer.pm > +++ b/PVE/QemuServer.pm > @@ -2814,6 +2814,40 @@ our $vmstatus_return_properties = { > > my $last_proc_pid_stat; > > +sub get_vcpu_to_thread_id { > + my ($pid) = @_; > + my @cpu_to_thread_id; > + my $task_dir = "/proc/$pid/task"; > + > + if (! -d $task_dir) { > + return @cpu_to_thread_id; > + } > + > + opendir(my $dh, $task_dir); > + if (!$dh) { > + return @cpu_to_thread_id; > + } > + while (my $tid = readdir($dh)) { > + next if $tid =~ /^\./; > + my $comm_file = "$task_dir/$tid/comm"; > + next unless -f $comm_file; > + > + open(my $fh, '<', $comm_file) or next; > + my $comm = <$fh>; > + close($fh); > + > + chomp $comm; > + > + if ($comm =~ /^CPU\s+(\d+)\/KVM$/) { > + my $vcpu = $1; > + push @cpu_to_thread_id, { tid => $tid, vcpu => $vcpu }; > + } > + } > + closedir($dh); > + > + return @cpu_to_thread_id; > +} nit: since they are not part of the initial bug's intent, this probably could be split into its own commit (adding vCPU counters). > + > # get VM status information > # This must be fast and should not block ($full == false) > # We only query KVM using QMP if $full == true (this can be slow) > @@ -2827,8 +2861,6 @@ sub vmstatus { > my $list = vzlist(); > my $defaults = load_defaults(); > > - my ($uptime) = PVE::ProcFSTools::read_proc_uptime(1); > - > my $cpucount = $cpuinfo->{cpus} || 1; > > foreach my $vmid (keys %$list) { > @@ -2911,6 +2943,25 @@ sub vmstatus { > > my $pstat = PVE::ProcFSTools::read_proc_pid_stat($pid); > next if !$pstat; # not running > + my ($uptime) = PVE::ProcFSTools::read_proc_uptime(1); > + my $process_uptime_ticks = $uptime - $pstat->{starttime}; > + > + $d->{cpustat}->{guest_time} = int($pstat->{guest_time}); > + $d->{cpustat}->{process_uptime_ticks} = $process_uptime_ticks; > + $d->{cpustat}->{stime} = int($pstat->{stime}); > + $d->{cpustat}->{user_hz} = $cpuinfo->{user_hz}; > + $d->{cpustat}->{utime} = int($pstat->{utime}); > + > + my @vcpu_to_thread_id = get_vcpu_to_thread_id($pid); > + if (@vcpu_to_thread_id) { > + foreach my $entry (@vcpu_to_thread_id) { > + my $statstr = PVE::Tools::file_read_firstline("/proc/$pid/task/$entry->{tid}/schedstat") or next; > + if ($statstr && $statstr =~ m/^(\d+) (\d+) \d/) { > + $d->{cpustat}->{"vcpu" . $entry->{vcpu}}->{cpu_ns} = int($1); > + $d->{cpustat}->{"vcpu" . $entry->{vcpu}}->{runqueue_ns} = int($2); > + }; > + } > + } note: This might be useful information for patch #2 (if we decide to make the added information available to metric servers as well) as this data is actually sent to the external metric servers (at `PVE::Service::pvestatd::update_qemu_status`) and it seems fine to me as the vCPUs get separated via a "instance=vcpuX" field. I haven't tested this with Grafana though. e.g. for one of my VMs this will add the following to the InfluxDB API write call: ``` cpustat,object=qemu,vmid=107,nodename=node1,host=test,instance=vcpu0 cpu_ns=10916152530,runqueue_ns=29127241 1727171085000000000 cpustat,object=qemu,vmid=107,nodename=node1,host=test,instance=vcpu1 cpu_ns=1341783516,runqueue_ns=6114069 1727171085000000000 cpustat,object=qemu,vmid=107,nodename=node1,host=test guest_time=846,process_uptime_ticks=5234,stime=333,user_hz=100,utime=1004 1727171085000000000 ``` > > my $used = $pstat->{utime} + $pstat->{stime}; > > -- > 2.46.0 As for patch #2, it would also be beneficial to the user that your added data properties are documented in the JSONSchema for the function call (`$vmstatus_return_properties`), so that they can be easily understood by other users as well (especially in which unit those raw values are so that it's easier to know how they would need to get converted). --- Otherwise, this works just as intended for me for: - `/nodes/{node}/qemu/{vmid}/status/current` (pvesh, curl, WebGUI) - `qm status ` (cli) - InfluxDB API write calls Reviewed-by: Daniel Kral Tested-by: Daniel Kral [0] https://bugzilla.proxmox.com/show_bug.cgi?id=5708#c3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel