From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id D0FEFE5AA for ; Tue, 18 Jul 2023 15:48:50 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BA8291BD35 for ; Tue, 18 Jul 2023 15:48:20 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 18 Jul 2023 15:48:19 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id B5CB242D02 for ; Tue, 18 Jul 2023 15:48:18 +0200 (CEST) Message-ID: <753fc51d-a262-c3a2-33da-2018d62ab312@proxmox.com> Date: Tue, 18 Jul 2023 15:48:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: Thomas Lamprecht Cc: Proxmox VE development discussion , Maximiliano Sandoval References: <20230718115828.170254-1-p.hufnagl@proxmox.com> <2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com> From: Philipp Hufnagl In-Reply-To: <2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.043 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.097 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH pve-container] pct: fix cpu load calculation on command line X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jul 2023 13:48:50 -0000 Hello On 7/18/23 15:02, Thomas Lamprecht wrote: > Am 18/07/2023 um 14:00 schrieb Philipp Hufnagl: >> Sorry forgott to tag as v2 > and also forgot to document the patch changelog like asked yesterday.. Sorry. I did not know that. I will add a changelog > >> On 7/18/23 13:58, Philipp Hufnagl wrote: >>>   When called from the command line, it was not possible to calculate >>>   cpu load because there was no 2nd data point available for the >>>   calculation. Now (when called) from the command line, cpu stats will >>>   be fetched twice with a minimum delta of 20ms. That way the load can >>>   be calculated > @Maximiliano, didn't we decide to just drop it instead? This isn't really > useful, once can get much better data from the pressure stall information > (PSI) which is tracked per cgroup and tells a user much more than a 20 ms > sample interval.. > > https://docs.kernel.org/accounting/psi.html#cgroup2-interface > > Still a few comments inline. Shall I wait with a v3 until a decision is made? > ust drop this CPU load stuff in pct status I'd rather do one of four > options: > > 1) rename this to prime_vmstatus_cpu_sampling and just do it for a single vmid, > then call this new method in PVE::CLI::pct->status and do the sleep there, as > that's actually the one call sites that cares about it, the existing vmstatus > method then just needs one change: > > - if ($delta_ms > 1000.0) { > + if ($delta_ms > 1000.0 || $old->{cpu} == 0) { > > 2) The same as 1) but instead of adding the prime_vmstatus_cpu_sampling helper > just call vmstatus twice with sleeping in-between (and the same change to the if > condition as for 1). > > 3) get the data where it's already available, i.e., pvestatd, might need more > rework though > > 4) switch over to reporting the PSI from /sys/fs/cgroup/lxc/VMID/cpu.pressure > this is pretty simple as in PSI ~ 0 -> no overload 0 >> PSI > 1 -> some overload > and PSI >> 1 a lot of overload. > > Option 4 sounds niceish, but needs more work and has not that high of a benefit > (users can already query this easily themselves), option 1 or 2 would be OK-ish, > but IMO not ideal, as we'd use a 20ms avg here compared to a >> 1s average elswhere, > which can be confusing as it can be quite, well spikey. option 3 would be better here > but as mentioned also more rework and possible more intrusive one, so IMO just > dropping it sounds almost the nicest and def. most simple one. > I think we should do Idea 1 as a solution until we finish a deeper rework