From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <p.hufnagl@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id D0FEFE5AA
 for <pve-devel@lists.proxmox.com>; Tue, 18 Jul 2023 15:48:50 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id BA8291BD35
 for <pve-devel@lists.proxmox.com>; Tue, 18 Jul 2023 15:48:20 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Tue, 18 Jul 2023 15:48:19 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id B5CB242D02
 for <pve-devel@lists.proxmox.com>; Tue, 18 Jul 2023 15:48:18 +0200 (CEST)
Message-ID: <753fc51d-a262-c3a2-33da-2018d62ab312@proxmox.com>
Date: Tue, 18 Jul 2023 15:48:17 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.12.0
Content-Language: en-US
To: Thomas Lamprecht <t.lamprecht@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Maximiliano Sandoval <m.sandoval@proxmox.com>
References: <20230718115828.170254-1-p.hufnagl@proxmox.com>
 <edc36f4f-f830-fba6-696e-0db56c81d318@proxmox.com>
 <2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com>
From: Philipp Hufnagl <p.hufnagl@proxmox.com>
In-Reply-To: <2d4ca785-09c3-0fc3-0658-e7f29f8579b9@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.043 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.097 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: Re: [pve-devel] [PATCH pve-container] pct: fix cpu load calculation
 on command line
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 18 Jul 2023 13:48:50 -0000

Hello

On 7/18/23 15:02, Thomas Lamprecht wrote:
> Am 18/07/2023 um 14:00 schrieb Philipp Hufnagl:
>> Sorry forgott to tag as v2
> and also forgot to document the patch changelog like asked yesterday..
Sorry. I did not know that. I will add a changelog
>
>> On 7/18/23 13:58, Philipp Hufnagl wrote:
>>>    When called from the command line, it was not possible to calculate
>>>    cpu load because there was no 2nd data point available for the
>>>    calculation. Now (when called) from the command line, cpu stats will
>>>    be fetched twice with a minimum delta of 20ms. That way the load can
>>>    be calculated
> @Maximiliano, didn't we decide to just drop it instead? This isn't really
> useful, once can get much better data from the pressure stall information
> (PSI) which is tracked per cgroup and tells a user much more than a 20 ms
> sample interval..
>
> https://docs.kernel.org/accounting/psi.html#cgroup2-interface
>
> Still a few comments inline.

Shall I wait with a v3 until a decision is made?

> ust drop this CPU load stuff in pct status I'd rather do one of four
> options:
>
> 1) rename this to prime_vmstatus_cpu_sampling and just do it for a single vmid,
> then call this new method in PVE::CLI::pct->status and do the sleep there, as
> that's actually the one call sites that cares about it, the existing vmstatus
> method then just needs one change:
>
> -           if ($delta_ms > 1000.0) {
> +           if ($delta_ms > 1000.0 || $old->{cpu} == 0) {
>
> 2) The same as 1) but instead of adding the prime_vmstatus_cpu_sampling helper
> just call vmstatus twice with sleeping in-between (and the same change to the if
> condition as for 1).
>
> 3) get the data where it's already available, i.e., pvestatd, might need more
> rework though
>
> 4) switch over to reporting the PSI from /sys/fs/cgroup/lxc/VMID/cpu.pressure
> this is pretty simple as in PSI ~ 0 -> no overload 0 >> PSI > 1 -> some overload
> and PSI >> 1 a lot of overload.
>
> Option 4 sounds niceish, but needs more work and has not that high of a benefit
> (users can already query this easily themselves), option 1 or 2 would be OK-ish,
> but IMO not ideal, as we'd use a 20ms avg here compared to a >> 1s average elswhere,
> which can be confusing as it can be quite, well spikey. option 3 would be better here
> but as mentioned also more rework and possible more intrusive one, so IMO just
> dropping it sounds almost the nicest and def. most simple one.
>
I think we should do Idea 1 as a solution until we finish a deeper rework