From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.lendl@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id C792D93154
 for <pve-devel@lists.proxmox.com>; Mon,  8 Apr 2024 14:05:12 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id A303D9834
 for <pve-devel@lists.proxmox.com>; Mon,  8 Apr 2024 14:04:42 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Mon,  8 Apr 2024 14:04:41 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id A9C6840A93;
 Mon,  8 Apr 2024 14:04:41 +0200 (CEST)
From: Stefan Lendl <s.lendl@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>, Roland <devzero@web.de>,
 Proxmox VE development discussion <pve-devel@lists.proxmox.com>
In-Reply-To: <a96fab26-8358-41ce-a142-740ba575b7a5@proxmox.com>
References: <20240125105658.1541023-2-s.lendl@proxmox.com>
 <c3d7bba9-73b0-46b0-ad72-94139afc0559@web.de>
 <a96fab26-8358-41ce-a142-740ba575b7a5@proxmox.com>
Date: Mon, 08 Apr 2024 14:04:40 +0200
Message-ID: <878r1o2h7r.fsf@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.020 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com]
Subject: Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large
 number processing
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 08 Apr 2024 12:05:12 -0000

Thomas Lamprecht <t.lamprecht@proxmox.com> writes:

> Hi,
>
> Am 28/02/2024 um 23:47 schrieb Roland:
>> any reason why this did not get a response ?=C2=A0 (i do not see rejecti=
on of
>> this=C2=A0 ,nor did it appear in
>> https://git.proxmox.com/?p=3Dksm-control-daemon.git;a=3Dsummary )
>
> No reason, but even if this looks pretty straight forward, positive
> feedback would still help to speed this up =E2=80=93 did you test this
> successfully? Then I could apply it with a Tested-by: name <email>
> trailer.
>
>>=20
>> and, while we are at ksmtuned, i think it's is broken, especially when
>> run on ZFS based installations, as it's totally mis-calculating ram
>> ressources.
>>=20
>> https://forum.proxmox.com/threads/ksm-is-needlessly-burning-cpu-because-=
of-using-vzs-and-ignoring-arcsize.142397/
>
> Yeah KSM could definitively do with some more love, lets see if we can
> allocate some dev time for this.
>
> The RSS (which rsz is an alias for) vs. VSS (vsz aliased) looks
> interesting, and VSS really seems to be the wrong thing to look at to me
> (albeit without deeper inspection of the matter).
>
> FWIW, depending on how the sum is used it might actually make even more
> sense to use PSS, i.e., the proportional set size which better accounts
> for shared memory by dividing that part between all its users, as if
> e.g. 10 QEMU processes have 100 MB of shared code and what not in their
> RSS, using RSS one would sum up 900 MB to much compared using PSS, but
> what's the correct one here is then depending on how they result is
> used.
>
> @Stefan, as you checked this out, would you care checking out the VSS
> vs. RSS vs. PSS matter too? I.e. checking what should make more sense to
> use and actually testing that out in a somewhat defined workload.
>
> The ZFS ARC thing is something else and might be a bit more complicated,
> so I'd focus first one above at that seems to provide better
> improvements for less work, or at least with less potential to build an
> unstable control system.
>
> thanks,
>  Thomas

I agree summing up processes it would make sense to use PSS.
Unfortunately, ps does not report the PSS.

Using VSZ was introduced in cd5cf20 without further explanations.
Upstream is using RSS as well so I though it would be save to use RSS as
well, as it gives a more accurate sum than VSZ.

I will send a new version that includes reverting to RSS.
>From the awk printf perspective, I think this should work as is.
We applied the patch on an enterprise customer and no problems where
reported since.

Best regards,
Stefan