public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
@ 2024-01-25 10:56 Stefan Lendl
       [not found] ` <c3d7bba9-73b0-46b0-ad72-94139afc0559@web.de>
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Stefan Lendl @ 2024-01-25 10:56 UTC (permalink / raw)
  To: pve-devel

awk internally uses float for every calculation, printing a large float
with awk results in 1.233e+09 format which causes the script to fail afterwards.
Instead I am printing the float without decimals.

Signed-off-by: Stefan Lendl <s.lendl@proxmox.com>
---
 debian/patches/awk-printf.diff | 16 ++++++++++++++++
 debian/patches/series          |  1 +
 2 files changed, 17 insertions(+)
 create mode 100644 debian/patches/awk-printf.diff

diff --git a/debian/patches/awk-printf.diff b/debian/patches/awk-printf.diff
new file mode 100644
index 0000000..11a957f
--- /dev/null
+++ b/debian/patches/awk-printf.diff
@@ -0,0 +1,16 @@
+--- ksm-control-scripts/ksmtuned	2024-01-25 11:33:03.485039813 +0100
++++ ksm-control-scripts.new/ksmtuned	2024-01-25 11:37:40.544751316 +0100
+@@ -72,11 +72,11 @@
+     # calculate how much memory is committed to running qemu processes
+     local progname
+     progname=${1:-kvm}
+-    ps -C "$progname" -o vsz= | awk '{ sum += $1 }; END { print sum }'
++    ps -C "$progname" -o vsz= | awk '{ sum += $1 }; END { printf ("%.0f", sum) }'
+ }
+ 
+ free_memory () {
+-    awk '/^(MemFree|Buffers|Cached):/ {free += $2}; END {print free}' \
++    awk '/^(MemFree|Buffers|Cached):/ {free += $2}; END { printf ("%.0f", free) }' \
+                 /proc/meminfo
+ }
+ 
diff --git a/debian/patches/series b/debian/patches/series
index 7aaec2c..63aba40 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -2,3 +2,4 @@ init-script.diff
 ksmtuned.diff
 adjust-ksm-slepp.diff
 use-vsz-instead-of-rsz.diff
+awk-printf.diff
-- 
2.43.0





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
       [not found] ` <c3d7bba9-73b0-46b0-ad72-94139afc0559@web.de>
@ 2024-02-29  7:52   ` Thomas Lamprecht
  2024-04-08 12:04     ` Stefan Lendl
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Lamprecht @ 2024-02-29  7:52 UTC (permalink / raw)
  To: Roland, Proxmox VE development discussion, Stefan Lendl

Hi,

Am 28/02/2024 um 23:47 schrieb Roland:
> any reason why this did not get a response ?  (i do not see rejection of
> this  ,nor did it appear in
> https://git.proxmox.com/?p=ksm-control-daemon.git;a=summary )

No reason, but even if this looks pretty straight forward, positive
feedback would still help to speed this up – did you test this
successfully? Then I could apply it with a Tested-by: name <email>
trailer.

> 
> and, while we are at ksmtuned, i think it's is broken, especially when
> run on ZFS based installations, as it's totally mis-calculating ram
> ressources.
> 
> https://forum.proxmox.com/threads/ksm-is-needlessly-burning-cpu-because-of-using-vzs-and-ignoring-arcsize.142397/

Yeah KSM could definitively do with some more love, lets see if we can
allocate some dev time for this.

The RSS (which rsz is an alias for) vs. VSS (vsz aliased) looks
interesting, and VSS really seems to be the wrong thing to look at to me
(albeit without deeper inspection of the matter).

FWIW, depending on how the sum is used it might actually make even more
sense to use PSS, i.e., the proportional set size which better accounts
for shared memory by dividing that part between all its users, as if
e.g. 10 QEMU processes have 100 MB of shared code and what not in their
RSS, using RSS one would sum up 900 MB to much compared using PSS, but
what's the correct one here is then depending on how they result is
used.

@Stefan, as you checked this out, would you care checking out the VSS
vs. RSS vs. PSS matter too? I.e. checking what should make more sense to
use and actually testing that out in a somewhat defined workload.

The ZFS ARC thing is something else and might be a bit more complicated,
so I'd focus first one above at that seems to provide better
improvements for less work, or at least with less potential to build an
unstable control system.

thanks,
 Thomas




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
       [not found] ` <6104c230-7e2e-43ae-8598-6c458b979ae1@web.de>
@ 2024-02-29 15:12   ` Stefan Lendl
  2024-02-29 15:16   ` Stefan Lendl
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Lendl @ 2024-02-29 15:12 UTC (permalink / raw)
  To: Roland; +Cc: Proxmox VE development discussion

Roland <devzero@web.de> writes:

> oh, and shouldn't we also add that to total and free_memory calculation,
> even chances are less that the limit is hit there ?
>
> total=`awk '/^MemTotal:/ {print $2}' /proc/meminfo`
>
> free_memory () {
>      awk '/^(MemFree|Buffers|Cached):/ {free += $2}; END {print free}' \
>                  /proc/meminfo
> }
>
> Am 25.01.24 um 11:56 schrieb Stefan Lendl:
>> diff --git a/debian/patches/awk-printf.diff b/debian/patches/awk-printf.diff
>> new file mode 100644
>> index 0000000..11a957f
>> --- /dev/null
>> +++ b/debian/patches/awk-printf.diff
>> @@ -0,0 +1,16 @@
>> +--- ksm-control-scripts/ksmtuned	2024-01-25 11:33:03.485039813 +0100
>> ++++ ksm-control-scripts.new/ksmtuned	2024-01-25 11:37:40.544751316 +0100
>> +@@ -72,11 +72,11 @@
>> +     # calculate how much memory is committed to running qemu processes
>> +     local progname
>> +     progname=${1:-kvm}
>> +-    ps -C "$progname" -o vsz= | awk '{ sum += $1 }; END { print sum }'
>> ++    ps -C "$progname" -o vsz= | awk '{ sum += $1 }; END { printf ("%.0f", sum) }'
>> + }
>> +
>> + free_memory () {
>> +-    awk '/^(MemFree|Buffers|Cached):/ {free += $2}; END {print free}' \
>> ++    awk '/^(MemFree|Buffers|Cached):/ {free += $2}; END { printf ("%.0f", free) }' \
>> +                 /proc/meminfo
>> + }
>> +

Hi Roland, as you can see in the patch, I am also adding this to the
free_memory function.

The patches are applied during the build process, hence the actual
source file still looks unchanged if you're looking at it in the repo.
If you install the package, the updated files will be placed at
/usr/sbin/ksmtuned where you can inspect the result.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
       [not found] ` <6104c230-7e2e-43ae-8598-6c458b979ae1@web.de>
  2024-02-29 15:12   ` Stefan Lendl
@ 2024-02-29 15:16   ` Stefan Lendl
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Lendl @ 2024-02-29 15:16 UTC (permalink / raw)
  To: Roland; +Cc: Proxmox VE development discussion

Roland <devzero@web.de> writes:

> oh, and shouldn't we also add that to total and free_memory calculation,
> even chances are less that the limit is hit there ?
>
> total=`awk '/^MemTotal:/ {print $2}' /proc/meminfo`

total does not require the printf fix because it does not do any
calculation.
The "print $2" operates on string level and prints the 2nd ($2) part of
the string after splitting at whitespaces.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
       [not found] ` <813667cc-81b6-46db-b144-54ee4cc578f6@web.de>
@ 2024-02-29 15:23   ` Stefan Lendl
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Lendl @ 2024-02-29 15:23 UTC (permalink / raw)
  To: Roland; +Cc: Proxmox VE development discussion, Thomas Lamprecht

Roland <devzero@web.de> writes:

> Hi Stefan,
>
> looks good for me so far and indeed, on very large system when VMs eat
> up >2TB this could hit the limit very soon.
>
> but shouldn't we add some newline , as the original "print sum" prints one ?
>
> root@s740:/usr/sbin# seq 1 100000 | awk '{ sum += $1 }; END { print sum }'
> 5.00005e+09
> root@s740:/usr/sbin# seq 1 100000 | awk '{ sum += $1 }; END { printf
> ("%.0f", sum) }'
> 5000050000root@s740:/usr/sbin#
>
> # seq 1 100000 | awk '{ sum += $1 }; END { print sum }'|xxd
> 00000000: 352e 3030 3030 3565 2b30 390a            5.00005e+09.
>
> # seq 1 100000 | awk '{ sum += $1 }; END { printf ("%.0f", sum) }' |xxd
> 00000000: 3530 3030 3035 3030 3030                 5000050000
>
> # seq 1 100000 | awk '{ sum += $1 }; END { printf ("%.0f\n", sum) }' |xxd
> 00000000: 3530 3030 3035 3030 3030 0a              5000050000.
>

I see that this appears different in that way.

In the script the result is always assigned to a variable which should
not care about a newline, or should even better be without the newline
in my opinion.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
  2024-02-29  7:52   ` Thomas Lamprecht
@ 2024-04-08 12:04     ` Stefan Lendl
  2024-04-08 12:22       ` Thomas Lamprecht
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Lendl @ 2024-04-08 12:04 UTC (permalink / raw)
  To: Thomas Lamprecht, Roland, Proxmox VE development discussion

Thomas Lamprecht <t.lamprecht@proxmox.com> writes:

> Hi,
>
> Am 28/02/2024 um 23:47 schrieb Roland:
>> any reason why this did not get a response ?  (i do not see rejection of
>> this  ,nor did it appear in
>> https://git.proxmox.com/?p=ksm-control-daemon.git;a=summary )
>
> No reason, but even if this looks pretty straight forward, positive
> feedback would still help to speed this up – did you test this
> successfully? Then I could apply it with a Tested-by: name <email>
> trailer.
>
>> 
>> and, while we are at ksmtuned, i think it's is broken, especially when
>> run on ZFS based installations, as it's totally mis-calculating ram
>> ressources.
>> 
>> https://forum.proxmox.com/threads/ksm-is-needlessly-burning-cpu-because-of-using-vzs-and-ignoring-arcsize.142397/
>
> Yeah KSM could definitively do with some more love, lets see if we can
> allocate some dev time for this.
>
> The RSS (which rsz is an alias for) vs. VSS (vsz aliased) looks
> interesting, and VSS really seems to be the wrong thing to look at to me
> (albeit without deeper inspection of the matter).
>
> FWIW, depending on how the sum is used it might actually make even more
> sense to use PSS, i.e., the proportional set size which better accounts
> for shared memory by dividing that part between all its users, as if
> e.g. 10 QEMU processes have 100 MB of shared code and what not in their
> RSS, using RSS one would sum up 900 MB to much compared using PSS, but
> what's the correct one here is then depending on how they result is
> used.
>
> @Stefan, as you checked this out, would you care checking out the VSS
> vs. RSS vs. PSS matter too? I.e. checking what should make more sense to
> use and actually testing that out in a somewhat defined workload.
>
> The ZFS ARC thing is something else and might be a bit more complicated,
> so I'd focus first one above at that seems to provide better
> improvements for less work, or at least with less potential to build an
> unstable control system.
>
> thanks,
>  Thomas

I agree summing up processes it would make sense to use PSS.
Unfortunately, ps does not report the PSS.

Using VSZ was introduced in cd5cf20 without further explanations.
Upstream is using RSS as well so I though it would be save to use RSS as
well, as it gives a more accurate sum than VSZ.

I will send a new version that includes reverting to RSS.
From the awk printf perspective, I think this should work as is.
We applied the patch on an enterprise customer and no problems where
reported since.

Best regards,
Stefan





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
  2024-04-08 12:04     ` Stefan Lendl
@ 2024-04-08 12:22       ` Thomas Lamprecht
  2024-04-08 13:02         ` Stefan Lendl
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Lamprecht @ 2024-04-08 12:22 UTC (permalink / raw)
  To: Stefan Lendl, Roland, Proxmox VE development discussion

Am 08/04/2024 um 14:04 schrieb Stefan Lendl:
> I agree summing up processes it would make sense to use PSS.
> Unfortunately, ps does not report the PSS.

The `ps` from the Debian Bookworm version of the `procps` package does report
it here if I use something like `ps -C kvm -o pss` though, FWICT this should
be available here?

Can you please re-check this?






^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing
  2024-04-08 12:22       ` Thomas Lamprecht
@ 2024-04-08 13:02         ` Stefan Lendl
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Lendl @ 2024-04-08 13:02 UTC (permalink / raw)
  To: Thomas Lamprecht, Roland, Proxmox VE development discussion

Thomas Lamprecht <t.lamprecht@proxmox.com> writes:

> Am 08/04/2024 um 14:04 schrieb Stefan Lendl:
>> I agree summing up processes it would make sense to use PSS.
>> Unfortunately, ps does not report the PSS.
>
> The `ps` from the Debian Bookworm version of the `procps` package does report
> it here if I use something like `ps -C kvm -o pss` though, FWICT this should
> be available here?
>
> Can you please re-check this?

Ok, yes. ps always reported 0 for me when running ps as a regular user,
while other values are reported.
It works as root. I will send a new series then.




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-08 13:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-25 10:56 [pve-devel] [PATCH ksm-control-daemon] ksmtuned: fix large number processing Stefan Lendl
     [not found] ` <c3d7bba9-73b0-46b0-ad72-94139afc0559@web.de>
2024-02-29  7:52   ` Thomas Lamprecht
2024-04-08 12:04     ` Stefan Lendl
2024-04-08 12:22       ` Thomas Lamprecht
2024-04-08 13:02         ` Stefan Lendl
     [not found] ` <6104c230-7e2e-43ae-8598-6c458b979ae1@web.de>
2024-02-29 15:12   ` Stefan Lendl
2024-02-29 15:16   ` Stefan Lendl
     [not found] ` <813667cc-81b6-46db-b144-54ee4cc578f6@web.de>
2024-02-29 15:23   ` Stefan Lendl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal