From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>,
"Aaron Lauterer" <a.lauterer@proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pve-devel] [RFC qemu-server] fix #6935: vmstatus: fallback to RSS in case of KSM usage
Date: Tue, 25 Nov 2025 19:17:08 +0100 [thread overview]
Message-ID: <703c1a72-7c08-4a80-90b0-01087cd0b946@proxmox.com> (raw)
In-Reply-To: <cd27bb7b-0bfd-4503-af65-aafa6434af8b@proxmox.com>
Am 25.11.25 um 18:21 schrieb Aaron Lauterer:
> On 2025-11-25 16:20, Thomas Lamprecht wrote:
>> Am 25.11.25 um 15:20 schrieb Fabian Grünbichler:
>>> On November 25, 2025 3:08 pm, Thomas Lamprecht wrote:
>>>> Just to be sure: The stats from memory.current or memory.stat inside the
>>>> /sys/fs/cgroup/qemu.slice/${vmid}.scope/ directory is definitively not
>>>> enough for our usecases?
>>>
>>> well, if we go for RSS they might be, for PSS they are not, since that
>>> doesn't exist there?
>>
>> Would need to take a closer look to tell for sure, but from a quick check
>> it indeed seems to not be there.
>>
>>> having the live view and the metrics use different semantics seems kinda
>>> confusing tbh..
>>
>> more than jumping between metrics over time silently? ;-) The live view can
>> be easily annotated with a different label or the like if the source is
>> another, not so easy for metrics.
>>
>> The more I think about this the more I'm in favor of just deprecating this
>> again completely, this page table walking can even cause some latency spikes
>> in the target process, IMO just not worth it. If the kernel can give us this
>> free, or at least much cheaper, in the future, then great, but until then it's
>> not really an option. If, we can make this opt-in. The best granularity here
>> probably would be through guest config, but for starters a cluster-wide
>> datacenter option could be already enough for the setups that are fine with
>> this performance trade-off in general.
>
>
> If I may add my 2 cents here. How much do we lose by switching completely to fetching RSS (or the cgroupv2 equivalent)? For the metrics and live view.
> AFAIU the resulting memory accounting will be a bit higher, as shared libraries will be fully accounted for for each cgroup and not proportionally as with PSS.
You account shared memory more than once for if a user checks each VM and sums
them up, i.e. the total can even come out for more than installed memory.
IME this confuses people more compared to over time effects, but no hard
feelings here as long as it's clear that it's now ignoring if any memory is
shared between multiple processes/VMs.
> I am not sure if we want to introduce additional config options (global per DC, or per guest) to change the behavior. As that is probably even more confusing for not that much gain.
I have no problem of ripping this out, that was just a proposal for a cheap way
to allow keeping this behavior for those that really want it.
> And its not like PSS doesn't come with its own set of weirdness. E.g., if we run 10 VMs, and stop all but one, the last will see an increase in memory consumption as it is the sole user of shared libraries.
Yes, that's the basic underlying principle and how reality with accounting works.
At any time the value is correct though, while with RSS it's wrong at any time.
For KSM you have already similar effects, from the POV of a VM the memory usage
can stay the same, but due to change what's in the memory the KSM sharing rate
goes down and thus memory usage on the host goes up even if all VMs kept exactly
the same amount of memory in use. If you start to share things the usage stats
will always stop being trivial.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-11-25 18:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-25 13:51 Fabian Grünbichler
2025-11-25 14:08 ` Thomas Lamprecht
2025-11-25 14:20 ` Fabian Grünbichler
2025-11-25 15:21 ` Thomas Lamprecht
2025-11-25 17:21 ` Aaron Lauterer
2025-11-25 18:17 ` Thomas Lamprecht [this message]
2025-11-25 14:53 ` Aaron Lauterer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=703c1a72-7c08-4a80-90b0-01087cd0b946@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=a.lauterer@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox