* [pve-devel] [PATCH pve-manager v2] api: ceph: improve reporting of ceph OSD memory usage
@ 2023-08-16 15:21 Stefan Hanreich
2023-09-01 14:00 ` Thomas Lamprecht
0 siblings, 1 reply; 2+ messages in thread
From: Stefan Hanreich @ 2023-08-16 15:21 UTC (permalink / raw)
To: pve-devel
Currently we are using the MemoryCurrent property of the OSD service
to determine the used memory of a Ceph OSD. This includes, among other
things, the memory used by buffers [1]. Since BlueFS uses buffered
I/O, this can lead to extremely high values shown in the UI.
Instead we are now reading the PSS value from the proc filesystem,
which should more accurately reflect the amount of memory currently
used by the Ceph OSD.
We decided on PSS over RSS, since this should give a better idea of
used memory - particularly when using a large amount of OSDs on one
host, since the OSDs share some of the pages.
[1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
---
Changes from v1:
* Now returns 0 instead of null in case of stopped OSDs in order to
preserve backwards compatibility
PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
index ded359904..63a4a4146 100644
--- a/PVE/API2/Ceph/OSD.pm
+++ b/PVE/API2/Ceph/OSD.pm
@@ -687,13 +687,10 @@ __PACKAGE__->register_method ({
my $raw = '';
my $pid;
- my $memory;
my $parser = sub {
my $line = shift;
if ($line =~ m/^MainPID=([0-9]*)$/) {
$pid = $1;
- } elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) {
- $memory = $1 eq "[not set]" ? 0 : $1;
}
};
@@ -702,12 +699,24 @@ __PACKAGE__->register_method ({
'show',
"ceph-osd\@${osdid}.service",
'--property',
- 'MainPID,MemoryCurrent',
+ 'MainPID',
];
run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser);
$pid = defined($pid) ? int($pid) : undef;
- $memory = defined($memory) ? int($memory) : undef;
+
+ my $memory = 0;
+ if ($pid && $pid > 0) {
+ open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
+ or die 'Could not open smaps_rollup for Ceph OSD';
+
+ while (my $line = <$SMAPS>) {
+ if ($line =~ m/^Pss:\s+([0-9]+) kB$/) {
+ $memory = $1 * 1024;
+ last;
+ }
+ }
+ }
my $data = {
osd => {
--
2.39.2
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [pve-devel] [PATCH pve-manager v2] api: ceph: improve reporting of ceph OSD memory usage
2023-08-16 15:21 [pve-devel] [PATCH pve-manager v2] api: ceph: improve reporting of ceph OSD memory usage Stefan Hanreich
@ 2023-09-01 14:00 ` Thomas Lamprecht
0 siblings, 0 replies; 2+ messages in thread
From: Thomas Lamprecht @ 2023-09-01 14:00 UTC (permalink / raw)
To: Proxmox VE development discussion, Stefan Hanreich
Am 16/08/2023 um 17:21 schrieb Stefan Hanreich:
> Currently we are using the MemoryCurrent property of the OSD service
> to determine the used memory of a Ceph OSD. This includes, among other
> things, the memory used by buffers [1]. Since BlueFS uses buffered
> I/O, this can lead to extremely high values shown in the UI.
>
> Instead we are now reading the PSS value from the proc filesystem,
> which should more accurately reflect the amount of memory currently
> used by the Ceph OSD.
>
> We decided on PSS over RSS, since this should give a better idea of
Who's "we"?
> used memory - particularly when using a large amount of OSDs on one
> host, since the OSDs share some of the pages.
fine for me, I'd hint that in the UI too though, e.g., using
"Memory (PSS)" as label.
>
> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
>
> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
> ---
>
> Changes from v1:
> * Now returns 0 instead of null in case of stopped OSDs in order to
> preserve backwards compatibility
>
>
> PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> + open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
> + or die 'Could not open smaps_rollup for Ceph OSD';
The die is missing a trailing newline, which then will result in
showing the user a rather ugly "died at line ... in .." suffixed.
Please also include the message of what the actual error was in the
die using $! – otherwise such things, especially due to being probably
rather rare, are unnecessarily hard to debug.
Would maybe reword it a bit too "smaps_rollup" is probably a bit
odd to read for some users ^^
nit, we normally start error messages lower case – while we have no
hard style rule for that so no hard feelings, just mentioning as it
stuck out to me.
So maybe something like:
or die "failed to read PSS memory-stat from process - $!\n";
Oh, and I would move that open + parse stuff to a private local
method, as it only crowds the API endpoint's code and might be better
off if moved to PVE::ProcFSTools or the like in the future (but we
don't use PSS anywhere else, so it can live in this module for now),
something like
my sub get_proc_pss_from_pid {
my ($pid) = @_;
# ...
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-09-01 14:01 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-16 15:21 [pve-devel] [PATCH pve-manager v2] api: ceph: improve reporting of ceph OSD memory usage Stefan Hanreich
2023-09-01 14:00 ` Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox