* [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage
@ 2023-08-16 14:34 Stefan Hanreich
2023-08-16 14:59 ` Aaron Lauterer
0 siblings, 1 reply; 3+ messages in thread
From: Stefan Hanreich @ 2023-08-16 14:34 UTC (permalink / raw)
To: pve-devel
Currently we are using the MemoryCurrent property of the OSD service
to determine the used memory of a Ceph OSD. This includes, among other
things, the memory used by buffers [1]. Since BlueFS uses buffered
I/O, this can lead to extremely high values shown in the UI.
Instead we are now reading the PSS value from the proc filesystem,
which should more accurately reflect the amount of memory currently
used by the Ceph OSD.
We decided on PSS over RSS, since this should give a better idea of
used memory - particularly when using a large amount of OSDs on one
host, since the OSDs share some of the pages.
[1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
---
PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
index ded359904..5f7718b0a 100644
--- a/PVE/API2/Ceph/OSD.pm
+++ b/PVE/API2/Ceph/OSD.pm
@@ -687,13 +687,10 @@ __PACKAGE__->register_method ({
my $raw = '';
my $pid;
- my $memory;
my $parser = sub {
my $line = shift;
if ($line =~ m/^MainPID=([0-9]*)$/) {
$pid = $1;
- } elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) {
- $memory = $1 eq "[not set]" ? 0 : $1;
}
};
@@ -702,12 +699,24 @@ __PACKAGE__->register_method ({
'show',
"ceph-osd\@${osdid}.service",
'--property',
- 'MainPID,MemoryCurrent',
+ 'MainPID',
];
run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser);
$pid = defined($pid) ? int($pid) : undef;
- $memory = defined($memory) ? int($memory) : undef;
+
+ my $memory;
+ if ($pid && $pid > 0) {
+ open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
+ or die 'Could not open smaps_rollup for Ceph OSD';
+
+ while (my $line = <$SMAPS>) {
+ if ($line =~ m/^Pss:\s+([0-9]+) kB$/) {
+ $memory = $1 * 1024;
+ last;
+ }
+ }
+ }
my $data = {
osd => {
--
2.39.2
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage
2023-08-16 14:34 [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage Stefan Hanreich
@ 2023-08-16 14:59 ` Aaron Lauterer
2023-08-16 15:11 ` Stefan Hanreich
0 siblings, 1 reply; 3+ messages in thread
From: Aaron Lauterer @ 2023-08-16 14:59 UTC (permalink / raw)
To: Proxmox VE development discussion, Stefan Hanreich
one small nitL: if the OSD is powered off, we now send null for the meminfo.
Maybe set it explicitly to 0 to not change the behavior?
but overall:
Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>
On 8/16/23 16:34, Stefan Hanreich wrote:
> Currently we are using the MemoryCurrent property of the OSD service
> to determine the used memory of a Ceph OSD. This includes, among other
> things, the memory used by buffers [1]. Since BlueFS uses buffered
> I/O, this can lead to extremely high values shown in the UI.
>
> Instead we are now reading the PSS value from the proc filesystem,
> which should more accurately reflect the amount of memory currently
> used by the Ceph OSD.
>
> We decided on PSS over RSS, since this should give a better idea of
> used memory - particularly when using a large amount of OSDs on one
> host, since the OSDs share some of the pages.
>
> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
>
> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
> ---
> PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
> index ded359904..5f7718b0a 100644
> --- a/PVE/API2/Ceph/OSD.pm
> +++ b/PVE/API2/Ceph/OSD.pm
> @@ -687,13 +687,10 @@ __PACKAGE__->register_method ({
>
> my $raw = '';
> my $pid;
> - my $memory;
> my $parser = sub {
> my $line = shift;
> if ($line =~ m/^MainPID=([0-9]*)$/) {
> $pid = $1;
> - } elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) {
> - $memory = $1 eq "[not set]" ? 0 : $1;
here we lose setting the value explicitly to 0 if not available
> }
> };
>
> @@ -702,12 +699,24 @@ __PACKAGE__->register_method ({
> 'show',
> "ceph-osd\@${osdid}.service",
> '--property',
> - 'MainPID,MemoryCurrent',
> + 'MainPID',
> ];
> run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser);
>
> $pid = defined($pid) ? int($pid) : undef;
> - $memory = defined($memory) ? int($memory) : undef;
> +
> + my $memory;
> + if ($pid && $pid > 0) {
> + open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
> + or die 'Could not open smaps_rollup for Ceph OSD';
> +
> + while (my $line = <$SMAPS>) {
> + if ($line =~ m/^Pss:\s+([0-9]+) kB$/) {
> + $memory = $1 * 1024;
> + last;
> + }
> + }
> + }
>
> my $data = {
> osd => {
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage
2023-08-16 14:59 ` Aaron Lauterer
@ 2023-08-16 15:11 ` Stefan Hanreich
0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hanreich @ 2023-08-16 15:11 UTC (permalink / raw)
To: Aaron Lauterer, Proxmox VE development discussion
On 8/16/23 16:59, Aaron Lauterer wrote:
> one small nitL: if the OSD is powered off, we now send null for the
> meminfo. Maybe set it explicitly to 0 to not change the behavior?
ah yes, I even had it that way initially but seems like it got removed
again when I tried to be smart and culled the line (which I changed prior):
$memory = defined($memory) ? int($memory) : 0;
I'll send a v2 shortly
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-08-16 15:11 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-16 14:34 [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage Stefan Hanreich
2023-08-16 14:59 ` Aaron Lauterer
2023-08-16 15:11 ` Stefan Hanreich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox