From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 836DACE91 for ; Wed, 16 Aug 2023 16:59:08 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 655B01875C for ; Wed, 16 Aug 2023 16:59:08 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 16 Aug 2023 16:59:06 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0B8DA41047 for ; Wed, 16 Aug 2023 16:59:06 +0200 (CEST) Message-ID: Date: Wed, 16 Aug 2023 16:59:04 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Proxmox VE development discussion , Stefan Hanreich References: <20230816143454.2225673-1-s.hanreich@proxmox.com> From: Aaron Lauterer In-Reply-To: <20230816143454.2225673-1-s.hanreich@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.085 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [osd.pm] Subject: Re: [pve-devel] [PATCH pve-manager] api: ceph: improve reporting of ceph OSD memory usage X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Aug 2023 14:59:08 -0000 one small nitL: if the OSD is powered off, we now send null for the meminfo. Maybe set it explicitly to 0 to not change the behavior? but overall: Tested-By: Aaron Lauterer On 8/16/23 16:34, Stefan Hanreich wrote: > Currently we are using the MemoryCurrent property of the OSD service > to determine the used memory of a Ceph OSD. This includes, among other > things, the memory used by buffers [1]. Since BlueFS uses buffered > I/O, this can lead to extremely high values shown in the UI. > > Instead we are now reading the PSS value from the proc filesystem, > which should more accurately reflect the amount of memory currently > used by the Ceph OSD. > > We decided on PSS over RSS, since this should give a better idea of > used memory - particularly when using a large amount of OSDs on one > host, since the OSDs share some of the pages. > > [1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt > > Signed-off-by: Stefan Hanreich > --- > PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm > index ded359904..5f7718b0a 100644 > --- a/PVE/API2/Ceph/OSD.pm > +++ b/PVE/API2/Ceph/OSD.pm > @@ -687,13 +687,10 @@ __PACKAGE__->register_method ({ > > my $raw = ''; > my $pid; > - my $memory; > my $parser = sub { > my $line = shift; > if ($line =~ m/^MainPID=([0-9]*)$/) { > $pid = $1; > - } elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) { > - $memory = $1 eq "[not set]" ? 0 : $1; here we lose setting the value explicitly to 0 if not available > } > }; > > @@ -702,12 +699,24 @@ __PACKAGE__->register_method ({ > 'show', > "ceph-osd\@${osdid}.service", > '--property', > - 'MainPID,MemoryCurrent', > + 'MainPID', > ]; > run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser); > > $pid = defined($pid) ? int($pid) : undef; > - $memory = defined($memory) ? int($memory) : undef; > + > + my $memory; > + if ($pid && $pid > 0) { > + open (my $SMAPS, '<', "/proc/$pid/smaps_rollup") > + or die 'Could not open smaps_rollup for Ceph OSD'; > + > + while (my $line = <$SMAPS>) { > + if ($line =~ m/^Pss:\s+([0-9]+) kB$/) { > + $memory = $1 * 1024; > + last; > + } > + } > + } > > my $data = { > osd => {