From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9CB92EBAA for ; Mon, 12 Dec 2022 13:14:55 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7A046474E for ; Mon, 12 Dec 2022 13:14:55 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 12 Dec 2022 13:14:54 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D8D5A44821 for ; Mon, 12 Dec 2022 13:14:53 +0100 (CET) From: Aaron Lauterer To: pve-devel@lists.proxmox.com Date: Mon, 12 Dec 2022 13:14:49 +0100 Message-Id: <20221212121451.877054-2-a.lauterer@proxmox.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221212121451.877054-1-a.lauterer@proxmox.com> References: <20221212121451.877054-1-a.lauterer@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.042 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH manager v5 1/3] api ceph osd: add OSD index, metadata and lv-info X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Dec 2022 12:14:55 -0000 To get more details for a single OSD, we add two new endpoints: * nodes/{node}/ceph/osd/{osdid}/metadata * nodes/{node}/ceph/osd/{osdid}/lv-info The {osdid} endpoint itself gets a new GET handler to return the index. The metadata one provides various metadata regarding the OSD. Such as * process id * memory usage * info about devices used (bdev/block, db, wal) * size * disks used (sdX) ... * network addresses and ports used ... Memory usage and PID are retrieved from systemd while the rest can be retrieved from the metadata provided by Ceph. The second one (lv-info) returns the following infos for a logical volume: * creation time * lv name * lv path * lv size * lv uuid * vg name Possible volumes are: * block (default value if not provided) * db * wal 'ceph-volume' is used to gather the infos, except for the creation time of the LV which is retrieved via 'lvs'. Signed-off-by: Aaron Lauterer --- changes since v4: - call `ceph-volume` with the OSD ID as parameter so it can only fetch data for that OSD instead of all v3: - verify definedness of $pid and $memory after run_command. also handle cast to int there and not in the assignment of the return values. This way we get a `null` value returned in case we never got any value. v2: - rephrased errormsgs on run_command - reworked systemctl show call and parsing of the results - expanded error msg if no LV info is found to mention that it could be, because the OSD is a bit older. This will hopefully reduce potential concerns if user encounter it - return array of devices instead of optionl bdev, db, and wal - add 'device' to the devices metadata (block, db, wal), used to be done in the UI v1: - squashed all API commits into one - moved all new API endpoints into sub endpoints to {osdid} - {osdid} itself returns the necessary index - incorporated other code improvements PVE/API2/Ceph/OSD.pm | 324 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 324 insertions(+) diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm index 93433b3a..431a9e1e 100644 --- a/PVE/API2/Ceph/OSD.pm +++ b/PVE/API2/Ceph/OSD.pm @@ -5,6 +5,7 @@ use warnings; use Cwd qw(abs_path); use IO::File; +use JSON; use UUID; use PVE::Ceph::Tools; @@ -516,6 +517,329 @@ __PACKAGE__->register_method ({ return $rpcenv->fork_worker('cephcreateosd', $devs->{dev}->{name}, $authuser, $worker); }}); +my $OSD_DEV_RETURN_PROPS = { + device => { + type => 'string', + enum => ['block', 'db', 'wal'], + description => 'Kind of OSD device', + }, + dev_node => { + type => 'string', + description => 'Device node', + }, + devices => { + type => 'string', + description => 'Physical disks used', + }, + size => { + type => 'integer', + description => 'Size in bytes', + }, + support_discard => { + type => 'boolean', + description => 'Discard support of the physical device', + }, + type => { + type => 'string', + description => 'Type of device. For example, hdd or ssd', + }, +}; + +__PACKAGE__->register_method ({ + name => 'osdindex', + path => '{osdid}', + method => 'GET', + permissions => { user => 'all' }, + description => "OSD index.", + parameters => { + additionalProperties => 0, + properties => { + node => get_standard_option('pve-node'), + osdid => { + description => 'OSD ID', + type => 'integer', + }, + }, + }, + returns => { + type => 'array', + items => { + type => "object", + properties => {}, + }, + links => [ { rel => 'child', href => "{name}" } ], + }, + code => sub { + my ($param) = @_; + + my $result = [ + { name => 'metadata' }, + { name => 'lv-info' }, + ]; + + return $result; + }}); + +__PACKAGE__->register_method ({ + name => 'osddetails', + path => '{osdid}/metadata', + method => 'GET', + description => "Get OSD details", + proxyto => 'node', + protected => 1, + permissions => { + check => ['perm', '/', [ 'Sys.Audit' ], any => 1], + }, + parameters => { + additionalProperties => 0, + properties => { + node => get_standard_option('pve-node'), + osdid => { + description => 'OSD ID', + type => 'integer', + }, + }, + }, + returns => { + type => 'object', + properties => { + osd => { + type => 'object', + description => 'General information about the OSD', + properties => { + hostname => { + type => 'string', + description => 'Name of the host containing the OSD.', + }, + id => { + type => 'integer', + description => 'ID of the OSD.', + }, + mem_usage => { + type => 'integer', + description => 'Memory usage of the OSD service.', + }, + osd_data => { + type => 'string', + description => "Path to the OSD's data directory.", + }, + osd_objectstore => { + type => 'string', + description => 'The type of object store used.', + }, + pid => { + type => 'integer', + description => 'OSD process ID.', + }, + version => { + type => 'string', + description => 'Ceph version of the OSD service.', + }, + front_addr => { + type => 'string', + description => 'Address and port used to talk to clients and monitors.', + }, + back_addr => { + type => 'string', + description => 'Address and port used to talk to other OSDs.', + }, + hb_front_addr => { + type => 'string', + description => 'Heartbeat address and port for clients and monitors.', + }, + hb_back_addr => { + type => 'string', + description => 'Heartbeat address and port for other OSDs.', + }, + }, + }, + devices => { + type => 'array', + description => 'Array containing data about devices', + items => { + type => "object", + properties => $OSD_DEV_RETURN_PROPS, + }, + } + } + }, + code => sub { + my ($param) = @_; + + PVE::Ceph::Tools::check_ceph_inited(); + + my $osdid = $param->{osdid}; + my $rados = PVE::RADOS->new(); + my $metadata = $rados->mon_command({ prefix => 'osd metadata', id => int($osdid) }); + + die "OSD '${osdid}' does not exists on host '${nodename}'\n" + if $nodename ne $metadata->{hostname}; + + my $raw = ''; + my $pid; + my $memory; + my $parser = sub { + my $line = shift; + if ($line =~ m/^MainPID=([0-9]*)$/) { + $pid = $1; + } elsif ($line =~ m/^MemoryCurrent=([0-9]*|\[not set\])$/) { + $memory = $1 eq "[not set]" ? 0 : $1; + } + }; + + my $cmd = [ + '/bin/systemctl', + 'show', + "ceph-osd\@${osdid}.service", + '--property', + 'MainPID,MemoryCurrent', + ]; + run_command($cmd, errmsg => 'fetching OSD PID and memory usage failed', outfunc => $parser); + + $pid = defined($pid) ? int($pid) : undef; + $memory = defined($memory) ? int($memory) : undef; + + my $data = { + osd => { + hostname => $metadata->{hostname}, + id => $metadata->{id}, + mem_usage => $memory, + osd_data => $metadata->{osd_data}, + osd_objectstore => $metadata->{osd_objectstore}, + pid => $pid, + version => "$metadata->{ceph_version_short} ($metadata->{ceph_release})", + front_addr => $metadata->{front_addr}, + back_addr => $metadata->{back_addr}, + hb_front_addr => $metadata->{hb_front_addr}, + hb_back_addr => $metadata->{hb_back_addr}, + }, + }; + + $data->{devices} = []; + + my $get_data = sub { + my ($dev, $prefix, $device) = @_; + push ( + @{$data->{devices}}, + { + dev_node => $metadata->{"${prefix}_${dev}_dev_node"}, + physical_device => $metadata->{"${prefix}_${dev}_devices"}, + size => int($metadata->{"${prefix}_${dev}_size"}), + support_discard => int($metadata->{"${prefix}_${dev}_support_discard"}), + type => $metadata->{"${prefix}_${dev}_type"}, + device => $device, + } + ); + }; + + $get_data->("bdev", "bluestore", "block"); + $get_data->("db", "bluefs", "db") if $metadata->{bluefs_dedicated_db}; + $get_data->("wal", "bluefs", "wal") if $metadata->{bluefs_dedicated_wal}; + + return $data; + }}); + +__PACKAGE__->register_method ({ + name => 'osdvolume', + path => '{osdid}/lv-info', + method => 'GET', + description => "Get OSD volume details", + proxyto => 'node', + protected => 1, + permissions => { + check => ['perm', '/', [ 'Sys.Audit' ], any => 1], + }, + parameters => { + additionalProperties => 0, + properties => { + node => get_standard_option('pve-node'), + osdid => { + description => 'OSD ID', + type => 'integer', + }, + type => { + description => 'OSD device type', + type => 'string', + enum => ['block', 'db', 'wal'], + default => 'block', + optional => 1, + }, + }, + }, + returns => { + type => 'object', + properties => { + creation_time => { + type => 'string', + description => "Creation time as reported by `lvs`.", + }, + lv_name => { + type => 'string', + description => 'Name of the logical volume (LV).', + }, + lv_path => { + type => 'string', + description => 'Path to the logical volume (LV).', + }, + lv_size => { + type => 'integer', + description => 'Size of the logical volume (LV).', + }, + lv_uuid => { + type => 'string', + description => 'UUID of the logical volume (LV).', + }, + vg_name => { + type => 'string', + description => 'Name of the volume group (VG).', + }, + }, + }, + code => sub { + my ($param) = @_; + + PVE::Ceph::Tools::check_ceph_inited(); + + my $osdid = $param->{osdid}; + my $type = $param->{type} // 'block'; + + my $raw = ''; + my $parser = sub { $raw .= shift }; + my $cmd = ['/usr/sbin/ceph-volume', 'lvm', 'list', $osdid, '--format', 'json']; + run_command($cmd, errmsg => 'listing Ceph LVM volumes failed', outfunc => $parser); + + my $result; + if ($raw =~ m/^(\{.*\})$/s) { #untaint + $result = JSON::decode_json($1); + } else { + die "got unexpected data from ceph-volume: '${raw}'\n"; + } + if (!$result->{$osdid}) { + die "OSD '${osdid}' not found in 'ceph-volume lvm list' on node '${nodename}'.\n" + ."Maybe it was created before LVM became the default?\n"; + } + + my $lv_data = { map { $_->{type} => $_ } @{$result->{$osdid}} }; + my $volume = $lv_data->{$type} || die "volume type '${type}' not found for OSD ${osdid}\n"; + + $raw = ''; + $cmd = ['/sbin/lvs', $volume->{lv_path}, '--reportformat', 'json', '-o', 'lv_time']; + run_command($cmd, errmsg => 'listing logical volumes failed', outfunc => $parser); + + if ($raw =~ m/(\{.*\})$/s) { #untaint, lvs has whitespace at beginning + $result = JSON::decode_json($1); + } else { + die "got unexpected data from lvs: '${raw}'\n"; + } + + my $data = { map { $_ => $volume->{$_} } qw(lv_name lv_path lv_uuid vg_name) }; + $data->{lv_size} = int($volume->{lv_size}); + + $data->{creation_time} = @{$result->{report}}[0]->{lv}[0]->{lv_time}; + + return $data; + }}); + # Check if $osdid belongs to $nodename # $tree ... rados osd tree (passing the tree makes it easy to test) sub osd_belongs_to_node { -- 2.30.2