public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks
@ 2022-03-25 10:55 Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 1/7] mon_command: refactor to pass all data to perl Aaron Lauterer
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

The main motivation behind this series is to leverage several safety
checks that Ceph has to make sure it is ok to stop or destroy a service.

For this to work, the librados2-perl needs to be adapted because if it
is unsafe to perform one of the actions, the Ceph API will return an
error with an explanation in the status message.

The approach chosen is to simplify the bindings and handle errors on the
Perl side. the RADOS.pm::mon_command now always returns a hash ref
containing

* return code
* status message
* data

This makes it necessary to adapt packages that depend on librados2-perl
to handle the changed return values: storage and manager.

Therefore we need to coordinate breaking dependencies and releases of
these packages!

A new cmd-safety endpoint is added which is called from the GUI wherever
possible to show a warning.


changes:

* simplify librados2-perl changes and always return hash ref instead of
  wantarray
* one central cmd-safety API endpoint instead of multiple ones per
  service type

librados2-perl: Aaron Lauterer (2):
  mon_command: refactor to pass all data to perl
  mon_command: optionally ignore errors

 PVE/RADOS.pm | 24 +++++++++++++++++++-----
 RADOS.xs     | 18 ++++++------------
 2 files changed, 25 insertions(+), 17 deletions(-)


storage:Aaron Lauterer (1):
  rbd: adapt to changed rados mon_command return values

 PVE/Storage/RBDPlugin.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


manager: Aaron Lauterer (4):
  ceph: adapt to changed rados mon_command return values
  api: ceph: add cmd-safety endpoint
  ui: osd: warn if removal could be problematic
  ui: osd: mon: mds: warn if stop/destroy actions are problematic

 PVE/API2/Ceph.pm                 | 102 +++++++++++++++++++++-
 PVE/API2/Ceph/MGR.pm             |   2 +-
 PVE/API2/Ceph/MON.pm             |  19 +++--
 PVE/API2/Ceph/OSD.pm             |  14 ++--
 PVE/API2/Ceph/Pools.pm           |  18 ++--
 PVE/API2/Cluster/Ceph.pm         |   6 +-
 PVE/CLI/pveceph.pm               |   4 +-
 PVE/Ceph/Services.pm             |  10 +--
 PVE/Ceph/Tools.pm                |  16 ++--
 www/manager6/ceph/OSD.js         | 140 ++++++++++++++++++++++++++-----
 www/manager6/ceph/ServiceList.js | 104 ++++++++++++++++++++---
 11 files changed, 360 insertions(+), 75 deletions(-)


-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 librados2-perl 1/7] mon_command: refactor to pass all data to perl
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 2/7] mon_command: optionally ignore errors Aaron Lauterer
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

By passing back all return values from the Ceph API (RADOS.xs) to Perl
we are more flexible to make more than just the data available further
up the stack. These values are:

* return code
* status message (can contain useful information)
* data

The Ceph API interaction happens in a child process. We need to en- and
decode the returned hash in JSON to pass it between the child and parent
process.

RADOS.pm::mon_command now returns not just the data, but all information
as a hash ref.  Therefore dependent packages (pve-manager, pve-storage)
need to adapt.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
Make sure to coordinate breaking changes and releases of affected
packages: storager, manager
Patches 3 and 4 in this series contain the needed changes for these
packages

changes:
* simplify changes by always returning has href from RADOS.xs and handle
  errors in the Perl side (RADOS.pm).
* always return hash ref from RADOS.pm::mon_command()


 PVE/RADOS.pm | 20 ++++++++++++++++----
 RADOS.xs     | 18 ++++++------------
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/PVE/RADOS.pm b/PVE/RADOS.pm
index 463abc7..bec5028 100644
--- a/PVE/RADOS.pm
+++ b/PVE/RADOS.pm
@@ -201,7 +201,7 @@ sub new {
 	    my $res;
 	    eval {
 		if ($cmd eq 'M') { # rados monitor commands
-		    $res = pve_rados_mon_command($self->{conn}, [ $data ]);
+		    $res = encode_json(pve_rados_mon_command($self->{conn}, [ $data ]));
 		} elsif ($cmd eq 'C') { # class methods
 		    my $aref = decode_json($data);
 		    my $method = shift @$aref;
@@ -265,13 +265,25 @@ sub mon_command {
 
     my $json = encode_json($cmd);
 
-    my $raw = eval { $sendcmd->($self, 'M', $json) };
+    my $ret = $sendcmd->($self, 'M', $json);
     die "error with '$cmd->{prefix}': $@" if $@;
 
+    my $raw = decode_json($ret);
+
+    die "error with '$cmd->{prefix}': mon_command failed - $raw->{status_message}\n"
+	if $raw->{return_code} < 0;
+
+    my $data = '';
     if ($cmd->{format} && $cmd->{format} eq 'json') {
-	return length($raw) ? decode_json($raw) : undef;
+	$data = length($raw->{data}) ? decode_json($raw->{data}) : undef;
+    } else {
+	$data = $raw->{data};
     }
-    return $raw;
+    return {
+	return_code => $raw->{return_code},
+	status_message => $raw->{status_message},
+	data => $data,
+    };
 }
 
 
diff --git a/RADOS.xs b/RADOS.xs
index 9afba1c..45f634c 100644
--- a/RADOS.xs
+++ b/RADOS.xs
@@ -98,7 +98,7 @@ CODE:
     rados_shutdown(cluster);
 }
 
-SV *
+HV *
 pve_rados_mon_command(cluster, cmds)
 rados_t cluster
 AV *cmds
@@ -136,18 +136,12 @@ CODE:
         &outslen
     );
 
-    if (ret < 0) {
-        char msg[4096];
-        if (outslen > sizeof(msg)) {
-            outslen = sizeof(msg);
-        }
-        snprintf(msg, sizeof(msg), "mon_command failed - %.*s\n", (int)outslen, outs);
-        rados_buffer_free(outs);
-        rados_buffer_free(outbuf);
-        die(msg);
-    }
+    HV * rh = (HV *)sv_2mortal((SV *)newHV());
 
-    RETVAL = newSVpv(outbuf, outbuflen);
+    (void)hv_store(rh, "return_code", 11, newSViv(ret), 0);
+    (void)hv_store(rh, "status_message", 14, newSVpv(outs, outslen), 0);
+    (void)hv_store(rh, "data", 4, newSVpv(outbuf, outbuflen), 0);
+    RETVAL = rh;
 
     rados_buffer_free(outbuf);
     rados_buffer_free(outs);
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 librados2-perl 2/7] mon_command: optionally ignore errors
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 1/7] mon_command: refactor to pass all data to perl Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 storage 3/7] rbd: adapt to changed rados mon_command return values Aaron Lauterer
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

In some situations we do not want to abort if the Ceph API returns an
error. For example if we run the 'osd ok-to-stop' or similar calls, we
are interested in the status message in the error case.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
I split the adding of "noerr" into a separate patch for a clearly
separated history

 PVE/RADOS.pm | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/PVE/RADOS.pm b/PVE/RADOS.pm
index bec5028..36b42ed 100644
--- a/PVE/RADOS.pm
+++ b/PVE/RADOS.pm
@@ -259,7 +259,9 @@ sub cluster_stat {
 # example1: { prefix => 'get_command_descriptions'})
 # example2: { prefix => 'mon dump', format => 'json' }
 sub mon_command {
-    my ($self, $cmd) = @_;
+    my ($self, $cmd, $noerr) = @_;
+
+    $noerr = 0 if !$noerr;
 
     $cmd->{format} = 'json' if !$cmd->{format};
 
@@ -271,7 +273,7 @@ sub mon_command {
     my $raw = decode_json($ret);
 
     die "error with '$cmd->{prefix}': mon_command failed - $raw->{status_message}\n"
-	if $raw->{return_code} < 0;
+	if !$noerr && $raw->{return_code} < 0;
 
     my $data = '';
     if ($cmd->{format} && $cmd->{format} eq 'json') {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 storage 3/7] rbd: adapt to changed rados mon_command return values
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 1/7] mon_command: refactor to pass all data to perl Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 2/7] mon_command: optionally ignore errors Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 4/7] ceph: " Aaron Lauterer
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

mon_command now returns a hash ref. Only the data is of interest.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
Needs to be coordinated with librados2-perl changes from patch 1

 PVE/Storage/RBDPlugin.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/Storage/RBDPlugin.pm b/PVE/Storage/RBDPlugin.pm
index e287e28..f8fcab7 100644
--- a/PVE/Storage/RBDPlugin.pm
+++ b/PVE/Storage/RBDPlugin.pm
@@ -594,7 +594,7 @@ sub status {
     my ($class, $storeid, $scfg, $cache) = @_;
 
     my $rados = $librados_connect->($scfg, $storeid);
-    my $df = $rados->mon_command({ prefix => 'df', format => 'json' });
+    my $df = $rados->mon_command({ prefix => 'df', format => 'json' })->{data};
 
     my ($d) = grep { $_->{name} eq $scfg->{pool} } @{$df->{pools}};
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 manager 4/7] ceph: adapt to changed rados mon_command return values
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (2 preceding siblings ...)
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 storage 3/7] rbd: adapt to changed rados mon_command return values Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 5/7] api: ceph: add cmd-safety endpoint Aaron Lauterer
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

The mon_command now returns a hash ref but current calls are only
interested in the 'data'.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
Needs to be coordinated with librados2-perl changes from patch 1

 PVE/API2/Ceph.pm         |  6 +++---
 PVE/API2/Ceph/MGR.pm     |  2 +-
 PVE/API2/Ceph/MON.pm     | 19 +++++++++++++------
 PVE/API2/Ceph/OSD.pm     | 14 +++++++-------
 PVE/API2/Ceph/Pools.pm   | 18 ++++++++++--------
 PVE/API2/Cluster/Ceph.pm |  6 +++---
 PVE/CLI/pveceph.pm       |  4 ++--
 PVE/Ceph/Services.pm     | 10 +++++-----
 PVE/Ceph/Tools.pm        | 16 ++++++++--------
 9 files changed, 52 insertions(+), 43 deletions(-)

diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
index 3bbcfe4c..1e1b1edd 100644
--- a/PVE/API2/Ceph.pm
+++ b/PVE/API2/Ceph.pm
@@ -168,7 +168,7 @@ __PACKAGE__->register_method ({
 	PVE::Ceph::Tools::check_ceph_inited();
 
 	my $rados = PVE::RADOS->new();
-	my $res = $rados->mon_command( { prefix => 'config dump', format => 'json' });
+	my $res = $rados->mon_command( { prefix => 'config dump', format => 'json' })->{data};
 	foreach my $entry (@$res) {
 	    $entry->{can_update_at_runtime} = $entry->{can_update_at_runtime}? 1 : 0; # JSON::true/false -> 1/0
 	}
@@ -525,7 +525,7 @@ __PACKAGE__->register_method ({
 	my $rados = PVE::RADOS->new();
 
 	eval {
-	    my $bindata = $rados->mon_command({ prefix => 'osd getcrushmap', format => 'plain' });
+	    my $bindata = $rados->mon_command({ prefix => 'osd getcrushmap', format => 'plain' })->{data};
 	    file_set_contents($mapfile, $bindata);
 	    run_command(['crushtool', '-d', $mapfile, '-o', $mapdata]);
 	    $txt = file_get_contents($mapdata);
@@ -630,7 +630,7 @@ __PACKAGE__->register_method ({
 
 	my $rados = PVE::RADOS->new();
 
-	my $rules = $rados->mon_command({ prefix => 'osd crush rule ls' });
+	my $rules = $rados->mon_command({ prefix => 'osd crush rule ls' })->{data};
 
 	my $res = [];
 
diff --git a/PVE/API2/Ceph/MGR.pm b/PVE/API2/Ceph/MGR.pm
index 2dc679ef..5b9087c9 100644
--- a/PVE/API2/Ceph/MGR.pm
+++ b/PVE/API2/Ceph/MGR.pm
@@ -67,7 +67,7 @@ __PACKAGE__->register_method ({
 
 	my $mgr_hash = PVE::Ceph::Services::get_services_info("mgr", $cfg, $rados);
 
-	my $mgr_dump = $rados->mon_command({ prefix => 'mgr dump' });
+	my $mgr_dump = $rados->mon_command({ prefix => 'mgr dump' })->{data};
 
 	my $active_name = $mgr_dump->{active_name};
 	$mgr_hash->{$active_name}->{state} = 'active' if $active_name;
diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm
index 12c9caf0..943ba0a0 100644
--- a/PVE/API2/Ceph/MON.pm
+++ b/PVE/API2/Ceph/MON.pm
@@ -28,8 +28,12 @@ my $find_mon_ips = sub {
 
     my $pubnet;
     if ($rados) {
-	$pubnet = $rados->mon_command({ prefix => "config get" , who => "mon.",
-		key => "public_network", format => 'plain' });
+	$pubnet = $rados->mon_command({
+		prefix => "config get",
+		who => "mon.",
+		key => "public_network",
+		format => 'plain',
+	    })->{data};
 	# if not defined in the db, the result is empty, it is also always
 	# followed by a newline
 	($pubnet) = $pubnet =~ m/^(\S+)$/;
@@ -231,7 +235,7 @@ __PACKAGE__->register_method ({
 	my $monhash = PVE::Ceph::Services::get_services_info("mon", $cfg, $rados);
 
 	if ($rados) {
-	    my $monstat = $rados->mon_command({ prefix => 'quorum_status' });
+	    my $monstat = $rados->mon_command({ prefix => 'quorum_status' })->{data};
 
 	    my $mons = $monstat->{monmap}->{mons};
 	    foreach my $d (@$mons) {
@@ -391,7 +395,10 @@ __PACKAGE__->register_method ({
 		    ];
 
 		    if (defined($rados)) { # we can only have a RADOS object if we have a monitor
-			my $mapdata = $rados->mon_command({ prefix => 'mon getmap', format => 'plain' });
+			my $mapdata = $rados->mon_command({
+				prefix => 'mon getmap',
+				format => 'plain',
+			    })->{data};
 			file_set_contents($monmap, $mapdata);
 			run_command($monmaptool_cmd);
 		    } else { # we need to create a monmap for the first monitor
@@ -502,7 +509,7 @@ __PACKAGE__->register_method ({
 	my $monsection = "mon.$monid";
 
 	my $rados = PVE::RADOS->new();
-	my $monstat = $rados->mon_command({ prefix => 'quorum_status' });
+	my $monstat = $rados->mon_command({ prefix => 'quorum_status' })->{data};
 	my $monlist = $monstat->{monmap}->{mons};
 	my $monhash = PVE::Ceph::Services::get_services_info('mon', $cfg, $rados);
 
@@ -520,7 +527,7 @@ __PACKAGE__->register_method ({
 		# reopen with longer timeout
 		$rados = PVE::RADOS->new(timeout => PVE::Ceph::Tools::get_config('long_rados_timeout'));
 		$monhash = PVE::Ceph::Services::get_services_info('mon', $cfg, $rados);
-		$monstat = $rados->mon_command({ prefix => 'quorum_status' });
+		$monstat = $rados->mon_command({ prefix => 'quorum_status' })->{data};
 		$monlist = $monstat->{monmap}->{mons};
 
 		my $addrs = [];
diff --git a/PVE/API2/Ceph/OSD.pm b/PVE/API2/Ceph/OSD.pm
index 93433b3a..d38c4ced 100644
--- a/PVE/API2/Ceph/OSD.pm
+++ b/PVE/API2/Ceph/OSD.pm
@@ -30,7 +30,7 @@ my $nodename = PVE::INotify::nodename();
 my $get_osd_status = sub {
     my ($rados, $osdid) = @_;
 
-    my $stat = $rados->mon_command({ prefix => 'osd dump' });
+    my $stat = $rados->mon_command({ prefix => 'osd dump' })->{data};
 
     my $osdlist = $stat->{osds} || [];
 
@@ -51,7 +51,7 @@ my $get_osd_status = sub {
 my $get_osd_usage = sub {
     my ($rados) = @_;
 
-    my $osdlist = $rados->mon_command({ prefix => 'pg dump', dumpcontents => [ 'osds' ]});
+    my $osdlist = $rados->mon_command({ prefix => 'pg dump', dumpcontents => [ 'osds' ]})->{data};
     if (!($osdlist && ref($osdlist))) {
 	warn "got unknown result format for 'pg dump osds' command\n";
 	return [];
@@ -95,7 +95,7 @@ __PACKAGE__->register_method ({
 	PVE::Ceph::Tools::check_ceph_inited();
 
 	my $rados = PVE::RADOS->new();
-	my $res = $rados->mon_command({ prefix => 'osd tree' });
+	my $res = $rados->mon_command({ prefix => 'osd tree' })->{data};
 
         die "no tree nodes found\n" if !($res && $res->{nodes});
 
@@ -103,7 +103,7 @@ __PACKAGE__->register_method ({
 
 	my $osd_usage = $get_osd_usage->($rados);
 
-	my $osdmetadata_res = $rados->mon_command({ prefix => 'osd metadata' });
+	my $osdmetadata_res = $rados->mon_command({ prefix => 'osd metadata' })->{data};
 	my $osdmetadata = { map { $_->{id} => $_ } @$osdmetadata_res };
 
 	my $hostversions = PVE::Ceph::Services::get_ceph_versions();
@@ -350,7 +350,7 @@ __PACKAGE__->register_method ({
 
 	# get necessary ceph infos
 	my $rados = PVE::RADOS->new();
-	my $monstat = $rados->mon_command({ prefix => 'quorum_status' });
+	my $monstat = $rados->mon_command({ prefix => 'quorum_status' })->{data};
 
 	die "unable to get fsid\n" if !$monstat->{monmap} || !$monstat->{monmap}->{fsid};
 	my $fsid = $monstat->{monmap}->{fsid};
@@ -366,7 +366,7 @@ __PACKAGE__->register_method ({
 			'mon' => 'allow profile bootstrap-osd'
 		    ],
 		    format => 'plain',
-		});
+		})->{data};
 	    file_set_contents($ceph_bootstrap_osd_keyring, $bindata);
 	};
 
@@ -574,7 +574,7 @@ __PACKAGE__->register_method ({
 	my $rados = PVE::RADOS->new();
 
 	my $osd_belongs_to_node = osd_belongs_to_node(
-	    $rados->mon_command({ prefix => 'osd tree' }),
+	    $rados->mon_command({ prefix => 'osd tree' })->{data},
 	    $param->{node},
 	    $osdid,
 	);
diff --git a/PVE/API2/Ceph/Pools.pm b/PVE/API2/Ceph/Pools.pm
index 002f7893..ed384d6f 100644
--- a/PVE/API2/Ceph/Pools.pm
+++ b/PVE/API2/Ceph/Pools.pm
@@ -21,8 +21,7 @@ my $get_autoscale_status = sub {
 
     $rados = PVE::RADOS->new() if !defined($rados);
 
-    my $autoscale = $rados->mon_command({
-	    prefix => 'osd pool autoscale-status'});
+    my $autoscale = $rados->mon_command({ prefix => 'osd pool autoscale-status' })->{data};
 
     my $data;
     foreach my $p (@$autoscale) {
@@ -132,7 +131,7 @@ __PACKAGE__->register_method ({
 	my $rados = PVE::RADOS->new();
 
 	my $stats = {};
-	my $res = $rados->mon_command({ prefix => 'df' });
+	my $res = $rados->mon_command({ prefix => 'df' })->{data};
 
 	foreach my $d (@{$res->{pools}}) {
 	    next if !$d->{stats};
@@ -140,8 +139,8 @@ __PACKAGE__->register_method ({
 	    $stats->{$d->{id}} = $d->{stats};
 	}
 
-	$res = $rados->mon_command({ prefix => 'osd dump' });
-	my $rulestmp = $rados->mon_command({ prefix => 'osd crush rule dump'});
+	$res = $rados->mon_command({ prefix => 'osd dump' })->{data};
+	my $rulestmp = $rados->mon_command({ prefix => 'osd crush rule dump'})->{data};
 
 	my $rules = {};
 	for my $rule (@$rulestmp) {
@@ -566,7 +565,7 @@ __PACKAGE__->register_method ({
 		prefix => 'osd pool get',
 		pool   => "$pool",
 		var    => 'all',
-	    });
+	    })->{data};
 
 	my $data = {
 	    id                     => $res->{pool_id},
@@ -593,7 +592,7 @@ __PACKAGE__->register_method ({
 
 	if ($verbose) {
 	    my $stats;
-	    my $res = $rados->mon_command({ prefix => 'df' });
+	    my $res = $rados->mon_command({ prefix => 'df' })->{data};
 
 	    # pg_autoscaler module is not enabled in Nautilus
 	    # avoid partial read further down, use new rados instance
@@ -606,7 +605,10 @@ __PACKAGE__->register_method ({
 		$data->{statistics} = $d->{stats};
 	    }
 
-	    my $apps = $rados->mon_command({ prefix => "osd pool application get", pool => "$pool", });
+	    my $apps = $rados->mon_command({
+		    prefix => "osd pool application get",
+		    pool => "$pool",
+		})->{data};
 	    $data->{application_list} = [ keys %$apps ];
 	}
 
diff --git a/PVE/API2/Cluster/Ceph.pm b/PVE/API2/Cluster/Ceph.pm
index 7f825003..e48626eb 100644
--- a/PVE/API2/Cluster/Ceph.pm
+++ b/PVE/API2/Cluster/Ceph.pm
@@ -96,7 +96,7 @@ __PACKAGE__->register_method ({
 	    }
 
 	    # get data from metadata call and merge 'our' data
-	    my $services = $rados->mon_command({ prefix => "$type metadata" });
+	    my $services = $rados->mon_command({ prefix => "$type metadata" })->{data};
 	    for my $service ( @$services ) {
 		my $hostname = $service->{hostname};
 		next if !defined($hostname); # can happen if node is dead
@@ -115,7 +115,7 @@ __PACKAGE__->register_method ({
 	    $res->{$type} = $data;
 	}
 
-	$res->{osd} = $rados->mon_command({ prefix => "osd metadata" });
+	$res->{osd} = $rados->mon_command({ prefix => "osd metadata" })->{data};
 
 	return $res;
     }
@@ -152,7 +152,7 @@ my $get_current_set_flags = sub {
 
     $rados //= PVE::RADOS->new();
 
-    my $stat = $rados->mon_command({ prefix => 'osd dump' });
+    my $stat = $rados->mon_command({ prefix => 'osd dump' })->{data};
     my $setflags = $stat->{flags} // '';
     return { map { $_ => 1 } PVE::Tools::split_list($setflags) };
 };
diff --git a/PVE/CLI/pveceph.pm b/PVE/CLI/pveceph.pm
index 995cfcd5..0aa390ef 100755
--- a/PVE/CLI/pveceph.pm
+++ b/PVE/CLI/pveceph.pm
@@ -77,7 +77,7 @@ __PACKAGE__->register_method ({
 	    $pools = PVE::Ceph::Tools::ls_pools(undef, $rados);
 	    $monstat = PVE::Ceph::Services::get_services_info('mon', undef, $rados);
 	    $mdsstat = PVE::Ceph::Services::get_services_info('mds', undef, $rados);
-	    $osdstat = $rados->mon_command({ prefix => 'osd metadata' });
+	    $osdstat = $rados->mon_command({ prefix => 'osd metadata' })->{data};
 	};
 	warn "Error gathering ceph info, already purged? Message: $@" if $@;
 
@@ -291,7 +291,7 @@ __PACKAGE__->register_method ({
 
 	    if ($param->{'remove-storages'}) {
 		my $defaultfs;
-		my $fs_dump = $rados->mon_command({ prefix => "fs dump" });
+		my $fs_dump = $rados->mon_command({ prefix => "fs dump" })->{data};
 		for my $fs ($fs_dump->{filesystems}->@*) {
 		    next if $fs->{id} != $fs_dump->{default_fscid};
 		    $defaultfs = $fs->{mdsmap}->{fs_name};
diff --git a/PVE/Ceph/Services.pm b/PVE/Ceph/Services.pm
index cda13c6a..8fe62cb1 100644
--- a/PVE/Ceph/Services.pm
+++ b/PVE/Ceph/Services.pm
@@ -147,7 +147,7 @@ sub get_services_info {
 	return $result;
     }
 
-    my $metadata = $rados->mon_command({ prefix => "$type metadata" });
+    my $metadata = $rados->mon_command({ prefix => "$type metadata" })->{data};
     foreach my $info (@$metadata) {
 	my $id = $info->{name} // $info->{id};
 	my $service = $result->{$id};
@@ -197,7 +197,7 @@ sub get_cluster_mds_state {
 	$mds_state->{$mds->{name}} = $state;
     };
 
-    my $mds_dump = $rados->mon_command({ prefix => 'mds stat' });
+    my $mds_dump = $rados->mon_command({ prefix => 'mds stat' })->{data};
     my $fsmap = $mds_dump->{fsmap};
 
 
@@ -225,7 +225,7 @@ sub is_mds_active {
 	$rados = PVE::RADOS->new();
     }
 
-    my $mds_dump = $rados->mon_command({ prefix => 'mds stat' });
+    my $mds_dump = $rados->mon_command({ prefix => 'mds stat' })->{data};
     my $fsmap = $mds_dump->{fsmap}->{filesystems};
 
     if (!($fsmap && scalar(@$fsmap) > 0)) {
@@ -280,7 +280,7 @@ sub create_mds {
 	entity => $service_name,
 	caps => $priv,
 	format => 'plain',
-    });
+    })->{data};
 
     PVE::Tools::file_set_contents($service_keyring, $output);
 
@@ -357,7 +357,7 @@ sub create_mgr {
 	    mds => 'allow *',
 	],
 	format => 'plain'
-    });
+    })->{data};
     PVE::Tools::file_set_contents($mgrkeyring, $output);
 
     print "setting owner for directory\n";
diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm
index 36d7788a..85306328 100644
--- a/PVE/Ceph/Tools.pm
+++ b/PVE/Ceph/Tools.pm
@@ -78,7 +78,7 @@ sub get_cluster_versions {
 
     my $rados = PVE::RADOS->new();
     my $cmd = $service ? "$service versions" : 'versions';
-    return $rados->mon_command({ prefix => $cmd });
+    return $rados->mon_command({ prefix => $cmd })->{data};
 }
 
 sub get_config {
@@ -282,7 +282,7 @@ sub ls_pools {
 	$rados = PVE::RADOS->new();
     }
 
-    my $res = $rados->mon_command({ prefix => "osd lspools" });
+    my $res = $rados->mon_command({ prefix => "osd lspools" })->{data};
 
     return $res;
 }
@@ -319,7 +319,7 @@ sub ls_fs {
 	$rados = PVE::RADOS->new();
     }
 
-    my $res = $rados->mon_command({ prefix => "fs ls" });
+    my $res = $rados->mon_command({ prefix => "fs ls" })->{data};
 
     return $res;
 }
@@ -420,7 +420,7 @@ sub get_db_wal_sizes {
     my $res = {};
 
     my $rados = PVE::RADOS->new();
-    my $db_config = $rados->mon_command({ prefix => 'config-key dump', key => 'config/' });
+    my $db_config = $rados->mon_command({ prefix => 'config-key dump', key => 'config/' })->{data};
 
     $res->{db} = $db_config->{"config/osd/bluestore_block_db_size"} //
 		 $db_config->{"config/global/bluestore_block_db_size"};
@@ -520,12 +520,12 @@ sub ceph_cluster_status {
     my ($rados) = @_;
     $rados = PVE::RADOS->new() if !$rados;
 
-    my $status = $rados->mon_command({ prefix => 'status' });
-    $status->{health} = $rados->mon_command({ prefix => 'health', detail => 'detail' });
+    my $status = $rados->mon_command({ prefix => 'status' })->{data};
+    $status->{health} = $rados->mon_command({ prefix => 'health', detail => 'detail' })->{data};
 
     if (!exists $status->{monmap}->{mons}) { # octopus moved most info out of status, re-add
-	$status->{monmap} = $rados->mon_command({ prefix => 'mon dump' });
-	$status->{mgrmap} = $rados->mon_command({ prefix => 'mgr dump' });
+	$status->{monmap} = $rados->mon_command({ prefix => 'mon dump' })->{data};
+	$status->{mgrmap} = $rados->mon_command({ prefix => 'mgr dump' })->{data};
     }
 
     return $status;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 manager 5/7] api: ceph: add cmd-safety endpoint
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (3 preceding siblings ...)
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 4/7] ceph: " Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 6/7] ui: osd: warn if removal could be problematic Aaron Lauterer
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

Ceph provides us with several safety checks to verify that an action is
safe to perform. This endpoint provides means to acces them.
The actual mon commands are not exposed directly. Instead the two
actions "stop" and "destroy" are offered.

In case it is not okay to perform an action, Ceph provides a status
message explaining why. This message is part of the returned values.

For now there are the following checks for these services:

MON:
  - ok-to-stop
  - ok-to-rm
OSD:
  - ok-to-stop
  - safe-to-destroy
MDS:
  - ok-to-stop

Even though OSDs have a check if it is okay to destroy them, it is for
now not really usable in our workflow because it needs the OSD to be up
and running to return useful information. Our workflow in the GUI
currently is that the OSD needs to be stopped in order to destroy it.

There are no checks if the service actually exists. Ceph will report
back that it is safe to stop/destroy if the service does not exist.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
changes:
* remove repetitive endpoints for each service type in favor for a
  central one

 PVE/API2/Ceph.pm | 96 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

diff --git a/PVE/API2/Ceph.pm b/PVE/API2/Ceph.pm
index 1e1b1edd..69ae746a 100644
--- a/PVE/API2/Ceph.pm
+++ b/PVE/API2/Ceph.pm
@@ -641,4 +641,100 @@ __PACKAGE__->register_method ({
 	return $res;
     }});
 
+__PACKAGE__->register_method ({
+    name => 'cmd_safety',
+    path => 'cmd-safety',
+    method => 'GET',
+    description => "Heuristical check if it is safe to perform an action.",
+    proxyto => 'node',
+    protected => 1,
+    permissions => {
+	check => ['perm', '/', [ 'Sys.audit' ]],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    service => {
+		description => 'Service type',
+		type => 'string',
+		enum => ['osd', 'mon', 'mds'],
+	    },
+	    id => {
+		description => 'ID of the service',
+		type => 'string',
+	    },
+	    action => {
+		description => 'Action to check',
+		type => 'string',
+		enum => ['stop', 'destroy'],
+	    },
+	},
+    },
+    returns => {
+	type => 'object',
+	properties => {
+	   safe  => {
+		type => 'boolean',
+		description => 'If it is safe to run the command.',
+	    },
+	    status => {
+		type => 'string',
+		optional => 1,
+		description => 'Status message given by Ceph.'
+	    },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	PVE::Ceph::Tools::check_ceph_inited();
+
+	my $id = $param->{id};
+	my $service = $param->{service};
+	my $action = $param->{action};
+
+	my $rados = PVE::RADOS->new();
+
+	my $supported_actions = {
+	    osd => {
+		stop => 'ok-to-stop',
+		destroy => 'safe-to-destroy',
+	    },
+	    mon => {
+		stop => 'ok-to-stop',
+		destroy => 'ok-to-rm',
+	    },
+	    mds => {
+		stop => 'ok-to-stop',
+	    },
+	};
+
+	die "Service does not support this action: ${service}: ${action}\n"
+	    if !$supported_actions->{$service}->{$action};
+
+	my $result = {
+	    safe => 0,
+	    status => '',
+	};
+
+	my $params = {
+	    prefix => "${service} $supported_actions->{$service}->{$action}",
+	    format => 'plain',
+	};
+	if ($service eq 'mon' && $action eq 'destroy') {
+	    $params->{id} = $id;
+	} else {
+	    $params->{ids} = [ $id ];
+	}
+
+	$result = $rados->mon_command($params, 1);
+	die $@ if $@;
+
+	$result->{safe} = $result->{return_code} == 0 ? 1 : 0;
+	$result->{status} = $result->{status_message};
+
+	return $result;
+    }});
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 manager 6/7] ui: osd: warn if removal could be problematic
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (4 preceding siblings ...)
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 5/7] api: ceph: add cmd-safety endpoint Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 7/7] ui: osd: mon: mds: warn if stop/destroy actions are problematic Aaron Lauterer
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

If an OSD is removed during the wrong conditions, it could lead to
blocked IO or worst case data loss.

Check against global flags that limit the capabilities of Ceph to heal
itself (norebalance, norecover, noout) and if there are degraded
objects.

Unfortunately, the 'safe-to-destroy' Ceph API endpoint will not help
here as it only works as long as the OSD is still running. By the time
the destroy button is enabled, the OSD will already be stopped.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
changes:
* incorporated most nits
* did not "improve" deconstruction of API call results as it would still
  be quite noisy but a little less readable IMHO.

 www/manager6/ceph/OSD.js | 69 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/www/manager6/ceph/OSD.js b/www/manager6/ceph/OSD.js
index 78f226ff..b36787f5 100644
--- a/www/manager6/ceph/OSD.js
+++ b/www/manager6/ceph/OSD.js
@@ -178,6 +178,20 @@ Ext.define('PVE.CephRemoveOsd', {
 	    labelWidth: 130,
 	    fieldLabel: gettext('Cleanup Disks'),
 	},
+	{
+	    xtype: 'displayfield',
+	    name: 'osd-flag-hint',
+	    userCls: 'pmx-hint',
+	    value: gettext('Global flags limiting the self healing of Ceph are enabled.'),
+	    hidden: true,
+	},
+	{
+	    xtype: 'displayfield',
+	    name: 'degraded-objects-hint',
+	    userCls: 'pmx-hint',
+	    value: gettext('Objects are degraded. Consider waiting until the cluster is healthy.'),
+	    hidden: true,
+	},
     ],
     initComponent: function() {
         let me = this;
@@ -198,6 +212,13 @@ Ext.define('PVE.CephRemoveOsd', {
         });
 
         me.callParent();
+
+	if (me.warnings.flags) {
+	    me.down('field[name=osd-flag-hint]').setHidden(false);
+	}
+	if (me.warnings.degraded) {
+	    me.down('field[name=degraded-objects-hint]').setHidden(false);
+	}
     },
 });
 
@@ -439,14 +460,58 @@ Ext.define('PVE.node.CephOsdTree', {
 	    }).show();
 	},
 
-	destroy_osd: function() {
+	destroy_osd: async function() {
 	    let me = this;
 	    let vm = this.getViewModel();
+
+	    let warnings = {
+		flags: false,
+		degraded: false,
+	    };
+
+	    let flagsPromise = Proxmox.Async.api2({
+		url: `/cluster/ceph/flags`,
+		method: 'GET',
+	    });
+
+	    let statusPromise = Proxmox.Async.api2({
+		url: `/cluster/ceph/status`,
+		method: 'GET',
+	    });
+
+	    me.getView().mask(gettext('Loading...'));
+
+	    try {
+		let result = await Promise.all([flagsPromise, statusPromise]);
+
+		let flagsData = result[0].result.data;
+		let statusData = result[1].result.data;
+
+		let flags = Array.from(
+		    flagsData.filter(v => v.value),
+		    v => v.name,
+		).filter(v => ['norebalance', 'norecover', 'noout'].includes(v));
+
+		if (flags.length) {
+		    warnings.flags = true;
+		}
+		if (Object.keys(statusData.pgmap).includes('degraded_objects')) {
+		    warnings.degraded = true;
+		}
+	    } catch (error) {
+		Ext.Msg.alert(gettext('Error'), error.htmlStatus);
+		me.getView().unmask();
+		return;
+	    }
+
+	    me.getView().unmask();
 	    Ext.create('PVE.CephRemoveOsd', {
 		nodename: vm.get('osdhost'),
 		osdid: vm.get('osdid'),
+		warnings: warnings,
 		taskDone: () => { me.reload(); },
-	    }).show();
+		autoShow: true,
+	    });
 	},
 
 	set_flags: function() {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [PATCH v2 manager 7/7] ui: osd: mon: mds: warn if stop/destroy actions are problematic
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (5 preceding siblings ...)
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 6/7] ui: osd: warn if removal could be problematic Aaron Lauterer
@ 2022-03-25 10:55 ` Aaron Lauterer
  2022-11-07 13:18 ` [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
  2022-11-17 17:46 ` [pve-devel] partially-applied: " Thomas Lamprecht
  8 siblings, 0 replies; 11+ messages in thread
From: Aaron Lauterer @ 2022-03-25 10:55 UTC (permalink / raw)
  To: pve-devel

Check if stopping of a service (OSD, MON, MDS) will be problematic for
Ceph. The warning still allows the user to proceed.

Ceph also has a check if the destruction of a MON is okay, so let's use
it.

Instead of the common OK button, label it with `Stop OSD` and so forth
to hopefully reduce the "click OK by habit" incidents.

This will not catch it every time as Ceph can need a few moments after a
change to establish its current status. For example, stopping one of 3
MONs and then right away destroying one of the last two running MONs
will most likely not trigger the warning. Doing so after a few seconds
should show the warning though.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
changes:
* adapted to changed API endpoint

 www/manager6/ceph/OSD.js         |  71 ++++++++++++++++-----
 www/manager6/ceph/ServiceList.js | 104 +++++++++++++++++++++++++++----
 2 files changed, 145 insertions(+), 30 deletions(-)

diff --git a/www/manager6/ceph/OSD.js b/www/manager6/ceph/OSD.js
index b36787f5..6f7e2159 100644
--- a/www/manager6/ceph/OSD.js
+++ b/www/manager6/ceph/OSD.js
@@ -527,23 +527,60 @@ Ext.define('PVE.node.CephOsdTree', {
 	    let me = this;
 	    let vm = this.getViewModel();
 	    let cmd = comp.cmd || comp;
-	    Proxmox.Utils.API2Request({
-                url: "/nodes/" + vm.get('osdhost') + "/ceph/" + cmd,
-		params: { service: "osd." + vm.get('osdid') },
-		waitMsgTarget: me.getView(),
-		method: 'POST',
-		success: function(response, options) {
-		    let upid = response.result.data;
-		    let win = Ext.create('Proxmox.window.TaskProgress', {
-			upid: upid,
-			taskDone: () => { me.reload(); },
-		    });
-		    win.show();
-		},
-		failure: function(response, opts) {
-		    Ext.Msg.alert(gettext('Error'), response.htmlStatus);
-		},
-	    });
+
+	    let doRequest = function() {
+		Proxmox.Utils.API2Request({
+		    url: `/nodes/${vm.get('osdhost')}/ceph/${cmd}`,
+		    params: { service: "osd." + vm.get('osdid') },
+		    waitMsgTarget: me.getView(),
+		    method: 'POST',
+		    success: function(response, options) {
+			let upid = response.result.data;
+			let win = Ext.create('Proxmox.window.TaskProgress', {
+			    upid: upid,
+			    taskDone: () => { me.reload(); },
+			});
+			win.show();
+		    },
+		    failure: function(response, opts) {
+			Ext.Msg.alert(gettext('Error'), response.htmlStatus);
+		    },
+		});
+	    };
+
+	    if (cmd === "stop") {
+		Proxmox.Utils.API2Request({
+		    url: `/nodes/${vm.get('osdhost')}/ceph/cmd-safety`,
+		    params: {
+			service: 'osd',
+			id: vm.get('osdid'),
+			action: 'stop',
+		    },
+		    waitMsgTarget: me.getView(),
+		    method: 'GET',
+		    success: function({ result: { data } }) {
+			if (!data.safe) {
+			    Ext.Msg.show({
+				title: gettext('Warning'),
+				message: data.status,
+				icon: Ext.Msg.WARNING,
+				buttons: Ext.Msg.OKCANCEL,
+				buttonText: { ok: gettext('Stop OSD') },
+				fn: function(selection) {
+				    if (selection === 'ok') {
+					doRequest();
+				    }
+				},
+			    });
+			} else {
+			    doRequest();
+			}
+		    },
+		    failure: response => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
+		});
+	    } else {
+		doRequest();
+	    }
 	},
 
 	set_selection_status: function(tp, selection) {
diff --git a/www/manager6/ceph/ServiceList.js b/www/manager6/ceph/ServiceList.js
index 9298974e..8b51fe6a 100644
--- a/www/manager6/ceph/ServiceList.js
+++ b/www/manager6/ceph/ServiceList.js
@@ -148,19 +148,57 @@ Ext.define('PVE.node.CephServiceController', {
 	    Ext.Msg.alert(gettext('Error'), "entry has no host");
 	    return;
 	}
-	Proxmox.Utils.API2Request({
-				  url: `/nodes/${rec.data.host}/ceph/${cmd}`,
-				  method: 'POST',
-				  params: { service: view.type + '.' + rec.data.name },
-				  success: function(response, options) {
-				      Ext.create('Proxmox.window.TaskProgress', {
-					  autoShow: true,
-					  upid: response.result.data,
-					  taskDone: () => view.rstore.load(),
-				      });
-				  },
-				  failure: (response, _opts) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
-	});
+	let doRequest = function() {
+	    Proxmox.Utils.API2Request({
+		url: `/nodes/${rec.data.host}/ceph/${cmd}`,
+		method: 'POST',
+		params: { service: view.type + '.' + rec.data.name },
+		success: function(response, options) {
+		    Ext.create('Proxmox.window.TaskProgress', {
+			autoShow: true,
+			upid: response.result.data,
+			taskDone: () => view.rstore.load(),
+		    });
+		},
+		failure: (response, _opts) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
+	    });
+	};
+	if (cmd === "stop" && ['mon', 'mds'].includes(view.type)) {
+	    Proxmox.Utils.API2Request({
+		url: `/nodes/${rec.data.host}/ceph/cmd-safety`,
+		params: {
+		    service: view.type,
+		    id: rec.data.name,
+		    action: 'stop',
+		},
+		method: 'GET',
+		success: function({ result: { data } }) {
+		    let stopText = {
+			mon: gettext('Stop MON'),
+			mds: gettext('Stop MDS'),
+		    };
+		    if (!data.safe) {
+			Ext.Msg.show({
+			    title: gettext('Warning'),
+			    message: data.status,
+			    icon: Ext.Msg.WARNING,
+			    buttons: Ext.Msg.OKCANCEL,
+			    buttonText: { ok: stopText[view.type] },
+			    fn: function(selection) {
+				if (selection === 'ok') {
+				    doRequest();
+				}
+			    },
+			});
+		    } else {
+			doRequest();
+		    }
+		},
+		failure: (response, _opts) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
+	    });
+	} else {
+	    doRequest();
+	}
     },
     onChangeService: function(button) {
 	let me = this;
@@ -273,6 +311,46 @@ Ext.define('PVE.node.CephServiceList', {
 		    taskDone: () => view.rstore.load(),
 		});
 	    },
+	    handler: function(btn, event, rec) {
+		let me = this;
+		let view = me.up('grid');
+		let doRequest = function() {
+		    Proxmox.button.StdRemoveButton.prototype.handler.call(me, btn, event, rec);
+		};
+		if (view.type === 'mon') {
+		    Proxmox.Utils.API2Request({
+			url: `/nodes/${rec.data.host}/ceph/cmd-safety`,
+			params: {
+			    service: view.type,
+			    id: rec.data.name,
+			    action: 'destroy',
+			},
+			method: 'GET',
+			success: function({ result: { data } }) {
+			    if (!data.safe) {
+				Ext.Msg.show({
+				    title: gettext('Warning'),
+				    message: data.status,
+				    icon: Ext.Msg.WARNING,
+				    buttons: Ext.Msg.OKCANCEL,
+				    buttonText: { ok: gettext('Destroy MON') },
+				    fn: function(selection) {
+					if (selection === 'ok') {
+					    doRequest();
+					}
+				    },
+				});
+			    } else {
+				doRequest();
+			    }
+			},
+			failure: (response, _opts) => Ext.Msg.alert(gettext('Error'), response.htmlStatus),
+		    });
+		} else {
+		    doRequest();
+		}
+	    },
+
 	},
 	'-',
 	{
-- 
2.30.2





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (6 preceding siblings ...)
  2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 7/7] ui: osd: mon: mds: warn if stop/destroy actions are problematic Aaron Lauterer
@ 2022-11-07 13:18 ` Aaron Lauterer
  2022-11-17 10:35   ` Thomas Lamprecht
  2022-11-17 17:46 ` [pve-devel] partially-applied: " Thomas Lamprecht
  8 siblings, 1 reply; 11+ messages in thread
From: Aaron Lauterer @ 2022-11-07 13:18 UTC (permalink / raw)
  To: pve-devel

ping?

If we are okay with the way this would change RADOS.xs and RADOS.pm, I can send 
a follow up for patch 3 and 4 as a few other places started to issue mon 
commands in the meantime.

There is also the currently pending "Ceph OSD: add detail infos" patch series 
which either could be included in the follow up patch, or will need its own, 
depending on what gets applied first.


[0] https://lists.proxmox.com/pipermail/pve-devel/2022-October/054375.html

On 3/25/22 11:55, Aaron Lauterer wrote:
> The main motivation behind this series is to leverage several safety
> checks that Ceph has to make sure it is ok to stop or destroy a service.
> 
> For this to work, the librados2-perl needs to be adapted because if it
> is unsafe to perform one of the actions, the Ceph API will return an
> error with an explanation in the status message.
> 
> The approach chosen is to simplify the bindings and handle errors on the
> Perl side. the RADOS.pm::mon_command now always returns a hash ref
> containing
> 
> * return code
> * status message
> * data
> 
> This makes it necessary to adapt packages that depend on librados2-perl
> to handle the changed return values: storage and manager.
> 
> Therefore we need to coordinate breaking dependencies and releases of
> these packages!
> 
> A new cmd-safety endpoint is added which is called from the GUI wherever
> possible to show a warning.
> 
> 
> changes:
> 
> * simplify librados2-perl changes and always return hash ref instead of
>    wantarray
> * one central cmd-safety API endpoint instead of multiple ones per
>    service type
> 
> librados2-perl: Aaron Lauterer (2):
>    mon_command: refactor to pass all data to perl
>    mon_command: optionally ignore errors
> 
>   PVE/RADOS.pm | 24 +++++++++++++++++++-----
>   RADOS.xs     | 18 ++++++------------
>   2 files changed, 25 insertions(+), 17 deletions(-)
> 
> 
> storage:Aaron Lauterer (1):
>    rbd: adapt to changed rados mon_command return values
> 
>   PVE/Storage/RBDPlugin.pm | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> manager: Aaron Lauterer (4):
>    ceph: adapt to changed rados mon_command return values
>    api: ceph: add cmd-safety endpoint
>    ui: osd: warn if removal could be problematic
>    ui: osd: mon: mds: warn if stop/destroy actions are problematic
> 
>   PVE/API2/Ceph.pm                 | 102 +++++++++++++++++++++-
>   PVE/API2/Ceph/MGR.pm             |   2 +-
>   PVE/API2/Ceph/MON.pm             |  19 +++--
>   PVE/API2/Ceph/OSD.pm             |  14 ++--
>   PVE/API2/Ceph/Pools.pm           |  18 ++--
>   PVE/API2/Cluster/Ceph.pm         |   6 +-
>   PVE/CLI/pveceph.pm               |   4 +-
>   PVE/Ceph/Services.pm             |  10 +--
>   PVE/Ceph/Tools.pm                |  16 ++--
>   www/manager6/ceph/OSD.js         | 140 ++++++++++++++++++++++++++-----
>   www/manager6/ceph/ServiceList.js | 104 ++++++++++++++++++++---
>   11 files changed, 360 insertions(+), 75 deletions(-)
> 
> 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks
  2022-11-07 13:18 ` [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
@ 2022-11-17 10:35   ` Thomas Lamprecht
  0 siblings, 0 replies; 11+ messages in thread
From: Thomas Lamprecht @ 2022-11-17 10:35 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Am 07/11/2022 um 14:18 schrieb Aaron Lauterer:
> If we are okay with the way this would change RADOS.xs and RADOS.pm, I can send a follow up for patch 3 and 4 as a few other places started to issue mon commands in the meantime.

how about renaming mon_command to mon_cmd and re-introduce mon_command as
compat wrapper around mon_cmd?

Would at least avoid the need to break older manager and qemu-server, and
we could keep using the old method call for sites that don't care?




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] partially-applied: [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks
  2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
                   ` (7 preceding siblings ...)
  2022-11-07 13:18 ` [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
@ 2022-11-17 17:46 ` Thomas Lamprecht
  8 siblings, 0 replies; 11+ messages in thread
From: Thomas Lamprecht @ 2022-11-17 17:46 UTC (permalink / raw)
  To: Proxmox VE development discussion, Aaron Lauterer

Am 25/03/2022 um 11:55 schrieb Aaron Lauterer:
> librados2-perl: Aaron Lauterer (2):
>   mon_command: refactor to pass all data to perl
>   mon_command: optionally ignore errors
> 
>  PVE/RADOS.pm | 24 +++++++++++++++++++-----
>  RADOS.xs     | 18 ++++++------------
>  2 files changed, 25 insertions(+), 17 deletions(-)

applied these two, with compat wrapper as discussed offlist, thanks!




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-11-17 17:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-25 10:55 [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 1/7] mon_command: refactor to pass all data to perl Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 librados2-perl 2/7] mon_command: optionally ignore errors Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 storage 3/7] rbd: adapt to changed rados mon_command return values Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 4/7] ceph: " Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 5/7] api: ceph: add cmd-safety endpoint Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 6/7] ui: osd: warn if removal could be problematic Aaron Lauterer
2022-03-25 10:55 ` [pve-devel] [PATCH v2 manager 7/7] ui: osd: mon: mds: warn if stop/destroy actions are problematic Aaron Lauterer
2022-11-07 13:18 ` [pve-devel] [PATCH v2 librados2-perl storage manager 0/7] Add Ceph safety checks Aaron Lauterer
2022-11-17 10:35   ` Thomas Lamprecht
2022-11-17 17:46 ` [pve-devel] partially-applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal