From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 074DDBA5C0 for ; Tue, 19 Mar 2024 18:41:58 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D31ED51DF for ; Tue, 19 Mar 2024 18:41:27 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 19 Mar 2024 18:41:26 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0ABB548AEE for ; Tue, 19 Mar 2024 18:41:26 +0100 (CET) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 19 Mar 2024 18:41:23 +0100 Message-Id: From: "Max Carrara" To: "Proxmox VE development discussion" X-Mailer: aerc 0.17.0-72-g6a84f1331f1c References: <20240305150758.252669-1-m.carrara@proxmox.com> <20240305150758.252669-16-m.carrara@proxmox.com> <1710841288.jsqryni90l.astroid@yuna.none> In-Reply-To: <1710841288.jsqryni90l.astroid@yuna.none> X-SPAM-LEVEL: Spam detection results: 0 AWL -0.375 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_ASCII_DIVIDERS 0.8 Email that uses ascii formatting dividers and possible spam tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [mon.pm, tools.pm, proxmox.com] Subject: Re: [pve-devel] [PATCH v4 pve-manager 15/16] fix #4759: ceph: configure ceph-crash.service and its key X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2024 17:41:58 -0000 On Tue Mar 19, 2024 at 11:04 AM CET, Fabian Gr=C3=BCnbichler wrote: > On March 5, 2024 4:07 pm, Max Carrara wrote: > > Due to Ceph dropping privileges when running the 'ceph-crash' daemon > > [0], it is necessary to allow the daemon to authenticate with its > > cluster in a safe manner. > >=20 > > In order to avoid exposing sensitive keyrings or somehow escalating > > its privileges again, 'ceph-crash' is therefore provided with its own > > keyring in the '/etc/pve/ceph' directory. This directory, due to being > > on 'pmxcfs', may be read by members of the 'www-data' group, which > > 'ceph-crash' is made part of [1]. > >=20 > >=20 > > Expected Configuration > > ---------------------- > >=20 > > 1. A keyring file named '/etc/pve/ceph/ceph.client.crash.keyring' > > exists > > 2. A section named 'client.crash' exists in '/etc/pve/ceph.conf' > > 3. The 'client.crash' section has a key named 'keyring' which > > references the keyring file as '/etc/pve/ceph/$cluster.$name.keyrin= g' > > 4. The 'client.crash' section has *no* key named 'key' > >=20 > >=20 > > New Clusters > > ------------ > >=20 > > The keyring file is created and the conf file is updated after the firs= t > > monitor has been created (when calling `pveceph mon create`). > >=20 > >=20 > > Existing Clusters > > ----------------- > >=20 > > A new helper script creates and configures the 'client.crash' keyring i= n > > `postinst`, if: > > * Ceph is installed > > * Ceph is initialized ('/etc/pve/ceph.conf' and '/etc/pve/ceph' exist) > > * Connection to RADOS is successful > >=20 > > If the above conditions are met, the helper script ensures that the > > existing configuration matches the expected configuration mentioned > > above. > >=20 > > The configuration is not changed if it is already as expected. > >=20 > > The helper script may be called again manually if the `postinst` hook > > fails. It is installed to '/usr/share/pve-manager/helpers/pve-init-ceph= -crash'. > >=20 > >=20 > > Existing `client.crash` Key > > --------------------------- > >=20 > > If a key named 'client.crash' already exists within the cluster, it is > > reused and not regenerated.=20 > >=20 > >=20 > > [0]: https://github.com/ceph/ceph/pull/48713 > > [1]: https://git.proxmox.com/?p=3Dceph.git;a=3Dcommitdiff;h=3Df72c698a5= 5905d93e9a0b7b95674616547deba8a > >=20 > > Signed-off-by: Max Carrara > > --- > > Changes v1 --> v2: > > * fix 'keyring' key being appended to 'client.crash' section even > > if it already exists and configured correctly > > Changes v2 --> v3: > > * avoid needless second 'pmxcfs' lock on 'ceph.conf' when configuring > > 'client.crash' keyring after first monitor was created > > * change name of helper sub that creates the ceph.crash keyring > > * the helper sub now implicitly expects the /etc/pve/ceph directory t= o > > exist (as the previous commit enforces its existence) > > * remove BASH part of postinst hook related to `ceph-crash` > > * add Perl helper script and call that instead in postinst hook > > * reword commit message > > Changes v3 --> v4: > > * ensure key named 'key' does not exist for 'client.crash' section > > in helper script (thanks Friedrich!) > > * restart ceph-crash.service if it wasn't disabled (thanks Friedrich!= ) > > * add empty line before and after making changes in `update_ceph_conf= ` > > in 'postinst' hook > > * clean up the helper script; logic is easier to follow now > > * increase verbosity of output of helper script, showing what's > > exactly done when invoked > >=20 > > PVE/API2/Ceph/MON.pm | 8 +++ > > PVE/Ceph/Tools.pm | 35 +++++++++++ > > bin/Makefile | 1 + > > bin/pve-init-ceph-crash | 129 ++++++++++++++++++++++++++++++++++++++++ > > debian/postinst | 14 +++++ > > 5 files changed, 187 insertions(+) > > create mode 100755 bin/pve-init-ceph-crash > >=20 > > diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm > > index 1e959ef3..8b7ce376 100644 > > --- a/PVE/API2/Ceph/MON.pm > > +++ b/PVE/API2/Ceph/MON.pm > > @@ -450,6 +450,14 @@ __PACKAGE__->register_method ({ > > ) > > }; > > warn "$@" if $@; > > + > > + print "Configuring keyring for ceph-crash.service\n"; > > + eval { > > + PVE::Ceph::Tools::create_or_update_crash_keyring_file(); > > + $cfg->{'client.crash'}->{keyring} =3D '/etc/pve/ceph/$cluster.$name= .keyring'; > > + cfs_write_file('ceph.conf', $cfg); > > + }; > > + warn "Unable to configure keyring for ceph-crash.service: $@" if= $@; > > } > > =20 > > eval { PVE::Ceph::Services::ceph_service_cmd('enable', $monsection) = }; > > diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm > > index 735bb116..9b97171e 100644 > > --- a/PVE/Ceph/Tools.pm > > +++ b/PVE/Ceph/Tools.pm > > @@ -20,6 +20,7 @@ my $pve_ceph_cfgpath =3D "/etc/pve/$ccname.conf"; > > my $ceph_cfgpath =3D "$ceph_cfgdir/$ccname.conf"; > > my $pve_ceph_cfgdir =3D "/etc/pve/ceph"; > > =20 > > +my $pve_ceph_crash_key_path =3D "$pve_ceph_cfgdir/$ccname.client.crash= .keyring"; > > my $pve_mon_key_path =3D "/etc/pve/priv/$ccname.mon.keyring"; > > my $pve_ckeyring_path =3D "/etc/pve/priv/$ccname.client.admin.keyring"= ; > > my $ckeyring_path =3D "/etc/ceph/ceph.client.admin.keyring"; > > @@ -45,6 +46,7 @@ my $config_values =3D { > > =20 > > my $config_files =3D { > > pve_ceph_cfgpath =3D> $pve_ceph_cfgpath, > > + pve_ceph_crash_key_path =3D> $pve_ceph_crash_key_path, > > pve_mon_key_path =3D> $pve_mon_key_path, > > pve_ckeyring_path =3D> $pve_ckeyring_path, > > ceph_bootstrap_osd_keyring =3D> $ceph_bootstrap_osd_keyring, > > @@ -420,6 +422,39 @@ sub get_or_create_admin_keyring { > > return $pve_ckeyring_path; > > } > > =20 > > +# is also used in `pve-init-ceph-crash` helper > > +sub create_or_update_crash_keyring_file { > > + my ($rados) =3D @_; > > + > > + if (!defined($rados)) { > > + $rados =3D PVE::RADOS->new(); > > + } > > + > > + my $output =3D $rados->mon_command({ > > + prefix =3D> 'auth get-or-create', > > + entity =3D> 'client.crash', > > + caps =3D> [ > > + mon =3D> 'profile crash', > > + mgr =3D> 'profile crash', > > + ], > > + format =3D> 'plain', > > + }); > > + > > + if (-f $pve_ceph_crash_key_path) { > > + my $contents =3D PVE::Tools::file_get_contents($pve_ceph_crash_key_pa= th); > > + > > + if ($contents ne $output) { > > + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output); > > + return 1; > > + } > > + } else { > > + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output); > > + return 1; > > + } > > + > > + return 0; > > +} > > + > > # get ceph-volume managed osds > > sub ceph_volume_list { > > my $result =3D {}; > > diff --git a/bin/Makefile b/bin/Makefile > > index 06d8148e..b221e4b1 100644 > > --- a/bin/Makefile > > +++ b/bin/Makefile > > @@ -83,6 +83,7 @@ install: $(SCRIPTS) $(CLI_MANS) $(SERVICE_MANS) $(BAS= H_COMPLETIONS) $(ZSH_COMPLE > > install -m 0755 $(SCRIPTS) $(BINDIR) > > install -d $(USRSHARE)/helpers > > install -m 0755 pve-startall-delay $(USRSHARE)/helpers > > + install -m 0755 pve-init-ceph-crash $(USRSHARE)/helpers > > install -d $(MAN1DIR) > > install -m 0644 $(CLI_MANS) $(MAN1DIR) > > install -d $(MAN8DIR) > > diff --git a/bin/pve-init-ceph-crash b/bin/pve-init-ceph-crash > > new file mode 100755 > > index 00000000..c8e9217d > > --- /dev/null > > +++ b/bin/pve-init-ceph-crash > > @@ -0,0 +1,129 @@ > > +#!/usr/bin/perl > > + > > +use strict; > > +use warnings; > > + > > +use List::Util qw(first); > > + > > +use PVE::Ceph::Tools; > > +use PVE::Cluster; > > +use PVE::RADOS; > > +use PVE::RPCEnvironment; > > + > > +my $ceph_cfg_file =3D 'ceph.conf'; > > +my $keyring_value =3D '/etc/pve/ceph/$cluster.$name.keyring'; > > + > > +my $entity =3D 'client.crash'; > > nit: this could be inlined? Inlined as in use 'client.crash' as the key directly? Eh, I'm more a fan of using a variable here, as constantly spelling it out while changing a bunch of things gets a little painful ... If you meant that I should put it in `try_adapt_cfg` - sure, I missed that that's the only `sub` in which it's being used, woops! > > > + > > + > > +sub try_adapt_cfg { > > + my ($cfg) =3D @_; > > + > > + my $changed =3D 0; > > + print("Checking whether the configuration for '$entity' needs to b= e updated.\n"); > > + > > + my $add_keyring =3D sub { > > + print("Setting keyring path to '$keyring_value'.\n"); > > + $cfg->{$entity}->{keyring} =3D $keyring_value; > > + $changed =3D 1; > > this can be removed, you always return right after calling this sub.. > > > + }; > > + > > + if (!exists($cfg->{$entity})) { > > + print("Adding missing section for '$entity'.\n"); > > + $add_keyring->(); > > + return $changed; > > and all of these can be changed to uncondtionally return 1 Fair! > > > + } > > + > > + if (exists($cfg->{$entity}->{key})) { > > + print("Removing existing usage of key.\n"); > > + delete($cfg->{$entity}->{key}); > > + $changed =3D 1; > > + } > > + > > + if (!exists($cfg->{$entity}->{keyring})) { > > + print("Keyring path is missing from configuration.\n"); > > + $add_keyring->(); > > + return $changed; > > + } > > + > > + my $current_keyring_value =3D $cfg->{$entity}->{keyring}; > > + if ($current_keyring_value ne $keyring_value) { > > + print("Current keyring path differs from expected path.\n"); > > + $add_keyring->(); > > + return $changed; > > + } > > + > > + return $changed; > > $changed only serves the purpose of returning 1 iff only the 'key' > value/entry was removed. so it could be renamed to reflect that ;) Good point. > > > +} > > + > > + > > +sub main { > > + my $rpcenv =3D PVE::RPCEnvironment->init('cli'); > > + > > + $rpcenv->init_request(); > > + $rpcenv->set_language($ENV{LANG}); > > + $rpcenv->set_user('root@pam'); > > why do we need an rpcenv here? Double-checked, just to be sure: `librados-perl` requires an `RPCEnvironment` to do some handling regarding forks - `RPCEnvironment::get()` will die if no env was initialized. > > > + > > + die "Not running as root. Exiting.\n" if $> !=3D 0; > > + > > + if (!PVE::Ceph::Tools::check_ceph_installed('ceph_bin', 1)) { > > + print("Ceph is not installed. No action required.\n"); > > + exit 0; > > + } > > + > > + eval { > > + PVE::Ceph::Tools::check_ceph_inited(); > > + }; > > + if ($@) { > > + print("Ceph is not initialized. No action required.\n"); > > + exit 0; > > + } > > + > > + my $rados =3D eval { PVE::RADOS->new() }; > > + $@ =3D ''; # warnings are displayed regardless > > what's this line for? Now that you mention it, I think this was a remnant from the previous series, where the error handling was different. Will most likely remove this in v5. > > > + my $ceph_crash_key_path =3D PVE::Ceph::Tools::get_config('pve_ceph= _crash_key_path'); > > + > > + PVE::Cluster::cfs_lock_file($ceph_cfg_file, undef, sub { > > if you let this sub return a boolean value (worked/failed) > > > + my $cfg =3D PVE::Cluster::cfs_read_file($ceph_cfg_file); > > + > > + if (!defined($rados)) { > > + my $has_mon_config =3D defined(first { m/mon\..*/ } keys $cfg->%*= ); > > + if ($has_mon_config) { > > + die "Connection to RADOS failed even though a monitor is configured.= \n" . > > + "Please verify whether your configuration is correct.\n" > > + } > > + > > + print( > > + "Connection to RADOS failed and no monitor is configured. ". > > + "Assume that things are fine. No action required.\n" > > + ); > > not sure if it wouldn't be better to check mon_host here? that's the > thing actually needed to contact the mon, everything else is optional.. > and even that could be moved to DNS, but we don't support that for > PVE-managed clusters.. Hmm... I'll see what I can do / how it behaves. > > > + return; > > + } > > + > > + my $updated_keyring =3D PVE::Ceph::Tools::create_or_update_crash_keyr= ing_file($rados); > > + > > + if ($updated_keyring) { > > + print("Keyring file '$ceph_crash_key_path' was updated.\n"); > > + } > > + > > + my $changed =3D try_adapt_cfg($cfg); > > + > > + if ($changed) { > > + print("Committing updated configuration.\n"); > > + PVE::Cluster::cfs_write_file($ceph_cfg_file, $cfg); > > + print("Successfully updated configuration for 'ceph-crash.service= '.\n"); > > + } else { > > + print("Configuration does not need to be updated.\n"); > > + } > > + }); > > then you can just check the return value (undef means cfs_lock set an > error, true means it worked, false means it failed without dieing, or > whatever you want to return) Thanks for the suggestion, will incorporate this in v5! > > > + if ($@) { > > + warn("Failed to configure keyring for 'ceph-crash.service'. Error:\n"= ); > > nit: I think I'd prefer the "Error: to be on the next line with the error= message.." ACK ;) > > > + warn("$@"); > > + exit 1; > > + } > > +} > > + > > +main(); > > + > > +0; > > why? Not sure anymore... my bad! I think this was part of some module testing that I was doing (where it's a convention to put a `1;` at the end of the file to confirm that the module loaded successfully). Regardless, this shouldn't be here, so will remove it in v5. > > > diff --git a/debian/postinst b/debian/postinst > > index 61b50f97..c36afa13 100755 > > --- a/debian/postinst > > +++ b/debian/postinst > > @@ -82,10 +82,24 @@ EOF > > =20 > > update_ceph_conf() { > > CEPH_CONF_DIR=3D'/etc/pve/ceph' > > + UNIT=3D'ceph-crash.service' > > + > > + echo "" > > =20 > > if test ! -d "${CEPH_CONF_DIR}"; then > > mkdir -p "${CEPH_CONF_DIR}" > > fi > > + > > + # Don't fail in case user has "exotic" configuration where RADOS > > + # isn't available on all nodes for some reason > > + /usr/share/pve-manager/helpers/pve-init-ceph-crash || true > > + > > + if systemctl -q is-enabled "$UNIT"; then > > like Friedrich said, please silence this (`2>/dev/null`) ACK! > > > + echo "Restarting '$UNIT'" > > + deb-systemd-invoke restart "$UNIT" > > || true please! ACK! .. and thanks for the thorough review! > > > + fi > > + > > + echo "" > > } > > =20 > > migrate_apt_auth_conf() { > > --=20 > > 2.39.2 > >=20 > >=20 > >=20 > > _______________________________________________ > > pve-devel mailing list > > pve-devel@lists.proxmox.com > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > >=20 > >=20 > >=20 > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel