From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id F1B809074C for ; Tue, 2 Apr 2024 16:57:06 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D4DC884BC for ; Tue, 2 Apr 2024 16:57:06 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 2 Apr 2024 16:57:05 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 50553446D3 for ; Tue, 2 Apr 2024 16:57:05 +0200 (CEST) From: Max Carrara To: pve-devel@lists.proxmox.com Date: Tue, 2 Apr 2024 16:55:22 +0200 Message-Id: <20240402145523.683008-11-m.carrara@proxmox.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240402145523.683008-1-m.carrara@proxmox.com> References: <20240402145523.683008-1-m.carrara@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.022 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH v5 pve-manager 10/11] fix #4759: ceph: configure ceph-crash.service and its key X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Apr 2024 14:57:07 -0000 Due to Ceph dropping privileges when running the 'ceph-crash' daemon [0], it is necessary to allow the daemon to authenticate with its cluster in a safe manner. In order to avoid exposing sensitive keyrings or somehow escalating its privileges again, 'ceph-crash' is therefore provided with its own keyring in the '/etc/pve/ceph' directory. This directory, due to being on 'pmxcfs', may be read by members of the 'www-data' group, which 'ceph-crash' is made part of [1]. Expected Configuration ---------------------- 1. A keyring file named '/etc/pve/ceph/ceph.client.crash.keyring' exists 2. A section named 'client.crash' exists in '/etc/pve/ceph.conf' 3. The 'client.crash' section has a key named 'keyring' which references the keyring file as '/etc/pve/ceph/$cluster.$name.keyring' 4. The 'client.crash' section has *no* key named 'key' New Clusters ------------ The keyring file is created and the conf file is updated after the first monitor has been created (when calling `pveceph mon create`). Existing Clusters ----------------- A new helper script creates and configures the 'client.crash' keyring in `postinst`, if: * Ceph is installed * Ceph is initialized ('/etc/pve/ceph.conf' and '/etc/pve/ceph' exist) * Connection to RADOS is successful If the above conditions are met, the helper script ensures that the existing configuration matches the expected configuration mentioned above. The configuration is not changed if it is already as expected. The helper script may be called again manually if the `postinst` hook fails. It is installed to '/usr/share/pve-manager/helpers/pve-init-ceph-crash'. Existing `client.crash` Key --------------------------- If a key named 'client.crash' already exists within the cluster, it is reused and not regenerated. [0]: https://github.com/ceph/ceph/pull/48713 [1]: https://git.proxmox.com/?p=ceph.git;a=commitdiff;h=f72c698a55905d93e9a0b7b95674616547deba8a Signed-off-by: Max Carrara --- Changes v1 --> v2: * fix 'keyring' key being appended to 'client.crash' section even if it already exists and configured correctly Changes v2 --> v3: * avoid needless second 'pmxcfs' lock on 'ceph.conf' when configuring 'client.crash' keyring after first monitor was created * change name of helper sub that creates the ceph.crash keyring * the helper sub now implicitly expects the /etc/pve/ceph directory to exist (as the previous commit enforces its existence) * remove BASH part of postinst hook related to `ceph-crash` * add Perl helper script and call that instead in postinst hook * reword commit message Changes v3 --> v4: * ensure key named 'key' does not exist for 'client.crash' section in helper script (thanks Friedrich!) * restart ceph-crash.service if it wasn't disabled (thanks Friedrich!) * add empty line before and after making changes in `update_ceph_conf` in 'postinst' hook * clean up the helper script; logic is easier to follow now * increase verbosity of output of helper script, showing what's exactly done when invoked Changes v4 --> v5: * helper script: improve error handling (thanks Fabian!) * helper script: improve overall control flow (thanks Fabian!) * helper script: add comment regarding RPC env and RADOS object * helper script: improve overall command output and make it more obvious when and what something is modified * helper script: improve error output and differ between errors regarding pmxcfs and altering the configuration * helper script: remove unnecessary code * postinst: suppress output to stderr when checking whether 'ceph-crash.service' is enabled (thanks Friedrich!) * postinst: ensure `deb-systemd-invoke` never fails (thanks Fabian!) PVE/API2/Ceph/MON.pm | 8 +++ PVE/Ceph/Tools.pm | 35 ++++++++++ bin/Makefile | 1 + bin/pve-init-ceph-crash | 142 ++++++++++++++++++++++++++++++++++++++++ debian/postinst | 13 ++++ 5 files changed, 199 insertions(+) create mode 100755 bin/pve-init-ceph-crash diff --git a/PVE/API2/Ceph/MON.pm b/PVE/API2/Ceph/MON.pm index 00b8f503..60c8d8dd 100644 --- a/PVE/API2/Ceph/MON.pm +++ b/PVE/API2/Ceph/MON.pm @@ -451,6 +451,14 @@ __PACKAGE__->register_method ({ ) }; warn "$@" if $@; + + print "Configuring keyring for ceph-crash.service\n"; + eval { + PVE::Ceph::Tools::create_or_update_crash_keyring_file(); + $cfg->{'client.crash'}->{keyring} = '/etc/pve/ceph/$cluster.$name.keyring'; + cfs_write_file('ceph.conf', $cfg); + }; + warn "Unable to configure keyring for ceph-crash.service: $@" if $@; } eval { PVE::Ceph::Services::ceph_service_cmd('enable', $monsection) }; diff --git a/PVE/Ceph/Tools.pm b/PVE/Ceph/Tools.pm index 735bb116..9b97171e 100644 --- a/PVE/Ceph/Tools.pm +++ b/PVE/Ceph/Tools.pm @@ -20,6 +20,7 @@ my $pve_ceph_cfgpath = "/etc/pve/$ccname.conf"; my $ceph_cfgpath = "$ceph_cfgdir/$ccname.conf"; my $pve_ceph_cfgdir = "/etc/pve/ceph"; +my $pve_ceph_crash_key_path = "$pve_ceph_cfgdir/$ccname.client.crash.keyring"; my $pve_mon_key_path = "/etc/pve/priv/$ccname.mon.keyring"; my $pve_ckeyring_path = "/etc/pve/priv/$ccname.client.admin.keyring"; my $ckeyring_path = "/etc/ceph/ceph.client.admin.keyring"; @@ -45,6 +46,7 @@ my $config_values = { my $config_files = { pve_ceph_cfgpath => $pve_ceph_cfgpath, + pve_ceph_crash_key_path => $pve_ceph_crash_key_path, pve_mon_key_path => $pve_mon_key_path, pve_ckeyring_path => $pve_ckeyring_path, ceph_bootstrap_osd_keyring => $ceph_bootstrap_osd_keyring, @@ -420,6 +422,39 @@ sub get_or_create_admin_keyring { return $pve_ckeyring_path; } +# is also used in `pve-init-ceph-crash` helper +sub create_or_update_crash_keyring_file { + my ($rados) = @_; + + if (!defined($rados)) { + $rados = PVE::RADOS->new(); + } + + my $output = $rados->mon_command({ + prefix => 'auth get-or-create', + entity => 'client.crash', + caps => [ + mon => 'profile crash', + mgr => 'profile crash', + ], + format => 'plain', + }); + + if (-f $pve_ceph_crash_key_path) { + my $contents = PVE::Tools::file_get_contents($pve_ceph_crash_key_path); + + if ($contents ne $output) { + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output); + return 1; + } + } else { + PVE::Tools::file_set_contents($pve_ceph_crash_key_path, $output); + return 1; + } + + return 0; +} + # get ceph-volume managed osds sub ceph_volume_list { my $result = {}; diff --git a/bin/Makefile b/bin/Makefile index 06d8148e..b221e4b1 100644 --- a/bin/Makefile +++ b/bin/Makefile @@ -83,6 +83,7 @@ install: $(SCRIPTS) $(CLI_MANS) $(SERVICE_MANS) $(BASH_COMPLETIONS) $(ZSH_COMPLE install -m 0755 $(SCRIPTS) $(BINDIR) install -d $(USRSHARE)/helpers install -m 0755 pve-startall-delay $(USRSHARE)/helpers + install -m 0755 pve-init-ceph-crash $(USRSHARE)/helpers install -d $(MAN1DIR) install -m 0644 $(CLI_MANS) $(MAN1DIR) install -d $(MAN8DIR) diff --git a/bin/pve-init-ceph-crash b/bin/pve-init-ceph-crash new file mode 100755 index 00000000..89170b2e --- /dev/null +++ b/bin/pve-init-ceph-crash @@ -0,0 +1,142 @@ +#!/usr/bin/perl + +use strict; +use warnings; + +use List::Util qw(first); + +use PVE::Ceph::Tools; +use PVE::Cluster; +use PVE::RADOS; +use PVE::RPCEnvironment; + +my $ceph_cfg_file = 'ceph.conf'; +my $keyring_value = '/etc/pve/ceph/$cluster.$name.keyring'; + + +sub try_adapt_cfg { + my ($cfg) = @_; + + my $entity = 'client.crash'; + my $removed_key = 0; + + print("Checking whether the configuration for '$entity' needs to be updated.\n"); + + my $add_keyring = sub { + print("Setting keyring path to '$keyring_value'.\n"); + $cfg->{$entity}->{keyring} = $keyring_value; + }; + + if (!exists($cfg->{$entity})) { + print("Adding missing section for '$entity'.\n"); + $add_keyring->(); + return 1; + } + + if (exists($cfg->{$entity}->{key})) { + print("Removing existing usage of key.\n"); + delete($cfg->{$entity}->{key}); + $removed_key = 1; + } + + if (!exists($cfg->{$entity}->{keyring})) { + print("Keyring path is missing from configuration.\n"); + $add_keyring->(); + return 1; + } + + my $current_keyring_value = $cfg->{$entity}->{keyring}; + if ($current_keyring_value ne $keyring_value) { + print("Current keyring path differs from expected path.\n"); + $add_keyring->(); + return 1; + } + + return $removed_key; +} + + +sub main { + # PVE::RADOS expects an active RPC Environment because it forks itself + # and may want to clean up after + my $rpcenv = PVE::RPCEnvironment->setup_default_cli_env(); + + if (!PVE::Ceph::Tools::check_ceph_installed('ceph_bin', 1)) { + print("Ceph is not installed. No action required.\n"); + exit 0; + } + + eval { + PVE::Ceph::Tools::check_ceph_inited(); + }; + if ($@) { + print("Ceph is not initialized. No action required.\n"); + exit 0; + } + + my $rados = eval { PVE::RADOS->new() }; + my $ceph_crash_key_path = PVE::Ceph::Tools::get_config('pve_ceph_crash_key_path'); + + my $inner_err = ''; + + my $rval = PVE::Cluster::cfs_lock_file($ceph_cfg_file, undef, sub { + eval { + my $cfg = PVE::Cluster::cfs_read_file($ceph_cfg_file); + + if (!defined($rados)) { + my $has_mon_host = defined($cfg->{global}) && defined($cfg->{global}->{mon_host}); + if ($has_mon_host && $cfg->{global}->{mon_host} ne '') { + die "Connection to RADOS failed even though a monitor is configured.\n" . + "Please verify whether your configuration in '$ceph_cfg_file' is correct.\n" + } + + print( + "Connection to RADOS failed and no monitor is configured in '$ceph_cfg_file'.\n". + "Assuming that things are fine. No action required.\n" + ); + return; + } + + my $updated_keyring = PVE::Ceph::Tools::create_or_update_crash_keyring_file($rados); + + if ($updated_keyring) { + print("Keyring file '$ceph_crash_key_path' was updated.\n"); + } + + my $changed = try_adapt_cfg($cfg); + + if ($changed) { + print("Committing updated configuration to '$ceph_cfg_file'.\n"); + PVE::Cluster::cfs_write_file($ceph_cfg_file, $cfg); + print("Successfully updated configuration for 'ceph-crash.service'.\n"); + } else { + print("Configuration in '$ceph_cfg_file' does not need to be updated.\n"); + } + }; + $inner_err = $@; + + return 1; + }); + + # cfs_lock_file sets $@ explicitly to undef + my $err = $@ // ''; + + my $has_err = !defined($rval) || $inner_err || $err; + + if ($has_err) { + $err =~ s/\n*$//; + $inner_err =~ s/\n*$//; + + if (!defined($rval)) { + warn("Error while acquiring or releasing lock for '$ceph_cfg_file'.\n"); + warn("Error: $err\n") if $err ne ''; + } + + warn("Failed to configure keyring for 'ceph-crash.service'.\nError: $inner_err\n") + if $inner_err ne ''; + + exit 1; + } +} + +main(); diff --git a/debian/postinst b/debian/postinst index 58f2713f..2bd71459 100755 --- a/debian/postinst +++ b/debian/postinst @@ -82,10 +82,23 @@ EOF update_ceph_conf() { CEPH_CONF_DIR='/etc/pve/ceph' + UNIT='ceph-crash.service' + + echo "" if test ! -d "${CEPH_CONF_DIR}"; then mkdir -p "${CEPH_CONF_DIR}" fi + + # Don't fail in case user has "exotic" configuration where RADOS + # isn't available on all nodes for some reason + /usr/share/pve-manager/helpers/pve-init-ceph-crash || true + + if systemctl -q is-enabled "$UNIT" 2> /dev/null; then + deb-systemd-invoke restart "$UNIT" || true + fi + + echo "" } migrate_apt_auth_conf() { -- 2.39.2