From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 3854EB270 for ; Thu, 29 Jun 2023 15:59:40 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 211B919431 for ; Thu, 29 Jun 2023 15:59:40 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 29 Jun 2023 15:59:39 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2C2F8422D0 for ; Thu, 29 Jun 2023 15:59:39 +0200 (CEST) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Thu, 29 Jun 2023 15:59:33 +0200 Message-Id: <20230629135935.62588-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.046 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: [pve-devel] [RFC cluster 1/2] pvecm: updatecerts: allow specifying time to wait for quorum via CLI argument X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jun 2023 13:59:40 -0000 Useful for the updatecerts call triggered via the ExecStartPre hook for pveproxy.service. When starting a node that's part of a cluster, there is a time window between the start of pve-cluster.service and when quorum is reached (from the node's perspective). pveproxy.service is ordered after pve-cluster.service, but that does not prevent the ExecStartPre hook from being executed before the node is part of the quorate partition. The pvecm updatecerts command won't do anything without quorum. In particular, it might happen that the base directories for observed files will not get created during/after the upgrade from Proxmox VE 7 to 8 (reported in the community forum [0] and reproduced right away in a virtual test cluster). This parameter will allow to increase the chances for successful execution of the hook. [0]: https://forum.proxmox.com/threads/129644/ Signed-off-by: Fiona Ebner --- src/PVE/CLI/pvecm.pm | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/src/PVE/CLI/pvecm.pm b/src/PVE/CLI/pvecm.pm index 564dc99..94f1e83 100755 --- a/src/PVE/CLI/pvecm.pm +++ b/src/PVE/CLI/pvecm.pm @@ -6,7 +6,7 @@ use warnings; use Cwd qw(getcwd); use File::Path; use File::Basename; -use PVE::Tools qw(run_command); +use PVE::Tools qw(extract_param run_command); use PVE::Cluster; use PVE::INotify; use PVE::JSONSchema qw(get_standard_option); @@ -566,12 +566,33 @@ __PACKAGE__->register_method ({ type => 'boolean', optional => 1, }, + 'quorum-wait-seconds' => { + description => "Wait for quorum for this many seconds.", + type => 'integer', + minimum => 0, + optional => 1, + }, }, }, returns => { type => 'null' }, code => sub { my ($param) = @_; + my $quorum_wait = extract_param($param, 'quorum-wait-seconds'); + + if ($quorum_wait && !PVE::Cluster::check_cfs_quorum(1)) { + print "waiting for quorum..."; + STDOUT->flush(); + for (my $i = 0; $i < $quorum_wait; $i++) { + if (PVE::Cluster::check_cfs_quorum(1)) { + print "OK"; + last; + } + sleep(1); + } + print "\n"; + } + # we get called by the pveproxy.service ExecStartPre and as we do # IO (on /etc/pve) which can hang (uninterruptedly D state). That'd be # no-good for ExecStartPre as it fails the whole service in this case -- 2.39.2