From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 087B3B4B8
 for <pve-devel@lists.proxmox.com>; Fri, 30 Jun 2023 13:49:58 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id DE6F923392
 for <pve-devel@lists.proxmox.com>; Fri, 30 Jun 2023 13:49:27 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Fri, 30 Jun 2023 13:49:27 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D976742A0B
 for <pve-devel@lists.proxmox.com>; Fri, 30 Jun 2023 13:49:26 +0200 (CEST)
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Fri, 30 Jun 2023 13:49:21 +0200
Message-Id: <20230630114923.65506-2-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20230630114923.65506-1-f.ebner@proxmox.com>
References: <20230630114923.65506-1-f.ebner@proxmox.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.045 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [pvecm.pm, proxmox.com]
Subject: [pve-devel] [PATCH v2 cluster 2/4] pvecm: updatecerts: allow
 specifying time to wait for quorum via CLI argument
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 30 Jun 2023 11:49:58 -0000

Useful for the updatecerts call triggered via the ExecStartPre hook
for pveproxy.service.

When starting a node that's part of a cluster, there is a time window
between the start of pve-cluster.service and when quorum is reached
(from the node's perspective). pveproxy.service is ordered after
pve-cluster.service, but that does not prevent the ExecStartPre hook
from being executed before the node is part of the quorate partition.

The pvecm updatecerts command won't do much without quorum. Generating
local (non-pmxcfs) files is still done before waiting on quorum.

In particular, it might happen that the base directories for observed
files will not get created during/after the upgrade from Proxmox VE 7
to 8 (reported in the community forum [0] and reproduced right away in
a virtual test cluster).

Waiting on quorum should highly increase the chances for successful
execution of the ExecStartPre hook.

[0]: https://forum.proxmox.com/threads/129644/

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v2:
    * Different approach: always wait for quorum until timeout.

 src/PVE/CLI/pvecm.pm | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/PVE/CLI/pvecm.pm b/src/PVE/CLI/pvecm.pm
index ebc15bd..a0550c2 100755
--- a/src/PVE/CLI/pvecm.pm
+++ b/src/PVE/CLI/pvecm.pm
@@ -6,6 +6,8 @@ use warnings;
 use Cwd qw(getcwd);
 use File::Path;
 use File::Basename;
+use Time::HiRes qw(usleep);
+
 use PVE::Tools qw(run_command);
 use PVE::Cluster;
 use PVE::INotify;
@@ -577,6 +579,12 @@ __PACKAGE__->register_method ({
 	# no-good for ExecStartPre as it fails the whole service in this case
 	PVE::Tools::run_fork_with_timeout(30, sub {
 	    PVE::Cluster::Setup::generate_local_files();
+
+	    for (my $i = 0; !PVE::Cluster::check_cfs_quorum(1); $i++) {
+		print "waiting for pmxcfs mount to appear and get quorate...\n" if $i % 50 == 0;
+		usleep(100 * 1000);
+	    }
+
 	    PVE::Cluster::Setup::updatecerts_and_ssh($param->@{qw(force silent)});
 	    PVE::Cluster::prepare_observed_file_basedirs();
 	});
-- 
2.39.2