From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 8E89B1FF136 for ; Mon, 09 Mar 2026 23:02:17 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 01043C023; Mon, 9 Mar 2026 23:02:11 +0100 (CET) From: Thomas Lamprecht To: pve-devel@lists.proxmox.com Subject: [PATCH ha-manager 1/3] api: status: add fencing status entry with armed/standby state Date: Mon, 9 Mar 2026 22:57:08 +0100 Message-ID: <20260309220128.973793-2-t.lamprecht@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260309220128.973793-1-t.lamprecht@proxmox.com> References: <20260309220128.973793-1-t.lamprecht@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1773093663903 X-SPAM-LEVEL: Spam detection results: 0 AWL -1.080 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.408 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.819 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.903 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: CXNEPVPWFHXMAGKNNNTAKAEH5GYSWFBC X-Message-ID-Hash: CXNEPVPWFHXMAGKNNNTAKAEH5GYSWFBC X-MailFrom: t.lamprecht@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Add a fencing entry to the HA status output that shows whether the fencing mechanism is active or idle. The CRM only opens the watchdog when actively running as master, so distinguish between: - armed: CRM is active master, watchdog connected - standby: no active CRM master (e.g. no services configured, cluster just started), watchdog not open Each LRM entry additionally shows its per-node watchdog state. The LRM holds its watchdog while it has the agent lock (active or maintenance state). Previously there was no indication of the fencing state at all, which made it hard to tell whether the watchdog was actually protecting the cluster. Signed-off-by: Thomas Lamprecht --- Some details here might be questioned, and tbh. I'm not *that* happy with the status endpoint's return schema structure, but that's pre-existing and probably needs a breaking change on a majore release if we want to clean this up for real. src/PVE/API2/HA/Status.pm | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/src/PVE/API2/HA/Status.pm b/src/PVE/API2/HA/Status.pm index a1e5787..a6b00b9 100644 --- a/src/PVE/API2/HA/Status.pm +++ b/src/PVE/API2/HA/Status.pm @@ -91,7 +91,7 @@ __PACKAGE__->register_method({ }, type => { description => "Type of status entry.", - enum => ["quorum", "master", "lrm", "service"], + enum => ["quorum", "master", "lrm", "service", "fencing"], }, quorate => { description => "For type 'quorum'. Whether the cluster is quorate or not.", @@ -143,6 +143,13 @@ __PACKAGE__->register_method({ type => "string", optional => 1, }, + armed_state => { + description => "For type 'fencing'. Whether HA fencing is armed" + . " or on standby.", + type => "string", + enum => ['armed', 'standby'], + optional => 1, + }, }, }, }, @@ -193,6 +200,23 @@ __PACKAGE__->register_method({ }; } + # the CRM only opens the watchdog when actively running as master + my $crm_active = + defined($status->{master_node}) + && defined($status->{timestamp}) + && $timestamp_to_status->($ctime, $status->{timestamp}) eq 'active'; + + my $armed_state = $crm_active ? 'armed' : 'standby'; + my $crm_wd = $crm_active ? "CRM watchdog active" : "CRM watchdog standby"; + push @$res, + { + id => 'fencing', + type => 'fencing', + node => $status->{master_node} // $nodename, + status => "$armed_state ($crm_wd)", + armed_state => $armed_state, + }; + foreach my $node (sort keys %{ $status->{node_status} }) { my $active_count = PVE::HA::Tools::count_active_services($status->{service_status}, $node); @@ -209,10 +233,17 @@ __PACKAGE__->register_method({ } else { my $status_str = &$timestamp_to_status($ctime, $lrm_status->{timestamp}); my $lrm_mode = $lrm_status->{mode}; + my $lrm_state = $lrm_status->{state} || 'unknown'; + + # LRM holds its watchdog while it has the agent lock + my $lrm_wd = + ($status_str eq 'active' + && ($lrm_state eq 'active' || $lrm_state eq 'maintenance')) + ? 'watchdog active' + : 'watchdog standby'; if ($status_str eq 'active') { $lrm_mode ||= 'active'; - my $lrm_state = $lrm_status->{state} || 'unknown'; if ($lrm_mode ne 'active') { $status_str = "$lrm_mode mode"; } else { @@ -227,7 +258,7 @@ __PACKAGE__->register_method({ } my $time_str = localtime($lrm_status->{timestamp}); - my $status_text = "$node ($status_str, $time_str)"; + my $status_text = "$node ($status_str, $lrm_wd, $time_str)"; push @$res, { id => $id, -- 2.47.3