From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 44D271FF13B for ; Wed, 22 Apr 2026 12:00:43 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 1DF2C13028; Wed, 22 Apr 2026 12:00:43 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Subject: [PATCH ha-manager 4/7] manager: make HA resources without failback move back to maintenance node Date: Wed, 22 Apr 2026 12:00:22 +0200 Message-ID: <20260422100035.232716-5-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260422100035.232716-1-d.kral@proxmox.com> References: <20260422100035.232716-1-d.kral@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776851951811 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.079 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [manager.pm] Message-ID-Hash: WYV4DKPIJANY2MOGS46AN3MDC3TTWST5 X-Message-ID-Hash: WYV4DKPIJANY2MOGS46AN3MDC3TTWST5 X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: If an HA resource has failback disabled and its current node is put in maintenance mode, the HA resource will correctly move to a replacement node. Though as the previous node is put out of maintenance mode again, the HA resource will stay on the new node. As HA resources should move back to their previous maintenance node, do not stay on current node if the HA resource is not yet on the maintenance node. Signed-off-by: Daniel Kral --- src/PVE/HA/Manager.pm | 1 + src/test/test-node-affinity-maintenance-nonstrict2/README | 3 ++- .../test-node-affinity-maintenance-nonstrict2/log.expect | 8 ++++++++ src/test/test-node-affinity-maintenance-strict4/README | 3 ++- .../test-node-affinity-maintenance-strict4/log.expect | 8 ++++++++ 5 files changed, 21 insertions(+), 2 deletions(-) diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm index 684244e1..795b98c1 100644 --- a/src/PVE/HA/Manager.pm +++ b/src/PVE/HA/Manager.pm @@ -336,6 +336,7 @@ sub select_service_node { $node_preference eq 'none' && !$service_conf->{failback} && $allowed_nodes->{$current_node} + && (!defined($maintenance_fallback) || $maintenance_fallback eq $current_node) && PVE::HA::Rules::ResourceAffinity::is_allowed_on_node( $together, $separate, $current_node, ) diff --git a/src/test/test-node-affinity-maintenance-nonstrict2/README b/src/test/test-node-affinity-maintenance-nonstrict2/README index 9af43c11..056a882d 100644 --- a/src/test/test-node-affinity-maintenance-nonstrict2/README +++ b/src/test/test-node-affinity-maintenance-nonstrict2/README @@ -1,3 +1,4 @@ Test whether an HA resource with failback disabled in a non-strict node affinity rule with a single node member will move to a replacement node if its -current node is in maintenance mode. +current node is in maintenance mode and moves back to the previous maintenance +node as soon as it's available again. diff --git a/src/test/test-node-affinity-maintenance-nonstrict2/log.expect b/src/test/test-node-affinity-maintenance-nonstrict2/log.expect index 05a77a24..339ce3ab 100644 --- a/src/test/test-node-affinity-maintenance-nonstrict2/log.expect +++ b/src/test/test-node-affinity-maintenance-nonstrict2/log.expect @@ -37,4 +37,12 @@ info 220 cmdlist: execute crm node3 disable-node-maintenance info 225 node3/lrm: got lock 'ha_agent_node3_lock' info 225 node3/lrm: status change maintenance => active info 240 node1/crm: node 'node3': state changed from 'maintenance' => 'online' +info 240 node1/crm: moving service 'vm:101' back to 'node3', node came back from maintenance. +info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running) +info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 241 node1/lrm: service vm:101 - start migrate to node 'node3' +info 241 node1/lrm: service vm:101 - end migrate to node 'node3' +info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3) +info 265 node3/lrm: starting service vm:101 +info 265 node3/lrm: service status vm:101 started info 820 hardware: exit simulation - done diff --git a/src/test/test-node-affinity-maintenance-strict4/README b/src/test/test-node-affinity-maintenance-strict4/README index 43c68463..e6ad5c7e 100644 --- a/src/test/test-node-affinity-maintenance-strict4/README +++ b/src/test/test-node-affinity-maintenance-strict4/README @@ -1,3 +1,4 @@ Test whether an HA resource with failback disabled in a strict node affinity rule with two differently prioritized node members will move to the -lower-priority node if its current node is in maintenance mode. +lower-priority node if its current node is in maintenance mode and moves back +to the previous maintenance node as soon as it's available again. diff --git a/src/test/test-node-affinity-maintenance-strict4/log.expect b/src/test/test-node-affinity-maintenance-strict4/log.expect index 6f19258c..0bdf4fa0 100644 --- a/src/test/test-node-affinity-maintenance-strict4/log.expect +++ b/src/test/test-node-affinity-maintenance-strict4/log.expect @@ -37,4 +37,12 @@ info 220 cmdlist: execute crm node3 disable-node-maintenance info 225 node3/lrm: got lock 'ha_agent_node3_lock' info 225 node3/lrm: status change maintenance => active info 240 node1/crm: node 'node3': state changed from 'maintenance' => 'online' +info 240 node1/crm: moving service 'vm:101' back to 'node3', node came back from maintenance. +info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running) +info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3) +info 243 node2/lrm: service vm:101 - start migrate to node 'node3' +info 243 node2/lrm: service vm:101 - end migrate to node 'node3' +info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3) +info 265 node3/lrm: starting service vm:101 +info 265 node3/lrm: service status vm:101 started info 820 hardware: exit simulation - done -- 2.47.3