From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 13D161FF15C for ; Fri, 19 Sep 2025 16:09:16 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A876C18823; Fri, 19 Sep 2025 16:09:32 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Date: Fri, 19 Sep 2025 16:08:10 +0200 Message-ID: <20250919140856.1361124-3-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919140856.1361124-1-d.kral@proxmox.com> References: <20250919140856.1361124-1-d.kral@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1758290929605 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.015 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH ha-manager 2/3] manager: fix precedence in mixed resource affinity rules usage X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Strict positive resource affinity rules narrow down the possible nodes to a single candidate for a HA resource A, which is the node, where the most of the HA resources in the positive affinity rule are already running on and in case of a tie the alphabetically first node is chosen. If the chosen node contains a HA resource B, which is in negative affinity with the HA resource A, then $pri_nodes will become empty and will not result in any migration. Therefore, apply the negative resource affinity rules before the positive resource affinity rules to prevent the premature pruning of nodes. Signed-off-by: Daniel Kral --- src/PVE/HA/Manager.pm | 2 +- .../README | 13 ++++++- .../log.expect | 36 ++++++++++++++----- 3 files changed, 41 insertions(+), 10 deletions(-) diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm index ba59f642..3d74288d 100644 --- a/src/PVE/HA/Manager.pm +++ b/src/PVE/HA/Manager.pm @@ -176,8 +176,8 @@ sub select_service_node { } } - apply_positive_resource_affinity($together, $pri_nodes); apply_negative_resource_affinity($separate, $pri_nodes); + apply_positive_resource_affinity($together, $pri_nodes); return $maintenance_fallback if defined($maintenance_fallback) && $pri_nodes->{$maintenance_fallback}; diff --git a/src/test/test-resource-affinity-strict-mixed2/README b/src/test/test-resource-affinity-strict-mixed2/README index c56d1a2d..a4d89ff3 100644 --- a/src/test/test-resource-affinity-strict-mixed2/README +++ b/src/test/test-resource-affinity-strict-mixed2/README @@ -7,4 +7,15 @@ The test scenario is: - vm:101, vm:103, vm:201, and vm:203 are currently running on node1 - vm:102 and vm:202 are running on node3 and node2 respectively -The current outcome is incorrect. +The expected outcome is: +- The resource-node placements do not adhere to the defined resource affinity + rules, therefore the HA resources must be moved accordingly: As vm:101 and + vm:103 must be kept separate from vm:201 and vm:203, which are all currently + running on node1, these must be migrated to separate nodes: + - As the negative resource affinity rule is strict, resources must neither + share the current nor the migration target node, so both positive + affinity groups must be put on "spare" nodes, which in that case is node3 + (for vm:101 and vm:103) and node2 (for vm:201 and vm:203) respectively. + These node selections are because there are already other positive + resource affinity rule members running on these nodes (vm:102 on node3 + and vm:202 on node2). diff --git a/src/test/test-resource-affinity-strict-mixed2/log.expect b/src/test/test-resource-affinity-strict-mixed2/log.expect index 9cdc8b14..e7081e4b 100644 --- a/src/test/test-resource-affinity-strict-mixed2/log.expect +++ b/src/test/test-resource-affinity-strict-mixed2/log.expect @@ -25,16 +25,24 @@ info 20 node1/crm: service 'vm:103': state changed from 'request_start' t info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'started' (node = node1) info 20 node1/crm: service 'vm:202': state changed from 'request_start' to 'started' (node = node2) info 20 node1/crm: service 'vm:203': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: migrate service 'vm:101' to node 'node3' (running) +info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:103' to node 'node3' (running) +info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:201' to node 'node2' (running) +info 20 node1/crm: service 'vm:201': state changed from 'started' to 'migrate' (node = node1, target = node2) +info 20 node1/crm: migrate service 'vm:203' to node 'node2' (running) +info 20 node1/crm: service 'vm:203': state changed from 'started' to 'migrate' (node = node1, target = node2) info 21 node1/lrm: got lock 'ha_agent_node1_lock' info 21 node1/lrm: status change wait_for_agent_lock => active -info 21 node1/lrm: starting service vm:101 -info 21 node1/lrm: service status vm:101 started -info 21 node1/lrm: starting service vm:103 -info 21 node1/lrm: service status vm:103 started -info 21 node1/lrm: starting service vm:201 -info 21 node1/lrm: service status vm:201 started -info 21 node1/lrm: starting service vm:203 -info 21 node1/lrm: service status vm:203 started +info 21 node1/lrm: service vm:101 - start migrate to node 'node3' +info 21 node1/lrm: service vm:101 - end migrate to node 'node3' +info 21 node1/lrm: service vm:103 - start migrate to node 'node3' +info 21 node1/lrm: service vm:103 - end migrate to node 'node3' +info 21 node1/lrm: service vm:201 - start migrate to node 'node2' +info 21 node1/lrm: service vm:201 - end migrate to node 'node2' +info 21 node1/lrm: service vm:203 - start migrate to node 'node2' +info 21 node1/lrm: service vm:203 - end migrate to node 'node2' info 22 node2/crm: status change wait_for_quorum => slave info 23 node2/lrm: got lock 'ha_agent_node2_lock' info 23 node2/lrm: status change wait_for_agent_lock => active @@ -45,4 +53,16 @@ info 25 node3/lrm: got lock 'ha_agent_node3_lock' info 25 node3/lrm: status change wait_for_agent_lock => active info 25 node3/lrm: starting service vm:102 info 25 node3/lrm: service status vm:102 started +info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:201': state changed from 'migrate' to 'started' (node = node2) +info 40 node1/crm: service 'vm:203': state changed from 'migrate' to 'started' (node = node2) +info 43 node2/lrm: starting service vm:201 +info 43 node2/lrm: service status vm:201 started +info 43 node2/lrm: starting service vm:203 +info 43 node2/lrm: service status vm:203 started +info 45 node3/lrm: starting service vm:101 +info 45 node3/lrm: service status vm:101 started +info 45 node3/lrm: starting service vm:103 +info 45 node3/lrm: service status vm:103 started info 620 hardware: exit simulation - done -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel