From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 112A41FF15C for ; Fri, 19 Sep 2025 16:09:26 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id CA069188B2; Fri, 19 Sep 2025 16:09:36 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Date: Fri, 19 Sep 2025 16:08:11 +0200 Message-ID: <20250919140856.1361124-4-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250919140856.1361124-1-d.kral@proxmox.com> References: <20250919140856.1361124-1-d.kral@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1758290929635 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.015 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH ha-manager 3/3] test: add additional mixed resource affinity rule test cases X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" These first test case shows and documents changes to the case, where there aren't enough nodes to properly hold the guarantees of strict resource affinity rules nor are efficiently solved yet. The second test case shows that the former test case would have needed four nodes to hold all guarantees and resolve the wrong node placement in one step. Signed-off-by: Daniel Kral --- .../README | 20 ++++ .../cmdlist | 3 + .../hardware_status | 5 + .../log.expect | 92 +++++++++++++++++++ .../manager_status | 1 + .../rules_config | 11 +++ .../service_config | 8 ++ .../README | 14 +++ .../cmdlist | 3 + .../hardware_status | 6 ++ .../log.expect | 85 +++++++++++++++++ .../manager_status | 1 + .../rules_config | 11 +++ .../service_config | 8 ++ 14 files changed, 268 insertions(+) create mode 100644 src/test/test-resource-affinity-strict-mixed3/README create mode 100644 src/test/test-resource-affinity-strict-mixed3/cmdlist create mode 100644 src/test/test-resource-affinity-strict-mixed3/hardware_status create mode 100644 src/test/test-resource-affinity-strict-mixed3/log.expect create mode 100644 src/test/test-resource-affinity-strict-mixed3/manager_status create mode 100644 src/test/test-resource-affinity-strict-mixed3/rules_config create mode 100644 src/test/test-resource-affinity-strict-mixed3/service_config create mode 100644 src/test/test-resource-affinity-strict-mixed4/README create mode 100644 src/test/test-resource-affinity-strict-mixed4/cmdlist create mode 100644 src/test/test-resource-affinity-strict-mixed4/hardware_status create mode 100644 src/test/test-resource-affinity-strict-mixed4/log.expect create mode 100644 src/test/test-resource-affinity-strict-mixed4/manager_status create mode 100644 src/test/test-resource-affinity-strict-mixed4/rules_config create mode 100644 src/test/test-resource-affinity-strict-mixed4/service_config diff --git a/src/test/test-resource-affinity-strict-mixed3/README b/src/test/test-resource-affinity-strict-mixed3/README new file mode 100644 index 00000000..dc8ec152 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/README @@ -0,0 +1,20 @@ +The test scenario is: +- vm:101, vm:102, and vm:103 must be kept together +- vm:201, vm:202, and vm:203 must be kept together +- vm:101 and vm:201 must be kept separate +- Therefore, vm:101, vm:102, and vm:103 must all be kept separate from vm:201, + vm:202, and vm:203 and vice versa +- vm:101, vm:103, vm:201, and vm:203 are currently running on node1 +- vm:102 and vm:202 are both running on node2 + +The expected outcome is: +- The resource-node placements do not adhere to the defined resource affinity + rules, therefore the HA resources must be moved accordingly. In the end, + vm:101, vm:102, and vm:103 should be on a separate node as vm:201, vm:202, + and vm:203. + +The current final outcome is correct, but is inefficient and doesn't hold all +guarantees (i.e. do not put resources in strict node affinity rules on the same +node nor their migration target) at all times. As shown by +test-resources-affinity-strict-mixed4, it needs at least four nodes to hold all +guarantees and rebalance the resources to their correct node placements. diff --git a/src/test/test-resource-affinity-strict-mixed3/cmdlist b/src/test/test-resource-affinity-strict-mixed3/cmdlist new file mode 100644 index 00000000..13f90cd7 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/cmdlist @@ -0,0 +1,3 @@ +[ + [ "power node1 on", "power node2 on", "power node3 on" ] +] diff --git a/src/test/test-resource-affinity-strict-mixed3/hardware_status b/src/test/test-resource-affinity-strict-mixed3/hardware_status new file mode 100644 index 00000000..451beb13 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/hardware_status @@ -0,0 +1,5 @@ +{ + "node1": { "power": "off", "network": "off" }, + "node2": { "power": "off", "network": "off" }, + "node3": { "power": "off", "network": "off" } +} diff --git a/src/test/test-resource-affinity-strict-mixed3/log.expect b/src/test/test-resource-affinity-strict-mixed3/log.expect new file mode 100644 index 00000000..b3de104f --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/log.expect @@ -0,0 +1,92 @@ +info 0 hardware: starting simulation +info 20 cmdlist: execute power node1 on +info 20 node1/crm: status change startup => wait_for_quorum +info 20 node1/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node2 on +info 20 node2/crm: status change startup => wait_for_quorum +info 20 node2/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node3 on +info 20 node3/crm: status change startup => wait_for_quorum +info 20 node3/lrm: status change startup => wait_for_agent_lock +info 20 node1/crm: got lock 'ha_manager_lock' +info 20 node1/crm: status change wait_for_quorum => master +info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online' +info 20 node1/crm: adding new service 'vm:101' on node 'node1' +info 20 node1/crm: adding new service 'vm:102' on node 'node2' +info 20 node1/crm: adding new service 'vm:103' on node 'node1' +info 20 node1/crm: adding new service 'vm:201' on node 'node1' +info 20 node1/crm: adding new service 'vm:202' on node 'node2' +info 20 node1/crm: adding new service 'vm:203' on node 'node1' +info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2) +info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:202': state changed from 'request_start' to 'started' (node = node2) +info 20 node1/crm: service 'vm:203': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: migrate service 'vm:101' to node 'node3' (running) +info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:102' to node 'node3' (running) +info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3) +info 20 node1/crm: migrate service 'vm:103' to node 'node3' (running) +info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:203' to node 'node2' (running) +info 20 node1/crm: service 'vm:203': state changed from 'started' to 'migrate' (node = node1, target = node2) +info 21 node1/lrm: got lock 'ha_agent_node1_lock' +info 21 node1/lrm: status change wait_for_agent_lock => active +info 21 node1/lrm: service vm:101 - start migrate to node 'node3' +info 21 node1/lrm: service vm:101 - end migrate to node 'node3' +info 21 node1/lrm: service vm:103 - start migrate to node 'node3' +info 21 node1/lrm: service vm:103 - end migrate to node 'node3' +info 21 node1/lrm: starting service vm:201 +info 21 node1/lrm: service status vm:201 started +info 21 node1/lrm: service vm:203 - start migrate to node 'node2' +info 21 node1/lrm: service vm:203 - end migrate to node 'node2' +info 22 node2/crm: status change wait_for_quorum => slave +info 23 node2/lrm: got lock 'ha_agent_node2_lock' +info 23 node2/lrm: status change wait_for_agent_lock => active +info 23 node2/lrm: service vm:102 - start migrate to node 'node3' +info 23 node2/lrm: service vm:102 - end migrate to node 'node3' +info 23 node2/lrm: starting service vm:202 +info 23 node2/lrm: service status vm:202 started +info 24 node3/crm: status change wait_for_quorum => slave +info 25 node3/lrm: got lock 'ha_agent_node3_lock' +info 25 node3/lrm: status change wait_for_agent_lock => active +info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: migrate service 'vm:201' to node 'node2' (running) +info 40 node1/crm: service 'vm:201': state changed from 'started' to 'migrate' (node = node1, target = node2) +info 40 node1/crm: migrate service 'vm:202' to node 'node1' (running) +info 40 node1/crm: service 'vm:202': state changed from 'started' to 'migrate' (node = node2, target = node1) +info 40 node1/crm: service 'vm:203': state changed from 'migrate' to 'started' (node = node2) +info 40 node1/crm: migrate service 'vm:203' to node 'node1' (running) +info 40 node1/crm: service 'vm:203': state changed from 'started' to 'migrate' (node = node2, target = node1) +info 41 node1/lrm: service vm:201 - start migrate to node 'node2' +info 41 node1/lrm: service vm:201 - end migrate to node 'node2' +info 43 node2/lrm: service vm:202 - start migrate to node 'node1' +info 43 node2/lrm: service vm:202 - end migrate to node 'node1' +info 43 node2/lrm: service vm:203 - start migrate to node 'node1' +info 43 node2/lrm: service vm:203 - end migrate to node 'node1' +info 45 node3/lrm: starting service vm:101 +info 45 node3/lrm: service status vm:101 started +info 45 node3/lrm: starting service vm:102 +info 45 node3/lrm: service status vm:102 started +info 45 node3/lrm: starting service vm:103 +info 45 node3/lrm: service status vm:103 started +info 60 node1/crm: service 'vm:201': state changed from 'migrate' to 'started' (node = node2) +info 60 node1/crm: service 'vm:202': state changed from 'migrate' to 'started' (node = node1) +info 60 node1/crm: service 'vm:203': state changed from 'migrate' to 'started' (node = node1) +info 60 node1/crm: migrate service 'vm:201' to node 'node1' (running) +info 60 node1/crm: service 'vm:201': state changed from 'started' to 'migrate' (node = node2, target = node1) +info 61 node1/lrm: starting service vm:202 +info 61 node1/lrm: service status vm:202 started +info 61 node1/lrm: starting service vm:203 +info 61 node1/lrm: service status vm:203 started +info 63 node2/lrm: service vm:201 - start migrate to node 'node1' +info 63 node2/lrm: service vm:201 - end migrate to node 'node1' +info 80 node1/crm: service 'vm:201': state changed from 'migrate' to 'started' (node = node1) +info 81 node1/lrm: starting service vm:201 +info 81 node1/lrm: service status vm:201 started +info 620 hardware: exit simulation - done diff --git a/src/test/test-resource-affinity-strict-mixed3/manager_status b/src/test/test-resource-affinity-strict-mixed3/manager_status new file mode 100644 index 00000000..0967ef42 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/manager_status @@ -0,0 +1 @@ +{} diff --git a/src/test/test-resource-affinity-strict-mixed3/rules_config b/src/test/test-resource-affinity-strict-mixed3/rules_config new file mode 100644 index 00000000..851ed590 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/rules_config @@ -0,0 +1,11 @@ +resource-affinity: together-100s + resources vm:101,vm:102,vm:103 + affinity positive + +resource-affinity: together-200s + resources vm:201,vm:202,vm:203 + affinity positive + +resource-affinity: lonely-must-vms-be + resources vm:101,vm:201 + affinity negative diff --git a/src/test/test-resource-affinity-strict-mixed3/service_config b/src/test/test-resource-affinity-strict-mixed3/service_config new file mode 100644 index 00000000..3028810b --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed3/service_config @@ -0,0 +1,8 @@ +{ + "vm:101": { "node": "node1", "state": "started" }, + "vm:102": { "node": "node2", "state": "started" }, + "vm:103": { "node": "node1", "state": "started" }, + "vm:201": { "node": "node1", "state": "started" }, + "vm:202": { "node": "node2", "state": "started" }, + "vm:203": { "node": "node1", "state": "started" } +} diff --git a/src/test/test-resource-affinity-strict-mixed4/README b/src/test/test-resource-affinity-strict-mixed4/README new file mode 100644 index 00000000..25e5abc7 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/README @@ -0,0 +1,14 @@ +The test scenario is: +- vm:101, vm:102, and vm:103 must be kept together +- vm:201, vm:202, and vm:203 must be kept together +- vm:101 and vm:201 must be kept separate +- Therefore, vm:101, vm:102, and vm:103 must all be kept separate from vm:201, + vm:202, and vm:203 and vice versa +- vm:101, vm:103, vm:201, and vm:203 are currently running on node1 +- vm:102 and vm:202 are both running on node2 + +The expected outcome is: +- The resource-node placements do not adhere to the defined resource affinity + rules, therefore the HA resources must be moved accordingly. In the end, + vm:101, vm:102, and vm:103 should be on a separate node as vm:201, vm:202, + and vm:203. diff --git a/src/test/test-resource-affinity-strict-mixed4/cmdlist b/src/test/test-resource-affinity-strict-mixed4/cmdlist new file mode 100644 index 00000000..043a94a6 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/cmdlist @@ -0,0 +1,3 @@ +[ + [ "power node1 on", "power node2 on", "power node3 on", "power node4 on" ] +] diff --git a/src/test/test-resource-affinity-strict-mixed4/hardware_status b/src/test/test-resource-affinity-strict-mixed4/hardware_status new file mode 100644 index 00000000..4aed08a1 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/hardware_status @@ -0,0 +1,6 @@ +{ + "node1": { "power": "off", "network": "off" }, + "node2": { "power": "off", "network": "off" }, + "node3": { "power": "off", "network": "off" }, + "node4": { "power": "off", "network": "off" } +} diff --git a/src/test/test-resource-affinity-strict-mixed4/log.expect b/src/test/test-resource-affinity-strict-mixed4/log.expect new file mode 100644 index 00000000..903af623 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/log.expect @@ -0,0 +1,85 @@ +info 0 hardware: starting simulation +info 20 cmdlist: execute power node1 on +info 20 node1/crm: status change startup => wait_for_quorum +info 20 node1/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node2 on +info 20 node2/crm: status change startup => wait_for_quorum +info 20 node2/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node3 on +info 20 node3/crm: status change startup => wait_for_quorum +info 20 node3/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node4 on +info 20 node4/crm: status change startup => wait_for_quorum +info 20 node4/lrm: status change startup => wait_for_agent_lock +info 20 node1/crm: got lock 'ha_manager_lock' +info 20 node1/crm: status change wait_for_quorum => master +info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online' +info 20 node1/crm: adding new service 'vm:101' on node 'node1' +info 20 node1/crm: adding new service 'vm:102' on node 'node2' +info 20 node1/crm: adding new service 'vm:103' on node 'node1' +info 20 node1/crm: adding new service 'vm:201' on node 'node1' +info 20 node1/crm: adding new service 'vm:202' on node 'node2' +info 20 node1/crm: adding new service 'vm:203' on node 'node1' +info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2) +info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service 'vm:202': state changed from 'request_start' to 'started' (node = node2) +info 20 node1/crm: service 'vm:203': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: migrate service 'vm:101' to node 'node3' (running) +info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:102' to node 'node3' (running) +info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3) +info 20 node1/crm: migrate service 'vm:103' to node 'node3' (running) +info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node3) +info 20 node1/crm: migrate service 'vm:201' to node 'node4' (running) +info 20 node1/crm: service 'vm:201': state changed from 'started' to 'migrate' (node = node1, target = node4) +info 20 node1/crm: migrate service 'vm:202' to node 'node4' (running) +info 20 node1/crm: service 'vm:202': state changed from 'started' to 'migrate' (node = node2, target = node4) +info 20 node1/crm: migrate service 'vm:203' to node 'node4' (running) +info 20 node1/crm: service 'vm:203': state changed from 'started' to 'migrate' (node = node1, target = node4) +info 21 node1/lrm: got lock 'ha_agent_node1_lock' +info 21 node1/lrm: status change wait_for_agent_lock => active +info 21 node1/lrm: service vm:101 - start migrate to node 'node3' +info 21 node1/lrm: service vm:101 - end migrate to node 'node3' +info 21 node1/lrm: service vm:103 - start migrate to node 'node3' +info 21 node1/lrm: service vm:103 - end migrate to node 'node3' +info 21 node1/lrm: service vm:201 - start migrate to node 'node4' +info 21 node1/lrm: service vm:201 - end migrate to node 'node4' +info 21 node1/lrm: service vm:203 - start migrate to node 'node4' +info 21 node1/lrm: service vm:203 - end migrate to node 'node4' +info 22 node2/crm: status change wait_for_quorum => slave +info 23 node2/lrm: got lock 'ha_agent_node2_lock' +info 23 node2/lrm: status change wait_for_agent_lock => active +info 23 node2/lrm: service vm:102 - start migrate to node 'node3' +info 23 node2/lrm: service vm:102 - end migrate to node 'node3' +info 23 node2/lrm: service vm:202 - start migrate to node 'node4' +info 23 node2/lrm: service vm:202 - end migrate to node 'node4' +info 24 node3/crm: status change wait_for_quorum => slave +info 25 node3/lrm: got lock 'ha_agent_node3_lock' +info 25 node3/lrm: status change wait_for_agent_lock => active +info 26 node4/crm: status change wait_for_quorum => slave +info 27 node4/lrm: got lock 'ha_agent_node4_lock' +info 27 node4/lrm: status change wait_for_agent_lock => active +info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3) +info 40 node1/crm: service 'vm:201': state changed from 'migrate' to 'started' (node = node4) +info 40 node1/crm: service 'vm:202': state changed from 'migrate' to 'started' (node = node4) +info 40 node1/crm: service 'vm:203': state changed from 'migrate' to 'started' (node = node4) +info 45 node3/lrm: starting service vm:101 +info 45 node3/lrm: service status vm:101 started +info 45 node3/lrm: starting service vm:102 +info 45 node3/lrm: service status vm:102 started +info 45 node3/lrm: starting service vm:103 +info 45 node3/lrm: service status vm:103 started +info 47 node4/lrm: starting service vm:201 +info 47 node4/lrm: service status vm:201 started +info 47 node4/lrm: starting service vm:202 +info 47 node4/lrm: service status vm:202 started +info 47 node4/lrm: starting service vm:203 +info 47 node4/lrm: service status vm:203 started +info 620 hardware: exit simulation - done diff --git a/src/test/test-resource-affinity-strict-mixed4/manager_status b/src/test/test-resource-affinity-strict-mixed4/manager_status new file mode 100644 index 00000000..0967ef42 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/manager_status @@ -0,0 +1 @@ +{} diff --git a/src/test/test-resource-affinity-strict-mixed4/rules_config b/src/test/test-resource-affinity-strict-mixed4/rules_config new file mode 100644 index 00000000..851ed590 --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/rules_config @@ -0,0 +1,11 @@ +resource-affinity: together-100s + resources vm:101,vm:102,vm:103 + affinity positive + +resource-affinity: together-200s + resources vm:201,vm:202,vm:203 + affinity positive + +resource-affinity: lonely-must-vms-be + resources vm:101,vm:201 + affinity negative diff --git a/src/test/test-resource-affinity-strict-mixed4/service_config b/src/test/test-resource-affinity-strict-mixed4/service_config new file mode 100644 index 00000000..3028810b --- /dev/null +++ b/src/test/test-resource-affinity-strict-mixed4/service_config @@ -0,0 +1,8 @@ +{ + "vm:101": { "node": "node1", "state": "started" }, + "vm:102": { "node": "node2", "state": "started" }, + "vm:103": { "node": "node1", "state": "started" }, + "vm:201": { "node": "node1", "state": "started" }, + "vm:202": { "node": "node2", "state": "started" }, + "vm:203": { "node": "node1", "state": "started" } +} -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel