From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH ha-manager 2/2] fix #6801: only consider target node during positive resource affinity migration
Date: Mon,  3 Nov 2025 16:17:12 +0100	[thread overview]
Message-ID: <20251103151823.387984-3-d.kral@proxmox.com> (raw)
In-Reply-To: <20251103151823.387984-1-d.kral@proxmox.com>
When a HA resource with positive affinity to other HA resources is moved
to another node, the other HA resources in positive affinity are
automatically moved to the same target node as well.
If the HA resources have significant differences in migration time
(more than the average HA Manager round of ~10 seconds) the already
migrated HA resources in 'started' state will check for better node
placements while the other(s) are still migrating.
This search includes whether the positive resource affinity rules are
held and will query where the other HA resources are. When HA resources
are still migrating this will report that these are both on the source
and target node, which is correct from a accounting standpoint, but will
add equal weights on both nodes and might result in the already started
HA resource to be migrated to the source node.
Therefore, only consider the target node for positive affinity during
migration or relocation to prevent this from happening.
As a side-effect, two test cases for positive resource affinity rules
will result in a slightly quicker convergence to a steady state as these
now will get the information about the common target node sooner.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Rules/ResourceAffinity.pm          |  6 ++--
 .../log.expect                                | 25 +++--------------
 .../log.expect                                | 28 +++++++++----------
 .../README                                    |  3 --
 .../log.expect                                | 28 +++----------------
 5 files changed, 26 insertions(+), 64 deletions(-)
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 4f5ffca5..9303bafd 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -517,8 +517,10 @@ sub get_resource_affinity {
     for my $csid (keys $positive->%*) {
         my ($current_node, $target_node) = $get_used_service_nodes->($csid);
 
-        $together->{$current_node}++ if defined($current_node);
-        $together->{$target_node}++ if defined($target_node);
+        # consider only the target node for positive affinity to prevent already
+        # moved HA resources to move back to the source node (see #6801)
+        my $node = $target_node // $current_node;
+        $together->{$node}++ if defined($node);
     }
 
     for my $csid (keys $negative->%*) {
diff --git a/src/test/test-resource-affinity-strict-mixed3/log.expect b/src/test/test-resource-affinity-strict-mixed3/log.expect
index b3de104f..ee6412a1 100644
--- a/src/test/test-resource-affinity-strict-mixed3/log.expect
+++ b/src/test/test-resource-affinity-strict-mixed3/log.expect
@@ -58,17 +58,11 @@ info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'sta
 info     40    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node3)
 info     40    node1/crm: migrate service 'vm:201' to node 'node2' (running)
 info     40    node1/crm: service 'vm:201': state changed from 'started' to 'migrate'  (node = node1, target = node2)
-info     40    node1/crm: migrate service 'vm:202' to node 'node1' (running)
-info     40    node1/crm: service 'vm:202': state changed from 'started' to 'migrate'  (node = node2, target = node1)
 info     40    node1/crm: service 'vm:203': state changed from 'migrate' to 'started'  (node = node2)
-info     40    node1/crm: migrate service 'vm:203' to node 'node1' (running)
-info     40    node1/crm: service 'vm:203': state changed from 'started' to 'migrate'  (node = node2, target = node1)
 info     41    node1/lrm: service vm:201 - start migrate to node 'node2'
 info     41    node1/lrm: service vm:201 - end migrate to node 'node2'
-info     43    node2/lrm: service vm:202 - start migrate to node 'node1'
-info     43    node2/lrm: service vm:202 - end migrate to node 'node1'
-info     43    node2/lrm: service vm:203 - start migrate to node 'node1'
-info     43    node2/lrm: service vm:203 - end migrate to node 'node1'
+info     43    node2/lrm: starting service vm:203
+info     43    node2/lrm: service status vm:203 started
 info     45    node3/lrm: starting service vm:101
 info     45    node3/lrm: service status vm:101 started
 info     45    node3/lrm: starting service vm:102
@@ -76,17 +70,6 @@ info     45    node3/lrm: service status vm:102 started
 info     45    node3/lrm: starting service vm:103
 info     45    node3/lrm: service status vm:103 started
 info     60    node1/crm: service 'vm:201': state changed from 'migrate' to 'started'  (node = node2)
-info     60    node1/crm: service 'vm:202': state changed from 'migrate' to 'started'  (node = node1)
-info     60    node1/crm: service 'vm:203': state changed from 'migrate' to 'started'  (node = node1)
-info     60    node1/crm: migrate service 'vm:201' to node 'node1' (running)
-info     60    node1/crm: service 'vm:201': state changed from 'started' to 'migrate'  (node = node2, target = node1)
-info     61    node1/lrm: starting service vm:202
-info     61    node1/lrm: service status vm:202 started
-info     61    node1/lrm: starting service vm:203
-info     61    node1/lrm: service status vm:203 started
-info     63    node2/lrm: service vm:201 - start migrate to node 'node1'
-info     63    node2/lrm: service vm:201 - end migrate to node 'node1'
-info     80    node1/crm: service 'vm:201': state changed from 'migrate' to 'started'  (node = node1)
-info     81    node1/lrm: starting service vm:201
-info     81    node1/lrm: service status vm:201 started
+info     63    node2/lrm: starting service vm:201
+info     63    node2/lrm: service status vm:201 started
 info    620     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive3/log.expect b/src/test/test-resource-affinity-strict-positive3/log.expect
index b5d7018f..5f4e6531 100644
--- a/src/test/test-resource-affinity-strict-positive3/log.expect
+++ b/src/test/test-resource-affinity-strict-positive3/log.expect
@@ -84,24 +84,24 @@ err     263    node2/lrm: unable to start service fa:120002 on local node after
 warn    280    node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
 info    280    node1/crm: relocate service 'fa:120002' to node 'node1'
 info    280    node1/crm: service 'fa:120002': state changed from 'started' to 'relocate'  (node = node2, target = node1)
+info    280    node1/crm: migrate service 'vm:101' to node 'node1' (running)
+info    280    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    280    node1/crm: migrate service 'vm:102' to node 'node1' (running)
+info    280    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node2, target = node1)
 info    283    node2/lrm: service fa:120002 - start relocate to node 'node1'
 info    283    node2/lrm: service fa:120002 - end relocate to node 'node1'
+info    283    node2/lrm: service vm:101 - start migrate to node 'node1'
+info    283    node2/lrm: service vm:101 - end migrate to node 'node1'
+info    283    node2/lrm: service vm:102 - start migrate to node 'node1'
+info    283    node2/lrm: service vm:102 - end migrate to node 'node1'
 info    300    node1/crm: service 'fa:120002': state changed from 'relocate' to 'started'  (node = node1)
-info    300    node1/crm: migrate service 'vm:101' to node 'node1' (running)
-info    300    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node2, target = node1)
-info    300    node1/crm: migrate service 'vm:102' to node 'node1' (running)
-info    300    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    300    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
+info    300    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node1)
 info    301    node1/lrm: starting service fa:120002
 info    301    node1/lrm: service status fa:120002 started
-info    303    node2/lrm: service vm:101 - start migrate to node 'node1'
-info    303    node2/lrm: service vm:101 - end migrate to node 'node1'
-info    303    node2/lrm: service vm:102 - start migrate to node 'node1'
-info    303    node2/lrm: service vm:102 - end migrate to node 'node1'
+info    301    node1/lrm: starting service vm:101
+info    301    node1/lrm: service status vm:101 started
+info    301    node1/lrm: starting service vm:102
+info    301    node1/lrm: service status vm:102 started
 info    320    node1/crm: relocation policy successful for 'fa:120002' on node 'node1', failed nodes: node2
-info    320    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
-info    320    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node1)
-info    321    node1/lrm: starting service vm:101
-info    321    node1/lrm: service status vm:101 started
-info    321    node1/lrm: starting service vm:102
-info    321    node1/lrm: service status vm:102 started
 info    720     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive6/README b/src/test/test-resource-affinity-strict-positive6/README
index a6affda3..e174e458 100644
--- a/src/test/test-resource-affinity-strict-positive6/README
+++ b/src/test/test-resource-affinity-strict-positive6/README
@@ -1,5 +1,2 @@
 Test whether two HA resources in positive resource affinity will migrate to the
 same target node when one of them finishes earlier than the other.
-
-The current behavior is not correct, because the already migrated HA resource
-will be migrated back to the source node.
diff --git a/src/test/test-resource-affinity-strict-positive6/log.expect b/src/test/test-resource-affinity-strict-positive6/log.expect
index 69f8d867..cbc63a1e 100644
--- a/src/test/test-resource-affinity-strict-positive6/log.expect
+++ b/src/test/test-resource-affinity-strict-positive6/log.expect
@@ -10,8 +10,6 @@ info     20    node3/crm: status change startup => wait_for_quorum
 info     20    node3/lrm: status change startup => wait_for_agent_lock
 info     20    node1/crm: got lock 'ha_manager_lock'
 info     20    node1/crm: status change wait_for_quorum => master
-info     20    node1/crm: migrate service 'vm:101' to node 'node1' (running)
-info     20    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node3, target = node1)
 info     21    node1/lrm: got lock 'ha_agent_node1_lock'
 info     21    node1/lrm: status change wait_for_agent_lock => active
 info     21    node1/lrm: service vm:102 - start migrate to node 'node3'
@@ -20,27 +18,9 @@ info     22    node2/crm: status change wait_for_quorum => slave
 info     24    node3/crm: status change wait_for_quorum => slave
 info     25    node3/lrm: got lock 'ha_agent_node3_lock'
 info     25    node3/lrm: status change wait_for_agent_lock => active
-info     25    node3/lrm: service vm:101 - start migrate to node 'node1'
-info     25    node3/lrm: service vm:101 - end migrate to node 'node1'
-info     40    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
 info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node3)
-info     40    node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info     40    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node1, target = node3)
-info     40    node1/crm: migrate service 'vm:102' to node 'node1' (running)
-info     40    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node3, target = node1)
-info     41    node1/lrm: service vm:101 - start migrate to node 'node3'
-info     41    node1/lrm: service vm:101 - end migrate to node 'node3'
-info     45    node3/lrm: service vm:102 - start migrate to node 'node1'
-info     45    node3/lrm: service vm:102 - end migrate to node 'node1'
-info     60    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node3)
-info     60    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node1)
-info     60    node1/crm: migrate service 'vm:101' to node 'node1' (running)
-info     60    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node3, target = node1)
-info     61    node1/lrm: starting service vm:102
-info     61    node1/lrm: service status vm:102 started
-info     65    node3/lrm: service vm:101 - start migrate to node 'node1'
-info     65    node3/lrm: service vm:101 - end migrate to node 'node1'
-info     80    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
-info     81    node1/lrm: starting service vm:101
-info     81    node1/lrm: service status vm:101 started
+info     45    node3/lrm: starting service vm:102
+info     45    node3/lrm: service status vm:102 started
 info    620     hardware: exit simulation - done
-- 
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
     prev parent reply	other threads:[~2025-11-03 15:17 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-03 15:17 [pve-devel] [PATCH ha-manager 0/2] fix #6801 Daniel Kral
2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 1/2] test: add delayed positive resource affinity migration test case Daniel Kral
2025-11-03 15:17 ` Daniel Kral [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20251103151823.387984-3-d.kral@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox