public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ha-manager 0/2] fix #6801
@ 2025-11-03 15:17 Daniel Kral
  2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 1/2] test: add delayed positive resource affinity migration test case Daniel Kral
  2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 2/2] fix #6801: only consider target node during positive resource affinity migration Daniel Kral
  0 siblings, 2 replies; 3+ messages in thread
From: Daniel Kral @ 2025-11-03 15:17 UTC (permalink / raw)
  To: pve-devel


NOTE: This fix is based on top of [1], which itself is based on [0].


This fixes an accounting bug, where HA resources in positive affinity
are migrated/relocated back to the (alphabetically-first) source node,
because both the source and target node are considered when evaluating
where a HA resource should be in `select_service_node`.



Example: vm:100 and vm:101 are in a positive resource affinity rule.

1. vm:100 is migrated from node1 to node3
2. vm:101 will also be migrated from node1 to node3 at the same time
3. vm:100 finishes migration at least 10 seconds before vm:101
4. vm:100 checks for a better node placement
4a. vm:100 checks whether the positive resource affinity is held and
    will get the information that the other HA resources (just vm:101)
    is on both node1 and node3
4b. In case of equal weights on both nodes, the alphabetically first is
    chosen [0]
5. vm:100 is migrated to node1



This fix needs changes from [0] as this patch series implements a way to
differentiate between $current_node and $target_node in
get_resource_affinity(...). Since [1] makes changes to that subroutine
too, I rebased on top of [1], even though this fix can also be applied
on top of [0] with some adaption.

I tried to write the test case a little bit more straight forward by
having a parameter to set a 'migration duration', but that would require
quite a few modifications to the current single-threaded pve-ha-tester,
e.g. a waitqueue which handles "delayed" migration finishes. We could
still do that if we need it for some other test case, but for now
setting up the environment worked fine.



[0] https://lore.proxmox.com/pve-devel/20251027164513.542678-1-d.kral@proxmox.com/
[1] https://lore.proxmox.com/pve-devel/20251103102118.153666-1-d.kral@proxmox.com/


Daniel Kral (2):
  test: add delayed positive resource affinity migration test case
  fix #6801: only consider target node during positive resource affinity
    migration

 src/PVE/HA/Rules/ResourceAffinity.pm          |  6 ++--
 .../log.expect                                | 25 +++--------------
 .../log.expect                                | 28 +++++++++----------
 .../README                                    |  2 ++
 .../cmdlist                                   |  3 ++
 .../hardware_status                           |  5 ++++
 .../log.expect                                | 26 +++++++++++++++++
 .../manager_status                            | 21 ++++++++++++++
 .../rules_config                              |  3 ++
 .../service_config                            |  4 +++
 10 files changed, 86 insertions(+), 37 deletions(-)
 create mode 100644 src/test/test-resource-affinity-strict-positive6/README
 create mode 100644 src/test/test-resource-affinity-strict-positive6/cmdlist
 create mode 100644 src/test/test-resource-affinity-strict-positive6/hardware_status
 create mode 100644 src/test/test-resource-affinity-strict-positive6/log.expect
 create mode 100644 src/test/test-resource-affinity-strict-positive6/manager_status
 create mode 100644 src/test/test-resource-affinity-strict-positive6/rules_config
 create mode 100644 src/test/test-resource-affinity-strict-positive6/service_config
-- 
2.47.3



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-11-03 15:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-03 15:17 [pve-devel] [PATCH ha-manager 0/2] fix #6801 Daniel Kral
2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 1/2] test: add delayed positive resource affinity migration test case Daniel Kral
2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 2/2] fix #6801: only consider target node during positive resource affinity migration Daniel Kral

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal