all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ha-manager 0/2] fix #6801
@ 2025-11-03 15:17 Daniel Kral
  2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 1/2] test: add delayed positive resource affinity migration test case Daniel Kral
  2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 2/2] fix #6801: only consider target node during positive resource affinity migration Daniel Kral
  0 siblings, 2 replies; 3+ messages in thread
From: Daniel Kral @ 2025-11-03 15:17 UTC (permalink / raw)
  To: pve-devel


NOTE: This fix is based on top of [1], which itself is based on [0].


This fixes an accounting bug, where HA resources in positive affinity
are migrated/relocated back to the (alphabetically-first) source node,
because both the source and target node are considered when evaluating
where a HA resource should be in `select_service_node`.



Example: vm:100 and vm:101 are in a positive resource affinity rule.

1. vm:100 is migrated from node1 to node3
2. vm:101 will also be migrated from node1 to node3 at the same time
3. vm:100 finishes migration at least 10 seconds before vm:101
4. vm:100 checks for a better node placement
4a. vm:100 checks whether the positive resource affinity is held and
    will get the information that the other HA resources (just vm:101)
    is on both node1 and node3
4b. In case of equal weights on both nodes, the alphabetically first is
    chosen [0]
5. vm:100 is migrated to node1



This fix needs changes from [0] as this patch series implements a way to
differentiate between $current_node and $target_node in
get_resource_affinity(...). Since [1] makes changes to that subroutine
too, I rebased on top of [1], even though this fix can also be applied
on top of [0] with some adaption.

I tried to write the test case a little bit more straight forward by
having a parameter to set a 'migration duration', but that would require
quite a few modifications to the current single-threaded pve-ha-tester,
e.g. a waitqueue which handles "delayed" migration finishes. We could
still do that if we need it for some other test case, but for now
setting up the environment worked fine.



[0] https://lore.proxmox.com/pve-devel/20251027164513.542678-1-d.kral@proxmox.com/
[1] https://lore.proxmox.com/pve-devel/20251103102118.153666-1-d.kral@proxmox.com/


Daniel Kral (2):
  test: add delayed positive resource affinity migration test case
  fix #6801: only consider target node during positive resource affinity
    migration

 src/PVE/HA/Rules/ResourceAffinity.pm          |  6 ++--
 .../log.expect                                | 25 +++--------------
 .../log.expect                                | 28 +++++++++----------
 .../README                                    |  2 ++
 .../cmdlist                                   |  3 ++
 .../hardware_status                           |  5 ++++
 .../log.expect                                | 26 +++++++++++++++++
 .../manager_status                            | 21 ++++++++++++++
 .../rules_config                              |  3 ++
 .../service_config                            |  4 +++
 10 files changed, 86 insertions(+), 37 deletions(-)
 create mode 100644 src/test/test-resource-affinity-strict-positive6/README
 create mode 100644 src/test/test-resource-affinity-strict-positive6/cmdlist
 create mode 100644 src/test/test-resource-affinity-strict-positive6/hardware_status
 create mode 100644 src/test/test-resource-affinity-strict-positive6/log.expect
 create mode 100644 src/test/test-resource-affinity-strict-positive6/manager_status
 create mode 100644 src/test/test-resource-affinity-strict-positive6/rules_config
 create mode 100644 src/test/test-resource-affinity-strict-positive6/service_config
-- 
2.47.3



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-11-03 15:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-03 15:17 [pve-devel] [PATCH ha-manager 0/2] fix #6801 Daniel Kral
2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 1/2] test: add delayed positive resource affinity migration test case Daniel Kral
2025-11-03 15:17 ` [pve-devel] [PATCH ha-manager 2/2] fix #6801: only consider target node during positive resource affinity migration Daniel Kral

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal