public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1)
@ 2025-08-01 16:22 Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Here's a follow up on the HA rules and especially the HA resource
affinity rules.

The first three patches haven't changed as they were lower priority for
me than the last part about loosening restrictions on mixed resource
references.

Patches #1 - #8 are rather independent patches but still have some
ordering / dependencies in between (e.g. patch #1 - #3 depend on each
other). The larger part here are patches #9 - #12, which allow mixed
resource references between node and resource affinity rules.

Also did a

    git rebase HEAD~12 --exec 'make clean && make deb'

on the series.

Changelog to v1
---------------

- add missing return schema to read_rule api

- ignore disable parameter as otherwise rule feasibility tests will be
  skipped

- do not check ignored resources as otherwise migrations are not
  possible or co-migrations are done even if the resource is not
  ha-managed (could be an use case in the future, but not for now)

- change behavior for single resource fails for resources in positive
  affinity rules: try to recover resource by migrating all resources to
  another available node

- loosen restrictions on mixed resource references, i.e., if a resource
  is used in a node affinity rule and a resource affinity rule at the
  same time (see more information in the patch)

TODO (more important)
---------------------

- more test cases and dev/user testing

- do not include ignored resources in feasibility check

- updates to pve-docs about the new behavior and missing docs:

  - new: mixing node and resource affinity rules are allowed now

  - new: interactions between node and resource affinity rules

  - new: ha rule conflicts for mixed usage

  - missing: for positive resource affinity rules, if resources are
    separated across nodes, the most populated one will be chosen as the
    one where all are migrated to

  - maybe missing: strictness behavior of negative resource affinity
    rules, i.e., that one resource cannot be migrated to either where
    another negative affinity resource is located nor migrated to

- using node affinity rules and negative resource affinity rules should
  still be improved as there are quite a few cases where manual
  intervention is needed to follow the resources node affinity rules
  (for now, resource affinity rules often times overwrite node affinity
  rules; the former should get more information about the latter in the
  scheduler; but might not be a blocker as its rather an edge case)

  - some of these cases are also represented in the added test cases, so
    that changes to them are documented in the future if those cases are
    accomodated for as well

TODO (less important)
---------------------

- adding node affinity rule blockers as well (but not really important)

- maybe log not-migrated groups because they had no group members?

- other suggestions to the original patch series'

- cleaning up


Daniel Kral (12):
  manager: fix ~revision version check for ha groups migration
  test: ha tester: add ha groups migration tests with runtime upgrades
  tree-wide: pass optional parameters as hash values for for_each_rule
    helper
  api: rules: add missing return schema for the read_rule api endpoint
  api: rules: ignore disable parameter if it is set to a falsy value
  rules: resource affinity: make message in inter-consistency check
    clearer
  config, manager: do not check ignored resources with affinity when
    migrating
  rules: make positive affinity resources migrate on single resource
    fail
  rules: allow same resources in node and resource affinity rules
  rules: restrict inter-plugin resource references to simple cases
  test: rules: add test cases for inter-plugin checks allowing simple
    use cases
  test: ha tester: add resource affinity test cases mixed with node
    affinity rules

 src/PVE/API2/HA/Rules.pm                      |  12 +-
 src/PVE/HA/Config.pm                          |   2 +
 src/PVE/HA/Manager.pm                         |  11 +-
 src/PVE/HA/Rules.pm                           | 289 +++++++++++++--
 src/PVE/HA/Rules/NodeAffinity.pm              |  14 +-
 src/PVE/HA/Rules/ResourceAffinity.pm          |  30 +-
 src/PVE/HA/Sim/Hardware.pm                    |   8 +
 ...onsistent-node-resource-affinity-rules.cfg |  54 +++
 ...nt-node-resource-affinity-rules.cfg.expect |  73 ++++
 ...sistent-resource-affinity-rules.cfg.expect |   8 +-
 ...egative-resource-affinity-rules.cfg.expect |   4 +-
 ...y-for-positive-resource-affinity-rules.cfg |  37 ++
 ...ositive-resource-affinity-rules.cfg.expect | 111 ++++++
 ...-affinity-with-resource-affinity-rules.cfg |  35 ++
 ...ty-with-resource-affinity-rules.cfg.expect |  48 +++
 .../multiple-resource-refs-in-rules.cfg       |  52 ---
 ...multiple-resource-refs-in-rules.cfg.expect | 111 ------
 src/test/test-group-migrate3/README           |   7 +
 src/test/test-group-migrate3/cmdlist          |  17 +
 src/test/test-group-migrate3/groups           |   7 +
 src/test/test-group-migrate3/hardware_status  |   5 +
 src/test/test-group-migrate3/log.expect       | 344 ++++++++++++++++++
 src/test/test-group-migrate3/manager_status   |   1 +
 src/test/test-group-migrate3/service_config   |   5 +
 src/test/test-group-migrate4/README           |   8 +
 src/test/test-group-migrate4/cmdlist          |  15 +
 src/test/test-group-migrate4/groups           |   7 +
 src/test/test-group-migrate4/hardware_status  |   5 +
 src/test/test-group-migrate4/log.expect       | 277 ++++++++++++++
 src/test/test-group-migrate4/manager_status   |   1 +
 src/test/test-group-migrate4/service_config   |   5 +
 .../README                                    |   9 +-
 .../log.expect                                |  26 +-
 .../README                                    |  14 +
 .../cmdlist                                   |   3 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  41 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   7 +
 .../service_config                            |   5 +
 .../README                                    |  13 +
 .../cmdlist                                   |   3 +
 .../hardware_status                           |   7 +
 .../log.expect                                |  63 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   7 +
 .../service_config                            |   5 +
 .../README                                    |  17 +
 .../cmdlist                                   |   6 +
 .../hardware_status                           |   6 +
 .../log.expect                                |  67 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |  15 +
 .../service_config                            |   5 +
 .../README                                    |  15 +
 .../cmdlist                                   |   3 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  49 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   7 +
 .../service_config                            |   5 +
 .../README                                    |  15 +
 .../cmdlist                                   |   3 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  49 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   7 +
 .../service_config                            |   5 +
 68 files changed, 1852 insertions(+), 248 deletions(-)
 create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
 create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
 create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
 delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
 delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
 create mode 100644 src/test/test-group-migrate3/README
 create mode 100644 src/test/test-group-migrate3/cmdlist
 create mode 100644 src/test/test-group-migrate3/groups
 create mode 100644 src/test/test-group-migrate3/hardware_status
 create mode 100644 src/test/test-group-migrate3/log.expect
 create mode 100644 src/test/test-group-migrate3/manager_status
 create mode 100644 src/test/test-group-migrate3/service_config
 create mode 100644 src/test/test-group-migrate4/README
 create mode 100644 src/test/test-group-migrate4/cmdlist
 create mode 100644 src/test/test-group-migrate4/groups
 create mode 100644 src/test/test-group-migrate4/hardware_status
 create mode 100644 src/test/test-group-migrate4/log.expect
 create mode 100644 src/test/test-group-migrate4/manager_status
 create mode 100644 src/test/test-group-migrate4/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config

-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

For the minimum version of 9.0.0~16 to migrate ha groups, the version
9.0.0 would fail the test as 0 < 16 would be true. If the ~revision is
not set for $version, then it is ordered after any minimum ~revision.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
nothing changed since v1

 src/PVE/HA/Manager.pm | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 9d7cb73f..0be12061 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -548,10 +548,13 @@ my $has_node_min_version = sub {
     return 0 if $major == $min_major && $minor < $min_minor;
     return 0 if $major == $min_major && $minor == $min_minor && $patch < $min_patch;
 
-    $rev //= 0;
     $min_rev //= 0;
     return 0
-        if $major == $min_major && $minor == $min_minor && $patch == $min_patch && $rev < $min_rev;
+        if $major == $min_major
+        && $minor == $min_minor
+        && $patch == $min_patch
+        && defined($rev)
+        && $rev < $min_rev;
 
     return 1;
 };
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

These test cases cover slightly more realistic upgrade paths of the
cluster nodes, where nodes are upgraded and rebooted one-by-one and some
actions might fail in-between.

The new sim_hardware_cmd 'version' is introduced to allow simulating the
runtime upgrades of each node and should be removed as soon as the HA
groups migration support code is not needed anymore (e.g. PVE 10).

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
nothing changed since v1

 src/PVE/HA/Sim/Hardware.pm                   |   8 +
 src/test/test-group-migrate3/README          |   7 +
 src/test/test-group-migrate3/cmdlist         |  17 +
 src/test/test-group-migrate3/groups          |   7 +
 src/test/test-group-migrate3/hardware_status |   5 +
 src/test/test-group-migrate3/log.expect      | 344 +++++++++++++++++++
 src/test/test-group-migrate3/manager_status  |   1 +
 src/test/test-group-migrate3/service_config  |   5 +
 src/test/test-group-migrate4/README          |   8 +
 src/test/test-group-migrate4/cmdlist         |  15 +
 src/test/test-group-migrate4/groups          |   7 +
 src/test/test-group-migrate4/hardware_status |   5 +
 src/test/test-group-migrate4/log.expect      | 277 +++++++++++++++
 src/test/test-group-migrate4/manager_status  |   1 +
 src/test/test-group-migrate4/service_config  |   5 +
 15 files changed, 712 insertions(+)
 create mode 100644 src/test/test-group-migrate3/README
 create mode 100644 src/test/test-group-migrate3/cmdlist
 create mode 100644 src/test/test-group-migrate3/groups
 create mode 100644 src/test/test-group-migrate3/hardware_status
 create mode 100644 src/test/test-group-migrate3/log.expect
 create mode 100644 src/test/test-group-migrate3/manager_status
 create mode 100644 src/test/test-group-migrate3/service_config
 create mode 100644 src/test/test-group-migrate4/README
 create mode 100644 src/test/test-group-migrate4/cmdlist
 create mode 100644 src/test/test-group-migrate4/groups
 create mode 100644 src/test/test-group-migrate4/hardware_status
 create mode 100644 src/test/test-group-migrate4/log.expect
 create mode 100644 src/test/test-group-migrate4/manager_status
 create mode 100644 src/test/test-group-migrate4/service_config

diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 4207ce31..63eb89ff 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -596,6 +596,7 @@ sub get_cfs_state {
 # simulate hardware commands, the following commands are available:
 #   power <node> <on|off>
 #   network <node> <on|off>
+#   version <node> set <version>
 #   delay <seconds>
 #   skip-round <crm|lrm> [<rounds=1>]
 #   cfs <node> <rw|update> <work|fail>
@@ -683,6 +684,13 @@ sub sim_hardware_cmd {
 
             $self->write_hardware_status_nolock($cstatus);
 
+        } elsif ($cmd eq 'version') {
+            die "sim_hardware_cmd: unknown version action '$action'"
+                if $action ne "set";
+            $cstatus->{$node}->{version} = $param;
+
+            $self->write_hardware_status_nolock($cstatus);
+
         } elsif ($cmd eq 'cfs') {
             die "sim_hardware_cmd: unknown cfs action '$action' for node '$node'"
                 if $action !~ m/^(rw|update)$/;
diff --git a/src/test/test-group-migrate3/README b/src/test/test-group-migrate3/README
new file mode 100644
index 00000000..0ee45f7a
--- /dev/null
+++ b/src/test/test-group-migrate3/README
@@ -0,0 +1,7 @@
+Test whether an initial (unsupported) mixed version cluster can be properly
+upgraded per major version and then the CRM correctly migrates the HA group
+config only after all nodes have at least the proper pre-release version.
+
+By rebooting every node after each version change, this tests whether the
+switching of the CRM node and a few instances of LRM restarts are properly
+prohibiting the HA groups config migration.
diff --git a/src/test/test-group-migrate3/cmdlist b/src/test/test-group-migrate3/cmdlist
new file mode 100644
index 00000000..d507acad
--- /dev/null
+++ b/src/test/test-group-migrate3/cmdlist
@@ -0,0 +1,17 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "version node1 set 8.4.1" ],
+    [ "reboot node1" ],
+    [ "version node2 set 8.4.1" ],
+    [ "reboot node2" ],
+    [ "version node3 set 8.4.1" ],
+    [ "reboot node3" ],
+    [ "version node1 set 9.0.0~16" ],
+    [ "reboot node1" ],
+    [ "version node2 set 9.0.0~16" ],
+    [ "reboot node2" ],
+    [ "version node3 set 9.0.0~15" ],
+    [ "reboot node3" ],
+    [ "version node3 set 9.0.0~17" ],
+    [ "reboot node3" ]
+]
diff --git a/src/test/test-group-migrate3/groups b/src/test/test-group-migrate3/groups
new file mode 100644
index 00000000..bad746ca
--- /dev/null
+++ b/src/test/test-group-migrate3/groups
@@ -0,0 +1,7 @@
+group: group1
+	nodes node1
+	restricted 1
+
+group: group2
+	nodes node2:2,node3
+	nofailback 1
diff --git a/src/test/test-group-migrate3/hardware_status b/src/test/test-group-migrate3/hardware_status
new file mode 100644
index 00000000..e8f9d73f
--- /dev/null
+++ b/src/test/test-group-migrate3/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "version": "7.4-4" },
+  "node2": { "power": "off", "network": "off", "version": "8.3.7" },
+  "node3": { "power": "off", "network": "off", "version": "8.3.0" }
+}
diff --git a/src/test/test-group-migrate3/log.expect b/src/test/test-group-migrate3/log.expect
new file mode 100644
index 00000000..63be1218
--- /dev/null
+++ b/src/test/test-group-migrate3/log.expect
@@ -0,0 +1,344 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:102
+info     23    node2/lrm: service status vm:102 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+noti     60    node1/crm: start ha group migration...
+noti     60    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti     60    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti     60    node1/crm: ha groups migration: node 'node1' has version '7.4-4'
+err      60    node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err      60    node1/crm: ha groups migration failed
+noti     60    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    120      cmdlist: execute version node1 set 8.4.1
+noti    180    node1/crm: start ha group migration...
+noti    180    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti    180    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    180    node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     180    node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     180    node1/crm: ha groups migration failed
+noti    180    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    220      cmdlist: execute reboot node1
+info    220    node1/lrm: got shutdown request with shutdown policy 'conditional'
+info    220    node1/lrm: reboot LRM, stop and freeze all services
+info    220    node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info    221    node1/lrm: stopping service vm:101
+info    221    node1/lrm: service status vm:101 stopped
+info    222    node1/lrm: exit (loop end)
+info    222       reboot: execute crm node1 stop
+info    221    node1/crm: server received shutdown request
+info    240    node1/crm: voluntary release CRM lock
+info    241    node1/crm: exit (loop end)
+info    241       reboot: execute power node1 off
+info    241       reboot: execute power node1 on
+info    241    node1/crm: status change startup => wait_for_quorum
+info    240    node1/lrm: status change startup => wait_for_agent_lock
+info    242    node2/crm: got lock 'ha_manager_lock'
+info    242    node2/crm: status change slave => master
+info    242    node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info    260    node1/crm: status change wait_for_quorum => slave
+info    261    node1/lrm: got lock 'ha_agent_node1_lock'
+info    261    node1/lrm: status change wait_for_agent_lock => active
+info    261    node1/lrm: starting service vm:101
+info    261    node1/lrm: service status vm:101 started
+noti    282    node2/crm: start ha group migration...
+noti    282    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti    282    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    282    node2/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     282    node2/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     282    node2/crm: ha groups migration failed
+noti    282    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    320      cmdlist: execute version node2 set 8.4.1
+noti    402    node2/crm: start ha group migration...
+noti    402    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti    402    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    402    node2/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     402    node2/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     402    node2/crm: ha groups migration failed
+noti    402    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    420      cmdlist: execute reboot node2
+info    420    node2/lrm: got shutdown request with shutdown policy 'conditional'
+info    420    node2/lrm: reboot LRM, stop and freeze all services
+info    422    node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info    423    node2/lrm: stopping service vm:102
+info    423    node2/lrm: service status vm:102 stopped
+info    424    node2/lrm: exit (loop end)
+info    424       reboot: execute crm node2 stop
+info    423    node2/crm: server received shutdown request
+info    442    node2/crm: voluntary release CRM lock
+info    443    node2/crm: exit (loop end)
+info    443       reboot: execute power node2 off
+info    443       reboot: execute power node2 on
+info    443    node2/crm: status change startup => wait_for_quorum
+info    440    node2/lrm: status change startup => wait_for_agent_lock
+info    444    node3/crm: got lock 'ha_manager_lock'
+info    444    node3/crm: status change slave => master
+info    444    node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info    462    node2/crm: status change wait_for_quorum => slave
+info    463    node2/lrm: got lock 'ha_agent_node2_lock'
+info    463    node2/lrm: status change wait_for_agent_lock => active
+info    463    node2/lrm: starting service vm:102
+info    463    node2/lrm: service status vm:102 started
+noti    484    node3/crm: start ha group migration...
+noti    484    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti    484    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    484    node3/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     484    node3/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     484    node3/crm: ha groups migration failed
+noti    484    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    520      cmdlist: execute version node3 set 8.4.1
+noti    604    node3/crm: start ha group migration...
+noti    604    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti    604    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    604    node3/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     604    node3/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     604    node3/crm: ha groups migration failed
+noti    604    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    620      cmdlist: execute reboot node3
+info    620    node3/lrm: got shutdown request with shutdown policy 'conditional'
+info    620    node3/lrm: reboot LRM, stop and freeze all services
+info    624    node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info    625    node3/lrm: stopping service vm:103
+info    625    node3/lrm: service status vm:103 stopped
+info    626    node3/lrm: exit (loop end)
+info    626       reboot: execute crm node3 stop
+info    625    node3/crm: server received shutdown request
+info    644    node3/crm: voluntary release CRM lock
+info    645    node3/crm: exit (loop end)
+info    645       reboot: execute power node3 off
+info    645       reboot: execute power node3 on
+info    645    node3/crm: status change startup => wait_for_quorum
+info    640    node3/lrm: status change startup => wait_for_agent_lock
+info    660    node1/crm: got lock 'ha_manager_lock'
+info    660    node1/crm: status change slave => master
+info    660    node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info    664    node3/crm: status change wait_for_quorum => slave
+info    665    node3/lrm: got lock 'ha_agent_node3_lock'
+info    665    node3/lrm: status change wait_for_agent_lock => active
+info    665    node3/lrm: starting service vm:103
+info    665    node3/lrm: service status vm:103 started
+noti    700    node1/crm: start ha group migration...
+noti    700    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti    700    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    700    node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     700    node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     700    node1/crm: ha groups migration failed
+noti    700    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    720      cmdlist: execute version node1 set 9.0.0~16
+info    820      cmdlist: execute reboot node1
+info    820    node1/lrm: got shutdown request with shutdown policy 'conditional'
+info    820    node1/lrm: reboot LRM, stop and freeze all services
+noti    820    node1/crm: start ha group migration...
+noti    820    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti    820    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'restart'
+err     820    node1/crm: abort ha groups migration: lrm 'node1' is not in mode 'active'
+err     820    node1/crm: ha groups migration failed
+noti    820    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    820    node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info    821    node1/lrm: stopping service vm:101
+info    821    node1/lrm: service status vm:101 stopped
+info    822    node1/lrm: exit (loop end)
+info    822       reboot: execute crm node1 stop
+info    821    node1/crm: server received shutdown request
+info    840    node1/crm: voluntary release CRM lock
+info    841    node1/crm: exit (loop end)
+info    841       reboot: execute power node1 off
+info    841       reboot: execute power node1 on
+info    841    node1/crm: status change startup => wait_for_quorum
+info    840    node1/lrm: status change startup => wait_for_agent_lock
+info    842    node2/crm: got lock 'ha_manager_lock'
+info    842    node2/crm: status change slave => master
+info    842    node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info    860    node1/crm: status change wait_for_quorum => slave
+info    861    node1/lrm: got lock 'ha_agent_node1_lock'
+info    861    node1/lrm: status change wait_for_agent_lock => active
+info    861    node1/lrm: starting service vm:101
+info    861    node1/lrm: service status vm:101 started
+noti    882    node2/crm: start ha group migration...
+noti    882    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti    882    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    882    node2/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti    882    node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti    882    node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti    882    node2/crm: ha groups migration: node 'node2' has version '8.4.1'
+err     882    node2/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err     882    node2/crm: ha groups migration failed
+noti    882    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    920      cmdlist: execute version node2 set 9.0.0~16
+noti   1002    node2/crm: start ha group migration...
+noti   1002    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1002    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1002    node2/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1002    node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1002    node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1002    node2/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1002    node2/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1002    node2/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1002    node2/crm: ha groups migration: node 'node3' has version '8.4.1'
+err    1002    node2/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err    1002    node2/crm: ha groups migration failed
+noti   1002    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1020      cmdlist: execute reboot node2
+info   1020    node2/lrm: got shutdown request with shutdown policy 'conditional'
+info   1020    node2/lrm: reboot LRM, stop and freeze all services
+info   1022    node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info   1023    node2/lrm: stopping service vm:102
+info   1023    node2/lrm: service status vm:102 stopped
+info   1024    node2/lrm: exit (loop end)
+info   1024       reboot: execute crm node2 stop
+info   1023    node2/crm: server received shutdown request
+info   1042    node2/crm: voluntary release CRM lock
+info   1043    node2/crm: exit (loop end)
+info   1043       reboot: execute power node2 off
+info   1043       reboot: execute power node2 on
+info   1043    node2/crm: status change startup => wait_for_quorum
+info   1040    node2/lrm: status change startup => wait_for_agent_lock
+info   1044    node3/crm: got lock 'ha_manager_lock'
+info   1044    node3/crm: status change slave => master
+info   1044    node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info   1062    node2/crm: status change wait_for_quorum => slave
+info   1063    node2/lrm: got lock 'ha_agent_node2_lock'
+info   1063    node2/lrm: status change wait_for_agent_lock => active
+info   1063    node2/lrm: starting service vm:102
+info   1063    node2/lrm: service status vm:102 started
+noti   1084    node3/crm: start ha group migration...
+noti   1084    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1084    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1084    node3/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1084    node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1084    node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1084    node3/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1084    node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1084    node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1084    node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err    1084    node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err    1084    node3/crm: ha groups migration failed
+noti   1084    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1120      cmdlist: execute version node3 set 9.0.0~15
+noti   1204    node3/crm: start ha group migration...
+noti   1204    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1204    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1204    node3/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1204    node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1204    node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1204    node3/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1204    node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1204    node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1204    node3/crm: ha groups migration: node 'node3' has version '9.0.0~15'
+err    1204    node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err    1204    node3/crm: ha groups migration failed
+noti   1204    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1220      cmdlist: execute reboot node3
+info   1220    node3/lrm: got shutdown request with shutdown policy 'conditional'
+info   1220    node3/lrm: reboot LRM, stop and freeze all services
+info   1224    node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info   1225    node3/lrm: stopping service vm:103
+info   1225    node3/lrm: service status vm:103 stopped
+info   1226    node3/lrm: exit (loop end)
+info   1226       reboot: execute crm node3 stop
+info   1225    node3/crm: server received shutdown request
+info   1244    node3/crm: voluntary release CRM lock
+info   1245    node3/crm: exit (loop end)
+info   1245       reboot: execute power node3 off
+info   1245       reboot: execute power node3 on
+info   1245    node3/crm: status change startup => wait_for_quorum
+info   1240    node3/lrm: status change startup => wait_for_agent_lock
+info   1260    node1/crm: got lock 'ha_manager_lock'
+info   1260    node1/crm: status change slave => master
+info   1260    node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info   1264    node3/crm: status change wait_for_quorum => slave
+info   1265    node3/lrm: got lock 'ha_agent_node3_lock'
+info   1265    node3/lrm: status change wait_for_agent_lock => active
+info   1265    node3/lrm: starting service vm:103
+info   1265    node3/lrm: service status vm:103 started
+noti   1300    node1/crm: start ha group migration...
+noti   1300    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1300    node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1300    node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node3' has version '9.0.0~15'
+err    1300    node1/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err    1300    node1/crm: ha groups migration failed
+noti   1300    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1320      cmdlist: execute version node3 set 9.0.0~17
+info   1420      cmdlist: execute reboot node3
+info   1420    node3/lrm: got shutdown request with shutdown policy 'conditional'
+info   1420    node3/lrm: reboot LRM, stop and freeze all services
+noti   1420    node1/crm: start ha group migration...
+noti   1420    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1420    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1420    node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1420    node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1420    node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1420    node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1420    node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1420    node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'restart'
+err    1420    node1/crm: abort ha groups migration: lrm 'node3' is not in mode 'active'
+err    1420    node1/crm: ha groups migration failed
+noti   1420    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1420    node1/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info   1425    node3/lrm: stopping service vm:103
+info   1425    node3/lrm: service status vm:103 stopped
+info   1426    node3/lrm: exit (loop end)
+info   1426       reboot: execute crm node3 stop
+info   1425    node3/crm: server received shutdown request
+info   1445    node3/crm: exit (loop end)
+info   1445       reboot: execute power node3 off
+info   1445       reboot: execute power node3 on
+info   1445    node3/crm: status change startup => wait_for_quorum
+info   1440    node3/lrm: status change startup => wait_for_agent_lock
+info   1460    node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info   1464    node3/crm: status change wait_for_quorum => slave
+info   1465    node3/lrm: got lock 'ha_agent_node3_lock'
+info   1465    node3/lrm: status change wait_for_agent_lock => active
+info   1465    node3/lrm: starting service vm:103
+info   1465    node3/lrm: service status vm:103 started
+noti   1540    node1/crm: start ha group migration...
+noti   1540    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1540    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1540    node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti   1540    node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1540    node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1540    node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti   1540    node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1540    node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1540    node1/crm: ha groups migration: node 'node3' has version '9.0.0~17'
+noti   1540    node1/crm: ha groups migration: migration to rules config successful
+noti   1540    node1/crm: ha groups migration: migration to resources config successful
+noti   1540    node1/crm: ha groups migration: group config deletion successful
+noti   1540    node1/crm: ha groups migration successful
+info   2020     hardware: exit simulation - done
diff --git a/src/test/test-group-migrate3/manager_status b/src/test/test-group-migrate3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-group-migrate3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-group-migrate3/service_config b/src/test/test-group-migrate3/service_config
new file mode 100644
index 00000000..a27551e5
--- /dev/null
+++ b/src/test/test-group-migrate3/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started", "group": "group1" },
+    "vm:102": { "node": "node2", "state": "started", "group": "group2" },
+    "vm:103": { "node": "node3", "state": "started", "group": "group2" }
+}
diff --git a/src/test/test-group-migrate4/README b/src/test/test-group-migrate4/README
new file mode 100644
index 00000000..37e60c7d
--- /dev/null
+++ b/src/test/test-group-migrate4/README
@@ -0,0 +1,8 @@
+Test whether a cluster, where all nodes have the same version from the previous
+major release, can be properly upgraded to the neede major release version and
+then the CRM correctly migrates the HA group config only after all nodes have
+the minimum major release version.
+
+Additionally, the nodes are rebooted with every version upgrade and in-between
+the CFS sporadically fails to read/write, fails to update cluster state and an
+LRM is restarted, which all prohibit the HA groups config migration.
diff --git a/src/test/test-group-migrate4/cmdlist b/src/test/test-group-migrate4/cmdlist
new file mode 100644
index 00000000..fdd3bfdd
--- /dev/null
+++ b/src/test/test-group-migrate4/cmdlist
@@ -0,0 +1,15 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "delay 10" ],
+    [ "version node1 set 9.0.0" ],
+    [ "reboot node1" ],
+    [ "cfs node2 rw fail" ],
+    [ "version node2 set 9.0.0" ],
+    [ "cfs node2 rw work" ],
+    [ "reboot node2" ],
+    [ "cfs node3 update fail" ],
+    [ "cfs node3 update work" ],
+    [ "version node3 set 9.0.1" ],
+    [ "restart-lrm node2" ],
+    [ "reboot node3" ]
+]
diff --git a/src/test/test-group-migrate4/groups b/src/test/test-group-migrate4/groups
new file mode 100644
index 00000000..bad746ca
--- /dev/null
+++ b/src/test/test-group-migrate4/groups
@@ -0,0 +1,7 @@
+group: group1
+	nodes node1
+	restricted 1
+
+group: group2
+	nodes node2:2,node3
+	nofailback 1
diff --git a/src/test/test-group-migrate4/hardware_status b/src/test/test-group-migrate4/hardware_status
new file mode 100644
index 00000000..7ad46416
--- /dev/null
+++ b/src/test/test-group-migrate4/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "version": "8.4.1" },
+  "node2": { "power": "off", "network": "off", "version": "8.4.1" },
+  "node3": { "power": "off", "network": "off", "version": "8.4.1" }
+}
diff --git a/src/test/test-group-migrate4/log.expect b/src/test/test-group-migrate4/log.expect
new file mode 100644
index 00000000..7ffe33e3
--- /dev/null
+++ b/src/test/test-group-migrate4/log.expect
@@ -0,0 +1,277 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:102
+info     23    node2/lrm: service status vm:102 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+noti     60    node1/crm: start ha group migration...
+noti     60    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti     60    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti     60    node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err      60    node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err      60    node1/crm: ha groups migration failed
+noti     60    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    120      cmdlist: execute delay 10
+noti    180    node1/crm: start ha group migration...
+noti    180    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti    180    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    180    node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err     180    node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err     180    node1/crm: ha groups migration failed
+noti    180    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    220      cmdlist: execute version node1 set 9.0.0
+noti    300    node1/crm: start ha group migration...
+noti    300    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti    300    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    300    node1/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti    300    node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti    300    node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti    300    node1/crm: ha groups migration: node 'node2' has version '8.4.1'
+err     300    node1/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err     300    node1/crm: ha groups migration failed
+noti    300    node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    320      cmdlist: execute reboot node1
+info    320    node1/lrm: got shutdown request with shutdown policy 'conditional'
+info    320    node1/lrm: reboot LRM, stop and freeze all services
+info    320    node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info    321    node1/lrm: stopping service vm:101
+info    321    node1/lrm: service status vm:101 stopped
+info    322    node1/lrm: exit (loop end)
+info    322       reboot: execute crm node1 stop
+info    321    node1/crm: server received shutdown request
+info    340    node1/crm: voluntary release CRM lock
+info    341    node1/crm: exit (loop end)
+info    341       reboot: execute power node1 off
+info    341       reboot: execute power node1 on
+info    341    node1/crm: status change startup => wait_for_quorum
+info    340    node1/lrm: status change startup => wait_for_agent_lock
+info    342    node2/crm: got lock 'ha_manager_lock'
+info    342    node2/crm: status change slave => master
+info    342    node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info    360    node1/crm: status change wait_for_quorum => slave
+info    361    node1/lrm: got lock 'ha_agent_node1_lock'
+info    361    node1/lrm: status change wait_for_agent_lock => active
+info    361    node1/lrm: starting service vm:101
+info    361    node1/lrm: service status vm:101 started
+noti    382    node2/crm: start ha group migration...
+noti    382    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti    382    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    382    node2/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti    382    node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti    382    node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti    382    node2/crm: ha groups migration: node 'node2' has version '8.4.1'
+err     382    node2/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err     382    node2/crm: ha groups migration failed
+noti    382    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    420      cmdlist: execute cfs node2 rw fail
+err     422    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     422    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     423    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     423    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     423    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     442    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     442    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     443    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     443    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     443    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     462    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     462    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     463    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     463    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     463    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     482    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     482    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     483    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     483    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     483    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     502    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     502    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     503    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     503    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     503    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+info    520      cmdlist: execute version node2 set 9.0.0
+err     522    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     522    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     523    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     523    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     523    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     542    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     542    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     543    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     543    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     543    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     562    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     562    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     563    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     563    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     563    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     582    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     582    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     583    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     583    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     583    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err     602    node2/crm: could not read manager status: cfs connection refused - not mounted?
+err     602    node2/crm: got unexpected error - cfs connection refused - not mounted?
+err     603    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     603    node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err     603    node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+info    620      cmdlist: execute cfs node2 rw work
+noti    702    node2/crm: start ha group migration...
+noti    702    node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti    702    node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    702    node2/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti    702    node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti    702    node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti    702    node2/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti    702    node2/crm: ha groups migration: node 'node3' is in state 'online'
+noti    702    node2/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti    702    node2/crm: ha groups migration: node 'node3' has version '8.4.1'
+err     702    node2/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err     702    node2/crm: ha groups migration failed
+noti    702    node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    720      cmdlist: execute reboot node2
+info    720    node2/lrm: got shutdown request with shutdown policy 'conditional'
+info    720    node2/lrm: reboot LRM, stop and freeze all services
+info    722    node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info    723    node2/lrm: stopping service vm:102
+info    723    node2/lrm: service status vm:102 stopped
+info    724    node2/lrm: exit (loop end)
+info    724       reboot: execute crm node2 stop
+info    723    node2/crm: server received shutdown request
+info    742    node2/crm: voluntary release CRM lock
+info    743    node2/crm: exit (loop end)
+info    743       reboot: execute power node2 off
+info    743       reboot: execute power node2 on
+info    743    node2/crm: status change startup => wait_for_quorum
+info    740    node2/lrm: status change startup => wait_for_agent_lock
+info    744    node3/crm: got lock 'ha_manager_lock'
+info    744    node3/crm: status change slave => master
+info    744    node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info    762    node2/crm: status change wait_for_quorum => slave
+info    763    node2/lrm: got lock 'ha_agent_node2_lock'
+info    763    node2/lrm: status change wait_for_agent_lock => active
+info    763    node2/lrm: starting service vm:102
+info    763    node2/lrm: service status vm:102 started
+noti    784    node3/crm: start ha group migration...
+noti    784    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti    784    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti    784    node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti    784    node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti    784    node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti    784    node3/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti    784    node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti    784    node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti    784    node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err     784    node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err     784    node3/crm: ha groups migration failed
+noti    784    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info    820      cmdlist: execute cfs node3 update fail
+noti    824    node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    825    node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    844    node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    845    node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    864    node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    865    node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    884    node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    885    node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    904    node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti    905    node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+info    920      cmdlist: execute cfs node3 update work
+noti   1004    node3/crm: start ha group migration...
+noti   1004    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1004    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1004    node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti   1004    node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1004    node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1004    node3/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti   1004    node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1004    node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1004    node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err    1004    node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err    1004    node3/crm: ha groups migration failed
+noti   1004    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1020      cmdlist: execute version node3 set 9.0.1
+info   1120      cmdlist: execute restart-lrm node2
+info   1120    node2/lrm: restart LRM, freeze all services
+noti   1124    node3/crm: start ha group migration...
+noti   1124    node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1124    node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1124    node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti   1124    node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1124    node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'restart'
+err    1124    node3/crm: abort ha groups migration: lrm 'node2' is not in mode 'active'
+err    1124    node3/crm: ha groups migration failed
+noti   1124    node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info   1124    node3/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info   1144    node2/lrm: exit (loop end)
+info   1144    node2/lrm: status change startup => wait_for_agent_lock
+info   1164    node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info   1183    node2/lrm: got lock 'ha_agent_node2_lock'
+info   1183    node2/lrm: status change wait_for_agent_lock => active
+info   1220      cmdlist: execute reboot node3
+info   1220    node3/lrm: got shutdown request with shutdown policy 'conditional'
+info   1220    node3/lrm: reboot LRM, stop and freeze all services
+info   1224    node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info   1225    node3/lrm: stopping service vm:103
+info   1225    node3/lrm: service status vm:103 stopped
+info   1226    node3/lrm: exit (loop end)
+info   1226       reboot: execute crm node3 stop
+info   1225    node3/crm: server received shutdown request
+info   1244    node3/crm: voluntary release CRM lock
+info   1245    node3/crm: exit (loop end)
+info   1245       reboot: execute power node3 off
+info   1245       reboot: execute power node3 on
+info   1245    node3/crm: status change startup => wait_for_quorum
+info   1240    node3/lrm: status change startup => wait_for_agent_lock
+info   1260    node1/crm: got lock 'ha_manager_lock'
+info   1260    node1/crm: status change slave => master
+info   1260    node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info   1264    node3/crm: status change wait_for_quorum => slave
+info   1265    node3/lrm: got lock 'ha_agent_node3_lock'
+info   1265    node3/lrm: status change wait_for_agent_lock => active
+info   1265    node3/lrm: starting service vm:103
+info   1265    node3/lrm: service status vm:103 started
+noti   1300    node1/crm: start ha group migration...
+noti   1300    node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti   1300    node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti   1300    node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti   1300    node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti   1300    node1/crm: ha groups migration: node 'node3' has version '9.0.1'
+noti   1300    node1/crm: ha groups migration: migration to rules config successful
+noti   1300    node1/crm: ha groups migration: migration to resources config successful
+noti   1300    node1/crm: ha groups migration: group config deletion successful
+noti   1300    node1/crm: ha groups migration successful
+info   1820     hardware: exit simulation - done
diff --git a/src/test/test-group-migrate4/manager_status b/src/test/test-group-migrate4/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-group-migrate4/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-group-migrate4/service_config b/src/test/test-group-migrate4/service_config
new file mode 100644
index 00000000..a27551e5
--- /dev/null
+++ b/src/test/test-group-migrate4/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started", "group": "group1" },
+    "vm:102": { "node": "node2", "state": "started", "group": "group2" },
+    "vm:103": { "node": "node3", "state": "started", "group": "group2" }
+}
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Make call sites to the for_each_rule more readable and while at remove
unnecessary variables in the helper body as well.

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/API2/HA/Rules.pm             |  6 ++----
 src/PVE/HA/Rules.pm                  | 16 +++++++---------
 src/PVE/HA/Rules/NodeAffinity.pm     | 14 +++++---------
 src/PVE/HA/Rules/ResourceAffinity.pm | 22 ++++++++--------------
 4 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index 1591df28..b180d2ed 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -192,10 +192,8 @@ __PACKAGE__->register_method({
 
                 push @$res, $cfg;
             },
-            {
-                type => $type,
-                sid => $resource,
-            },
+            type => $type,
+            sid => $resource,
         );
 
         return $res;
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index e5d12571..e2b77215 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -419,13 +419,13 @@ sub canonicalize : prototype($$$) {
 
 =head3 foreach_rule(...)
 
-=head3 foreach_rule($rules, $func [, $opts])
+=head3 foreach_rule($rules, $func [, %opts])
 
 Filters the given C<$rules> according to the C<$opts> and loops over the
 resulting rules in the order as defined in the section config and executes
 C<$func> with the parameters C<L<< ($rule, $ruleid) >>>.
 
-The filter properties for C<$opts> are:
+The following key-value pairs for C<$opts> as filter properties are:
 
 =over
 
@@ -439,12 +439,10 @@ The filter properties for C<$opts> are:
 
 =cut
 
-sub foreach_rule : prototype($$;$) {
-    my ($rules, $func, $opts) = @_;
+sub foreach_rule : prototype($$;%) {
+    my ($rules, $func, %opts) = @_;
 
-    my $sid = $opts->{sid};
-    my $type = $opts->{type};
-    my $exclude_disabled_rules = $opts->{exclude_disabled_rules};
+    my $sid = $opts{sid};
 
     my @ruleids = sort {
         $rules->{order}->{$a} <=> $rules->{order}->{$b}
@@ -455,8 +453,8 @@ sub foreach_rule : prototype($$;$) {
 
         next if !$rule; # skip invalid rules
         next if defined($sid) && !defined($rule->{resources}->{$sid});
-        next if defined($type) && $rule->{type} ne $type;
-        next if $exclude_disabled_rules && exists($rule->{disable});
+        next if defined($opts{type}) && $rule->{type} ne $opts{type};
+        next if $opts{exclude_disabled_rules} && exists($rule->{disable});
 
         $func->($rule, $ruleid);
     }
diff --git a/src/PVE/HA/Rules/NodeAffinity.pm b/src/PVE/HA/Rules/NodeAffinity.pm
index ee3ef985..09a8e67c 100644
--- a/src/PVE/HA/Rules/NodeAffinity.pm
+++ b/src/PVE/HA/Rules/NodeAffinity.pm
@@ -148,10 +148,8 @@ sub get_plugin_check_arguments {
 
             $result->{node_affinity_rules}->{$ruleid} = $rule;
         },
-        {
-            type => 'node-affinity',
-            exclude_disabled_rules => 1,
-        },
+        type => 'node-affinity',
+        exclude_disabled_rules => 1,
     );
 
     return $result;
@@ -231,11 +229,9 @@ my $get_resource_node_affinity_rule = sub {
 
             $node_affinity_rule = dclone($rule) if !$node_affinity_rule;
         },
-        {
-            sid => $sid,
-            type => 'node-affinity',
-            exclude_disabled_rules => 1,
-        },
+        sid => $sid,
+        type => 'node-affinity',
+        exclude_disabled_rules => 1,
     );
 
     return $node_affinity_rule;
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 6b5670ac..1d2ed1ed 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -92,10 +92,8 @@ sub get_plugin_check_arguments {
             $result->{positive_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'positive';
             $result->{negative_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'negative';
         },
-        {
-            type => 'resource-affinity',
-            exclude_disabled_rules => 1,
-        },
+        type => 'resource-affinity',
+        exclude_disabled_rules => 1,
     );
 
     return $result;
@@ -490,11 +488,9 @@ sub get_affinitive_resources : prototype($$) {
                 $affinity_set->{$csid} = 1 if $csid ne $sid;
             }
         },
-        {
-            sid => $sid,
-            type => 'resource-affinity',
-            exclude_disabled_rules => 1,
-        },
+        sid => $sid,
+        type => 'resource-affinity',
+        exclude_disabled_rules => 1,
     );
 
     return ($together, $separate);
@@ -560,11 +556,9 @@ sub get_resource_affinity : prototype($$$) {
                 }
             }
         },
-        {
-            sid => $sid,
-            type => 'resource-affinity',
-            exclude_disabled_rules => 1,
-        },
+        sid => $sid,
+        type => 'resource-affinity',
+        exclude_disabled_rules => 1,
     );
 
     return ($together, $separate);
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (2 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/API2/HA/Rules.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index b180d2ed..d797f621 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -223,6 +223,8 @@ __PACKAGE__->register_method({
             rule => get_standard_option('pve-ha-rule-id'),
             type => {
                 type => 'string',
+                description => "HA rule type.",
+                enum => PVE::HA::Rules->lookup_types(),
             },
         },
     },
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (3 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Otherwise, these will be ignored by the feasibility check and allows
users to create rules or update rules, which are infeasible and will
make other HA rules invalid.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/API2/HA/Rules.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index d797f621..ab431019 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -268,6 +268,8 @@ __PACKAGE__->register_method({
         my $type = extract_param($param, 'type');
         my $ruleid = extract_param($param, 'rule');
 
+        delete $param->{disable} if !$param->{disable};
+
         my $plugin = PVE::HA::Rules->lookup($type);
 
         my $opts = $plugin->check_config($ruleid, $param, 1, 1);
@@ -318,6 +320,8 @@ __PACKAGE__->register_method({
         my $digest = extract_param($param, 'digest');
         my $delete = extract_param($param, 'delete');
 
+        delete $param->{disable} if !$param->{disable};
+
         if ($delete) {
             $delete = [PVE::Tools::split_list($delete)];
         }
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (4 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Most users will likely interact with the HA rules through the web
interface, where the HA rule ids are not shown in the rules view.

Error messages with direct references to these rule ids will seem
confusing to users, so replace them with a more generic name.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Rules/ResourceAffinity.pm                      | 4 ++--
 .../inconsistent-resource-affinity-rules.cfg.expect       | 8 ++++----
 ...r-implicit-negative-resource-affinity-rules.cfg.expect | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 1d2ed1ed..7327ee08 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -236,9 +236,9 @@ __PACKAGE__->register_check(
             my ($positiveid, $negativeid) = @$conflict;
 
             push $errors->{$positiveid}->{resources}->@*,
-                "rule shares two or more resources with '$negativeid'";
+                "rule shares two or more resources with a negative resource affinity rule";
             push $errors->{$negativeid}->{resources}->@*,
-                "rule shares two or more resources with '$positiveid'";
+                "rule shares two or more resources with a positive resource affinity rule";
         }
     },
 );
diff --git a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
index b0cde0f8..d4a2d7b2 100644
--- a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
+++ b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
@@ -1,8 +1,8 @@
 --- Log ---
-Drop rule 'keep-apart1', because rule shares two or more resources with 'stick-together1'.
-Drop rule 'keep-apart2', because rule shares two or more resources with 'stick-together1'.
-Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart1'.
-Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart2'.
+Drop rule 'keep-apart1', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'keep-apart2', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'stick-together1', because rule shares two or more resources with a negative resource affinity rule.
+Drop rule 'stick-together1', because rule shares two or more resources with a negative resource affinity rule.
 --- Config ---
 $VAR1 = {
           'digest' => '50875b320034d8ac7dded185e590f5f87c4e2bb6',
diff --git a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
index bcd368ab..09364d41 100644
--- a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
+++ b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
@@ -1,6 +1,6 @@
 --- Log ---
-Drop rule 'do-not-infer-inconsistent-negative2', because rule shares two or more resources with 'do-not-infer-inconsistent-positive1'.
-Drop rule 'do-not-infer-inconsistent-positive1', because rule shares two or more resources with 'do-not-infer-inconsistent-negative2'.
+Drop rule 'do-not-infer-inconsistent-negative2', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'do-not-infer-inconsistent-positive1', because rule shares two or more resources with a negative resource affinity rule.
 --- Config ---
 $VAR1 = {
           'digest' => 'd8724dfe2130bb642b98e021da973aa0ec0695f0',
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (5 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

These should not be accounted for as these are treated as if the HA
Manager doesn't manage them at all.

Reported-by: Michael Köppl <m.koeppl@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Config.pm  | 2 ++
 src/PVE/HA/Manager.pm | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 6de08650..53b55d0e 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -400,6 +400,8 @@ sub get_resource_motion_info {
             next if $ns->{$node} ne 'online';
 
             for my $csid (sort keys %$separate) {
+                next if !defined($ss->{$csid});
+                next if $ss->{$csid}->{state} eq 'ignored';
                 next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $node;
                 next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $node;
 
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 0be12061..ba59f642 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -420,6 +420,8 @@ sub execute_migration {
     my ($together, $separate) = get_affinitive_resources($self->{rules}, $sid);
 
     for my $csid (sort keys %$separate) {
+        next if !defined($ss->{$csid});
+        next if $ss->{$csid}->{state} eq 'ignored';
         next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $target;
         next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $target;
 
@@ -437,6 +439,8 @@ sub execute_migration {
 
     my $resources_to_migrate = [];
     for my $csid (sort keys %$together) {
+        next if !defined($ss->{$csid});
+        next if $ss->{$csid}->{state} eq 'ignored';
         next if $ss->{$csid}->{node} && $ss->{$csid}->{node} eq $target;
         next if $ss->{$csid}->{target} && $ss->{$csid}->{target} eq $target;
 
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (6 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

In the context of the HA Manager, resources' downtime is expected to be
minimized as much as possible. Therefore, it is more reasonable to try
other possible node placements if one or more of the HA resources of a
positive affinity rule fail, instead of putting the failed HA resources
in recovery.

This can be improved later to allow temporarily separated positive
affinity "groups", where the failed HA resources tries to find a node,
where it can start, and afterwards the other HA resources are migrated
there too, but this simpler heuristic is enough for the current feature.

Reported-by: Hannes Dürr <h.duerr@proxmox.com>
Reported-by: Michael Köppl <m.koeppl@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Rules/ResourceAffinity.pm          |  4 +++
 .../README                                    |  9 ++++---
 .../log.expect                                | 26 ++++++++++++++++---
 3 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 7327ee08..9bc039ba 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -596,6 +596,10 @@ resource has not failed running there yet.
 sub apply_positive_resource_affinity : prototype($$) {
     my ($together, $allowed_nodes) = @_;
 
+    for my $node (keys %$together) {
+        delete $together->{$node} if !$allowed_nodes->{$node};
+    }
+
     my @possible_nodes = sort keys $together->%*
         or return; # nothing to do if there is no positive resource affinity
 
diff --git a/src/test/test-resource-affinity-strict-positive3/README b/src/test/test-resource-affinity-strict-positive3/README
index a270277b..804d1312 100644
--- a/src/test/test-resource-affinity-strict-positive3/README
+++ b/src/test/test-resource-affinity-strict-positive3/README
@@ -1,7 +1,8 @@
 Test whether a strict positive resource affinity rule makes three resources
 migrate to the same recovery node in case of a failover of their previously
 assigned node. If one of those fail to start on the recovery node (e.g.
-insufficient resources), the failing resource will be kept on the recovery node.
+insufficient resources), all resources in the positive resource affinity rule
+will be migrated to another available recovery node.
 
 The test scenario is:
 - vm:101, vm:102, and fa:120002 must be kept together
@@ -12,6 +13,6 @@ The test scenario is:
 
 The expected outcome is:
 - As node3 fails, all resources are migrated to node2
-- Two of those resources will start successfully, but fa:120002 will stay in
-  recovery, since it cannot be started on this node, but cannot be relocated to
-  another one either due to the strict resource affinity rule
+- Two of those resources will start successfully, but fa:120002 will fail; as
+  there are other available nodes left where it can run, all resources in the
+  positive resource affinity rule are migrated to the next-best fitting node
diff --git a/src/test/test-resource-affinity-strict-positive3/log.expect b/src/test/test-resource-affinity-strict-positive3/log.expect
index 4a54cb3b..b5d7018f 100644
--- a/src/test/test-resource-affinity-strict-positive3/log.expect
+++ b/src/test/test-resource-affinity-strict-positive3/log.expect
@@ -82,8 +82,26 @@ info    263    node2/lrm: starting service fa:120002
 warn    263    node2/lrm: unable to start service fa:120002
 err     263    node2/lrm: unable to start service fa:120002 on local node after 1 retries
 warn    280    node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
-warn    280    node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120002', retry start on current node. Tried nodes: node2
-info    283    node2/lrm: starting service fa:120002
-info    283    node2/lrm: service status fa:120002 started
-info    300    node1/crm: relocation policy successful for 'fa:120002' on node 'node2', failed nodes: node2
+info    280    node1/crm: relocate service 'fa:120002' to node 'node1'
+info    280    node1/crm: service 'fa:120002': state changed from 'started' to 'relocate'  (node = node2, target = node1)
+info    283    node2/lrm: service fa:120002 - start relocate to node 'node1'
+info    283    node2/lrm: service fa:120002 - end relocate to node 'node1'
+info    300    node1/crm: service 'fa:120002': state changed from 'relocate' to 'started'  (node = node1)
+info    300    node1/crm: migrate service 'vm:101' to node 'node1' (running)
+info    300    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    300    node1/crm: migrate service 'vm:102' to node 'node1' (running)
+info    300    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    301    node1/lrm: starting service fa:120002
+info    301    node1/lrm: service status fa:120002 started
+info    303    node2/lrm: service vm:101 - start migrate to node 'node1'
+info    303    node2/lrm: service vm:101 - end migrate to node 'node1'
+info    303    node2/lrm: service vm:102 - start migrate to node 'node1'
+info    303    node2/lrm: service vm:102 - end migrate to node 'node1'
+info    320    node1/crm: relocation policy successful for 'fa:120002' on node 'node1', failed nodes: node2
+info    320    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
+info    320    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node1)
+info    321    node1/lrm: starting service vm:101
+info    321    node1/lrm: service status vm:101 started
+info    321    node1/lrm: starting service vm:102
+info    321    node1/lrm: service status vm:102 started
 info    720     hardware: exit simulation - done
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (7 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

In preparation of the next patch, remove the overly restrictive checker,
which disallows that resources are used in node affinity rules and
resource affinity rules at the same time.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
This could be squashed into the next, but I figured it's good measure
for documentation to have its own patch.

 src/PVE/HA/Rules.pm                           |  70 -----------
 .../multiple-resource-refs-in-rules.cfg       |  52 --------
 ...multiple-resource-refs-in-rules.cfg.expect | 111 ------------------
 3 files changed, 233 deletions(-)
 delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
 delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect

diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index e2b77215..64dae1e4 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -475,74 +475,4 @@ sub get_next_ordinal : prototype($) {
     return $current_order + 1;
 }
 
-=head1 INTER-PLUGIN RULE CHECKERS
-
-=cut
-
-=head3 check_single_global_resource_reference($node_affinity_rules, $resource_affinity_rules)
-
-Returns all rules in C<$node_affinity_rules> and C<$resource_affinity_rules> as
-a list of lists, each consisting of the rule id and the resource id, where one
-of the resources is used in both a node affinity rule and resource affinity rule
-at the same time.
-
-If there are none, the returned list is empty.
-
-=cut
-
-sub check_single_global_resource_reference {
-    my ($node_affinity_rules, $resource_affinity_rules) = @_;
-
-    my @conflicts = ();
-    my $resource_ruleids = {};
-
-    while (my ($ruleid, $rule) = each %$node_affinity_rules) {
-        for my $sid (keys $rule->{resources}->%*) {
-            push $resource_ruleids->{$sid}->{node_affinity}->@*, $ruleid;
-        }
-    }
-    while (my ($ruleid, $rule) = each %$resource_affinity_rules) {
-        for my $sid (keys $rule->{resources}->%*) {
-            push $resource_ruleids->{$sid}->{resource_affinity}->@*, $ruleid;
-        }
-    }
-
-    for my $sid (keys %$resource_ruleids) {
-        my $node_affinity_ruleids = $resource_ruleids->{$sid}->{node_affinity} // [];
-        my $resource_affinity_ruleids = $resource_ruleids->{$sid}->{resource_affinity} // [];
-
-        next if @$node_affinity_ruleids > 0 && !@$resource_affinity_ruleids;
-        next if @$resource_affinity_ruleids > 0 && !@$node_affinity_ruleids;
-
-        for my $ruleid (@$node_affinity_ruleids, @$resource_affinity_ruleids) {
-            push @conflicts, [$ruleid, $sid];
-        }
-    }
-
-    @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
-    return \@conflicts;
-}
-
-__PACKAGE__->register_check(
-    sub {
-        my ($args) = @_;
-
-        return check_single_global_resource_reference(
-            $args->{node_affinity_rules},
-            $args->{resource_affinity_rules},
-        );
-    },
-    sub {
-        my ($conflicts, $errors) = @_;
-
-        for my $conflict (@$conflicts) {
-            my ($ruleid, $sid) = @$conflict;
-
-            push $errors->{$ruleid}->{resources}->@*,
-                "resource '$sid' cannot be used in both a node affinity rule"
-                . " and a resource affinity rule at the same time";
-        }
-    },
-);
-
 1;
diff --git a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg b/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
deleted file mode 100644
index 6608a5c3..00000000
--- a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
+++ /dev/null
@@ -1,52 +0,0 @@
-# Case 1: Do not remove node/resource affinity rules, which do not share resources between these types.
-node-affinity: different-resource1
-	resources vm:101,vm:102,vm:103
-	nodes node1,node2:2
-	strict 0
-
-resource-affinity: different-resource2
-	resources vm:104,vm:105
-	affinity positive
-
-node-affinity: different-resource3
-	resources vm:106
-	nodes node1,node2:2
-	strict 1
-
-resource-affinity: different-resource4
-	resources vm:107,vm:109
-	affinity negative
-
-# Case 2: Remove rules, which share the same resource(s) between different rule types.
-node-affinity: same-resource1
-	resources vm:201
-	nodes node1,node2:2
-	strict 0
-
-resource-affinity: same-resource2
-	resources vm:201,vm:205
-	affinity negative
-
-resource-affinity: same-resource3
-	resources vm:201,vm:203,vm:204
-	affinity negative
-
-node-affinity: same-resource4
-	resources vm:205,vm:206,vm:207
-	nodes node1:2,node3:3
-	strict 1
-
-# Case 3: Do not remove rules, which do not share resources between them.
-node-affinity: other-different-resource1
-	resources vm:301,vm:308
-	nodes node1,node2:2
-	strict 0
-
-resource-affinity: other-different-resource2
-	resources vm:302,vm:304,vm:305
-	affinity positive
-
-node-affinity: other-different-resource3
-	resources vm:303,vm:306,vm:309
-	nodes node1,node2:2
-	strict 1
diff --git a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect b/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
deleted file mode 100644
index 972c042d..00000000
--- a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
+++ /dev/null
@@ -1,111 +0,0 @@
---- Log ---
-Drop rule 'same-resource1', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource2', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource2', because resource 'vm:205' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource3', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource4', because resource 'vm:205' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
---- Config ---
-$VAR1 = {
-          'digest' => 'fcbdf84d442d38b4c901d989c211fb62024c5515',
-          'ids' => {
-                     'different-resource1' => {
-                                                'nodes' => {
-                                                             'node1' => {
-                                                                          'priority' => 0
-                                                                        },
-                                                             'node2' => {
-                                                                          'priority' => 2
-                                                                        }
-                                                           },
-                                                'resources' => {
-                                                                 'vm:101' => 1,
-                                                                 'vm:102' => 1,
-                                                                 'vm:103' => 1
-                                                               },
-                                                'strict' => 0,
-                                                'type' => 'node-affinity'
-                                              },
-                     'different-resource2' => {
-                                                'affinity' => 'positive',
-                                                'resources' => {
-                                                                 'vm:104' => 1,
-                                                                 'vm:105' => 1
-                                                               },
-                                                'type' => 'resource-affinity'
-                                              },
-                     'different-resource3' => {
-                                                'nodes' => {
-                                                             'node1' => {
-                                                                          'priority' => 0
-                                                                        },
-                                                             'node2' => {
-                                                                          'priority' => 2
-                                                                        }
-                                                           },
-                                                'resources' => {
-                                                                 'vm:106' => 1
-                                                               },
-                                                'strict' => 1,
-                                                'type' => 'node-affinity'
-                                              },
-                     'different-resource4' => {
-                                                'affinity' => 'negative',
-                                                'resources' => {
-                                                                 'vm:107' => 1,
-                                                                 'vm:109' => 1
-                                                               },
-                                                'type' => 'resource-affinity'
-                                              },
-                     'other-different-resource1' => {
-                                                      'nodes' => {
-                                                                   'node1' => {
-                                                                                'priority' => 0
-                                                                              },
-                                                                   'node2' => {
-                                                                                'priority' => 2
-                                                                              }
-                                                                 },
-                                                      'resources' => {
-                                                                       'vm:301' => 1,
-                                                                       'vm:308' => 1
-                                                                     },
-                                                      'strict' => 0,
-                                                      'type' => 'node-affinity'
-                                                    },
-                     'other-different-resource2' => {
-                                                      'affinity' => 'positive',
-                                                      'resources' => {
-                                                                       'vm:302' => 1,
-                                                                       'vm:304' => 1,
-                                                                       'vm:305' => 1
-                                                                     },
-                                                      'type' => 'resource-affinity'
-                                                    },
-                     'other-different-resource3' => {
-                                                      'nodes' => {
-                                                                   'node1' => {
-                                                                                'priority' => 0
-                                                                              },
-                                                                   'node2' => {
-                                                                                'priority' => 2
-                                                                              }
-                                                                 },
-                                                      'resources' => {
-                                                                       'vm:303' => 1,
-                                                                       'vm:306' => 1,
-                                                                       'vm:309' => 1
-                                                                     },
-                                                      'strict' => 1,
-                                                      'type' => 'node-affinity'
-                                                    }
-                   },
-          'order' => {
-                       'different-resource1' => 1,
-                       'different-resource2' => 2,
-                       'different-resource3' => 3,
-                       'different-resource4' => 4,
-                       'other-different-resource1' => 9,
-                       'other-different-resource2' => 10,
-                       'other-different-resource3' => 11
-                     }
-        };
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (8 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Add inter-plugin checks and helpers, which allow resources to be used in
node affinity rules and resource affinity rules at the same time, if the
following conditions are met:

- the resources of a resource affinity rule are not part of any node
  affinity rule, which has multiple priority groups. This is because of
  the dynamic nature of priority groups.

- the resources of a positive resource affinity rule are part of at most
  one node affinity rule, but no more. Otherwise, it is not easily
  decidable (yet) what the common node restrictions are.

- the positive resource affinity rules, which have at least one resource
  which is part of one node affinity rule, make all the resources part
  of the node affinity rule.

- the resources of a negative resource affinity rule are not restricted
  by their node affinity rules in such a way that these do not have
  enough nodes to be separated on.

Additionally, resources of a positive resource affinity rule, which are
also part of at most a single node affinity rule, are also added to the
node affinity rule.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Rules.pm | 281 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 281 insertions(+)

diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index 64dae1e4..323ad038 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -410,6 +410,8 @@ sub canonicalize : prototype($$$) {
         next if $@; # plugin doesn't implement plugin_canonicalize(...)
     }
 
+    $class->global_canonicalize($rules);
+
     return $messages;
 }
 
@@ -475,4 +477,283 @@ sub get_next_ordinal : prototype($) {
     return $current_order + 1;
 }
 
+=head1 INTER-PLUGIN RULE CHECKERS
+
+=cut
+
+my $has_multiple_priorities = sub {
+    my ($node_affinity_rule) = @_;
+
+    my $priority;
+    for my $node (values $node_affinity_rule->{nodes}->%*) {
+        $priority = $node->{priority} if !defined($priority);
+
+        return 1 if $priority != $node->{priority};
+    }
+};
+
+=head3 check_single_priority_node_affinity_in_resource_affinity_rules(...)
+
+Returns all rules in C<$resource_affinity_rules> and C<$node_affinity_rules> as
+a list of lists, each consisting of the rule type and resource id, where at
+least one resource in a resource affinity rule are in node affinity rules,
+which have multiple priority groups defined.
+
+That is, the resource affinity rule cannot be statically checked to be feasible
+as the selection of the priority group is dependent on the currently online
+nodes.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_single_priority_node_affinity_in_resource_affinity_rules {
+    my ($resource_affinity_rules, $node_affinity_rules) = @_;
+
+    my @conflicts = ();
+
+    while (my ($resource_affinity_id, $resource_affinity_rule) = each %$resource_affinity_rules) {
+        my $has_conflicts;
+        my $resources = $resource_affinity_rule->{resources};
+        my @paired_node_affinity_rules = ();
+
+        for my $node_affinity_id (keys %$node_affinity_rules) {
+            my $node_affinity_rule = $node_affinity_rules->{$node_affinity_id};
+
+            next if sets_are_disjoint($resources, $node_affinity_rule->{resources});
+
+            $has_conflicts = $has_multiple_priorities->($node_affinity_rule)
+                if !$has_conflicts;
+
+            push @paired_node_affinity_rules, $node_affinity_id;
+        }
+        if ($has_conflicts) {
+            push @conflicts, ['resource-affinity', $resource_affinity_id];
+            push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+        }
+    }
+
+    @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+    return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+    sub {
+        my ($args) = @_;
+
+        return check_single_priority_node_affinity_in_resource_affinity_rules(
+            $args->{resource_affinity_rules},
+            $args->{node_affinity_rules},
+        );
+    },
+    sub {
+        my ($conflicts, $errors) = @_;
+
+        for my $conflict (@$conflicts) {
+            my ($type, $ruleid) = @$conflict;
+
+            if ($type eq 'node-affinity') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "resources are in a resource affinity rule and cannot be in"
+                    . " a node affinity rule with multiple priorities";
+            } elsif ($type eq 'resource-affinity') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "resources are in node affinity rules with multiple priorities";
+            }
+        }
+    },
+);
+
+=head3 check_single_node_affinity_per_positive_resource_affinity_rule(...)
+
+Returns all rules in C<$positive_rules> and C<$node_affinity_rules> as a list of
+lists, each consisting of the rule type and resource id, where one of the
+resources is used in a positive resource affinity rule and more than one node
+affinity rule.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_single_node_affinity_per_positive_resource_affinity_rule {
+    my ($positive_rules, $node_affinity_rules) = @_;
+
+    my @conflicts = ();
+
+    while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+        my $positive_resources = $positive_rule->{resources};
+        my @paired_node_affinity_rules = ();
+
+        while (my ($node_affinity_id, $node_affinity_rule) = each %$node_affinity_rules) {
+            next if sets_are_disjoint($positive_resources, $node_affinity_rule->{resources});
+
+            push @paired_node_affinity_rules, $node_affinity_id;
+        }
+        if (@paired_node_affinity_rules > 1) {
+            push @conflicts, ['positive', $positiveid];
+            push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+        }
+    }
+
+    @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+    return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+    sub {
+        my ($args) = @_;
+
+        return check_single_node_affinity_per_positive_resource_affinity_rule(
+            $args->{positive_rules},
+            $args->{node_affinity_rules},
+        );
+    },
+    sub {
+        my ($conflicts, $errors) = @_;
+
+        for my $conflict (@$conflicts) {
+            my ($type, $ruleid) = @$conflict;
+
+            if ($type eq 'positive') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "resources are in multiple node affinity rules";
+            } elsif ($type eq 'node-affinity') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "at least one resource is in a positive resource affinity"
+                    . " rule and there are other resources in at least one"
+                    . " other node affinity rule already";
+            }
+        }
+    },
+);
+
+=head3 check_negative_resource_affinity_node_affinity_consistency(...)
+
+Returns all rules in C<$negative_rules> and C<$node_affinity_rules> as a list
+of lists, each consisting of the rule type and resource id, where the resources
+in the negative resource affinity rule are restricted to less nodes than needed
+to keep them separate by their node affinity rules.
+
+That is, the negative resource affinity rule cannot be fullfilled as there are
+not enough nodes to spread the resources on.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_negative_resource_affinity_node_affinity_consistency {
+    my ($negative_rules, $node_affinity_rules) = @_;
+
+    my @conflicts = ();
+
+    while (my ($negativeid, $negative_rule) = each %$negative_rules) {
+        my $allowed_nodes = {};
+        my $located_resources;
+        my $resources = $negative_rule->{resources};
+        my @paired_node_affinity_rules = ();
+
+        for my $node_affinity_id (keys %$node_affinity_rules) {
+            my ($node_affinity_resources, $node_affinity_nodes) =
+                $node_affinity_rules->{$node_affinity_id}->@{qw(resources nodes)};
+            my $common_resources = set_intersect($resources, $node_affinity_resources);
+
+            next if keys %$common_resources < 1;
+
+            $located_resources = set_union($located_resources, $common_resources);
+            $allowed_nodes = set_union($allowed_nodes, $node_affinity_nodes);
+
+            push @paired_node_affinity_rules, $node_affinity_id;
+        }
+        if (keys %$allowed_nodes < keys %$located_resources) {
+            push @conflicts, ['negative', $negativeid];
+            push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+        }
+    }
+
+    @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+    return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+    sub {
+        my ($args) = @_;
+
+        return check_negative_resource_affinity_node_affinity_consistency(
+            $args->{negative_rules},
+            $args->{node_affinity_rules},
+        );
+    },
+    sub {
+        my ($conflicts, $errors) = @_;
+
+        for my $conflict (@$conflicts) {
+            my ($type, $ruleid) = @$conflict;
+
+            if ($type eq 'negative') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "two or more resources are restricted to less nodes than"
+                    . " available to the resources";
+            } elsif ($type eq 'node-affinity') {
+                push $errors->{$ruleid}->{resources}->@*,
+                    "at least one resource is in a negative resource affinity"
+                    . " rule and this rule would restrict these to less nodes"
+                    . " than available to the resources";
+            }
+        }
+    },
+);
+
+=head1 INTER-PLUGIN RULE CANONICALIZATION HELPERS
+
+=cut
+
+=head3 create_implicit_positive_resource_affinity_node_affinity_rules(...)
+
+Modifies C<$rules> such that all resources of a positive resource affinity rule,
+defined in C<$positive_rules>, where at least one of their resources is also in
+a node affinity rule, defined in C<$node_affinity_rules>, makes all the other
+positive resource affinity rule's resources also part of the node affinity rule.
+
+This helper assumes that there can only be a single node affinity rule per
+positive resource affinity rule as there is no heuristic yet what should be
+done in the case of multiple node affinity rules.
+
+This also makes it cheaper to infer these implicit constraints later instead of
+propagating that information in each scheduler invocation.
+
+=cut
+
+sub create_implicit_positive_resource_affinity_node_affinity_rules {
+    my ($rules, $positive_rules, $node_affinity_rules) = @_;
+
+    my @conflicts = ();
+
+    while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+        my $found_node_affinity_id;
+        my $positive_resources = $positive_rule->{resources};
+
+        for my $node_affinity_id (keys %$node_affinity_rules) {
+            my $node_affinity_rule = $rules->{ids}->{$node_affinity_id};
+            next if sets_are_disjoint($positive_resources, $node_affinity_rule->{resources});
+
+            # assuming that all $resources have at most one node affinity rule,
+            # take the first found node affinity rule.
+            $node_affinity_rule->{resources}->{$_} = 1 for keys %$positive_resources;
+            last;
+        }
+    }
+}
+
+sub global_canonicalize {
+    my ($class, $rules) = @_;
+
+    my $args = $class->get_check_arguments($rules);
+
+    create_implicit_positive_resource_affinity_node_affinity_rules(
+        $rules,
+        $args->{positive_rules},
+        $args->{node_affinity_rules},
+    );
+}
+
 1;
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (9 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
  2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Add test cases to verify that the rule checkers correctly identify and
remove HA node and resource affinity rules from the rules to make the
rule set feasible. The added test cases verify:

- the resources of a resource affinity rule are not part of any node
  affinity rule, which has multiple priority groups. This is because of
  the dynamic nature of priority groups.

- the resources of a positive resource affinity rule are part of at most
  one node affinity rule, but no more. Otherwise, it is not easily
  decidable (yet) what the common node restrictions are.

- the positive resource affinity rules, which have at least one resource
  which is part of one node affinity rule, make all the resources part
  of the node affinity rule.

- the resources of a negative resource affinity rule are not restricted
  by their node affinity rules in such a way that these do not have
  enough nodes to be separated on.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 ...onsistent-node-resource-affinity-rules.cfg |  54 +++++++++
 ...nt-node-resource-affinity-rules.cfg.expect |  73 ++++++++++++
 ...y-for-positive-resource-affinity-rules.cfg |  37 ++++++
 ...ositive-resource-affinity-rules.cfg.expect | 111 ++++++++++++++++++
 ...-affinity-with-resource-affinity-rules.cfg |  35 ++++++
 ...ty-with-resource-affinity-rules.cfg.expect |  48 ++++++++
 6 files changed, 358 insertions(+)
 create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
 create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
 create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect

diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
new file mode 100644
index 00000000..88e6dd0e
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
@@ -0,0 +1,54 @@
+# Case 1: Do not remove a positive resource affinity rule, where there is exactly one node to keep them together.
+node-affinity: vm101-vm102-must-be-on-node1
+	resources vm:101,vm:102
+	nodes node1
+	strict 1
+
+resource-affinity: vm101-vm102-must-be-kept-together
+	resources vm:101,vm:102
+	affinity positive
+
+# Case 2: Do not remove a negative resource affinity rule, where there are exactly enough nodes available to keep them apart.
+node-affinity: vm201-must-be-on-node1
+	resources vm:201
+	nodes node1
+	strict 1
+
+node-affinity: vm202-must-be-on-node2
+	resources vm:202
+	nodes node2
+	strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-separate
+	resources vm:201,vm:202
+	affinity negative
+
+# Case 3: Remove positive resource affinity rules, where two resources are restricted to different nodes.
+node-affinity: vm301-must-be-on-node1
+	resources vm:301
+	nodes node1
+	strict 1
+
+node-affinity: vm301-must-be-on-node2
+	resources vm:302
+	nodes node2
+	strict 1
+
+resource-affinity: vm301-vm302-must-be-kept-together
+	resources vm:301,vm:302
+	affinity positive
+
+# Case 4: Remove negative resource affinity rules, where two resources are restricted to less nodes than needed to keep them apart.
+node-affinity: vm401-must-be-on-node1
+	resources vm:401
+	nodes node1
+	strict 1
+
+node-affinity: vm402-must-be-on-node1
+	resources vm:402
+	nodes node1
+	strict 1
+
+resource-affinity: vm401-vm402-must-be-kept-separate
+	resources vm:401,vm:402
+	affinity negative
diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..e12242ab
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
@@ -0,0 +1,73 @@
+--- Log ---
+Drop rule 'vm301-must-be-on-node1', because at least one resource is in a positive resource affinity rule and there are other resources in at least one other node affinity rule already.
+Drop rule 'vm301-must-be-on-node2', because at least one resource is in a positive resource affinity rule and there are other resources in at least one other node affinity rule already.
+Drop rule 'vm301-vm302-must-be-kept-together', because resources are in multiple node affinity rules.
+Drop rule 'vm401-must-be-on-node1', because at least one resource is in a negative resource affinity rule and this rule would restrict these to less nodes than available to the resources.
+Drop rule 'vm401-vm402-must-be-kept-separate', because two or more resources are restricted to less nodes than available to the resources.
+Drop rule 'vm402-must-be-on-node1', because at least one resource is in a negative resource affinity rule and this rule would restrict these to less nodes than available to the resources.
+--- Config ---
+$VAR1 = {
+          'digest' => 'a5d782a442bbe3bf3a4d088db82a575b382a53fe',
+          'ids' => {
+                     'vm101-vm102-must-be-kept-together' => {
+                                                              'affinity' => 'positive',
+                                                              'resources' => {
+                                                                               'vm:101' => 1,
+                                                                               'vm:102' => 1
+                                                                             },
+                                                              'type' => 'resource-affinity'
+                                                            },
+                     'vm101-vm102-must-be-on-node1' => {
+                                                         'nodes' => {
+                                                                      'node1' => {
+                                                                                   'priority' => 0
+                                                                                 }
+                                                                    },
+                                                         'resources' => {
+                                                                          'vm:101' => 1,
+                                                                          'vm:102' => 1
+                                                                        },
+                                                         'strict' => 1,
+                                                         'type' => 'node-affinity'
+                                                       },
+                     'vm201-must-be-on-node1' => {
+                                                   'nodes' => {
+                                                                'node1' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:201' => 1
+                                                                  },
+                                                   'strict' => 1,
+                                                   'type' => 'node-affinity'
+                                                 },
+                     'vm201-vm202-must-be-kept-separate' => {
+                                                              'affinity' => 'negative',
+                                                              'resources' => {
+                                                                               'vm:201' => 1,
+                                                                               'vm:202' => 1
+                                                                             },
+                                                              'type' => 'resource-affinity'
+                                                            },
+                     'vm202-must-be-on-node2' => {
+                                                   'nodes' => {
+                                                                'node2' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:202' => 1
+                                                                  },
+                                                   'strict' => 1,
+                                                   'type' => 'node-affinity'
+                                                 }
+                   },
+          'order' => {
+                       'vm101-vm102-must-be-kept-together' => 2,
+                       'vm101-vm102-must-be-on-node1' => 1,
+                       'vm201-must-be-on-node1' => 3,
+                       'vm201-vm202-must-be-kept-separate' => 5,
+                       'vm202-must-be-on-node2' => 4
+                     }
+        };
diff --git a/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
new file mode 100644
index 00000000..ebbf5e63
--- /dev/null
+++ b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
@@ -0,0 +1,37 @@
+# Case 1: Do not change any node affinity rules, if there are no resources of a positive resource affinity rule in any node affinity rules.
+resource-affinity: do-not-infer-positive1
+	resources vm:101,vm:102,vm:103
+	affinity positive
+
+# Case 2: Do not change any node affinity rules for node affinity rules of resources in a negative resource affinity rule.
+node-affinity: do-not-infer-negative1
+	resources vm:203
+	nodes node1,node2
+	strict 1
+
+node-affinity: do-not-infer-negative2
+	resources vm:201
+	nodes node3
+
+resource-affinity: do-not-infer-negative3
+	resources vm:201,vm:203
+	affinity negative
+
+# Case 2: Add two resources, which are not part of the node affinity rule of another resource in a positive resource affinity rule, to the node affinity rule.
+node-affinity: infer-single-resource1
+	resources vm:302
+	nodes node3
+
+resource-affinity: infer-single-resource2
+	resources vm:301,vm:302,vm:303
+	affinity positive
+
+# Case 3: Add one resource, which is not part of the node affinity rule of the other resources in a positive resource affinity rule, to the node affinity rule.
+node-affinity: infer-multi-resources1
+	resources vm:402,vm:404,vm:405
+	nodes node1,node3
+	strict 1
+
+resource-affinity: infer-multi-resources2
+	resources vm:401,vm:402,vm:403,vm:404
+	affinity positive
diff --git a/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..33c56c62
--- /dev/null
+++ b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
@@ -0,0 +1,111 @@
+--- Log ---
+--- Config ---
+$VAR1 = {
+          'digest' => '32ae135ef2f8bd84cd12c18af6910dce9d6bc9fa',
+          'ids' => {
+                     'do-not-infer-negative1' => {
+                                                   'nodes' => {
+                                                                'node1' => {
+                                                                             'priority' => 0
+                                                                           },
+                                                                'node2' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:203' => 1
+                                                                  },
+                                                   'strict' => 1,
+                                                   'type' => 'node-affinity'
+                                                 },
+                     'do-not-infer-negative2' => {
+                                                   'nodes' => {
+                                                                'node3' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:201' => 1
+                                                                  },
+                                                   'type' => 'node-affinity'
+                                                 },
+                     'do-not-infer-negative3' => {
+                                                   'affinity' => 'negative',
+                                                   'resources' => {
+                                                                    'vm:201' => 1,
+                                                                    'vm:203' => 1
+                                                                  },
+                                                   'type' => 'resource-affinity'
+                                                 },
+                     'do-not-infer-positive1' => {
+                                                   'affinity' => 'positive',
+                                                   'resources' => {
+                                                                    'vm:101' => 1,
+                                                                    'vm:102' => 1,
+                                                                    'vm:103' => 1
+                                                                  },
+                                                   'type' => 'resource-affinity'
+                                                 },
+                     'infer-multi-resources1' => {
+                                                   'nodes' => {
+                                                                'node1' => {
+                                                                             'priority' => 0
+                                                                           },
+                                                                'node3' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:401' => 1,
+                                                                    'vm:402' => 1,
+                                                                    'vm:403' => 1,
+                                                                    'vm:404' => 1,
+                                                                    'vm:405' => 1
+                                                                  },
+                                                   'strict' => 1,
+                                                   'type' => 'node-affinity'
+                                                 },
+                     'infer-multi-resources2' => {
+                                                   'affinity' => 'positive',
+                                                   'resources' => {
+                                                                    'vm:401' => 1,
+                                                                    'vm:402' => 1,
+                                                                    'vm:403' => 1,
+                                                                    'vm:404' => 1
+                                                                  },
+                                                   'type' => 'resource-affinity'
+                                                 },
+                     'infer-single-resource1' => {
+                                                   'nodes' => {
+                                                                'node3' => {
+                                                                             'priority' => 0
+                                                                           }
+                                                              },
+                                                   'resources' => {
+                                                                    'vm:301' => 1,
+                                                                    'vm:302' => 1,
+                                                                    'vm:303' => 1
+                                                                  },
+                                                   'type' => 'node-affinity'
+                                                 },
+                     'infer-single-resource2' => {
+                                                   'affinity' => 'positive',
+                                                   'resources' => {
+                                                                    'vm:301' => 1,
+                                                                    'vm:302' => 1,
+                                                                    'vm:303' => 1
+                                                                  },
+                                                   'type' => 'resource-affinity'
+                                                 }
+                   },
+          'order' => {
+                       'do-not-infer-negative1' => 2,
+                       'do-not-infer-negative2' => 3,
+                       'do-not-infer-negative3' => 4,
+                       'do-not-infer-positive1' => 1,
+                       'infer-multi-resources1' => 7,
+                       'infer-multi-resources2' => 8,
+                       'infer-single-resource1' => 5,
+                       'infer-single-resource2' => 6
+                     }
+        };
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
new file mode 100644
index 00000000..7fb9cdd3
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
@@ -0,0 +1,35 @@
+# Case 1: Remove resource affinity rules, where there is a loose Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm101-vm102-should-be-on-node1-or-node2
+	resources vm:101,vm:102
+	nodes node1:1,node2:2
+	strict 0
+
+resource-affinity: vm101-vm102-must-be-kept-separate
+	resources vm:101,vm:102
+	affinity negative
+
+# Case 2: Remove resource affinity rules, where there is a strict Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm201-vm202-must-be-on-node1-or-node2
+	resources vm:201,vm:202
+	nodes node1:1,node2:2
+	strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-together
+	resources vm:201,vm:202
+	affinity positive
+
+# Case 3: Do not remove the resource affinity rule, if there is only one priority group in each node affinity rule for the ha
+# resource affinity rule's resources.
+node-affinity: vm301-must-be-on-node1-with-prio-1
+        resources vm:301
+        nodes node1:1
+        strict 1
+
+node-affinity: vm302-must-be-on-node2-with-prio-2
+        resources vm:302
+        nodes node2:2
+        strict 1
+
+resource-affinity: vm301-vm302-must-be-kept-together
+        resources vm:301,vm:302
+        affinity negative
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..92d12929
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
@@ -0,0 +1,48 @@
+--- Log ---
+Drop rule 'vm101-vm102-must-be-kept-separate', because resources are in node affinity rules with multiple priorities.
+Drop rule 'vm101-vm102-should-be-on-node1-or-node2', because resources are in a resource affinity rule and cannot be in a node affinity rule with multiple priorities.
+Drop rule 'vm201-vm202-must-be-kept-together', because resources are in node affinity rules with multiple priorities.
+Drop rule 'vm201-vm202-must-be-on-node1-or-node2', because resources are in a resource affinity rule and cannot be in a node affinity rule with multiple priorities.
+--- Config ---
+$VAR1 = {
+          'digest' => '722a98914555f296af0916c980a9d6c780f5f072',
+          'ids' => {
+                     'vm301-must-be-on-node1-with-prio-1' => {
+                                                               'nodes' => {
+                                                                            'node1' => {
+                                                                                         'priority' => 1
+                                                                                       }
+                                                                          },
+                                                               'resources' => {
+                                                                                'vm:301' => 1
+                                                                              },
+                                                               'strict' => 1,
+                                                               'type' => 'node-affinity'
+                                                             },
+                     'vm301-vm302-must-be-kept-together' => {
+                                                              'affinity' => 'negative',
+                                                              'resources' => {
+                                                                               'vm:301' => 1,
+                                                                               'vm:302' => 1
+                                                                             },
+                                                              'type' => 'resource-affinity'
+                                                            },
+                     'vm302-must-be-on-node2-with-prio-2' => {
+                                                               'nodes' => {
+                                                                            'node2' => {
+                                                                                         'priority' => 2
+                                                                                       }
+                                                                          },
+                                                               'resources' => {
+                                                                                'vm:302' => 1
+                                                                              },
+                                                               'strict' => 1,
+                                                               'type' => 'node-affinity'
+                                                             }
+                   },
+          'order' => {
+                       'vm301-must-be-on-node1-with-prio-1' => 5,
+                       'vm301-vm302-must-be-kept-together' => 7,
+                       'vm302-must-be-on-node2-with-prio-2' => 6
+                     }
+        };
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (10 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
  2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht
  12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
  To: pve-devel

Add test cases for some scenarios, where node and positive/negative
resource affinity rules are applied together.

For the positive resource affinity rules, node affinity rules will
always take precedence, even if all or the majority of resources in the
resource affinity rule are already on another node contradicting the
node affinity rule.

For the negative resource affinity rules, node affinity rules will take
precedence if it is possible to do so. Currently, there are still
cases, which will need manual intervention. These should be accounted
for automatically in the future by providing more information to the
scheduler.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../README                                    | 14 ++++
 .../cmdlist                                   |  3 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 41 ++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  7 ++
 .../service_config                            |  5 ++
 .../README                                    | 13 ++++
 .../cmdlist                                   |  3 +
 .../hardware_status                           |  7 ++
 .../log.expect                                | 63 +++++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  7 ++
 .../service_config                            |  5 ++
 .../README                                    | 17 +++++
 .../cmdlist                                   |  6 ++
 .../hardware_status                           |  6 ++
 .../log.expect                                | 67 +++++++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              | 15 +++++
 .../service_config                            |  5 ++
 .../README                                    | 15 +++++
 .../cmdlist                                   |  3 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 49 ++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  7 ++
 .../service_config                            |  5 ++
 .../README                                    | 15 +++++
 .../cmdlist                                   |  3 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 49 ++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  7 ++
 .../service_config                            |  5 ++
 35 files changed, 462 insertions(+)
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
 create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config

diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
new file mode 100644
index 00000000..5f68bbb8
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
@@ -0,0 +1,14 @@
+Test whether a strict negative resource affinity rule among three resources,
+where the resources are contradicting the negative resource affinity rule and
+one of them should be on a specific node by its node affinity rule, makes the
+resource in the node affinity rule migrate to its preferred node, if possible.
+
+The test scenario is:
+- vm:102 should be on node2
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101 is currently on node1
+- vm:102 and vm:103 are currently on node1, which must be separated
+
+The expected outcome is:
+- As vm:102 and vm:103 are still on the same node, make vm:102 migrate to node2
+  to fulfill its node affinity rule
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
new file mode 100644
index 00000000..216aeb66
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
@@ -0,0 +1,41 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: migrate service 'vm:102' to node 'node2' (running)
+info     20    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node1, target = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: service vm:102 - start migrate to node 'node2'
+info     21    node1/lrm: service vm:102 - end migrate to node 'node2'
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node2)
+info     43    node2/lrm: starting service vm:102
+info     43    node2/lrm: service status vm:102 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
new file mode 100644
index 00000000..be874144
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node2
+	resources vm:102
+	nodes node2,node3
+
+resource-affinity: lonely-must-vms-be
+	resources vm:101,vm:102,vm:103
+	affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
new file mode 100644
index 00000000..b98edc85
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
new file mode 100644
index 00000000..e2de70fb
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
@@ -0,0 +1,13 @@
+Test whether a strict negative resource affinity rule among three resources,
+where all resources are in a node affinity rule restricting them to three
+nodes, are migrated to these nodes as all three resources are still on a common
+node outside of the cluster.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 should be on node2, node3 or node4
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 is currently on node1
+
+The expected outcome is:
+- As vm:101, vm:102, and vm:103 are still on the same node and should be on
+  node2, node3 or node4, migrate them to node2, node3, and node4 respectively
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
new file mode 100644
index 00000000..8cdc6092
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
new file mode 100644
index 00000000..7b8e961e
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
@@ -0,0 +1,7 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" },
+  "node4": { "power": "off", "network": "off" },
+  "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
new file mode 100644
index 00000000..3e75c6bf
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
@@ -0,0 +1,63 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node4 on
+info     20    node4/crm: status change startup => wait_for_quorum
+info     20    node4/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node5 on
+info     20    node5/crm: status change startup => wait_for_quorum
+info     20    node5/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info     20    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node1, target = node2)
+info     20    node1/crm: migrate service 'vm:102' to node 'node3' (running)
+info     20    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node1, target = node3)
+info     20    node1/crm: migrate service 'vm:103' to node 'node4' (running)
+info     20    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node1, target = node4)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: service vm:101 - start migrate to node 'node2'
+info     21    node1/lrm: service vm:101 - end migrate to node 'node2'
+info     21    node1/lrm: service vm:102 - start migrate to node 'node3'
+info     21    node1/lrm: service vm:102 - end migrate to node 'node3'
+info     21    node1/lrm: service vm:103 - start migrate to node 'node4'
+info     21    node1/lrm: service vm:103 - end migrate to node 'node4'
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     26    node4/crm: status change wait_for_quorum => slave
+info     27    node4/lrm: got lock 'ha_agent_node4_lock'
+info     27    node4/lrm: status change wait_for_agent_lock => active
+info     28    node5/crm: status change wait_for_quorum => slave
+info     40    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node3)
+info     40    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node4)
+info     43    node2/lrm: starting service vm:101
+info     43    node2/lrm: service status vm:101 started
+info     45    node3/lrm: starting service vm:102
+info     45    node3/lrm: service status vm:102 started
+info     47    node4/lrm: starting service vm:103
+info     47    node4/lrm: service status vm:103 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
new file mode 100644
index 00000000..b84d7702
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vms-must-be-on-subcluster
+	resources vm:101,vm:102,vm:103
+	nodes node2,node3,node4
+
+resource-affinity: lonely-must-vms-be
+	resources vm:101,vm:102,vm:103
+	affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
new file mode 100644
index 00000000..57e3579d
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
new file mode 100644
index 00000000..588f9020
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
@@ -0,0 +1,17 @@
+Test whether a strict negative resource affinity rule among three resources,
+where two resources are restricted each to nodes they are not yet on, can be
+exchanged to the nodes described by their node affinity rules, if one of the
+resources is stopped.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 should be on node2, node3 or node1 respectively
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are currently on node1, node2, node3 respectively
+
+The expected outcome is:
+- the resources can neither be manually migrated nor automatically exchange
+  their nodes to match their node affinity rules, because of the strict
+  condition, that they cannot be on either a node, where a resource with
+  negative affinity is currently on or is migrated to
+- therefore, one of the resources must be stopped manually to allow the
+  rearrangement to fullfill the node affinity rules
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
new file mode 100644
index 00000000..2f2c80f5
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
@@ -0,0 +1,6 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "service vm:103 migrate node1" ],
+    [ "service vm:101 stopped" ],
+    [ "service vm:101 started" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
new file mode 100644
index 00000000..4aed08a1
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
@@ -0,0 +1,6 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" },
+  "node4": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
new file mode 100644
index 00000000..1ed34c36
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
@@ -0,0 +1,67 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:102
+info     23    node2/lrm: service status vm:102 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+info    120      cmdlist: execute service vm:103 migrate node1
+err     120    node1/crm: crm command 'migrate vm:103 node1' error - service 'vm:101' on node 'node1' in negative affinity with service 'vm:103'
+info    220      cmdlist: execute service vm:101 stopped
+info    220    node1/crm: service 'vm:101': state changed from 'started' to 'request_stop'
+info    221    node1/lrm: stopping service vm:101
+info    221    node1/lrm: service status vm:101 stopped
+info    240    node1/crm: service 'vm:101': state changed from 'request_stop' to 'stopped'
+info    240    node1/crm: migrate service 'vm:103' to node 'node1' (running)
+info    240    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node3, target = node1)
+info    245    node3/lrm: service vm:103 - start migrate to node 'node1'
+info    245    node3/lrm: service vm:103 - end migrate to node 'node1'
+info    260    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node1)
+info    260    node1/crm: migrate service 'vm:102' to node 'node3' (running)
+info    260    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node2, target = node3)
+info    261    node1/lrm: starting service vm:103
+info    261    node1/lrm: service status vm:103 started
+info    263    node2/lrm: service vm:102 - start migrate to node 'node3'
+info    263    node2/lrm: service vm:102 - end migrate to node 'node3'
+info    280    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node3)
+info    285    node3/lrm: starting service vm:102
+info    285    node3/lrm: service status vm:102 started
+info    320      cmdlist: execute service vm:101 started
+info    320    node1/crm: service 'vm:101': state changed from 'stopped' to 'request_start'  (node = node1)
+info    320    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info    320    node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info    320    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node1, target = node2)
+info    321    node1/lrm: service vm:101 - start migrate to node 'node2'
+info    321    node1/lrm: service vm:101 - end migrate to node 'node2'
+info    340    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node2)
+info    343    node2/lrm: starting service vm:101
+info    343    node2/lrm: service status vm:101 started
+info    920     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
new file mode 100644
index 00000000..2362d220
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
@@ -0,0 +1,15 @@
+node-affinity: vm101-must-be-on-node2
+	resources vm:101
+	nodes node2
+
+node-affinity: vm102-must-be-on-node3
+	resources vm:102
+	nodes node3
+
+node-affinity: vm103-must-be-on-node1
+	resources vm:103
+	nodes node1
+
+resource-affinity: lonely-must-vms-be
+	resources vm:101,vm:102,vm:103
+	affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
new file mode 100644
index 00000000..4b26f6b4
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node2", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
new file mode 100644
index 00000000..2202f5a3
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
@@ -0,0 +1,15 @@
+Test whether a strict positive resource affinity rule among three resources,
+where one of these resources is restricted to another node than they are
+currently on with a node affinity rule, makes all resources migrate to that
+node.
+
+The test scenario is:
+- vm:102 should be kept on node2
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are currently running on node3
+
+The expected outcome is:
+- As vm:102 is on node3, which contradicts its node affinity rule, vm:102 is
+  migrated to node2 to fullfill its node affinity rule
+- As vm:102 is in a positive resource affinity rule with vm:101 and vm:103, all
+  of them are migrated to node2 as these are inferred for all of them
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
new file mode 100644
index 00000000..d84b2228
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
@@ -0,0 +1,49 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info     20    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node3, target = node2)
+info     20    node1/crm: migrate service 'vm:102' to node 'node2' (running)
+info     20    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node3, target = node2)
+info     20    node1/crm: migrate service 'vm:103' to node 'node2' (running)
+info     20    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node3, target = node2)
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: service vm:101 - start migrate to node 'node2'
+info     25    node3/lrm: service vm:101 - end migrate to node 'node2'
+info     25    node3/lrm: service vm:102 - start migrate to node 'node2'
+info     25    node3/lrm: service vm:102 - end migrate to node 'node2'
+info     25    node3/lrm: service vm:103 - start migrate to node 'node2'
+info     25    node3/lrm: service vm:103 - end migrate to node 'node2'
+info     40    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node2)
+info     43    node2/lrm: starting service vm:101
+info     43    node2/lrm: service status vm:101 started
+info     43    node2/lrm: starting service vm:102
+info     43    node2/lrm: service status vm:102 started
+info     43    node2/lrm: starting service vm:103
+info     43    node2/lrm: service status vm:103 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
new file mode 100644
index 00000000..655f5161
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node2
+	resources vm:102
+	nodes node2
+
+resource-affinity: vms-must-stick-together
+	resources vm:101,vm:102,vm:103
+	affinity positive
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
new file mode 100644
index 00000000..299a58c9
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
new file mode 100644
index 00000000..c5f4b469
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
@@ -0,0 +1,15 @@
+Test whether a strict positive resource affinity rule among three resources,
+where one of these resources is restricted to another node than they are
+currently on with a node affinity rule, makes all resources migrate to that
+node.
+
+The test scenario is:
+- vm:102 must be kept on node1 or node2
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are currently running on node3
+
+The expected outcome is:
+- As vm:102 is on node3, which contradicts its node affinity rule, vm:102 is
+  migrated to node1 to fullfill its node affinity rule
+- As vm:102 is in a positive resource affinity rule with vm:101 and vm:103, all
+  of them are migrated to node1 as these are inferred for all of them
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
new file mode 100644
index 00000000..22fb5ced
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
@@ -0,0 +1,49 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: migrate service 'vm:101' to node 'node1' (running)
+info     20    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node3, target = node1)
+info     20    node1/crm: migrate service 'vm:102' to node 'node1' (running)
+info     20    node1/crm: service 'vm:102': state changed from 'started' to 'migrate'  (node = node3, target = node1)
+info     20    node1/crm: migrate service 'vm:103' to node 'node1' (running)
+info     20    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node3, target = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: service vm:101 - start migrate to node 'node1'
+info     25    node3/lrm: service vm:101 - end migrate to node 'node1'
+info     25    node3/lrm: service vm:102 - start migrate to node 'node1'
+info     25    node3/lrm: service vm:102 - end migrate to node 'node1'
+info     25    node3/lrm: service vm:103 - start migrate to node 'node1'
+info     25    node3/lrm: service vm:103 - end migrate to node 'node1'
+info     40    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node1)
+info     40    node1/crm: service 'vm:102': state changed from 'migrate' to 'started'  (node = node1)
+info     40    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node1)
+info     41    node1/lrm: starting service vm:101
+info     41    node1/lrm: service status vm:101 started
+info     41    node1/lrm: starting service vm:102
+info     41    node1/lrm: service status vm:102 started
+info     41    node1/lrm: starting service vm:103
+info     41    node1/lrm: service status vm:103 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
new file mode 100644
index 00000000..6db94930
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node1-or-node2
+	resources vm:102
+	nodes node1,node2
+
+resource-affinity: vms-must-stick-together
+	resources vm:101,vm:102,vm:103
+	affinity positive
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
new file mode 100644
index 00000000..299a58c9
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" }
+}
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1)
  2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
                   ` (11 preceding siblings ...)
  2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
@ 2025-08-01 17:36 ` Thomas Lamprecht
  12 siblings, 0 replies; 14+ messages in thread
From: Thomas Lamprecht @ 2025-08-01 17:36 UTC (permalink / raw)
  To: pve-devel, Daniel Kral

On Fri, 01 Aug 2025 18:22:15 +0200, Daniel Kral wrote:
> Here's a follow up on the HA rules and especially the HA resource
> affinity rules.
> 
> The first three patches haven't changed as they were lower priority for
> me than the last part about loosening restrictions on mixed resource
> references.
> 
> [...]

Really not much new code for what it is, nice! And the HA test system makes it
much easier to confidently do (and review!) such changes.

Applied, thanks!

[01/12] manager: fix ~revision version check for ha groups migration
        commit: ea2c1f0201084a3c17472e3fc100a206a221f522
[02/12] test: ha tester: add ha groups migration tests with runtime upgrades
        commit: da810d20bf3d06e345c71dcc38ef9a7004df341a
[03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper
        commit: 8363f7f24731b95efcd0db9b6f9252be94a10e3b
[04/12] api: rules: add missing return schema for the read_rule api endpoint
        commit: ac927a5f5f30e57455cfed0c87a5b43e843db913
[05/12] api: rules: ignore disable parameter if it is set to a falsy value
        commit: c060bc4f3695c199085afd3479aac2ccd2f97c82
[06/12] rules: resource affinity: make message in inter-consistency check clearer
        commit: 5808ede019ec92cb76b8b2a2605fa6e33b566f20
[07/12] config, manager: do not check ignored resources with affinity when migrating
        commit: 1c9c35d4e35e40da79322c62fe484b904c60d471
[08/12] rules: make positive affinity resources migrate on single resource fail
        commit: ad89487d39e1f8ace72f244f51e96d68c393286c
[09/12] rules: allow same resources in node and resource affinity rules
        commit: 4edad9d1fed3b24eee51c0e27d8e1a7cec40f425
[10/12] rules: restrict inter-plugin resource references to simple cases
        commit: c48d9e66b8edf0851028040ab3117b4f01757e14
[11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases
        commit: ed530cc40be19f19f8b30b1bc5bd5a29898ab815
[12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules
        commit: 100859a77c36ef2eaf638f233ac1c82b050ee9cf


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-08-01 17:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal