* [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1)
@ 2025-08-01 16:22 Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Here's a follow up on the HA rules and especially the HA resource
affinity rules.
The first three patches haven't changed as they were lower priority for
me than the last part about loosening restrictions on mixed resource
references.
Patches #1 - #8 are rather independent patches but still have some
ordering / dependencies in between (e.g. patch #1 - #3 depend on each
other). The larger part here are patches #9 - #12, which allow mixed
resource references between node and resource affinity rules.
Also did a
git rebase HEAD~12 --exec 'make clean && make deb'
on the series.
Changelog to v1
---------------
- add missing return schema to read_rule api
- ignore disable parameter as otherwise rule feasibility tests will be
skipped
- do not check ignored resources as otherwise migrations are not
possible or co-migrations are done even if the resource is not
ha-managed (could be an use case in the future, but not for now)
- change behavior for single resource fails for resources in positive
affinity rules: try to recover resource by migrating all resources to
another available node
- loosen restrictions on mixed resource references, i.e., if a resource
is used in a node affinity rule and a resource affinity rule at the
same time (see more information in the patch)
TODO (more important)
---------------------
- more test cases and dev/user testing
- do not include ignored resources in feasibility check
- updates to pve-docs about the new behavior and missing docs:
- new: mixing node and resource affinity rules are allowed now
- new: interactions between node and resource affinity rules
- new: ha rule conflicts for mixed usage
- missing: for positive resource affinity rules, if resources are
separated across nodes, the most populated one will be chosen as the
one where all are migrated to
- maybe missing: strictness behavior of negative resource affinity
rules, i.e., that one resource cannot be migrated to either where
another negative affinity resource is located nor migrated to
- using node affinity rules and negative resource affinity rules should
still be improved as there are quite a few cases where manual
intervention is needed to follow the resources node affinity rules
(for now, resource affinity rules often times overwrite node affinity
rules; the former should get more information about the latter in the
scheduler; but might not be a blocker as its rather an edge case)
- some of these cases are also represented in the added test cases, so
that changes to them are documented in the future if those cases are
accomodated for as well
TODO (less important)
---------------------
- adding node affinity rule blockers as well (but not really important)
- maybe log not-migrated groups because they had no group members?
- other suggestions to the original patch series'
- cleaning up
Daniel Kral (12):
manager: fix ~revision version check for ha groups migration
test: ha tester: add ha groups migration tests with runtime upgrades
tree-wide: pass optional parameters as hash values for for_each_rule
helper
api: rules: add missing return schema for the read_rule api endpoint
api: rules: ignore disable parameter if it is set to a falsy value
rules: resource affinity: make message in inter-consistency check
clearer
config, manager: do not check ignored resources with affinity when
migrating
rules: make positive affinity resources migrate on single resource
fail
rules: allow same resources in node and resource affinity rules
rules: restrict inter-plugin resource references to simple cases
test: rules: add test cases for inter-plugin checks allowing simple
use cases
test: ha tester: add resource affinity test cases mixed with node
affinity rules
src/PVE/API2/HA/Rules.pm | 12 +-
src/PVE/HA/Config.pm | 2 +
src/PVE/HA/Manager.pm | 11 +-
src/PVE/HA/Rules.pm | 289 +++++++++++++--
src/PVE/HA/Rules/NodeAffinity.pm | 14 +-
src/PVE/HA/Rules/ResourceAffinity.pm | 30 +-
src/PVE/HA/Sim/Hardware.pm | 8 +
...onsistent-node-resource-affinity-rules.cfg | 54 +++
...nt-node-resource-affinity-rules.cfg.expect | 73 ++++
...sistent-resource-affinity-rules.cfg.expect | 8 +-
...egative-resource-affinity-rules.cfg.expect | 4 +-
...y-for-positive-resource-affinity-rules.cfg | 37 ++
...ositive-resource-affinity-rules.cfg.expect | 111 ++++++
...-affinity-with-resource-affinity-rules.cfg | 35 ++
...ty-with-resource-affinity-rules.cfg.expect | 48 +++
.../multiple-resource-refs-in-rules.cfg | 52 ---
...multiple-resource-refs-in-rules.cfg.expect | 111 ------
src/test/test-group-migrate3/README | 7 +
src/test/test-group-migrate3/cmdlist | 17 +
src/test/test-group-migrate3/groups | 7 +
src/test/test-group-migrate3/hardware_status | 5 +
src/test/test-group-migrate3/log.expect | 344 ++++++++++++++++++
src/test/test-group-migrate3/manager_status | 1 +
src/test/test-group-migrate3/service_config | 5 +
src/test/test-group-migrate4/README | 8 +
src/test/test-group-migrate4/cmdlist | 15 +
src/test/test-group-migrate4/groups | 7 +
src/test/test-group-migrate4/hardware_status | 5 +
src/test/test-group-migrate4/log.expect | 277 ++++++++++++++
src/test/test-group-migrate4/manager_status | 1 +
src/test/test-group-migrate4/service_config | 5 +
.../README | 9 +-
.../log.expect | 26 +-
.../README | 14 +
.../cmdlist | 3 +
.../hardware_status | 5 +
.../log.expect | 41 +++
.../manager_status | 1 +
.../rules_config | 7 +
.../service_config | 5 +
.../README | 13 +
.../cmdlist | 3 +
.../hardware_status | 7 +
.../log.expect | 63 ++++
.../manager_status | 1 +
.../rules_config | 7 +
.../service_config | 5 +
.../README | 17 +
.../cmdlist | 6 +
.../hardware_status | 6 +
.../log.expect | 67 ++++
.../manager_status | 1 +
.../rules_config | 15 +
.../service_config | 5 +
.../README | 15 +
.../cmdlist | 3 +
.../hardware_status | 5 +
.../log.expect | 49 +++
.../manager_status | 1 +
.../rules_config | 7 +
.../service_config | 5 +
.../README | 15 +
.../cmdlist | 3 +
.../hardware_status | 5 +
.../log.expect | 49 +++
.../manager_status | 1 +
.../rules_config | 7 +
.../service_config | 5 +
68 files changed, 1852 insertions(+), 248 deletions(-)
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
create mode 100644 src/test/test-group-migrate3/README
create mode 100644 src/test/test-group-migrate3/cmdlist
create mode 100644 src/test/test-group-migrate3/groups
create mode 100644 src/test/test-group-migrate3/hardware_status
create mode 100644 src/test/test-group-migrate3/log.expect
create mode 100644 src/test/test-group-migrate3/manager_status
create mode 100644 src/test/test-group-migrate3/service_config
create mode 100644 src/test/test-group-migrate4/README
create mode 100644 src/test/test-group-migrate4/cmdlist
create mode 100644 src/test/test-group-migrate4/groups
create mode 100644 src/test/test-group-migrate4/hardware_status
create mode 100644 src/test/test-group-migrate4/log.expect
create mode 100644 src/test/test-group-migrate4/manager_status
create mode 100644 src/test/test-group-migrate4/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
For the minimum version of 9.0.0~16 to migrate ha groups, the version
9.0.0 would fail the test as 0 < 16 would be true. If the ~revision is
not set for $version, then it is ordered after any minimum ~revision.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
nothing changed since v1
src/PVE/HA/Manager.pm | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 9d7cb73f..0be12061 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -548,10 +548,13 @@ my $has_node_min_version = sub {
return 0 if $major == $min_major && $minor < $min_minor;
return 0 if $major == $min_major && $minor == $min_minor && $patch < $min_patch;
- $rev //= 0;
$min_rev //= 0;
return 0
- if $major == $min_major && $minor == $min_minor && $patch == $min_patch && $rev < $min_rev;
+ if $major == $min_major
+ && $minor == $min_minor
+ && $patch == $min_patch
+ && defined($rev)
+ && $rev < $min_rev;
return 1;
};
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
These test cases cover slightly more realistic upgrade paths of the
cluster nodes, where nodes are upgraded and rebooted one-by-one and some
actions might fail in-between.
The new sim_hardware_cmd 'version' is introduced to allow simulating the
runtime upgrades of each node and should be removed as soon as the HA
groups migration support code is not needed anymore (e.g. PVE 10).
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
nothing changed since v1
src/PVE/HA/Sim/Hardware.pm | 8 +
src/test/test-group-migrate3/README | 7 +
src/test/test-group-migrate3/cmdlist | 17 +
src/test/test-group-migrate3/groups | 7 +
src/test/test-group-migrate3/hardware_status | 5 +
src/test/test-group-migrate3/log.expect | 344 +++++++++++++++++++
src/test/test-group-migrate3/manager_status | 1 +
src/test/test-group-migrate3/service_config | 5 +
src/test/test-group-migrate4/README | 8 +
src/test/test-group-migrate4/cmdlist | 15 +
src/test/test-group-migrate4/groups | 7 +
src/test/test-group-migrate4/hardware_status | 5 +
src/test/test-group-migrate4/log.expect | 277 +++++++++++++++
src/test/test-group-migrate4/manager_status | 1 +
src/test/test-group-migrate4/service_config | 5 +
15 files changed, 712 insertions(+)
create mode 100644 src/test/test-group-migrate3/README
create mode 100644 src/test/test-group-migrate3/cmdlist
create mode 100644 src/test/test-group-migrate3/groups
create mode 100644 src/test/test-group-migrate3/hardware_status
create mode 100644 src/test/test-group-migrate3/log.expect
create mode 100644 src/test/test-group-migrate3/manager_status
create mode 100644 src/test/test-group-migrate3/service_config
create mode 100644 src/test/test-group-migrate4/README
create mode 100644 src/test/test-group-migrate4/cmdlist
create mode 100644 src/test/test-group-migrate4/groups
create mode 100644 src/test/test-group-migrate4/hardware_status
create mode 100644 src/test/test-group-migrate4/log.expect
create mode 100644 src/test/test-group-migrate4/manager_status
create mode 100644 src/test/test-group-migrate4/service_config
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 4207ce31..63eb89ff 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -596,6 +596,7 @@ sub get_cfs_state {
# simulate hardware commands, the following commands are available:
# power <node> <on|off>
# network <node> <on|off>
+# version <node> set <version>
# delay <seconds>
# skip-round <crm|lrm> [<rounds=1>]
# cfs <node> <rw|update> <work|fail>
@@ -683,6 +684,13 @@ sub sim_hardware_cmd {
$self->write_hardware_status_nolock($cstatus);
+ } elsif ($cmd eq 'version') {
+ die "sim_hardware_cmd: unknown version action '$action'"
+ if $action ne "set";
+ $cstatus->{$node}->{version} = $param;
+
+ $self->write_hardware_status_nolock($cstatus);
+
} elsif ($cmd eq 'cfs') {
die "sim_hardware_cmd: unknown cfs action '$action' for node '$node'"
if $action !~ m/^(rw|update)$/;
diff --git a/src/test/test-group-migrate3/README b/src/test/test-group-migrate3/README
new file mode 100644
index 00000000..0ee45f7a
--- /dev/null
+++ b/src/test/test-group-migrate3/README
@@ -0,0 +1,7 @@
+Test whether an initial (unsupported) mixed version cluster can be properly
+upgraded per major version and then the CRM correctly migrates the HA group
+config only after all nodes have at least the proper pre-release version.
+
+By rebooting every node after each version change, this tests whether the
+switching of the CRM node and a few instances of LRM restarts are properly
+prohibiting the HA groups config migration.
diff --git a/src/test/test-group-migrate3/cmdlist b/src/test/test-group-migrate3/cmdlist
new file mode 100644
index 00000000..d507acad
--- /dev/null
+++ b/src/test/test-group-migrate3/cmdlist
@@ -0,0 +1,17 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "version node1 set 8.4.1" ],
+ [ "reboot node1" ],
+ [ "version node2 set 8.4.1" ],
+ [ "reboot node2" ],
+ [ "version node3 set 8.4.1" ],
+ [ "reboot node3" ],
+ [ "version node1 set 9.0.0~16" ],
+ [ "reboot node1" ],
+ [ "version node2 set 9.0.0~16" ],
+ [ "reboot node2" ],
+ [ "version node3 set 9.0.0~15" ],
+ [ "reboot node3" ],
+ [ "version node3 set 9.0.0~17" ],
+ [ "reboot node3" ]
+]
diff --git a/src/test/test-group-migrate3/groups b/src/test/test-group-migrate3/groups
new file mode 100644
index 00000000..bad746ca
--- /dev/null
+++ b/src/test/test-group-migrate3/groups
@@ -0,0 +1,7 @@
+group: group1
+ nodes node1
+ restricted 1
+
+group: group2
+ nodes node2:2,node3
+ nofailback 1
diff --git a/src/test/test-group-migrate3/hardware_status b/src/test/test-group-migrate3/hardware_status
new file mode 100644
index 00000000..e8f9d73f
--- /dev/null
+++ b/src/test/test-group-migrate3/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "version": "7.4-4" },
+ "node2": { "power": "off", "network": "off", "version": "8.3.7" },
+ "node3": { "power": "off", "network": "off", "version": "8.3.0" }
+}
diff --git a/src/test/test-group-migrate3/log.expect b/src/test/test-group-migrate3/log.expect
new file mode 100644
index 00000000..63be1218
--- /dev/null
+++ b/src/test/test-group-migrate3/log.expect
@@ -0,0 +1,344 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+noti 60 node1/crm: start ha group migration...
+noti 60 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 60 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 60 node1/crm: ha groups migration: node 'node1' has version '7.4-4'
+err 60 node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 60 node1/crm: ha groups migration failed
+noti 60 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 120 cmdlist: execute version node1 set 8.4.1
+noti 180 node1/crm: start ha group migration...
+noti 180 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 180 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 180 node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 180 node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 180 node1/crm: ha groups migration failed
+noti 180 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 220 cmdlist: execute reboot node1
+info 220 node1/lrm: got shutdown request with shutdown policy 'conditional'
+info 220 node1/lrm: reboot LRM, stop and freeze all services
+info 220 node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info 221 node1/lrm: stopping service vm:101
+info 221 node1/lrm: service status vm:101 stopped
+info 222 node1/lrm: exit (loop end)
+info 222 reboot: execute crm node1 stop
+info 221 node1/crm: server received shutdown request
+info 240 node1/crm: voluntary release CRM lock
+info 241 node1/crm: exit (loop end)
+info 241 reboot: execute power node1 off
+info 241 reboot: execute power node1 on
+info 241 node1/crm: status change startup => wait_for_quorum
+info 240 node1/lrm: status change startup => wait_for_agent_lock
+info 242 node2/crm: got lock 'ha_manager_lock'
+info 242 node2/crm: status change slave => master
+info 242 node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info 260 node1/crm: status change wait_for_quorum => slave
+info 261 node1/lrm: got lock 'ha_agent_node1_lock'
+info 261 node1/lrm: status change wait_for_agent_lock => active
+info 261 node1/lrm: starting service vm:101
+info 261 node1/lrm: service status vm:101 started
+noti 282 node2/crm: start ha group migration...
+noti 282 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 282 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 282 node2/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 282 node2/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 282 node2/crm: ha groups migration failed
+noti 282 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 320 cmdlist: execute version node2 set 8.4.1
+noti 402 node2/crm: start ha group migration...
+noti 402 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 402 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 402 node2/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 402 node2/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 402 node2/crm: ha groups migration failed
+noti 402 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 420 cmdlist: execute reboot node2
+info 420 node2/lrm: got shutdown request with shutdown policy 'conditional'
+info 420 node2/lrm: reboot LRM, stop and freeze all services
+info 422 node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info 423 node2/lrm: stopping service vm:102
+info 423 node2/lrm: service status vm:102 stopped
+info 424 node2/lrm: exit (loop end)
+info 424 reboot: execute crm node2 stop
+info 423 node2/crm: server received shutdown request
+info 442 node2/crm: voluntary release CRM lock
+info 443 node2/crm: exit (loop end)
+info 443 reboot: execute power node2 off
+info 443 reboot: execute power node2 on
+info 443 node2/crm: status change startup => wait_for_quorum
+info 440 node2/lrm: status change startup => wait_for_agent_lock
+info 444 node3/crm: got lock 'ha_manager_lock'
+info 444 node3/crm: status change slave => master
+info 444 node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info 462 node2/crm: status change wait_for_quorum => slave
+info 463 node2/lrm: got lock 'ha_agent_node2_lock'
+info 463 node2/lrm: status change wait_for_agent_lock => active
+info 463 node2/lrm: starting service vm:102
+info 463 node2/lrm: service status vm:102 started
+noti 484 node3/crm: start ha group migration...
+noti 484 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 484 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 484 node3/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 484 node3/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 484 node3/crm: ha groups migration failed
+noti 484 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 520 cmdlist: execute version node3 set 8.4.1
+noti 604 node3/crm: start ha group migration...
+noti 604 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 604 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 604 node3/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 604 node3/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 604 node3/crm: ha groups migration failed
+noti 604 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 620 cmdlist: execute reboot node3
+info 620 node3/lrm: got shutdown request with shutdown policy 'conditional'
+info 620 node3/lrm: reboot LRM, stop and freeze all services
+info 624 node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info 625 node3/lrm: stopping service vm:103
+info 625 node3/lrm: service status vm:103 stopped
+info 626 node3/lrm: exit (loop end)
+info 626 reboot: execute crm node3 stop
+info 625 node3/crm: server received shutdown request
+info 644 node3/crm: voluntary release CRM lock
+info 645 node3/crm: exit (loop end)
+info 645 reboot: execute power node3 off
+info 645 reboot: execute power node3 on
+info 645 node3/crm: status change startup => wait_for_quorum
+info 640 node3/lrm: status change startup => wait_for_agent_lock
+info 660 node1/crm: got lock 'ha_manager_lock'
+info 660 node1/crm: status change slave => master
+info 660 node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info 664 node3/crm: status change wait_for_quorum => slave
+info 665 node3/lrm: got lock 'ha_agent_node3_lock'
+info 665 node3/lrm: status change wait_for_agent_lock => active
+info 665 node3/lrm: starting service vm:103
+info 665 node3/lrm: service status vm:103 started
+noti 700 node1/crm: start ha group migration...
+noti 700 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 700 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 700 node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 700 node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 700 node1/crm: ha groups migration failed
+noti 700 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 720 cmdlist: execute version node1 set 9.0.0~16
+info 820 cmdlist: execute reboot node1
+info 820 node1/lrm: got shutdown request with shutdown policy 'conditional'
+info 820 node1/lrm: reboot LRM, stop and freeze all services
+noti 820 node1/crm: start ha group migration...
+noti 820 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 820 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'restart'
+err 820 node1/crm: abort ha groups migration: lrm 'node1' is not in mode 'active'
+err 820 node1/crm: ha groups migration failed
+noti 820 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 820 node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info 821 node1/lrm: stopping service vm:101
+info 821 node1/lrm: service status vm:101 stopped
+info 822 node1/lrm: exit (loop end)
+info 822 reboot: execute crm node1 stop
+info 821 node1/crm: server received shutdown request
+info 840 node1/crm: voluntary release CRM lock
+info 841 node1/crm: exit (loop end)
+info 841 reboot: execute power node1 off
+info 841 reboot: execute power node1 on
+info 841 node1/crm: status change startup => wait_for_quorum
+info 840 node1/lrm: status change startup => wait_for_agent_lock
+info 842 node2/crm: got lock 'ha_manager_lock'
+info 842 node2/crm: status change slave => master
+info 842 node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info 860 node1/crm: status change wait_for_quorum => slave
+info 861 node1/lrm: got lock 'ha_agent_node1_lock'
+info 861 node1/lrm: status change wait_for_agent_lock => active
+info 861 node1/lrm: starting service vm:101
+info 861 node1/lrm: service status vm:101 started
+noti 882 node2/crm: start ha group migration...
+noti 882 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 882 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 882 node2/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 882 node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti 882 node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 882 node2/crm: ha groups migration: node 'node2' has version '8.4.1'
+err 882 node2/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err 882 node2/crm: ha groups migration failed
+noti 882 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 920 cmdlist: execute version node2 set 9.0.0~16
+noti 1002 node2/crm: start ha group migration...
+noti 1002 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1002 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1002 node2/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1002 node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1002 node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1002 node2/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1002 node2/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1002 node2/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1002 node2/crm: ha groups migration: node 'node3' has version '8.4.1'
+err 1002 node2/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 1002 node2/crm: ha groups migration failed
+noti 1002 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1020 cmdlist: execute reboot node2
+info 1020 node2/lrm: got shutdown request with shutdown policy 'conditional'
+info 1020 node2/lrm: reboot LRM, stop and freeze all services
+info 1022 node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info 1023 node2/lrm: stopping service vm:102
+info 1023 node2/lrm: service status vm:102 stopped
+info 1024 node2/lrm: exit (loop end)
+info 1024 reboot: execute crm node2 stop
+info 1023 node2/crm: server received shutdown request
+info 1042 node2/crm: voluntary release CRM lock
+info 1043 node2/crm: exit (loop end)
+info 1043 reboot: execute power node2 off
+info 1043 reboot: execute power node2 on
+info 1043 node2/crm: status change startup => wait_for_quorum
+info 1040 node2/lrm: status change startup => wait_for_agent_lock
+info 1044 node3/crm: got lock 'ha_manager_lock'
+info 1044 node3/crm: status change slave => master
+info 1044 node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info 1062 node2/crm: status change wait_for_quorum => slave
+info 1063 node2/lrm: got lock 'ha_agent_node2_lock'
+info 1063 node2/lrm: status change wait_for_agent_lock => active
+info 1063 node2/lrm: starting service vm:102
+info 1063 node2/lrm: service status vm:102 started
+noti 1084 node3/crm: start ha group migration...
+noti 1084 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1084 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1084 node3/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1084 node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1084 node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1084 node3/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1084 node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1084 node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1084 node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err 1084 node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 1084 node3/crm: ha groups migration failed
+noti 1084 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1120 cmdlist: execute version node3 set 9.0.0~15
+noti 1204 node3/crm: start ha group migration...
+noti 1204 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1204 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1204 node3/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1204 node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1204 node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1204 node3/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1204 node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1204 node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1204 node3/crm: ha groups migration: node 'node3' has version '9.0.0~15'
+err 1204 node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 1204 node3/crm: ha groups migration failed
+noti 1204 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1220 cmdlist: execute reboot node3
+info 1220 node3/lrm: got shutdown request with shutdown policy 'conditional'
+info 1220 node3/lrm: reboot LRM, stop and freeze all services
+info 1224 node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info 1225 node3/lrm: stopping service vm:103
+info 1225 node3/lrm: service status vm:103 stopped
+info 1226 node3/lrm: exit (loop end)
+info 1226 reboot: execute crm node3 stop
+info 1225 node3/crm: server received shutdown request
+info 1244 node3/crm: voluntary release CRM lock
+info 1245 node3/crm: exit (loop end)
+info 1245 reboot: execute power node3 off
+info 1245 reboot: execute power node3 on
+info 1245 node3/crm: status change startup => wait_for_quorum
+info 1240 node3/lrm: status change startup => wait_for_agent_lock
+info 1260 node1/crm: got lock 'ha_manager_lock'
+info 1260 node1/crm: status change slave => master
+info 1260 node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info 1264 node3/crm: status change wait_for_quorum => slave
+info 1265 node3/lrm: got lock 'ha_agent_node3_lock'
+info 1265 node3/lrm: status change wait_for_agent_lock => active
+info 1265 node3/lrm: starting service vm:103
+info 1265 node3/lrm: service status vm:103 started
+noti 1300 node1/crm: start ha group migration...
+noti 1300 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1300 node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1300 node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node3' has version '9.0.0~15'
+err 1300 node1/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 1300 node1/crm: ha groups migration failed
+noti 1300 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1320 cmdlist: execute version node3 set 9.0.0~17
+info 1420 cmdlist: execute reboot node3
+info 1420 node3/lrm: got shutdown request with shutdown policy 'conditional'
+info 1420 node3/lrm: reboot LRM, stop and freeze all services
+noti 1420 node1/crm: start ha group migration...
+noti 1420 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1420 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1420 node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1420 node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1420 node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1420 node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1420 node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1420 node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'restart'
+err 1420 node1/crm: abort ha groups migration: lrm 'node3' is not in mode 'active'
+err 1420 node1/crm: ha groups migration failed
+noti 1420 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1420 node1/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info 1425 node3/lrm: stopping service vm:103
+info 1425 node3/lrm: service status vm:103 stopped
+info 1426 node3/lrm: exit (loop end)
+info 1426 reboot: execute crm node3 stop
+info 1425 node3/crm: server received shutdown request
+info 1445 node3/crm: exit (loop end)
+info 1445 reboot: execute power node3 off
+info 1445 reboot: execute power node3 on
+info 1445 node3/crm: status change startup => wait_for_quorum
+info 1440 node3/lrm: status change startup => wait_for_agent_lock
+info 1460 node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info 1464 node3/crm: status change wait_for_quorum => slave
+info 1465 node3/lrm: got lock 'ha_agent_node3_lock'
+info 1465 node3/lrm: status change wait_for_agent_lock => active
+info 1465 node3/lrm: starting service vm:103
+info 1465 node3/lrm: service status vm:103 started
+noti 1540 node1/crm: start ha group migration...
+noti 1540 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1540 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1540 node1/crm: ha groups migration: node 'node1' has version '9.0.0~16'
+noti 1540 node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1540 node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1540 node1/crm: ha groups migration: node 'node2' has version '9.0.0~16'
+noti 1540 node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1540 node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1540 node1/crm: ha groups migration: node 'node3' has version '9.0.0~17'
+noti 1540 node1/crm: ha groups migration: migration to rules config successful
+noti 1540 node1/crm: ha groups migration: migration to resources config successful
+noti 1540 node1/crm: ha groups migration: group config deletion successful
+noti 1540 node1/crm: ha groups migration successful
+info 2020 hardware: exit simulation - done
diff --git a/src/test/test-group-migrate3/manager_status b/src/test/test-group-migrate3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-group-migrate3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-group-migrate3/service_config b/src/test/test-group-migrate3/service_config
new file mode 100644
index 00000000..a27551e5
--- /dev/null
+++ b/src/test/test-group-migrate3/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started", "group": "group1" },
+ "vm:102": { "node": "node2", "state": "started", "group": "group2" },
+ "vm:103": { "node": "node3", "state": "started", "group": "group2" }
+}
diff --git a/src/test/test-group-migrate4/README b/src/test/test-group-migrate4/README
new file mode 100644
index 00000000..37e60c7d
--- /dev/null
+++ b/src/test/test-group-migrate4/README
@@ -0,0 +1,8 @@
+Test whether a cluster, where all nodes have the same version from the previous
+major release, can be properly upgraded to the neede major release version and
+then the CRM correctly migrates the HA group config only after all nodes have
+the minimum major release version.
+
+Additionally, the nodes are rebooted with every version upgrade and in-between
+the CFS sporadically fails to read/write, fails to update cluster state and an
+LRM is restarted, which all prohibit the HA groups config migration.
diff --git a/src/test/test-group-migrate4/cmdlist b/src/test/test-group-migrate4/cmdlist
new file mode 100644
index 00000000..fdd3bfdd
--- /dev/null
+++ b/src/test/test-group-migrate4/cmdlist
@@ -0,0 +1,15 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "delay 10" ],
+ [ "version node1 set 9.0.0" ],
+ [ "reboot node1" ],
+ [ "cfs node2 rw fail" ],
+ [ "version node2 set 9.0.0" ],
+ [ "cfs node2 rw work" ],
+ [ "reboot node2" ],
+ [ "cfs node3 update fail" ],
+ [ "cfs node3 update work" ],
+ [ "version node3 set 9.0.1" ],
+ [ "restart-lrm node2" ],
+ [ "reboot node3" ]
+]
diff --git a/src/test/test-group-migrate4/groups b/src/test/test-group-migrate4/groups
new file mode 100644
index 00000000..bad746ca
--- /dev/null
+++ b/src/test/test-group-migrate4/groups
@@ -0,0 +1,7 @@
+group: group1
+ nodes node1
+ restricted 1
+
+group: group2
+ nodes node2:2,node3
+ nofailback 1
diff --git a/src/test/test-group-migrate4/hardware_status b/src/test/test-group-migrate4/hardware_status
new file mode 100644
index 00000000..7ad46416
--- /dev/null
+++ b/src/test/test-group-migrate4/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "version": "8.4.1" },
+ "node2": { "power": "off", "network": "off", "version": "8.4.1" },
+ "node3": { "power": "off", "network": "off", "version": "8.4.1" }
+}
diff --git a/src/test/test-group-migrate4/log.expect b/src/test/test-group-migrate4/log.expect
new file mode 100644
index 00000000..7ffe33e3
--- /dev/null
+++ b/src/test/test-group-migrate4/log.expect
@@ -0,0 +1,277 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+noti 60 node1/crm: start ha group migration...
+noti 60 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 60 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 60 node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 60 node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 60 node1/crm: ha groups migration failed
+noti 60 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 120 cmdlist: execute delay 10
+noti 180 node1/crm: start ha group migration...
+noti 180 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 180 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 180 node1/crm: ha groups migration: node 'node1' has version '8.4.1'
+err 180 node1/crm: abort ha groups migration: node 'node1' needs at least pve-manager version '9.0.0~16'
+err 180 node1/crm: ha groups migration failed
+noti 180 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 220 cmdlist: execute version node1 set 9.0.0
+noti 300 node1/crm: start ha group migration...
+noti 300 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 300 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 300 node1/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 300 node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti 300 node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 300 node1/crm: ha groups migration: node 'node2' has version '8.4.1'
+err 300 node1/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err 300 node1/crm: ha groups migration failed
+noti 300 node1/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 320 cmdlist: execute reboot node1
+info 320 node1/lrm: got shutdown request with shutdown policy 'conditional'
+info 320 node1/lrm: reboot LRM, stop and freeze all services
+info 320 node1/crm: service 'vm:101': state changed from 'started' to 'freeze'
+info 321 node1/lrm: stopping service vm:101
+info 321 node1/lrm: service status vm:101 stopped
+info 322 node1/lrm: exit (loop end)
+info 322 reboot: execute crm node1 stop
+info 321 node1/crm: server received shutdown request
+info 340 node1/crm: voluntary release CRM lock
+info 341 node1/crm: exit (loop end)
+info 341 reboot: execute power node1 off
+info 341 reboot: execute power node1 on
+info 341 node1/crm: status change startup => wait_for_quorum
+info 340 node1/lrm: status change startup => wait_for_agent_lock
+info 342 node2/crm: got lock 'ha_manager_lock'
+info 342 node2/crm: status change slave => master
+info 342 node2/crm: service 'vm:101': state changed from 'freeze' to 'started'
+info 360 node1/crm: status change wait_for_quorum => slave
+info 361 node1/lrm: got lock 'ha_agent_node1_lock'
+info 361 node1/lrm: status change wait_for_agent_lock => active
+info 361 node1/lrm: starting service vm:101
+info 361 node1/lrm: service status vm:101 started
+noti 382 node2/crm: start ha group migration...
+noti 382 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 382 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 382 node2/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 382 node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti 382 node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 382 node2/crm: ha groups migration: node 'node2' has version '8.4.1'
+err 382 node2/crm: abort ha groups migration: node 'node2' needs at least pve-manager version '9.0.0~16'
+err 382 node2/crm: ha groups migration failed
+noti 382 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 420 cmdlist: execute cfs node2 rw fail
+err 422 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 422 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 423 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 423 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 423 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 442 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 442 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 443 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 443 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 443 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 462 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 462 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 463 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 463 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 463 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 482 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 482 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 483 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 483 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 483 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 502 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 502 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 503 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 503 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 503 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+info 520 cmdlist: execute version node2 set 9.0.0
+err 522 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 522 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 523 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 523 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 523 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 542 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 542 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 543 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 543 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 543 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 562 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 562 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 563 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 563 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 563 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 582 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 582 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 583 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 583 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 583 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+err 602 node2/crm: could not read manager status: cfs connection refused - not mounted?
+err 602 node2/crm: got unexpected error - cfs connection refused - not mounted?
+err 603 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 603 node2/lrm: updating service status from manager failed: cfs connection refused - not mounted?
+err 603 node2/lrm: unable to write lrm status file - cfs connection refused - not mounted?
+info 620 cmdlist: execute cfs node2 rw work
+noti 702 node2/crm: start ha group migration...
+noti 702 node2/crm: ha groups migration: node 'node1' is in state 'online'
+noti 702 node2/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 702 node2/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 702 node2/crm: ha groups migration: node 'node2' is in state 'online'
+noti 702 node2/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 702 node2/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti 702 node2/crm: ha groups migration: node 'node3' is in state 'online'
+noti 702 node2/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 702 node2/crm: ha groups migration: node 'node3' has version '8.4.1'
+err 702 node2/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 702 node2/crm: ha groups migration failed
+noti 702 node2/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 720 cmdlist: execute reboot node2
+info 720 node2/lrm: got shutdown request with shutdown policy 'conditional'
+info 720 node2/lrm: reboot LRM, stop and freeze all services
+info 722 node2/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info 723 node2/lrm: stopping service vm:102
+info 723 node2/lrm: service status vm:102 stopped
+info 724 node2/lrm: exit (loop end)
+info 724 reboot: execute crm node2 stop
+info 723 node2/crm: server received shutdown request
+info 742 node2/crm: voluntary release CRM lock
+info 743 node2/crm: exit (loop end)
+info 743 reboot: execute power node2 off
+info 743 reboot: execute power node2 on
+info 743 node2/crm: status change startup => wait_for_quorum
+info 740 node2/lrm: status change startup => wait_for_agent_lock
+info 744 node3/crm: got lock 'ha_manager_lock'
+info 744 node3/crm: status change slave => master
+info 744 node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info 762 node2/crm: status change wait_for_quorum => slave
+info 763 node2/lrm: got lock 'ha_agent_node2_lock'
+info 763 node2/lrm: status change wait_for_agent_lock => active
+info 763 node2/lrm: starting service vm:102
+info 763 node2/lrm: service status vm:102 started
+noti 784 node3/crm: start ha group migration...
+noti 784 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 784 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 784 node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 784 node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti 784 node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 784 node3/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti 784 node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti 784 node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 784 node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err 784 node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 784 node3/crm: ha groups migration failed
+noti 784 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 820 cmdlist: execute cfs node3 update fail
+noti 824 node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 825 node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 844 node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 845 node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 864 node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 865 node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 884 node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 885 node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 904 node3/crm: temporary inconsistent cluster state (cfs restart?), skip round
+noti 905 node3/lrm: temporary inconsistent cluster state (cfs restart?), skip round
+info 920 cmdlist: execute cfs node3 update work
+noti 1004 node3/crm: start ha group migration...
+noti 1004 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1004 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1004 node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 1004 node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1004 node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1004 node3/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti 1004 node3/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1004 node3/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1004 node3/crm: ha groups migration: node 'node3' has version '8.4.1'
+err 1004 node3/crm: abort ha groups migration: node 'node3' needs at least pve-manager version '9.0.0~16'
+err 1004 node3/crm: ha groups migration failed
+noti 1004 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1020 cmdlist: execute version node3 set 9.0.1
+info 1120 cmdlist: execute restart-lrm node2
+info 1120 node2/lrm: restart LRM, freeze all services
+noti 1124 node3/crm: start ha group migration...
+noti 1124 node3/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1124 node3/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1124 node3/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 1124 node3/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1124 node3/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'restart'
+err 1124 node3/crm: abort ha groups migration: lrm 'node2' is not in mode 'active'
+err 1124 node3/crm: ha groups migration failed
+noti 1124 node3/crm: retry ha groups migration in 6 rounds (~ 60 seconds)
+info 1124 node3/crm: service 'vm:102': state changed from 'started' to 'freeze'
+info 1144 node2/lrm: exit (loop end)
+info 1144 node2/lrm: status change startup => wait_for_agent_lock
+info 1164 node3/crm: service 'vm:102': state changed from 'freeze' to 'started'
+info 1183 node2/lrm: got lock 'ha_agent_node2_lock'
+info 1183 node2/lrm: status change wait_for_agent_lock => active
+info 1220 cmdlist: execute reboot node3
+info 1220 node3/lrm: got shutdown request with shutdown policy 'conditional'
+info 1220 node3/lrm: reboot LRM, stop and freeze all services
+info 1224 node3/crm: service 'vm:103': state changed from 'started' to 'freeze'
+info 1225 node3/lrm: stopping service vm:103
+info 1225 node3/lrm: service status vm:103 stopped
+info 1226 node3/lrm: exit (loop end)
+info 1226 reboot: execute crm node3 stop
+info 1225 node3/crm: server received shutdown request
+info 1244 node3/crm: voluntary release CRM lock
+info 1245 node3/crm: exit (loop end)
+info 1245 reboot: execute power node3 off
+info 1245 reboot: execute power node3 on
+info 1245 node3/crm: status change startup => wait_for_quorum
+info 1240 node3/lrm: status change startup => wait_for_agent_lock
+info 1260 node1/crm: got lock 'ha_manager_lock'
+info 1260 node1/crm: status change slave => master
+info 1260 node1/crm: service 'vm:103': state changed from 'freeze' to 'started'
+info 1264 node3/crm: status change wait_for_quorum => slave
+info 1265 node3/lrm: got lock 'ha_agent_node3_lock'
+info 1265 node3/lrm: status change wait_for_agent_lock => active
+info 1265 node3/lrm: starting service vm:103
+info 1265 node3/lrm: service status vm:103 started
+noti 1300 node1/crm: start ha group migration...
+noti 1300 node1/crm: ha groups migration: node 'node1' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node1' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node1' has version '9.0.0'
+noti 1300 node1/crm: ha groups migration: node 'node2' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node2' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node2' has version '9.0.0'
+noti 1300 node1/crm: ha groups migration: node 'node3' is in state 'online'
+noti 1300 node1/crm: ha groups migration: lrm 'node3' is in state 'active' and mode 'active'
+noti 1300 node1/crm: ha groups migration: node 'node3' has version '9.0.1'
+noti 1300 node1/crm: ha groups migration: migration to rules config successful
+noti 1300 node1/crm: ha groups migration: migration to resources config successful
+noti 1300 node1/crm: ha groups migration: group config deletion successful
+noti 1300 node1/crm: ha groups migration successful
+info 1820 hardware: exit simulation - done
diff --git a/src/test/test-group-migrate4/manager_status b/src/test/test-group-migrate4/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-group-migrate4/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-group-migrate4/service_config b/src/test/test-group-migrate4/service_config
new file mode 100644
index 00000000..a27551e5
--- /dev/null
+++ b/src/test/test-group-migrate4/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started", "group": "group1" },
+ "vm:102": { "node": "node2", "state": "started", "group": "group2" },
+ "vm:103": { "node": "node3", "state": "started", "group": "group2" }
+}
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Make call sites to the for_each_rule more readable and while at remove
unnecessary variables in the helper body as well.
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Rules.pm | 6 ++----
src/PVE/HA/Rules.pm | 16 +++++++---------
src/PVE/HA/Rules/NodeAffinity.pm | 14 +++++---------
src/PVE/HA/Rules/ResourceAffinity.pm | 22 ++++++++--------------
4 files changed, 22 insertions(+), 36 deletions(-)
diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index 1591df28..b180d2ed 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -192,10 +192,8 @@ __PACKAGE__->register_method({
push @$res, $cfg;
},
- {
- type => $type,
- sid => $resource,
- },
+ type => $type,
+ sid => $resource,
);
return $res;
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index e5d12571..e2b77215 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -419,13 +419,13 @@ sub canonicalize : prototype($$$) {
=head3 foreach_rule(...)
-=head3 foreach_rule($rules, $func [, $opts])
+=head3 foreach_rule($rules, $func [, %opts])
Filters the given C<$rules> according to the C<$opts> and loops over the
resulting rules in the order as defined in the section config and executes
C<$func> with the parameters C<L<< ($rule, $ruleid) >>>.
-The filter properties for C<$opts> are:
+The following key-value pairs for C<$opts> as filter properties are:
=over
@@ -439,12 +439,10 @@ The filter properties for C<$opts> are:
=cut
-sub foreach_rule : prototype($$;$) {
- my ($rules, $func, $opts) = @_;
+sub foreach_rule : prototype($$;%) {
+ my ($rules, $func, %opts) = @_;
- my $sid = $opts->{sid};
- my $type = $opts->{type};
- my $exclude_disabled_rules = $opts->{exclude_disabled_rules};
+ my $sid = $opts{sid};
my @ruleids = sort {
$rules->{order}->{$a} <=> $rules->{order}->{$b}
@@ -455,8 +453,8 @@ sub foreach_rule : prototype($$;$) {
next if !$rule; # skip invalid rules
next if defined($sid) && !defined($rule->{resources}->{$sid});
- next if defined($type) && $rule->{type} ne $type;
- next if $exclude_disabled_rules && exists($rule->{disable});
+ next if defined($opts{type}) && $rule->{type} ne $opts{type};
+ next if $opts{exclude_disabled_rules} && exists($rule->{disable});
$func->($rule, $ruleid);
}
diff --git a/src/PVE/HA/Rules/NodeAffinity.pm b/src/PVE/HA/Rules/NodeAffinity.pm
index ee3ef985..09a8e67c 100644
--- a/src/PVE/HA/Rules/NodeAffinity.pm
+++ b/src/PVE/HA/Rules/NodeAffinity.pm
@@ -148,10 +148,8 @@ sub get_plugin_check_arguments {
$result->{node_affinity_rules}->{$ruleid} = $rule;
},
- {
- type => 'node-affinity',
- exclude_disabled_rules => 1,
- },
+ type => 'node-affinity',
+ exclude_disabled_rules => 1,
);
return $result;
@@ -231,11 +229,9 @@ my $get_resource_node_affinity_rule = sub {
$node_affinity_rule = dclone($rule) if !$node_affinity_rule;
},
- {
- sid => $sid,
- type => 'node-affinity',
- exclude_disabled_rules => 1,
- },
+ sid => $sid,
+ type => 'node-affinity',
+ exclude_disabled_rules => 1,
);
return $node_affinity_rule;
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 6b5670ac..1d2ed1ed 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -92,10 +92,8 @@ sub get_plugin_check_arguments {
$result->{positive_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'positive';
$result->{negative_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'negative';
},
- {
- type => 'resource-affinity',
- exclude_disabled_rules => 1,
- },
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
);
return $result;
@@ -490,11 +488,9 @@ sub get_affinitive_resources : prototype($$) {
$affinity_set->{$csid} = 1 if $csid ne $sid;
}
},
- {
- sid => $sid,
- type => 'resource-affinity',
- exclude_disabled_rules => 1,
- },
+ sid => $sid,
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
);
return ($together, $separate);
@@ -560,11 +556,9 @@ sub get_resource_affinity : prototype($$$) {
}
}
},
- {
- sid => $sid,
- type => 'resource-affinity',
- exclude_disabled_rules => 1,
- },
+ sid => $sid,
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
);
return ($together, $separate);
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (2 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Rules.pm | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index b180d2ed..d797f621 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -223,6 +223,8 @@ __PACKAGE__->register_method({
rule => get_standard_option('pve-ha-rule-id'),
type => {
type => 'string',
+ description => "HA rule type.",
+ enum => PVE::HA::Rules->lookup_types(),
},
},
},
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (3 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Otherwise, these will be ignored by the feasibility check and allows
users to create rules or update rules, which are infeasible and will
make other HA rules invalid.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Rules.pm | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index d797f621..ab431019 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -268,6 +268,8 @@ __PACKAGE__->register_method({
my $type = extract_param($param, 'type');
my $ruleid = extract_param($param, 'rule');
+ delete $param->{disable} if !$param->{disable};
+
my $plugin = PVE::HA::Rules->lookup($type);
my $opts = $plugin->check_config($ruleid, $param, 1, 1);
@@ -318,6 +320,8 @@ __PACKAGE__->register_method({
my $digest = extract_param($param, 'digest');
my $delete = extract_param($param, 'delete');
+ delete $param->{disable} if !$param->{disable};
+
if ($delete) {
$delete = [PVE::Tools::split_list($delete)];
}
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (4 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Most users will likely interact with the HA rules through the web
interface, where the HA rule ids are not shown in the rules view.
Error messages with direct references to these rule ids will seem
confusing to users, so replace them with a more generic name.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Rules/ResourceAffinity.pm | 4 ++--
.../inconsistent-resource-affinity-rules.cfg.expect | 8 ++++----
...r-implicit-negative-resource-affinity-rules.cfg.expect | 4 ++--
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 1d2ed1ed..7327ee08 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -236,9 +236,9 @@ __PACKAGE__->register_check(
my ($positiveid, $negativeid) = @$conflict;
push $errors->{$positiveid}->{resources}->@*,
- "rule shares two or more resources with '$negativeid'";
+ "rule shares two or more resources with a negative resource affinity rule";
push $errors->{$negativeid}->{resources}->@*,
- "rule shares two or more resources with '$positiveid'";
+ "rule shares two or more resources with a positive resource affinity rule";
}
},
);
diff --git a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
index b0cde0f8..d4a2d7b2 100644
--- a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
+++ b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
@@ -1,8 +1,8 @@
--- Log ---
-Drop rule 'keep-apart1', because rule shares two or more resources with 'stick-together1'.
-Drop rule 'keep-apart2', because rule shares two or more resources with 'stick-together1'.
-Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart1'.
-Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart2'.
+Drop rule 'keep-apart1', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'keep-apart2', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'stick-together1', because rule shares two or more resources with a negative resource affinity rule.
+Drop rule 'stick-together1', because rule shares two or more resources with a negative resource affinity rule.
--- Config ---
$VAR1 = {
'digest' => '50875b320034d8ac7dded185e590f5f87c4e2bb6',
diff --git a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
index bcd368ab..09364d41 100644
--- a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
+++ b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
@@ -1,6 +1,6 @@
--- Log ---
-Drop rule 'do-not-infer-inconsistent-negative2', because rule shares two or more resources with 'do-not-infer-inconsistent-positive1'.
-Drop rule 'do-not-infer-inconsistent-positive1', because rule shares two or more resources with 'do-not-infer-inconsistent-negative2'.
+Drop rule 'do-not-infer-inconsistent-negative2', because rule shares two or more resources with a positive resource affinity rule.
+Drop rule 'do-not-infer-inconsistent-positive1', because rule shares two or more resources with a negative resource affinity rule.
--- Config ---
$VAR1 = {
'digest' => 'd8724dfe2130bb642b98e021da973aa0ec0695f0',
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (5 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
These should not be accounted for as these are treated as if the HA
Manager doesn't manage them at all.
Reported-by: Michael Köppl <m.koeppl@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Config.pm | 2 ++
src/PVE/HA/Manager.pm | 4 ++++
2 files changed, 6 insertions(+)
diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 6de08650..53b55d0e 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -400,6 +400,8 @@ sub get_resource_motion_info {
next if $ns->{$node} ne 'online';
for my $csid (sort keys %$separate) {
+ next if !defined($ss->{$csid});
+ next if $ss->{$csid}->{state} eq 'ignored';
next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $node;
next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $node;
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 0be12061..ba59f642 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -420,6 +420,8 @@ sub execute_migration {
my ($together, $separate) = get_affinitive_resources($self->{rules}, $sid);
for my $csid (sort keys %$separate) {
+ next if !defined($ss->{$csid});
+ next if $ss->{$csid}->{state} eq 'ignored';
next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $target;
next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $target;
@@ -437,6 +439,8 @@ sub execute_migration {
my $resources_to_migrate = [];
for my $csid (sort keys %$together) {
+ next if !defined($ss->{$csid});
+ next if $ss->{$csid}->{state} eq 'ignored';
next if $ss->{$csid}->{node} && $ss->{$csid}->{node} eq $target;
next if $ss->{$csid}->{target} && $ss->{$csid}->{target} eq $target;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (6 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
In the context of the HA Manager, resources' downtime is expected to be
minimized as much as possible. Therefore, it is more reasonable to try
other possible node placements if one or more of the HA resources of a
positive affinity rule fail, instead of putting the failed HA resources
in recovery.
This can be improved later to allow temporarily separated positive
affinity "groups", where the failed HA resources tries to find a node,
where it can start, and afterwards the other HA resources are migrated
there too, but this simpler heuristic is enough for the current feature.
Reported-by: Hannes Dürr <h.duerr@proxmox.com>
Reported-by: Michael Köppl <m.koeppl@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Rules/ResourceAffinity.pm | 4 +++
.../README | 9 ++++---
.../log.expect | 26 ++++++++++++++++---
3 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 7327ee08..9bc039ba 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -596,6 +596,10 @@ resource has not failed running there yet.
sub apply_positive_resource_affinity : prototype($$) {
my ($together, $allowed_nodes) = @_;
+ for my $node (keys %$together) {
+ delete $together->{$node} if !$allowed_nodes->{$node};
+ }
+
my @possible_nodes = sort keys $together->%*
or return; # nothing to do if there is no positive resource affinity
diff --git a/src/test/test-resource-affinity-strict-positive3/README b/src/test/test-resource-affinity-strict-positive3/README
index a270277b..804d1312 100644
--- a/src/test/test-resource-affinity-strict-positive3/README
+++ b/src/test/test-resource-affinity-strict-positive3/README
@@ -1,7 +1,8 @@
Test whether a strict positive resource affinity rule makes three resources
migrate to the same recovery node in case of a failover of their previously
assigned node. If one of those fail to start on the recovery node (e.g.
-insufficient resources), the failing resource will be kept on the recovery node.
+insufficient resources), all resources in the positive resource affinity rule
+will be migrated to another available recovery node.
The test scenario is:
- vm:101, vm:102, and fa:120002 must be kept together
@@ -12,6 +13,6 @@ The test scenario is:
The expected outcome is:
- As node3 fails, all resources are migrated to node2
-- Two of those resources will start successfully, but fa:120002 will stay in
- recovery, since it cannot be started on this node, but cannot be relocated to
- another one either due to the strict resource affinity rule
+- Two of those resources will start successfully, but fa:120002 will fail; as
+ there are other available nodes left where it can run, all resources in the
+ positive resource affinity rule are migrated to the next-best fitting node
diff --git a/src/test/test-resource-affinity-strict-positive3/log.expect b/src/test/test-resource-affinity-strict-positive3/log.expect
index 4a54cb3b..b5d7018f 100644
--- a/src/test/test-resource-affinity-strict-positive3/log.expect
+++ b/src/test/test-resource-affinity-strict-positive3/log.expect
@@ -82,8 +82,26 @@ info 263 node2/lrm: starting service fa:120002
warn 263 node2/lrm: unable to start service fa:120002
err 263 node2/lrm: unable to start service fa:120002 on local node after 1 retries
warn 280 node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
-warn 280 node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120002', retry start on current node. Tried nodes: node2
-info 283 node2/lrm: starting service fa:120002
-info 283 node2/lrm: service status fa:120002 started
-info 300 node1/crm: relocation policy successful for 'fa:120002' on node 'node2', failed nodes: node2
+info 280 node1/crm: relocate service 'fa:120002' to node 'node1'
+info 280 node1/crm: service 'fa:120002': state changed from 'started' to 'relocate' (node = node2, target = node1)
+info 283 node2/lrm: service fa:120002 - start relocate to node 'node1'
+info 283 node2/lrm: service fa:120002 - end relocate to node 'node1'
+info 300 node1/crm: service 'fa:120002': state changed from 'relocate' to 'started' (node = node1)
+info 300 node1/crm: migrate service 'vm:101' to node 'node1' (running)
+info 300 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 300 node1/crm: migrate service 'vm:102' to node 'node1' (running)
+info 300 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 301 node1/lrm: starting service fa:120002
+info 301 node1/lrm: service status fa:120002 started
+info 303 node2/lrm: service vm:101 - start migrate to node 'node1'
+info 303 node2/lrm: service vm:101 - end migrate to node 'node1'
+info 303 node2/lrm: service vm:102 - start migrate to node 'node1'
+info 303 node2/lrm: service vm:102 - end migrate to node 'node1'
+info 320 node1/crm: relocation policy successful for 'fa:120002' on node 'node1', failed nodes: node2
+info 320 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
+info 320 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 321 node1/lrm: starting service vm:101
+info 321 node1/lrm: service status vm:101 started
+info 321 node1/lrm: starting service vm:102
+info 321 node1/lrm: service status vm:102 started
info 720 hardware: exit simulation - done
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (7 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
In preparation of the next patch, remove the overly restrictive checker,
which disallows that resources are used in node affinity rules and
resource affinity rules at the same time.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
This could be squashed into the next, but I figured it's good measure
for documentation to have its own patch.
src/PVE/HA/Rules.pm | 70 -----------
.../multiple-resource-refs-in-rules.cfg | 52 --------
...multiple-resource-refs-in-rules.cfg.expect | 111 ------------------
3 files changed, 233 deletions(-)
delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
delete mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index e2b77215..64dae1e4 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -475,74 +475,4 @@ sub get_next_ordinal : prototype($) {
return $current_order + 1;
}
-=head1 INTER-PLUGIN RULE CHECKERS
-
-=cut
-
-=head3 check_single_global_resource_reference($node_affinity_rules, $resource_affinity_rules)
-
-Returns all rules in C<$node_affinity_rules> and C<$resource_affinity_rules> as
-a list of lists, each consisting of the rule id and the resource id, where one
-of the resources is used in both a node affinity rule and resource affinity rule
-at the same time.
-
-If there are none, the returned list is empty.
-
-=cut
-
-sub check_single_global_resource_reference {
- my ($node_affinity_rules, $resource_affinity_rules) = @_;
-
- my @conflicts = ();
- my $resource_ruleids = {};
-
- while (my ($ruleid, $rule) = each %$node_affinity_rules) {
- for my $sid (keys $rule->{resources}->%*) {
- push $resource_ruleids->{$sid}->{node_affinity}->@*, $ruleid;
- }
- }
- while (my ($ruleid, $rule) = each %$resource_affinity_rules) {
- for my $sid (keys $rule->{resources}->%*) {
- push $resource_ruleids->{$sid}->{resource_affinity}->@*, $ruleid;
- }
- }
-
- for my $sid (keys %$resource_ruleids) {
- my $node_affinity_ruleids = $resource_ruleids->{$sid}->{node_affinity} // [];
- my $resource_affinity_ruleids = $resource_ruleids->{$sid}->{resource_affinity} // [];
-
- next if @$node_affinity_ruleids > 0 && !@$resource_affinity_ruleids;
- next if @$resource_affinity_ruleids > 0 && !@$node_affinity_ruleids;
-
- for my $ruleid (@$node_affinity_ruleids, @$resource_affinity_ruleids) {
- push @conflicts, [$ruleid, $sid];
- }
- }
-
- @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
- return \@conflicts;
-}
-
-__PACKAGE__->register_check(
- sub {
- my ($args) = @_;
-
- return check_single_global_resource_reference(
- $args->{node_affinity_rules},
- $args->{resource_affinity_rules},
- );
- },
- sub {
- my ($conflicts, $errors) = @_;
-
- for my $conflict (@$conflicts) {
- my ($ruleid, $sid) = @$conflict;
-
- push $errors->{$ruleid}->{resources}->@*,
- "resource '$sid' cannot be used in both a node affinity rule"
- . " and a resource affinity rule at the same time";
- }
- },
-);
-
1;
diff --git a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg b/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
deleted file mode 100644
index 6608a5c3..00000000
--- a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg
+++ /dev/null
@@ -1,52 +0,0 @@
-# Case 1: Do not remove node/resource affinity rules, which do not share resources between these types.
-node-affinity: different-resource1
- resources vm:101,vm:102,vm:103
- nodes node1,node2:2
- strict 0
-
-resource-affinity: different-resource2
- resources vm:104,vm:105
- affinity positive
-
-node-affinity: different-resource3
- resources vm:106
- nodes node1,node2:2
- strict 1
-
-resource-affinity: different-resource4
- resources vm:107,vm:109
- affinity negative
-
-# Case 2: Remove rules, which share the same resource(s) between different rule types.
-node-affinity: same-resource1
- resources vm:201
- nodes node1,node2:2
- strict 0
-
-resource-affinity: same-resource2
- resources vm:201,vm:205
- affinity negative
-
-resource-affinity: same-resource3
- resources vm:201,vm:203,vm:204
- affinity negative
-
-node-affinity: same-resource4
- resources vm:205,vm:206,vm:207
- nodes node1:2,node3:3
- strict 1
-
-# Case 3: Do not remove rules, which do not share resources between them.
-node-affinity: other-different-resource1
- resources vm:301,vm:308
- nodes node1,node2:2
- strict 0
-
-resource-affinity: other-different-resource2
- resources vm:302,vm:304,vm:305
- affinity positive
-
-node-affinity: other-different-resource3
- resources vm:303,vm:306,vm:309
- nodes node1,node2:2
- strict 1
diff --git a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect b/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
deleted file mode 100644
index 972c042d..00000000
--- a/src/test/rules_cfgs/multiple-resource-refs-in-rules.cfg.expect
+++ /dev/null
@@ -1,111 +0,0 @@
---- Log ---
-Drop rule 'same-resource1', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource2', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource2', because resource 'vm:205' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource3', because resource 'vm:201' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
-Drop rule 'same-resource4', because resource 'vm:205' cannot be used in both a node affinity rule and a resource affinity rule at the same time.
---- Config ---
-$VAR1 = {
- 'digest' => 'fcbdf84d442d38b4c901d989c211fb62024c5515',
- 'ids' => {
- 'different-resource1' => {
- 'nodes' => {
- 'node1' => {
- 'priority' => 0
- },
- 'node2' => {
- 'priority' => 2
- }
- },
- 'resources' => {
- 'vm:101' => 1,
- 'vm:102' => 1,
- 'vm:103' => 1
- },
- 'strict' => 0,
- 'type' => 'node-affinity'
- },
- 'different-resource2' => {
- 'affinity' => 'positive',
- 'resources' => {
- 'vm:104' => 1,
- 'vm:105' => 1
- },
- 'type' => 'resource-affinity'
- },
- 'different-resource3' => {
- 'nodes' => {
- 'node1' => {
- 'priority' => 0
- },
- 'node2' => {
- 'priority' => 2
- }
- },
- 'resources' => {
- 'vm:106' => 1
- },
- 'strict' => 1,
- 'type' => 'node-affinity'
- },
- 'different-resource4' => {
- 'affinity' => 'negative',
- 'resources' => {
- 'vm:107' => 1,
- 'vm:109' => 1
- },
- 'type' => 'resource-affinity'
- },
- 'other-different-resource1' => {
- 'nodes' => {
- 'node1' => {
- 'priority' => 0
- },
- 'node2' => {
- 'priority' => 2
- }
- },
- 'resources' => {
- 'vm:301' => 1,
- 'vm:308' => 1
- },
- 'strict' => 0,
- 'type' => 'node-affinity'
- },
- 'other-different-resource2' => {
- 'affinity' => 'positive',
- 'resources' => {
- 'vm:302' => 1,
- 'vm:304' => 1,
- 'vm:305' => 1
- },
- 'type' => 'resource-affinity'
- },
- 'other-different-resource3' => {
- 'nodes' => {
- 'node1' => {
- 'priority' => 0
- },
- 'node2' => {
- 'priority' => 2
- }
- },
- 'resources' => {
- 'vm:303' => 1,
- 'vm:306' => 1,
- 'vm:309' => 1
- },
- 'strict' => 1,
- 'type' => 'node-affinity'
- }
- },
- 'order' => {
- 'different-resource1' => 1,
- 'different-resource2' => 2,
- 'different-resource3' => 3,
- 'different-resource4' => 4,
- 'other-different-resource1' => 9,
- 'other-different-resource2' => 10,
- 'other-different-resource3' => 11
- }
- };
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (8 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Add inter-plugin checks and helpers, which allow resources to be used in
node affinity rules and resource affinity rules at the same time, if the
following conditions are met:
- the resources of a resource affinity rule are not part of any node
affinity rule, which has multiple priority groups. This is because of
the dynamic nature of priority groups.
- the resources of a positive resource affinity rule are part of at most
one node affinity rule, but no more. Otherwise, it is not easily
decidable (yet) what the common node restrictions are.
- the positive resource affinity rules, which have at least one resource
which is part of one node affinity rule, make all the resources part
of the node affinity rule.
- the resources of a negative resource affinity rule are not restricted
by their node affinity rules in such a way that these do not have
enough nodes to be separated on.
Additionally, resources of a positive resource affinity rule, which are
also part of at most a single node affinity rule, are also added to the
node affinity rule.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Rules.pm | 281 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 281 insertions(+)
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index 64dae1e4..323ad038 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -410,6 +410,8 @@ sub canonicalize : prototype($$$) {
next if $@; # plugin doesn't implement plugin_canonicalize(...)
}
+ $class->global_canonicalize($rules);
+
return $messages;
}
@@ -475,4 +477,283 @@ sub get_next_ordinal : prototype($) {
return $current_order + 1;
}
+=head1 INTER-PLUGIN RULE CHECKERS
+
+=cut
+
+my $has_multiple_priorities = sub {
+ my ($node_affinity_rule) = @_;
+
+ my $priority;
+ for my $node (values $node_affinity_rule->{nodes}->%*) {
+ $priority = $node->{priority} if !defined($priority);
+
+ return 1 if $priority != $node->{priority};
+ }
+};
+
+=head3 check_single_priority_node_affinity_in_resource_affinity_rules(...)
+
+Returns all rules in C<$resource_affinity_rules> and C<$node_affinity_rules> as
+a list of lists, each consisting of the rule type and resource id, where at
+least one resource in a resource affinity rule are in node affinity rules,
+which have multiple priority groups defined.
+
+That is, the resource affinity rule cannot be statically checked to be feasible
+as the selection of the priority group is dependent on the currently online
+nodes.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_single_priority_node_affinity_in_resource_affinity_rules {
+ my ($resource_affinity_rules, $node_affinity_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($resource_affinity_id, $resource_affinity_rule) = each %$resource_affinity_rules) {
+ my $has_conflicts;
+ my $resources = $resource_affinity_rule->{resources};
+ my @paired_node_affinity_rules = ();
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my $node_affinity_rule = $node_affinity_rules->{$node_affinity_id};
+
+ next if sets_are_disjoint($resources, $node_affinity_rule->{resources});
+
+ $has_conflicts = $has_multiple_priorities->($node_affinity_rule)
+ if !$has_conflicts;
+
+ push @paired_node_affinity_rules, $node_affinity_id;
+ }
+ if ($has_conflicts) {
+ push @conflicts, ['resource-affinity', $resource_affinity_id];
+ push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+ }
+ }
+
+ @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_single_priority_node_affinity_in_resource_affinity_rules(
+ $args->{resource_affinity_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($conflicts, $errors) = @_;
+
+ for my $conflict (@$conflicts) {
+ my ($type, $ruleid) = @$conflict;
+
+ if ($type eq 'node-affinity') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "resources are in a resource affinity rule and cannot be in"
+ . " a node affinity rule with multiple priorities";
+ } elsif ($type eq 'resource-affinity') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "resources are in node affinity rules with multiple priorities";
+ }
+ }
+ },
+);
+
+=head3 check_single_node_affinity_per_positive_resource_affinity_rule(...)
+
+Returns all rules in C<$positive_rules> and C<$node_affinity_rules> as a list of
+lists, each consisting of the rule type and resource id, where one of the
+resources is used in a positive resource affinity rule and more than one node
+affinity rule.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_single_node_affinity_per_positive_resource_affinity_rule {
+ my ($positive_rules, $node_affinity_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+ my $positive_resources = $positive_rule->{resources};
+ my @paired_node_affinity_rules = ();
+
+ while (my ($node_affinity_id, $node_affinity_rule) = each %$node_affinity_rules) {
+ next if sets_are_disjoint($positive_resources, $node_affinity_rule->{resources});
+
+ push @paired_node_affinity_rules, $node_affinity_id;
+ }
+ if (@paired_node_affinity_rules > 1) {
+ push @conflicts, ['positive', $positiveid];
+ push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+ }
+ }
+
+ @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_single_node_affinity_per_positive_resource_affinity_rule(
+ $args->{positive_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($conflicts, $errors) = @_;
+
+ for my $conflict (@$conflicts) {
+ my ($type, $ruleid) = @$conflict;
+
+ if ($type eq 'positive') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "resources are in multiple node affinity rules";
+ } elsif ($type eq 'node-affinity') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "at least one resource is in a positive resource affinity"
+ . " rule and there are other resources in at least one"
+ . " other node affinity rule already";
+ }
+ }
+ },
+);
+
+=head3 check_negative_resource_affinity_node_affinity_consistency(...)
+
+Returns all rules in C<$negative_rules> and C<$node_affinity_rules> as a list
+of lists, each consisting of the rule type and resource id, where the resources
+in the negative resource affinity rule are restricted to less nodes than needed
+to keep them separate by their node affinity rules.
+
+That is, the negative resource affinity rule cannot be fullfilled as there are
+not enough nodes to spread the resources on.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_negative_resource_affinity_node_affinity_consistency {
+ my ($negative_rules, $node_affinity_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($negativeid, $negative_rule) = each %$negative_rules) {
+ my $allowed_nodes = {};
+ my $located_resources;
+ my $resources = $negative_rule->{resources};
+ my @paired_node_affinity_rules = ();
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my ($node_affinity_resources, $node_affinity_nodes) =
+ $node_affinity_rules->{$node_affinity_id}->@{qw(resources nodes)};
+ my $common_resources = set_intersect($resources, $node_affinity_resources);
+
+ next if keys %$common_resources < 1;
+
+ $located_resources = set_union($located_resources, $common_resources);
+ $allowed_nodes = set_union($allowed_nodes, $node_affinity_nodes);
+
+ push @paired_node_affinity_rules, $node_affinity_id;
+ }
+ if (keys %$allowed_nodes < keys %$located_resources) {
+ push @conflicts, ['negative', $negativeid];
+ push @conflicts, ['node-affinity', $_] for @paired_node_affinity_rules;
+ }
+ }
+
+ @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_negative_resource_affinity_node_affinity_consistency(
+ $args->{negative_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($conflicts, $errors) = @_;
+
+ for my $conflict (@$conflicts) {
+ my ($type, $ruleid) = @$conflict;
+
+ if ($type eq 'negative') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "two or more resources are restricted to less nodes than"
+ . " available to the resources";
+ } elsif ($type eq 'node-affinity') {
+ push $errors->{$ruleid}->{resources}->@*,
+ "at least one resource is in a negative resource affinity"
+ . " rule and this rule would restrict these to less nodes"
+ . " than available to the resources";
+ }
+ }
+ },
+);
+
+=head1 INTER-PLUGIN RULE CANONICALIZATION HELPERS
+
+=cut
+
+=head3 create_implicit_positive_resource_affinity_node_affinity_rules(...)
+
+Modifies C<$rules> such that all resources of a positive resource affinity rule,
+defined in C<$positive_rules>, where at least one of their resources is also in
+a node affinity rule, defined in C<$node_affinity_rules>, makes all the other
+positive resource affinity rule's resources also part of the node affinity rule.
+
+This helper assumes that there can only be a single node affinity rule per
+positive resource affinity rule as there is no heuristic yet what should be
+done in the case of multiple node affinity rules.
+
+This also makes it cheaper to infer these implicit constraints later instead of
+propagating that information in each scheduler invocation.
+
+=cut
+
+sub create_implicit_positive_resource_affinity_node_affinity_rules {
+ my ($rules, $positive_rules, $node_affinity_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+ my $found_node_affinity_id;
+ my $positive_resources = $positive_rule->{resources};
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my $node_affinity_rule = $rules->{ids}->{$node_affinity_id};
+ next if sets_are_disjoint($positive_resources, $node_affinity_rule->{resources});
+
+ # assuming that all $resources have at most one node affinity rule,
+ # take the first found node affinity rule.
+ $node_affinity_rule->{resources}->{$_} = 1 for keys %$positive_resources;
+ last;
+ }
+ }
+}
+
+sub global_canonicalize {
+ my ($class, $rules) = @_;
+
+ my $args = $class->get_check_arguments($rules);
+
+ create_implicit_positive_resource_affinity_node_affinity_rules(
+ $rules,
+ $args->{positive_rules},
+ $args->{node_affinity_rules},
+ );
+}
+
1;
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (9 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Add test cases to verify that the rule checkers correctly identify and
remove HA node and resource affinity rules from the rules to make the
rule set feasible. The added test cases verify:
- the resources of a resource affinity rule are not part of any node
affinity rule, which has multiple priority groups. This is because of
the dynamic nature of priority groups.
- the resources of a positive resource affinity rule are part of at most
one node affinity rule, but no more. Otherwise, it is not easily
decidable (yet) what the common node restrictions are.
- the positive resource affinity rules, which have at least one resource
which is part of one node affinity rule, make all the resources part
of the node affinity rule.
- the resources of a negative resource affinity rule are not restricted
by their node affinity rules in such a way that these do not have
enough nodes to be separated on.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
...onsistent-node-resource-affinity-rules.cfg | 54 +++++++++
...nt-node-resource-affinity-rules.cfg.expect | 73 ++++++++++++
...y-for-positive-resource-affinity-rules.cfg | 37 ++++++
...ositive-resource-affinity-rules.cfg.expect | 111 ++++++++++++++++++
...-affinity-with-resource-affinity-rules.cfg | 35 ++++++
...ty-with-resource-affinity-rules.cfg.expect | 48 ++++++++
6 files changed, 358 insertions(+)
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
new file mode 100644
index 00000000..88e6dd0e
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
@@ -0,0 +1,54 @@
+# Case 1: Do not remove a positive resource affinity rule, where there is exactly one node to keep them together.
+node-affinity: vm101-vm102-must-be-on-node1
+ resources vm:101,vm:102
+ nodes node1
+ strict 1
+
+resource-affinity: vm101-vm102-must-be-kept-together
+ resources vm:101,vm:102
+ affinity positive
+
+# Case 2: Do not remove a negative resource affinity rule, where there are exactly enough nodes available to keep them apart.
+node-affinity: vm201-must-be-on-node1
+ resources vm:201
+ nodes node1
+ strict 1
+
+node-affinity: vm202-must-be-on-node2
+ resources vm:202
+ nodes node2
+ strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-separate
+ resources vm:201,vm:202
+ affinity negative
+
+# Case 3: Remove positive resource affinity rules, where two resources are restricted to different nodes.
+node-affinity: vm301-must-be-on-node1
+ resources vm:301
+ nodes node1
+ strict 1
+
+node-affinity: vm301-must-be-on-node2
+ resources vm:302
+ nodes node2
+ strict 1
+
+resource-affinity: vm301-vm302-must-be-kept-together
+ resources vm:301,vm:302
+ affinity positive
+
+# Case 4: Remove negative resource affinity rules, where two resources are restricted to less nodes than needed to keep them apart.
+node-affinity: vm401-must-be-on-node1
+ resources vm:401
+ nodes node1
+ strict 1
+
+node-affinity: vm402-must-be-on-node1
+ resources vm:402
+ nodes node1
+ strict 1
+
+resource-affinity: vm401-vm402-must-be-kept-separate
+ resources vm:401,vm:402
+ affinity negative
diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..e12242ab
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
@@ -0,0 +1,73 @@
+--- Log ---
+Drop rule 'vm301-must-be-on-node1', because at least one resource is in a positive resource affinity rule and there are other resources in at least one other node affinity rule already.
+Drop rule 'vm301-must-be-on-node2', because at least one resource is in a positive resource affinity rule and there are other resources in at least one other node affinity rule already.
+Drop rule 'vm301-vm302-must-be-kept-together', because resources are in multiple node affinity rules.
+Drop rule 'vm401-must-be-on-node1', because at least one resource is in a negative resource affinity rule and this rule would restrict these to less nodes than available to the resources.
+Drop rule 'vm401-vm402-must-be-kept-separate', because two or more resources are restricted to less nodes than available to the resources.
+Drop rule 'vm402-must-be-on-node1', because at least one resource is in a negative resource affinity rule and this rule would restrict these to less nodes than available to the resources.
+--- Config ---
+$VAR1 = {
+ 'digest' => 'a5d782a442bbe3bf3a4d088db82a575b382a53fe',
+ 'ids' => {
+ 'vm101-vm102-must-be-kept-together' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'vm101-vm102-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm201-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:201' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm201-vm202-must-be-kept-separate' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'vm202-must-be-on-node2' => {
+ 'nodes' => {
+ 'node2' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:202' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ }
+ },
+ 'order' => {
+ 'vm101-vm102-must-be-kept-together' => 2,
+ 'vm101-vm102-must-be-on-node1' => 1,
+ 'vm201-must-be-on-node1' => 3,
+ 'vm201-vm202-must-be-kept-separate' => 5,
+ 'vm202-must-be-on-node2' => 4
+ }
+ };
diff --git a/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
new file mode 100644
index 00000000..ebbf5e63
--- /dev/null
+++ b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg
@@ -0,0 +1,37 @@
+# Case 1: Do not change any node affinity rules, if there are no resources of a positive resource affinity rule in any node affinity rules.
+resource-affinity: do-not-infer-positive1
+ resources vm:101,vm:102,vm:103
+ affinity positive
+
+# Case 2: Do not change any node affinity rules for node affinity rules of resources in a negative resource affinity rule.
+node-affinity: do-not-infer-negative1
+ resources vm:203
+ nodes node1,node2
+ strict 1
+
+node-affinity: do-not-infer-negative2
+ resources vm:201
+ nodes node3
+
+resource-affinity: do-not-infer-negative3
+ resources vm:201,vm:203
+ affinity negative
+
+# Case 2: Add two resources, which are not part of the node affinity rule of another resource in a positive resource affinity rule, to the node affinity rule.
+node-affinity: infer-single-resource1
+ resources vm:302
+ nodes node3
+
+resource-affinity: infer-single-resource2
+ resources vm:301,vm:302,vm:303
+ affinity positive
+
+# Case 3: Add one resource, which is not part of the node affinity rule of the other resources in a positive resource affinity rule, to the node affinity rule.
+node-affinity: infer-multi-resources1
+ resources vm:402,vm:404,vm:405
+ nodes node1,node3
+ strict 1
+
+resource-affinity: infer-multi-resources2
+ resources vm:401,vm:402,vm:403,vm:404
+ affinity positive
diff --git a/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..33c56c62
--- /dev/null
+++ b/src/test/rules_cfgs/infer-node-affinity-for-positive-resource-affinity-rules.cfg.expect
@@ -0,0 +1,111 @@
+--- Log ---
+--- Config ---
+$VAR1 = {
+ 'digest' => '32ae135ef2f8bd84cd12c18af6910dce9d6bc9fa',
+ 'ids' => {
+ 'do-not-infer-negative1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ },
+ 'node2' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:203' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'do-not-infer-negative2' => {
+ 'nodes' => {
+ 'node3' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:201' => 1
+ },
+ 'type' => 'node-affinity'
+ },
+ 'do-not-infer-negative3' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:203' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-infer-positive1' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1,
+ 'vm:103' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-multi-resources1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ },
+ 'node3' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:401' => 1,
+ 'vm:402' => 1,
+ 'vm:403' => 1,
+ 'vm:404' => 1,
+ 'vm:405' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'infer-multi-resources2' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:401' => 1,
+ 'vm:402' => 1,
+ 'vm:403' => 1,
+ 'vm:404' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-single-resource1' => {
+ 'nodes' => {
+ 'node3' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1,
+ 'vm:303' => 1
+ },
+ 'type' => 'node-affinity'
+ },
+ 'infer-single-resource2' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1,
+ 'vm:303' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ 'do-not-infer-negative1' => 2,
+ 'do-not-infer-negative2' => 3,
+ 'do-not-infer-negative3' => 4,
+ 'do-not-infer-positive1' => 1,
+ 'infer-multi-resources1' => 7,
+ 'infer-multi-resources2' => 8,
+ 'infer-single-resource1' => 5,
+ 'infer-single-resource2' => 6
+ }
+ };
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
new file mode 100644
index 00000000..7fb9cdd3
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
@@ -0,0 +1,35 @@
+# Case 1: Remove resource affinity rules, where there is a loose Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm101-vm102-should-be-on-node1-or-node2
+ resources vm:101,vm:102
+ nodes node1:1,node2:2
+ strict 0
+
+resource-affinity: vm101-vm102-must-be-kept-separate
+ resources vm:101,vm:102
+ affinity negative
+
+# Case 2: Remove resource affinity rules, where there is a strict Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm201-vm202-must-be-on-node1-or-node2
+ resources vm:201,vm:202
+ nodes node1:1,node2:2
+ strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-together
+ resources vm:201,vm:202
+ affinity positive
+
+# Case 3: Do not remove the resource affinity rule, if there is only one priority group in each node affinity rule for the ha
+# resource affinity rule's resources.
+node-affinity: vm301-must-be-on-node1-with-prio-1
+ resources vm:301
+ nodes node1:1
+ strict 1
+
+node-affinity: vm302-must-be-on-node2-with-prio-2
+ resources vm:302
+ nodes node2:2
+ strict 1
+
+resource-affinity: vm301-vm302-must-be-kept-together
+ resources vm:301,vm:302
+ affinity negative
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
new file mode 100644
index 00000000..92d12929
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
@@ -0,0 +1,48 @@
+--- Log ---
+Drop rule 'vm101-vm102-must-be-kept-separate', because resources are in node affinity rules with multiple priorities.
+Drop rule 'vm101-vm102-should-be-on-node1-or-node2', because resources are in a resource affinity rule and cannot be in a node affinity rule with multiple priorities.
+Drop rule 'vm201-vm202-must-be-kept-together', because resources are in node affinity rules with multiple priorities.
+Drop rule 'vm201-vm202-must-be-on-node1-or-node2', because resources are in a resource affinity rule and cannot be in a node affinity rule with multiple priorities.
+--- Config ---
+$VAR1 = {
+ 'digest' => '722a98914555f296af0916c980a9d6c780f5f072',
+ 'ids' => {
+ 'vm301-must-be-on-node1-with-prio-1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 1
+ }
+ },
+ 'resources' => {
+ 'vm:301' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm301-vm302-must-be-kept-together' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'vm302-must-be-on-node2-with-prio-2' => {
+ 'nodes' => {
+ 'node2' => {
+ 'priority' => 2
+ }
+ },
+ 'resources' => {
+ 'vm:302' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ }
+ },
+ 'order' => {
+ 'vm301-must-be-on-node1-with-prio-1' => 5,
+ 'vm301-vm302-must-be-kept-together' => 7,
+ 'vm302-must-be-on-node2-with-prio-2' => 6
+ }
+ };
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (10 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
@ 2025-08-01 16:22 ` Daniel Kral
2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht
12 siblings, 0 replies; 14+ messages in thread
From: Daniel Kral @ 2025-08-01 16:22 UTC (permalink / raw)
To: pve-devel
Add test cases for some scenarios, where node and positive/negative
resource affinity rules are applied together.
For the positive resource affinity rules, node affinity rules will
always take precedence, even if all or the majority of resources in the
resource affinity rule are already on another node contradicting the
node affinity rule.
For the negative resource affinity rules, node affinity rules will take
precedence if it is possible to do so. Currently, there are still
cases, which will need manual intervention. These should be accounted
for automatically in the future by providing more information to the
scheduler.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../README | 14 ++++
.../cmdlist | 3 +
.../hardware_status | 5 ++
.../log.expect | 41 ++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 5 ++
.../README | 13 ++++
.../cmdlist | 3 +
.../hardware_status | 7 ++
.../log.expect | 63 +++++++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 5 ++
.../README | 17 +++++
.../cmdlist | 6 ++
.../hardware_status | 6 ++
.../log.expect | 67 +++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 15 +++++
.../service_config | 5 ++
.../README | 15 +++++
.../cmdlist | 3 +
.../hardware_status | 5 ++
.../log.expect | 49 ++++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 5 ++
.../README | 15 +++++
.../cmdlist | 3 +
.../hardware_status | 5 ++
.../log.expect | 49 ++++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 5 ++
35 files changed, 462 insertions(+)
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
create mode 100644 src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
new file mode 100644
index 00000000..5f68bbb8
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/README
@@ -0,0 +1,14 @@
+Test whether a strict negative resource affinity rule among three resources,
+where the resources are contradicting the negative resource affinity rule and
+one of them should be on a specific node by its node affinity rule, makes the
+resource in the node affinity rule migrate to its preferred node, if possible.
+
+The test scenario is:
+- vm:102 should be on node2
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101 is currently on node1
+- vm:102 and vm:103 are currently on node1, which must be separated
+
+The expected outcome is:
+- As vm:102 and vm:103 are still on the same node, make vm:102 migrate to node2
+ to fulfill its node affinity rule
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
new file mode 100644
index 00000000..216aeb66
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/log.expect
@@ -0,0 +1,41 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: migrate service 'vm:102' to node 'node2' (running)
+info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: service vm:102 - start migrate to node 'node2'
+info 21 node1/lrm: service vm:102 - end migrate to node 'node2'
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 43 node2/lrm: starting service vm:102
+info 43 node2/lrm: service status vm:102 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
new file mode 100644
index 00000000..be874144
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node2
+ resources vm:102
+ nodes node2,node3
+
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
new file mode 100644
index 00000000..b98edc85
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative1/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
new file mode 100644
index 00000000..e2de70fb
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/README
@@ -0,0 +1,13 @@
+Test whether a strict negative resource affinity rule among three resources,
+where all resources are in a node affinity rule restricting them to three
+nodes, are migrated to these nodes as all three resources are still on a common
+node outside of the cluster.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 should be on node2, node3 or node4
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 is currently on node1
+
+The expected outcome is:
+- As vm:101, vm:102, and vm:103 are still on the same node and should be on
+ node2, node3 or node4, migrate them to node2, node3, and node4 respectively
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
new file mode 100644
index 00000000..8cdc6092
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
new file mode 100644
index 00000000..7b8e961e
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/hardware_status
@@ -0,0 +1,7 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" },
+ "node4": { "power": "off", "network": "off" },
+ "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
new file mode 100644
index 00000000..3e75c6bf
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/log.expect
@@ -0,0 +1,63 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node4 on
+info 20 node4/crm: status change startup => wait_for_quorum
+info 20 node4/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node5 on
+info 20 node5/crm: status change startup => wait_for_quorum
+info 20 node5/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 20 node1/crm: migrate service 'vm:102' to node 'node3' (running)
+info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 20 node1/crm: migrate service 'vm:103' to node 'node4' (running)
+info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node4)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 21 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 21 node1/lrm: service vm:102 - start migrate to node 'node3'
+info 21 node1/lrm: service vm:102 - end migrate to node 'node3'
+info 21 node1/lrm: service vm:103 - start migrate to node 'node4'
+info 21 node1/lrm: service vm:103 - end migrate to node 'node4'
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 26 node4/crm: status change wait_for_quorum => slave
+info 27 node4/lrm: got lock 'ha_agent_node4_lock'
+info 27 node4/lrm: status change wait_for_agent_lock => active
+info 28 node5/crm: status change wait_for_quorum => slave
+info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node4)
+info 43 node2/lrm: starting service vm:101
+info 43 node2/lrm: service status vm:101 started
+info 45 node3/lrm: starting service vm:102
+info 45 node3/lrm: service status vm:102 started
+info 47 node4/lrm: starting service vm:103
+info 47 node4/lrm: service status vm:103 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
new file mode 100644
index 00000000..b84d7702
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vms-must-be-on-subcluster
+ resources vm:101,vm:102,vm:103
+ nodes node2,node3,node4
+
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
new file mode 100644
index 00000000..57e3579d
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative2/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
new file mode 100644
index 00000000..588f9020
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/README
@@ -0,0 +1,17 @@
+Test whether a strict negative resource affinity rule among three resources,
+where two resources are restricted each to nodes they are not yet on, can be
+exchanged to the nodes described by their node affinity rules, if one of the
+resources is stopped.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 should be on node2, node3 or node1 respectively
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are currently on node1, node2, node3 respectively
+
+The expected outcome is:
+- the resources can neither be manually migrated nor automatically exchange
+ their nodes to match their node affinity rules, because of the strict
+ condition, that they cannot be on either a node, where a resource with
+ negative affinity is currently on or is migrated to
+- therefore, one of the resources must be stopped manually to allow the
+ rearrangement to fullfill the node affinity rules
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
new file mode 100644
index 00000000..2f2c80f5
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/cmdlist
@@ -0,0 +1,6 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "service vm:103 migrate node1" ],
+ [ "service vm:101 stopped" ],
+ [ "service vm:101 started" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
new file mode 100644
index 00000000..4aed08a1
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/hardware_status
@@ -0,0 +1,6 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" },
+ "node4": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
new file mode 100644
index 00000000..1ed34c36
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/log.expect
@@ -0,0 +1,67 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 120 cmdlist: execute service vm:103 migrate node1
+err 120 node1/crm: crm command 'migrate vm:103 node1' error - service 'vm:101' on node 'node1' in negative affinity with service 'vm:103'
+info 220 cmdlist: execute service vm:101 stopped
+info 220 node1/crm: service 'vm:101': state changed from 'started' to 'request_stop'
+info 221 node1/lrm: stopping service vm:101
+info 221 node1/lrm: service status vm:101 stopped
+info 240 node1/crm: service 'vm:101': state changed from 'request_stop' to 'stopped'
+info 240 node1/crm: migrate service 'vm:103' to node 'node1' (running)
+info 240 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 245 node3/lrm: service vm:103 - start migrate to node 'node1'
+info 245 node3/lrm: service vm:103 - end migrate to node 'node1'
+info 260 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 260 node1/crm: migrate service 'vm:102' to node 'node3' (running)
+info 260 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 261 node1/lrm: starting service vm:103
+info 261 node1/lrm: service status vm:103 started
+info 263 node2/lrm: service vm:102 - start migrate to node 'node3'
+info 263 node2/lrm: service vm:102 - end migrate to node 'node3'
+info 280 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 285 node3/lrm: starting service vm:102
+info 285 node3/lrm: service status vm:102 started
+info 320 cmdlist: execute service vm:101 started
+info 320 node1/crm: service 'vm:101': state changed from 'stopped' to 'request_start' (node = node1)
+info 320 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 320 node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info 320 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 321 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 321 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 340 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 343 node2/lrm: starting service vm:101
+info 343 node2/lrm: service status vm:101 started
+info 920 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
new file mode 100644
index 00000000..2362d220
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/rules_config
@@ -0,0 +1,15 @@
+node-affinity: vm101-must-be-on-node2
+ resources vm:101
+ nodes node2
+
+node-affinity: vm102-must-be-on-node3
+ resources vm:102
+ nodes node3
+
+node-affinity: vm103-must-be-on-node1
+ resources vm:103
+ nodes node1
+
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
new file mode 100644
index 00000000..4b26f6b4
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-negative3/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node2", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
new file mode 100644
index 00000000..2202f5a3
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/README
@@ -0,0 +1,15 @@
+Test whether a strict positive resource affinity rule among three resources,
+where one of these resources is restricted to another node than they are
+currently on with a node affinity rule, makes all resources migrate to that
+node.
+
+The test scenario is:
+- vm:102 should be kept on node2
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are currently running on node3
+
+The expected outcome is:
+- As vm:102 is on node3, which contradicts its node affinity rule, vm:102 is
+ migrated to node2 to fullfill its node affinity rule
+- As vm:102 is in a positive resource affinity rule with vm:101 and vm:103, all
+ of them are migrated to node2 as these are inferred for all of them
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
new file mode 100644
index 00000000..d84b2228
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/log.expect
@@ -0,0 +1,49 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 20 node1/crm: migrate service 'vm:102' to node 'node2' (running)
+info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 20 node1/crm: migrate service 'vm:103' to node 'node2' (running)
+info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: service vm:101 - start migrate to node 'node2'
+info 25 node3/lrm: service vm:101 - end migrate to node 'node2'
+info 25 node3/lrm: service vm:102 - start migrate to node 'node2'
+info 25 node3/lrm: service vm:102 - end migrate to node 'node2'
+info 25 node3/lrm: service vm:103 - start migrate to node 'node2'
+info 25 node3/lrm: service vm:103 - end migrate to node 'node2'
+info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node2)
+info 43 node2/lrm: starting service vm:101
+info 43 node2/lrm: service status vm:101 started
+info 43 node2/lrm: starting service vm:102
+info 43 node2/lrm: service status vm:102 started
+info 43 node2/lrm: starting service vm:103
+info 43 node2/lrm: service status vm:103 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
new file mode 100644
index 00000000..655f5161
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node2
+ resources vm:102
+ nodes node2
+
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102,vm:103
+ affinity positive
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
new file mode 100644
index 00000000..299a58c9
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive1/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
new file mode 100644
index 00000000..c5f4b469
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/README
@@ -0,0 +1,15 @@
+Test whether a strict positive resource affinity rule among three resources,
+where one of these resources is restricted to another node than they are
+currently on with a node affinity rule, makes all resources migrate to that
+node.
+
+The test scenario is:
+- vm:102 must be kept on node1 or node2
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are currently running on node3
+
+The expected outcome is:
+- As vm:102 is on node3, which contradicts its node affinity rule, vm:102 is
+ migrated to node1 to fullfill its node affinity rule
+- As vm:102 is in a positive resource affinity rule with vm:101 and vm:103, all
+ of them are migrated to node1 as these are inferred for all of them
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
new file mode 100644
index 00000000..22fb5ced
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/log.expect
@@ -0,0 +1,49 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: migrate service 'vm:101' to node 'node1' (running)
+info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 20 node1/crm: migrate service 'vm:102' to node 'node1' (running)
+info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 20 node1/crm: migrate service 'vm:103' to node 'node1' (running)
+info 20 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: service vm:101 - start migrate to node 'node1'
+info 25 node3/lrm: service vm:101 - end migrate to node 'node1'
+info 25 node3/lrm: service vm:102 - start migrate to node 'node1'
+info 25 node3/lrm: service vm:102 - end migrate to node 'node1'
+info 25 node3/lrm: service vm:103 - start migrate to node 'node1'
+info 25 node3/lrm: service vm:103 - end migrate to node 'node1'
+info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
+info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 40 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 41 node1/lrm: starting service vm:101
+info 41 node1/lrm: service status vm:101 started
+info 41 node1/lrm: starting service vm:102
+info 41 node1/lrm: service status vm:102 started
+info 41 node1/lrm: starting service vm:103
+info 41 node1/lrm: service status vm:103 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
new file mode 100644
index 00000000..6db94930
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm102-must-be-on-node1-or-node2
+ resources vm:102
+ nodes node1,node2
+
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102,vm:103
+ affinity positive
diff --git a/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
new file mode 100644
index 00000000..299a58c9
--- /dev/null
+++ b/src/test/test-resource-affinity-with-node-affinity-strict-positive2/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" }
+}
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
* [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1)
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
` (11 preceding siblings ...)
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
@ 2025-08-01 17:36 ` Thomas Lamprecht
12 siblings, 0 replies; 14+ messages in thread
From: Thomas Lamprecht @ 2025-08-01 17:36 UTC (permalink / raw)
To: pve-devel, Daniel Kral
On Fri, 01 Aug 2025 18:22:15 +0200, Daniel Kral wrote:
> Here's a follow up on the HA rules and especially the HA resource
> affinity rules.
>
> The first three patches haven't changed as they were lower priority for
> me than the last part about loosening restrictions on mixed resource
> references.
>
> [...]
Really not much new code for what it is, nice! And the HA test system makes it
much easier to confidently do (and review!) such changes.
Applied, thanks!
[01/12] manager: fix ~revision version check for ha groups migration
commit: ea2c1f0201084a3c17472e3fc100a206a221f522
[02/12] test: ha tester: add ha groups migration tests with runtime upgrades
commit: da810d20bf3d06e345c71dcc38ef9a7004df341a
[03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper
commit: 8363f7f24731b95efcd0db9b6f9252be94a10e3b
[04/12] api: rules: add missing return schema for the read_rule api endpoint
commit: ac927a5f5f30e57455cfed0c87a5b43e843db913
[05/12] api: rules: ignore disable parameter if it is set to a falsy value
commit: c060bc4f3695c199085afd3479aac2ccd2f97c82
[06/12] rules: resource affinity: make message in inter-consistency check clearer
commit: 5808ede019ec92cb76b8b2a2605fa6e33b566f20
[07/12] config, manager: do not check ignored resources with affinity when migrating
commit: 1c9c35d4e35e40da79322c62fe484b904c60d471
[08/12] rules: make positive affinity resources migrate on single resource fail
commit: ad89487d39e1f8ace72f244f51e96d68c393286c
[09/12] rules: allow same resources in node and resource affinity rules
commit: 4edad9d1fed3b24eee51c0e27d8e1a7cec40f425
[10/12] rules: restrict inter-plugin resource references to simple cases
commit: c48d9e66b8edf0851028040ab3117b4f01757e14
[11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases
commit: ed530cc40be19f19f8b30b1bc5bd5a29898ab815
[12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules
commit: 100859a77c36ef2eaf638f233ac1c82b050ee9cf
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-08-01 17:36 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-01 16:22 [pve-devel] [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 01/12] manager: fix ~revision version check for ha groups migration Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 02/12] test: ha tester: add ha groups migration tests with runtime upgrades Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 03/12] tree-wide: pass optional parameters as hash values for for_each_rule helper Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 04/12] api: rules: add missing return schema for the read_rule api endpoint Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 05/12] api: rules: ignore disable parameter if it is set to a falsy value Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 06/12] rules: resource affinity: make message in inter-consistency check clearer Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 07/12] config, manager: do not check ignored resources with affinity when migrating Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 08/12] rules: make positive affinity resources migrate on single resource fail Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 09/12] rules: allow same resources in node and resource affinity rules Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 10/12] rules: restrict inter-plugin resource references to simple cases Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 11/12] test: rules: add test cases for inter-plugin checks allowing simple use cases Daniel Kral
2025-08-01 16:22 ` [pve-devel] [PATCH ha-manager v2 12/12] test: ha tester: add resource affinity test cases mixed with node affinity rules Daniel Kral
2025-08-01 17:36 ` [pve-devel] applied: [PATCH ha-manager v2 00/12] HA rules follow up (part 1) Thomas Lamprecht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox