* [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules
@ 2025-07-04 18:20 Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 01/13] rules: introduce plugin-specific canonicalize routines Daniel Kral
` (18 more replies)
0 siblings, 19 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
RFC v1: https://lore.proxmox.com/pve-devel/20250325151254.193177-1-d.kral@proxmox.com/
RFC v2: https://lore.proxmox.com/pve-devel/20250620143148.218469-1-d.kral@proxmox.com/
HA rules: https://lore.proxmox.com/pve-devel/20250704181659.465441-1-d.kral@proxmox.com/
This is the other part, where the HA rules are extended with the HA
resource affinity rules. This depends on the HA rules series linked
above.
This is yet another follow-up to the previous RFC patch series for the
HA resource affinity rules feature (formerly known as HA colocation
rules), which allow users to specify rules (or affinity/anti-affinity)
for the HA Manager, which make two or more HA resource be either kept
together or apart with respect to each other.
Changelog to v2
---------------
- split up the patch series (ofc)
- rebased on newest available master
- renamed "HA Colocation Rule" to "HA Resource Affinity Rule"
- renamed "together" and "separate" to "positive" and "negative"
respectively for resource affinity rules
- renamed any reference of a 'HA service' to 'HA resource' (e.g. rules
property 'services' is now 'resources')
- converted tri-state property 'state' to a binary 'disable' flag on HA
rules and expose the 'contradictory' state with an 'errors' hash
- remove the "use-location-rules" feature flag and implement a more
straightforward ha groups migration (not directly relevant to this
feature, but wanted to note it either way)
- added more rules config test cases
- moved PVE::HashTools back to PVE::HA::HashTools because it lacked any
other obvious use cases for now
- added the inference that all services in a positive affinity must have
negative affinity with any service that is in negative affinity with
any of the services in the positive affinity (I hope someone finds a
better wording for this ;))
- added a rule checker which makes resource affinity rules with more
services than available nodes invalid
- dropped the patch which handled too many resources in a resource
affinity rule as that made more chaos than necessary (replaced that
with the check mentioned above)
- removed the strictness requirement of node affinity rules in the
inter-plugin type checks for node/resource affinity rules
- refactored the handling of manual migrations of services in resource
affinity relationships and made them external so that these can be
shown in the qemu-server and pve-container migrate preconditions in
the web interface
TODO for v3
-----------
There are still a few things that I am currently aware of now that
should be fixed as follow-ups or in a next revision.
- Mixed usage of Node Affinity rules and Resource Affinity rules still
behave rather awkward; The current implementation is still lacking the
inference when the services in a resource affinity rule have a node
affinity rule, then the other services must be in a node affinity rule
as well; with the current checks that should be reasonable to
implement in a similar way as the inference I've written for the other
case above
- Otherwise, if we don't want the magical inference from above, one
should add a checker which disallows a resource affinity rule to have
services which are not all in the same node affinity rule OR do not
have at least a common node among them (single-priority groups ofc).
- Testing, testing, testing
- Cases that where discovered while @Michael reviewed my series (thank
you very much Michael!)
As in the previous revisions, I've run a
git rebase master --exec 'make clean && make deb'
on the series, so the tests should work for every patch.
Changelog from v2 to v3
-----------------------
I've added per-patch changelogs for patches that have been changed, but
here's a better overview about the overall changes since the RFC:
- implemented the API/CLI endpoints and web interface integration for HA
rules
- added user-facing documentation about HA rules
- implemented HA location rules as semantically equivalent replacements
to HA groups (with the addition that the 'nofailback' flag was moved
to HA services as an inverted 'failback' to remove the double negation)
- implemented a "use-location-rules" feature flag in the datacenter
config to allow users to upgrade to the feature on their own
- dropped the 'loose' colocation rules for now, as these can be a
separate feature and it's unsure how these should act without user
feedback; i have them in a separate tree with not a lot of changes in
between these patches, so they are easy to rebase as a follow-up patch
series
- moved global rule checkers to the base rules plugin and made them
more modular and aware of their own plugin-specific rule set
- fixed a race condition where positively colocated services are split
and stay on multiple nodes (e.g. when the rule has been newly added
and the services are on different nodes) -> selects the node where
most of the pos. colocated service are now
- made the HA manager aware of the positive and negative colocations
when migrating, i.e., migrating other pos. colocated service with the
to-be-migrated service and blocking if a service is migrated to a node
where a neg. colocated service already is
- refactored the select_service_node(...) subroutine a bit to have less
arguments
------
Below is the updated initial cover letter of the first RFC.
------
I chose the name "colocation" in favor of affinity/anti-affinity, since
it is a bit more concise that it is about co-locating services between
each other in contrast to locating services on nodes, but no hard
feelings to change it (same for any other names in this series).
Many thanks to @Thomas, @Fiona, @Friedrich, @Fabian, @Lukas, @Michael
and @Hannes Duerr for the discussions about this feature off-list!
Recap: HA groups
----------------
The HA Manager currently allows a service to be assigned to one HA
groups, which essentially implements an affinity to a set of nodes. This
affinity can either be unrestricted or restricted, where the first
allows recovery to nodes outside of the HA group's nodes, if those are
currently unavailable.
This allows users to constrain the set of nodes, that can be selected
from as the starting and/or recovery node. Furthermore, each node in a
HA group can have an individual priority. This further constraints the
set of possible recovery nodes to the subset of online nodes in the
highest priority group.
Introduction
------------
Colocation is the concept of an inter-service affinity relationship,
which can either be positive (keep services together) or negative (keep
services apart). This is in contrast with the service-nodes affinity
relationship implemented by HA groups.
Motivation
----------
There are many different use cases to support colocation, but two simple
examples that come to mind are:
- Two or more services need to communicate with each other very
frequently. To reduce the communication path length and therefore
hopefully the latency, keep them together on one node.
- Two or more services need a lot of computational resources and will
therefore consume much of the assigned node's resource capacity. To
reduce starving and memory stalls, keep them separate on multiple
nodes, so that they have enough resources for themselves.
And some more concrete use cases from current HA Manager users:
- "For example: let's say we have three DB VMs (DB nodes in a cluster)
which we want to run on ANY PVE host, but we don't want them to be on
the same host." [0]
- "An example is: When Server from the DMZ zone start on the same host
like the firewall or another example the application servers start on
the same host like the sql server. Services they depend on each other
have short latency to each other." [1]
HA Rules
--------
To implement colocation, this patch series introduces HA rules, which
allows users to specify the colocation requirements on services. These
are implemented with the widely used section config, where each type of
rule is a individual plugin (for now 'location' and 'colocation').
This introduces some small initial complexity for testing satisfiability
of the rules, but allows the constraint interface to be extensible, and
hopefully allow easier reasoning about the node selection process with
the added constraint rules in the future.
Colocation Rules
----------------
The two properties of colocation rules, as described in the
introduction, are rather straightforward. A typical colocation rule
inside of the config would look like the following:
colocation: some-lonely-services
services vm:101,vm:103,ct:909
affinity separate
This means that the three services vm:101, vm:103 and ct:909 must be
kept separate on different nodes. I'm very keen on naming suggestions
since I think there could be a better word than 'affinity' here. I
played around with 'keep-services', since then it would always read
something like 'keep-services separate', which is very declarative, but
this might suggest that this is a binary option to too much users (I
mean it is, but not with the values 0 and 1).
Feasibility and Inference
-------------------------
Since rules allow more complexity, it is necessary to check whether
rules are (1) feasible and (2) can be simplified, so that as many HA
rules can still be applied as are feasible.
| Feasibility
----------
The feasibility checks are implemented in PVE::HA::Rules::Location,
PVE::HA::Rules::Colocation, and PVE::HA::Rules, where the latter handles
global checks in between rule types.
| Canonicalization
----------
Additionally, colocation rules are currently simplified as follows:
- If there are multiple positive colocation rules with common services
and the same strictness, these are merged to a single positive
colocation rule (so it is easier to check which services are
positively colocated with a service).
This is implemented in PVE::HA::Rules::Colocation::plugin_canonicalize.
Special negative colocation scenarios
-------------------------------------
Just to be aware of these, there's a distinction between the following
two sets of negative colocation rules:
colocation: separate-vms
services vm:101,vm:102,vm:103
affinity separate
and
colocation: separate-vms1
services vm:101,vm:102
affinity separate
colocation: separate-vms2
services vm:102,vm:103
affinity separate
The first keeps all three services separate from each other, while the
second only keeps pair-wise services separate from each other, but
vm:101 and vm:103 might be migrated to the same node.
Additional and/or future ideas
------------------------------
- Make recomputing the online node usage more granular.
- Add information of overall free node resources to improve decision
heuristic when recovering services to nodes.
- Implementing non-strict colocation rules, e.g., which won't fail but
ignore the rule (for a timeout?, until migrated by the user?), only
considering the $target node while migrating, etc.
- When deciding the recovery node for positively colocated services,
account for the needed resources of all to-be-migrated services rather
than just the first one. This is a non-trivial problem as we currently
solve this as a online bin covering problem, i.e. selecting for each
service alone instead of selecting for all services together.
- Ignore migrations to nodes where the service may not be according to
their location rules / HA group nodes.
- Dynamic colocation rule health statistics (e.g. warn on the
satisfiability of a colocation rule), e.g. in the WebGUI and/or API.
- Property for mandatory colocation rules to specify whether all
services should be stopped if the rule cannot be satisfied.
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=5260
[2] https://bugzilla.proxmox.com/show_bug.cgi?id=5332
ha-manager:
Daniel Kral (13):
rules: introduce plugin-specific canonicalize routines
rules: add haenv node list to the rules' canonicalization stage
rules: introduce resource affinity rule plugin
rules: add global checks between node and resource affinity rules
usage: add information about a service's assigned nodes
manager: apply resource affinity rules when selecting service nodes
manager: handle resource affinity rules in manual migrations
sim: resources: add option to limit start and migrate tries to node
test: ha tester: add test cases for negative resource affinity rules
test: ha tester: add test cases for positive resource affinity rules
test: ha tester: add test cases for static scheduler resource affinity
test: rules: add test cases for resource affinity rules
api: resources: add check for resource affinity in resource migrations
debian/pve-ha-manager.install | 1 +
src/PVE/API2/HA/Resources.pm | 131 +++-
src/PVE/API2/HA/Rules.pm | 5 +-
src/PVE/API2/HA/Status.pm | 4 +-
src/PVE/CLI/ha_manager.pm | 52 +-
src/PVE/HA/Config.pm | 56 ++
src/PVE/HA/Env/PVE2.pm | 2 +
src/PVE/HA/Manager.pm | 73 +-
src/PVE/HA/Resources.pm | 3 +-
src/PVE/HA/Rules.pm | 232 ++++++-
src/PVE/HA/Rules/Makefile | 2 +-
src/PVE/HA/Rules/ResourceAffinity.pm | 642 ++++++++++++++++++
src/PVE/HA/Sim/Env.pm | 2 +
src/PVE/HA/Sim/Resources/VirtFail.pm | 29 +-
src/PVE/HA/Usage.pm | 18 +
src/PVE/HA/Usage/Basic.pm | 19 +
src/PVE/HA/Usage/Static.pm | 19 +
.../defaults-for-resource-affinity-rules.cfg | 16 +
...lts-for-resource-affinity-rules.cfg.expect | 38 ++
...onsistent-node-resource-affinity-rules.cfg | 54 ++
...nt-node-resource-affinity-rules.cfg.expect | 121 ++++
.../inconsistent-resource-affinity-rules.cfg | 11 +
...sistent-resource-affinity-rules.cfg.expect | 11 +
...ctive-negative-resource-affinity-rules.cfg | 17 +
...egative-resource-affinity-rules.cfg.expect | 30 +
.../ineffective-resource-affinity-rules.cfg | 8 +
...fective-resource-affinity-rules.cfg.expect | 9 +
...licit-negative-resource-affinity-rules.cfg | 40 ++
...egative-resource-affinity-rules.cfg.expect | 131 ++++
...licit-negative-resource-affinity-rules.cfg | 16 +
...egative-resource-affinity-rules.cfg.expect | 73 ++
...ected-positive-resource-affinity-rules.cfg | 42 ++
...ositive-resource-affinity-rules.cfg.expect | 70 ++
...-affinity-with-resource-affinity-rules.cfg | 19 +
...ty-with-resource-affinity-rules.cfg.expect | 45 ++
.../README | 26 +
.../cmdlist | 4 +
.../datacenter.cfg | 6 +
.../hardware_status | 5 +
.../log.expect | 120 ++++
.../manager_status | 1 +
.../rules_config | 19 +
.../service_config | 10 +
.../static_service_stats | 10 +
.../README | 20 +
.../cmdlist | 4 +
.../datacenter.cfg | 6 +
.../hardware_status | 5 +
.../log.expect | 174 +++++
.../manager_status | 1 +
.../rules_config | 11 +
.../service_config | 14 +
.../static_service_stats | 14 +
.../README | 22 +
.../cmdlist | 22 +
.../datacenter.cfg | 6 +
.../hardware_status | 7 +
.../log.expect | 272 ++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 9 +
.../static_service_stats | 9 +
.../README | 13 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 60 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 15 +
.../cmdlist | 4 +
.../hardware_status | 7 +
.../log.expect | 90 +++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 10 +
.../README | 16 +
.../cmdlist | 4 +
.../hardware_status | 7 +
.../log.expect | 110 +++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 10 +
.../README | 18 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 69 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 11 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 56 ++
.../manager_status | 1 +
.../rules_config | 7 +
.../service_config | 5 +
.../README | 18 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 69 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 15 +
.../cmdlist | 5 +
.../hardware_status | 5 +
.../log.expect | 52 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 4 +
.../README | 12 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 38 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 5 +
.../README | 12 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 66 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 11 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 80 +++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 8 +
.../README | 17 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 89 +++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 8 +
.../README | 11 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 59 ++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 5 +
.../README | 19 +
.../cmdlist | 8 +
.../hardware_status | 5 +
.../log.expect | 281 ++++++++
.../manager_status | 1 +
.../rules_config | 15 +
.../service_config | 11 +
src/test/test_rules_config.pl | 6 +-
154 files changed, 4400 insertions(+), 39 deletions(-)
create mode 100644 src/PVE/HA/Rules/ResourceAffinity.pm
create mode 100644 src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats
create mode 100644 src/test/test-resource-affinity-strict-negative1/README
create mode 100644 src/test/test-resource-affinity-strict-negative1/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative1/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative1/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative1/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative1/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative1/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative2/README
create mode 100644 src/test/test-resource-affinity-strict-negative2/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative2/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative2/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative2/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative2/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative2/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative3/README
create mode 100644 src/test/test-resource-affinity-strict-negative3/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative3/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative3/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative3/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative3/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative3/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative4/README
create mode 100644 src/test/test-resource-affinity-strict-negative4/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative4/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative4/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative4/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative4/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative4/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative5/README
create mode 100644 src/test/test-resource-affinity-strict-negative5/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative5/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative5/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative5/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative5/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative5/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative6/README
create mode 100644 src/test/test-resource-affinity-strict-negative6/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative6/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative6/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative6/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative6/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative6/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative7/README
create mode 100644 src/test/test-resource-affinity-strict-negative7/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative7/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative7/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative7/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative7/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative7/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative8/README
create mode 100644 src/test/test-resource-affinity-strict-negative8/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative8/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative8/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative8/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative8/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative8/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive1/README
create mode 100644 src/test/test-resource-affinity-strict-positive1/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive1/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive1/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive1/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive1/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive1/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive2/README
create mode 100644 src/test/test-resource-affinity-strict-positive2/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive2/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive2/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive2/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive2/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive2/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive3/README
create mode 100644 src/test/test-resource-affinity-strict-positive3/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive3/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive3/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive3/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive3/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive3/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive4/README
create mode 100644 src/test/test-resource-affinity-strict-positive4/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive4/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive4/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive4/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive4/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive4/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive5/README
create mode 100644 src/test/test-resource-affinity-strict-positive5/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive5/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive5/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive5/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive5/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive5/service_config
base-commit: 264dc2c58d145394219f82f25d41f4fc438c4dc4
prerequisite-patch-id: 530b875c25a6bded1cc2294960cf465d5c2bcbca
prerequisite-patch-id: be76b977780d57e5fbf352bd978bdae5c940550d
prerequisite-patch-id: f7e9aa60a2062358ce66bc7ff1b1a9040e5326c6
prerequisite-patch-id: 0b58a4d7f2e46025edbe3570f75c205cacce7420
prerequisite-patch-id: 4b19363e458e614a6df1956ac5a217bfc62610d7
prerequisite-patch-id: 9b6ebaa0969b63f30f33c761eff0f8df7fd5f8d0
prerequisite-patch-id: 878a1f4702c9783218c5d8b0187a3862b85ee44b
prerequisite-patch-id: d81d430bb9a5ae9cd30067f2f4afa4dec5c085fc
prerequisite-patch-id: dad7bbb8de320efda08f7e660af2fce04490adb3
prerequisite-patch-id: f3f25c27f6a165617011ae641d581dda2c05b82e
prerequisite-patch-id: ea6202f21814509cf877d68506f37fe80059371d
prerequisite-patch-id: ec46e7ad626365020fdd6a07b99335c56cb024d0
prerequisite-patch-id: 8a19f490ae3dadeeb71da8888ac3ad1e0036407f
prerequisite-patch-id: afd04a8513a3bbfd5943a4bc2975b723c92348ad
prerequisite-patch-id: 9eec8be1085114a9acb33b90ca73616c611ccf65
prerequisite-patch-id: d1e039fd3f200201641a43f7e1cb423e526a27c9
prerequisite-patch-id: e86fb011c1574c112a8e9a30ab4401eb6fa25eb9
docs:
Daniel Kral (1):
ha: add documentation about ha resource affinity rules
Makefile | 1 +
gen-ha-rules-resource-affinity-opts.pl | 20 ++++
ha-manager.adoc | 133 +++++++++++++++++++++++++
ha-rules-resource-affinity-opts.adoc | 8 ++
4 files changed, 162 insertions(+)
create mode 100755 gen-ha-rules-resource-affinity-opts.pl
create mode 100644 ha-rules-resource-affinity-opts.adoc
base-commit: 7cc17ee5950a53bbd5b5ad81270352ccdb1c541c
prerequisite-patch-id: 92556cd6c1edfb88b397ae244d7dcd56876cd8fb
prerequisite-patch-id: f4f3b5d3ab4765a96b473a24446cf81964c12042
prerequisite-patch-id: 7ac868e0d7f8b1c08e54143c37dda9475bf14d96
manager:
Daniel Kral (3):
ui: ha: rules: add ha resource affinity rules
ui: migrate: lxc: display precondition messages for ha resource
affinity
ui: migrate: vm: display precondition messages for ha resource
affinity
www/manager6/Makefile | 2 +
www/manager6/ha/Rules.js | 12 ++
.../ha/rules/ResourceAffinityRuleEdit.js | 24 ++++
.../ha/rules/ResourceAffinityRules.js | 31 +++++
www/manager6/window/Migrate.js | 131 +++++++++++++++++-
5 files changed, 197 insertions(+), 3 deletions(-)
create mode 100644 www/manager6/ha/rules/ResourceAffinityRuleEdit.js
create mode 100644 www/manager6/ha/rules/ResourceAffinityRules.js
base-commit: c0cbe76ee90e7110934c50414bc22371cf13c01a
prerequisite-patch-id: ec6a39936719cfe38787fccb1a80af6378980723
prerequisite-patch-id: 9415da9186d58d8b31377c1f25ff18f8c2ffc5a2
prerequisite-patch-id: e22720f6d06927514b80cc496331c13fd080fd8d
prerequisite-patch-id: d1f267d8039d9bb04b1a0f9375970230a00755cb
prerequisite-patch-id: 5752652afa1754cb13a18b469137e7a04446d764
pve-container:
Daniel Kral (1):
api: introduce migration preconditions api endpoint
src/PVE/API2/LXC.pm | 141 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 141 insertions(+)
base-commit: 7b8077037b310f881e0422f3aabb7e9cf057cb72
prerequisite-patch-id: 6e1b48c8279bba02a04aecb550b19a7f5b5a86d0
prerequisite-patch-id: 13bd7463605c2fb86dea8ce2b4d11d3b57e726ad
prerequisite-patch-id: 66f15a96f8cdd9a21f57d1ee4b71dbb75b183716
prerequisite-patch-id: 0d73ae35bfd4edfd33dd09f9be3f23839df7d798
qemu-server:
Daniel Kral (1):
api: migration preconditions: add checks for ha resource affinity
rules
src/PVE/API2/Qemu.pm | 49 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
base-commit: 4bd4a6378d83be5c5d6494bea953067b762fa0bb
prerequisite-patch-id: 92fa07c7b5efc5c61274a3fcbef3fc50403c4395
prerequisite-patch-id: 45412364886b697957234a9907ff0780a03c8fe0
prerequisite-patch-id: dbd1bed03695935811133e4b47fee7503085d79c
prerequisite-patch-id: 75b3ae68437fc1a1c35ca65f6c5d7bfb7d0ae761
prerequisite-patch-id: 638838e65f6737f7b542e5e9862cf123fdaaa7c8
prerequisite-patch-id: 20f9cc9362238af5edd124cc664b83e4d254047e
prerequisite-patch-id: a192d04047d2cafe4e5acc93cc3353ae5e3a0ca9
prerequisite-patch-id: ac31a90492b4deab7aa9261e7b3297ed006f8aab
prerequisite-patch-id: b5593d700fc9395dacd9728cf321da2dbb43c953
prerequisite-patch-id: c4ecf88d9c7dbbdf3d45a4f47d8409daf711aa3a
prerequisite-patch-id: 56f032b886c0a630e55f6c7e93f054d6a413de39
prerequisite-patch-id: cb87f6db69df0462c2f75bea160427509c242f2a
prerequisite-patch-id: c798d97cd2de42ce5f5f1a8992eb63a8ff56ba3b
prerequisite-patch-id: 2342211b84632fa9c058f8f0cb30fa414413dbb2
prerequisite-patch-id: 06803397314bbb318be3136fbe378018e29cb5f9
prerequisite-patch-id: 591f2129b044240f4f73841b0c8c23fe5ecd1e25
prerequisite-patch-id: f7fc7bbcc43ae266b5a64ba749eae1462d0e8809
prerequisite-patch-id: eecdec45a3457706cf6b07a648f24e4d0f2fd463
prerequisite-patch-id: 735c77be6142fcb4509257523be5f893f982b8da
prerequisite-patch-id: 7458a9e7d30d92ed13738cde39845838901ed96e
prerequisite-patch-id: 32abe240d401f3fb55035d07cbf84d7aa51d0909
prerequisite-patch-id: c89d3c5bf26057d0b5536f1b90b2a53b4f7e4fd7
prerequisite-patch-id: c71d2f276776353aa80977d4fd2bb8ee67f0996f
prerequisite-patch-id: 7469ff6963df33ed50fc42a215a4fe26f6624e85
prerequisite-patch-id: b10f50d61f9f57d6316c93da7b4e14544f32e37f
prerequisite-patch-id: e3dd34aa01b5c4b01a80f3557ca8ee170ba951a2
prerequisite-patch-id: fb01eef662838ee474801670f65fe2450e539db8
prerequisite-patch-id: 637c768e2217f7e230dd74898ad8d5521906079c
prerequisite-patch-id: fb3ddaa3b3a1719e5288235e897b03c8a2b6e0d1
prerequisite-patch-id: cdf7cc5a3731cf28ded49d17fbc937f0529d7c4f
prerequisite-patch-id: 33ccd4f5266722525d47b3a36e82d7ad8e81ed5c
prerequisite-patch-id: 30cfcfcc835d71a7fe53b075d2ce43af20bc53b4
prerequisite-patch-id: ce5b02664f3715b95d71101d2c107ffcd910b8e3
prerequisite-patch-id: 56ac3292381ebf1cb28d13039ff8dc59eadcdec2
prerequisite-patch-id: 88c19bd062a4c446bd25d07dc3d79f8c393557f4
prerequisite-patch-id: 1f94299f0e3d203d7d6ad539b85ff166915b8102
prerequisite-patch-id: a7595017f9383a37d616bf08d7eb6e79f3f02684
prerequisite-patch-id: d32966a023905395a1904d3ecf4fb979e4a12c50
prerequisite-patch-id: ea83303358b0c89d7d44ff779333957bbd7bb6e3
prerequisite-patch-id: 4f2dab83befa91ad6ae7d3de3bffee8f633e26a4
prerequisite-patch-id: 3e581da8a18525e7c00099d4423dfd23a6aa28aa
prerequisite-patch-id: 59fb4bace486be94096bcd2291850428f6fd4281
prerequisite-patch-id: 8ab490bfe9d826dd6951e12138e844d44594918c
prerequisite-patch-id: efe99ebbf56d0ee6f65c0d4d0ef81bb1c45653e2
prerequisite-patch-id: c7189bcd0a12d489f55fceb6a15169b6d7c4c6ab
prerequisite-patch-id: c2202da4c4668425f5305528b2815ca6e18b3f2d
prerequisite-patch-id: 972023350c88368f7f8ea5c38a0db4203269b74e
prerequisite-patch-id: 979f735b7277fa8c053e346c8c5cbb7e9cca175f
prerequisite-patch-id: 397d98543aea0795a538b5f2134cf3d536864d1e
prerequisite-patch-id: b48cf1dbee9bbfaca0975a542d6c4eeb9c3a73fe
prerequisite-patch-id: 0545d14a7dade8ef5576ad29c03dd25f4e44f29c
prerequisite-patch-id: d148ba8f442aba20cb52a41c8f6282c6da7432c5
prerequisite-patch-id: 93bb7d94d66dc9f21856aa1d36e6611969637f7e
prerequisite-patch-id: 43ca3a9f6d56fd4a430a2ef206c733598321e00a
prerequisite-patch-id: 11f58168f21c7d6f63e1981660caeb1c55a67b7e
prerequisite-patch-id: 60d22700d35dc8db4e36eee1398627f8ef81ec90
prerequisite-patch-id: 53a3bef801b7dc854d547300b242ecc2086a9649
prerequisite-patch-id: 8bed06668bc4547cc6ebf6bd38684c7cfdaa2999
prerequisite-patch-id: 637814dceb301beaa41dbf8f1ab87532238c66e6
prerequisite-patch-id: 1b5c5f3c3158debb889127e26b5f695f17b56d16
prerequisite-patch-id: 1a7d8eda8e08b4017a9aff6428ad9a4fb9f3894b
Summary over all repositories:
165 files changed, 4949 insertions(+), 42 deletions(-)
--
Generated by git-murpp 0.8.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 01/13] rules: introduce plugin-specific canonicalize routines
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 02/13] rules: add haenv node list to the rules' canonicalization stage Daniel Kral
` (17 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
These are needed by the resource affinity rule type in an upcoming
patch, which needs to make changes to the existing rule set to properly
synthesize inferred rules after the rule set is already made feasible.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Rules.pm | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index bda0b5d..39c349d 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -354,6 +354,18 @@ sub check_feasibility : prototype($$) {
return $global_errors;
}
+=head3 $class->plugin_canonicalize($rules)
+
+B<OPTIONAL:> Can be implemented in the I<rule plugin>.
+
+Modifies the C<$rules> to a plugin-specific canonical form.
+
+=cut
+
+sub plugin_canonicalize : prototype($$) {
+ my ($class, $rules) = @_;
+}
+
=head3 $class->canonicalize($rules)
Modifies C<$rules> to contain only feasible rules.
@@ -385,6 +397,12 @@ sub canonicalize : prototype($$) {
}
}
+ for my $type (@$types) {
+ my $plugin = $class->lookup($type);
+ eval { $plugin->plugin_canonicalize($rules) };
+ next if $@; # plugin doesn't implement plugin_canonicalize(...)
+ }
+
return $messages;
}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 02/13] rules: add haenv node list to the rules' canonicalization stage
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 01/13] rules: introduce plugin-specific canonicalize routines Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 03/13] rules: introduce resource affinity rule plugin Daniel Kral
` (16 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add the HA environment's node list information to the feasibility
check/canonicalization stage, which is needed for at least one rule
check for negative resource affinity rules in an upcoming patch, which
verifies that there are enough available nodes to separate the HA
resources on.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Rules.pm | 5 ++++-
src/PVE/HA/Manager.pm | 3 ++-
src/PVE/HA/Rules.pm | 20 +++++++++++++-------
src/test/test_rules_config.pl | 4 +++-
4 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/src/PVE/API2/HA/Rules.pm b/src/PVE/API2/HA/Rules.pm
index 2e5e382..51e264f 100644
--- a/src/PVE/API2/HA/Rules.pm
+++ b/src/PVE/API2/HA/Rules.pm
@@ -101,7 +101,10 @@ my $check_feasibility = sub {
$rules = $get_full_rules_config->($rules);
- return PVE::HA::Rules->check_feasibility($rules);
+ my $manager_status = PVE::HA::Config::read_manager_status();
+ my $nodes = [keys $manager_status->{node_status}->%*];
+
+ return PVE::HA::Rules->check_feasibility($rules, $nodes);
};
my $assert_feasibility = sub {
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index b2fd896..4bf74d2 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -621,7 +621,8 @@ sub manage {
) {
PVE::HA::Groups::migrate_groups_to_rules($new_rules, $self->{groups}, $sc);
- my $messages = PVE::HA::Rules->canonicalize($new_rules);
+ my $nodes = $self->{ns}->list_nodes();
+ my $messages = PVE::HA::Rules->canonicalize($new_rules, $nodes);
$haenv->log('info', $_) for @$messages;
$self->{rules} = $new_rules;
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index 39c349d..3121424 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -322,26 +322,30 @@ sub get_check_arguments : prototype($$) {
return $global_args;
}
-=head3 $class->check_feasibility($rules)
+=head3 $class->check_feasibility($rules, $nodes)
Checks whether the given C<$rules> are feasible by running all checks, which
were registered with C<L<< register_check()|/$class->register_check(...) >>>,
and returns a hash map of errorneous rules.
+C<$nodes> is a list of the configured cluster nodes.
+
The checks are run in the order in which the rule plugins were registered,
while global checks, i.e. checks between different rule types, are run at the
very last.
=cut
-sub check_feasibility : prototype($$) {
- my ($class, $rules) = @_;
+sub check_feasibility : prototype($$$) {
+ my ($class, $rules, $nodes) = @_;
my $global_errors = {};
my $removable_ruleids = [];
my $global_args = $class->get_check_arguments($rules);
+ $global_args->{nodes} = $nodes;
+
for my $type (@$types, 'global') {
for my $entry (@{ $checkdef->{$type} }) {
my ($check, $collect_errors) = @$entry;
@@ -366,10 +370,12 @@ sub plugin_canonicalize : prototype($$) {
my ($class, $rules) = @_;
}
-=head3 $class->canonicalize($rules)
+=head3 $class->canonicalize($rules, $nodes)
Modifies C<$rules> to contain only feasible rules.
+C<$nodes> is a list of the configured cluster nodes.
+
This is done by running all checks, which were registered with
C<L<< register_check()|/$class->register_check(...) >>> and removing any
rule, which makes the rule set infeasible.
@@ -378,11 +384,11 @@ Returns a list of messages with the reasons why rules were removed.
=cut
-sub canonicalize : prototype($$) {
- my ($class, $rules) = @_;
+sub canonicalize : prototype($$$) {
+ my ($class, $rules, $nodes) = @_;
my $messages = [];
- my $global_errors = $class->check_feasibility($rules);
+ my $global_errors = $class->check_feasibility($rules, $nodes);
for my $ruleid (keys %$global_errors) {
delete $rules->{ids}->{$ruleid};
diff --git a/src/test/test_rules_config.pl b/src/test/test_rules_config.pl
index 824afed..d49d14f 100755
--- a/src/test/test_rules_config.pl
+++ b/src/test/test_rules_config.pl
@@ -42,6 +42,8 @@ sub check_cfg {
my $raw = PVE::Tools::file_get_contents($cfg_fn);
+ my $nodes = ['node1', 'node2', 'node3'];
+
open(my $LOG, '>', "$outfile");
select($LOG);
$| = 1;
@@ -49,7 +51,7 @@ sub check_cfg {
print "--- Log ---\n";
my $cfg = PVE::HA::Rules->parse_config($cfg_fn, $raw);
PVE::HA::Rules->set_rule_defaults($_) for values %{ $cfg->{ids} };
- my $messages = PVE::HA::Rules->canonicalize($cfg);
+ my $messages = PVE::HA::Rules->canonicalize($cfg, $nodes);
print $_ for @$messages;
print "--- Config ---\n";
{
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 03/13] rules: introduce resource affinity rule plugin
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 01/13] rules: introduce plugin-specific canonicalize routines Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 02/13] rules: add haenv node list to the rules' canonicalization stage Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 04/13] rules: add global checks between node and resource affinity rules Daniel Kral
` (15 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add the resource affinity rule plugin to allow users to specify
inter-resource affinity constraints. Resource affinity rules must
specify two or more resources and one of the affinity types:
* positive: keeping HA resources together, or
* negative: keeping HA resources separate;
The initial implementation restricts resource affinity rules to need at
least two specified resources, restricts negative resource affinity
rules to need less or equal resources than available nodes and disallows
that the same two or more resources are specified in both a positive and
a negative resource affinity rule, as that is an infeasible rule set.
Positive resource affinity rules, whose resource sets overlap are
handled as a single positive resource affinity rule to make it easier to
retrieve the resources, which are to be kept together, in later patches.
Positive resource affinity rules, whose resources are also in negative
resource affinity rules, make all the positive resource affinity rules'
resources be in negative resource affinity relationships as well.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
debian/pve-ha-manager.install | 1 +
src/PVE/HA/Env/PVE2.pm | 2 +
src/PVE/HA/Manager.pm | 1 +
src/PVE/HA/Rules/Makefile | 2 +-
src/PVE/HA/Rules/ResourceAffinity.pm | 438 +++++++++++++++++++++++++++
src/PVE/HA/Sim/Env.pm | 2 +
6 files changed, 445 insertions(+), 1 deletion(-)
create mode 100644 src/PVE/HA/Rules/ResourceAffinity.pm
diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 79f86d2..2e6b7d5 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -36,6 +36,7 @@
/usr/share/perl5/PVE/HA/Resources/PVEVM.pm
/usr/share/perl5/PVE/HA/Rules.pm
/usr/share/perl5/PVE/HA/Rules/NodeAffinity.pm
+/usr/share/perl5/PVE/HA/Rules/ResourceAffinity.pm
/usr/share/perl5/PVE/HA/Tools.pm
/usr/share/perl5/PVE/HA/Usage.pm
/usr/share/perl5/PVE/HA/Usage/Basic.pm
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index aecffc0..c595e4d 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -24,6 +24,7 @@ use PVE::HA::Resources::PVEVM;
use PVE::HA::Resources::PVECT;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity;
+use PVE::HA::Rules::ResourceAffinity;
PVE::HA::Resources::PVEVM->register();
PVE::HA::Resources::PVECT->register();
@@ -31,6 +32,7 @@ PVE::HA::Resources::PVECT->register();
PVE::HA::Resources->init();
PVE::HA::Rules::NodeAffinity->register();
+PVE::HA::Rules::ResourceAffinity->register();
PVE::HA::Rules->init(property_isolation => 1);
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 4bf74d2..52097cf 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -11,6 +11,7 @@ use PVE::HA::Tools ':exit_codes';
use PVE::HA::NodeStatus;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
+use PVE::HA::Rules::ResourceAffinity;
use PVE::HA::Usage::Basic;
use PVE::HA::Usage::Static;
diff --git a/src/PVE/HA/Rules/Makefile b/src/PVE/HA/Rules/Makefile
index dfef257..6411925 100644
--- a/src/PVE/HA/Rules/Makefile
+++ b/src/PVE/HA/Rules/Makefile
@@ -1,4 +1,4 @@
-SOURCES=NodeAffinity.pm
+SOURCES=NodeAffinity.pm ResourceAffinity.pm
.PHONY: install
install:
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
new file mode 100644
index 0000000..57ccc09
--- /dev/null
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -0,0 +1,438 @@
+package PVE::HA::Rules::ResourceAffinity;
+
+use strict;
+use warnings;
+
+use PVE::HA::HashTools qw(set_intersect sets_are_disjoint);
+use PVE::HA::Rules;
+
+use base qw(PVE::HA::Rules);
+
+=head1 NAME
+
+PVE::HA::Rules::ResourceAffinity - Resource Affinity Plugin for HA Rules
+
+=head1 DESCRIPTION
+
+This package provides the capability to specify and apply rules, which put
+affinity constraints between the HA resources.
+
+HA resource affinity rules have one of the two types:
+
+=over
+
+=item C<'positive'>
+
+Positive resource affinity rules specify that HA resources need to be be kept
+together.
+
+=item C<'negative'>
+
+Negative resource affinity rules (or resource anti-affinity rules) specify that
+HA resources need to be kept separate.
+
+=back
+
+HA resource affinity rules MUST be applied. That is, if a HA resource cannot
+comply with the resource affinity rule, it is put in recovery or other
+error-like states, if there is no other way to recover them.
+
+=cut
+
+sub type {
+ return 'resource-affinity';
+}
+
+sub properties {
+ return {
+ affinity => {
+ description => "Describes whether the HA resources are supposed to"
+ . " be kept on the same node ('positive'), or are supposed to"
+ . " be kept on separate nodes ('negative').",
+ type => 'string',
+ enum => ['positive', 'negative'],
+ optional => 0,
+ },
+ };
+}
+
+sub options {
+ return {
+ resources => { optional => 0 },
+ affinity => { optional => 0 },
+ disable => { optional => 1 },
+ comment => { optional => 1 },
+ };
+}
+
+sub get_plugin_check_arguments {
+ my ($self, $rules) = @_;
+
+ my $result = {
+ resource_affinity_rules => {},
+ positive_rules => {},
+ negative_rules => {},
+ };
+
+ PVE::HA::Rules::foreach_rule(
+ $rules,
+ sub {
+ my ($rule, $ruleid) = @_;
+
+ $result->{resource_affinity_rules}->{$ruleid} = $rule;
+
+ $result->{positive_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'positive';
+ $result->{negative_rules}->{$ruleid} = $rule if $rule->{affinity} eq 'negative';
+ },
+ {
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
+ },
+ );
+
+ return $result;
+}
+
+=head1 RESOURCE AFFINITY RULE CHECKERS
+
+=cut
+
+=head3 check_resource_affinity_resources_count($resource_affinity_rules)
+
+Returns a list of resource affinity rule ids, defined in
+C<$resource_affinity_rules>, which do not have enough resources defined to be
+effective resource affinity rules.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_resource_affinity_resources_count {
+ my ($resource_affinity_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($ruleid, $rule) = each %$resource_affinity_rules) {
+ push @conflicts, $ruleid if keys %{ $rule->{resources} } < 2;
+ }
+
+ @conflicts = sort @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_resource_affinity_resources_count($args->{resource_affinity_rules});
+ },
+ sub {
+ my ($ruleids, $errors) = @_;
+
+ for my $ruleid (@$ruleids) {
+ push @{ $errors->{$ruleid}->{resources} },
+ "rule is ineffective as there are less than two resources";
+ }
+ },
+);
+
+=head3 check_negative_resource_affinity_resources_count($negative_rules, $nodes)
+
+Returns a list of negative resource affinity rule ids, defined in
+C<$negative_rules>, which do have more resources defined than available according
+to the node list C<$nodes>, i.e., there are not enough nodes to separate the
+resources on, even if all nodes are available.
+
+If there are none, the returned list ist empty.
+
+=cut
+
+sub check_negative_resource_affinity_resources_count {
+ my ($negative_rules, $nodes) = @_;
+
+ my @conflicts = ();
+
+ my $total_node_count = @$nodes;
+
+ while (my ($negativeid, $negative_rule) = each %$negative_rules) {
+ push @conflicts, $negativeid if keys $negative_rule->{resources}->%* > $total_node_count;
+ }
+
+ @conflicts = sort @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_negative_resource_affinity_resources_count(
+ $args->{negative_rules}, $args->{nodes},
+ );
+ },
+ sub {
+ my ($ruleids, $errors) = @_;
+
+ for my $ruleid (@$ruleids) {
+ push @{ $errors->{$ruleid}->{resources} },
+ "rule defines more resources than available nodes";
+ }
+ },
+);
+
+=head3 check_inter_resource_affinity_rules_consistency($positive_rules, $negative_rules)
+
+Returns a list of lists consisting of a positive resource affinity rule, defined
+in C<$positive_rules> and a negative resource affinity rule id, defined in
+C<$negative_rules>, which share at least the same two resources among them.
+
+This is an impossible constraint as the same resources cannot be kept together on
+the same node and kept separate on different nodes at the same time.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_inter_resource_affinity_rules_consistency {
+ my ($positive_rules, $negative_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($positiveid, $positive) = each %$positive_rules) {
+ my $positive_resources = $positive->{resources};
+
+ while (my ($negativeid, $negative) = each %$negative_rules) {
+ my $common_resources = set_intersect($positive_resources, $negative->{resources});
+ next if %$common_resources < 2;
+
+ push @conflicts, [$positiveid, $negativeid];
+ }
+ }
+
+ @conflicts = sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] } @conflicts;
+ return \@conflicts;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_inter_resource_affinity_rules_consistency(
+ $args->{positive_rules},
+ $args->{negative_rules},
+ );
+ },
+ sub {
+ my ($conflicts, $errors) = @_;
+
+ for my $conflict (@$conflicts) {
+ my ($positiveid, $negativeid) = @$conflict;
+
+ push @{ $errors->{$positiveid}->{resources} },
+ "rule shares two or more resources with '$negativeid'";
+ push @{ $errors->{$negativeid}->{resources} },
+ "rule shares two or more resources with '$positiveid'";
+ }
+ },
+);
+
+=head1 RESOURCE AFFINITY RULE CANONICALIZATION HELPERS
+
+=cut
+
+my $sort_by_lowest_resource_id = sub {
+ my ($rules) = @_;
+
+ my $lowest_rule_resource_id = {};
+ for my $ruleid (keys %$rules) {
+ my @rule_resources = sort keys $rules->{$ruleid}->{resources}->%*;
+ $lowest_rule_resource_id->{$ruleid} = $rule_resources[0];
+ }
+
+ # sort rules such that rules with the lowest numbered resource come first
+ my @sorted_ruleids = sort {
+ $lowest_rule_resource_id->{$a} cmp $lowest_rule_resource_id->{$b}
+ } sort keys %$rules;
+
+ return @sorted_ruleids;
+};
+
+# returns a list of hashes, which contain disjoint resource affinity rules, i.e.,
+# put resource affinity constraints on disjoint sets of resources
+my $find_disjoint_resource_affinity_rules = sub {
+ my ($rules) = @_;
+
+ my @disjoint_rules = ();
+
+ # order needed so that it is easier to check whether there is an overlap
+ my @sorted_ruleids = $sort_by_lowest_resource_id->($rules);
+
+ for my $ruleid (@sorted_ruleids) {
+ my $rule = $rules->{$ruleid};
+
+ my $found = 0;
+ for my $entry (@disjoint_rules) {
+ next if sets_are_disjoint($rule->{resources}, $entry->{resources});
+
+ $found = 1;
+ push @{ $entry->{ruleids} }, $ruleid;
+ $entry->{resources}->{$_} = 1 for keys $rule->{resources}->%*;
+
+ last;
+ }
+ if (!$found) {
+ push @disjoint_rules,
+ {
+ ruleids => [$ruleid],
+ resources => { $rule->{resources}->%* },
+ };
+ }
+ }
+
+ return @disjoint_rules;
+};
+
+=head3 merge_connected_positive_resource_affinity_rules($rules, $positive_rules)
+
+Modifies C<$rules> to contain only disjoint positive resource affinity rules
+among the ones defined in C<$positive_rules>, i.e., all positive resource
+affinity rules put positive resource affinity constraints on disjoint sets of
+resources.
+
+If two or more positive resource affinity rules have overlapping resource sets,
+then these will be removed from C<$rules> and a new positive resource affinity
+rule, where the rule id is the dashed concatenation of the rule ids
+(e.g. C<'$rule1-$rule2'>), is inserted in C<$rules>.
+
+This makes it cheaper to find the resources, which are in positive affinity with
+a resource, in C<$rules> at a later point in time.
+
+=cut
+
+sub merge_connected_positive_resource_affinity_rules {
+ my ($rules, $positive_rules) = @_;
+
+ my @disjoint_positive_rules = $find_disjoint_resource_affinity_rules->($positive_rules);
+
+ for my $entry (@disjoint_positive_rules) {
+ next if @{ $entry->{ruleids} } < 2;
+
+ my $new_ruleid = '_merged-' . join('-', @{ $entry->{ruleids} });
+ my $first_ruleid = @{ $entry->{ruleids} }[0];
+
+ $rules->{ids}->{$new_ruleid} = {
+ type => 'resource-affinity',
+ affinity => 'positive',
+ resources => $entry->{resources},
+ };
+ $rules->{order}->{$new_ruleid} = $rules->{order}->{$first_ruleid};
+
+ for my $ruleid (@{ $entry->{ruleids} }) {
+ delete $rules->{ids}->{$ruleid};
+ delete $rules->{order}->{$ruleid};
+ }
+ }
+}
+
+# retrieve the existing negative resource affinity relationships for any of the
+# $resources in the $negative_rules; returns a hash map, where the keys are the
+# resources to be separated from and the values are subsets of the $resources
+my $get_negative_resource_affinity_for_resources = sub {
+ my ($negative_rules, $resources) = @_;
+
+ my $separated_from = {};
+
+ while (my ($negativeid, $negative_rule) = each %$negative_rules) {
+ # assuming that there is at most one $sid in a $negative_rule, because
+ # these are removed by the inter-resource-affinity checker before
+ for my $sid (keys %$resources) {
+ next if !$negative_rule->{resources}->{$sid};
+
+ for my $csid (keys $negative_rule->{resources}->%*) {
+ $separated_from->{$csid}->{$sid} = 1 if $csid ne $sid;
+ }
+ }
+ }
+
+ return $separated_from;
+};
+
+=head3 create_implicit_negative_resource_affinity_rules($rules, $positive_rules, $negative_rules)
+
+Modifies C<$rules> to contain the negative resource affinity rules, which are
+implied by the negative resource affinity relationships, defined in
+C<$negative_rules>, the resources in a positive resource affinity rule are in,
+defined in C<$positive_rules>.
+
+If one or more resources in a positive resource affinity rule is also in a
+negative resource affinity rule, then for each of the resources in the positive
+resource affinity rule not in the negative resource affinity will also be put in
+that relationship by its own negative resource affinity rule in C<$rules>.
+
+This helper assumes that 1) the resource sets in positive resource affinity rules
+are disjoint from each other (i.e. already merged connected ones before), and
+2) there cannot be two or more same resources in a positive and a negative
+resource affinity rule (i.e. these are removed beforehand).
+
+For example, if two resources A and B must be kept together, but resource A must
+be kept apart from resource C and resource B must be kept apart from resource D,
+then the inferred rules will be a negative resource affinity between A and D
+and a negative resource affinity between B and C.
+
+This makes it cheaper to infer these implicit constraints later instead of
+propagating that information in each scheduler invocation.
+
+=cut
+
+sub create_implicit_negative_resource_affinity_rules {
+ my ($rules, $positive_rules, $negative_rules) = @_;
+
+ my @conflicts = ();
+
+ while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+ my $positive_resources = $positive_rule->{resources};
+
+ # assuming that every positive rule's resource set is disjoint from the others
+ my $separated_from =
+ $get_negative_resource_affinity_for_resources->($negative_rules, $positive_resources);
+
+ for my $csid (keys %$separated_from) {
+ for my $sid (keys %$positive_resources) {
+ next if $separated_from->{$csid}->{$sid};
+
+ my $new_ruleid = "_implicit-negative-$positiveid-$sid-$csid";
+ my $new_negative_resources = {
+ $sid => 1,
+ $csid => 1,
+ };
+
+ $rules->{ids}->{$new_ruleid} = {
+ type => 'resource-affinity',
+ affinity => 'negative',
+ resources => $new_negative_resources,
+ };
+ $rules->{order}->{$new_ruleid} = PVE::HA::Rules::get_next_ordinal($rules);
+ }
+ }
+ }
+}
+
+sub plugin_canonicalize {
+ my ($class, $rules) = @_;
+
+ my $args = $class->get_plugin_check_arguments($rules);
+
+ merge_connected_positive_resource_affinity_rules($rules, $args->{positive_rules});
+
+ $args = $class->get_plugin_check_arguments($rules);
+
+ # must come after merging connected positive rules, because of this helpers
+ # assumptions about resource sets and inter-resource affinity consistency
+ create_implicit_negative_resource_affinity_rules(
+ $rules,
+ $args->{positive_rules},
+ $args->{negative_rules},
+ );
+}
+
+1;
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index 446071d..19d5bc0 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -12,6 +12,7 @@ use PVE::HA::Env;
use PVE::HA::Resources;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity;
+use PVE::HA::Rules::ResourceAffinity;
use PVE::HA::Sim::Resources::VirtVM;
use PVE::HA::Sim::Resources::VirtCT;
use PVE::HA::Sim::Resources::VirtFail;
@@ -23,6 +24,7 @@ PVE::HA::Sim::Resources::VirtFail->register();
PVE::HA::Resources->init();
PVE::HA::Rules::NodeAffinity->register();
+PVE::HA::Rules::ResourceAffinity->register();
PVE::HA::Rules->init(property_isolation => 1);
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 04/13] rules: add global checks between node and resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (2 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 03/13] rules: introduce resource affinity rule plugin Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 05/13] usage: add information about a service's assigned nodes Daniel Kral
` (14 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add checks, which determine infeasible resource affinity rules, because
their resources are already restricted by their node affinity rules in
such a way, that these cannot be satisfied or are reasonable to be
proven to be satisfiable.
Node affinity rules restrict resources to certain nodes by their nature,
but resources in positive resource affinity rule need to have at least
one common node to be feasible and resources in negative resource
affinity rule need to have at least the amount of nodes available that
nodes are restricted to in total.
Since node affinity rules allow nodes to be put in priority groups, but
the information which priority group is relevant depends on the online
nodes, these checks currently prohibit resource affinity rules with
resources, which make use of these kinds of node affinity rules.
Even though node affinity rules are restricted to only allow a resource
to be used in a single node affinity rule, the checks here still go over
all node affinity rules, as this restriction is bound to be changed in
the future.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Rules.pm | 194 +++++++++++++++++++++++++++
src/PVE/HA/Rules/ResourceAffinity.pm | 3 +-
2 files changed, 196 insertions(+), 1 deletion(-)
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
index 3121424..892e7aa 100644
--- a/src/PVE/HA/Rules.pm
+++ b/src/PVE/HA/Rules.pm
@@ -6,6 +6,7 @@ use warnings;
use PVE::JSONSchema qw(get_standard_option);
use PVE::Tools;
+use PVE::HA::HashTools qw(set_intersect set_union sets_are_disjoint);
use PVE::HA::Tools;
use base qw(PVE::SectionConfig);
@@ -476,4 +477,197 @@ sub get_next_ordinal : prototype($) {
return $current_order + 1;
}
+=head1 INTER-PLUGIN RULE CHECKERS
+
+=cut
+
+=head3 check_single_priority_node_affinity_in_resource_affinity_rules(...)
+
+Returns a list of resource affinity rule ids, defined in
+C<$resource_affinity_rules>, where the resources in the resource affinity rule
+are in node affinity rules, defined in C<$node_affinity_rules>, which have
+multiple priority groups defined.
+
+That is, the resource affinity rule cannot be statically checked to be feasible
+as the selection of the priority group is dependent on the currently online
+nodes.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_single_priority_node_affinity_in_resource_affinity_rules {
+ my ($resource_affinity_rules, $node_affinity_rules) = @_;
+
+ my @errors = ();
+
+ while (my ($resource_affinity_id, $resource_affinity_rule) = each %$resource_affinity_rules) {
+ my $priority;
+ my $resources = $resource_affinity_rule->{resources};
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my $node_affinity_rule = $node_affinity_rules->{$node_affinity_id};
+
+ next if sets_are_disjoint($resources, $node_affinity_rule->{resources});
+
+ for my $node (values %{ $node_affinity_rule->{nodes} }) {
+ $priority = $node->{priority} if !defined($priority);
+
+ if ($priority != $node->{priority}) {
+ push @errors, $resource_affinity_id;
+ last; # early return to check next resource affinity rule
+ }
+ }
+ }
+ }
+
+ @errors = sort @errors;
+ return \@errors;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_single_priority_node_affinity_in_resource_affinity_rules(
+ $args->{resource_affinity_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($ruleids, $errors) = @_;
+
+ for my $ruleid (@$ruleids) {
+ push @{ $errors->{$ruleid}->{resources} },
+ "resources are in node affinity rules with multiple priorities";
+ }
+ },
+);
+
+=head3 check_positive_resource_affinity_node_affinity_consistency(...)
+
+Returns a list of positive resource affinity rule ids, defined in
+C<$positive_rules>, where the resources in the positive resource affinity rule
+are restricted to a disjoint set of nodes by their node affinity rules, defined
+in C<$node_affinity_rules>.
+
+That is, the positive resource affinity rule cannot be fullfilled as the
+resources cannot be placed on the same node.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_positive_resource_affinity_node_affinity_consistency {
+ my ($positive_rules, $node_affinity_rules) = @_;
+
+ my @errors = ();
+
+ while (my ($positiveid, $positive_rule) = each %$positive_rules) {
+ my $allowed_nodes;
+ my $resources = $positive_rule->{resources};
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my ($node_affinity_resources, $node_affinity_nodes) =
+ $node_affinity_rules->{$node_affinity_id}->@{qw(resources nodes)};
+
+ next if sets_are_disjoint($resources, $node_affinity_resources);
+
+ $allowed_nodes = { $node_affinity_nodes->%* } if !defined($allowed_nodes);
+ $allowed_nodes = set_intersect($allowed_nodes, $node_affinity_nodes);
+
+ if (keys %$allowed_nodes < 1) {
+ push @errors, $positiveid;
+ last; # early return to check next positive resource affinity rule
+ }
+ }
+ }
+
+ @errors = sort @errors;
+ return \@errors;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_positive_resource_affinity_node_affinity_consistency(
+ $args->{positive_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($ruleids, $errors) = @_;
+
+ for my $ruleid (@$ruleids) {
+ push @{ $errors->{$ruleid}->{resources} },
+ "two or more resources are restricted to different nodes";
+ }
+ },
+);
+
+=head3 check_negative_resource_affinity_node_affinity_consistency(...)
+
+Returns a list of negative resource affinity rule ids, defined in
+C<$negative_rules>, where the resources in the negative resource affinity rule
+are restricted to less nodes than needed to keep them separate by their node
+affinity rules, defined in C<$node_affinity_rules>.
+
+That is, the negative resource affinity rule cannot be fullfilled as there are
+not enough nodes to spread the resources on.
+
+If there are none, the returned list is empty.
+
+=cut
+
+sub check_negative_resource_affinity_node_affinity_consistency {
+ my ($negative_rules, $node_affinity_rules) = @_;
+
+ my @errors = ();
+
+ while (my ($negativeid, $negative_rule) = each %$negative_rules) {
+ my $allowed_nodes = {};
+ my $located_resources;
+ my $resources = $negative_rule->{resources};
+
+ for my $node_affinity_id (keys %$node_affinity_rules) {
+ my ($node_affinity_resources, $node_affinity_nodes) =
+ $node_affinity_rules->{$node_affinity_id}->@{qw(resources nodes)};
+ my $common_resources = set_intersect($resources, $node_affinity_resources);
+
+ next if keys %$common_resources < 1;
+
+ $located_resources = set_union($located_resources, $common_resources);
+ $allowed_nodes = set_union($allowed_nodes, $node_affinity_nodes);
+
+ if (keys %$allowed_nodes < keys %$located_resources) {
+ push @errors, $negativeid;
+ last; # early return to check next negative resource affinity rule
+ }
+ }
+ }
+
+ @errors = sort @errors;
+ return \@errors;
+}
+
+__PACKAGE__->register_check(
+ sub {
+ my ($args) = @_;
+
+ return check_negative_resource_affinity_node_affinity_consistency(
+ $args->{negative_rules},
+ $args->{node_affinity_rules},
+ );
+ },
+ sub {
+ my ($ruleids, $errors) = @_;
+
+ for my $ruleid (@$ruleids) {
+ push @{ $errors->{$ruleid}->{resources} },
+ "two or more resources are restricted to less nodes than available to the resources";
+ }
+ },
+);
+
1;
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 57ccc09..b024c93 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -167,7 +167,8 @@ __PACKAGE__->register_check(
my ($args) = @_;
return check_negative_resource_affinity_resources_count(
- $args->{negative_rules}, $args->{nodes},
+ $args->{negative_rules},
+ $args->{nodes},
);
},
sub {
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 05/13] usage: add information about a service's assigned nodes
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (3 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 04/13] rules: add global checks between node and resource affinity rules Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 06/13] manager: apply resource affinity rules when selecting service nodes Daniel Kral
` (13 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
This will be used to retrieve the nodes, which a service is currently
putting load on and using their resources, when dealing with HA resource
affinity rules in select_service_node(...).
For example, a migrating service A in a negative resource affinity with
services B and C will need to block those services B and C to migrate on
both the source and target node.
This is implemented here, because the service's usage of the nodes is
currently best encoded in recompute_online_node_usage(...) and other
call sites of add_service_usage_to_node(...).
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Manager.pm | 16 ++++++++++++----
src/PVE/HA/Usage.pm | 18 ++++++++++++++++++
src/PVE/HA/Usage/Basic.pm | 19 +++++++++++++++++++
src/PVE/HA/Usage/Static.pm | 19 +++++++++++++++++++
4 files changed, 68 insertions(+), 4 deletions(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 52097cf..b536c0f 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -268,6 +268,7 @@ sub recompute_online_node_usage {
|| $state eq 'recovery'
) {
$online_node_usage->add_service_usage_to_node($sd->{node}, $sid, $sd->{node});
+ $online_node_usage->set_service_node($sid, $sd->{node});
} elsif (
$state eq 'migrate'
|| $state eq 'relocate'
@@ -275,10 +276,14 @@ sub recompute_online_node_usage {
) {
my $source = $sd->{node};
# count it for both, source and target as load is put on both
- $online_node_usage->add_service_usage_to_node($source, $sid, $source, $target)
- if $state ne 'request_start_balance';
- $online_node_usage->add_service_usage_to_node($target, $sid, $source, $target)
- if $online_node_usage->contains_node($target);
+ if ($state ne 'request_start_balance') {
+ $online_node_usage->add_service_usage_to_node($source, $sid, $source, $target);
+ $online_node_usage->add_service_node($sid, $source);
+ }
+ if ($online_node_usage->contains_node($target)) {
+ $online_node_usage->add_service_usage_to_node($target, $sid, $source, $target);
+ $online_node_usage->add_service_node($sid, $target);
+ }
} elsif ($state eq 'stopped' || $state eq 'request_start') {
# do nothing
} else {
@@ -290,6 +295,7 @@ sub recompute_online_node_usage {
# case a node dies, as we cannot really know if the to-be-aborted incoming migration
# has already cleaned up all used resources
$online_node_usage->add_service_usage_to_node($target, $sid, $sd->{node}, $target);
+ $online_node_usage->set_service_node($sid, $target);
}
}
}
@@ -1065,6 +1071,7 @@ sub next_state_started {
if ($node && ($sd->{node} ne $node)) {
$self->{online_node_usage}->add_service_usage_to_node($node, $sid, $sd->{node});
+ $self->{online_node_usage}->add_service_node($sid, $node);
if (defined(my $fallback = $sd->{maintenance_node})) {
if ($node eq $fallback) {
@@ -1193,6 +1200,7 @@ sub next_state_recovery {
$haenv->steal_service($sid, $sd->{node}, $recovery_node);
$self->{online_node_usage}->add_service_usage_to_node($recovery_node, $sid, $recovery_node);
+ $self->{online_node_usage}->add_service_node($sid, $recovery_node);
# NOTE: $sd *is normally read-only*, fencing is the exception
$cd->{node} = $sd->{node} = $recovery_node;
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 66d9572..7f4d9ca 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -27,6 +27,24 @@ sub list_nodes {
die "implement in subclass";
}
+sub get_service_nodes {
+ my ($self, $sid) = @_;
+
+ die "implement in subclass";
+}
+
+sub set_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ die "implement in subclass";
+}
+
+sub add_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ die "implement in subclass";
+}
+
sub contains_node {
my ($self, $nodename) = @_;
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index ead08c5..afe3733 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -11,6 +11,7 @@ sub new {
return bless {
nodes => {},
haenv => $haenv,
+ 'service-nodes' => {},
}, $class;
}
@@ -38,6 +39,24 @@ sub contains_node {
return defined($self->{nodes}->{$nodename});
}
+sub get_service_nodes {
+ my ($self, $sid) = @_;
+
+ return $self->{'service-nodes'}->{$sid};
+}
+
+sub set_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ $self->{'service-nodes'}->{$sid} = [$nodename];
+}
+
+sub add_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
+}
+
sub add_service_usage_to_node {
my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 061e74a..6707a54 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -22,6 +22,7 @@ sub new {
'service-stats' => {},
haenv => $haenv,
scheduler => $scheduler,
+ 'service-nodes' => {},
'service-counts' => {}, # Service count on each node. Fallback if scoring calculation fails.
}, $class;
}
@@ -86,6 +87,24 @@ my sub get_service_usage {
return $service_stats;
}
+sub get_service_nodes {
+ my ($self, $sid) = @_;
+
+ return $self->{'service-nodes'}->{$sid};
+}
+
+sub set_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ $self->{'service-nodes'}->{$sid} = [$nodename];
+}
+
+sub add_service_node {
+ my ($self, $sid, $nodename) = @_;
+
+ push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
+}
+
sub add_service_usage_to_node {
my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 06/13] manager: apply resource affinity rules when selecting service nodes
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (4 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 05/13] usage: add information about a service's assigned nodes Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 07/13] manager: handle resource affinity rules in manual migrations Daniel Kral
` (12 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add a mechanism to the node selection subroutine, which enforces the
resource affinity rules defined in the rules config.
The algorithm makes in-place changes to the set of nodes in such a way,
that the final set contains only the nodes where the resource affinity
rules allow the HA resources to run on, depending on the affinity type
of the resource affinity rules.
The HA resource's failback property also slightly changes meaning
because now it also controls how the HA Manager chooses nodes for a HA
resource with resource affinity rules, not only node affinity rules.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Resources.pm | 3 +-
src/PVE/API2/HA/Status.pm | 4 +-
src/PVE/HA/Manager.pm | 11 +-
src/PVE/HA/Resources.pm | 3 +-
src/PVE/HA/Rules/ResourceAffinity.pm | 150 +++++++++++++++++++++++++++
5 files changed, 167 insertions(+), 4 deletions(-)
diff --git a/src/PVE/API2/HA/Resources.pm b/src/PVE/API2/HA/Resources.pm
index e06d202..6ead5f0 100644
--- a/src/PVE/API2/HA/Resources.pm
+++ b/src/PVE/API2/HA/Resources.pm
@@ -131,7 +131,8 @@ __PACKAGE__->register_method({
description => "HA resource is automatically migrated to the"
. " node with the highest priority according to their node"
. " affinity rule, if a node with a higher priority than"
- . " the current node comes online.",
+ . " the current node comes online, or migrated to the node,"
+ . " which doesn\'t violate any resource affinity rule.",
type => 'boolean',
optional => 1,
default => 1,
diff --git a/src/PVE/API2/HA/Status.pm b/src/PVE/API2/HA/Status.pm
index 4038766..d831650 100644
--- a/src/PVE/API2/HA/Status.pm
+++ b/src/PVE/API2/HA/Status.pm
@@ -113,7 +113,9 @@ __PACKAGE__->register_method({
description => "HA resource is automatically migrated to"
. " the node with the highest priority according to their"
. " node affinity rule, if a node with a higher priority"
- . " than the current node comes online.",
+ . " than the current node comes online, or migrate to"
+ . " the node, which doesn\'t violate any resource"
+ . " affinity rule.",
type => "boolean",
optional => 1,
default => 1,
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index b536c0f..06d83cd 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -11,7 +11,8 @@ use PVE::HA::Tools ':exit_codes';
use PVE::HA::NodeStatus;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
-use PVE::HA::Rules::ResourceAffinity;
+use PVE::HA::Rules::ResourceAffinity
+ qw(get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
use PVE::HA::Usage::Basic;
use PVE::HA::Usage::Static;
@@ -151,11 +152,16 @@ sub select_service_node {
return undef if !%$pri_nodes;
+ my ($together, $separate) = get_resource_affinity($rules, $sid, $online_node_usage);
+
# stay on current node if possible (avoids random migrations)
if (
$node_preference eq 'none'
&& !$service_conf->{failback}
&& $allowed_nodes->{$current_node}
+ && PVE::HA::Rules::ResourceAffinity::is_allowed_on_node(
+ $together, $separate, $current_node,
+ )
) {
return $current_node;
}
@@ -167,6 +173,9 @@ sub select_service_node {
}
}
+ apply_positive_resource_affinity($together, $pri_nodes);
+ apply_negative_resource_affinity($separate, $pri_nodes);
+
return $maintenance_fallback
if defined($maintenance_fallback) && $pri_nodes->{$maintenance_fallback};
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index b6d4a73..fbb0685 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -66,7 +66,8 @@ EODESC
description => "Automatically migrate HA resource to the node with"
. " the highest priority according to their node affinity "
. " rules, if a node with a higher priority than the current"
- . " node comes online.",
+ . " node comes online, or migrate to the node, which doesn\'t"
+ . " violate any resource affinity rule.",
type => 'boolean',
optional => 1,
default => 1,
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index b024c93..965b9a1 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -6,8 +6,15 @@ use warnings;
use PVE::HA::HashTools qw(set_intersect sets_are_disjoint);
use PVE::HA::Rules;
+use base qw(Exporter);
use base qw(PVE::HA::Rules);
+our @EXPORT_OK = qw(
+ get_resource_affinity
+ apply_positive_resource_affinity
+ apply_negative_resource_affinity
+);
+
=head1 NAME
PVE::HA::Rules::ResourceAffinity - Resource Affinity Plugin for HA Rules
@@ -436,4 +443,147 @@ sub plugin_canonicalize {
);
}
+=head1 RESOURCE AFFINITY RULE HELPERS
+
+=cut
+
+=head3 get_resource_affinity($rules, $sid, $online_node_usage)
+
+Returns a list of two hashes, where the first describes the positive resource
+affinity and the second hash describes the negative resource affinity for
+resource C<$sid> according to the resource affinity rules in C<$rules> and the
+resource locations in C<$online_node_usage>.
+
+For the positive resource affinity of a resource C<$sid>, each element in the
+hash represents an online node, where other resources, which C<$sid> is in
+positive affinity with, are already running, and how many of them. That is,
+each element represents a node, where the resource must be.
+
+For the negative resource affinity of a resource C<$sid>, each element in the
+hash represents an online node, where other resources, which C<$sid> is in
+negative affinity with, are alreaddy running. That is, each element represents
+a node, where the resource must not be.
+
+For example, if there are already three resources running, which the resource
+C<$sid> is in a positive affinity with, and two running resources, which the
+resource C<$sid> is in a negative affinity with, the returned value will be:
+
+ {
+ together => {
+ node2 => 3
+ },
+ separate => {
+ node1 => 1,
+ node3 => 1
+ }
+ }
+
+=cut
+
+sub get_resource_affinity : prototype($$$) {
+ my ($rules, $sid, $online_node_usage) = @_;
+
+ my $together = {};
+ my $separate = {};
+
+ PVE::HA::Rules::foreach_rule(
+ $rules,
+ sub {
+ my ($rule) = @_;
+
+ for my $csid (keys %{ $rule->{resources} }) {
+ next if $csid eq $sid;
+
+ my $nodes = $online_node_usage->get_service_nodes($csid);
+
+ next if !$nodes || !@$nodes; # skip unassigned nodes
+
+ if ($rule->{affinity} eq 'positive') {
+ $together->{$_}++ for @$nodes;
+ } elsif ($rule->{affinity} eq 'negative') {
+ $separate->{$_} = 1 for @$nodes;
+ } else {
+ die "unimplemented resource affinity type $rule->{affinity}\n";
+ }
+ }
+ },
+ {
+ sid => $sid,
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
+ },
+ );
+
+ return ($together, $separate);
+}
+
+=head3 is_allowed_on_node($together, $separate, $node)
+
+Checks whether the resource affinity hashes C<$together> or C<$separate> state
+whether for C<$together> the C<$node> must be selected, or for C<$separate> the
+node C<$node> must be avoided.
+
+=cut
+
+sub is_allowed_on_node : prototype($$$) {
+ my ($together, $separate, $node) = @_;
+
+ return $together->{$node} || !$separate->{$node};
+}
+
+=head3 apply_positive_resource_affinity($together, $allowed_nodes)
+
+Applies the positive resource affinity C<$together> on the allowed node hash set
+C<$allowed_nodes> by modifying it directly.
+
+Positive resource affinity means keeping resources together on a single node and
+therefore minimizing the separation of resources.
+
+The allowed node hash set C<$allowed_nodes> is expected to contain all nodes,
+which are available to the resource this helper is called for, i.e. each node
+is currently online, available according to other location constraints, and the
+resource has not failed running there yet.
+
+=cut
+
+sub apply_positive_resource_affinity : prototype($$) {
+ my ($together, $allowed_nodes) = @_;
+
+ my @possible_nodes = sort keys $together->%*
+ or return; # nothing to do if there is no positive resource affinity
+
+ # select the most populated node from a positive resource affinity
+ @possible_nodes = sort { $together->{$b} <=> $together->{$a} } @possible_nodes;
+ my $majority_node = $possible_nodes[0];
+
+ for my $node (keys %$allowed_nodes) {
+ delete $allowed_nodes->{$node} if $node ne $majority_node;
+ }
+}
+
+=head3 apply_negative_resource_affinity($separate, $allowed_nodes)
+
+Applies the negative resource affinity C<$separate> on the allowed node hash set
+C<$allowed_nodes> by modifying it directly.
+
+Negative resource affinity means keeping resources separate on multiple nodes
+and therefore maximizing the separation of resources.
+
+The allowed node hash set C<$allowed_nodes> is expected to contain all nodes,
+which are available to the resource this helper is called for, i.e. each node
+is currently online, available according to other location constraints, and the
+resource has not failed running there yet.
+
+=cut
+
+sub apply_negative_resource_affinity : prototype($$) {
+ my ($separate, $allowed_nodes) = @_;
+
+ my $forbidden_nodes = { $separate->%* };
+
+ for my $node (keys %$forbidden_nodes) {
+ delete $allowed_nodes->{$node};
+ }
+}
+
1;
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 07/13] manager: handle resource affinity rules in manual migrations
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (5 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 06/13] manager: apply resource affinity rules when selecting service nodes Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 08/13] sim: resources: add option to limit start and migrate tries to node Daniel Kral
` (11 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Make any manual user migration of a resource follow the resource
affinity rules it is part of. That is:
- prevent a resource to be manually migrated to a node, which contains a
resource, that the resource must be kept separate from (negative
resource affinity).
- make resources, which must be kept together (positive resource
affinity), migrate to the same target node, and
The log information here is only redirected to the HA Manager node's
syslog, so user-facing endpoints need to implement this logic as well to
give users adequate feedback about these actions.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Manager.pm | 46 ++++++++++++++++++++++--
src/PVE/HA/Rules/ResourceAffinity.pm | 53 ++++++++++++++++++++++++++++
2 files changed, 96 insertions(+), 3 deletions(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 06d83cd..fc0c116 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -12,7 +12,7 @@ use PVE::HA::NodeStatus;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
use PVE::HA::Rules::ResourceAffinity
- qw(get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
+ qw(get_affinitive_resources get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
use PVE::HA::Usage::Basic;
use PVE::HA::Usage::Static;
@@ -409,6 +409,47 @@ sub read_lrm_status {
return ($results, $modes);
}
+sub execute_migration {
+ my ($self, $cmd, $task, $sid, $target) = @_;
+
+ my ($haenv, $ss) = $self->@{qw(haenv ss)};
+
+ my ($together, $separate) = get_affinitive_resources($self->{rules}, $sid);
+
+ for my $csid (sort keys %$separate) {
+ next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $target;
+ next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $target;
+
+ $haenv->log(
+ 'err',
+ "crm command '$cmd' error - service '$csid' on node '$target' in"
+ . " negative affinity with service '$sid'",
+ );
+
+ return; # one negative resource affinity is enough to not execute migration
+ }
+
+ $haenv->log('info', "got crm command: $cmd");
+ $ss->{$sid}->{cmd} = [$task, $target];
+
+ my $resources_to_migrate = [];
+ for my $csid (sort keys %$together) {
+ next if $ss->{$csid}->{node} && $ss->{$csid}->{node} eq $target;
+ next if $ss->{$csid}->{target} && $ss->{$csid}->{target} eq $target;
+
+ push @$resources_to_migrate, $csid;
+ }
+
+ for my $csid (@$resources_to_migrate) {
+ $haenv->log(
+ 'info',
+ "crm command '$cmd' - $task service '$csid' to node '$target'"
+ . " (service '$csid' in positive affinity with service '$sid')",
+ );
+ $ss->{$csid}->{cmd} = [$task, $target];
+ }
+}
+
# read new crm commands and save them into crm master status
sub update_crm_commands {
my ($self) = @_;
@@ -432,8 +473,7 @@ sub update_crm_commands {
"ignore crm command - service already on target node: $cmd",
);
} else {
- $haenv->log('info', "got crm command: $cmd");
- $ss->{$sid}->{cmd} = [$task, $node];
+ $self->execute_migration($cmd, $task, $sid, $node);
}
}
} else {
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 965b9a1..e5a858e 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -10,6 +10,7 @@ use base qw(Exporter);
use base qw(PVE::HA::Rules);
our @EXPORT_OK = qw(
+ get_affinitive_resources
get_resource_affinity
apply_positive_resource_affinity
apply_negative_resource_affinity
@@ -447,6 +448,58 @@ sub plugin_canonicalize {
=cut
+=head3 get_affinitive_resources($rules, $sid)
+
+Returns a list of two hash sets, where the first hash set contains the
+resources, which C<$sid> is positively affinitive to, and the second hash
+contains the resources, which C<$sid> is negatively affinitive to, acording to
+the resource affinity rules in C<$rules>.
+
+Note that a resource C<$sid> becomes part of any negative affinity relation
+of its positively affinitive resources.
+
+For example, if a resource is negatively affinitive to C<'vm:101'> and positively
+affinitive to C<'ct:200'> and C<'ct:201'>, the returned value will be:
+
+ {
+ together => {
+ 'vm:101' => 1
+ },
+ separate => {
+ 'ct:200' => 1,
+ 'ct:201' => 1
+ }
+ }
+
+=cut
+
+sub get_affinitive_resources : prototype($$) {
+ my ($rules, $sid) = @_;
+
+ my $together = {};
+ my $separate = {};
+
+ PVE::HA::Rules::foreach_rule(
+ $rules,
+ sub {
+ my ($rule, $ruleid) = @_;
+
+ my $affinity_set = $rule->{affinity} eq 'positive' ? $together : $separate;
+
+ for my $csid (sort keys %{ $rule->{resources} }) {
+ $affinity_set->{$csid} = 1 if $csid ne $sid;
+ }
+ },
+ {
+ sid => $sid,
+ type => 'resource-affinity',
+ exclude_disabled_rules => 1,
+ },
+ );
+
+ return ($together, $separate);
+}
+
=head3 get_resource_affinity($rules, $sid, $online_node_usage)
Returns a list of two hashes, where the first describes the positive resource
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 08/13] sim: resources: add option to limit start and migrate tries to node
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (6 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 07/13] manager: handle resource affinity rules in manual migrations Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 09/13] test: ha tester: add test cases for negative resource affinity rules Daniel Kral
` (10 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add an option to the VirtFail's name to allow the start and migrate fail
counts to only apply on a certain node number with a specific naming
scheme.
This allows a slightly more elaborate test type, e.g. where a service
can start on one node (or any other in that case), but fails to start on
a specific node, which it is expected to start on after a migration.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Sim/Resources/VirtFail.pm | 29 +++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/src/PVE/HA/Sim/Resources/VirtFail.pm b/src/PVE/HA/Sim/Resources/VirtFail.pm
index 3b476e1..13b72dc 100644
--- a/src/PVE/HA/Sim/Resources/VirtFail.pm
+++ b/src/PVE/HA/Sim/Resources/VirtFail.pm
@@ -10,25 +10,28 @@ use base qw(PVE::HA::Sim::Resources);
# To make it more interesting we can encode some behavior in the VMID
# with the following format, where fa: is the type and a, b, c, ...
# are digits in base 10, i.e. the full service ID would be:
-# fa:abcde
+# fa:abcdef
# And the digits after the fa: type prefix would mean:
# - a: no meaning but can be used for differentiating similar resources
# - b: how many tries are needed to start correctly (0 is normal behavior) (should be set)
# - c: how many tries are needed to migrate correctly (0 is normal behavior) (should be set)
# - d: should shutdown be successful (0 = yes, anything else no) (optional)
# - e: return value of $plugin->exists() defaults to 1 if not set (optional)
+# - f: limits the constraints of b and c to the nodeX (0 = apply to all nodes) (optional)
my $decode_id = sub {
my $id = shift;
- my ($start, $migrate, $stop, $exists) = $id =~ /^\d(\d)(\d)(\d)?(\d)?/g;
+ my ($start, $migrate, $stop, $exists, $limit_to_node) =
+ $id =~ /^\d(\d)(\d)(\d)?(\d)?(\d)?/g;
$start = 0 if !defined($start);
$migrate = 0 if !defined($migrate);
$stop = 0 if !defined($stop);
$exists = 1 if !defined($exists);
+ $limit_to_node = 0 if !defined($limit_to_node);
- return ($start, $migrate, $stop, $exists);
+ return ($start, $migrate, $stop, $exists, $limit_to_node);
};
my $tries = {
@@ -52,12 +55,14 @@ sub exists {
sub start {
my ($class, $haenv, $id) = @_;
- my ($start_failure_count) = &$decode_id($id);
+ my ($start_failure_count, $limit_to_node) = ($decode_id->($id))[0, 4];
- $tries->{start}->{$id} = 0 if !$tries->{start}->{$id};
- $tries->{start}->{$id}++;
+ if ($limit_to_node == 0 || $haenv->nodename() eq "node$limit_to_node") {
+ $tries->{start}->{$id} = 0 if !$tries->{start}->{$id};
+ $tries->{start}->{$id}++;
- return if $start_failure_count >= $tries->{start}->{$id};
+ return if $start_failure_count >= $tries->{start}->{$id};
+ }
$tries->{start}->{$id} = 0; # reset counts
@@ -78,12 +83,14 @@ sub shutdown {
sub migrate {
my ($class, $haenv, $id, $target, $online) = @_;
- my (undef, $migrate_failure_count) = &$decode_id($id);
+ my ($migrate_failure_count, $limit_to_node) = ($decode_id->($id))[1, 4];
- $tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id};
- $tries->{migrate}->{$id}++;
+ if ($limit_to_node == 0 || $haenv->nodename() eq "node$limit_to_node") {
+ $tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id};
+ $tries->{migrate}->{$id}++;
- return if $migrate_failure_count >= $tries->{migrate}->{$id};
+ return if $migrate_failure_count >= $tries->{migrate}->{$id};
+ }
$tries->{migrate}->{$id} = 0; # reset counts
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 09/13] test: ha tester: add test cases for negative resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (7 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 08/13] sim: resources: add option to limit start and migrate tries to node Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 10/13] test: ha tester: add test cases for positive " Daniel Kral
` (9 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add test cases for strict negative resource affinity rules, i.e. where
resources must be kept on separate nodes. These verify the behavior of
the resources in strict negative resource affinity rules in case of a
failover of the node of one or more of these resources in the following
scenarios:
1. 2 resources in neg. affinity and a 3 node cluster; 1 node failing
2. 3 resources in neg. affinity and a 5 node cluster; 1 node failing
3. 3 resources in neg. affinity and a 5 node cluster; 2 nodes failing
4. 2 resources in neg. affinity and a 3 node cluster; 1 node failing,
but the recovery node cannot start the resource
5. Pair of 2 neg. resource affinity rules (with one common resource in
both) in a 3 node cluster; 1 node failing
6. 2 resources in neg. affinity and a 3 node cluster; 1 node failing,
but both resources cannot start on the recovery node
7. 2 resources in neg. affinity and a 3 node cluster; 1 resource
manually migrated to another free node; other resources in neg.
affinity with migrated resource cannot be migrated to that resource's
source node during migration
8. 3 resources in neg. affinity and a 3 node cluster; 1 resource
manually migrated to another resource's node fails
The word "strict" describes the current policy of resource affinity
rules and is added in anticipation of a "non-strict" variant in the
future.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../README | 13 +++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 60 ++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 15 +++
.../cmdlist | 4 +
.../hardware_status | 7 ++
.../log.expect | 90 ++++++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 10 ++
.../README | 16 +++
.../cmdlist | 4 +
.../hardware_status | 7 ++
.../log.expect | 110 ++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 10 ++
.../README | 18 +++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 69 +++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 11 ++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 56 +++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 5 +
.../README | 18 +++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 69 +++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 15 +++
.../cmdlist | 5 +
.../hardware_status | 5 +
.../log.expect | 52 +++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 4 +
.../README | 12 ++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 38 ++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 5 +
56 files changed, 827 insertions(+)
create mode 100644 src/test/test-resource-affinity-strict-negative1/README
create mode 100644 src/test/test-resource-affinity-strict-negative1/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative1/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative1/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative1/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative1/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative1/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative2/README
create mode 100644 src/test/test-resource-affinity-strict-negative2/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative2/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative2/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative2/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative2/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative2/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative3/README
create mode 100644 src/test/test-resource-affinity-strict-negative3/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative3/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative3/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative3/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative3/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative3/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative4/README
create mode 100644 src/test/test-resource-affinity-strict-negative4/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative4/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative4/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative4/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative4/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative4/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative5/README
create mode 100644 src/test/test-resource-affinity-strict-negative5/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative5/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative5/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative5/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative5/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative5/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative6/README
create mode 100644 src/test/test-resource-affinity-strict-negative6/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative6/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative6/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative6/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative6/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative6/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative7/README
create mode 100644 src/test/test-resource-affinity-strict-negative7/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative7/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative7/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative7/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative7/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative7/service_config
create mode 100644 src/test/test-resource-affinity-strict-negative8/README
create mode 100644 src/test/test-resource-affinity-strict-negative8/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-negative8/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-negative8/log.expect
create mode 100644 src/test/test-resource-affinity-strict-negative8/manager_status
create mode 100644 src/test/test-resource-affinity-strict-negative8/rules_config
create mode 100644 src/test/test-resource-affinity-strict-negative8/service_config
diff --git a/src/test/test-resource-affinity-strict-negative1/README b/src/test/test-resource-affinity-strict-negative1/README
new file mode 100644
index 0000000..0f01197
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/README
@@ -0,0 +1,13 @@
+Test whether a strict negative resource affinity rule among two resources makes
+one of the resources migrate to a different recovery node than the other in case
+of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept separate
+- vm:101 and vm:102 are currently running on node2 and node3 respectively
+- node1 has a higher resource count than node2 to test the resource affinity rule
+ is applied even though the scheduler would prefer the less utilized node
+
+The expected outcome is:
+- As node3 fails, vm:102 is migrated to node1; even though the utilization of
+ node1 is high already, the resources must be kept separate
diff --git a/src/test/test-resource-affinity-strict-negative1/cmdlist b/src/test/test-resource-affinity-strict-negative1/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative1/hardware_status b/src/test/test-resource-affinity-strict-negative1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative1/log.expect b/src/test/test-resource-affinity-strict-negative1/log.expect
new file mode 100644
index 0000000..475db39
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/log.expect
@@ -0,0 +1,60 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node2'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:101
+info 23 node2/lrm: service status vm:101 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:102
+info 25 node3/lrm: service status vm:102 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node1)
+info 241 node1/lrm: starting service vm:102
+info 241 node1/lrm: service status vm:102 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative1/manager_status b/src/test/test-resource-affinity-strict-negative1/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-strict-negative1/rules_config b/src/test/test-resource-affinity-strict-negative1/rules_config
new file mode 100644
index 0000000..2074776
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative1/service_config b/src/test/test-resource-affinity-strict-negative1/service_config
new file mode 100644
index 0000000..6582e8c
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative1/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node2", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative2/README b/src/test/test-resource-affinity-strict-negative2/README
new file mode 100644
index 0000000..613be64
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/README
@@ -0,0 +1,15 @@
+Test whether a strict negative resource affinity rule among three resources makes
+one of the resources migrate to a different node than the other resources in case
+of a failover of the resource's previously assigned node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are on node3, node4, and node5 respectively
+- node1 and node2 have each both higher resource counts than node3, node4 and
+ node5 to test the rule is applied even though the scheduler would prefer the
+ less utilized nodes node3 and node4
+
+The expected outcome is:
+- As node5 fails, vm:103 is migrated to node2; even though the utilization of
+ node2 is high already, the resources must be kept separate; node2 is chosen
+ since node1 has one more resource running on it
diff --git a/src/test/test-resource-affinity-strict-negative2/cmdlist b/src/test/test-resource-affinity-strict-negative2/cmdlist
new file mode 100644
index 0000000..89d09c9
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+ [ "network node5 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative2/hardware_status b/src/test/test-resource-affinity-strict-negative2/hardware_status
new file mode 100644
index 0000000..7b8e961
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/hardware_status
@@ -0,0 +1,7 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" },
+ "node4": { "power": "off", "network": "off" },
+ "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative2/log.expect b/src/test/test-resource-affinity-strict-negative2/log.expect
new file mode 100644
index 0000000..858d3c9
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/log.expect
@@ -0,0 +1,90 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node4 on
+info 20 node4/crm: status change startup => wait_for_quorum
+info 20 node4/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node5 on
+info 20 node5/crm: status change startup => wait_for_quorum
+info 20 node5/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node4'
+info 20 node1/crm: adding new service 'vm:103' on node 'node5'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: adding new service 'vm:106' on node 'node1'
+info 20 node1/crm: adding new service 'vm:107' on node 'node2'
+info 20 node1/crm: adding new service 'vm:108' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node4)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node5)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:108': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 21 node1/lrm: starting service vm:105
+info 21 node1/lrm: service status vm:105 started
+info 21 node1/lrm: starting service vm:106
+info 21 node1/lrm: service status vm:106 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:107
+info 23 node2/lrm: service status vm:107 started
+info 23 node2/lrm: starting service vm:108
+info 23 node2/lrm: service status vm:108 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 26 node4/crm: status change wait_for_quorum => slave
+info 27 node4/lrm: got lock 'ha_agent_node4_lock'
+info 27 node4/lrm: status change wait_for_agent_lock => active
+info 27 node4/lrm: starting service vm:102
+info 27 node4/lrm: service status vm:102 started
+info 28 node5/crm: status change wait_for_quorum => slave
+info 29 node5/lrm: got lock 'ha_agent_node5_lock'
+info 29 node5/lrm: status change wait_for_agent_lock => active
+info 29 node5/lrm: starting service vm:103
+info 29 node5/lrm: service status vm:103 started
+info 120 cmdlist: execute network node5 off
+info 120 node1/crm: node 'node5': state changed from 'online' => 'unknown'
+info 128 node5/crm: status change slave => wait_for_quorum
+info 129 node5/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node5'
+info 170 watchdog: execute power node5 off
+info 169 node5/crm: killed by poweroff
+info 170 node5/lrm: killed by poweroff
+info 170 hardware: server 'node5' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node5_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node5'
+info 240 node1/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:103' from fenced node 'node5' to node 'node2'
+info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node2)
+info 243 node2/lrm: starting service vm:103
+info 243 node2/lrm: service status vm:103 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative2/manager_status b/src/test/test-resource-affinity-strict-negative2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative2/rules_config b/src/test/test-resource-affinity-strict-negative2/rules_config
new file mode 100644
index 0000000..44e6a02
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative2/service_config b/src/test/test-resource-affinity-strict-negative2/service_config
new file mode 100644
index 0000000..2c27816
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative2/service_config
@@ -0,0 +1,10 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node4", "state": "started" },
+ "vm:103": { "node": "node5", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" },
+ "vm:106": { "node": "node1", "state": "started" },
+ "vm:107": { "node": "node2", "state": "started" },
+ "vm:108": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative3/README b/src/test/test-resource-affinity-strict-negative3/README
new file mode 100644
index 0000000..a26301a
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/README
@@ -0,0 +1,16 @@
+Test whether a strict negative resource affinity rule among three resources makes
+two of the resources migrate to two different recovery nodes than the node of
+the third resource in case of a failover of their two previously assigned nodes.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are respectively on node3, node4, and node5
+- node1 and node2 have both higher resource counts than node3, node4 and node5
+ to test the resource affinity rule is enforced even though the utilization
+ would prefer the other node3
+
+The expected outcome is:
+- As node4 and node5 fails, vm:102 and vm:103 are migrated to node2 and node1
+ respectively; even though the utilization of node1 and node2 are high
+ already, the resources must be kept separate; node2 is chosen first since
+ node1 has one more resource running on it
diff --git a/src/test/test-resource-affinity-strict-negative3/cmdlist b/src/test/test-resource-affinity-strict-negative3/cmdlist
new file mode 100644
index 0000000..1934596
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+ [ "network node4 off", "network node5 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative3/hardware_status b/src/test/test-resource-affinity-strict-negative3/hardware_status
new file mode 100644
index 0000000..7b8e961
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/hardware_status
@@ -0,0 +1,7 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" },
+ "node4": { "power": "off", "network": "off" },
+ "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative3/log.expect b/src/test/test-resource-affinity-strict-negative3/log.expect
new file mode 100644
index 0000000..4acdcec
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/log.expect
@@ -0,0 +1,110 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node4 on
+info 20 node4/crm: status change startup => wait_for_quorum
+info 20 node4/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node5 on
+info 20 node5/crm: status change startup => wait_for_quorum
+info 20 node5/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node4'
+info 20 node1/crm: adding new service 'vm:103' on node 'node5'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: adding new service 'vm:106' on node 'node1'
+info 20 node1/crm: adding new service 'vm:107' on node 'node2'
+info 20 node1/crm: adding new service 'vm:108' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node4)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node5)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:108': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 21 node1/lrm: starting service vm:105
+info 21 node1/lrm: service status vm:105 started
+info 21 node1/lrm: starting service vm:106
+info 21 node1/lrm: service status vm:106 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:107
+info 23 node2/lrm: service status vm:107 started
+info 23 node2/lrm: starting service vm:108
+info 23 node2/lrm: service status vm:108 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 26 node4/crm: status change wait_for_quorum => slave
+info 27 node4/lrm: got lock 'ha_agent_node4_lock'
+info 27 node4/lrm: status change wait_for_agent_lock => active
+info 27 node4/lrm: starting service vm:102
+info 27 node4/lrm: service status vm:102 started
+info 28 node5/crm: status change wait_for_quorum => slave
+info 29 node5/lrm: got lock 'ha_agent_node5_lock'
+info 29 node5/lrm: status change wait_for_agent_lock => active
+info 29 node5/lrm: starting service vm:103
+info 29 node5/lrm: service status vm:103 started
+info 120 cmdlist: execute network node4 off
+info 120 cmdlist: execute network node5 off
+info 120 node1/crm: node 'node4': state changed from 'online' => 'unknown'
+info 120 node1/crm: node 'node5': state changed from 'online' => 'unknown'
+info 126 node4/crm: status change slave => wait_for_quorum
+info 127 node4/lrm: status change active => lost_agent_lock
+info 128 node5/crm: status change slave => wait_for_quorum
+info 129 node5/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node4': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node4'
+info 160 node1/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node5'
+info 168 watchdog: execute power node4 off
+info 167 node4/crm: killed by poweroff
+info 168 node4/lrm: killed by poweroff
+info 168 hardware: server 'node4' stopped by poweroff (watchdog)
+info 170 watchdog: execute power node5 off
+info 169 node5/crm: killed by poweroff
+info 170 node5/lrm: killed by poweroff
+info 170 hardware: server 'node5' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node4_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node4'
+info 240 node1/crm: node 'node4': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node4'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: got lock 'ha_agent_node5_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node5'
+info 240 node1/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node4' to node 'node2'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:103' from fenced node 'node5' to node 'node1'
+info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node1)
+info 241 node1/lrm: starting service vm:103
+info 241 node1/lrm: service status vm:103 started
+info 243 node2/lrm: starting service vm:102
+info 243 node2/lrm: service status vm:102 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative3/manager_status b/src/test/test-resource-affinity-strict-negative3/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative3/rules_config b/src/test/test-resource-affinity-strict-negative3/rules_config
new file mode 100644
index 0000000..44e6a02
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative3/service_config b/src/test/test-resource-affinity-strict-negative3/service_config
new file mode 100644
index 0000000..2c27816
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative3/service_config
@@ -0,0 +1,10 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node4", "state": "started" },
+ "vm:103": { "node": "node5", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" },
+ "vm:106": { "node": "node1", "state": "started" },
+ "vm:107": { "node": "node2", "state": "started" },
+ "vm:108": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative4/README b/src/test/test-resource-affinity-strict-negative4/README
new file mode 100644
index 0000000..16895a4
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/README
@@ -0,0 +1,18 @@
+Test whether a strict negative resource affinity rule among two resources makes
+one of the resources migrate to a different recovery node than the other resource
+in case of a failover of resource's previously assigned node. As the resource
+fails to start on the recovery node (e.g. insufficient resources), the failing
+resource is kept on the recovery node.
+
+The test scenario is:
+- vm:101 and fa:120001 must be kept separate
+- vm:101 and fa:120001 are on node2 and node3 respectively
+- fa:120001 will fail to start on node1
+- node1 has a higher resource count than node2 to test the resource affinity rule
+ is applied even though the scheduler would prefer the less utilized node
+
+The expected outcome is:
+- As node3 fails, fa:120001 is migrated to node1
+- fa:120001 will stay on the node (potentially in recovery), since it cannot be
+ started on node1, but cannot be relocated to another one either due to the
+ strict resource affinity rule
diff --git a/src/test/test-resource-affinity-strict-negative4/cmdlist b/src/test/test-resource-affinity-strict-negative4/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative4/hardware_status b/src/test/test-resource-affinity-strict-negative4/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative4/log.expect b/src/test/test-resource-affinity-strict-negative4/log.expect
new file mode 100644
index 0000000..f772ea8
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/log.expect
@@ -0,0 +1,69 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'fa:120001' on node 'node3'
+info 20 node1/crm: adding new service 'vm:101' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: service 'fa:120001': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:101
+info 23 node2/lrm: service status vm:101 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service fa:120001
+info 25 node3/lrm: service status fa:120001 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'fa:120001': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'fa:120001': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'fa:120001' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'fa:120001': state changed from 'recovery' to 'started' (node = node1)
+info 241 node1/lrm: starting service fa:120001
+warn 241 node1/lrm: unable to start service fa:120001
+warn 241 node1/lrm: restart policy: retry number 1 for service 'fa:120001'
+info 261 node1/lrm: starting service fa:120001
+warn 261 node1/lrm: unable to start service fa:120001
+err 261 node1/lrm: unable to start service fa:120001 on local node after 1 retries
+warn 280 node1/crm: starting service fa:120001 on node 'node1' failed, relocating service.
+warn 280 node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120001', retry start on current node. Tried nodes: node1
+info 281 node1/lrm: starting service fa:120001
+info 281 node1/lrm: service status fa:120001 started
+info 300 node1/crm: relocation policy successful for 'fa:120001' on node 'node1', failed nodes: node1
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative4/manager_status b/src/test/test-resource-affinity-strict-negative4/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-strict-negative4/rules_config b/src/test/test-resource-affinity-strict-negative4/rules_config
new file mode 100644
index 0000000..227ec31
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,fa:120001
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative4/service_config b/src/test/test-resource-affinity-strict-negative4/service_config
new file mode 100644
index 0000000..f53c2bc
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative4/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node2", "state": "started" },
+ "fa:120001": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative5/README b/src/test/test-resource-affinity-strict-negative5/README
new file mode 100644
index 0000000..35276fb
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/README
@@ -0,0 +1,11 @@
+Test whether two pair-wise strict negative resource affinity rules, i.e. where
+one resource is in two separate negative resource affinity rules with two other
+resources, makes one of the outer resources migrate to the same node as the other
+outer resource in case of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102, and vm:101 and vm:103 must each be kept separate
+- vm:101, vm:102, and vm:103 are respectively on node1, node2, and node3
+
+The expected outcome is:
+- As node3 fails, vm:103 is migrated to node2 - the same as vm:102
diff --git a/src/test/test-resource-affinity-strict-negative5/cmdlist b/src/test/test-resource-affinity-strict-negative5/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative5/hardware_status b/src/test/test-resource-affinity-strict-negative5/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative5/log.expect b/src/test/test-resource-affinity-strict-negative5/log.expect
new file mode 100644
index 0000000..16156ad
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/log.expect
@@ -0,0 +1,56 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node2)
+info 243 node2/lrm: starting service vm:103
+info 243 node2/lrm: service status vm:103 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative5/manager_status b/src/test/test-resource-affinity-strict-negative5/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative5/rules_config b/src/test/test-resource-affinity-strict-negative5/rules_config
new file mode 100644
index 0000000..6a13333
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/rules_config
@@ -0,0 +1,7 @@
+resource-affinity: lonely-must-some-vms-be1
+ resources vm:101,vm:102
+ affinity negative
+
+resource-affinity: lonely-must-some-vms-be2
+ resources vm:101,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative5/service_config b/src/test/test-resource-affinity-strict-negative5/service_config
new file mode 100644
index 0000000..4b26f6b
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative5/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node2", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative6/README b/src/test/test-resource-affinity-strict-negative6/README
new file mode 100644
index 0000000..2c8e7c1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/README
@@ -0,0 +1,18 @@
+Test whether a strict negative resource affinity rule among two resources makes
+one of the resources migrate to a different recovery node than the other resource
+in case of a failover of the resource's previously assigned node. As the other
+resource fails to starts on the recovery node (e.g. insufficient resources), the
+failing resource is kept on the recovery node.
+
+The test scenario is:
+- fa:120001 and fa:220001 must be kept separate
+- fa:120001 and fa:220001 are on node2 and node3 respectively
+- fa:120001 and fa:220001 will fail to start on node1
+- node1 has a higher resource count than node2 to test the resource affinity rule
+ is applied even though the scheduler would prefer the less utilized node
+
+The expected outcome is:
+- As node3 fails, fa:220001 is migrated to node1
+- fa:220001 will stay on the node (potentially in recovery), since it cannot be
+ started on node1, but cannot be relocated to another one either due to the
+ strict resource affinity rule
diff --git a/src/test/test-resource-affinity-strict-negative6/cmdlist b/src/test/test-resource-affinity-strict-negative6/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative6/hardware_status b/src/test/test-resource-affinity-strict-negative6/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative6/log.expect b/src/test/test-resource-affinity-strict-negative6/log.expect
new file mode 100644
index 0000000..0d9854a
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/log.expect
@@ -0,0 +1,69 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'fa:120001' on node 'node2'
+info 20 node1/crm: adding new service 'fa:220001' on node 'node3'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: service 'fa:120001': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'fa:220001': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service fa:120001
+info 23 node2/lrm: service status fa:120001 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service fa:220001
+info 25 node3/lrm: service status fa:220001 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'fa:220001': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'fa:220001': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'fa:220001' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'fa:220001': state changed from 'recovery' to 'started' (node = node1)
+info 241 node1/lrm: starting service fa:220001
+warn 241 node1/lrm: unable to start service fa:220001
+warn 241 node1/lrm: restart policy: retry number 1 for service 'fa:220001'
+info 261 node1/lrm: starting service fa:220001
+warn 261 node1/lrm: unable to start service fa:220001
+err 261 node1/lrm: unable to start service fa:220001 on local node after 1 retries
+warn 280 node1/crm: starting service fa:220001 on node 'node1' failed, relocating service.
+warn 280 node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:220001', retry start on current node. Tried nodes: node1
+info 281 node1/lrm: starting service fa:220001
+info 281 node1/lrm: service status fa:220001 started
+info 300 node1/crm: relocation policy successful for 'fa:220001' on node 'node1', failed nodes: node1
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative6/manager_status b/src/test/test-resource-affinity-strict-negative6/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative6/rules_config b/src/test/test-resource-affinity-strict-negative6/rules_config
new file mode 100644
index 0000000..95a24f5
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources fa:120001,fa:220001
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative6/service_config b/src/test/test-resource-affinity-strict-negative6/service_config
new file mode 100644
index 0000000..1f9480c
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative6/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "fa:120001": { "node": "node2", "state": "started" },
+ "fa:220001": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative7/README b/src/test/test-resource-affinity-strict-negative7/README
new file mode 100644
index 0000000..818abba
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/README
@@ -0,0 +1,15 @@
+Test whether a strict negative resource affinity rule among two resources makes
+one of the resource, which is manually migrated to another node, be migrated
+there and disallows other resources, which are in negative affinity with the
+migrated resource, to not be migrated to the migrated resource's source node.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept separate
+- vm:101 and vm:102 are running on node1 and node2 respectively
+
+The expected outcome is:
+- vm:101 is migrated to node3
+- While vm:101 is migrated, vm:102 cannot be migrated to node1, as vm:101 is
+ still putting load on node1 as its source node
+- After vm:101 is successfully migrated to node3, vm:102 can be migrated to
+ node1
diff --git a/src/test/test-resource-affinity-strict-negative7/cmdlist b/src/test/test-resource-affinity-strict-negative7/cmdlist
new file mode 100644
index 0000000..468ba56
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/cmdlist
@@ -0,0 +1,5 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "service vm:101 migrate node3", "service vm:102 migrate node1" ],
+ [ "service vm:102 migrate node1" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative7/hardware_status b/src/test/test-resource-affinity-strict-negative7/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative7/log.expect b/src/test/test-resource-affinity-strict-negative7/log.expect
new file mode 100644
index 0000000..6060f5e
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/log.expect
@@ -0,0 +1,52 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 120 cmdlist: execute service vm:101 migrate node3
+info 120 cmdlist: execute service vm:102 migrate node1
+info 120 node1/crm: got crm command: migrate vm:101 node3
+err 120 node1/crm: crm command 'migrate vm:102 node1' error - service 'vm:101' on node 'node1' in negative affinity with service 'vm:102'
+info 120 node1/crm: migrate service 'vm:101' to node 'node3'
+info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 121 node1/lrm: service vm:101 - start migrate to node 'node3'
+info 121 node1/lrm: service vm:101 - end migrate to node 'node3'
+info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 145 node3/lrm: got lock 'ha_agent_node3_lock'
+info 145 node3/lrm: status change wait_for_agent_lock => active
+info 145 node3/lrm: starting service vm:101
+info 145 node3/lrm: service status vm:101 started
+info 220 cmdlist: execute service vm:102 migrate node1
+info 220 node1/crm: got crm command: migrate vm:102 node1
+info 220 node1/crm: migrate service 'vm:102' to node 'node1'
+info 220 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 223 node2/lrm: service vm:102 - start migrate to node 'node1'
+info 223 node2/lrm: service vm:102 - end migrate to node 'node1'
+info 240 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 241 node1/lrm: starting service vm:102
+info 241 node1/lrm: service status vm:102 started
+info 820 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative7/manager_status b/src/test/test-resource-affinity-strict-negative7/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative7/rules_config b/src/test/test-resource-affinity-strict-negative7/rules_config
new file mode 100644
index 0000000..2074776
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative7/service_config b/src/test/test-resource-affinity-strict-negative7/service_config
new file mode 100644
index 0000000..0336d09
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative7/service_config
@@ -0,0 +1,4 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative8/README b/src/test/test-resource-affinity-strict-negative8/README
new file mode 100644
index 0000000..f338aad
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/README
@@ -0,0 +1,12 @@
+Test whether a strict negative resource affinity rule among three resources makes
+one of the resource, which is manually migrated to another resource's node, where
+the resource is in negative affinity with the migrated resource, stay on the node
+of the other resources.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are all running on node1, node2, and node3
+
+The expected outcome is:
+- vm:101 cannot be migrated to node3 as it would conflict the negative resource
+ affinity rule between vm:101, vm:102 and vm:103.
diff --git a/src/test/test-resource-affinity-strict-negative8/cmdlist b/src/test/test-resource-affinity-strict-negative8/cmdlist
new file mode 100644
index 0000000..13cab7b
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "service vm:101 migrate node3" ]
+]
diff --git a/src/test/test-resource-affinity-strict-negative8/hardware_status b/src/test/test-resource-affinity-strict-negative8/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-negative8/log.expect b/src/test/test-resource-affinity-strict-negative8/log.expect
new file mode 100644
index 0000000..96f55d5
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/log.expect
@@ -0,0 +1,38 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 120 cmdlist: execute service vm:101 migrate node3
+err 120 node1/crm: crm command 'migrate vm:101 node3' error - service 'vm:103' on node 'node3' in negative affinity with service 'vm:101'
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-negative8/manager_status b/src/test/test-resource-affinity-strict-negative8/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-negative8/rules_config b/src/test/test-resource-affinity-strict-negative8/rules_config
new file mode 100644
index 0000000..44e6a02
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: lonely-must-vms-be
+ resources vm:101,vm:102,vm:103
+ affinity negative
diff --git a/src/test/test-resource-affinity-strict-negative8/service_config b/src/test/test-resource-affinity-strict-negative8/service_config
new file mode 100644
index 0000000..4b26f6b
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-negative8/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node2", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" }
+}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 10/13] test: ha tester: add test cases for positive resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (8 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 09/13] test: ha tester: add test cases for negative resource affinity rules Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 11/13] test: ha tester: add test cases for static scheduler resource affinity Daniel Kral
` (8 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add test cases for strict positive resource affinity rules, i.e. where
resources must be kept on the same node together. These verify the
behavior of the resources in strict positive resource affinity rules in
case of a failover of their assigned nodes in the following scenarios:
1. 2 resources in neg. affinity and a 3 node cluster; 1 node failing
2. 3 resources in neg. affinity and a 3 node cluster; 1 node failing
3. 3 resources in neg. affinity and a 3 node cluster; 1 node failing,
but the recovery node cannot start one of the resources
4. 3 resources in neg. affinity and a 3 node cluster; 1 resource
manually migrated to another node will migrate the other resources in
pos. affinity with the migrated resource to the same node as well
5. 9 resources in neg. affinity a 3 node cluster; 1 resource manually
migrated to another node will migrate the other resources in pos.
affinity with the migrated resource to the same node as well
The word "strict" describes the current policy of resource affinity
rules and is added in anticipation of a "non-strict" variant in the
future.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../README | 12 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 66 ++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 6 +
.../README | 11 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 80 +++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 8 +
.../README | 17 ++
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 89 ++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 8 +
.../README | 11 +
.../cmdlist | 4 +
.../hardware_status | 5 +
.../log.expect | 59 ++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 5 +
.../README | 19 ++
.../cmdlist | 8 +
.../hardware_status | 5 +
.../log.expect | 281 ++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 15 +
.../service_config | 11 +
35 files changed, 764 insertions(+)
create mode 100644 src/test/test-resource-affinity-strict-positive1/README
create mode 100644 src/test/test-resource-affinity-strict-positive1/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive1/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive1/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive1/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive1/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive1/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive2/README
create mode 100644 src/test/test-resource-affinity-strict-positive2/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive2/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive2/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive2/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive2/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive2/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive3/README
create mode 100644 src/test/test-resource-affinity-strict-positive3/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive3/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive3/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive3/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive3/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive3/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive4/README
create mode 100644 src/test/test-resource-affinity-strict-positive4/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive4/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive4/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive4/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive4/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive4/service_config
create mode 100644 src/test/test-resource-affinity-strict-positive5/README
create mode 100644 src/test/test-resource-affinity-strict-positive5/cmdlist
create mode 100644 src/test/test-resource-affinity-strict-positive5/hardware_status
create mode 100644 src/test/test-resource-affinity-strict-positive5/log.expect
create mode 100644 src/test/test-resource-affinity-strict-positive5/manager_status
create mode 100644 src/test/test-resource-affinity-strict-positive5/rules_config
create mode 100644 src/test/test-resource-affinity-strict-positive5/service_config
diff --git a/src/test/test-resource-affinity-strict-positive1/README b/src/test/test-resource-affinity-strict-positive1/README
new file mode 100644
index 0000000..3b20474
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/README
@@ -0,0 +1,12 @@
+Test whether a strict positive resource affinity rule makes two resources migrate
+to the same recovery node in case of a failover of their previously assigned
+node.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept together
+- vm:101 and vm:102 are both currently running on node3
+- node1 and node2 have the same resource count to test that the rule is applied
+ even though it would be usually balanced between both remaining nodes
+
+The expected outcome is:
+- As node3 fails, both resources are migrated to node1
diff --git a/src/test/test-resource-affinity-strict-positive1/cmdlist b/src/test/test-resource-affinity-strict-positive1/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-positive1/hardware_status b/src/test/test-resource-affinity-strict-positive1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive1/log.expect b/src/test/test-resource-affinity-strict-positive1/log.expect
new file mode 100644
index 0000000..7d43314
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/log.expect
@@ -0,0 +1,66 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:104
+info 23 node2/lrm: service status vm:104 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 25 node3/lrm: starting service vm:102
+info 25 node3/lrm: service status vm:102 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'vm:101': state changed from 'recovery' to 'started' (node = node1)
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node1)
+info 241 node1/lrm: starting service vm:101
+info 241 node1/lrm: service status vm:101 started
+info 241 node1/lrm: starting service vm:102
+info 241 node1/lrm: service status vm:102 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive1/manager_status b/src/test/test-resource-affinity-strict-positive1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-positive1/rules_config b/src/test/test-resource-affinity-strict-positive1/rules_config
new file mode 100644
index 0000000..9789d7c
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102
+ affinity positive
diff --git a/src/test/test-resource-affinity-strict-positive1/service_config b/src/test/test-resource-affinity-strict-positive1/service_config
new file mode 100644
index 0000000..9fb091d
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive1/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive2/README b/src/test/test-resource-affinity-strict-positive2/README
new file mode 100644
index 0000000..533625c
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/README
@@ -0,0 +1,11 @@
+Test whether a strict positive resource affinity rule makes three resources
+migrate to the same recovery node in case of a failover of their previously
+assigned node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are all currently running on node3
+
+The expected outcome is:
+- As node3 fails, all resources are migrated to node2, as node2 is less utilized
+ than the other available node1
diff --git a/src/test/test-resource-affinity-strict-positive2/cmdlist b/src/test/test-resource-affinity-strict-positive2/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-positive2/hardware_status b/src/test/test-resource-affinity-strict-positive2/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive2/log.expect b/src/test/test-resource-affinity-strict-positive2/log.expect
new file mode 100644
index 0000000..78f4d66
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/log.expect
@@ -0,0 +1,80 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: adding new service 'vm:106' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 21 node1/lrm: starting service vm:105
+info 21 node1/lrm: service status vm:105 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:106
+info 23 node2/lrm: service status vm:106 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 25 node3/lrm: starting service vm:102
+info 25 node3/lrm: service status vm:102 started
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:101': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node2)
+info 243 node2/lrm: starting service vm:101
+info 243 node2/lrm: service status vm:101 started
+info 243 node2/lrm: starting service vm:102
+info 243 node2/lrm: service status vm:102 started
+info 243 node2/lrm: starting service vm:103
+info 243 node2/lrm: service status vm:103 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive2/manager_status b/src/test/test-resource-affinity-strict-positive2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-positive2/rules_config b/src/test/test-resource-affinity-strict-positive2/rules_config
new file mode 100644
index 0000000..12da6e6
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102,vm:103
+ affinity positive
diff --git a/src/test/test-resource-affinity-strict-positive2/service_config b/src/test/test-resource-affinity-strict-positive2/service_config
new file mode 100644
index 0000000..fd4a87e
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive2/service_config
@@ -0,0 +1,8 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" },
+ "vm:106": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive3/README b/src/test/test-resource-affinity-strict-positive3/README
new file mode 100644
index 0000000..a270277
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/README
@@ -0,0 +1,17 @@
+Test whether a strict positive resource affinity rule makes three resources
+migrate to the same recovery node in case of a failover of their previously
+assigned node. If one of those fail to start on the recovery node (e.g.
+insufficient resources), the failing resource will be kept on the recovery node.
+
+The test scenario is:
+- vm:101, vm:102, and fa:120002 must be kept together
+- vm:101, vm:102, and fa:120002 are all currently running on node3
+- fa:120002 will fail to start on node2
+- node1 has a higher resource count than node2 so that node2 is selected for
+ migration so that fa:12002 is guaranteed to fail there
+
+The expected outcome is:
+- As node3 fails, all resources are migrated to node2
+- Two of those resources will start successfully, but fa:120002 will stay in
+ recovery, since it cannot be started on this node, but cannot be relocated to
+ another one either due to the strict resource affinity rule
diff --git a/src/test/test-resource-affinity-strict-positive3/cmdlist b/src/test/test-resource-affinity-strict-positive3/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-resource-affinity-strict-positive3/hardware_status b/src/test/test-resource-affinity-strict-positive3/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive3/log.expect b/src/test/test-resource-affinity-strict-positive3/log.expect
new file mode 100644
index 0000000..4a54cb3
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/log.expect
@@ -0,0 +1,89 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'fa:120002' on node 'node3'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node3'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: adding new service 'vm:106' on node 'node2'
+info 20 node1/crm: service 'fa:120002': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node2)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 21 node1/lrm: starting service vm:105
+info 21 node1/lrm: service status vm:105 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:106
+info 23 node2/lrm: service status vm:106 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service fa:120002
+info 25 node3/lrm: service status fa:120002 started
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 25 node3/lrm: starting service vm:102
+info 25 node3/lrm: service status vm:102 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'fa:120002': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'fa:120002': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'fa:120002' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'fa:120002': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:101': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node2)
+info 243 node2/lrm: starting service fa:120002
+warn 243 node2/lrm: unable to start service fa:120002
+warn 243 node2/lrm: restart policy: retry number 1 for service 'fa:120002'
+info 243 node2/lrm: starting service vm:101
+info 243 node2/lrm: service status vm:101 started
+info 243 node2/lrm: starting service vm:102
+info 243 node2/lrm: service status vm:102 started
+info 263 node2/lrm: starting service fa:120002
+warn 263 node2/lrm: unable to start service fa:120002
+err 263 node2/lrm: unable to start service fa:120002 on local node after 1 retries
+warn 280 node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
+warn 280 node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120002', retry start on current node. Tried nodes: node2
+info 283 node2/lrm: starting service fa:120002
+info 283 node2/lrm: service status fa:120002 started
+info 300 node1/crm: relocation policy successful for 'fa:120002' on node 'node2', failed nodes: node2
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive3/manager_status b/src/test/test-resource-affinity-strict-positive3/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-resource-affinity-strict-positive3/rules_config b/src/test/test-resource-affinity-strict-positive3/rules_config
new file mode 100644
index 0000000..077fccd
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102,fa:120002
+ affinity positive
diff --git a/src/test/test-resource-affinity-strict-positive3/service_config b/src/test/test-resource-affinity-strict-positive3/service_config
new file mode 100644
index 0000000..3ce5f27
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive3/service_config
@@ -0,0 +1,8 @@
+{
+ "vm:101": { "node": "node3", "state": "started" },
+ "vm:102": { "node": "node3", "state": "started" },
+ "fa:120002": { "node": "node3", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" },
+ "vm:106": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive4/README b/src/test/test-resource-affinity-strict-positive4/README
new file mode 100644
index 0000000..6e16b30
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/README
@@ -0,0 +1,11 @@
+Test whether a strict positive resource affinity rule of three resources makes
+the resources stay together, if one of the resources is manually migrated to
+another node, i.e., migrate to the same node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are all currently running on node1
+
+The expected outcome is:
+- As vm:101 is migrated to node2, vm:102 and vm:103 are migrated to node2 as
+ well as a side-effect to follow the positive resource affinity rule.
diff --git a/src/test/test-resource-affinity-strict-positive4/cmdlist b/src/test/test-resource-affinity-strict-positive4/cmdlist
new file mode 100644
index 0000000..2e420cc
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "service vm:101 migrate node2" ]
+]
diff --git a/src/test/test-resource-affinity-strict-positive4/hardware_status b/src/test/test-resource-affinity-strict-positive4/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive4/log.expect b/src/test/test-resource-affinity-strict-positive4/log.expect
new file mode 100644
index 0000000..0d9854d
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/log.expect
@@ -0,0 +1,59 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 120 cmdlist: execute service vm:101 migrate node2
+info 120 node1/crm: got crm command: migrate vm:101 node2
+info 120 node1/crm: crm command 'migrate vm:101 node2' - migrate service 'vm:102' to node 'node2' (service 'vm:102' in positive affinity with service 'vm:101')
+info 120 node1/crm: crm command 'migrate vm:101 node2' - migrate service 'vm:103' to node 'node2' (service 'vm:103' in positive affinity with service 'vm:101')
+info 120 node1/crm: migrate service 'vm:101' to node 'node2'
+info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:102' to node 'node2'
+info 120 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:103' to node 'node2'
+info 120 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 121 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:102 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:102 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:103 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:103 - end migrate to node 'node2'
+info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node2)
+info 143 node2/lrm: got lock 'ha_agent_node2_lock'
+info 143 node2/lrm: status change wait_for_agent_lock => active
+info 143 node2/lrm: starting service vm:101
+info 143 node2/lrm: service status vm:101 started
+info 143 node2/lrm: starting service vm:102
+info 143 node2/lrm: service status vm:102 started
+info 143 node2/lrm: starting service vm:103
+info 143 node2/lrm: service status vm:103 started
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive4/manager_status b/src/test/test-resource-affinity-strict-positive4/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-positive4/rules_config b/src/test/test-resource-affinity-strict-positive4/rules_config
new file mode 100644
index 0000000..12da6e6
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: vms-must-stick-together
+ resources vm:101,vm:102,vm:103
+ affinity positive
diff --git a/src/test/test-resource-affinity-strict-positive4/service_config b/src/test/test-resource-affinity-strict-positive4/service_config
new file mode 100644
index 0000000..57e3579
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive4/service_config
@@ -0,0 +1,5 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive5/README b/src/test/test-resource-affinity-strict-positive5/README
new file mode 100644
index 0000000..3a9909e
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/README
@@ -0,0 +1,19 @@
+Test whether multiple connected positive resource affinity rules makes the
+resources stay together, if one of the resources is manually migrated to another
+node, i.e., migrate all of them to the same node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:103, vm:104, and vm:105 must be kept together
+- vm:105, vm:106, and vm:107 must be kept together
+- vm:105, vm:108, and vm:109 must be kept together
+- So essentially, vm:101 through vm:109 must be kept together
+- vm:101 through vm:109 are all on node1
+
+The expected outcome is:
+- As vm:103 is migrated to node2, all of vm:101 through vm:109 are migrated to
+ node2 as well, as these all must be kept together
+- As vm:101 is migrated to node3, all of vm:101 through vm:109 are migrated to
+ node3 as well, as these all must be kept together
+- As vm:109 is migrated to node1, all of vm:101 through vm:109 are migrated to
+ node1 as well, as these all must be kept together
diff --git a/src/test/test-resource-affinity-strict-positive5/cmdlist b/src/test/test-resource-affinity-strict-positive5/cmdlist
new file mode 100644
index 0000000..85c33d0
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/cmdlist
@@ -0,0 +1,8 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [ "service vm:103 migrate node2" ],
+ [ "delay 100" ],
+ [ "service vm:101 migrate node3" ],
+ [ "delay 100" ],
+ [ "service vm:109 migrate node1" ]
+]
diff --git a/src/test/test-resource-affinity-strict-positive5/hardware_status b/src/test/test-resource-affinity-strict-positive5/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-resource-affinity-strict-positive5/log.expect b/src/test/test-resource-affinity-strict-positive5/log.expect
new file mode 100644
index 0000000..4e91890
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/log.expect
@@ -0,0 +1,281 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: adding new service 'vm:106' on node 'node1'
+info 20 node1/crm: adding new service 'vm:107' on node 'node1'
+info 20 node1/crm: adding new service 'vm:108' on node 'node1'
+info 20 node1/crm: adding new service 'vm:109' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:108': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:109': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 21 node1/lrm: starting service vm:105
+info 21 node1/lrm: service status vm:105 started
+info 21 node1/lrm: starting service vm:106
+info 21 node1/lrm: service status vm:106 started
+info 21 node1/lrm: starting service vm:107
+info 21 node1/lrm: service status vm:107 started
+info 21 node1/lrm: starting service vm:108
+info 21 node1/lrm: service status vm:108 started
+info 21 node1/lrm: starting service vm:109
+info 21 node1/lrm: service status vm:109 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 120 cmdlist: execute service vm:103 migrate node2
+info 120 node1/crm: got crm command: migrate vm:103 node2
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:101' to node 'node2' (service 'vm:101' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:102' to node 'node2' (service 'vm:102' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:104' to node 'node2' (service 'vm:104' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:105' to node 'node2' (service 'vm:105' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:106' to node 'node2' (service 'vm:106' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:107' to node 'node2' (service 'vm:107' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:108' to node 'node2' (service 'vm:108' in positive affinity with service 'vm:103')
+info 120 node1/crm: crm command 'migrate vm:103 node2' - migrate service 'vm:109' to node 'node2' (service 'vm:109' in positive affinity with service 'vm:103')
+info 120 node1/crm: migrate service 'vm:101' to node 'node2'
+info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:102' to node 'node2'
+info 120 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:103' to node 'node2'
+info 120 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:104' to node 'node2'
+info 120 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:105' to node 'node2'
+info 120 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:106' to node 'node2'
+info 120 node1/crm: service 'vm:106': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:107' to node 'node2'
+info 120 node1/crm: service 'vm:107': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:108' to node 'node2'
+info 120 node1/crm: service 'vm:108': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 120 node1/crm: migrate service 'vm:109' to node 'node2'
+info 120 node1/crm: service 'vm:109': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 121 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:102 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:102 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:103 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:103 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:104 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:104 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:105 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:105 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:106 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:106 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:107 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:107 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:108 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:108 - end migrate to node 'node2'
+info 121 node1/lrm: service vm:109 - start migrate to node 'node2'
+info 121 node1/lrm: service vm:109 - end migrate to node 'node2'
+info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:104': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:106': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:107': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:108': state changed from 'migrate' to 'started' (node = node2)
+info 140 node1/crm: service 'vm:109': state changed from 'migrate' to 'started' (node = node2)
+info 143 node2/lrm: got lock 'ha_agent_node2_lock'
+info 143 node2/lrm: status change wait_for_agent_lock => active
+info 143 node2/lrm: starting service vm:101
+info 143 node2/lrm: service status vm:101 started
+info 143 node2/lrm: starting service vm:102
+info 143 node2/lrm: service status vm:102 started
+info 143 node2/lrm: starting service vm:103
+info 143 node2/lrm: service status vm:103 started
+info 143 node2/lrm: starting service vm:104
+info 143 node2/lrm: service status vm:104 started
+info 143 node2/lrm: starting service vm:105
+info 143 node2/lrm: service status vm:105 started
+info 143 node2/lrm: starting service vm:106
+info 143 node2/lrm: service status vm:106 started
+info 143 node2/lrm: starting service vm:107
+info 143 node2/lrm: service status vm:107 started
+info 143 node2/lrm: starting service vm:108
+info 143 node2/lrm: service status vm:108 started
+info 143 node2/lrm: starting service vm:109
+info 143 node2/lrm: service status vm:109 started
+info 220 cmdlist: execute delay 100
+info 400 cmdlist: execute service vm:101 migrate node3
+info 400 node1/crm: got crm command: migrate vm:101 node3
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:102' to node 'node3' (service 'vm:102' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:103' to node 'node3' (service 'vm:103' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:104' to node 'node3' (service 'vm:104' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:105' to node 'node3' (service 'vm:105' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:106' to node 'node3' (service 'vm:106' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:107' to node 'node3' (service 'vm:107' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:108' to node 'node3' (service 'vm:108' in positive affinity with service 'vm:101')
+info 400 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:109' to node 'node3' (service 'vm:109' in positive affinity with service 'vm:101')
+info 400 node1/crm: migrate service 'vm:101' to node 'node3'
+info 400 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:102' to node 'node3'
+info 400 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:103' to node 'node3'
+info 400 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:104' to node 'node3'
+info 400 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:105' to node 'node3'
+info 400 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:106' to node 'node3'
+info 400 node1/crm: service 'vm:106': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:107' to node 'node3'
+info 400 node1/crm: service 'vm:107': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:108' to node 'node3'
+info 400 node1/crm: service 'vm:108': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 400 node1/crm: migrate service 'vm:109' to node 'node3'
+info 400 node1/crm: service 'vm:109': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 403 node2/lrm: service vm:101 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:101 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:102 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:102 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:103 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:103 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:104 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:104 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:105 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:105 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:106 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:106 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:107 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:107 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:108 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:108 - end migrate to node 'node3'
+info 403 node2/lrm: service vm:109 - start migrate to node 'node3'
+info 403 node2/lrm: service vm:109 - end migrate to node 'node3'
+info 420 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:104': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:106': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:107': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:108': state changed from 'migrate' to 'started' (node = node3)
+info 420 node1/crm: service 'vm:109': state changed from 'migrate' to 'started' (node = node3)
+info 425 node3/lrm: got lock 'ha_agent_node3_lock'
+info 425 node3/lrm: status change wait_for_agent_lock => active
+info 425 node3/lrm: starting service vm:101
+info 425 node3/lrm: service status vm:101 started
+info 425 node3/lrm: starting service vm:102
+info 425 node3/lrm: service status vm:102 started
+info 425 node3/lrm: starting service vm:103
+info 425 node3/lrm: service status vm:103 started
+info 425 node3/lrm: starting service vm:104
+info 425 node3/lrm: service status vm:104 started
+info 425 node3/lrm: starting service vm:105
+info 425 node3/lrm: service status vm:105 started
+info 425 node3/lrm: starting service vm:106
+info 425 node3/lrm: service status vm:106 started
+info 425 node3/lrm: starting service vm:107
+info 425 node3/lrm: service status vm:107 started
+info 425 node3/lrm: starting service vm:108
+info 425 node3/lrm: service status vm:108 started
+info 425 node3/lrm: starting service vm:109
+info 425 node3/lrm: service status vm:109 started
+info 500 cmdlist: execute delay 100
+info 680 cmdlist: execute service vm:109 migrate node1
+info 680 node1/crm: got crm command: migrate vm:109 node1
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:101' to node 'node1' (service 'vm:101' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:102' to node 'node1' (service 'vm:102' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:103' to node 'node1' (service 'vm:103' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:104' to node 'node1' (service 'vm:104' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:105' to node 'node1' (service 'vm:105' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:106' to node 'node1' (service 'vm:106' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:107' to node 'node1' (service 'vm:107' in positive affinity with service 'vm:109')
+info 680 node1/crm: crm command 'migrate vm:109 node1' - migrate service 'vm:108' to node 'node1' (service 'vm:108' in positive affinity with service 'vm:109')
+info 680 node1/crm: migrate service 'vm:101' to node 'node1'
+info 680 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:102' to node 'node1'
+info 680 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:103' to node 'node1'
+info 680 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:104' to node 'node1'
+info 680 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:105' to node 'node1'
+info 680 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:106' to node 'node1'
+info 680 node1/crm: service 'vm:106': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:107' to node 'node1'
+info 680 node1/crm: service 'vm:107': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:108' to node 'node1'
+info 680 node1/crm: service 'vm:108': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 680 node1/crm: migrate service 'vm:109' to node 'node1'
+info 680 node1/crm: service 'vm:109': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 685 node3/lrm: service vm:101 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:101 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:102 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:102 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:103 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:103 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:104 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:104 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:105 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:105 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:106 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:106 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:107 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:107 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:108 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:108 - end migrate to node 'node1'
+info 685 node3/lrm: service vm:109 - start migrate to node 'node1'
+info 685 node3/lrm: service vm:109 - end migrate to node 'node1'
+info 700 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:104': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:106': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:107': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:108': state changed from 'migrate' to 'started' (node = node1)
+info 700 node1/crm: service 'vm:109': state changed from 'migrate' to 'started' (node = node1)
+info 701 node1/lrm: starting service vm:101
+info 701 node1/lrm: service status vm:101 started
+info 701 node1/lrm: starting service vm:102
+info 701 node1/lrm: service status vm:102 started
+info 701 node1/lrm: starting service vm:103
+info 701 node1/lrm: service status vm:103 started
+info 701 node1/lrm: starting service vm:104
+info 701 node1/lrm: service status vm:104 started
+info 701 node1/lrm: starting service vm:105
+info 701 node1/lrm: service status vm:105 started
+info 701 node1/lrm: starting service vm:106
+info 701 node1/lrm: service status vm:106 started
+info 701 node1/lrm: starting service vm:107
+info 701 node1/lrm: service status vm:107 started
+info 701 node1/lrm: starting service vm:108
+info 701 node1/lrm: service status vm:108 started
+info 701 node1/lrm: starting service vm:109
+info 701 node1/lrm: service status vm:109 started
+info 1280 hardware: exit simulation - done
diff --git a/src/test/test-resource-affinity-strict-positive5/manager_status b/src/test/test-resource-affinity-strict-positive5/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-resource-affinity-strict-positive5/rules_config b/src/test/test-resource-affinity-strict-positive5/rules_config
new file mode 100644
index 0000000..b070af3
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/rules_config
@@ -0,0 +1,15 @@
+resource-affinity: vms-must-stick-together1
+ resources vm:101,vm:102,vm:103
+ affinity positive
+
+resource-affinity: vms-must-stick-together2
+ resources vm:103,vm:104,vm:105
+ affinity positive
+
+resource-affinity: vms-must-stick-together3
+ resources vm:105,vm:106,vm:107
+ affinity positive
+
+resource-affinity: vms-must-stick-together4
+ resources vm:105,vm:108,vm:109
+ affinity positive
diff --git a/src/test/test-resource-affinity-strict-positive5/service_config b/src/test/test-resource-affinity-strict-positive5/service_config
new file mode 100644
index 0000000..48db7b1
--- /dev/null
+++ b/src/test/test-resource-affinity-strict-positive5/service_config
@@ -0,0 +1,11 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" },
+ "vm:106": { "node": "node1", "state": "started" },
+ "vm:107": { "node": "node1", "state": "started" },
+ "vm:108": { "node": "node1", "state": "started" },
+ "vm:109": { "node": "node1", "state": "started" }
+}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 11/13] test: ha tester: add test cases for static scheduler resource affinity
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (9 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 10/13] test: ha tester: add test cases for positive " Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 12/13] test: rules: add test cases for resource affinity rules Daniel Kral
` (7 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add test cases, where resource affinity rules are used with the static
utilization scheduler and the rebalance on start option enabled. These
verify the behavior in the following scenarios:
- 7 resources with interwined resource affinity rules in a 3 node
cluster; 1 node failing
- 3 resources in neg. affinity and a 3 node cluster, where the rules are
stated in pairwise form; 1 node failing
- 5 resources in neg. affinity and a 5 node cluster; nodes consecutively
failing after each other
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../README | 26 ++
.../cmdlist | 4 +
.../datacenter.cfg | 6 +
.../hardware_status | 5 +
.../log.expect | 120 ++++++++
.../manager_status | 1 +
.../rules_config | 19 ++
.../service_config | 10 +
.../static_service_stats | 10 +
.../README | 20 ++
.../cmdlist | 4 +
.../datacenter.cfg | 6 +
.../hardware_status | 5 +
.../log.expect | 174 +++++++++++
.../manager_status | 1 +
.../rules_config | 11 +
.../service_config | 14 +
.../static_service_stats | 14 +
.../README | 22 ++
.../cmdlist | 22 ++
.../datacenter.cfg | 6 +
.../hardware_status | 7 +
.../log.expect | 272 ++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 9 +
.../static_service_stats | 9 +
27 files changed, 801 insertions(+)
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/README
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/cmdlist
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/log.expect
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/manager_status
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/rules_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/service_config
create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/README b/src/test/test-crs-static-rebalance-resource-affinity1/README
new file mode 100644
index 0000000..9b36bf6
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/README
@@ -0,0 +1,26 @@
+Test whether a mixed set of strict resource affinity rules in conjunction with
+the static load scheduler with auto-rebalancing are applied correctly on
+resource start enabled and in case of a subsequent failover.
+
+The test scenario is:
+- vm:101 and vm:102 do not have any resource affinity
+- Services that must be kept together:
+ - vm:102 and vm:107
+ - vm:104, vm:106, and vm:108
+- Services that must be kept separate:
+ - vm:103, vm:104, and vm:105
+ - vm:103, vm:106, and vm:107
+ - vm:107 and vm:108
+- Therefore, there are consistent interdependencies between the positive and
+ negative resource affinity rules' resource members
+- vm:101 and vm:102 are currently assigned to node1 and node2 respectively
+- vm:103 through vm:108 are currently assigned to node3
+
+The expected outcome is:
+- vm:101, vm:102, vm:103 should be started on node1, node2, and node3
+ respectively, as there's nothing running on there yet
+- vm:104, vm:106, and vm:108 should all be assigned on the same node, which
+ will be node1, since it has the most resources left for vm:104
+- vm:105 and vm:107 should both be assigned on the same node, which will be
+ node2, since both cannot be assigned to the other nodes because of the
+ resource affinity constraints
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist
new file mode 100644
index 0000000..eee0e40
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+ "crs": {
+ "ha": "static",
+ "ha-rebalance-on-start": 1
+ }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
new file mode 100644
index 0000000..84484af
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+ "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+ "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/log.expect b/src/test/test-crs-static-rebalance-resource-affinity1/log.expect
new file mode 100644
index 0000000..cdd2497
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/log.expect
@@ -0,0 +1,120 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'static'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 20 node1/crm: adding new service 'vm:104' on node 'node3'
+info 20 node1/crm: adding new service 'vm:105' on node 'node3'
+info 20 node1/crm: adding new service 'vm:106' on node 'node3'
+info 20 node1/crm: adding new service 'vm:107' on node 'node3'
+info 20 node1/crm: adding new service 'vm:108' on node 'node3'
+info 20 node1/crm: service vm:101: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service vm:102: re-balance selected current node node2 for startup
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service vm:103: re-balance selected current node node3 for startup
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service vm:104: re-balance selected new node node1 for startup
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1)
+info 20 node1/crm: service vm:105: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node2)
+info 20 node1/crm: service vm:106: re-balance selected new node node1 for startup
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1)
+info 20 node1/crm: service vm:107: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node2)
+info 20 node1/crm: service vm:108: re-balance selected new node node1 for startup
+info 20 node1/crm: service 'vm:108': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 25 node3/lrm: service vm:104 - start relocate to node 'node1'
+info 25 node3/lrm: service vm:104 - end relocate to node 'node1'
+info 25 node3/lrm: service vm:105 - start relocate to node 'node2'
+info 25 node3/lrm: service vm:105 - end relocate to node 'node2'
+info 25 node3/lrm: service vm:106 - start relocate to node 'node1'
+info 25 node3/lrm: service vm:106 - end relocate to node 'node1'
+info 25 node3/lrm: service vm:107 - start relocate to node 'node2'
+info 25 node3/lrm: service vm:107 - end relocate to node 'node2'
+info 25 node3/lrm: service vm:108 - start relocate to node 'node1'
+info 25 node3/lrm: service vm:108 - end relocate to node 'node1'
+info 40 node1/crm: service 'vm:104': state changed from 'request_start_balance' to 'started' (node = node1)
+info 40 node1/crm: service 'vm:105': state changed from 'request_start_balance' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:106': state changed from 'request_start_balance' to 'started' (node = node1)
+info 40 node1/crm: service 'vm:107': state changed from 'request_start_balance' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:108': state changed from 'request_start_balance' to 'started' (node = node1)
+info 41 node1/lrm: starting service vm:104
+info 41 node1/lrm: service status vm:104 started
+info 41 node1/lrm: starting service vm:106
+info 41 node1/lrm: service status vm:106 started
+info 41 node1/lrm: starting service vm:108
+info 41 node1/lrm: service status vm:108 started
+info 43 node2/lrm: starting service vm:105
+info 43 node2/lrm: service status vm:105 started
+info 43 node2/lrm: starting service vm:107
+info 43 node2/lrm: service status vm:107 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+err 240 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 260 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 280 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 300 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 320 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 340 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 360 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 380 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 400 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 420 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 440 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 460 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 480 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 500 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 520 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 540 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 560 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 580 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 600 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 620 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 640 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 660 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 680 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err 700 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/manager_status b/src/test/test-crs-static-rebalance-resource-affinity1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/rules_config b/src/test/test-crs-static-rebalance-resource-affinity1/rules_config
new file mode 100644
index 0000000..734a055
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/rules_config
@@ -0,0 +1,19 @@
+resource-affinity: vms-must-stick-together1
+ resources vm:102,vm:107
+ affinity positive
+
+resource-affinity: vms-must-stick-together2
+ resources vm:104,vm:106,vm:108
+ affinity positive
+
+resource-affinity: vms-must-stay-apart1
+ resources vm:103,vm:104,vm:105
+ affinity negative
+
+resource-affinity: vms-must-stay-apart2
+ resources vm:103,vm:106,vm:107
+ affinity negative
+
+resource-affinity: vms-must-stay-apart3
+ resources vm:107,vm:108
+ affinity negative
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/service_config b/src/test/test-crs-static-rebalance-resource-affinity1/service_config
new file mode 100644
index 0000000..02e4a07
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/service_config
@@ -0,0 +1,10 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node2", "state": "started" },
+ "vm:103": { "node": "node3", "state": "started" },
+ "vm:104": { "node": "node3", "state": "started" },
+ "vm:105": { "node": "node3", "state": "started" },
+ "vm:106": { "node": "node3", "state": "started" },
+ "vm:107": { "node": "node3", "state": "started" },
+ "vm:108": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats
new file mode 100644
index 0000000..c6472ca
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats
@@ -0,0 +1,10 @@
+{
+ "vm:101": { "maxcpu": 8, "maxmem": 16000000000 },
+ "vm:102": { "maxcpu": 4, "maxmem": 24000000000 },
+ "vm:103": { "maxcpu": 2, "maxmem": 32000000000 },
+ "vm:104": { "maxcpu": 4, "maxmem": 48000000000 },
+ "vm:105": { "maxcpu": 8, "maxmem": 16000000000 },
+ "vm:106": { "maxcpu": 4, "maxmem": 32000000000 },
+ "vm:107": { "maxcpu": 2, "maxmem": 64000000000 },
+ "vm:108": { "maxcpu": 8, "maxmem": 48000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/README b/src/test/test-crs-static-rebalance-resource-affinity2/README
new file mode 100644
index 0000000..6354299
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/README
@@ -0,0 +1,20 @@
+Test whether a pairwise strict negative resource affinity rules, i.e. negative
+resource affinity relations a<->b, b<->c and a<->c, in conjunction with the
+static load scheduler with auto-rebalancing are applied correctly on resource
+start and in case of a subsequent failover.
+
+The test scenario is:
+- vm:100 and vm:200 must be kept separate
+- vm:200 and vm:300 must be kept separate
+- vm:100 and vm:300 must be kept separate
+- Therefore, vm:100, vm:200, and vm:300 must be kept separate
+- The resources' static usage stats are chosen so that during rebalancing vm:300
+ will need to select a less than ideal node according to the static usage
+ scheduler, i.e. node1 being the ideal one, to test whether the resource
+ affinity rule still applies correctly
+
+The expected outcome is:
+- vm:100, vm:200, and vm:300 should be started on node1, node2, and node3
+ respectively, just as if the three negative resource affinity rule would've
+ been stated in a single negative resource affinity rule
+- As node3 fails, vm:300 cannot be recovered
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist
new file mode 100644
index 0000000..eee0e40
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist
@@ -0,0 +1,4 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "network node3 off" ]
+]
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+ "crs": {
+ "ha": "static",
+ "ha-rebalance-on-start": 1
+ }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
new file mode 100644
index 0000000..84484af
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+ "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+ "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/log.expect b/src/test/test-crs-static-rebalance-resource-affinity2/log.expect
new file mode 100644
index 0000000..a7e5c8e
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/log.expect
@@ -0,0 +1,174 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'static'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:100' on node 'node1'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:200' on node 'node1'
+info 20 node1/crm: adding new service 'vm:201' on node 'node1'
+info 20 node1/crm: adding new service 'vm:202' on node 'node1'
+info 20 node1/crm: adding new service 'vm:203' on node 'node1'
+info 20 node1/crm: adding new service 'vm:300' on node 'node1'
+info 20 node1/crm: adding new service 'vm:301' on node 'node1'
+info 20 node1/crm: adding new service 'vm:302' on node 'node1'
+info 20 node1/crm: adding new service 'vm:303' on node 'node1'
+info 20 node1/crm: service vm:100: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:100': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service vm:101: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2)
+info 20 node1/crm: service vm:102: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:103: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:200: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:200': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2)
+info 20 node1/crm: service vm:201: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:202: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:202': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:203: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:203': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service vm:300: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:300': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:301: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:301': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service vm:302: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:302': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2)
+info 20 node1/crm: service vm:303: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:303': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:100
+info 21 node1/lrm: service status vm:100 started
+info 21 node1/lrm: service vm:101 - start relocate to node 'node2'
+info 21 node1/lrm: service vm:101 - end relocate to node 'node2'
+info 21 node1/lrm: service vm:102 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:102 - end relocate to node 'node3'
+info 21 node1/lrm: service vm:103 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:103 - end relocate to node 'node3'
+info 21 node1/lrm: service vm:200 - start relocate to node 'node2'
+info 21 node1/lrm: service vm:200 - end relocate to node 'node2'
+info 21 node1/lrm: service vm:201 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:201 - end relocate to node 'node3'
+info 21 node1/lrm: service vm:202 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:202 - end relocate to node 'node3'
+info 21 node1/lrm: starting service vm:203
+info 21 node1/lrm: service status vm:203 started
+info 21 node1/lrm: service vm:300 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:300 - end relocate to node 'node3'
+info 21 node1/lrm: starting service vm:301
+info 21 node1/lrm: service status vm:301 started
+info 21 node1/lrm: service vm:302 - start relocate to node 'node2'
+info 21 node1/lrm: service vm:302 - end relocate to node 'node2'
+info 21 node1/lrm: starting service vm:303
+info 21 node1/lrm: service status vm:303 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 40 node1/crm: service 'vm:101': state changed from 'request_start_balance' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:102': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:103': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:200': state changed from 'request_start_balance' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:201': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:202': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:300': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:302': state changed from 'request_start_balance' to 'started' (node = node2)
+info 43 node2/lrm: got lock 'ha_agent_node2_lock'
+info 43 node2/lrm: status change wait_for_agent_lock => active
+info 43 node2/lrm: starting service vm:101
+info 43 node2/lrm: service status vm:101 started
+info 43 node2/lrm: starting service vm:200
+info 43 node2/lrm: service status vm:200 started
+info 43 node2/lrm: starting service vm:302
+info 43 node2/lrm: service status vm:302 started
+info 45 node3/lrm: got lock 'ha_agent_node3_lock'
+info 45 node3/lrm: status change wait_for_agent_lock => active
+info 45 node3/lrm: starting service vm:102
+info 45 node3/lrm: service status vm:102 started
+info 45 node3/lrm: starting service vm:103
+info 45 node3/lrm: service status vm:103 started
+info 45 node3/lrm: starting service vm:201
+info 45 node3/lrm: service status vm:201 started
+info 45 node3/lrm: starting service vm:202
+info 45 node3/lrm: service status vm:202 started
+info 45 node3/lrm: starting service vm:300
+info 45 node3/lrm: service status vm:300 started
+info 120 cmdlist: execute network node3 off
+info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 124 node3/crm: status change slave => wait_for_quorum
+info 125 node3/lrm: status change active => lost_agent_lock
+info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:201': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:202': state changed from 'started' to 'fence'
+info 160 node1/crm: service 'vm:300': state changed from 'started' to 'fence'
+info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 160 node1/crm: FENCE: Try to fence node 'node3'
+info 166 watchdog: execute power node3 off
+info 165 node3/crm: killed by poweroff
+info 166 node3/lrm: killed by poweroff
+info 166 hardware: server 'node3' stopped by poweroff (watchdog)
+info 240 node1/crm: got lock 'ha_agent_node3_lock'
+info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:201': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:202': state changed from 'fence' to 'recovery'
+info 240 node1/crm: service 'vm:300': state changed from 'fence' to 'recovery'
+info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node1)
+info 240 node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:201' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:201': state changed from 'recovery' to 'started' (node = node2)
+info 240 node1/crm: recover service 'vm:202' from fenced node 'node3' to node 'node2'
+info 240 node1/crm: service 'vm:202': state changed from 'recovery' to 'started' (node = node2)
+err 240 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 240 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+info 241 node1/lrm: starting service vm:102
+info 241 node1/lrm: service status vm:102 started
+info 243 node2/lrm: starting service vm:103
+info 243 node2/lrm: service status vm:103 started
+info 243 node2/lrm: starting service vm:201
+info 243 node2/lrm: service status vm:201 started
+info 243 node2/lrm: starting service vm:202
+info 243 node2/lrm: service status vm:202 started
+err 260 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 280 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 300 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 320 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 340 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 360 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 380 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 400 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 420 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 440 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 460 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 480 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 500 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 520 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 540 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 560 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 580 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 600 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 620 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 640 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 660 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 680 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+err 700 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found
+info 720 hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/manager_status b/src/test/test-crs-static-rebalance-resource-affinity2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/rules_config b/src/test/test-crs-static-rebalance-resource-affinity2/rules_config
new file mode 100644
index 0000000..bfe8787
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/rules_config
@@ -0,0 +1,11 @@
+resource-affinity: very-lonely-services1
+ resources vm:100,vm:200
+ affinity negative
+
+resource-affinity: very-lonely-services2
+ resources vm:200,vm:300
+ affinity negative
+
+resource-affinity: very-lonely-services3
+ resources vm:100,vm:300
+ affinity negative
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/service_config b/src/test/test-crs-static-rebalance-resource-affinity2/service_config
new file mode 100644
index 0000000..0de367e
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/service_config
@@ -0,0 +1,14 @@
+{
+ "vm:100": { "node": "node1", "state": "started" },
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:200": { "node": "node1", "state": "started" },
+ "vm:201": { "node": "node1", "state": "started" },
+ "vm:202": { "node": "node1", "state": "started" },
+ "vm:203": { "node": "node1", "state": "started" },
+ "vm:300": { "node": "node1", "state": "started" },
+ "vm:301": { "node": "node1", "state": "started" },
+ "vm:302": { "node": "node1", "state": "started" },
+ "vm:303": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats
new file mode 100644
index 0000000..3c7502e
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats
@@ -0,0 +1,14 @@
+{
+ "vm:100": { "maxcpu": 8, "maxmem": 16000000000 },
+ "vm:101": { "maxcpu": 4, "maxmem": 8000000000 },
+ "vm:102": { "maxcpu": 2, "maxmem": 8000000000 },
+ "vm:103": { "maxcpu": 2, "maxmem": 4000000000 },
+ "vm:200": { "maxcpu": 4, "maxmem": 24000000000 },
+ "vm:201": { "maxcpu": 2, "maxmem": 8000000000 },
+ "vm:202": { "maxcpu": 4, "maxmem": 4000000000 },
+ "vm:203": { "maxcpu": 2, "maxmem": 8000000000 },
+ "vm:300": { "maxcpu": 6, "maxmem": 32000000000 },
+ "vm:301": { "maxcpu": 2, "maxmem": 4000000000 },
+ "vm:302": { "maxcpu": 2, "maxmem": 8000000000 },
+ "vm:303": { "maxcpu": 4, "maxmem": 8000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/README b/src/test/test-crs-static-rebalance-resource-affinity3/README
new file mode 100644
index 0000000..9e57662
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/README
@@ -0,0 +1,22 @@
+Test whether a more complex set of pairwise strict negative resource affinity
+rules, i.e. there's negative resource affinity relations a<->b, b<->c and a<->c,
+with 5 resources in conjunction with the static load scheduler with
+auto-rebalancing are applied correctly on resource start and in case of a
+consecutive failover of all nodes after each other.
+
+The test scenario is:
+- vm:100, vm:200, vm:300, vm:400, and vm:500 must be kept separate
+- The resources' static usage stats are chosen so that during rebalancing vm:300
+ and vm:500 will need to select a less than ideal node according to the static
+ usage scheduler, i.e. node2 and node3 being their ideal ones, to test whether
+ the resource affinity rule still applies correctly
+
+The expected outcome is:
+- vm:100, vm:200, vm:300, vm:400, and vm:500 should be started on node2, node1,
+ node4, node3, and node5 respectively
+- vm:400 and vm:500 are started on node3 and node5, instead of node2 and node3
+ as would've been without the resource affinity rules
+- As node1, node2, node3, node4, and node5 fail consecutively with each node
+ coming back online, vm:200, vm:100, vm:400, vm:300, and vm:500 will be put in
+ recovery during the failover respectively, as there is no other node left to
+ accomodate them without violating the resource affinity rule.
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist
new file mode 100644
index 0000000..6665419
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist
@@ -0,0 +1,22 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+ [ "power node1 off" ],
+ [ "delay 100" ],
+ [ "power node1 on" ],
+ [ "delay 100" ],
+ [ "power node2 off" ],
+ [ "delay 100" ],
+ [ "power node2 on" ],
+ [ "delay 100" ],
+ [ "power node3 off" ],
+ [ "delay 100" ],
+ [ "power node3 on" ],
+ [ "delay 100" ],
+ [ "power node4 off" ],
+ [ "delay 100" ],
+ [ "power node4 on" ],
+ [ "delay 100" ],
+ [ "power node5 off" ],
+ [ "delay 100" ],
+ [ "power node5 on" ]
+]
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+ "crs": {
+ "ha": "static",
+ "ha-rebalance-on-start": 1
+ }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
new file mode 100644
index 0000000..b6dcb1a
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
@@ -0,0 +1,7 @@
+{
+ "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 },
+ "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 },
+ "node3": { "power": "off", "network": "off", "cpus": 16, "memory": 24000000000 },
+ "node4": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 },
+ "node5": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/log.expect b/src/test/test-crs-static-rebalance-resource-affinity3/log.expect
new file mode 100644
index 0000000..4e87f03
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/log.expect
@@ -0,0 +1,272 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node4 on
+info 20 node4/crm: status change startup => wait_for_quorum
+info 20 node4/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node5 on
+info 20 node5/crm: status change startup => wait_for_quorum
+info 20 node5/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'static'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:100' on node 'node1'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:200' on node 'node1'
+info 20 node1/crm: adding new service 'vm:201' on node 'node1'
+info 20 node1/crm: adding new service 'vm:300' on node 'node1'
+info 20 node1/crm: adding new service 'vm:400' on node 'node1'
+info 20 node1/crm: adding new service 'vm:500' on node 'node1'
+info 20 node1/crm: service vm:100: re-balance selected new node node2 for startup
+info 20 node1/crm: service 'vm:100': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2)
+info 20 node1/crm: service vm:101: re-balance selected new node node4 for startup
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node4)
+info 20 node1/crm: service vm:200: re-balance selected current node node1 for startup
+info 20 node1/crm: service 'vm:200': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service vm:201: re-balance selected new node node5 for startup
+info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node5)
+info 20 node1/crm: service vm:300: re-balance selected new node node4 for startup
+info 20 node1/crm: service 'vm:300': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node4)
+info 20 node1/crm: service vm:400: re-balance selected new node node3 for startup
+info 20 node1/crm: service 'vm:400': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3)
+info 20 node1/crm: service vm:500: re-balance selected new node node5 for startup
+info 20 node1/crm: service 'vm:500': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node5)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: service vm:100 - start relocate to node 'node2'
+info 21 node1/lrm: service vm:100 - end relocate to node 'node2'
+info 21 node1/lrm: service vm:101 - start relocate to node 'node4'
+info 21 node1/lrm: service vm:101 - end relocate to node 'node4'
+info 21 node1/lrm: starting service vm:200
+info 21 node1/lrm: service status vm:200 started
+info 21 node1/lrm: service vm:201 - start relocate to node 'node5'
+info 21 node1/lrm: service vm:201 - end relocate to node 'node5'
+info 21 node1/lrm: service vm:300 - start relocate to node 'node4'
+info 21 node1/lrm: service vm:300 - end relocate to node 'node4'
+info 21 node1/lrm: service vm:400 - start relocate to node 'node3'
+info 21 node1/lrm: service vm:400 - end relocate to node 'node3'
+info 21 node1/lrm: service vm:500 - start relocate to node 'node5'
+info 21 node1/lrm: service vm:500 - end relocate to node 'node5'
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 26 node4/crm: status change wait_for_quorum => slave
+info 28 node5/crm: status change wait_for_quorum => slave
+info 40 node1/crm: service 'vm:100': state changed from 'request_start_balance' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:101': state changed from 'request_start_balance' to 'started' (node = node4)
+info 40 node1/crm: service 'vm:201': state changed from 'request_start_balance' to 'started' (node = node5)
+info 40 node1/crm: service 'vm:300': state changed from 'request_start_balance' to 'started' (node = node4)
+info 40 node1/crm: service 'vm:400': state changed from 'request_start_balance' to 'started' (node = node3)
+info 40 node1/crm: service 'vm:500': state changed from 'request_start_balance' to 'started' (node = node5)
+info 43 node2/lrm: got lock 'ha_agent_node2_lock'
+info 43 node2/lrm: status change wait_for_agent_lock => active
+info 43 node2/lrm: starting service vm:100
+info 43 node2/lrm: service status vm:100 started
+info 45 node3/lrm: got lock 'ha_agent_node3_lock'
+info 45 node3/lrm: status change wait_for_agent_lock => active
+info 45 node3/lrm: starting service vm:400
+info 45 node3/lrm: service status vm:400 started
+info 47 node4/lrm: got lock 'ha_agent_node4_lock'
+info 47 node4/lrm: status change wait_for_agent_lock => active
+info 47 node4/lrm: starting service vm:101
+info 47 node4/lrm: service status vm:101 started
+info 47 node4/lrm: starting service vm:300
+info 47 node4/lrm: service status vm:300 started
+info 49 node5/lrm: got lock 'ha_agent_node5_lock'
+info 49 node5/lrm: status change wait_for_agent_lock => active
+info 49 node5/lrm: starting service vm:201
+info 49 node5/lrm: service status vm:201 started
+info 49 node5/lrm: starting service vm:500
+info 49 node5/lrm: service status vm:500 started
+info 120 cmdlist: execute power node1 off
+info 120 node1/crm: killed by poweroff
+info 120 node1/lrm: killed by poweroff
+info 220 cmdlist: execute delay 100
+info 222 node3/crm: got lock 'ha_manager_lock'
+info 222 node3/crm: status change slave => master
+info 222 node3/crm: using scheduler mode 'static'
+info 222 node3/crm: node 'node1': state changed from 'online' => 'unknown'
+info 282 node3/crm: service 'vm:200': state changed from 'started' to 'fence'
+info 282 node3/crm: node 'node1': state changed from 'unknown' => 'fence'
+emai 282 node3/crm: FENCE: Try to fence node 'node1'
+info 282 node3/crm: got lock 'ha_agent_node1_lock'
+info 282 node3/crm: fencing: acknowledged - got agent lock for node 'node1'
+info 282 node3/crm: node 'node1': state changed from 'fence' => 'unknown'
+emai 282 node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node1'
+info 282 node3/crm: service 'vm:200': state changed from 'fence' to 'recovery'
+err 282 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+err 302 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+err 322 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+err 342 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+err 362 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+err 382 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found
+info 400 cmdlist: execute power node1 on
+info 400 node1/crm: status change startup => wait_for_quorum
+info 400 node1/lrm: status change startup => wait_for_agent_lock
+info 400 node1/crm: status change wait_for_quorum => slave
+info 404 node3/crm: node 'node1': state changed from 'unknown' => 'online'
+info 404 node3/crm: recover service 'vm:200' to previous failed and fenced node 'node1' again
+info 404 node3/crm: service 'vm:200': state changed from 'recovery' to 'started' (node = node1)
+info 421 node1/lrm: got lock 'ha_agent_node1_lock'
+info 421 node1/lrm: status change wait_for_agent_lock => active
+info 421 node1/lrm: starting service vm:200
+info 421 node1/lrm: service status vm:200 started
+info 500 cmdlist: execute delay 100
+info 680 cmdlist: execute power node2 off
+info 680 node2/crm: killed by poweroff
+info 680 node2/lrm: killed by poweroff
+info 682 node3/crm: node 'node2': state changed from 'online' => 'unknown'
+info 742 node3/crm: service 'vm:100': state changed from 'started' to 'fence'
+info 742 node3/crm: node 'node2': state changed from 'unknown' => 'fence'
+emai 742 node3/crm: FENCE: Try to fence node 'node2'
+info 780 cmdlist: execute delay 100
+info 802 node3/crm: got lock 'ha_agent_node2_lock'
+info 802 node3/crm: fencing: acknowledged - got agent lock for node 'node2'
+info 802 node3/crm: node 'node2': state changed from 'fence' => 'unknown'
+emai 802 node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node2'
+info 802 node3/crm: service 'vm:100': state changed from 'fence' to 'recovery'
+err 802 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 822 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 842 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 862 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 882 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 902 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 922 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+err 942 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found
+info 960 cmdlist: execute power node2 on
+info 960 node2/crm: status change startup => wait_for_quorum
+info 960 node2/lrm: status change startup => wait_for_agent_lock
+info 962 node2/crm: status change wait_for_quorum => slave
+info 963 node2/lrm: got lock 'ha_agent_node2_lock'
+info 963 node2/lrm: status change wait_for_agent_lock => active
+info 964 node3/crm: node 'node2': state changed from 'unknown' => 'online'
+info 964 node3/crm: recover service 'vm:100' to previous failed and fenced node 'node2' again
+info 964 node3/crm: service 'vm:100': state changed from 'recovery' to 'started' (node = node2)
+info 983 node2/lrm: starting service vm:100
+info 983 node2/lrm: service status vm:100 started
+info 1060 cmdlist: execute delay 100
+info 1240 cmdlist: execute power node3 off
+info 1240 node3/crm: killed by poweroff
+info 1240 node3/lrm: killed by poweroff
+info 1340 cmdlist: execute delay 100
+info 1346 node5/crm: got lock 'ha_manager_lock'
+info 1346 node5/crm: status change slave => master
+info 1346 node5/crm: using scheduler mode 'static'
+info 1346 node5/crm: node 'node3': state changed from 'online' => 'unknown'
+info 1406 node5/crm: service 'vm:400': state changed from 'started' to 'fence'
+info 1406 node5/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai 1406 node5/crm: FENCE: Try to fence node 'node3'
+info 1406 node5/crm: got lock 'ha_agent_node3_lock'
+info 1406 node5/crm: fencing: acknowledged - got agent lock for node 'node3'
+info 1406 node5/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai 1406 node5/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info 1406 node5/crm: service 'vm:400': state changed from 'fence' to 'recovery'
+err 1406 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+err 1426 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+err 1446 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+err 1466 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+err 1486 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+err 1506 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found
+info 1520 cmdlist: execute power node3 on
+info 1520 node3/crm: status change startup => wait_for_quorum
+info 1520 node3/lrm: status change startup => wait_for_agent_lock
+info 1524 node3/crm: status change wait_for_quorum => slave
+info 1528 node5/crm: node 'node3': state changed from 'unknown' => 'online'
+info 1528 node5/crm: recover service 'vm:400' to previous failed and fenced node 'node3' again
+info 1528 node5/crm: service 'vm:400': state changed from 'recovery' to 'started' (node = node3)
+info 1545 node3/lrm: got lock 'ha_agent_node3_lock'
+info 1545 node3/lrm: status change wait_for_agent_lock => active
+info 1545 node3/lrm: starting service vm:400
+info 1545 node3/lrm: service status vm:400 started
+info 1620 cmdlist: execute delay 100
+info 1800 cmdlist: execute power node4 off
+info 1800 node4/crm: killed by poweroff
+info 1800 node4/lrm: killed by poweroff
+info 1806 node5/crm: node 'node4': state changed from 'online' => 'unknown'
+info 1866 node5/crm: service 'vm:101': state changed from 'started' to 'fence'
+info 1866 node5/crm: service 'vm:300': state changed from 'started' to 'fence'
+info 1866 node5/crm: node 'node4': state changed from 'unknown' => 'fence'
+emai 1866 node5/crm: FENCE: Try to fence node 'node4'
+info 1900 cmdlist: execute delay 100
+info 1926 node5/crm: got lock 'ha_agent_node4_lock'
+info 1926 node5/crm: fencing: acknowledged - got agent lock for node 'node4'
+info 1926 node5/crm: node 'node4': state changed from 'fence' => 'unknown'
+emai 1926 node5/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node4'
+info 1926 node5/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info 1926 node5/crm: service 'vm:300': state changed from 'fence' to 'recovery'
+info 1926 node5/crm: recover service 'vm:101' from fenced node 'node4' to node 'node2'
+info 1926 node5/crm: service 'vm:101': state changed from 'recovery' to 'started' (node = node2)
+err 1926 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 1926 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+info 1943 node2/lrm: starting service vm:101
+info 1943 node2/lrm: service status vm:101 started
+err 1946 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 1966 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 1986 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 2006 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 2026 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 2046 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+err 2066 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found
+info 2080 cmdlist: execute power node4 on
+info 2080 node4/crm: status change startup => wait_for_quorum
+info 2080 node4/lrm: status change startup => wait_for_agent_lock
+info 2086 node4/crm: status change wait_for_quorum => slave
+info 2087 node4/lrm: got lock 'ha_agent_node4_lock'
+info 2087 node4/lrm: status change wait_for_agent_lock => active
+info 2088 node5/crm: node 'node4': state changed from 'unknown' => 'online'
+info 2088 node5/crm: recover service 'vm:300' to previous failed and fenced node 'node4' again
+info 2088 node5/crm: service 'vm:300': state changed from 'recovery' to 'started' (node = node4)
+info 2107 node4/lrm: starting service vm:300
+info 2107 node4/lrm: service status vm:300 started
+info 2180 cmdlist: execute delay 100
+info 2360 cmdlist: execute power node5 off
+info 2360 node5/crm: killed by poweroff
+info 2360 node5/lrm: killed by poweroff
+info 2460 cmdlist: execute delay 100
+info 2480 node1/crm: got lock 'ha_manager_lock'
+info 2480 node1/crm: status change slave => master
+info 2480 node1/crm: using scheduler mode 'static'
+info 2480 node1/crm: node 'node5': state changed from 'online' => 'unknown'
+info 2540 node1/crm: service 'vm:201': state changed from 'started' to 'fence'
+info 2540 node1/crm: service 'vm:500': state changed from 'started' to 'fence'
+info 2540 node1/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai 2540 node1/crm: FENCE: Try to fence node 'node5'
+info 2540 node1/crm: got lock 'ha_agent_node5_lock'
+info 2540 node1/crm: fencing: acknowledged - got agent lock for node 'node5'
+info 2540 node1/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai 2540 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info 2540 node1/crm: service 'vm:201': state changed from 'fence' to 'recovery'
+info 2540 node1/crm: service 'vm:500': state changed from 'fence' to 'recovery'
+info 2540 node1/crm: recover service 'vm:201' from fenced node 'node5' to node 'node2'
+info 2540 node1/crm: service 'vm:201': state changed from 'recovery' to 'started' (node = node2)
+err 2540 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+err 2540 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+info 2543 node2/lrm: starting service vm:201
+info 2543 node2/lrm: service status vm:201 started
+err 2560 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+err 2580 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+err 2600 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+err 2620 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found
+info 2640 cmdlist: execute power node5 on
+info 2640 node5/crm: status change startup => wait_for_quorum
+info 2640 node5/lrm: status change startup => wait_for_agent_lock
+info 2640 node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info 2640 node1/crm: recover service 'vm:500' to previous failed and fenced node 'node5' again
+info 2640 node1/crm: service 'vm:500': state changed from 'recovery' to 'started' (node = node5)
+info 2648 node5/crm: status change wait_for_quorum => slave
+info 2669 node5/lrm: got lock 'ha_agent_node5_lock'
+info 2669 node5/lrm: status change wait_for_agent_lock => active
+info 2669 node5/lrm: starting service vm:500
+info 2669 node5/lrm: service status vm:500 started
+info 3240 hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/manager_status b/src/test/test-crs-static-rebalance-resource-affinity3/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/rules_config b/src/test/test-crs-static-rebalance-resource-affinity3/rules_config
new file mode 100644
index 0000000..442cd58
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/rules_config
@@ -0,0 +1,3 @@
+resource-affinity: keep-them-apart
+ resources vm:100,vm:200,vm:300,vm:400,vm:500
+ affinity negative
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/service_config b/src/test/test-crs-static-rebalance-resource-affinity3/service_config
new file mode 100644
index 0000000..86dc27d
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/service_config
@@ -0,0 +1,9 @@
+{
+ "vm:100": { "node": "node1", "state": "started" },
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:200": { "node": "node1", "state": "started" },
+ "vm:201": { "node": "node1", "state": "started" },
+ "vm:300": { "node": "node1", "state": "started" },
+ "vm:400": { "node": "node1", "state": "started" },
+ "vm:500": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats
new file mode 100644
index 0000000..755282b
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats
@@ -0,0 +1,9 @@
+{
+ "vm:100": { "maxcpu": 16, "maxmem": 16000000000 },
+ "vm:101": { "maxcpu": 4, "maxmem": 8000000000 },
+ "vm:200": { "maxcpu": 2, "maxmem": 48000000000 },
+ "vm:201": { "maxcpu": 4, "maxmem": 8000000000 },
+ "vm:300": { "maxcpu": 8, "maxmem": 32000000000 },
+ "vm:400": { "maxcpu": 32, "maxmem": 32000000000 },
+ "vm:500": { "maxcpu": 16, "maxmem": 8000000000 }
+}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 12/13] test: rules: add test cases for resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (10 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 11/13] test: ha tester: add test cases for static scheduler resource affinity Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 13/13] api: resources: add check for resource affinity in resource migrations Daniel Kral
` (6 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add test cases to verify that the rule checkers correctly identify and
remove HA Resource Affinity rules from the rules to make the rule set
feasible. The added test cases verify:
- Resource Affinity rules retrieve the correct optional default values
- Resource Affinity rules, which state that two or more resources are to
be kept together and separate at the same time, are dropped from the
rule set
- Resource Affinity rules, which cannot be fullfilled because of the
constraints imposed by the Node Affinity rules of their resources, are
dropped from the rule set
- Resource Affinity rules, which specify less than two nodes, are
dropped from the rule set
- Negative resource affinity rules, which specify more nodes than
available, are dropped from the rule set
- Positive resource affinity rule resources, which overlap with other
positive resource affinity rules' resources, are merged into a single
positive resource affinity rule to make them disjoint from each other
- Positive resource affinity rule resources, which are also in negative
resource affinity rules, implicitly create negative resource affinity
rules for the other resources as well
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../defaults-for-resource-affinity-rules.cfg | 16 +++
...lts-for-resource-affinity-rules.cfg.expect | 38 +++++
...onsistent-node-resource-affinity-rules.cfg | 54 ++++++++
...nt-node-resource-affinity-rules.cfg.expect | 121 ++++++++++++++++
.../inconsistent-resource-affinity-rules.cfg | 11 ++
...sistent-resource-affinity-rules.cfg.expect | 11 ++
...ctive-negative-resource-affinity-rules.cfg | 17 +++
...egative-resource-affinity-rules.cfg.expect | 30 ++++
.../ineffective-resource-affinity-rules.cfg | 8 ++
...fective-resource-affinity-rules.cfg.expect | 9 ++
...licit-negative-resource-affinity-rules.cfg | 40 ++++++
...egative-resource-affinity-rules.cfg.expect | 131 ++++++++++++++++++
...licit-negative-resource-affinity-rules.cfg | 16 +++
...egative-resource-affinity-rules.cfg.expect | 73 ++++++++++
...ected-positive-resource-affinity-rules.cfg | 42 ++++++
...ositive-resource-affinity-rules.cfg.expect | 70 ++++++++++
...-affinity-with-resource-affinity-rules.cfg | 19 +++
...ty-with-resource-affinity-rules.cfg.expect | 45 ++++++
src/test/test_rules_config.pl | 2 +
19 files changed, 753 insertions(+)
create mode 100644 src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
diff --git a/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg b/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg
new file mode 100644
index 0000000..a0fb4e0
--- /dev/null
+++ b/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg
@@ -0,0 +1,16 @@
+# Case 1: Resource Affinity rules are enabled by default, so set it so if it isn't yet.
+resource-affinity: resource-affinity-defaults
+ resources vm:101,vm:102
+ affinity negative
+
+# Case 2: Resource Affinity rule is disabled, it shouldn't be enabled afterwards.
+resource-affinity: resource-affinity-disabled
+ resources vm:201,vm:202
+ affinity negative
+ disable
+
+# Case 3: Resource Affinity rule is disabled with explicit 1 set, it shouldn't be enabled afterwards.
+resource-affinity: resource-affinity-disabled-explicit
+ resources vm:301,vm:302
+ affinity negative
+ disable 1
diff --git a/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..7384b0b
--- /dev/null
+++ b/src/test/rules_cfgs/defaults-for-resource-affinity-rules.cfg.expect
@@ -0,0 +1,38 @@
+--- Log ---
+--- Config ---
+$VAR1 = {
+ 'digest' => '9ac7cc517f02c41e3403085ec02f6a9259f2ac94',
+ 'ids' => {
+ 'resource-affinity-defaults' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'resource-affinity-disabled' => {
+ 'affinity' => 'negative',
+ 'disable' => 1,
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'resource-affinity-disabled-explicit' => {
+ 'affinity' => 'negative',
+ 'disable' => 1,
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ 'resource-affinity-defaults' => 1,
+ 'resource-affinity-disabled' => 2,
+ 'resource-affinity-disabled-explicit' => 3
+ }
+ };
diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
new file mode 100644
index 0000000..9c93193
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg
@@ -0,0 +1,54 @@
+# Case 1: Remove no positive resource affinity rule, where there is exactly one node to keep them together.
+node-affinity: vm101-vm102-must-be-on-node1
+ resources vm:101,vm:102
+ nodes node1
+ strict 1
+
+resource-affinity: vm101-vm102-must-be-kept-together
+ resources vm:101,vm:102
+ affinity positive
+
+# Case 2: Remove no negative resource affinity rule, where there are exactly enough nodes available to keep them apart.
+node-affinity: vm201-must-be-on-node1
+ resources vm:201
+ nodes node1
+ strict 1
+
+node-affinity: vm202-must-be-on-node2
+ resources vm:202
+ nodes node2
+ strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-separate
+ resources vm:201,vm:202
+ affinity negative
+
+# Case 1: Remove the positive resource affinity rules, where two resources are restricted to a different node.
+node-affinity: vm301-must-be-on-node1
+ resources vm:301
+ nodes node1
+ strict 1
+
+node-affinity: vm301-must-be-on-node2
+ resources vm:302
+ nodes node2
+ strict 1
+
+resource-affinity: vm301-vm302-must-be-kept-together
+ resources vm:301,vm:302
+ affinity positive
+
+# Case 2: Remove the negative resource affinity rule, where two resources are restricted to less nodes than needed to keep them apart.
+node-affinity: vm401-must-be-on-node1
+ resources vm:401
+ nodes node1
+ strict 1
+
+node-affinity: vm402-must-be-on-node1
+ resources vm:402
+ nodes node1
+ strict 1
+
+resource-affinity: vm401-vm402-must-be-kept-separate
+ resources vm:401,vm:402
+ affinity negative
diff --git a/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..a2b898d
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-node-resource-affinity-rules.cfg.expect
@@ -0,0 +1,121 @@
+--- Log ---
+Drop rule 'vm301-vm302-must-be-kept-together', because two or more resources are restricted to different nodes.
+Drop rule 'vm401-vm402-must-be-kept-separate', because two or more resources are restricted to less nodes than available to the resources.
+--- Config ---
+$VAR1 = {
+ 'digest' => '2125f6ec9743b24d6eb9ac8273ea90525cdd0d5a',
+ 'ids' => {
+ 'vm101-vm102-must-be-kept-together' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'vm101-vm102-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm201-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:201' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm201-vm202-must-be-kept-separate' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'vm202-must-be-on-node2' => {
+ 'nodes' => {
+ 'node2' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:202' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm301-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:301' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm301-must-be-on-node2' => {
+ 'nodes' => {
+ 'node2' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:302' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm401-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:401' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ },
+ 'vm402-must-be-on-node1' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 0
+ }
+ },
+ 'resources' => {
+ 'vm:402' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ }
+ },
+ 'order' => {
+ 'vm101-vm102-must-be-kept-together' => 2,
+ 'vm101-vm102-must-be-on-node1' => 1,
+ 'vm201-must-be-on-node1' => 3,
+ 'vm201-vm202-must-be-kept-separate' => 5,
+ 'vm202-must-be-on-node2' => 4,
+ 'vm301-must-be-on-node1' => 6,
+ 'vm301-must-be-on-node2' => 7,
+ 'vm401-must-be-on-node1' => 9,
+ 'vm402-must-be-on-node1' => 10
+ }
+ };
diff --git a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg
new file mode 100644
index 0000000..a620e29
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg
@@ -0,0 +1,11 @@
+resource-affinity: keep-apart1
+ resources vm:102,vm:103
+ affinity negative
+
+resource-affinity: keep-apart2
+ resources vm:102,vm:104,vm:106
+ affinity negative
+
+resource-affinity: stick-together1
+ resources vm:101,vm:102,vm:103,vm:104,vm:106
+ affinity positive
diff --git a/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..b0cde0f
--- /dev/null
+++ b/src/test/rules_cfgs/inconsistent-resource-affinity-rules.cfg.expect
@@ -0,0 +1,11 @@
+--- Log ---
+Drop rule 'keep-apart1', because rule shares two or more resources with 'stick-together1'.
+Drop rule 'keep-apart2', because rule shares two or more resources with 'stick-together1'.
+Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart1'.
+Drop rule 'stick-together1', because rule shares two or more resources with 'keep-apart2'.
+--- Config ---
+$VAR1 = {
+ 'digest' => '50875b320034d8ac7dded185e590f5f87c4e2bb6',
+ 'ids' => {},
+ 'order' => {}
+ };
diff --git a/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg b/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg
new file mode 100644
index 0000000..c0f18d2
--- /dev/null
+++ b/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg
@@ -0,0 +1,17 @@
+# Case 1: Do not remove negative resource affinity rules, which do define less resources than available nodes (3).
+resource-affinity: do-not-remove-me1
+ resources vm:101,vm:102
+ affinity negative
+
+resource-affinity: do-not-remove-me2
+ resources vm:101,vm:102,vm:103
+ affinity negative
+
+# Case 1: Remove negative resource affinity rules, which do define more resources than available nodes (3).
+resource-affinity: remove-me1
+ resources vm:101,vm:102,vm:103,vm:104
+ affinity negative
+
+resource-affinity: remove-me2
+ resources vm:101,vm:102,vm:103,vm:104,vm:105
+ affinity negative
diff --git a/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..8a2b879
--- /dev/null
+++ b/src/test/rules_cfgs/ineffective-negative-resource-affinity-rules.cfg.expect
@@ -0,0 +1,30 @@
+--- Log ---
+Drop rule 'remove-me1', because rule defines more resources than available nodes.
+Drop rule 'remove-me2', because rule defines more resources than available nodes.
+--- Config ---
+$VAR1 = {
+ 'digest' => '68633cedeeb355ef78fe28221ef3f16537b3e788',
+ 'ids' => {
+ 'do-not-remove-me1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-remove-me2' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1,
+ 'vm:103' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ 'do-not-remove-me1' => 1,
+ 'do-not-remove-me2' => 2
+ }
+ };
diff --git a/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg b/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg
new file mode 100644
index 0000000..32f977b
--- /dev/null
+++ b/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg
@@ -0,0 +1,8 @@
+# Case 1: Remove resource affinity rules, which do not have enough resources to be effective.
+resource-affinity: lonely-resource1
+ resources vm:101
+ affinity positive
+
+resource-affinity: lonely-resource2
+ resources vm:101
+ affinity negative
diff --git a/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..b2d468b
--- /dev/null
+++ b/src/test/rules_cfgs/ineffective-resource-affinity-rules.cfg.expect
@@ -0,0 +1,9 @@
+--- Log ---
+Drop rule 'lonely-resource1', because rule is ineffective as there are less than two resources.
+Drop rule 'lonely-resource2', because rule is ineffective as there are less than two resources.
+--- Config ---
+$VAR1 = {
+ 'digest' => 'fe89f8c8f5acc29f807eaa0cec5974b6e957a596',
+ 'ids' => {},
+ 'order' => {}
+ };
diff --git a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg
new file mode 100644
index 0000000..db26286
--- /dev/null
+++ b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg
@@ -0,0 +1,40 @@
+# Case 1: Do not infer any negative resource affinity rules, if there are no resources of a positive
+# resource affinity rule in any negative resource affinity rules.
+resource-affinity: do-not-infer-positive1
+ resources vm:101,vm:102,vm:103
+ affinity positive
+
+# Case 2: Infer negative resource affinity rules with one resource in a negative resource affinity rule.
+resource-affinity: infer-simple-positive1
+ resources vm:201,vm:202,vm:203
+ affinity positive
+
+resource-affinity: infer-simple-negative1
+ resources vm:201,vm:204
+ affinity negative
+
+# Case 3: Infer negative resource affinity rules with two resources in different negative resource affinity rules.
+resource-affinity: infer-two-positive1
+ resources vm:301,vm:302,vm:303
+ affinity positive
+
+resource-affinity: infer-two-negative1
+ resources vm:303,vm:304
+ affinity negative
+
+resource-affinity: infer-two-negative2
+ resources vm:302,vm:305
+ affinity negative
+
+# Case 4: Do not infer negative resource affinity rules from inconsistent resource affinity rules.
+resource-affinity: do-not-infer-inconsistent-positive1
+ resources vm:401,vm:402,vm:403
+ affinity positive
+
+resource-affinity: do-not-infer-inconsistent-negative1
+ resources vm:401,vm:404
+ affinity negative
+
+resource-affinity: do-not-infer-inconsistent-negative2
+ resources vm:402,vm:403,vm:405
+ affinity negative
diff --git a/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..bcd368a
--- /dev/null
+++ b/src/test/rules_cfgs/infer-implicit-negative-resource-affinity-rules.cfg.expect
@@ -0,0 +1,131 @@
+--- Log ---
+Drop rule 'do-not-infer-inconsistent-negative2', because rule shares two or more resources with 'do-not-infer-inconsistent-positive1'.
+Drop rule 'do-not-infer-inconsistent-positive1', because rule shares two or more resources with 'do-not-infer-inconsistent-negative2'.
+--- Config ---
+$VAR1 = {
+ 'digest' => 'd8724dfe2130bb642b98e021da973aa0ec0695f0',
+ 'ids' => {
+ '_implicit-negative-infer-simple-positive1-vm:202-vm:204' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:202' => 1,
+ 'vm:204' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-infer-simple-positive1-vm:203-vm:204' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:203' => 1,
+ 'vm:204' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-infer-two-positive1-vm:301-vm:304' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:304' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-infer-two-positive1-vm:301-vm:305' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:305' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-infer-two-positive1-vm:302-vm:304' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:302' => 1,
+ 'vm:304' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-infer-two-positive1-vm:303-vm:305' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:303' => 1,
+ 'vm:305' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-infer-inconsistent-negative1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:401' => 1,
+ 'vm:404' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-infer-positive1' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1,
+ 'vm:103' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-simple-negative1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:204' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-simple-positive1' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1,
+ 'vm:203' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-two-negative1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:303' => 1,
+ 'vm:304' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-two-negative2' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:302' => 1,
+ 'vm:305' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-two-positive1' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1,
+ 'vm:303' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ '_implicit-negative-infer-simple-positive1-vm:202-vm:204' => 2,
+ '_implicit-negative-infer-simple-positive1-vm:203-vm:204' => 2,
+ '_implicit-negative-infer-two-positive1-vm:301-vm:304' => 2,
+ '_implicit-negative-infer-two-positive1-vm:301-vm:305' => 2,
+ '_implicit-negative-infer-two-positive1-vm:302-vm:304' => 2,
+ '_implicit-negative-infer-two-positive1-vm:303-vm:305' => 2,
+ 'do-not-infer-inconsistent-negative1' => 8,
+ 'do-not-infer-positive1' => 1,
+ 'infer-simple-negative1' => 3,
+ 'infer-simple-positive1' => 2,
+ 'infer-two-negative1' => 5,
+ 'infer-two-negative2' => 6,
+ 'infer-two-positive1' => 4
+ }
+ };
diff --git a/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg b/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg
new file mode 100644
index 0000000..5694bde
--- /dev/null
+++ b/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg
@@ -0,0 +1,16 @@
+# Case 1: Infer negative resource affinity rules from connected positive resource affinity rules.
+resource-affinity: infer-connected-positive1
+ resources vm:101,vm:102
+ affinity positive
+
+resource-affinity: infer-connected-positive2
+ resources vm:102,vm:103
+ affinity positive
+
+resource-affinity: infer-connected-negative1
+ resources vm:101,vm:104
+ affinity negative
+
+resource-affinity: infer-connected-negative2
+ resources vm:102,vm:105
+ affinity negative
diff --git a/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..876c203
--- /dev/null
+++ b/src/test/rules_cfgs/merge-and-infer-implicit-negative-resource-affinity-rules.cfg.expect
@@ -0,0 +1,73 @@
+--- Log ---
+--- Config ---
+$VAR1 = {
+ 'digest' => '5695bd62a65966a275a62a01d2d8fbc370d91668',
+ 'ids' => {
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:101-vm:105' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:105' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:102-vm:104' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:102' => 1,
+ 'vm:104' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:103-vm:104' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:103' => 1,
+ 'vm:104' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:103-vm:105' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:103' => 1,
+ 'vm:105' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ '_merged-infer-connected-positive1-infer-connected-positive2' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1,
+ 'vm:103' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-connected-negative1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:104' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'infer-connected-negative2' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:102' => 1,
+ 'vm:105' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:101-vm:105' => 2,
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:102-vm:104' => 2,
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:103-vm:104' => 2,
+ '_implicit-negative-_merged-infer-connected-positive1-infer-connected-positive2-vm:103-vm:105' => 2,
+ '_merged-infer-connected-positive1-infer-connected-positive2' => 1,
+ 'infer-connected-negative1' => 3,
+ 'infer-connected-negative2' => 4
+ }
+ };
diff --git a/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg b/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg
new file mode 100644
index 0000000..8954a27
--- /dev/null
+++ b/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg
@@ -0,0 +1,42 @@
+# Case 1: Do not merge any negative resource affinity rules.
+resource-affinity: do-not-merge-negative1
+ resources vm:101,vm:102
+ affinity negative
+
+resource-affinity: do-not-merge-negative2
+ resources vm:102,vm:103
+ affinity negative
+
+resource-affinity: do-not-merge-negative3
+ resources vm:104,vm:105
+ affinity negative
+
+# Case 2: Do not merge unconnected positive resource affinity rules.
+resource-affinity: do-not-merge-positive1
+ resources vm:201,vm:202
+ affinity positive
+
+resource-affinity: do-not-merge-positive2
+ resources vm:203,vm:204
+ affinity positive
+
+# Case 3: Merge connected positive resource affinity rules.
+resource-affinity: merge-positive1
+ resources vm:301,vm:302
+ affinity positive
+
+resource-affinity: merge-positive2
+ resources vm:303,vm:305,vm:307
+ affinity positive
+
+resource-affinity: merge-positive3
+ resources vm:302,vm:303
+ affinity positive
+
+resource-affinity: merge-positive4
+ resources vm:302,vm:304,vm:306
+ affinity positive
+
+resource-affinity: merge-positive5
+ resources vm:307,vm:308,vm:309
+ affinity positive
diff --git a/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..e57a792
--- /dev/null
+++ b/src/test/rules_cfgs/merge-connected-positive-resource-affinity-rules.cfg.expect
@@ -0,0 +1,70 @@
+--- Log ---
+--- Config ---
+$VAR1 = {
+ 'digest' => '920d9caac206fc0dd893753bfb2cab3e6d6a9b9b',
+ 'ids' => {
+ '_merged-merge-positive1-merge-positive3-merge-positive4-merge-positive2-merge-positive5' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:301' => 1,
+ 'vm:302' => 1,
+ 'vm:303' => 1,
+ 'vm:304' => 1,
+ 'vm:305' => 1,
+ 'vm:306' => 1,
+ 'vm:307' => 1,
+ 'vm:308' => 1,
+ 'vm:309' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-merge-negative1' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-merge-negative2' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:102' => 1,
+ 'vm:103' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-merge-negative3' => {
+ 'affinity' => 'negative',
+ 'resources' => {
+ 'vm:104' => 1,
+ 'vm:105' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-merge-positive1' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1
+ },
+ 'type' => 'resource-affinity'
+ },
+ 'do-not-merge-positive2' => {
+ 'affinity' => 'positive',
+ 'resources' => {
+ 'vm:203' => 1,
+ 'vm:204' => 1
+ },
+ 'type' => 'resource-affinity'
+ }
+ },
+ 'order' => {
+ '_merged-merge-positive1-merge-positive3-merge-positive4-merge-positive2-merge-positive5' => 6,
+ 'do-not-merge-negative1' => 1,
+ 'do-not-merge-negative2' => 2,
+ 'do-not-merge-negative3' => 3,
+ 'do-not-merge-positive1' => 4,
+ 'do-not-merge-positive2' => 5
+ }
+ };
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
new file mode 100644
index 0000000..28504e3
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg
@@ -0,0 +1,19 @@
+# Case 1: Remove resource affinity rules, where there is a loose Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm101-vm102-should-be-on-node1-or-node2
+ resources vm:101,vm:102
+ nodes node1:1,node2:2
+ strict 0
+
+resource-affinity: vm101-vm102-must-be-kept-separate
+ resources vm:101,vm:102
+ affinity negative
+
+# Case 2: Remove resource affinity rules, where there is a strict Node Affinity rule with multiple priority groups set for the nodes.
+node-affinity: vm201-vm202-must-be-on-node1-or-node2
+ resources vm:201,vm:202
+ nodes node1:1,node2:2
+ strict 1
+
+resource-affinity: vm201-vm202-must-be-kept-together
+ resources vm:201,vm:202
+ affinity positive
diff --git a/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
new file mode 100644
index 0000000..41517f5
--- /dev/null
+++ b/src/test/rules_cfgs/multi-priority-node-affinity-with-resource-affinity-rules.cfg.expect
@@ -0,0 +1,45 @@
+--- Log ---
+Drop rule 'vm101-vm102-must-be-kept-separate', because resources are in node affinity rules with multiple priorities.
+Drop rule 'vm201-vm202-must-be-kept-together', because resources are in node affinity rules with multiple priorities.
+--- Config ---
+$VAR1 = {
+ 'digest' => 'b9dab8eba68f60c1a6e75138b5c129de8ad284ee',
+ 'ids' => {
+ 'vm101-vm102-should-be-on-node1-or-node2' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 1
+ },
+ 'node2' => {
+ 'priority' => 2
+ }
+ },
+ 'resources' => {
+ 'vm:101' => 1,
+ 'vm:102' => 1
+ },
+ 'strict' => 0,
+ 'type' => 'node-affinity'
+ },
+ 'vm201-vm202-must-be-on-node1-or-node2' => {
+ 'nodes' => {
+ 'node1' => {
+ 'priority' => 1
+ },
+ 'node2' => {
+ 'priority' => 2
+ }
+ },
+ 'resources' => {
+ 'vm:201' => 1,
+ 'vm:202' => 1
+ },
+ 'strict' => 1,
+ 'type' => 'node-affinity'
+ }
+ },
+ 'order' => {
+ 'vm101-vm102-should-be-on-node1-or-node2' => 1,
+ 'vm201-vm202-must-be-on-node1-or-node2' => 3
+ }
+ };
diff --git a/src/test/test_rules_config.pl b/src/test/test_rules_config.pl
index d49d14f..c2a7af4 100755
--- a/src/test/test_rules_config.pl
+++ b/src/test/test_rules_config.pl
@@ -13,8 +13,10 @@ use Data::Dumper;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity;
+use PVE::HA::Rules::ResourceAffinity;
PVE::HA::Rules::NodeAffinity->register();
+PVE::HA::Rules::ResourceAffinity->register();
PVE::HA::Rules->init(property_isolation => 1);
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH ha-manager v3 13/13] api: resources: add check for resource affinity in resource migrations
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (11 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 12/13] test: rules: add test cases for resource affinity rules Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules Daniel Kral
` (5 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
The HA Manager already handles positive and negative resource affinity
rules for individual resource migrations, but the information about
these is only redirected to the HA environment's logger, i.e., for
production usage these messages are redirected to the HA Manager node's
syslog.
Therefore, add checks when migrating/relocating resources through their
respective API endpoints to give users information about comigrated
resources, i.e., resources which are migrated together to the requested
target node because of positive resource affinity rules, and blocking
resources, i.e., resources which are blocking a resource to be migrated
to a requested target node, because of a negative resource affinity
rule.
get_resource_motion_info(...) is also callable from other packages, to
get a listing of all allowed and disallowed nodes with respect to the
Resource Affinity rules, e.g., a migration precondition check.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Resources.pm | 128 +++++++++++++++++++++++++++++++++--
src/PVE/CLI/ha_manager.pm | 52 +++++++++++++-
src/PVE/HA/Config.pm | 56 +++++++++++++++
3 files changed, 228 insertions(+), 8 deletions(-)
diff --git a/src/PVE/API2/HA/Resources.pm b/src/PVE/API2/HA/Resources.pm
index 6ead5f0..d0b8d7e 100644
--- a/src/PVE/API2/HA/Resources.pm
+++ b/src/PVE/API2/HA/Resources.pm
@@ -319,19 +319,75 @@ __PACKAGE__->register_method({
),
},
},
- returns => { type => 'null' },
+ returns => {
+ type => 'object',
+ properties => {
+ sid => {
+ description => "HA resource, which is requested to be migrated.",
+ type => 'string',
+ optional => 0,
+ },
+ 'requested-node' => {
+ description => "Node, which was requested to be migrated to.",
+ type => 'string',
+ optional => 0,
+ },
+ 'comigrated-resources' => {
+ description => "HA resources, which are migrated to the same"
+ . " requested target node as the given HA resource, because"
+ . " these are in positive affinity with the HA resource.",
+ type => 'array',
+ optional => 1,
+ },
+ 'blocking-resources' => {
+ description => "HA resources, which are blocking the given"
+ . " HA resource from being migrated to the requested"
+ . " target node.",
+ type => 'array',
+ optional => 1,
+ items => {
+ description => "A blocking HA resource",
+ type => 'object',
+ properties => {
+ sid => {
+ type => 'string',
+ description => "The blocking HA resource id",
+ },
+ cause => {
+ type => 'string',
+ description => "The reason why the HA resource is"
+ . " blocking the migration.",
+ enum => ['resource-affinity'],
+ },
+ },
+ },
+ },
+ },
+ },
code => sub {
my ($param) = @_;
+ my $result = {};
+
my ($sid, $type, $name) = PVE::HA::Config::parse_sid(extract_param($param, 'sid'));
+ my $req_node = extract_param($param, 'node');
PVE::HA::Config::service_is_ha_managed($sid);
check_service_state($sid);
- PVE::HA::Config::queue_crm_commands("migrate $sid $param->{node}");
+ PVE::HA::Config::queue_crm_commands("migrate $sid $req_node");
+ $result->{sid} = $sid;
+ $result->{'requested-node'} = $req_node;
- return undef;
+ my ($comigrated_resources, $blocking_resources_by_node) =
+ PVE::HA::Config::get_resource_motion_info($sid);
+ my $blocking_resources = $blocking_resources_by_node->{$req_node};
+
+ $result->{'comigrated-resources'} = $comigrated_resources if @$comigrated_resources;
+ $result->{'blocking-resources'} = $blocking_resources if $blocking_resources;
+
+ return $result;
},
});
@@ -361,19 +417,79 @@ __PACKAGE__->register_method({
),
},
},
- returns => { type => 'null' },
+ returns => {
+ type => 'object',
+ properties => {
+ sid => {
+ description => "HA resource, which is requested to be relocated.",
+ type => 'string',
+ optional => 0,
+ },
+ 'requested-node' => {
+ description => "Node, which was requested to be relocated to.",
+ type => 'string',
+ optional => 0,
+ },
+ 'comigrated-resources' => {
+ description => "HA resources, which are relocated to the same"
+ . " requested target node as the given HA resource, because"
+ . " these are in positive affinity with the HA resource.",
+ type => 'array',
+ optional => 1,
+ items => {
+ type => 'string',
+ description => "A comigrated HA resource",
+ },
+ },
+ 'blocking-resources' => {
+ description => "HA resources, which are blocking the given"
+ . " HA resource from being relocated to the requested"
+ . " target node.",
+ type => 'array',
+ optional => 1,
+ items => {
+ description => "A blocking HA resource",
+ type => 'object',
+ properties => {
+ sid => {
+ type => 'string',
+ description => "The blocking HA resource id",
+ },
+ cause => {
+ type => 'string',
+ description => "The reason why the HA resource is"
+ . " blocking the relocation.",
+ enum => ['resource-affinity'],
+ },
+ },
+ },
+ },
+ },
+ },
code => sub {
my ($param) = @_;
+ my $result = {};
+
my ($sid, $type, $name) = PVE::HA::Config::parse_sid(extract_param($param, 'sid'));
+ my $req_node = extract_param($param, 'node');
PVE::HA::Config::service_is_ha_managed($sid);
check_service_state($sid);
- PVE::HA::Config::queue_crm_commands("relocate $sid $param->{node}");
+ PVE::HA::Config::queue_crm_commands("relocate $sid $req_node");
+ $result->{sid} = $sid;
+ $result->{'requested-node'} = $req_node;
- return undef;
+ my ($comigrated_resources, $blocking_resources_by_node) =
+ PVE::HA::Config::get_resource_motion_info($sid);
+ my $blocking_resources = $blocking_resources_by_node->{$req_node};
+
+ $result->{'comigrated-resources'} = $comigrated_resources if @$comigrated_resources;
+ $result->{'blocking-resources'} = $blocking_resources if @$blocking_resources;
+
+ return $result;
},
});
diff --git a/src/PVE/CLI/ha_manager.pm b/src/PVE/CLI/ha_manager.pm
index ef936cd..d1f8393 100644
--- a/src/PVE/CLI/ha_manager.pm
+++ b/src/PVE/CLI/ha_manager.pm
@@ -147,6 +147,42 @@ __PACKAGE__->register_method({
},
});
+my $print_resource_motion_output = sub {
+ my ($cmd) = @_;
+
+ return sub {
+ my ($data) = @_;
+
+ my $sid = $data->{sid};
+ my $req_node = $data->{'requested-node'};
+
+ if (my $blocking_resources = $data->{'blocking-resources'}) {
+ my $err_msg = "cannot $cmd resource '$sid' to node '$req_node':\n\n";
+
+ for my $blocking_resource (@$blocking_resources) {
+ my ($csid, $cause) = $blocking_resource->@{qw(sid cause)};
+
+ $err_msg .= "- resource '$csid' on target node '$req_node'";
+
+ if ($cause eq 'resource-affinity') {
+ $err_msg .= " in negative affinity with resource '$sid'";
+ }
+
+ $err_msg .= "\n";
+ }
+
+ die $err_msg;
+ }
+
+ if ($data->{'comigrated-resources'}) {
+ for my $csid ($data->{'comigrated-resources'}->@*) {
+ print "also $cmd resource '$csid' in positive affinity with"
+ . " resource '$sid' to target node '$req_node'\n";
+ }
+ }
+ };
+};
+
our $cmddef = {
status => [__PACKAGE__, 'status'],
config => [
@@ -239,8 +275,20 @@ our $cmddef = {
relocate => { alias => 'crm-command relocate' },
'crm-command' => {
- migrate => ["PVE::API2::HA::Resources", 'migrate', ['sid', 'node']],
- relocate => ["PVE::API2::HA::Resources", 'relocate', ['sid', 'node']],
+ migrate => [
+ "PVE::API2::HA::Resources",
+ 'migrate',
+ ['sid', 'node'],
+ {},
+ $print_resource_motion_output->('migrate'),
+ ],
+ relocate => [
+ "PVE::API2::HA::Resources",
+ 'relocate',
+ ['sid', 'node'],
+ {},
+ $print_resource_motion_output->('relocate'),
+ ],
stop => [__PACKAGE__, 'stop', ['sid', 'timeout']],
'node-maintenance' => {
enable => [__PACKAGE__, 'node-maintenance-set', ['node'], { disable => 0 }],
diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 59bafd7..003909e 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -8,6 +8,7 @@ use JSON;
use PVE::HA::Tools;
use PVE::HA::Groups;
use PVE::HA::Rules;
+use PVE::HA::Rules::ResourceAffinity qw(get_affinitive_resources);
use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_write_file cfs_lock_file);
use PVE::HA::Resources;
@@ -223,6 +224,24 @@ sub read_and_check_rules_config {
return $rules;
}
+sub read_and_check_effective_rules_config {
+
+ my $rules = read_and_check_rules_config();
+
+ my $manager_status = read_manager_status();
+ my $nodes = [keys $manager_status->{node_status}->%*];
+
+ # TODO PVE 10: Remove group migration when HA groups have been fully migrated to location rules
+ my $groups = read_group_config();
+ my $resources = read_and_check_resources_config();
+
+ PVE::HA::Groups::migrate_groups_to_rules($rules, $groups, $resources);
+
+ PVE::HA::Rules->canonicalize($rules, $nodes);
+
+ return $rules;
+}
+
sub write_rules_config {
my ($cfg) = @_;
@@ -350,6 +369,43 @@ sub service_is_configured {
return 0;
}
+sub get_resource_motion_info {
+ my ($sid) = @_;
+
+ my $resources = read_resources_config();
+
+ my $comigrated_resources = [];
+ my $blocking_resources_by_node = {};
+
+ if (&$service_check_ha_state($resources, $sid)) {
+ my $manager_status = read_manager_status();
+ my $ss = $manager_status->{service_status};
+ my $ns = $manager_status->{node_status};
+
+ my $rules = read_and_check_effective_rules_config();
+ my ($together, $separate) = get_affinitive_resources($rules, $sid);
+
+ push @$comigrated_resources, $_ for sort keys %$together;
+
+ for my $node (keys %$ns) {
+ next if $ns->{$node} ne 'online';
+
+ for my $csid (sort keys %$separate) {
+ next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $node;
+ next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $node;
+
+ push $blocking_resources_by_node->{$node}->@*,
+ {
+ sid => $csid,
+ cause => 'resource-affinity',
+ };
+ }
+ }
+ }
+
+ return ($comigrated_resources, $blocking_resources_by_node);
+}
+
# graceful, as long as locking + cfs_write works
sub delete_service_from_config {
my ($sid) = @_;
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (12 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 13/13] api: resources: add check for resource affinity in resource migrations Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-08 16:08 ` Shannon Sterz
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 1/3] ui: ha: rules: add " Daniel Kral
` (4 subsequent siblings)
18 siblings, 1 reply; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add documentation about HA Resource Affinity rules, what effects those
have on the CRS scheduler, and what users can expect when those are
changed.
There are also a few points on the rule conflicts/errors list which
describe some conflicts that can arise from a mixed usage of HA Node
Affinity rules and HA Resource Affinity rules.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
Makefile | 1 +
gen-ha-rules-resource-affinity-opts.pl | 20 ++++
ha-manager.adoc | 133 +++++++++++++++++++++++++
ha-rules-resource-affinity-opts.adoc | 8 ++
4 files changed, 162 insertions(+)
create mode 100755 gen-ha-rules-resource-affinity-opts.pl
create mode 100644 ha-rules-resource-affinity-opts.adoc
diff --git a/Makefile b/Makefile
index c5e506e..4d9e2f0 100644
--- a/Makefile
+++ b/Makefile
@@ -51,6 +51,7 @@ GEN_SCRIPTS= \
gen-ha-resources-opts.pl \
gen-ha-rules-node-affinity-opts.pl \
gen-ha-rules-opts.pl \
+ gen-ha-rules-resource-affinity-opts.pl \
gen-datacenter.cfg.5-opts.pl \
gen-pct.conf.5-opts.pl \
gen-pct-network-opts.pl \
diff --git a/gen-ha-rules-resource-affinity-opts.pl b/gen-ha-rules-resource-affinity-opts.pl
new file mode 100755
index 0000000..5abed50
--- /dev/null
+++ b/gen-ha-rules-resource-affinity-opts.pl
@@ -0,0 +1,20 @@
+#!/usr/bin/perl
+
+use lib '.';
+use strict;
+use warnings;
+use PVE::RESTHandler;
+
+use Data::Dumper;
+
+use PVE::HA::Rules;
+use PVE::HA::Rules::ResourceAffinity;
+
+my $private = PVE::HA::Rules::private();
+my $resource_affinity_rule_props = PVE::HA::Rules::ResourceAffinity::properties();
+my $properties = {
+ resources => $private->{propertyList}->{resources},
+ $resource_affinity_rule_props->%*,
+};
+
+print PVE::RESTHandler::dump_properties($properties);
diff --git a/ha-manager.adoc b/ha-manager.adoc
index ec26c22..8d06885 100644
--- a/ha-manager.adoc
+++ b/ha-manager.adoc
@@ -692,6 +692,10 @@ include::ha-rules-opts.adoc[]
| HA Rule Type | Description
| `node-affinity` | Places affinity from one or more HA resources to one or
more nodes.
+| `resource-affinity` | Places affinity between two or more HA resources. The
+affinity `separate` specifies that HA resources are to be kept on separate
+nodes, while the affinity `together` specifies that HA resources are to be kept
+on the same node.
|===========================================================
[[ha_manager_node_affinity_rules]]
@@ -758,6 +762,88 @@ Node Affinity Rule Properties
include::ha-rules-node-affinity-opts.adoc[]
+[[ha_manager_resource_affinity_rules]]
+Resource Affinity Rules
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Another common requirement is that two or more HA resources should run on
+either the same node, or should be distributed on separate nodes. These are
+also commonly called "Affinity/Anti-Affinity constraints".
+
+For example, suppose there is a lot of communication traffic between the HA
+resources `vm:100` and `vm:200`, e.g., a web server communicating with a
+database server. If those HA resources are on separate nodes, this could
+potentially result in a higher latency and unnecessary network load. Resource
+affinity rules with the affinity `positive` implement the constraint to keep
+the HA resources on the same node:
+
+----
+# ha-manager rules add resource-affinity keep-together \
+ --affinity positive --resources vm:100,vm:200
+----
+
+NOTE: If there are two or more positive resource affinity rules, which have
+common HA resources, then these are treated as a single positive resource
+affinity rule. For example, if the HA resources `vm:100` and `vm:101` and the
+HA resources `vm:101` and `vm:102` are each in a positive resource affinity
+rule, then it is the same as if `vm:100`, `vm:101` and `vm:102` would have been
+in a single positive resource affinity rule.
+
+However, suppose there are computationally expensive, and/or distributed
+programs running on the HA resources `vm:200` and `ct:300`, e.g., sharded
+database instances. In that case, running them on the same node could
+potentially result in pressure on the hardware resources of the node and will
+slow down the operations of these HA resources. Resource affinity rules with
+the affinity `negative` implement the constraint to spread the HA resources on
+separate nodes:
+
+----
+# ha-manager rules add resource-affinity keep-separate \
+ --affinity negative --resources vm:200,ct:300
+----
+
+Other than node affinity rules, resource affinity rules are strict by default,
+i.e., if the constraints imposed by the resource affinity rules cannot be met
+for a HA resource, the HA Manager will put the HA resource in recovery state in
+case of a failover or in error state elsewhere.
+
+The above commands created the following rules in the rules configuration file:
+
+.Resource Affinity Rules Configuration Example (`/etc/pve/ha/rules.cfg`)
+----
+resource-affinity: keep-together
+ resources vm:100,vm:200
+ affinity positive
+
+resource-affinity: keep-separate
+ resources vm:200,ct:300
+ affinity negative
+----
+
+Interactions between Positive and Negative Resource Affinity Rules
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+If there are HA resources in a positive resource affinity rule, which are also
+part of a negative resource affinity rule, then all the other HA resources in
+the positive resource affinity rule are in negative affinity with the HA
+resources of these negative resource affinity rules as well.
+
+For example, if the HA resources `vm:100`, `vm:101`, and `vm:102` are in a
+positive resource affinity rule, and `vm:100` is in a negative resource affinity
+rule with the HA resource `ct:200`, then `vm:101` and `vm:102` are each in
+negative resource affinity with `ct:200` as well.
+
+Note that if there are two or more HA resources in both a positive and negative
+resource affinity rule, then those will be disabled as they cause a conflict:
+Two or more HA resources cannot be kept on the same node and separated on
+different nodes at the same time. For more information on these cases, see the
+section about xref:ha_manager_rule_conflicts[rule conflicts and errors] below.
+
+Resource Affinity Rule Properties
++++++++++++++++++++++++++++++++++
+
+include::ha-rules-resource-affinity-opts.adoc[]
+
[[ha_manager_rule_conflicts]]
Rule Conflicts and Errors
~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -774,6 +860,43 @@ Currently, HA rules are checked for the following feasibility tests:
total. If two or more HA node affinity rules specify the same HA resource,
these HA node affinity rules will be disabled.
+* A HA resource affinity rule must specify at least two HA resources to be
+ feasible. If a HA resource affinity rule does specify only one HA resource,
+ the HA resource affinity rule will be disabled.
+
+* A HA resource affinity rule must specify no more HA resources than there are
+ nodes in the cluster. If a HA resource affinity rule does specify more HA
+ resources than there are in the cluster, the HA resource affinity rule will be
+ disabled.
+
+* A positive HA resource affinity rule cannot specify the same two or more HA
+ resources as a negative HA resources affinity rule. That is, two or more HA
+ resources cannot be kept together and separate at the same time. If any pair
+ of positive and negative HA resource affinity rules do specify the same two or
+ more HA resources, both HA resource affinity rules will be disabled.
+
+* A HA resource, which is already constrained by a HA node affinity rule, can
+ only be referenced by a HA resource affinity rule, if the HA node affinity
+ rule does only use a single priority group. That is, the specified nodes in
+ the HA node affinity rule have the same priority. If one of the HA resources
+ in a HA resource affinity rule is constrainted by a HA node affinity rule with
+ multiple priority groups, the HA resource affinity rule will be disabled.
+
+* The HA resources of a positive HA resource affinity rule, which are
+ constrained by HA node affinity rules, must have at least one common node,
+ where the HA resources are allowed to run on. Otherwise, the HA resources
+ could only run on separate nodes. In other words, if two or more HA resources
+ of a positive HA resource affinity rule are constrained to different nodes,
+ the positive HA resource affinity rule will be disabled.
+
+* The HA resources of a negative HA resource affinity rule, which are
+ constrained by HA node affinity rules, must have at least enough nodes to
+ separate these constrained HA resources on. Otherwise, the HA resources do not
+ have enough nodes to be separated on. In other words, if two or more HA
+ resources of a negative HA resource affinity rule are constrained to less
+ nodes than needed to separate them on, the negative HA resource affinity rule
+ will be disabled.
+
[[ha_manager_fencing]]
Fencing
-------
@@ -1205,6 +1328,16 @@ The CRS is currently used at the following scheduling points:
algorithm to ensure that these HA resources are assigned according to their
node and priority constraints.
+** Positive resource affinity rules: If a positive resource affinity rule is
+ created or HA resources are added to an existing positive resource affinity
+ rule, the HA stack will use the CRS algorithm to ensure that these HA
+ resources are moved to a common node.
+
+** Negative resource affinity rules: If a negative resource affinity rule is
+ created or HA resources are added to an existing negative resource affinity
+ rule, the HA stack will use the CRS algorithm to ensure that these HA
+ resources are moved to separate nodes.
+
- HA service stopped -> start transition (opt-in). Requesting that a stopped
service should be started is an good opportunity to check for the best suited
node as per the CRS algorithm, as moving stopped services is cheaper to do
diff --git a/ha-rules-resource-affinity-opts.adoc b/ha-rules-resource-affinity-opts.adoc
new file mode 100644
index 0000000..596ec3c
--- /dev/null
+++ b/ha-rules-resource-affinity-opts.adoc
@@ -0,0 +1,8 @@
+`affinity`: `<negative | positive>` ::
+
+Describes whether the HA resources are supposed to be kept on the same node ('positive'), or are supposed to be kept on separate nodes ('negative').
+
+`resources`: `<type>:<name>{,<type>:<name>}*` ::
+
+List of HA resource IDs. This consists of a list of resource types followed by a resource specific name separated with a colon (example: vm:100,ct:101).
+
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH manager v3 1/3] ui: ha: rules: add ha resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (13 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 2/3] ui: migrate: lxc: display precondition messages for ha resource affinity Daniel Kral
` (3 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Add HA resource affinity rules as a second rule type to the HA Rules'
tab page as a separate grid so that the columns match the content of
these rules better.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
www/manager6/Makefile | 2 ++
www/manager6/ha/Rules.js | 12 +++++++
.../ha/rules/ResourceAffinityRuleEdit.js | 24 ++++++++++++++
.../ha/rules/ResourceAffinityRules.js | 31 +++++++++++++++++++
4 files changed, 69 insertions(+)
create mode 100644 www/manager6/ha/rules/ResourceAffinityRuleEdit.js
create mode 100644 www/manager6/ha/rules/ResourceAffinityRules.js
diff --git a/www/manager6/Makefile b/www/manager6/Makefile
index 3d3450c7..8c9c8610 100644
--- a/www/manager6/Makefile
+++ b/www/manager6/Makefile
@@ -151,6 +151,8 @@ JSSRC= \
ha/StatusView.js \
ha/rules/NodeAffinityRuleEdit.js \
ha/rules/NodeAffinityRules.js \
+ ha/rules/ResourceAffinityRuleEdit.js \
+ ha/rules/ResourceAffinityRules.js \
dc/ACLView.js \
dc/ACMEClusterView.js \
dc/AuthEditBase.js \
diff --git a/www/manager6/ha/Rules.js b/www/manager6/ha/Rules.js
index 8f487465..1799d25f 100644
--- a/www/manager6/ha/Rules.js
+++ b/www/manager6/ha/Rules.js
@@ -167,6 +167,17 @@ Ext.define(
flex: 1,
border: 0,
},
+ {
+ xtype: 'splitter',
+ collapsible: false,
+ performCollapse: false,
+ },
+ {
+ title: gettext('HA Resource Affinity Rules'),
+ xtype: 'pveHAResourceAffinityRulesView',
+ flex: 1,
+ border: 0,
+ },
],
},
function () {
@@ -180,6 +191,7 @@ Ext.define(
'errors',
'disable',
'comment',
+ 'affinity',
'resources',
{
name: 'strict',
diff --git a/www/manager6/ha/rules/ResourceAffinityRuleEdit.js b/www/manager6/ha/rules/ResourceAffinityRuleEdit.js
new file mode 100644
index 00000000..3bfb2c49
--- /dev/null
+++ b/www/manager6/ha/rules/ResourceAffinityRuleEdit.js
@@ -0,0 +1,24 @@
+Ext.define('PVE.ha.rules.ResourceAffinityInputPanel', {
+ extend: 'PVE.ha.RuleInputPanel',
+
+ initComponent: function () {
+ let me = this;
+
+ me.column1 = [];
+
+ me.column2 = [
+ {
+ xtype: 'proxmoxKVComboBox',
+ name: 'affinity',
+ fieldLabel: gettext('Affinity'),
+ allowBlank: false,
+ comboItems: [
+ ['positive', gettext('Keep together')],
+ ['negative', gettext('Keep separate')],
+ ],
+ },
+ ];
+
+ me.callParent();
+ },
+});
diff --git a/www/manager6/ha/rules/ResourceAffinityRules.js b/www/manager6/ha/rules/ResourceAffinityRules.js
new file mode 100644
index 00000000..6205881e
--- /dev/null
+++ b/www/manager6/ha/rules/ResourceAffinityRules.js
@@ -0,0 +1,31 @@
+Ext.define('PVE.ha.ResourceAffinityRulesView', {
+ extend: 'PVE.ha.RulesBaseView',
+ alias: 'widget.pveHAResourceAffinityRulesView',
+
+ ruleType: 'resource-affinity',
+ ruleTitle: gettext('HA Resource Affinity'),
+ inputPanel: 'ResourceAffinityInputPanel',
+ faIcon: 'link',
+
+ stateful: true,
+ stateId: 'grid-ha-resource-affinity-rules',
+
+ initComponent: function () {
+ let me = this;
+
+ me.columns = [
+ {
+ header: gettext('Affinity'),
+ width: 100,
+ dataIndex: 'affinity',
+ },
+ {
+ header: gettext('HA Resources'),
+ flex: 1,
+ dataIndex: 'resources',
+ },
+ ];
+
+ me.callParent();
+ },
+});
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH manager v3 2/3] ui: migrate: lxc: display precondition messages for ha resource affinity
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (14 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 1/3] ui: ha: rules: add " Daniel Kral
@ 2025-07-04 18:20 ` Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH manager v3 3/3] ui: migrate: vm: " Daniel Kral
` (2 subsequent siblings)
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:20 UTC (permalink / raw)
To: pve-devel
Extend the container precondition check to show whether a migration of a
container results in any additional migrations because of positive HA
resource affinity rules or if any migrations cannot be completed because
of any negative resource affinity rules.
In the latter case these migrations would be blocked when executing the
migrations anyway by the HA Manager's CLI and it state machine, but this
gives a better heads-up about this. However, additional migrations are
not reported in advance by the CLI yet, so these warnings are crucial to
warn users about the comigrated HA resources.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
www/manager6/window/Migrate.js | 87 ++++++++++++++++++++++++++++++++--
1 file changed, 84 insertions(+), 3 deletions(-)
diff --git a/www/manager6/window/Migrate.js b/www/manager6/window/Migrate.js
index 4103ad13..dff6af08 100644
--- a/www/manager6/window/Migrate.js
+++ b/www/manager6/window/Migrate.js
@@ -195,7 +195,7 @@ Ext.define('PVE.window.Migrate', {
if (vm.get('vmtype') === 'qemu') {
await me.checkQemuPreconditions(resetMigrationPossible);
} else {
- me.checkLxcPreconditions(resetMigrationPossible);
+ await me.checkLxcPreconditions(resetMigrationPossible);
}
// Only allow nodes where the local storage is available in case of offline migration
@@ -363,11 +363,92 @@ Ext.define('PVE.window.Migrate', {
vm.set('migration', migration);
},
- checkLxcPreconditions: function (resetMigrationPossible) {
- let vm = this.getViewModel();
+ checkLxcPreconditions: async function (resetMigrationPossible) {
+ let me = this;
+ let vm = me.getViewModel();
+ let migrateStats;
+
if (vm.get('running')) {
vm.set('migration.mode', 'restart');
}
+
+ try {
+ if (
+ me.fetchingNodeMigrateInfo &&
+ me.fetchingNodeMigrateInfo === vm.get('nodename')
+ ) {
+ return;
+ }
+ me.fetchingNodeMigrateInfo = vm.get('nodename');
+ let { result } = await Proxmox.Async.api2({
+ url: `/nodes/${vm.get('nodename')}/${vm.get('vmtype')}/${vm.get('vmid')}/migrate`,
+ method: 'GET',
+ });
+ migrateStats = result.data;
+ me.fetchingNodeMigrateInfo = false;
+ } catch (error) {
+ Ext.Msg.alert(Proxmox.Utils.errorText, error.htmlStatus);
+ return;
+ }
+
+ if (migrateStats.running) {
+ vm.set('running', true);
+ }
+
+ // Get migration object from viewmodel to prevent to many bind callbacks
+ let migration = vm.get('migration');
+ if (resetMigrationPossible) {
+ migration.possible = true;
+ }
+ migration.preconditions = [];
+ let targetNode = me.lookup('pveNodeSelector').value;
+ let disallowed = migrateStats['not-allowed-nodes']?.[targetNode] ?? {};
+
+ let blockingHAResources = disallowed['blocking-ha-resources'] ?? [];
+ if (blockingHAResources.length) {
+ migration.possible = false;
+
+ for (const { sid, cause } of blockingHAResources) {
+ let reasonText;
+ if (cause === 'resource-affinity') {
+ reasonText = Ext.String.format(
+ gettext(
+ 'HA resource {0} with negative affinity to container on selected target node',
+ ),
+ sid,
+ );
+ } else {
+ reasonText = Ext.String.format(
+ gettext('blocking HA resource {0} on selected target node'),
+ sid,
+ );
+ }
+
+ migration.preconditions.push({
+ severity: 'error',
+ text: Ext.String.format(
+ gettext('Cannot migrate container, because {0}.'),
+ reasonText,
+ ),
+ });
+ }
+ }
+
+ let comigratedHAResources = migrateStats['comigrated-ha-resources'];
+ if (comigratedHAResources !== undefined) {
+ for (const sid of comigratedHAResources) {
+ const text = Ext.String.format(
+ gettext(
+ 'HA resource {0} with positive affinity to container is also migrated to selected target node.',
+ ),
+ sid,
+ );
+
+ migration.preconditions.push({ text, severity: 'warning' });
+ }
+ }
+
+ vm.set('migration', migration);
},
},
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH manager v3 3/3] ui: migrate: vm: display precondition messages for ha resource affinity
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (15 preceding siblings ...)
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 2/3] ui: migrate: lxc: display precondition messages for ha resource affinity Daniel Kral
@ 2025-07-04 18:21 ` Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH container v3 1/1] api: introduce migration preconditions api endpoint Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH qemu-server v3 1/1] api: migration preconditions: add checks for ha resource affinity rules Daniel Kral
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:21 UTC (permalink / raw)
To: pve-devel
Extend the VM precondition check to show whether a migration of a VM
results in any additional migrations because of positive HA resource
affinity rules or if any migrations cannot be completed because of any
negative resource affinity rules.
In the latter case these migrations would be blocked when executing the
migrations anyway by the HA Manager's CLI and it state machine, but this
gives a better heads-up about this. However, additional migrations are
not reported in advance by the CLI yet, so these warnings are crucial to
warn users about the comigrated HA resources.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
www/manager6/window/Migrate.js | 44 ++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/www/manager6/window/Migrate.js b/www/manager6/window/Migrate.js
index dff6af08..53349f8c 100644
--- a/www/manager6/window/Migrate.js
+++ b/www/manager6/window/Migrate.js
@@ -361,6 +361,50 @@ Ext.define('PVE.window.Migrate', {
});
}
+ let blockingHAResources = disallowed['blocking-ha-resources'] ?? [];
+ if (blockingHAResources.length) {
+ migration.possible = false;
+
+ for (const { sid, cause } of blockingHAResources) {
+ let reasonText;
+ if (cause === 'resource-affinity') {
+ reasonText = Ext.String.format(
+ gettext(
+ 'HA resource {0} with negative affinity to VM on selected target node',
+ ),
+ sid,
+ );
+ } else {
+ reasonText = Ext.String.format(
+ gettext('blocking HA resource {0} on selected target node'),
+ sid,
+ );
+ }
+
+ migration.preconditions.push({
+ severity: 'error',
+ text: Ext.String.format(
+ gettext('Cannot migrate VM, because {0}.'),
+ reasonText,
+ ),
+ });
+ }
+ }
+
+ let comigratedHAResources = migrateStats['comigrated-ha-resources'];
+ if (comigratedHAResources !== undefined) {
+ for (const sid of comigratedHAResources) {
+ const text = Ext.String.format(
+ gettext(
+ 'HA resource {0} with positive affinity to VM is also migrated to selected target node.',
+ ),
+ sid,
+ );
+
+ migration.preconditions.push({ text, severity: 'warning' });
+ }
+ }
+
vm.set('migration', migration);
},
checkLxcPreconditions: async function (resetMigrationPossible) {
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH container v3 1/1] api: introduce migration preconditions api endpoint
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (16 preceding siblings ...)
2025-07-04 18:21 ` [pve-devel] [PATCH manager v3 3/3] ui: migrate: vm: " Daniel Kral
@ 2025-07-04 18:21 ` Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH qemu-server v3 1/1] api: migration preconditions: add checks for ha resource affinity rules Daniel Kral
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:21 UTC (permalink / raw)
To: pve-devel
Add a migration preconditions api endpoint for containers in similar
vein to the one which is already present for virtual machines.
This is needed to inform callees about positive and negative ha resource
affinity rules, which the container is part of. These inform callees
about any comigrated resources or blocking resources that are caused by
the resource affinity rules.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/LXC.pm | 141 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 141 insertions(+)
diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 28f7fdd..dd20604 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -1395,6 +1395,147 @@ __PACKAGE__->register_method({
},
});
+__PACKAGE__->register_method({
+ name => 'migrate_vm_precondition',
+ path => '{vmid}/migrate',
+ method => 'GET',
+ protected => 1,
+ proxyto => 'node',
+ description => "Get preconditions for migration.",
+ permissions => {
+ check => ['perm', '/vms/{vmid}', ['VM.Migrate']],
+ },
+ parameters => {
+ additionalProperties => 0,
+ properties => {
+ node => get_standard_option('pve-node'),
+ vmid =>
+ get_standard_option('pve-vmid', { completion => \&PVE::LXC::complete_ctid }),
+ target => get_standard_option(
+ 'pve-node',
+ {
+ description => "Target node.",
+ completion => \&PVE::Cluster::complete_migration_target,
+ optional => 1,
+ },
+ ),
+ },
+ },
+ returns => {
+ type => "object",
+ properties => {
+ running => {
+ type => 'boolean',
+ description => "Determines if the container is running.",
+ },
+ 'allowed-nodes' => {
+ type => 'array',
+ items => {
+ type => 'string',
+ description => "An allowed node",
+ },
+ optional => 1,
+ description => "List of nodes allowed for migration.",
+ },
+ 'not-allowed-nodes' => {
+ type => 'object',
+ optional => 1,
+ properties => {
+ 'blocking-ha-resources' => {
+ description => "HA resources, which are blocking the"
+ . " container from being migrated to the node.",
+ type => 'array',
+ optional => 1,
+ items => {
+ description => "A blocking HA resource",
+ type => 'object',
+ properties => {
+ sid => {
+ type => 'string',
+ description => "The blocking HA resource id",
+ },
+ cause => {
+ type => 'string',
+ description => "The reason why the HA"
+ . " resource is blocking the migration.",
+ enum => ['resource-affinity'],
+ },
+ },
+ },
+ },
+ },
+ description => "List of not allowed nodes with additional information.",
+ },
+ 'comigrated-ha-resources' => {
+ description => "HA resources, which will be migrated to the"
+ . " same target node as the container, because these are in"
+ . " positive affinity with the container.",
+ type => 'array',
+ optional => 1,
+ items => {
+ type => 'string',
+ description => "A comigrated HA resource",
+ },
+ },
+ },
+ },
+ code => sub {
+ my ($param) = @_;
+
+ my $rpcenv = PVE::RPCEnvironment::get();
+
+ my $authuser = $rpcenv->get_user();
+
+ PVE::Cluster::check_cfs_quorum();
+
+ my $res = {};
+
+ my $vmid = extract_param($param, 'vmid');
+ my $target = extract_param($param, 'target');
+ my $localnode = PVE::INotify::nodename();
+
+ # test if VM exists
+ my $vmconf = PVE::LXC::Config->load_config($vmid);
+ my $storecfg = PVE::Storage::config();
+
+ # try to detect errors early
+ PVE::LXC::Config->check_lock($vmconf);
+
+ $res->{running} = PVE::LXC::check_running($vmid) ? 1 : 0;
+
+ $res->{'allowed-nodes'} = [];
+ $res->{'not-allowed-nodes'} = {};
+
+ my $comigrated_ha_resources = {};
+ my $blocking_ha_resources_by_node = {};
+
+ if (PVE::HA::Config::vm_is_ha_managed($vmid)) {
+ ($comigrated_ha_resources, $blocking_ha_resources_by_node) =
+ PVE::HA::Config::get_resource_motion_info("ct:$vmid");
+ }
+
+ my $nodelist = PVE::Cluster::get_nodelist();
+ for my $node ($nodelist->@*) {
+ next if $node eq $localnode;
+
+ # extracting blocking resources for current node
+ if (my $blocking_ha_resources = $blocking_ha_resources_by_node->{$node}) {
+ $res->{'not-allowed-nodes'}->{$node}->{'blocking-ha-resources'} =
+ $blocking_ha_resources;
+ }
+
+ # if nothing came up, add it to the allowed nodes
+ if (!defined($res->{'not-allowed-nodes'}->{$node})) {
+ push $res->{'allowed-nodes'}->@*, $node;
+ }
+ }
+
+ $res->{'comigrated-ha-resources'} = $comigrated_ha_resources;
+
+ return $res;
+ },
+});
+
__PACKAGE__->register_method({
name => 'migrate_vm',
path => '{vmid}/migrate',
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* [pve-devel] [PATCH qemu-server v3 1/1] api: migration preconditions: add checks for ha resource affinity rules
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
` (17 preceding siblings ...)
2025-07-04 18:21 ` [pve-devel] [PATCH container v3 1/1] api: introduce migration preconditions api endpoint Daniel Kral
@ 2025-07-04 18:21 ` Daniel Kral
18 siblings, 0 replies; 22+ messages in thread
From: Daniel Kral @ 2025-07-04 18:21 UTC (permalink / raw)
To: pve-devel
Add information about positive and negative ha resource affinity rules,
which the VM is part of, to the migration precondition API endpoint.
These inform callees about any comigrated resources or blocking
resources that are caused by the resource affinity rules.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/Qemu.pm | 49 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
index 2e6358e4..8a122cfb 100644
--- a/src/PVE/API2/Qemu.pm
+++ b/src/PVE/API2/Qemu.pm
@@ -5094,6 +5094,28 @@ __PACKAGE__->register_method({
description => 'A storage',
},
},
+ 'blocking-ha-resources' => {
+ description => "HA resources, which are blocking the"
+ . " VM from being migrated to the node.",
+ type => 'array',
+ optional => 1,
+ items => {
+ description => "A blocking HA resource",
+ type => 'object',
+ properties => {
+ sid => {
+ type => 'string',
+ description => "The blocking HA resource id",
+ },
+ cause => {
+ type => 'string',
+ description => "The reason why the HA"
+ . " resource is blocking the migration.",
+ enum => ['resource-affinity'],
+ },
+ },
+ },
+ },
},
description => "List of not allowed nodes with additional information.",
},
@@ -5146,6 +5168,17 @@ __PACKAGE__->register_method({
description =>
"Object of mapped resources with additional information such if they're live migratable.",
},
+ 'comigrated-ha-resources' => {
+ description => "HA resources, which will be migrated to the"
+ . " same target node as the VM, because these are in"
+ . " positive affinity with the VM.",
+ type => 'array',
+ optional => 1,
+ items => {
+ type => 'string',
+ description => "A comigrated HA resource",
+ },
+ },
},
},
code => sub {
@@ -5186,6 +5219,14 @@ __PACKAGE__->register_method({
my $storage_nodehash =
PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
+ my $comigrated_ha_resources = {};
+ my $blocking_ha_resources_by_node = {};
+
+ if (PVE::HA::Config::vm_is_ha_managed($vmid)) {
+ ($comigrated_ha_resources, $blocking_ha_resources_by_node) =
+ PVE::HA::Config::get_resource_motion_info("vm:$vmid");
+ }
+
my $nodelist = PVE::Cluster::get_nodelist();
for my $node ($nodelist->@*) {
next if $node eq $localnode;
@@ -5202,6 +5243,12 @@ __PACKAGE__->register_method({
$missing_mappings;
}
+ # extracting blocking resources for current node
+ if (my $blocking_ha_resources = $blocking_ha_resources_by_node->{$node}) {
+ $res->{not_allowed_nodes}->{$node}->{'blocking-ha-resources'} =
+ $blocking_ha_resources;
+ }
+
# if nothing came up, add it to the allowed nodes
if (scalar($res->{not_allowed_nodes}->{$node}->%*) == 0) {
push $res->{allowed_nodes}->@*, $node;
@@ -5215,6 +5262,8 @@ __PACKAGE__->register_method({
$res->{'mapped-resources'} = [sort keys $mapped_resources->%*];
$res->{'mapped-resource-info'} = $mapped_resources;
+ $res->{'comigrated-ha-resources'} = $comigrated_ha_resources;
+
return $res;
},
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules
2025-07-04 18:20 ` [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules Daniel Kral
@ 2025-07-08 16:08 ` Shannon Sterz
2025-07-09 6:19 ` Friedrich Weber
0 siblings, 1 reply; 22+ messages in thread
From: Shannon Sterz @ 2025-07-08 16:08 UTC (permalink / raw)
To: Proxmox VE development discussion, Daniel Kral
On Fri Jul 4, 2025 at 8:20 PM CEST, Daniel Kral wrote:
> Add documentation about HA Resource Affinity rules, what effects those
> have on the CRS scheduler, and what users can expect when those are
> changed.
>
> There are also a few points on the rule conflicts/errors list which
> describe some conflicts that can arise from a mixed usage of HA Node
> Affinity rules and HA Resource Affinity rules.
>
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
> Makefile | 1 +
> gen-ha-rules-resource-affinity-opts.pl | 20 ++++
> ha-manager.adoc | 133 +++++++++++++++++++++++++
> ha-rules-resource-affinity-opts.adoc | 8 ++
> 4 files changed, 162 insertions(+)
> create mode 100755 gen-ha-rules-resource-affinity-opts.pl
> create mode 100644 ha-rules-resource-affinity-opts.adoc
>
> diff --git a/Makefile b/Makefile
> index c5e506e..4d9e2f0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -51,6 +51,7 @@ GEN_SCRIPTS= \
> gen-ha-resources-opts.pl \
> gen-ha-rules-node-affinity-opts.pl \
> gen-ha-rules-opts.pl \
> + gen-ha-rules-resource-affinity-opts.pl \
> gen-datacenter.cfg.5-opts.pl \
> gen-pct.conf.5-opts.pl \
> gen-pct-network-opts.pl \
> diff --git a/gen-ha-rules-resource-affinity-opts.pl b/gen-ha-rules-resource-affinity-opts.pl
> new file mode 100755
> index 0000000..5abed50
> --- /dev/null
> +++ b/gen-ha-rules-resource-affinity-opts.pl
> @@ -0,0 +1,20 @@
> +#!/usr/bin/perl
> +
> +use lib '.';
> +use strict;
> +use warnings;
> +use PVE::RESTHandler;
> +
> +use Data::Dumper;
> +
> +use PVE::HA::Rules;
> +use PVE::HA::Rules::ResourceAffinity;
> +
> +my $private = PVE::HA::Rules::private();
> +my $resource_affinity_rule_props = PVE::HA::Rules::ResourceAffinity::properties();
> +my $properties = {
> + resources => $private->{propertyList}->{resources},
> + $resource_affinity_rule_props->%*,
> +};
> +
> +print PVE::RESTHandler::dump_properties($properties);
> diff --git a/ha-manager.adoc b/ha-manager.adoc
> index ec26c22..8d06885 100644
> --- a/ha-manager.adoc
> +++ b/ha-manager.adoc
> @@ -692,6 +692,10 @@ include::ha-rules-opts.adoc[]
> | HA Rule Type | Description
> | `node-affinity` | Places affinity from one or more HA resources to one or
> more nodes.
> +| `resource-affinity` | Places affinity between two or more HA resources. The
> +affinity `separate` specifies that HA resources are to be kept on separate
> +nodes, while the affinity `together` specifies that HA resources are to be kept
> +on the same node.
here it's calleged "together" (or "separate")...
> |===========================================================
>
> [[ha_manager_node_affinity_rules]]
> @@ -758,6 +762,88 @@ Node Affinity Rule Properties
>
> include::ha-rules-node-affinity-opts.adoc[]
>
> +[[ha_manager_resource_affinity_rules]]
> +Resource Affinity Rules
> +^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Another common requirement is that two or more HA resources should run on
> +either the same node, or should be distributed on separate nodes. These are
> +also commonly called "Affinity/Anti-Affinity constraints".
> +
> +For example, suppose there is a lot of communication traffic between the HA
> +resources `vm:100` and `vm:200`, e.g., a web server communicating with a
nit: just a small heads up, we recommend avoid "e.g." as it often gets
confused with "i.e." [1]. you could use `for example` instead to make
this a bit clearer (same below)
[1]: https://pve.proxmox.com/wiki/Technical_Writing_Style_Guide#Abbreviations
> +database server. If those HA resources are on separate nodes, this could
> +potentially result in a higher latency and unnecessary network load. Resource
> +affinity rules with the affinity `positive` implement the constraint to keep
> +the HA resources on the same node:
> +
> +----
> +# ha-manager rules add resource-affinity keep-together \
> + --affinity positive --resources vm:100,vm:200
> +----
... here it is specified as "positive"? did i miss something or is that
incorrect?
> +
> +NOTE: If there are two or more positive resource affinity rules, which have
> +common HA resources, then these are treated as a single positive resource
> +affinity rule. For example, if the HA resources `vm:100` and `vm:101` and the
> +HA resources `vm:101` and `vm:102` are each in a positive resource affinity
> +rule, then it is the same as if `vm:100`, `vm:101` and `vm:102` would have been
> +in a single positive resource affinity rule.
> +
> +However, suppose there are computationally expensive, and/or distributed
> +programs running on the HA resources `vm:200` and `ct:300`, e.g., sharded
> +database instances. In that case, running them on the same node could
> +potentially result in pressure on the hardware resources of the node and will
> +slow down the operations of these HA resources. Resource affinity rules with
> +the affinity `negative` implement the constraint to spread the HA resources on
> +separate nodes:
> +
> +----
> +# ha-manager rules add resource-affinity keep-separate \
> + --affinity negative --resources vm:200,ct:300
> +----
... same here with "separate" or "negative"
> +
> +Other than node affinity rules, resource affinity rules are strict by default,
> +i.e., if the constraints imposed by the resource affinity rules cannot be met
> +for a HA resource, the HA Manager will put the HA resource in recovery state in
> +case of a failover or in error state elsewhere.
> +
> +The above commands created the following rules in the rules configuration file:
> +
> +.Resource Affinity Rules Configuration Example (`/etc/pve/ha/rules.cfg`)
> +----
> +resource-affinity: keep-together
> + resources vm:100,vm:200
> + affinity positive
> +
> +resource-affinity: keep-separate
> + resources vm:200,ct:300
> + affinity negative
> +----
> +
> +Interactions between Positive and Negative Resource Affinity Rules
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +
> +If there are HA resources in a positive resource affinity rule, which are also
> +part of a negative resource affinity rule, then all the other HA resources in
> +the positive resource affinity rule are in negative affinity with the HA
> +resources of these negative resource affinity rules as well.
> +
> +For example, if the HA resources `vm:100`, `vm:101`, and `vm:102` are in a
> +positive resource affinity rule, and `vm:100` is in a negative resource affinity
> +rule with the HA resource `ct:200`, then `vm:101` and `vm:102` are each in
> +negative resource affinity with `ct:200` as well.
> +
> +Note that if there are two or more HA resources in both a positive and negative
> +resource affinity rule, then those will be disabled as they cause a conflict:
> +Two or more HA resources cannot be kept on the same node and separated on
> +different nodes at the same time. For more information on these cases, see the
> +section about xref:ha_manager_rule_conflicts[rule conflicts and errors] below.
> +
> +Resource Affinity Rule Properties
> ++++++++++++++++++++++++++++++++++
> +
> +include::ha-rules-resource-affinity-opts.adoc[]
> +
> [[ha_manager_rule_conflicts]]
> Rule Conflicts and Errors
> ~~~~~~~~~~~~~~~~~~~~~~~~~
> @@ -774,6 +860,43 @@ Currently, HA rules are checked for the following feasibility tests:
> total. If two or more HA node affinity rules specify the same HA resource,
> these HA node affinity rules will be disabled.
>
> +* A HA resource affinity rule must specify at least two HA resources to be
> + feasible. If a HA resource affinity rule does specify only one HA resource,
nit: get rid of the "does" it makes this already very long and hard to
parse sentence ven hader to read.
> + the HA resource affinity rule will be disabled.
> +
> +* A HA resource affinity rule must specify no more HA resources than there are
> + nodes in the cluster. If a HA resource affinity rule does specify more HA
same here
> + resources than there are in the cluster, the HA resource affinity rule will be
> + disabled.
> +
> +* A positive HA resource affinity rule cannot specify the same two or more HA
> + resources as a negative HA resources affinity rule. That is, two or more HA
> + resources cannot be kept together and separate at the same time. If any pair
> + of positive and negative HA resource affinity rules do specify the same two or
> + more HA resources, both HA resource affinity rules will be disabled.
> +
> +* A HA resource, which is already constrained by a HA node affinity rule, can
> + only be referenced by a HA resource affinity rule, if the HA node affinity
> + rule does only use a single priority group. That is, the specified nodes in
and here
> + the HA node affinity rule have the same priority. If one of the HA resources
> + in a HA resource affinity rule is constrainted by a HA node affinity rule with
typo: constrainted -> constrained
> + multiple priority groups, the HA resource affinity rule will be disabled.
> +
> +* The HA resources of a positive HA resource affinity rule, which are
> + constrained by HA node affinity rules, must have at least one common node,
> + where the HA resources are allowed to run on. Otherwise, the HA resources
> + could only run on separate nodes. In other words, if two or more HA resources
> + of a positive HA resource affinity rule are constrained to different nodes,
> + the positive HA resource affinity rule will be disabled.
> +
> +* The HA resources of a negative HA resource affinity rule, which are
> + constrained by HA node affinity rules, must have at least enough nodes to
> + separate these constrained HA resources on. Otherwise, the HA resources do not
nit: the "on" here is not necessary.
> + have enough nodes to be separated on. In other words, if two or more HA
same here.
> + resources of a negative HA resource affinity rule are constrained to less
> + nodes than needed to separate them on, the negative HA resource affinity rule
and here
> + will be disabled.
> +
> [[ha_manager_fencing]]
> Fencing
> -------
> @@ -1205,6 +1328,16 @@ The CRS is currently used at the following scheduling points:
> algorithm to ensure that these HA resources are assigned according to their
> node and priority constraints.
>
> +** Positive resource affinity rules: If a positive resource affinity rule is
> + created or HA resources are added to an existing positive resource affinity
> + rule, the HA stack will use the CRS algorithm to ensure that these HA
> + resources are moved to a common node.
> +
> +** Negative resource affinity rules: If a negative resource affinity rule is
> + created or HA resources are added to an existing negative resource affinity
> + rule, the HA stack will use the CRS algorithm to ensure that these HA
> + resources are moved to separate nodes.
> +
> - HA service stopped -> start transition (opt-in). Requesting that a stopped
> service should be started is an good opportunity to check for the best suited
> node as per the CRS algorithm, as moving stopped services is cheaper to do
> diff --git a/ha-rules-resource-affinity-opts.adoc b/ha-rules-resource-affinity-opts.adoc
> new file mode 100644
> index 0000000..596ec3c
> --- /dev/null
> +++ b/ha-rules-resource-affinity-opts.adoc
> @@ -0,0 +1,8 @@
> +`affinity`: `<negative | positive>` ::
> +
> +Describes whether the HA resources are supposed to be kept on the same node ('positive'), or are supposed to be kept on separate nodes ('negative').
> +
> +`resources`: `<type>:<name>{,<type>:<name>}*` ::
> +
> +List of HA resource IDs. This consists of a list of resource types followed by a resource specific name separated with a colon (example: vm:100,ct:101).
> +
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules
2025-07-08 16:08 ` Shannon Sterz
@ 2025-07-09 6:19 ` Friedrich Weber
0 siblings, 0 replies; 22+ messages in thread
From: Friedrich Weber @ 2025-07-09 6:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Shannon Sterz, Daniel Kral
On 08/07/2025 18:08, Shannon Sterz wrote:
> On Fri Jul 4, 2025 at 8:20 PM CEST, Daniel Kral wrote:
>> Add documentation about HA Resource Affinity rules, what effects those
>> have on the CRS scheduler, and what users can expect when those are
>> changed.
>>
>> There are also a few points on the rule conflicts/errors list which
>> describe some conflicts that can arise from a mixed usage of HA Node
>> Affinity rules and HA Resource Affinity rules.
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>> Makefile | 1 +
>> gen-ha-rules-resource-affinity-opts.pl | 20 ++++
>> ha-manager.adoc | 133 +++++++++++++++++++++++++
>> ha-rules-resource-affinity-opts.adoc | 8 ++
>> 4 files changed, 162 insertions(+)
>> create mode 100755 gen-ha-rules-resource-affinity-opts.pl
>> create mode 100644 ha-rules-resource-affinity-opts.adoc
>>
>> diff --git a/Makefile b/Makefile
>> index c5e506e..4d9e2f0 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -51,6 +51,7 @@ GEN_SCRIPTS= \
>> gen-ha-resources-opts.pl \
>> gen-ha-rules-node-affinity-opts.pl \
>> gen-ha-rules-opts.pl \
>> + gen-ha-rules-resource-affinity-opts.pl \
>> gen-datacenter.cfg.5-opts.pl \
>> gen-pct.conf.5-opts.pl \
>> gen-pct-network-opts.pl \
>> diff --git a/gen-ha-rules-resource-affinity-opts.pl b/gen-ha-rules-resource-affinity-opts.pl
>> new file mode 100755
>> index 0000000..5abed50
>> --- /dev/null
>> +++ b/gen-ha-rules-resource-affinity-opts.pl
>> @@ -0,0 +1,20 @@
>> +#!/usr/bin/perl
>> +
>> +use lib '.';
>> +use strict;
>> +use warnings;
>> +use PVE::RESTHandler;
>> +
>> +use Data::Dumper;
>> +
>> +use PVE::HA::Rules;
>> +use PVE::HA::Rules::ResourceAffinity;
>> +
>> +my $private = PVE::HA::Rules::private();
>> +my $resource_affinity_rule_props = PVE::HA::Rules::ResourceAffinity::properties();
>> +my $properties = {
>> + resources => $private->{propertyList}->{resources},
>> + $resource_affinity_rule_props->%*,
>> +};
>> +
>> +print PVE::RESTHandler::dump_properties($properties);
>> diff --git a/ha-manager.adoc b/ha-manager.adoc
>> index ec26c22..8d06885 100644
>> --- a/ha-manager.adoc
>> +++ b/ha-manager.adoc
>> @@ -692,6 +692,10 @@ include::ha-rules-opts.adoc[]
>> | HA Rule Type | Description
>> | `node-affinity` | Places affinity from one or more HA resources to one or
>> more nodes.
>> +| `resource-affinity` | Places affinity between two or more HA resources. The
>> +affinity `separate` specifies that HA resources are to be kept on separate
>> +nodes, while the affinity `together` specifies that HA resources are to be kept
>> +on the same node.
>
> here it's calleged "together" (or "separate")...
>
>> |===========================================================
>>
>> [[ha_manager_node_affinity_rules]]
>> @@ -758,6 +762,88 @@ Node Affinity Rule Properties
>>
>> include::ha-rules-node-affinity-opts.adoc[]
>>
>> +[[ha_manager_resource_affinity_rules]]
>> +Resource Affinity Rules
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +Another common requirement is that two or more HA resources should run on
>> +either the same node, or should be distributed on separate nodes. These are
>> +also commonly called "Affinity/Anti-Affinity constraints".
>> +
>> +For example, suppose there is a lot of communication traffic between the HA
>> +resources `vm:100` and `vm:200`, e.g., a web server communicating with a
>
> nit: just a small heads up, we recommend avoid "e.g." as it often gets
> confused with "i.e." [1]. you could use `for example` instead to make
> this a bit clearer (same below)
>
> [1]: https://pve.proxmox.com/wiki/Technical_Writing_Style_Guide#Abbreviations
>
>> +database server. If those HA resources are on separate nodes, this could
>> +potentially result in a higher latency and unnecessary network load. Resource
>> +affinity rules with the affinity `positive` implement the constraint to keep
>> +the HA resources on the same node:
>> +
>> +----
>> +# ha-manager rules add resource-affinity keep-together \
>> + --affinity positive --resources vm:100,vm:200
>> +----
>
> ... here it is specified as "positive"? did i miss something or is that
> incorrect?
Good catch, but I think it should be "positive"/"negative", so it's
"together" and "separate" that are outdated. A lot of the naming was
changed between v2 and v3, including "together"->"positive", and
"separate"->"negative" [1] so they're probably leftovers from before the
rename.
[1]
https://lore.proxmox.com/pve-devel/7fb94369-d8b6-47c6-b36c-428db5bb85de@proxmox.com/
>
>> +
>> +NOTE: If there are two or more positive resource affinity rules, which have
>> +common HA resources, then these are treated as a single positive resource
>> +affinity rule. For example, if the HA resources `vm:100` and `vm:101` and the
>> +HA resources `vm:101` and `vm:102` are each in a positive resource affinity
>> +rule, then it is the same as if `vm:100`, `vm:101` and `vm:102` would have been
>> +in a single positive resource affinity rule.
>> +
>> +However, suppose there are computationally expensive, and/or distributed
>> +programs running on the HA resources `vm:200` and `ct:300`, e.g., sharded
>> +database instances. In that case, running them on the same node could
>> +potentially result in pressure on the hardware resources of the node and will
>> +slow down the operations of these HA resources. Resource affinity rules with
>> +the affinity `negative` implement the constraint to spread the HA resources on
>> +separate nodes:
>> +
>> +----
>> +# ha-manager rules add resource-affinity keep-separate \
>> + --affinity negative --resources vm:200,ct:300
>> +----
>
> ... same here with "separate" or "negative"
>
>> +
>> +Other than node affinity rules, resource affinity rules are strict by default,
>> +i.e., if the constraints imposed by the resource affinity rules cannot be met
>> +for a HA resource, the HA Manager will put the HA resource in recovery state in
>> +case of a failover or in error state elsewhere.
>> +
>> +The above commands created the following rules in the rules configuration file:
>> +
>> +.Resource Affinity Rules Configuration Example (`/etc/pve/ha/rules.cfg`)
>> +----
>> +resource-affinity: keep-together
>> + resources vm:100,vm:200
>> + affinity positive
>> +
>> +resource-affinity: keep-separate
>> + resources vm:200,ct:300
>> + affinity negative
>> +----
>> +
>> +Interactions between Positive and Negative Resource Affinity Rules
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +
>> +If there are HA resources in a positive resource affinity rule, which are also
>> +part of a negative resource affinity rule, then all the other HA resources in
>> +the positive resource affinity rule are in negative affinity with the HA
>> +resources of these negative resource affinity rules as well.
>> +
>> +For example, if the HA resources `vm:100`, `vm:101`, and `vm:102` are in a
>> +positive resource affinity rule, and `vm:100` is in a negative resource affinity
>> +rule with the HA resource `ct:200`, then `vm:101` and `vm:102` are each in
>> +negative resource affinity with `ct:200` as well.
>> +
>> +Note that if there are two or more HA resources in both a positive and negative
>> +resource affinity rule, then those will be disabled as they cause a conflict:
>> +Two or more HA resources cannot be kept on the same node and separated on
>> +different nodes at the same time. For more information on these cases, see the
>> +section about xref:ha_manager_rule_conflicts[rule conflicts and errors] below.
>> +
>> +Resource Affinity Rule Properties
>> ++++++++++++++++++++++++++++++++++
>> +
>> +include::ha-rules-resource-affinity-opts.adoc[]
>> +
>> [[ha_manager_rule_conflicts]]
>> Rule Conflicts and Errors
>> ~~~~~~~~~~~~~~~~~~~~~~~~~
>> @@ -774,6 +860,43 @@ Currently, HA rules are checked for the following feasibility tests:
>> total. If two or more HA node affinity rules specify the same HA resource,
>> these HA node affinity rules will be disabled.
>>
>> +* A HA resource affinity rule must specify at least two HA resources to be
>> + feasible. If a HA resource affinity rule does specify only one HA resource,
>
> nit: get rid of the "does" it makes this already very long and hard to
> parse sentence ven hader to read.
>
>> + the HA resource affinity rule will be disabled.
>> +
>> +* A HA resource affinity rule must specify no more HA resources than there are
>> + nodes in the cluster. If a HA resource affinity rule does specify more HA
>
> same here
>
>> + resources than there are in the cluster, the HA resource affinity rule will be
>> + disabled.
>> +
>> +* A positive HA resource affinity rule cannot specify the same two or more HA
>> + resources as a negative HA resources affinity rule. That is, two or more HA
>> + resources cannot be kept together and separate at the same time. If any pair
>> + of positive and negative HA resource affinity rules do specify the same two or
>> + more HA resources, both HA resource affinity rules will be disabled.
>> +
>> +* A HA resource, which is already constrained by a HA node affinity rule, can
>> + only be referenced by a HA resource affinity rule, if the HA node affinity
>> + rule does only use a single priority group. That is, the specified nodes in
>
> and here
>
>> + the HA node affinity rule have the same priority. If one of the HA resources
>> + in a HA resource affinity rule is constrainted by a HA node affinity rule with
>
> typo: constrainted -> constrained
>
>> + multiple priority groups, the HA resource affinity rule will be disabled.
>> +
>> +* The HA resources of a positive HA resource affinity rule, which are
>> + constrained by HA node affinity rules, must have at least one common node,
>> + where the HA resources are allowed to run on. Otherwise, the HA resources
>> + could only run on separate nodes. In other words, if two or more HA resources
>> + of a positive HA resource affinity rule are constrained to different nodes,
>> + the positive HA resource affinity rule will be disabled.
>> +
>> +* The HA resources of a negative HA resource affinity rule, which are
>> + constrained by HA node affinity rules, must have at least enough nodes to
>> + separate these constrained HA resources on. Otherwise, the HA resources do not
>
> nit: the "on" here is not necessary.
>
>> + have enough nodes to be separated on. In other words, if two or more HA
>
> same here.
>
>> + resources of a negative HA resource affinity rule are constrained to less
>> + nodes than needed to separate them on, the negative HA resource affinity rule
>
> and here
>
>> + will be disabled.
>> +
>> [[ha_manager_fencing]]
>> Fencing
>> -------
>> @@ -1205,6 +1328,16 @@ The CRS is currently used at the following scheduling points:
>> algorithm to ensure that these HA resources are assigned according to their
>> node and priority constraints.
>>
>> +** Positive resource affinity rules: If a positive resource affinity rule is
>> + created or HA resources are added to an existing positive resource affinity
>> + rule, the HA stack will use the CRS algorithm to ensure that these HA
>> + resources are moved to a common node.
>> +
>> +** Negative resource affinity rules: If a negative resource affinity rule is
>> + created or HA resources are added to an existing negative resource affinity
>> + rule, the HA stack will use the CRS algorithm to ensure that these HA
>> + resources are moved to separate nodes.
>> +
>> - HA service stopped -> start transition (opt-in). Requesting that a stopped
>> service should be started is an good opportunity to check for the best suited
>> node as per the CRS algorithm, as moving stopped services is cheaper to do
>> diff --git a/ha-rules-resource-affinity-opts.adoc b/ha-rules-resource-affinity-opts.adoc
>> new file mode 100644
>> index 0000000..596ec3c
>> --- /dev/null
>> +++ b/ha-rules-resource-affinity-opts.adoc
>> @@ -0,0 +1,8 @@
>> +`affinity`: `<negative | positive>` ::
>> +
>> +Describes whether the HA resources are supposed to be kept on the same node ('positive'), or are supposed to be kept on separate nodes ('negative').
>> +
>> +`resources`: `<type>:<name>{,<type>:<name>}*` ::
>> +
>> +List of HA resource IDs. This consists of a list of resource types followed by a resource specific name separated with a colon (example: vm:100,ct:101).
>> +
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-07-09 6:18 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-04 18:20 [pve-devel] [PATCH container/docs/ha-manager/manager/qemu-server v3 00/19] HA resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 01/13] rules: introduce plugin-specific canonicalize routines Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 02/13] rules: add haenv node list to the rules' canonicalization stage Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 03/13] rules: introduce resource affinity rule plugin Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 04/13] rules: add global checks between node and resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 05/13] usage: add information about a service's assigned nodes Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 06/13] manager: apply resource affinity rules when selecting service nodes Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 07/13] manager: handle resource affinity rules in manual migrations Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 08/13] sim: resources: add option to limit start and migrate tries to node Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 09/13] test: ha tester: add test cases for negative resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 10/13] test: ha tester: add test cases for positive " Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 11/13] test: ha tester: add test cases for static scheduler resource affinity Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 12/13] test: rules: add test cases for resource affinity rules Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH ha-manager v3 13/13] api: resources: add check for resource affinity in resource migrations Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules Daniel Kral
2025-07-08 16:08 ` Shannon Sterz
2025-07-09 6:19 ` Friedrich Weber
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 1/3] ui: ha: rules: add " Daniel Kral
2025-07-04 18:20 ` [pve-devel] [PATCH manager v3 2/3] ui: migrate: lxc: display precondition messages for ha resource affinity Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH manager v3 3/3] ui: migrate: vm: " Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH container v3 1/1] api: introduce migration preconditions api endpoint Daniel Kral
2025-07-04 18:21 ` [pve-devel] [PATCH qemu-server v3 1/1] api: migration preconditions: add checks for ha resource affinity rules Daniel Kral
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.