public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
@ 2025-03-25 15:12 Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
                   ` (17 more replies)
  0 siblings, 18 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

This RFC patch series is a draft for the implementation to allow users
to specify colocation rules (or affinity/anti-affinity) for the HA
Manager, so that two or more services are either kept together or apart
with respect to each other in case of service recovery or if
auto-rebalancing on service start is enabled.

I chose the name "colocation" in favor of affinity/anti-affinity, since
it is a bit more concise that it is about co-locating services between
each other in contrast to locating services on nodes, but no hard
feelings to change it (same for any other names in this series).

Many thanks to @Thomas, @Fiona, @Friedrich, and @Hannes Duerr for the
discussions about this feature off-list!


Recap: HA groups
----------------

The HA Manager currently allows a service to be assigned to one HA
groups, which essentially implements an affinity to a set of nodes. This
affinity can either be unrestricted or restricted, where the first
allows recovery to nodes outside of the HA group's nodes, if those are
currently unavailable.

This allows users to constrain the set of nodes, that can be selected
from as the starting and/or recovery node. Furthermore, each node in a
HA group can have an individual priority. This further constraints the
set of possible recovery nodes to the subset of online nodes in the
highest priority group.


Introduction
------------

Colocation is the concept of an inter-service affinity relationship,
which can either be positive (keep services together) or negative (keep
services apart). This is in contrast with the service-nodes affinity
relationship implemented by HA groups.

In addition to the positive-negative dimension, there's also the
mandatory-optional axis. Currently, this is a binary setting, whether
failing to meet the colocation relationship results in a service

- (1) being kept in recovery for a mandatory colocation rule, or
- (2) is migrated in ignorance to the optional colocation rule.


Motivation
----------

There are many different use cases to support colocation, but two simple
examples that come to mind are:

- Two or more services need to communicate with each other very
  frequently. To reduce the communication path length and therefore
  hopefully the latency, keep them together on one node.

- Two or more services need a lot of computational resources and will
  therefore consume much of the assigned node's resource capacity. To
  reduce starving and memory stalls, keep them separate on multiple
  nodes, so that they have enough resources for themselves.

And some more concrete use cases from current HA Manager users:

- "For example: let's say we have three DB VMs (DB nodes in a cluster)
  which we want to run on ANY PVE host, but we don't want them to be on
  the same host." [0]

- "An example is: When Server from the DMZ zone start on the same host
  like the firewall or another example the application servers start on
  the same host like the sql server. Services they depend on each other
  have short latency to each other." [1]


HA Rules
--------

To implement colocation, this patch series introduces HA rules, which
allows users to specify the colocation requirements on services. These
are implemented with the widely used section config, where each type of
rule is a individual plugin (for now only 'colocation').

This introduces some small initial complexity for testing satisfiability
of the rules, but allows the constraint interface to be extensible, and
hopefully allow easier reasoning about the node selection process with
the added constraint rules in the future.

Colocation Rules
----------------

The two properties of colocation rules, as described in the
introduction, are rather straightforward. A typical colocation rule
inside of the config would look like the following:

colocation: some-lonely-services
	services vm:101,vm:103,ct:909
	affinity separate
	strict 1

This means that the three services vm:101, vm:103 and ct:909 must be
kept separate on different nodes. I'm very keen on naming suggestions
since I think there could be a better word than 'affinity' here. I
played around with 'keep-services', since then it would always read
something like 'keep-services separate', which is very declarative, but
this might suggest that this is a binary option to too much users (I
mean it is, but not with the values 0 and 1).

Satisfiability and Inference
----------------------------

Since rules allow more complexity, it is necessary to check whether
rules can be (1) satisfied, (2) simplified, and (3) infer other
constraints. There's a static part (i.e. the configuration file) and a
dynamic part (i.e. when deciding the next node) for this.

| Satisfiability
----------

Statically, colocation rules currently must satisfy:

- Two or more services must not be in both a positive and negative
  colocation rule.

- Two or more services in a positive colocation rule must not be in
  restricted HA groups with disjoint node sets.

- Two or more services in a negative colocation rule, which are in
  restricted HA groups, must have at least as many statically available
  nodes as node-restricted services.

The first is obvious. The second one asserts whether there is at least a
common node that can be recovered to. The third one asserts whether
there are enough nodes that can be selected from for recovery of the
services, which are restricted to a set of node.

Of course, it doesn't make sense to have three services in a negative
colocation relation in case of a failover, if there are only three
cluster nodes, but the static part is only a best effort to reduce
obvious misconfigurations.

| Canonicalization
----------

Additionally, colocation rules are currently simplified as follows:

- If there are multiple positive colocation rules with common services
  and the same strictness, these are merged to a single positive
  colocation rule.

| Inference rules
----------

There are currently no inference rules implemented for the RFC, but
there could be potential to further simplify some code paths in the
future, e.g. a positive colocation rule where one service is part of a
restricted HA group makes the other services in the positive colocation
rule a part of this HA group as well.

I leave this open for discussion here.


Special negative colocation scenarios
-------------------------------------

Just to be aware of these, there's a distinction between the following
two sets of negative colocation rules:

colocation: separate-vms
	services vm:101,vm:102,vm:103
	affinity separate
	strict 1

and

colocation: separate-vms1
	services vm:101,vm:102
	affinity separate
	strict 1

colocation: separate-vms2
	services vm:102,vm:103
	affinity separate
	strict 1

The first keeps all three services separate from each other, while the
second only keeps pair-wise services separate from each other, but
vm:101 and vm:103 might be migrated to the same node.


Test cases
----------

The test cases are quite straight forward and I designed them so they
would fail without the colocation rules applied. This can be verified,
if the `apply_colocation_rules(...)` is removed from the
`select_service_node()` body.

They are not completely exhaustive and I didn't implement test cases
with HA groups yet (both for the ha-tester and rules config tests), but
would be implemented in a post-RFC.

Also the loose tests are complete copies of their strict counterparts,
where only the expected log and the rules are changed from 'strict 1' to
'strict 0'.


TODO
----

- WebGUI Integration

- User Documentation

- Add test cases with HA groups and more complex scenarios

- CLI / API endpoints for CRUD and maybe verification

- Cleanup the `select_service_node` signature into two structs as
  suggested by @Thomas in [3]

Additional and/or future ideas
------------------------------

- Transforming HA groups to location rules (see comment below).

- Make recomputing the online node usage more granular.

- Add information of overall free node resources to improve decision
  heuristic when recovering services to nodes.

- Improve recovery node selection for optional positive colocation.
  Correlated with the idea about free node resources above.

- When deciding the recovery node for positively colocated services,
  account for the needed resources of all to-be-migrated services rather
  than just the first one. This is a non-trivial problem as we currently
  solve this as a online bin covering problem, i.e. selecting for each
  service alone instead of selecting for all services together.

- When migrating a service manually, migrate the colocated services too.
  But this would also mean that we need to check whether a migration is
  legal according to the colocation rules, which we do not do yet for HA
  groups.

- Dynamic colocation rule health statistics (e.g. warn on the
  satisfiability of a colocation rule), e.g. in the WebGUI and/or API.

- Property for mandatory colocation rules to specify whether all
  services should be stopped if the rule cannot be satisfied.


Comment about HA groups -> Location Rules
-----------------------------------------

This part is not really part of the patch series, but still worth for an
on-list discussion.

I'd like to suggest to also transform the existing HA groups to location
rules, if the rule concept turns out to be a good fit for the colocation
feature in the HA Manager, as HA groups seem to integrate quite easily
into this concept.

This would make service-node relationships a little more flexible for
users and we'd be able to have both configurable / visible in the same
WebUI view, API endpoint, and configuration file. Also, some code paths
could be a little more consise, e.g. checking changes to constraints and
canonicalizing the rules config.

The how should be rather straightforward for the obvious use cases:

- Services in unrestricted HA groups -> Location rules with the nodes of
  the HA group; We could either split each node priority group into
  separate location rules (with each having their score / weight) or
  keep the input format of HA groups with a list of
  `<node>(:<priority>)` in each rule

- Services in restricted HA groups -> Same as above, but also using
  either `+inf` for a mandatory location rule or `strict` property
  depending on how we decide on the colocation rule properties

This would allow most of the use cases of HA groups to be easily
migratable to location rules. We could also keep the inference of the
'default group' for unrestricted HA groups (any node that is available
is added as a group member with priority -1).

The only thing that I'm unsure about this, is how we would migrate the
`nofailback` option, since this operates on the group-level. If we keep
the `<node>(:<priority>)` syntax and restrict that each service can only
be part of one location rule, it'd be easy to have the same flag. If we
go with multiple location rules per service and each having a score or
weight (for the priority), then we wouldn't be able to have this flag
anymore. I think we could keep the semantic if we move this flag to the
service config, but I'm thankful for any comments on this.

[0] https://clusterlabs.org/projects/pacemaker/doc/3.0/Pacemaker_Explained/html/constraints.html#colocation-properties
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=5260
[2] https://bugzilla.proxmox.com/show_bug.cgi?id=5332
[3] https://lore.proxmox.com/pve-devel/c8fa7b8c-fb37-5389-1302-2002780d4ee2@proxmox.com/

Diffstat
--------

pve-cluster:

Daniel Kral (1):
  cfs: add 'ha/rules.cfg' to observed files

 src/PVE/Cluster.pm  | 1 +
 src/pmxcfs/status.c | 1 +
 2 files changed, 2 insertions(+)


pve-ha-manager:

Daniel Kral (15):
  ignore output of fence config tests in tree
  tools: add hash set helper subroutines
  usage: add get_service_node and pin_service_node methods
  add rules section config base plugin
  rules: add colocation rule plugin
  config, env, hw: add rules read and parse methods
  manager: read and update rules config
  manager: factor out prioritized nodes in select_service_node
  manager: apply colocation rules when selecting service nodes
  sim: resources: add option to limit start and migrate tries to node
  test: ha tester: add test cases for strict negative colocation rules
  test: ha tester: add test cases for strict positive colocation rules
  test: ha tester: add test cases for loose colocation rules
  test: ha tester: add test cases in more complex scenarios
  test: add test cases for rules config

 .gitignore                                    |   3 +
 debian/pve-ha-manager.install                 |   2 +
 src/PVE/HA/Config.pm                          |  12 +
 src/PVE/HA/Env.pm                             |   6 +
 src/PVE/HA/Env/PVE2.pm                        |  13 +
 src/PVE/HA/Makefile                           |   3 +-
 src/PVE/HA/Manager.pm                         | 235 ++++++++++-
 src/PVE/HA/Rules.pm                           | 118 ++++++
 src/PVE/HA/Rules/Colocation.pm                | 391 ++++++++++++++++++
 src/PVE/HA/Rules/Makefile                     |   6 +
 src/PVE/HA/Sim/Env.pm                         |  15 +
 src/PVE/HA/Sim/Hardware.pm                    |  15 +
 src/PVE/HA/Sim/Resources/VirtFail.pm          |  37 +-
 src/PVE/HA/Tools.pm                           |  53 +++
 src/PVE/HA/Usage.pm                           |  12 +
 src/PVE/HA/Usage/Basic.pm                     |  15 +
 src/PVE/HA/Usage/Static.pm                    |  14 +
 src/test/Makefile                             |   4 +-
 .../connected-positive-colocations.cfg        |  34 ++
 .../connected-positive-colocations.cfg.expect |  54 +++
 .../rules_cfgs/illdefined-colocations.cfg     |   9 +
 .../illdefined-colocations.cfg.expect         |  12 +
 .../inner-inconsistent-colocations.cfg        |  14 +
 .../inner-inconsistent-colocations.cfg.expect |  13 +
 .../test-colocation-loose-separate1/README    |  13 +
 .../test-colocation-loose-separate1/cmdlist   |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  60 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-loose-separate4/README    |  17 +
 .../test-colocation-loose-separate4/cmdlist   |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  73 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-loose-together1/README    |  11 +
 .../test-colocation-loose-together1/cmdlist   |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  66 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-loose-together3/README    |  16 +
 .../test-colocation-loose-together3/cmdlist   |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  93 +++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   8 +
 .../test-colocation-strict-separate1/README   |  13 +
 .../test-colocation-strict-separate1/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  60 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-strict-separate2/README   |  15 +
 .../test-colocation-strict-separate2/cmdlist  |   4 +
 .../hardware_status                           |   7 +
 .../log.expect                                |  90 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |  10 +
 .../test-colocation-strict-separate3/README   |  16 +
 .../test-colocation-strict-separate3/cmdlist  |   4 +
 .../hardware_status                           |   7 +
 .../log.expect                                | 110 +++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |  10 +
 .../test-colocation-strict-separate4/README   |  17 +
 .../test-colocation-strict-separate4/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  69 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-strict-separate5/README   |  11 +
 .../test-colocation-strict-separate5/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  56 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   9 +
 .../service_config                            |   5 +
 .../test-colocation-strict-together1/README   |  11 +
 .../test-colocation-strict-together1/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  66 +++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-strict-together2/README   |  11 +
 .../test-colocation-strict-together2/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  80 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   8 +
 .../test-colocation-strict-together3/README   |  17 +
 .../test-colocation-strict-together3/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  89 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   8 +
 .../test-crs-static-rebalance-coloc1/README   |  26 ++
 .../test-crs-static-rebalance-coloc1/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   5 +
 .../log.expect                                | 120 ++++++
 .../manager_status                            |   1 +
 .../rules_config                              |  24 ++
 .../service_config                            |  10 +
 .../static_service_stats                      |  10 +
 .../test-crs-static-rebalance-coloc2/README   |  16 +
 .../test-crs-static-rebalance-coloc2/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  86 ++++
 .../manager_status                            |   1 +
 .../rules_config                              |  14 +
 .../service_config                            |   5 +
 .../static_service_stats                      |   5 +
 .../test-crs-static-rebalance-coloc3/README   |  14 +
 .../test-crs-static-rebalance-coloc3/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   7 +
 .../log.expect                                | 156 +++++++
 .../manager_status                            |   1 +
 .../rules_config                              |  49 +++
 .../service_config                            |   7 +
 .../static_service_stats                      |   5 +
 src/test/test_failover1.pl                    |   4 +-
 src/test/test_rules_config.pl                 | 100 +++++
 137 files changed, 3113 insertions(+), 20 deletions(-)
 create mode 100644 src/PVE/HA/Rules.pm
 create mode 100644 src/PVE/HA/Rules/Colocation.pm
 create mode 100644 src/PVE/HA/Rules/Makefile
 create mode 100644 src/test/rules_cfgs/connected-positive-colocations.cfg
 create mode 100644 src/test/rules_cfgs/connected-positive-colocations.cfg.expect
 create mode 100644 src/test/rules_cfgs/illdefined-colocations.cfg
 create mode 100644 src/test/rules_cfgs/illdefined-colocations.cfg.expect
 create mode 100644 src/test/rules_cfgs/inner-inconsistent-colocations.cfg
 create mode 100644 src/test/rules_cfgs/inner-inconsistent-colocations.cfg.expect
 create mode 100644 src/test/test-colocation-loose-separate1/README
 create mode 100644 src/test/test-colocation-loose-separate1/cmdlist
 create mode 100644 src/test/test-colocation-loose-separate1/hardware_status
 create mode 100644 src/test/test-colocation-loose-separate1/log.expect
 create mode 100644 src/test/test-colocation-loose-separate1/manager_status
 create mode 100644 src/test/test-colocation-loose-separate1/rules_config
 create mode 100644 src/test/test-colocation-loose-separate1/service_config
 create mode 100644 src/test/test-colocation-loose-separate4/README
 create mode 100644 src/test/test-colocation-loose-separate4/cmdlist
 create mode 100644 src/test/test-colocation-loose-separate4/hardware_status
 create mode 100644 src/test/test-colocation-loose-separate4/log.expect
 create mode 100644 src/test/test-colocation-loose-separate4/manager_status
 create mode 100644 src/test/test-colocation-loose-separate4/rules_config
 create mode 100644 src/test/test-colocation-loose-separate4/service_config
 create mode 100644 src/test/test-colocation-loose-together1/README
 create mode 100644 src/test/test-colocation-loose-together1/cmdlist
 create mode 100644 src/test/test-colocation-loose-together1/hardware_status
 create mode 100644 src/test/test-colocation-loose-together1/log.expect
 create mode 100644 src/test/test-colocation-loose-together1/manager_status
 create mode 100644 src/test/test-colocation-loose-together1/rules_config
 create mode 100644 src/test/test-colocation-loose-together1/service_config
 create mode 100644 src/test/test-colocation-loose-together3/README
 create mode 100644 src/test/test-colocation-loose-together3/cmdlist
 create mode 100644 src/test/test-colocation-loose-together3/hardware_status
 create mode 100644 src/test/test-colocation-loose-together3/log.expect
 create mode 100644 src/test/test-colocation-loose-together3/manager_status
 create mode 100644 src/test/test-colocation-loose-together3/rules_config
 create mode 100644 src/test/test-colocation-loose-together3/service_config
 create mode 100644 src/test/test-colocation-strict-separate1/README
 create mode 100644 src/test/test-colocation-strict-separate1/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate1/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate1/log.expect
 create mode 100644 src/test/test-colocation-strict-separate1/manager_status
 create mode 100644 src/test/test-colocation-strict-separate1/rules_config
 create mode 100644 src/test/test-colocation-strict-separate1/service_config
 create mode 100644 src/test/test-colocation-strict-separate2/README
 create mode 100644 src/test/test-colocation-strict-separate2/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate2/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate2/log.expect
 create mode 100644 src/test/test-colocation-strict-separate2/manager_status
 create mode 100644 src/test/test-colocation-strict-separate2/rules_config
 create mode 100644 src/test/test-colocation-strict-separate2/service_config
 create mode 100644 src/test/test-colocation-strict-separate3/README
 create mode 100644 src/test/test-colocation-strict-separate3/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate3/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate3/log.expect
 create mode 100644 src/test/test-colocation-strict-separate3/manager_status
 create mode 100644 src/test/test-colocation-strict-separate3/rules_config
 create mode 100644 src/test/test-colocation-strict-separate3/service_config
 create mode 100644 src/test/test-colocation-strict-separate4/README
 create mode 100644 src/test/test-colocation-strict-separate4/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate4/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate4/log.expect
 create mode 100644 src/test/test-colocation-strict-separate4/manager_status
 create mode 100644 src/test/test-colocation-strict-separate4/rules_config
 create mode 100644 src/test/test-colocation-strict-separate4/service_config
 create mode 100644 src/test/test-colocation-strict-separate5/README
 create mode 100644 src/test/test-colocation-strict-separate5/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate5/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate5/log.expect
 create mode 100644 src/test/test-colocation-strict-separate5/manager_status
 create mode 100644 src/test/test-colocation-strict-separate5/rules_config
 create mode 100644 src/test/test-colocation-strict-separate5/service_config
 create mode 100644 src/test/test-colocation-strict-together1/README
 create mode 100644 src/test/test-colocation-strict-together1/cmdlist
 create mode 100644 src/test/test-colocation-strict-together1/hardware_status
 create mode 100644 src/test/test-colocation-strict-together1/log.expect
 create mode 100644 src/test/test-colocation-strict-together1/manager_status
 create mode 100644 src/test/test-colocation-strict-together1/rules_config
 create mode 100644 src/test/test-colocation-strict-together1/service_config
 create mode 100644 src/test/test-colocation-strict-together2/README
 create mode 100644 src/test/test-colocation-strict-together2/cmdlist
 create mode 100644 src/test/test-colocation-strict-together2/hardware_status
 create mode 100644 src/test/test-colocation-strict-together2/log.expect
 create mode 100644 src/test/test-colocation-strict-together2/manager_status
 create mode 100644 src/test/test-colocation-strict-together2/rules_config
 create mode 100644 src/test/test-colocation-strict-together2/service_config
 create mode 100644 src/test/test-colocation-strict-together3/README
 create mode 100644 src/test/test-colocation-strict-together3/cmdlist
 create mode 100644 src/test/test-colocation-strict-together3/hardware_status
 create mode 100644 src/test/test-colocation-strict-together3/log.expect
 create mode 100644 src/test/test-colocation-strict-together3/manager_status
 create mode 100644 src/test/test-colocation-strict-together3/rules_config
 create mode 100644 src/test/test-colocation-strict-together3/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/static_service_stats
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/static_service_stats
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/static_service_stats
 create mode 100755 src/test/test_rules_config.pl


Summary over all repositories:
  139 files changed, 3115 insertions(+), 20 deletions(-)

-- 
Generated by git-murpp 0.8.0


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/Cluster.pm  | 1 +
 src/pmxcfs/status.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm
index e0e3ee9..afbb36f 100644
--- a/src/PVE/Cluster.pm
+++ b/src/PVE/Cluster.pm
@@ -69,6 +69,7 @@ my $observed = {
     'ha/crm_commands' => 1,
     'ha/manager_status' => 1,
     'ha/resources.cfg' => 1,
+    'ha/rules.cfg' => 1,
     'ha/groups.cfg' => 1,
     'ha/fence.cfg' => 1,
     'status.cfg' => 1,
diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
index ff5fcc4..cee0c57 100644
--- a/src/pmxcfs/status.c
+++ b/src/pmxcfs/status.c
@@ -97,6 +97,7 @@ static memdb_change_t memdb_change_array[] = {
 	{ .path = "ha/crm_commands" },
 	{ .path = "ha/manager_status" },
 	{ .path = "ha/resources.cfg" },
+	{ .path = "ha/rules.cfg" },
 	{ .path = "ha/groups.cfg" },
 	{ .path = "ha/fence.cfg" },
 	{ .path = "status.cfg" },
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 17:49   ` [pve-devel] applied: " Thomas Lamprecht
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .gitignore | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitignore b/.gitignore
index 5b748c4..c35280e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,3 +4,5 @@
 *.buildinfo
 *.tar.gz
 /src/test/test-*/status/*
+/src/test/fence_cfgs/*.cfg.commands
+/src/test/fence_cfgs/*.cfg.write
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 17:53   ` Thomas Lamprecht
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods Daniel Kral
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Implement helper subroutines, which implement basic set operations done
on hash sets, i.e. hashes with elements set to a true value, e.g. 1.

These will be used for various tasks in the HA Manager colocation rules,
e.g. for verifying the satisfiability of the rules or applying the
colocation rules on the allowed set of nodes.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
If they're useful somewhere else, I can move them to PVE::Tools
post-RFC, but it'd be probably useful to prefix them with `hash_` there.
AFAICS there weren't any other helpers for this with a quick grep over
all projects and `PVE::Tools::array_intersect()` wasn't what I needed.

 src/PVE/HA/Tools.pm | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
index 0f9e9a5..fc3282c 100644
--- a/src/PVE/HA/Tools.pm
+++ b/src/PVE/HA/Tools.pm
@@ -115,6 +115,48 @@ sub write_json_to_file {
     PVE::Tools::file_set_contents($filename, $raw);
 }
 
+sub is_disjoint {
+    my ($hash1, $hash2) = @_;
+
+    for my $key (keys %$hash1) {
+	return 0 if exists($hash2->{$key});
+    }
+
+    return 1;
+};
+
+sub intersect {
+    my ($hash1, $hash2) = @_;
+
+    my $result = { map { $_ => $hash2->{$_} } keys %$hash1 };
+
+    for my $key (keys %$result) {
+	delete $result->{$key} if !defined($result->{$key});
+    }
+
+    return $result;
+};
+
+sub set_difference {
+    my ($hash1, $hash2) = @_;
+
+    my $result = { map { $_ => 1 } keys %$hash1 };
+
+    for my $key (keys %$result) {
+	delete $result->{$key} if defined($hash2->{$key});
+    }
+
+    return $result;
+};
+
+sub union {
+    my ($hash1, $hash2) = @_;
+
+    my $result = { map { $_ => 1 } keys %$hash1, keys %$hash2 };
+
+    return $result;
+};
+
 sub count_fenced_services {
     my ($ss, $node) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (2 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin Daniel Kral
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add methods get_service_node() and pin_service_node() to the Usage class
to retrieve and pin the current node of a specific service.

This is used to retrieve the current node of a service for colocation
rules inside of select_service_node(), where there is currently no
access to the global services state.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
For me this is more of a temporary change, since I don't think putting
this information here is very useful in the future. It was more of a
workaround for the moment, since `select_service_node()` doesn't have
access to the global service configuration data, which is needed here.

I would like to give `select_service_node()` the information from e.g.
$sc directly post-RFC.

 src/PVE/HA/Usage.pm        | 12 ++++++++++++
 src/PVE/HA/Usage/Basic.pm  | 15 +++++++++++++++
 src/PVE/HA/Usage/Static.pm | 14 ++++++++++++++
 3 files changed, 41 insertions(+)

diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 66d9572..e4f86d7 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -27,6 +27,18 @@ sub list_nodes {
     die "implement in subclass";
 }
 
+sub get_service_node {
+    my ($self, $sid) = @_;
+
+    die "implement in subclass";
+}
+
+sub pin_service_node {
+    my ($self, $sid, $node) = @_;
+
+    die "implement in subclass";
+}
+
 sub contains_node {
     my ($self, $nodename) = @_;
 
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index d6b3d6c..50d687b 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -10,6 +10,7 @@ sub new {
 
     return bless {
 	nodes => {},
+	services => {},
 	haenv => $haenv,
     }, $class;
 }
@@ -38,11 +39,25 @@ sub contains_node {
     return defined($self->{nodes}->{$nodename});
 }
 
+sub get_service_node {
+    my ($self, $sid) = @_;
+
+    return $self->{services}->{$sid};
+}
+
+sub pin_service_node {
+    my ($self, $sid, $node) = @_;
+
+    $self->{services}->{$sid} = $node;
+}
+
 sub add_service_usage_to_node {
     my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
 
     if ($self->contains_node($nodename)) {
+	$self->{total}++;
 	$self->{nodes}->{$nodename}++;
+	$self->{services}->{$sid} = $nodename;
     } else {
 	$self->{haenv}->log(
 	    'warning',
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 3d0af3a..8db9202 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -22,6 +22,7 @@ sub new {
 	'service-stats' => {},
 	haenv => $haenv,
 	scheduler => $scheduler,
+	'service-nodes' => {},
 	'service-counts' => {}, # Service count on each node. Fallback if scoring calculation fails.
     }, $class;
 }
@@ -85,9 +86,22 @@ my sub get_service_usage {
     return $service_stats;
 }
 
+sub get_service_node {
+    my ($self, $sid) = @_;
+
+    return $self->{'service-nodes'}->{$sid};
+}
+
+sub pin_service_node {
+    my ($self, $sid, $node) = @_;
+
+    $self->{'service-nodes'}->{$sid} = $node;
+}
+
 sub add_service_usage_to_node {
     my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
 
+    $self->{'service-nodes'}->{$sid} = $nodename;
     $self->{'service-counts'}->{$nodename}++;
 
     eval {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (3 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add a rules section config base plugin to allow users to specify
different kinds of rules in a single configuration file.

The interface is designed to allow sub plugins to implement their own
{decode,encode}_value() methods and also offer a canonicalized version
of their rules with canonicalize(), i.e. with any inconsistencies
removed and ambiguities resolved. There is also a are_satisfiable()
method for anticipation of the verification of additions or changes to
the rules config via the API.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 debian/pve-ha-manager.install |   1 +
 src/PVE/HA/Makefile           |   2 +-
 src/PVE/HA/Rules.pm           | 118 ++++++++++++++++++++++++++++++++++
 src/PVE/HA/Tools.pm           |   5 ++
 4 files changed, 125 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/HA/Rules.pm

diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 0ffbd8d..9bbd375 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -32,6 +32,7 @@
 /usr/share/perl5/PVE/HA/Resources.pm
 /usr/share/perl5/PVE/HA/Resources/PVECT.pm
 /usr/share/perl5/PVE/HA/Resources/PVEVM.pm
+/usr/share/perl5/PVE/HA/Rules.pm
 /usr/share/perl5/PVE/HA/Tools.pm
 /usr/share/perl5/PVE/HA/Usage.pm
 /usr/share/perl5/PVE/HA/Usage/Basic.pm
diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
index 8c91b97..489cbc0 100644
--- a/src/PVE/HA/Makefile
+++ b/src/PVE/HA/Makefile
@@ -1,4 +1,4 @@
-SIM_SOURCES=CRM.pm Env.pm Groups.pm Resources.pm LRM.pm Manager.pm \
+SIM_SOURCES=CRM.pm Env.pm Groups.pm Rules.pm Resources.pm LRM.pm Manager.pm \
 	NodeStatus.pm Tools.pm FenceConfig.pm Fence.pm Usage.pm
 
 SOURCES=${SIM_SOURCES} Config.pm
diff --git a/src/PVE/HA/Rules.pm b/src/PVE/HA/Rules.pm
new file mode 100644
index 0000000..bff3375
--- /dev/null
+++ b/src/PVE/HA/Rules.pm
@@ -0,0 +1,118 @@
+package PVE::HA::Rules;
+
+use strict;
+use warnings;
+
+use PVE::JSONSchema qw(get_standard_option);
+use PVE::SectionConfig;
+use PVE::HA::Tools;
+
+use base qw(PVE::SectionConfig);
+
+# TODO Add descriptions, completions, etc.
+my $defaultData = {
+    propertyList => {
+	type => { description => "Rule type." },
+	ruleid => get_standard_option('pve-ha-rule-id'),
+	comment => {
+	    type => 'string',
+	    maxLength => 4096,
+	    description => "Rule description.",
+	},
+    },
+};
+
+sub private {
+    return $defaultData;
+}
+
+sub options {
+    return {
+	type => { optional => 0 },
+	ruleid => { optional => 0 },
+	comment => { optional => 1 },
+    };
+};
+
+sub decode_value {
+    my ($class, $type, $key, $value) = @_;
+
+    if ($key eq 'comment') {
+	return PVE::Tools::decode_text($value);
+    }
+
+    my $plugin = __PACKAGE__->lookup($type);
+    return $plugin->decode_value($type, $key, $value);
+}
+
+sub encode_value {
+    my ($class, $type, $key, $value) = @_;
+
+    if ($key eq 'comment') {
+	return PVE::Tools::encode_text($value);
+    }
+
+    my $plugin = __PACKAGE__->lookup($type);
+    return $plugin->encode_value($type, $key, $value);
+}
+
+sub parse_section_header {
+    my ($class, $line) = @_;
+
+    if ($line =~ m/^(\S+):\s*(\S+)\s*$/) {
+	my ($type, $ruleid) = (lc($1), $2);
+	my $errmsg = undef; # set if you want to skip whole section
+	eval { PVE::JSONSchema::pve_verify_configid($ruleid); };
+	$errmsg = $@ if $@;
+	my $config = {}; # to return additional attributes
+	return ($type, $ruleid, $errmsg, $config);
+    }
+    return undef;
+}
+
+sub foreach_service_rule {
+    my ($rules, $func, $opts) = @_;
+
+    my $sid = $opts->{sid};
+    my $type = $opts->{type};
+
+    my @ruleids = sort {
+	$rules->{order}->{$a} <=> $rules->{order}->{$b}
+    } keys %{$rules->{ids}};
+
+    for my $ruleid (@ruleids) {
+	my $rule = $rules->{ids}->{$ruleid};
+
+	next if !$rule; # invalid rules are kept undef in section config, delete them
+	next if $type && $rule->{type} ne $type;
+	next if $sid && !defined($rule->{services}->{$sid});
+
+	$func->($rule, $ruleid);
+    }
+}
+
+sub canonicalize {
+    my ($class, $rules, $groups, $services) = @_;
+
+    die "implement in subclass";
+}
+
+sub are_satisfiable {
+    my ($class, $rules, $groups, $services) = @_;
+
+    die "implement in subclass";
+}
+
+sub checked_config {
+    my ($rules, $groups, $services) = @_;
+
+    my $types = __PACKAGE__->lookup_types();
+
+    for my $type (@$types) {
+	my $plugin = __PACKAGE__->lookup($type);
+
+	$plugin->canonicalize($rules, $groups, $services);
+    }
+}
+
+1;
diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
index fc3282c..35107c9 100644
--- a/src/PVE/HA/Tools.pm
+++ b/src/PVE/HA/Tools.pm
@@ -92,6 +92,11 @@ PVE::JSONSchema::register_standard_option('pve-ha-group-id', {
     type => 'string', format => 'pve-configid',
 });
 
+PVE::JSONSchema::register_standard_option('pve-ha-rule-id', {
+    description => "The HA rule identifier.",
+    type => 'string', format => 'pve-configid',
+});
+
 sub read_json_from_file {
     my ($filename, $default) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (4 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-04-03 12:16   ` Fabian Grünbichler
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods Daniel Kral
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add the colocation rule plugin to allow users to specify inter-service
affinity constraints.

These colocation rules can either be positive (keeping services
together) or negative (keeping service separate). Their strictness can
also be specified as either a MUST or a SHOULD, where the first
specifies that any service the constraint cannot be applied for stays in
recovery, while the latter specifies that that any service the
constraint cannot be applied for is lifted from the constraint.

The initial implementation also implements four basic transformations,
where colocation rules with not enough services are dropped, transitive
positive colocation rules are merged, and inter-colocation rule
inconsistencies as well as colocation rule inconsistencies with respect
to the location constraints specified in HA groups are dropped.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 debian/pve-ha-manager.install  |   1 +
 src/PVE/HA/Makefile            |   1 +
 src/PVE/HA/Rules/Colocation.pm | 391 +++++++++++++++++++++++++++++++++
 src/PVE/HA/Rules/Makefile      |   6 +
 src/PVE/HA/Tools.pm            |   6 +
 5 files changed, 405 insertions(+)
 create mode 100644 src/PVE/HA/Rules/Colocation.pm
 create mode 100644 src/PVE/HA/Rules/Makefile

diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 9bbd375..89f9144 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -33,6 +33,7 @@
 /usr/share/perl5/PVE/HA/Resources/PVECT.pm
 /usr/share/perl5/PVE/HA/Resources/PVEVM.pm
 /usr/share/perl5/PVE/HA/Rules.pm
+/usr/share/perl5/PVE/HA/Rules/Colocation.pm
 /usr/share/perl5/PVE/HA/Tools.pm
 /usr/share/perl5/PVE/HA/Usage.pm
 /usr/share/perl5/PVE/HA/Usage/Basic.pm
diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
index 489cbc0..e386cbf 100644
--- a/src/PVE/HA/Makefile
+++ b/src/PVE/HA/Makefile
@@ -8,6 +8,7 @@ install:
 	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA
 	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/$$i; done
 	make -C Resources install
+	make -C Rules install
 	make -C Usage install
 	make -C Env install
 
diff --git a/src/PVE/HA/Rules/Colocation.pm b/src/PVE/HA/Rules/Colocation.pm
new file mode 100644
index 0000000..808d48e
--- /dev/null
+++ b/src/PVE/HA/Rules/Colocation.pm
@@ -0,0 +1,391 @@
+package PVE::HA::Rules::Colocation;
+
+use strict;
+use warnings;
+
+use Data::Dumper;
+
+use PVE::JSONSchema qw(get_standard_option);
+use PVE::HA::Tools;
+
+use base qw(PVE::HA::Rules);
+
+sub type {
+    return 'colocation';
+}
+
+sub properties {
+    return {
+	services => get_standard_option('pve-ha-resource-id-list'),
+	affinity => {
+	    description => "Describes whether the services are supposed to be kept on separate"
+		. " nodes, or are supposed to be kept together on the same node.",
+	    type => 'string',
+	    enum => ['separate', 'together'],
+	    optional => 0,
+	},
+	strict => {
+	    description => "Describes whether the colocation rule is mandatory or optional.",
+	    type => 'boolean',
+	    optional => 0,
+	},
+    }
+}
+
+sub options {
+    return {
+	services => { optional => 0 },
+	strict => { optional => 0 },
+	affinity => { optional => 0 },
+	comment => { optional => 1 },
+    };
+};
+
+sub decode_value {
+    my ($class, $type, $key, $value) = @_;
+
+    if ($key eq 'services') {
+	my $res = {};
+
+	for my $service (PVE::Tools::split_list($value)) {
+	    if (PVE::HA::Tools::pve_verify_ha_resource_id($service)) {
+		$res->{$service} = 1;
+	    }
+	}
+
+	return $res;
+    }
+
+    return $value;
+}
+
+sub encode_value {
+    my ($class, $type, $key, $value) = @_;
+
+    if ($key eq 'services') {
+	PVE::HA::Tools::pve_verify_ha_resource_id($_) for (keys %$value);
+
+	return join(',', keys %$value);
+    }
+
+    return $value;
+}
+
+sub foreach_colocation_rule {
+    my ($rules, $func, $opts) = @_;
+
+    my $my_opts = { map { $_ => $opts->{$_} } keys %$opts };
+    $my_opts->{type} = 'colocation';
+
+    PVE::HA::Rules::foreach_service_rule($rules, $func, $my_opts);
+}
+
+sub split_colocation_rules {
+    my ($rules) = @_;
+
+    my $positive_ruleids = [];
+    my $negative_ruleids = [];
+
+    foreach_colocation_rule($rules, sub {
+	my ($rule, $ruleid) = @_;
+
+	my $ruleid_set = $rule->{affinity} eq 'together' ? $positive_ruleids : $negative_ruleids;
+	push @$ruleid_set, $ruleid;
+    });
+
+    return ($positive_ruleids, $negative_ruleids);
+}
+
+=head3 check_service_count($rules)
+
+Returns a list of conflicts caused by colocation rules, which do not have
+enough services in them, defined in C<$rules>.
+
+If there are no conflicts, the returned list is empty.
+
+=cut
+
+sub check_services_count {
+    my ($rules) = @_;
+
+    my $conflicts = [];
+
+    foreach_colocation_rule($rules, sub {
+	my ($rule, $ruleid) = @_;
+
+	push @$conflicts, $ruleid if (scalar(keys %{$rule->{services}}) < 2);
+    });
+
+    return $conflicts;
+}
+
+=head3 check_positive_intransitivity($rules)
+
+Returns a list of conflicts caused by transitive positive colocation rules
+defined in C<$rules>.
+
+Transitive positive colocation rules exist, if there are at least two positive
+colocation rules with the same strictness, which put at least the same two
+services in relation. This means, that these rules can be merged together.
+
+If there are no conflicts, the returned list is empty.
+
+=cut
+
+sub check_positive_intransitivity {
+    my ($rules) = @_;
+
+    my $conflicts = {};
+    my ($positive_ruleids) = split_colocation_rules($rules);
+
+    while (my $outerid = shift(@$positive_ruleids)) {
+	my $outer = $rules->{ids}->{$outerid};
+
+	for my $innerid (@$positive_ruleids) {
+	    my $inner = $rules->{ids}->{$innerid};
+
+	    next if $outerid eq $innerid;
+	    next if $outer->{strict} != $inner->{strict};
+	    next if PVE::HA::Tools::is_disjoint($outer->{services}, $inner->{services});
+
+	    push @{$conflicts->{$outerid}}, $innerid;
+	}
+    }
+
+    return $conflicts;
+}
+
+=head3 check_inner_consistency($rules)
+
+Returns a list of conflicts caused by inconsistencies between positive and
+negative colocation rules defined in C<$rules>.
+
+Inner inconsistent colocation rules exist, if there are at least the same two
+services in a positive and a negative colocation relation, which is an
+impossible constraint as they are opposites of each other.
+
+If there are no conflicts, the returned list is empty.
+
+=cut
+
+sub check_inner_consistency {
+    my ($rules) = @_;
+
+    my $conflicts = [];
+    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
+
+    for my $outerid (@$positive_ruleids) {
+	my $outer = $rules->{ids}->{$outerid}->{services};
+
+	for my $innerid (@$negative_ruleids) {
+	    my $inner = $rules->{ids}->{$innerid}->{services};
+
+	    my $intersection = PVE::HA::Tools::intersect($outer, $inner);
+	    next if scalar(keys %$intersection < 2);
+
+	    push @$conflicts, [$outerid, $innerid];
+	}
+    }
+
+    return $conflicts;
+}
+
+=head3 check_positive_group_consistency(...)
+
+Returns a list of conflicts caused by inconsistencies between positive
+colocation rules defined in C<$rules> and node restrictions defined in
+C<$groups> and C<$service>.
+
+A positive colocation rule inconsistency with groups exists, if at least two
+services in a positive colocation rule are restricted to disjoint sets of
+nodes, i.e. they are in restricted HA groups, which have a disjoint set of
+nodes.
+
+If there are no conflicts, the returned list is empty.
+
+=cut
+
+sub check_positive_group_consistency {
+    my ($rules, $groups, $services, $positive_ruleids, $conflicts) = @_;
+
+    for my $ruleid (@$positive_ruleids) {
+	my $rule_services = $rules->{ids}->{$ruleid}->{services};
+	my $nodes;
+
+	for my $sid (keys %$rule_services) {
+	    my $groupid = $services->{$sid}->{group};
+	    return if !$groupid;
+
+	    my $group = $groups->{ids}->{$groupid};
+	    return if !$group;
+	    return if !$group->{restricted};
+
+	    $nodes = { map { $_ => 1 } keys %{$group->{nodes}} } if !defined($nodes);
+	    $nodes = PVE::HA::Tools::intersect($nodes, $group->{nodes});
+	}
+
+	if (defined($nodes) && scalar keys %$nodes < 1) {
+	    push @$conflicts, ['positive', $ruleid];
+	}
+    }
+}
+
+=head3 check_negative_group_consistency(...)
+
+Returns a list of conflicts caused by inconsistencies between negative
+colocation rules defined in C<$rules> and node restrictions defined in
+C<$groups> and C<$service>.
+
+A negative colocation rule inconsistency with groups exists, if at least two
+services in a negative colocation rule are restricted to less nodes in total
+than services in the rule, i.e. they are in restricted HA groups, where the
+union of all restricted node sets have less elements than restricted services.
+
+If there are no conflicts, the returned list is empty.
+
+=cut
+
+sub check_negative_group_consistency {
+    my ($rules, $groups, $services, $negative_ruleids, $conflicts) = @_;
+
+    for my $ruleid (@$negative_ruleids) {
+	my $rule_services = $rules->{ids}->{$ruleid}->{services};
+	my $restricted_services = 0;
+	my $restricted_nodes;
+
+	for my $sid (keys %$rule_services) {
+	    my $groupid = $services->{$sid}->{group};
+	    return if !$groupid;
+
+	    my $group = $groups->{ids}->{$groupid};
+	    return if !$group;
+	    return if !$group->{restricted};
+
+	    $restricted_services++;
+
+	    $restricted_nodes = {} if !defined($restricted_nodes);
+	    $restricted_nodes = PVE::HA::Tools::union($restricted_nodes, $group->{nodes});
+	}
+
+	if (defined($restricted_nodes)
+	    && scalar keys %$restricted_nodes < $restricted_services) {
+	    push @$conflicts, ['negative', $ruleid];
+	}
+    }
+}
+
+sub check_consistency_with_groups {
+    my ($rules, $groups, $services) = @_;
+
+    my $conflicts = [];
+    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
+
+    check_positive_group_consistency($rules, $groups, $services, $positive_ruleids, $conflicts);
+    check_negative_group_consistency($rules, $groups, $services, $negative_ruleids, $conflicts);
+
+    return $conflicts;
+}
+
+sub canonicalize {
+    my ($class, $rules, $groups, $services) = @_;
+
+    my $illdefined_ruleids = check_services_count($rules);
+
+    for my $ruleid (@$illdefined_ruleids) {
+	print "Drop colocation rule '$ruleid', because it does not have enough services defined.\n";
+
+	delete $rules->{ids}->{$ruleid};
+    }
+
+    my $mergeable_positive_ruleids = check_positive_intransitivity($rules);
+
+    for my $outerid (sort keys %$mergeable_positive_ruleids) {
+	my $outer = $rules->{ids}->{$outerid};
+	my $innerids = $mergeable_positive_ruleids->{$outerid};
+
+	for my $innerid (@$innerids) {
+	    my $inner = $rules->{ids}->{$innerid};
+
+	    $outer->{services}->{$_} = 1 for (keys %{$inner->{services}});
+
+	    print "Merge services of positive colocation rule '$innerid' into positive colocation"
+		. " rule '$outerid', because they share at least one service.\n";
+
+	    delete $rules->{ids}->{$innerid};
+	}
+    }
+
+    my $inner_conflicts = check_inner_consistency($rules);
+
+    for my $conflict (@$inner_conflicts) {
+	my ($positiveid, $negativeid) = @$conflict;
+
+	print "Drop positive colocation rule '$positiveid' and negative colocation rule"
+	    . " '$negativeid', because they share two or more services.\n";
+
+	delete $rules->{ids}->{$positiveid};
+	delete $rules->{ids}->{$negativeid};
+    }
+
+    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
+
+    for my $conflict (@$group_conflicts) {
+	my ($type, $ruleid) = @$conflict;
+
+	if ($type eq 'positive') {
+	    print "Drop positive colocation rule '$ruleid', because two or more services are"
+		. " restricted to different nodes.\n";
+	} elsif ($type eq 'negative') {
+	    print "Drop negative colocation rule '$ruleid', because two or more services are"
+		. " restricted to less nodes than services.\n";
+	} else {
+	    die "Invalid group conflict type $type\n";
+	}
+
+	delete $rules->{ids}->{$ruleid};
+    }
+}
+
+# TODO This will be used to verify modifications to the rules config over the API
+sub are_satisfiable {
+    my ($class, $rules, $groups, $services) = @_;
+
+    my $illdefined_ruleids = check_services_count($rules);
+
+    for my $ruleid (@$illdefined_ruleids) {
+	print "Colocation rule '$ruleid' does not have enough services defined.\n";
+    }
+
+    my $inner_conflicts = check_inner_consistency($rules);
+
+    for my $conflict (@$inner_conflicts) {
+	my ($positiveid, $negativeid) = @$conflict;
+
+	print "Positive colocation rule '$positiveid' is inconsistent with negative colocation rule"
+	    . " '$negativeid', because they share two or more services between them.\n";
+    }
+
+    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
+
+    for my $conflict (@$group_conflicts) {
+	my ($type, $ruleid) = @$conflict;
+
+	if ($type eq 'positive') {
+	    print "Positive colocation rule '$ruleid' is unapplicable, because two or more services"
+		. " are restricted to different nodes.\n";
+	} elsif ($type eq 'negative') {
+	    print "Negative colocation rule '$ruleid' is unapplicable, because two or more services"
+		. " are restricted to less nodes than services.\n";
+	} else {
+	    die "Invalid group conflict type $type\n";
+	}
+    }
+
+    if (scalar(@$inner_conflicts) || scalar(@$group_conflicts)) {
+	return 0;
+    }
+
+    return 1;
+}
+
+1;
diff --git a/src/PVE/HA/Rules/Makefile b/src/PVE/HA/Rules/Makefile
new file mode 100644
index 0000000..8cb91ac
--- /dev/null
+++ b/src/PVE/HA/Rules/Makefile
@@ -0,0 +1,6 @@
+SOURCES=Colocation.pm
+
+.PHONY: install
+install:
+	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA/Rules
+	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/Rules/$$i; done
diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
index 35107c9..52251d7 100644
--- a/src/PVE/HA/Tools.pm
+++ b/src/PVE/HA/Tools.pm
@@ -46,6 +46,12 @@ PVE::JSONSchema::register_standard_option('pve-ha-resource-id', {
     type => 'string', format => 'pve-ha-resource-id',
 });
 
+PVE::JSONSchema::register_standard_option('pve-ha-resource-id-list', {
+    description => "List of HA resource IDs.",
+    typetext => "<type>:<name>{,<type>:<name>}*",
+    type => 'string', format => 'pve-ha-resource-id-list',
+});
+
 PVE::JSONSchema::register_format('pve-ha-resource-or-vm-id', \&pve_verify_ha_resource_or_vm_id);
 sub pve_verify_ha_resource_or_vm_id {
     my ($sid, $noerr) = @_;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (5 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config Daniel Kral
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Adds methods to the HA environment to read and parse the rules
configuration file for the specific environment implementation.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Config.pm       | 12 ++++++++++++
 src/PVE/HA/Env.pm          |  6 ++++++
 src/PVE/HA/Env/PVE2.pm     | 13 +++++++++++++
 src/PVE/HA/Sim/Env.pm      | 15 +++++++++++++++
 src/PVE/HA/Sim/Hardware.pm | 15 +++++++++++++++
 5 files changed, 61 insertions(+)

diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 129236d..99ae33a 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -7,12 +7,14 @@ use JSON;
 
 use PVE::HA::Tools;
 use PVE::HA::Groups;
+use PVE::HA::Rules;
 use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_write_file cfs_lock_file);
 use PVE::HA::Resources;
 
 my $manager_status_filename = "ha/manager_status";
 my $ha_groups_config = "ha/groups.cfg";
 my $ha_resources_config = "ha/resources.cfg";
+my $ha_rules_config = "ha/rules.cfg";
 my $crm_commands_filename = "ha/crm_commands";
 my $ha_fence_config = "ha/fence.cfg";
 
@@ -31,6 +33,11 @@ cfs_register_file(
     sub { PVE::HA::Resources->parse_config(@_); },
     sub { PVE::HA::Resources->write_config(@_); },
 );
+cfs_register_file(
+    $ha_rules_config,
+    sub { PVE::HA::Rules->parse_config(@_); },
+    sub { PVE::HA::Rules->write_config(@_); },
+);
 cfs_register_file($manager_status_filename, \&json_reader, \&json_writer);
 cfs_register_file(
     $ha_fence_config, \&PVE::HA::FenceConfig::parse_config, \&PVE::HA::FenceConfig::write_config);
@@ -193,6 +200,11 @@ sub parse_sid {
     return wantarray ? ($sid, $type, $name) : $sid;
 }
 
+sub read_rules_config {
+
+    return cfs_read_file($ha_rules_config);
+}
+
 sub read_group_config {
 
     return cfs_read_file($ha_groups_config);
diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index bb28a75..bdcbed8 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -131,6 +131,12 @@ sub steal_service {
     return $self->{plug}->steal_service($sid, $current_node, $new_node);
 }
 
+sub read_rules_config {
+    my ($self) = @_;
+
+    return $self->{plug}->read_rules_config();
+}
+
 sub read_group_config {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 1de4b69..3157e56 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -28,6 +28,13 @@ PVE::HA::Resources::PVECT->register();
 
 PVE::HA::Resources->init();
 
+use PVE::HA::Rules;
+use PVE::HA::Rules::Colocation;
+
+PVE::HA::Rules::Colocation->register();
+
+PVE::HA::Rules->init();
+
 my $lockdir = "/etc/pve/priv/lock";
 
 sub new {
@@ -188,6 +195,12 @@ sub steal_service {
     $self->cluster_state_update();
 }
 
+sub read_rules_config {
+    my ($self) = @_;
+
+    return PVE::HA::Config::read_rules_config();
+}
+
 sub read_group_config {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index b2ab231..2f73859 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -20,6 +20,13 @@ PVE::HA::Sim::Resources::VirtFail->register();
 
 PVE::HA::Resources->init();
 
+use PVE::HA::Rules;
+use PVE::HA::Rules::Colocation;
+
+PVE::HA::Rules::Colocation->register();
+
+PVE::HA::Rules->init();
+
 sub new {
     my ($this, $nodename, $hardware, $log_id) = @_;
 
@@ -245,6 +252,14 @@ sub exec_fence_agent {
     return $self->{hardware}->exec_fence_agent($agent, $node, @param);
 }
 
+sub read_rules_config {
+    my ($self) = @_;
+
+    $assert_cfs_can_rw->($self);
+
+    return $self->{hardware}->read_rules_config();
+}
+
 sub read_group_config {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 859e0a3..24bc8b9 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -28,6 +28,7 @@ my $watchdog_timeout = 60;
 # $testdir/cmdlist                    Command list for simulation
 # $testdir/hardware_status            Hardware description (number of nodes, ...)
 # $testdir/manager_status             CRM status (start with {})
+# $testdir/rules_config               Contraints / Rules configuration
 # $testdir/service_config             Service configuration
 # $testdir/static_service_stats       Static service usage information (cpu, memory)
 # $testdir/groups                     HA groups configuration
@@ -319,6 +320,16 @@ sub read_crm_commands {
     return $self->global_lock($code);
 }
 
+sub read_rules_config {
+    my ($self) = @_;
+
+    my $filename = "$self->{statusdir}/rules_config";
+    my $raw = '';
+    $raw = PVE::Tools::file_get_contents($filename) if -f $filename;
+
+    return PVE::HA::Rules->parse_config($filename, $raw);
+}
+
 sub read_group_config {
     my ($self) = @_;
 
@@ -391,6 +402,10 @@ sub new {
     # copy initial configuartion
     copy("$testdir/manager_status", "$statusdir/manager_status"); # optional
 
+    if (-f "$testdir/rules_config") {
+	copy("$testdir/rules_config", "$statusdir/rules_config");
+    }
+
     if (-f "$testdir/groups") {
 	copy("$testdir/groups", "$statusdir/groups");
     } else {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (6 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node Daniel Kral
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Read the rules configuration in each round and update the canonicalized
rules configuration if there were any changes since the last round to
reduce the amount of times of verifying the rule set.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
As noted inline already, there's a missing check whether the service
configuration changed, which includes the HA group assignment (and is
only needed for these), since there's no digest as for groups/rules.

I was hesitant to change the structure of `%sc` or the return value of
`read_service_config()` as it's used quite often and didn't want to
create a sha1 digest here just for this check. This is another plus
point to have all of these constraints in a single configuration file.

 src/PVE/HA/Manager.pm | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index d983672..7a8e7dc 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -11,6 +11,9 @@ use PVE::HA::NodeStatus;
 use PVE::HA::Usage::Basic;
 use PVE::HA::Usage::Static;
 
+use PVE::HA::Rules;
+use PVE::HA::Rules::Colocation;
+
 ## Variable Name & Abbreviations Convention
 #
 # The HA stack has some variables it uses frequently and thus abbreviates it such that it may be
@@ -41,7 +44,12 @@ sub new {
 
     my $class = ref($this) || $this;
 
-    my $self = bless { haenv => $haenv, crs => {} }, $class;
+    my $self = bless {
+	haenv => $haenv,
+	crs => {},
+	last_rules_digest => '',
+	last_groups_digest => '',
+    }, $class;
 
     my $old_ms = $haenv->read_manager_status();
 
@@ -497,6 +505,19 @@ sub manage {
 	delete $ss->{$sid};
     }
 
+    my $new_rules = $haenv->read_rules_config();
+
+    # TODO We should also check for a service digest here, but we would've to
+    #      calculate it here independently or also expose it through read_service_config()
+    if ($new_rules->{digest} ne $self->{last_rules_digest}
+	|| $self->{groups}->{digest} ne $self->{last_groups_digest}) {
+	$self->{rules} = $new_rules;
+	PVE::HA::Rules::checked_config($self->{rules}, $self->{groups}, $sc);
+    }
+
+    $self->{last_rules_digest} = $self->{rules}->{digest};
+    $self->{last_groups_digest} = $self->{groups}->{digest};
+
     $self->update_crm_commands();
 
     for (;;) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (7 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Factor out the prioritized node hash set in the select_service_node as
it is used multiple times and makes the intent a little clearer.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Manager.pm | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 7a8e7dc..8f2ab3d 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -175,23 +175,24 @@ sub select_service_node {
     # select node from top priority node list
 
     my $top_pri = $pri_list[0];
+    my $pri_nodes = $pri_groups->{$top_pri};
 
     # try to avoid nodes where the service failed already if we want to relocate
     if ($try_next) {
 	foreach my $node (@$tried_nodes) {
-	    delete $pri_groups->{$top_pri}->{$node};
+	    delete $pri_nodes->{$node};
 	}
     }
 
     return $maintenance_fallback
-	if defined($maintenance_fallback) && $pri_groups->{$top_pri}->{$maintenance_fallback};
+	if defined($maintenance_fallback) && $pri_nodes->{$maintenance_fallback};
 
-    return $current_node if (!$try_next && !$best_scored) && $pri_groups->{$top_pri}->{$current_node};
+    return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
 
     my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
     my @nodes = sort {
 	$scores->{$a} <=> $scores->{$b} || $a cmp $b
-    } keys %{$pri_groups->{$top_pri}};
+    } keys %$pri_nodes;
 
     my $found;
     for (my $i = scalar(@nodes) - 1; $i >= 0; $i--) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (8 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-04-03 12:17   ` Fabian Grünbichler
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node Daniel Kral
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add a mechanism to the node selection subroutine, which enforces the
colocation rules defined in the rules config.

The algorithm manipulates the set of nodes directly, which the service
is allowed to run on, depending on the type and strictness of the
colocation rules, if there are any.

This makes it depend on the prior removal of any nodes, which are
unavailable (i.e. offline, unreachable, or weren't able to start the
service in previous tries) or are not allowed to be run on otherwise
(i.e. HA group node restrictions) to function correctly.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Manager.pm      | 203 ++++++++++++++++++++++++++++++++++++-
 src/test/test_failover1.pl |   4 +-
 2 files changed, 205 insertions(+), 2 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 8f2ab3d..79b6555 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -157,8 +157,201 @@ sub get_node_priority_groups {
     return ($pri_groups, $group_members);
 }
 
+=head3 get_colocated_services($rules, $sid, $online_node_usage)
+
+Returns a hash map of all services, which are specified as being in a positive
+or negative colocation in C<$rules> with the given service with id C<$sid>.
+
+Each service entry consists of the type of colocation, strictness of colocation
+and the node the service is currently assigned to, if any, according to
+C<$online_node_usage>.
+
+For example, a service C<'vm:101'> being strictly colocated together (positive)
+with two other services C<'vm:102'> and C<'vm:103'> and loosely colocated
+separate with another service C<'vm:104'> results in the hash map:
+
+    {
+	'vm:102' => {
+	    affinity => 'together',
+	    strict => 1,
+	    node => 'node2'
+	},
+	'vm:103' => {
+	    affinity => 'together',
+	    strict => 1,
+	    node => 'node2'
+	},
+	'vm:104' => {
+	    affinity => 'separate',
+	    strict => 0,
+	    node => undef
+	}
+    }
+
+=cut
+
+sub get_colocated_services {
+    my ($rules, $sid, $online_node_usage) = @_;
+
+    my $services = {};
+
+    PVE::HA::Rules::Colocation::foreach_colocation_rule($rules, sub {
+	my ($rule) = @_;
+
+	for my $csid (sort keys %{$rule->{services}}) {
+	    next if $csid eq $sid;
+
+	    $services->{$csid} = {
+		node => $online_node_usage->get_service_node($csid),
+		affinity => $rule->{affinity},
+		strict => $rule->{strict},
+	    };
+        }
+    }, {
+	sid => $sid,
+    });
+
+    return $services;
+}
+
+=head3 get_colocation_preference($rules, $sid, $online_node_usage)
+
+Returns a list of two hashes, where each is a hash map of the colocation
+preference of C<$sid>, according to the colocation rules in C<$rules> and the
+service locations in C<$online_node_usage>.
+
+The first hash is the positive colocation preference, where each element
+represents properties for how much C<$sid> prefers to be on the node.
+Currently, this is a binary C<$strict> field, which means either it should be
+there (C<0>) or must be there (C<1>).
+
+The second hash is the negative colocation preference, where each element
+represents properties for how much C<$sid> prefers not to be on the node.
+Currently, this is a binary C<$strict> field, which means either it should not
+be there (C<0>) or must not be there (C<1>).
+
+=cut
+
+sub get_colocation_preference {
+    my ($rules, $sid, $online_node_usage) = @_;
+
+    my $services = get_colocated_services($rules, $sid, $online_node_usage);
+
+    my $together = {};
+    my $separate = {};
+
+    for my $service (values %$services) {
+	my $node = $service->{node};
+
+	next if !$node;
+
+	my $node_set = $service->{affinity} eq 'together' ? $together : $separate;
+	$node_set->{$node}->{strict} = $node_set->{$node}->{strict} || $service->{strict};
+    }
+
+    return ($together, $separate);
+}
+
+=head3 apply_positive_colocation_rules($together, $allowed_nodes)
+
+Applies the positive colocation preference C<$together> on the allowed node
+hash set C<$allowed_nodes> directly.
+
+Positive colocation means keeping services together on a single node, and
+therefore minimizing the separation of services.
+
+The allowed node hash set C<$allowed_nodes> is expected to contain any node,
+which is available to the service, i.e. each node is currently online, is
+available according to other location constraints, and the service has not
+failed running there yet.
+
+=cut
+
+sub apply_positive_colocation_rules {
+    my ($together, $allowed_nodes) = @_;
+
+    return if scalar(keys %$together) < 1;
+
+    my $mandatory_nodes = {};
+    my $possible_nodes = PVE::HA::Tools::intersect($allowed_nodes, $together);
+
+    for my $node (sort keys %$together) {
+	$mandatory_nodes->{$node} = 1 if $together->{$node}->{strict};
+    }
+
+    if (scalar keys %$mandatory_nodes) {
+	# limit to only the nodes the service must be on.
+	for my $node (keys %$allowed_nodes) {
+	    next if exists($mandatory_nodes->{$node});
+
+	    delete $allowed_nodes->{$node};
+	}
+    } elsif (scalar keys %$possible_nodes) {
+	# limit to the possible nodes the service should be on, if there are any.
+	for my $node (keys %$allowed_nodes) {
+	    next if exists($possible_nodes->{$node});
+
+	    delete $allowed_nodes->{$node};
+	}
+    }
+}
+
+=head3 apply_negative_colocation_rules($separate, $allowed_nodes)
+
+Applies the negative colocation preference C<$separate> on the allowed node
+hash set C<$allowed_nodes> directly.
+
+Negative colocation means keeping services separate on multiple nodes, and
+therefore maximizing the separation of services.
+
+The allowed node hash set C<$allowed_nodes> is expected to contain any node,
+which is available to the service, i.e. each node is currently online, is
+available according to other location constraints, and the service has not
+failed running there yet.
+
+=cut
+
+sub apply_negative_colocation_rules {
+    my ($separate, $allowed_nodes) = @_;
+
+    return if scalar(keys %$separate) < 1;
+
+    my $mandatory_nodes = {};
+    my $possible_nodes = PVE::HA::Tools::set_difference($allowed_nodes, $separate);
+
+    for my $node (sort keys %$separate) {
+	$mandatory_nodes->{$node} = 1 if $separate->{$node}->{strict};
+    }
+
+    if (scalar keys %$mandatory_nodes) {
+	# limit to the nodes the service must not be on.
+	for my $node (keys %$allowed_nodes) {
+	    next if !exists($mandatory_nodes->{$node});
+
+	    delete $allowed_nodes->{$node};
+	}
+    } elsif (scalar keys %$possible_nodes) {
+	# limit to the nodes the service should not be on, if any.
+	for my $node (keys %$allowed_nodes) {
+	    next if exists($possible_nodes->{$node});
+
+	    delete $allowed_nodes->{$node};
+	}
+    }
+}
+
+sub apply_colocation_rules {
+    my ($rules, $sid, $allowed_nodes, $online_node_usage) = @_;
+
+    my ($together, $separate) = get_colocation_preference($rules, $sid, $online_node_usage);
+
+    apply_positive_colocation_rules($together, $allowed_nodes);
+    apply_negative_colocation_rules($separate, $allowed_nodes);
+}
+
 sub select_service_node {
-    my ($groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
+    # TODO Cleanup this signature post-RFC
+    my ($rules, $groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
 
     my $group = get_service_group($groups, $online_node_usage, $service_conf);
 
@@ -189,6 +382,8 @@ sub select_service_node {
 
     return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
 
+    apply_colocation_rules($rules, $sid, $pri_nodes, $online_node_usage);
+
     my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
     my @nodes = sort {
 	$scores->{$a} <=> $scores->{$b} || $a cmp $b
@@ -758,6 +953,7 @@ sub next_state_request_start {
 
     if ($self->{crs}->{rebalance_on_request_start}) {
 	my $selected_node = select_service_node(
+	    $self->{rules},
 	    $self->{groups},
 	    $self->{online_node_usage},
 	    $sid,
@@ -771,6 +967,9 @@ sub next_state_request_start {
 	my $select_text = $selected_node ne $current_node ? 'new' : 'current';
 	$haenv->log('info', "service $sid: re-balance selected $select_text node $selected_node for startup");
 
+	# TODO It would be better if this information would be retrieved from $ss/$sd post-RFC
+	$self->{online_node_usage}->pin_service_node($sid, $selected_node);
+
 	if ($selected_node ne $current_node) {
 	    $change_service_state->($self, $sid, 'request_start_balance', node => $current_node, target => $selected_node);
 	    return;
@@ -898,6 +1097,7 @@ sub next_state_started {
 	    }
 
 	    my $node = select_service_node(
+		$self->{rules},
 	        $self->{groups},
 		$self->{online_node_usage},
 		$sid,
@@ -1004,6 +1204,7 @@ sub next_state_recovery {
     $self->recompute_online_node_usage(); # we want the most current node state
 
     my $recovery_node = select_service_node(
+	$self->{rules},
 	$self->{groups},
 	$self->{online_node_usage},
 	$sid,
diff --git a/src/test/test_failover1.pl b/src/test/test_failover1.pl
index 308eab3..4c84fbd 100755
--- a/src/test/test_failover1.pl
+++ b/src/test/test_failover1.pl
@@ -8,6 +8,8 @@ use PVE::HA::Groups;
 use PVE::HA::Manager;
 use PVE::HA::Usage::Basic;
 
+my $rules = {};
+
 my $groups = PVE::HA::Groups->parse_config("groups.tmp", <<EOD);
 group: prefer_node1
 	nodes node1
@@ -31,7 +33,7 @@ sub test {
     my ($expected_node, $try_next) = @_;
     
     my $node = PVE::HA::Manager::select_service_node
-	($groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
+	($rules, $groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
 
     my (undef, undef, $line) = caller();
     die "unexpected result: $node != ${expected_node} at line $line\n" 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (9 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add an option to the VirtFail's name to allow the start and migrate fail
counts to only apply on a certain node number with a specific naming
scheme.

This allows a slightly more elaborate test type, e.g. where a service
can start on one node (or any other in that case), but fails to start on
a specific node, which it is expected to start on after a migration.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Sim/Resources/VirtFail.pm | 37 +++++++++++++++++++---------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/src/PVE/HA/Sim/Resources/VirtFail.pm b/src/PVE/HA/Sim/Resources/VirtFail.pm
index ce88391..fddecd6 100644
--- a/src/PVE/HA/Sim/Resources/VirtFail.pm
+++ b/src/PVE/HA/Sim/Resources/VirtFail.pm
@@ -10,25 +10,36 @@ use base qw(PVE::HA::Sim::Resources);
 # To make it more interesting we can encode some behavior in the VMID
 # with the following format, where fa: is the type and a, b, c, ...
 # are digits in base 10, i.e. the full service ID would be:
-#   fa:abcde
+#   fa:abcdef
 # And the digits after the fa: type prefix would mean:
 #   - a: no meaning but can be used for differentiating similar resources
 #   - b: how many tries are needed to start correctly (0 is normal behavior) (should be set)
 #   - c: how many tries are needed to migrate correctly (0 is normal behavior) (should be set)
 #   - d: should shutdown be successful (0 = yes, anything else no) (optional)
 #   - e: return value of $plugin->exists() defaults to 1 if not set (optional)
+#   - f: limits the constraints of b and c to the nodeX (0 = apply to all nodes) (optional)
 
 my $decode_id = sub {
     my $id = shift;
 
-    my ($start, $migrate, $stop, $exists) = $id =~ /^\d(\d)(\d)(\d)?(\d)?/g;
+    my ($start, $migrate, $stop, $exists, $limit_to_node) = $id =~ /^\d(\d)(\d)(\d)?(\d)?(\d)?/g;
 
     $start = 0 if !defined($start);
     $migrate = 0 if !defined($migrate);
     $stop = 0 if !defined($stop);
     $exists = 1 if !defined($exists);
+    $limit_to_node = 0 if !defined($limit_to_node);
 
-    return ($start, $migrate, $stop, $exists)
+    return ($start, $migrate, $stop, $exists, $limit_to_node);
+};
+
+my $should_retry_action = sub {
+    my ($haenv, $limit_to_node) = @_;
+
+    my ($node) = $haenv->nodename() =~ /^node(\d)/g;
+    $node = 0 if !defined($node);
+
+    return $limit_to_node == 0 || $limit_to_node == $node;
 };
 
 my $tries = {
@@ -53,12 +64,14 @@ sub exists {
 sub start {
     my ($class, $haenv, $id) = @_;
 
-    my ($start_failure_count) = &$decode_id($id);
+    my ($start_failure_count, $limit_to_node) = (&$decode_id($id))[0,4];
 
-    $tries->{start}->{$id} = 0 if !$tries->{start}->{$id};
-    $tries->{start}->{$id}++;
+    if ($should_retry_action->($haenv, $limit_to_node)) {
+	$tries->{start}->{$id} = 0 if !$tries->{start}->{$id};
+	$tries->{start}->{$id}++;
 
-    return if $start_failure_count >= $tries->{start}->{$id};
+	return if $start_failure_count >= $tries->{start}->{$id};
+    }
 
     $tries->{start}->{$id} = 0; # reset counts
 
@@ -79,12 +92,14 @@ sub shutdown {
 sub migrate {
     my ($class, $haenv, $id, $target, $online) = @_;
 
-    my (undef, $migrate_failure_count) = &$decode_id($id);
+    my ($migrate_failure_count, $limit_to_node) = (&$decode_id($id))[1,4];
 
-    $tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id};
-    $tries->{migrate}->{$id}++;
+    if ($should_retry_action->($haenv, $limit_to_node)) {
+	$tries->{migrate}->{$id} = 0 if !$tries->{migrate}->{$id};
+	$tries->{migrate}->{$id}++;
 
-    return if $migrate_failure_count >= $tries->{migrate}->{$id};
+	return if $migrate_failure_count >= $tries->{migrate}->{$id};
+    }
 
     $tries->{migrate}->{$id} = 0; # reset counts
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (10 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive " Daniel Kral
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add test cases for strict negative colocation rules, i.e. where services
must be kept on separate nodes. These verify the behavior of the
services in strict negative colocation rules in case of a failover of
the node of one or more of these services in the following scenarios:

- 2 neg. colocated services in a 3 node cluster; 1 node failing
- 3 neg. colocated services in a 5 node cluster; 1 node failing
- 3 neg. colocated services in a 5 node cluster; 2 nodes failing
- 2 neg. colocated services in a 3 node cluster; 1 node failing, but the
  recovery node cannot start the service
- Pair of 2 neg. colocated services (with one common service in both) in
  a 3 node cluster; 1 node failing

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../test-colocation-strict-separate1/README   |  13 +++
 .../test-colocation-strict-separate1/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  60 ++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-strict-separate2/README   |  15 +++
 .../test-colocation-strict-separate2/cmdlist  |   4 +
 .../hardware_status                           |   7 ++
 .../log.expect                                |  90 ++++++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |  10 ++
 .../test-colocation-strict-separate3/README   |  16 +++
 .../test-colocation-strict-separate3/cmdlist  |   4 +
 .../hardware_status                           |   7 ++
 .../log.expect                                | 110 ++++++++++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |  10 ++
 .../test-colocation-strict-separate4/README   |  17 +++
 .../test-colocation-strict-separate4/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  69 +++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |   4 +
 .../service_config                            |   6 +
 .../test-colocation-strict-separate5/README   |  11 ++
 .../test-colocation-strict-separate5/cmdlist  |   4 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  56 +++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |   9 ++
 .../service_config                            |   5 +
 35 files changed, 573 insertions(+)
 create mode 100644 src/test/test-colocation-strict-separate1/README
 create mode 100644 src/test/test-colocation-strict-separate1/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate1/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate1/log.expect
 create mode 100644 src/test/test-colocation-strict-separate1/manager_status
 create mode 100644 src/test/test-colocation-strict-separate1/rules_config
 create mode 100644 src/test/test-colocation-strict-separate1/service_config
 create mode 100644 src/test/test-colocation-strict-separate2/README
 create mode 100644 src/test/test-colocation-strict-separate2/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate2/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate2/log.expect
 create mode 100644 src/test/test-colocation-strict-separate2/manager_status
 create mode 100644 src/test/test-colocation-strict-separate2/rules_config
 create mode 100644 src/test/test-colocation-strict-separate2/service_config
 create mode 100644 src/test/test-colocation-strict-separate3/README
 create mode 100644 src/test/test-colocation-strict-separate3/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate3/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate3/log.expect
 create mode 100644 src/test/test-colocation-strict-separate3/manager_status
 create mode 100644 src/test/test-colocation-strict-separate3/rules_config
 create mode 100644 src/test/test-colocation-strict-separate3/service_config
 create mode 100644 src/test/test-colocation-strict-separate4/README
 create mode 100644 src/test/test-colocation-strict-separate4/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate4/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate4/log.expect
 create mode 100644 src/test/test-colocation-strict-separate4/manager_status
 create mode 100644 src/test/test-colocation-strict-separate4/rules_config
 create mode 100644 src/test/test-colocation-strict-separate4/service_config
 create mode 100644 src/test/test-colocation-strict-separate5/README
 create mode 100644 src/test/test-colocation-strict-separate5/cmdlist
 create mode 100644 src/test/test-colocation-strict-separate5/hardware_status
 create mode 100644 src/test/test-colocation-strict-separate5/log.expect
 create mode 100644 src/test/test-colocation-strict-separate5/manager_status
 create mode 100644 src/test/test-colocation-strict-separate5/rules_config
 create mode 100644 src/test/test-colocation-strict-separate5/service_config

diff --git a/src/test/test-colocation-strict-separate1/README b/src/test/test-colocation-strict-separate1/README
new file mode 100644
index 0000000..5a03d99
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/README
@@ -0,0 +1,13 @@
+Test whether a strict negative colocation rule among two services makes one of
+the services migrate to a different recovery node than the other in case of a
+failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept separate
+- vm:101 and vm:102 are currently running on node2 and node3 respectively
+- node1 has a higher service count than node2 to test the colocation rule is
+  applied even though the scheduler would prefer the less utilized node
+
+Therefore, the expected outcome is:
+- As node3 fails, vm:102 is migrated to node1; even though the utilization of
+  node1 is high already, the services must be kept separate
diff --git a/src/test/test-colocation-strict-separate1/cmdlist b/src/test/test-colocation-strict-separate1/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-separate1/hardware_status b/src/test/test-colocation-strict-separate1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate1/log.expect b/src/test/test-colocation-strict-separate1/log.expect
new file mode 100644
index 0000000..475db39
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/log.expect
@@ -0,0 +1,60 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node2'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:101
+info     23    node2/lrm: service status vm:101 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service vm:102
+info    241    node1/lrm: service status vm:102 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate1/manager_status b/src/test/test-colocation-strict-separate1/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-colocation-strict-separate1/rules_config b/src/test/test-colocation-strict-separate1/rules_config
new file mode 100644
index 0000000..21c5608
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-must-vms-be
+    services vm:101,vm:102
+    affinity separate
+    strict 1
diff --git a/src/test/test-colocation-strict-separate1/service_config b/src/test/test-colocation-strict-separate1/service_config
new file mode 100644
index 0000000..6582e8c
--- /dev/null
+++ b/src/test/test-colocation-strict-separate1/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node2", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-separate2/README b/src/test/test-colocation-strict-separate2/README
new file mode 100644
index 0000000..f494d2b
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/README
@@ -0,0 +1,15 @@
+Test whether a strict negative colocation rule among three services makes one
+of the services migrate to a different node than the other services in case of
+a failover of the service's previously assigned node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are on node3, node4, and node5 respectively
+- node1 and node2 have each both higher service counts than node3, node4 and
+  node5 to test the rule is applied even though the scheduler would prefer the
+  less utilizied nodes node3, node4, or node5
+
+Therefore, the expected outcome is:
+- As node5 fails, vm:103 is migrated to node2; even though the utilization of
+  node2 is high already, the services must be kept separate; node2 is chosen
+  since node1 has one more service running on it
diff --git a/src/test/test-colocation-strict-separate2/cmdlist b/src/test/test-colocation-strict-separate2/cmdlist
new file mode 100644
index 0000000..89d09c9
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+    [ "network node5 off" ]
+]
diff --git a/src/test/test-colocation-strict-separate2/hardware_status b/src/test/test-colocation-strict-separate2/hardware_status
new file mode 100644
index 0000000..7b8e961
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/hardware_status
@@ -0,0 +1,7 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" },
+  "node4": { "power": "off", "network": "off" },
+  "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate2/log.expect b/src/test/test-colocation-strict-separate2/log.expect
new file mode 100644
index 0000000..858d3c9
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/log.expect
@@ -0,0 +1,90 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node4 on
+info     20    node4/crm: status change startup => wait_for_quorum
+info     20    node4/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node5 on
+info     20    node5/crm: status change startup => wait_for_quorum
+info     20    node5/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node4'
+info     20    node1/crm: adding new service 'vm:103' on node 'node5'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: adding new service 'vm:106' on node 'node1'
+info     20    node1/crm: adding new service 'vm:107' on node 'node2'
+info     20    node1/crm: adding new service 'vm:108' on node 'node2'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node4)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node5)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:107': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:108': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     21    node1/lrm: starting service vm:105
+info     21    node1/lrm: service status vm:105 started
+info     21    node1/lrm: starting service vm:106
+info     21    node1/lrm: service status vm:106 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:107
+info     23    node2/lrm: service status vm:107 started
+info     23    node2/lrm: starting service vm:108
+info     23    node2/lrm: service status vm:108 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     26    node4/crm: status change wait_for_quorum => slave
+info     27    node4/lrm: got lock 'ha_agent_node4_lock'
+info     27    node4/lrm: status change wait_for_agent_lock => active
+info     27    node4/lrm: starting service vm:102
+info     27    node4/lrm: service status vm:102 started
+info     28    node5/crm: status change wait_for_quorum => slave
+info     29    node5/lrm: got lock 'ha_agent_node5_lock'
+info     29    node5/lrm: status change wait_for_agent_lock => active
+info     29    node5/lrm: starting service vm:103
+info     29    node5/lrm: service status vm:103 started
+info    120      cmdlist: execute network node5 off
+info    120    node1/crm: node 'node5': state changed from 'online' => 'unknown'
+info    128    node5/crm: status change slave => wait_for_quorum
+info    129    node5/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node5'
+info    170     watchdog: execute power node5 off
+info    169    node5/crm: killed by poweroff
+info    170    node5/lrm: killed by poweroff
+info    170     hardware: server 'node5' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node5_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node5'
+info    240    node1/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:103' from fenced node 'node5' to node 'node2'
+info    240    node1/crm: service 'vm:103': state changed from 'recovery' to 'started'  (node = node2)
+info    243    node2/lrm: starting service vm:103
+info    243    node2/lrm: service status vm:103 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate2/manager_status b/src/test/test-colocation-strict-separate2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-separate2/rules_config b/src/test/test-colocation-strict-separate2/rules_config
new file mode 100644
index 0000000..4167bab
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-must-vms-be
+    services vm:101,vm:102,vm:103
+    affinity separate
+    strict 1
diff --git a/src/test/test-colocation-strict-separate2/service_config b/src/test/test-colocation-strict-separate2/service_config
new file mode 100644
index 0000000..2c27816
--- /dev/null
+++ b/src/test/test-colocation-strict-separate2/service_config
@@ -0,0 +1,10 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node4", "state": "started" },
+    "vm:103": { "node": "node5", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" },
+    "vm:106": { "node": "node1", "state": "started" },
+    "vm:107": { "node": "node2", "state": "started" },
+    "vm:108": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-separate3/README b/src/test/test-colocation-strict-separate3/README
new file mode 100644
index 0000000..44d88ef
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/README
@@ -0,0 +1,16 @@
+Test whether a strict negative colocation rule among three services makes two
+of the services migrate to two different recovery nodes than the node of the
+third service in case of a failover of their two previously assigned nodes.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept separate
+- vm:101, vm:102, and vm:103 are respectively on node3, node4, and node5
+- node1 and node2 have both higher service counts than node3, node4 and node5
+  to test the colocation rule is enforced even though the utilization would
+  prefer the other node3, node4, and node5
+
+Therefore, the expected outcome is:
+- As node4 and node5 fails, vm:102 and vm:103 are migrated to node2 and node1
+  respectively; even though the utilization of node1 and node2 are high
+  already, the services must be kept separate; node2 is chosen first since
+  node1 has one more service running on it
diff --git a/src/test/test-colocation-strict-separate3/cmdlist b/src/test/test-colocation-strict-separate3/cmdlist
new file mode 100644
index 0000000..1934596
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+    [ "network node4 off", "network node5 off" ]
+]
diff --git a/src/test/test-colocation-strict-separate3/hardware_status b/src/test/test-colocation-strict-separate3/hardware_status
new file mode 100644
index 0000000..7b8e961
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/hardware_status
@@ -0,0 +1,7 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" },
+  "node4": { "power": "off", "network": "off" },
+  "node5": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate3/log.expect b/src/test/test-colocation-strict-separate3/log.expect
new file mode 100644
index 0000000..4acdcec
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/log.expect
@@ -0,0 +1,110 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node4 on
+info     20    node4/crm: status change startup => wait_for_quorum
+info     20    node4/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node5 on
+info     20    node5/crm: status change startup => wait_for_quorum
+info     20    node5/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node4'
+info     20    node1/crm: adding new service 'vm:103' on node 'node5'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: adding new service 'vm:106' on node 'node1'
+info     20    node1/crm: adding new service 'vm:107' on node 'node2'
+info     20    node1/crm: adding new service 'vm:108' on node 'node2'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node4)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node5)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:107': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:108': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     21    node1/lrm: starting service vm:105
+info     21    node1/lrm: service status vm:105 started
+info     21    node1/lrm: starting service vm:106
+info     21    node1/lrm: service status vm:106 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:107
+info     23    node2/lrm: service status vm:107 started
+info     23    node2/lrm: starting service vm:108
+info     23    node2/lrm: service status vm:108 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     26    node4/crm: status change wait_for_quorum => slave
+info     27    node4/lrm: got lock 'ha_agent_node4_lock'
+info     27    node4/lrm: status change wait_for_agent_lock => active
+info     27    node4/lrm: starting service vm:102
+info     27    node4/lrm: service status vm:102 started
+info     28    node5/crm: status change wait_for_quorum => slave
+info     29    node5/lrm: got lock 'ha_agent_node5_lock'
+info     29    node5/lrm: status change wait_for_agent_lock => active
+info     29    node5/lrm: starting service vm:103
+info     29    node5/lrm: service status vm:103 started
+info    120      cmdlist: execute network node4 off
+info    120      cmdlist: execute network node5 off
+info    120    node1/crm: node 'node4': state changed from 'online' => 'unknown'
+info    120    node1/crm: node 'node5': state changed from 'online' => 'unknown'
+info    126    node4/crm: status change slave => wait_for_quorum
+info    127    node4/lrm: status change active => lost_agent_lock
+info    128    node5/crm: status change slave => wait_for_quorum
+info    129    node5/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node4': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node4'
+info    160    node1/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node5'
+info    168     watchdog: execute power node4 off
+info    167    node4/crm: killed by poweroff
+info    168    node4/lrm: killed by poweroff
+info    168     hardware: server 'node4' stopped by poweroff (watchdog)
+info    170     watchdog: execute power node5 off
+info    169    node5/crm: killed by poweroff
+info    170    node5/lrm: killed by poweroff
+info    170     hardware: server 'node5' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node4_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node4'
+info    240    node1/crm: node 'node4': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node4'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: got lock 'ha_agent_node5_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node5'
+info    240    node1/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node4' to node 'node2'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:103' from fenced node 'node5' to node 'node1'
+info    240    node1/crm: service 'vm:103': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service vm:103
+info    241    node1/lrm: service status vm:103 started
+info    243    node2/lrm: starting service vm:102
+info    243    node2/lrm: service status vm:102 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate3/manager_status b/src/test/test-colocation-strict-separate3/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-separate3/rules_config b/src/test/test-colocation-strict-separate3/rules_config
new file mode 100644
index 0000000..4167bab
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-must-vms-be
+    services vm:101,vm:102,vm:103
+    affinity separate
+    strict 1
diff --git a/src/test/test-colocation-strict-separate3/service_config b/src/test/test-colocation-strict-separate3/service_config
new file mode 100644
index 0000000..2c27816
--- /dev/null
+++ b/src/test/test-colocation-strict-separate3/service_config
@@ -0,0 +1,10 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node4", "state": "started" },
+    "vm:103": { "node": "node5", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" },
+    "vm:106": { "node": "node1", "state": "started" },
+    "vm:107": { "node": "node2", "state": "started" },
+    "vm:108": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-separate4/README b/src/test/test-colocation-strict-separate4/README
new file mode 100644
index 0000000..31f127d
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/README
@@ -0,0 +1,17 @@
+Test whether a strict negative colocation rule among two services makes one of
+the services migrate to a different recovery node than the other service in
+case of a failover of service's previously assigned node. As the service fails
+to start on the recovery node (e.g. insufficient resources), the failing
+service is kept on the recovery node.
+
+The test scenario is:
+- vm:101 and fa:120001 must be kept separate
+- vm:101 and fa:120001 are on node2 and node3 respectively
+- fa:120001 will fail to start on node1
+- node1 has a higher service count than node2 to test the colocation rule is
+  applied even though the scheduler would prefer the less utilized node
+
+Therefore, the expected outcome is:
+- As node3 fails, fa:120001 is migrated to node1
+- fa:120001 will stay in recovery, since it cannot be started on node1, but
+  cannot be relocated to another one either due to the strict colocation rule
diff --git a/src/test/test-colocation-strict-separate4/cmdlist b/src/test/test-colocation-strict-separate4/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-separate4/hardware_status b/src/test/test-colocation-strict-separate4/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate4/log.expect b/src/test/test-colocation-strict-separate4/log.expect
new file mode 100644
index 0000000..f772ea8
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/log.expect
@@ -0,0 +1,69 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'fa:120001' on node 'node3'
+info     20    node1/crm: adding new service 'vm:101' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'fa:120001': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:101
+info     23    node2/lrm: service status vm:101 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service fa:120001
+info     25    node3/lrm: service status fa:120001 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'fa:120001': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'fa:120001': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'fa:120001' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'fa:120001': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service fa:120001
+warn    241    node1/lrm: unable to start service fa:120001
+warn    241    node1/lrm: restart policy: retry number 1 for service 'fa:120001'
+info    261    node1/lrm: starting service fa:120001
+warn    261    node1/lrm: unable to start service fa:120001
+err     261    node1/lrm: unable to start service fa:120001 on local node after 1 retries
+warn    280    node1/crm: starting service fa:120001 on node 'node1' failed, relocating service.
+warn    280    node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120001', retry start on current node. Tried nodes: node1
+info    281    node1/lrm: starting service fa:120001
+info    281    node1/lrm: service status fa:120001 started
+info    300    node1/crm: relocation policy successful for 'fa:120001' on node 'node1', failed nodes: node1
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate4/manager_status b/src/test/test-colocation-strict-separate4/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-separate4/rules_config b/src/test/test-colocation-strict-separate4/rules_config
new file mode 100644
index 0000000..3db0056
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-must-vms-be
+    services vm:101,fa:120001
+    affinity separate
+    strict 1
diff --git a/src/test/test-colocation-strict-separate4/service_config b/src/test/test-colocation-strict-separate4/service_config
new file mode 100644
index 0000000..f53c2bc
--- /dev/null
+++ b/src/test/test-colocation-strict-separate4/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node2", "state": "started" },
+    "fa:120001": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-separate5/README b/src/test/test-colocation-strict-separate5/README
new file mode 100644
index 0000000..4cdcbf5
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/README
@@ -0,0 +1,11 @@
+Test whether two pair-wise strict negative colocation rules, i.e. where one
+service is in two separate non-colocation relationship with two other services,
+makes one of the outer services migrate to the same node as the other outer
+service in case of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102, and vm:101 and vm:103 must each be kept separate
+- vm:101, vm:102, and vm:103 are respectively on node1, node2, and node3
+
+Therefore, the expected outcome is:
+- As node3 fails, vm:103 is migrated to node2 - the same as vm:102
diff --git a/src/test/test-colocation-strict-separate5/cmdlist b/src/test/test-colocation-strict-separate5/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-separate5/hardware_status b/src/test/test-colocation-strict-separate5/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate5/log.expect b/src/test/test-colocation-strict-separate5/log.expect
new file mode 100644
index 0000000..16156ad
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/log.expect
@@ -0,0 +1,56 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:102
+info     23    node2/lrm: service status vm:102 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:103': state changed from 'recovery' to 'started'  (node = node2)
+info    243    node2/lrm: starting service vm:103
+info    243    node2/lrm: service status vm:103 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate5/manager_status b/src/test/test-colocation-strict-separate5/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-separate5/rules_config b/src/test/test-colocation-strict-separate5/rules_config
new file mode 100644
index 0000000..f72fc66
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/rules_config
@@ -0,0 +1,9 @@
+colocation: lonely-must-some-vms-be1
+    services vm:101,vm:102
+    affinity separate
+    strict 1
+
+colocation: lonely-must-some-vms-be2
+    services vm:101,vm:103
+    affinity separate
+    strict 1
diff --git a/src/test/test-colocation-strict-separate5/service_config b/src/test/test-colocation-strict-separate5/service_config
new file mode 100644
index 0000000..4b26f6b
--- /dev/null
+++ b/src/test/test-colocation-strict-separate5/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node2", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" }
+}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive colocation rules
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (11 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose " Daniel Kral
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add test cases for strict positive colocation rules, i.e. where services
must be kept on the same node together. These verify the behavior of the
services in strict positive colocation rules in case of a failover of
their assigned nodes in the following scenarios:

- 2 pos. colocated services in a 3 node cluster; 1 node failing
- 3 pos. colocated services in a 3 node cluster; 1 node failing
- 3 pos. colocated services in a 3 node cluster; 1 node failing, but the
  recovery node cannot start one of the services

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
The first test case can probably be dropped, since the second test case
shows the exact same behavior, just with a third service added.

 .../test-colocation-strict-together1/README   | 11 +++
 .../test-colocation-strict-together1/cmdlist  |  4 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 66 ++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  6 ++
 .../test-colocation-strict-together2/README   | 11 +++
 .../test-colocation-strict-together2/cmdlist  |  4 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 80 +++++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  8 ++
 .../test-colocation-strict-together3/README   | 17 ++++
 .../test-colocation-strict-together3/cmdlist  |  4 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 89 +++++++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  8 ++
 21 files changed, 338 insertions(+)
 create mode 100644 src/test/test-colocation-strict-together1/README
 create mode 100644 src/test/test-colocation-strict-together1/cmdlist
 create mode 100644 src/test/test-colocation-strict-together1/hardware_status
 create mode 100644 src/test/test-colocation-strict-together1/log.expect
 create mode 100644 src/test/test-colocation-strict-together1/manager_status
 create mode 100644 src/test/test-colocation-strict-together1/rules_config
 create mode 100644 src/test/test-colocation-strict-together1/service_config
 create mode 100644 src/test/test-colocation-strict-together2/README
 create mode 100644 src/test/test-colocation-strict-together2/cmdlist
 create mode 100644 src/test/test-colocation-strict-together2/hardware_status
 create mode 100644 src/test/test-colocation-strict-together2/log.expect
 create mode 100644 src/test/test-colocation-strict-together2/manager_status
 create mode 100644 src/test/test-colocation-strict-together2/rules_config
 create mode 100644 src/test/test-colocation-strict-together2/service_config
 create mode 100644 src/test/test-colocation-strict-together3/README
 create mode 100644 src/test/test-colocation-strict-together3/cmdlist
 create mode 100644 src/test/test-colocation-strict-together3/hardware_status
 create mode 100644 src/test/test-colocation-strict-together3/log.expect
 create mode 100644 src/test/test-colocation-strict-together3/manager_status
 create mode 100644 src/test/test-colocation-strict-together3/rules_config
 create mode 100644 src/test/test-colocation-strict-together3/service_config

diff --git a/src/test/test-colocation-strict-together1/README b/src/test/test-colocation-strict-together1/README
new file mode 100644
index 0000000..ab8a7d5
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/README
@@ -0,0 +1,11 @@
+Test whether a strict positive colocation rule makes two services migrate to
+the same recovery node in case of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept together
+- vm:101 and vm:102 are both currently running on node3
+- node1 and node2 have the same service count to test that the rule is applied
+  even though it would be usually balanced between both remaining nodes
+
+Therefore, the expected outcome is:
+- As node3 fails, both services are migrated to node1
diff --git a/src/test/test-colocation-strict-together1/cmdlist b/src/test/test-colocation-strict-together1/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-together1/hardware_status b/src/test/test-colocation-strict-together1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-together1/log.expect b/src/test/test-colocation-strict-together1/log.expect
new file mode 100644
index 0000000..7d43314
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/log.expect
@@ -0,0 +1,66 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node2'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:104
+info     23    node2/lrm: service status vm:104 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:101': state changed from 'recovery' to 'started'  (node = node1)
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service vm:101
+info    241    node1/lrm: service status vm:101 started
+info    241    node1/lrm: starting service vm:102
+info    241    node1/lrm: service status vm:102 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-together1/manager_status b/src/test/test-colocation-strict-together1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-together1/rules_config b/src/test/test-colocation-strict-together1/rules_config
new file mode 100644
index 0000000..e6bd30b
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/rules_config
@@ -0,0 +1,4 @@
+colocation: vms-must-stick-together
+	services vm:101,vm:102
+	affinity together
+	strict 1
diff --git a/src/test/test-colocation-strict-together1/service_config b/src/test/test-colocation-strict-together1/service_config
new file mode 100644
index 0000000..9fb091d
--- /dev/null
+++ b/src/test/test-colocation-strict-together1/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-together2/README b/src/test/test-colocation-strict-together2/README
new file mode 100644
index 0000000..c1abf68
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/README
@@ -0,0 +1,11 @@
+Test whether a strict positive colocation rule makes three services migrate to
+the same recovery node in case of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101, vm:102, and vm:103 must be kept together
+- vm:101, vm:102, and vm:103 are all currently running on node3
+- node1 has a higher service count than node2 to test that the rule is applied
+  even though it would be usually balanced between both remaining nodes
+
+Therefore, the expected outcome is:
+- As node3 fails, all services are migrated to node2
diff --git a/src/test/test-colocation-strict-together2/cmdlist b/src/test/test-colocation-strict-together2/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-together2/hardware_status b/src/test/test-colocation-strict-together2/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-together2/log.expect b/src/test/test-colocation-strict-together2/log.expect
new file mode 100644
index 0000000..78f4d66
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/log.expect
@@ -0,0 +1,80 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: adding new service 'vm:106' on node 'node2'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     21    node1/lrm: starting service vm:105
+info     21    node1/lrm: service status vm:105 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:106
+info     23    node2/lrm: service status vm:106 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:101': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:103': state changed from 'recovery' to 'started'  (node = node2)
+info    243    node2/lrm: starting service vm:101
+info    243    node2/lrm: service status vm:101 started
+info    243    node2/lrm: starting service vm:102
+info    243    node2/lrm: service status vm:102 started
+info    243    node2/lrm: starting service vm:103
+info    243    node2/lrm: service status vm:103 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-together2/manager_status b/src/test/test-colocation-strict-together2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-together2/rules_config b/src/test/test-colocation-strict-together2/rules_config
new file mode 100644
index 0000000..904dc1f
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/rules_config
@@ -0,0 +1,4 @@
+colocation: vms-must-stick-together
+	services vm:101,vm:102,vm:103
+	affinity together
+	strict 1
diff --git a/src/test/test-colocation-strict-together2/service_config b/src/test/test-colocation-strict-together2/service_config
new file mode 100644
index 0000000..fd4a87e
--- /dev/null
+++ b/src/test/test-colocation-strict-together2/service_config
@@ -0,0 +1,8 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" },
+    "vm:106": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-colocation-strict-together3/README b/src/test/test-colocation-strict-together3/README
new file mode 100644
index 0000000..5332696
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/README
@@ -0,0 +1,17 @@
+Test whether a strict positive colocation rule makes three services migrate to
+the same recovery node in case of a failover of their previously assigned node.
+If one of those fail to start on the recovery node (e.g. insufficient
+resources), the failing service will be kept on the recovery node.
+
+The test scenario is:
+- vm:101, vm:102, and fa:120002 must be kept together
+- vm:101, vm:102, and fa:120002 are all currently running on node3
+- fa:120002 will fail to start on node2
+- node1 has a higher service count than node2 to test that the rule is applied
+  even though it would be usually balanced between both remaining nodes
+
+Therefore, the expected outcome is:
+- As node3 fails, all services are migrated to node2
+- Two of those services will start successfully, but fa:120002 will stay in
+  recovery, since it cannot be started on this node, but cannot be relocated to
+  another one either due to the strict colocation rule
diff --git a/src/test/test-colocation-strict-together3/cmdlist b/src/test/test-colocation-strict-together3/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-strict-together3/hardware_status b/src/test/test-colocation-strict-together3/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-together3/log.expect b/src/test/test-colocation-strict-together3/log.expect
new file mode 100644
index 0000000..4a54cb3
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/log.expect
@@ -0,0 +1,89 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'fa:120002' on node 'node3'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: adding new service 'vm:106' on node 'node2'
+info     20    node1/crm: service 'fa:120002': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     21    node1/lrm: starting service vm:105
+info     21    node1/lrm: service status vm:105 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:106
+info     23    node2/lrm: service status vm:106 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service fa:120002
+info     25    node3/lrm: service status fa:120002 started
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'fa:120002': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'fa:120002': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'fa:120002' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'fa:120002': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:101': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node2)
+info    243    node2/lrm: starting service fa:120002
+warn    243    node2/lrm: unable to start service fa:120002
+warn    243    node2/lrm: restart policy: retry number 1 for service 'fa:120002'
+info    243    node2/lrm: starting service vm:101
+info    243    node2/lrm: service status vm:101 started
+info    243    node2/lrm: starting service vm:102
+info    243    node2/lrm: service status vm:102 started
+info    263    node2/lrm: starting service fa:120002
+warn    263    node2/lrm: unable to start service fa:120002
+err     263    node2/lrm: unable to start service fa:120002 on local node after 1 retries
+warn    280    node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
+warn    280    node1/crm: Start Error Recovery: Tried all available nodes for service 'fa:120002', retry start on current node. Tried nodes: node2
+info    283    node2/lrm: starting service fa:120002
+info    283    node2/lrm: service status fa:120002 started
+info    300    node1/crm: relocation policy successful for 'fa:120002' on node 'node2', failed nodes: node2
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-together3/manager_status b/src/test/test-colocation-strict-together3/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-colocation-strict-together3/rules_config b/src/test/test-colocation-strict-together3/rules_config
new file mode 100644
index 0000000..5feafb5
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/rules_config
@@ -0,0 +1,4 @@
+colocation: vms-must-stick-together
+	services vm:101,vm:102,fa:120002
+	affinity together
+	strict 1
diff --git a/src/test/test-colocation-strict-together3/service_config b/src/test/test-colocation-strict-together3/service_config
new file mode 100644
index 0000000..3ce5f27
--- /dev/null
+++ b/src/test/test-colocation-strict-together3/service_config
@@ -0,0 +1,8 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "fa:120002": { "node": "node3", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" },
+    "vm:106": { "node": "node2", "state": "started" }
+}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose colocation rules
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (12 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive " Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios Daniel Kral
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add test cases for loose positive and negative colocation rules, i.e.
where services should be kept on the same node together or kept separate
nodes. These are copies of their strict counterpart tests, but verify
the behavior if the colocation rule cannot be met, i.e. not adhering to
the colocation rule. The test scenarios are:

- 2 neg. colocated services in a 3 node cluster; 1 node failing
- 2 neg. colocated services in a 3 node cluster; 1 node failing, but the
  recovery node cannot start the service
- 2 pos. colocated services in a 3 node cluster; 1 node failing
- 3 pos. colocated services in a 3 node cluster; 1 node failing, but the
  recovery node cannot start one of the services

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../test-colocation-loose-separate1/README    | 13 +++
 .../test-colocation-loose-separate1/cmdlist   |  4 +
 .../hardware_status                           |  5 +
 .../log.expect                                | 60 ++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  6 ++
 .../test-colocation-loose-separate4/README    | 17 ++++
 .../test-colocation-loose-separate4/cmdlist   |  4 +
 .../hardware_status                           |  5 +
 .../log.expect                                | 73 +++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  6 ++
 .../test-colocation-loose-together1/README    | 11 +++
 .../test-colocation-loose-together1/cmdlist   |  4 +
 .../hardware_status                           |  5 +
 .../log.expect                                | 66 +++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  6 ++
 .../test-colocation-loose-together3/README    | 16 ++++
 .../test-colocation-loose-together3/cmdlist   |  4 +
 .../hardware_status                           |  5 +
 .../log.expect                                | 93 +++++++++++++++++++
 .../manager_status                            |  1 +
 .../rules_config                              |  4 +
 .../service_config                            |  8 ++
 28 files changed, 431 insertions(+)
 create mode 100644 src/test/test-colocation-loose-separate1/README
 create mode 100644 src/test/test-colocation-loose-separate1/cmdlist
 create mode 100644 src/test/test-colocation-loose-separate1/hardware_status
 create mode 100644 src/test/test-colocation-loose-separate1/log.expect
 create mode 100644 src/test/test-colocation-loose-separate1/manager_status
 create mode 100644 src/test/test-colocation-loose-separate1/rules_config
 create mode 100644 src/test/test-colocation-loose-separate1/service_config
 create mode 100644 src/test/test-colocation-loose-separate4/README
 create mode 100644 src/test/test-colocation-loose-separate4/cmdlist
 create mode 100644 src/test/test-colocation-loose-separate4/hardware_status
 create mode 100644 src/test/test-colocation-loose-separate4/log.expect
 create mode 100644 src/test/test-colocation-loose-separate4/manager_status
 create mode 100644 src/test/test-colocation-loose-separate4/rules_config
 create mode 100644 src/test/test-colocation-loose-separate4/service_config
 create mode 100644 src/test/test-colocation-loose-together1/README
 create mode 100644 src/test/test-colocation-loose-together1/cmdlist
 create mode 100644 src/test/test-colocation-loose-together1/hardware_status
 create mode 100644 src/test/test-colocation-loose-together1/log.expect
 create mode 100644 src/test/test-colocation-loose-together1/manager_status
 create mode 100644 src/test/test-colocation-loose-together1/rules_config
 create mode 100644 src/test/test-colocation-loose-together1/service_config
 create mode 100644 src/test/test-colocation-loose-together3/README
 create mode 100644 src/test/test-colocation-loose-together3/cmdlist
 create mode 100644 src/test/test-colocation-loose-together3/hardware_status
 create mode 100644 src/test/test-colocation-loose-together3/log.expect
 create mode 100644 src/test/test-colocation-loose-together3/manager_status
 create mode 100644 src/test/test-colocation-loose-together3/rules_config
 create mode 100644 src/test/test-colocation-loose-together3/service_config

diff --git a/src/test/test-colocation-loose-separate1/README b/src/test/test-colocation-loose-separate1/README
new file mode 100644
index 0000000..ac7c395
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/README
@@ -0,0 +1,13 @@
+Test whether a loose negative colocation rule among two services makes one of
+the services migrate to a different recovery node than the other in case of a
+failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102 should be kept separate
+- vm:101 and vm:102 are currently running on node2 and node3 respectively
+- node1 has a higher service count than node2 to test that the rule is applied
+  even though the scheduler would prefer the less utilized node
+
+Therefore, the expected outcome is:
+- As node3 fails, vm:102 is migrated to node1; even though the utilization of
+  node1 is high already, the services must be kept separate
diff --git a/src/test/test-colocation-loose-separate1/cmdlist b/src/test/test-colocation-loose-separate1/cmdlist
new file mode 100644
index 0000000..eee0e40
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on"],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-loose-separate1/hardware_status b/src/test/test-colocation-loose-separate1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-loose-separate1/log.expect b/src/test/test-colocation-loose-separate1/log.expect
new file mode 100644
index 0000000..475db39
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/log.expect
@@ -0,0 +1,60 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node2'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:101
+info     23    node2/lrm: service status vm:101 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service vm:102
+info    241    node1/lrm: service status vm:102 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-loose-separate1/manager_status b/src/test/test-colocation-loose-separate1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-loose-separate1/rules_config b/src/test/test-colocation-loose-separate1/rules_config
new file mode 100644
index 0000000..5227309
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-should-vms-be
+    services vm:101,vm:102
+    affinity separate
+    strict 0
diff --git a/src/test/test-colocation-loose-separate1/service_config b/src/test/test-colocation-loose-separate1/service_config
new file mode 100644
index 0000000..6582e8c
--- /dev/null
+++ b/src/test/test-colocation-loose-separate1/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node2", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-colocation-loose-separate4/README b/src/test/test-colocation-loose-separate4/README
new file mode 100644
index 0000000..5b68cde
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/README
@@ -0,0 +1,17 @@
+Test whether a loose negative colocation rule among two services makes one of
+the services migrate to a different recovery node than the other service in
+case of a failover of service's previously assigned node. As the service fails
+to start on the recovery node (e.g. insufficient resources), the failing
+service is kept on the recovery node.
+
+The test scenario is:
+- vm:101 and fa:120001 should be kept separate
+- vm:101 and fa:120001 are on node2 and node3 respectively
+- fa:120001 will fail to start on node1
+- node1 has a higher service count than node2 to test the colocation rule is
+  applied even though the scheduler would prefer the less utilized node
+
+Therefore, the expected outcome is:
+- As node3 fails, fa:120001 is migrated to node1
+- fa:120001 will be relocated to another node, since it couldn't start on its
+  initial recovery node
diff --git a/src/test/test-colocation-loose-separate4/cmdlist b/src/test/test-colocation-loose-separate4/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-loose-separate4/hardware_status b/src/test/test-colocation-loose-separate4/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-loose-separate4/log.expect b/src/test/test-colocation-loose-separate4/log.expect
new file mode 100644
index 0000000..bf70aca
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/log.expect
@@ -0,0 +1,73 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'fa:120001' on node 'node3'
+info     20    node1/crm: adding new service 'vm:101' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'fa:120001': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:101
+info     23    node2/lrm: service status vm:101 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service fa:120001
+info     25    node3/lrm: service status fa:120001 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'fa:120001': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'fa:120001': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'fa:120001' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'fa:120001': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service fa:120001
+warn    241    node1/lrm: unable to start service fa:120001
+warn    241    node1/lrm: restart policy: retry number 1 for service 'fa:120001'
+info    261    node1/lrm: starting service fa:120001
+warn    261    node1/lrm: unable to start service fa:120001
+err     261    node1/lrm: unable to start service fa:120001 on local node after 1 retries
+warn    280    node1/crm: starting service fa:120001 on node 'node1' failed, relocating service.
+info    280    node1/crm: relocate service 'fa:120001' to node 'node2'
+info    280    node1/crm: service 'fa:120001': state changed from 'started' to 'relocate'  (node = node1, target = node2)
+info    281    node1/lrm: service fa:120001 - start relocate to node 'node2'
+info    281    node1/lrm: service fa:120001 - end relocate to node 'node2'
+info    300    node1/crm: service 'fa:120001': state changed from 'relocate' to 'started'  (node = node2)
+info    303    node2/lrm: starting service fa:120001
+info    303    node2/lrm: service status fa:120001 started
+info    320    node1/crm: relocation policy successful for 'fa:120001' on node 'node2', failed nodes: node1
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-loose-separate4/manager_status b/src/test/test-colocation-loose-separate4/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-loose-separate4/rules_config b/src/test/test-colocation-loose-separate4/rules_config
new file mode 100644
index 0000000..8a4b869
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/rules_config
@@ -0,0 +1,4 @@
+colocation: lonely-should-vms-be
+    services vm:101,fa:120001
+    affinity separate
+    strict 0
diff --git a/src/test/test-colocation-loose-separate4/service_config b/src/test/test-colocation-loose-separate4/service_config
new file mode 100644
index 0000000..f53c2bc
--- /dev/null
+++ b/src/test/test-colocation-loose-separate4/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node2", "state": "started" },
+    "fa:120001": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-colocation-loose-together1/README b/src/test/test-colocation-loose-together1/README
new file mode 100644
index 0000000..2f5aeec
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/README
@@ -0,0 +1,11 @@
+Test whether a loose positive colocation rule makes two services migrate to
+the same recovery node in case of a failover of their previously assigned node.
+
+The test scenario is:
+- vm:101 and vm:102 should be kept together
+- vm:101 and vm:102 are both currently running on node3
+- node1 and node2 have the same service count to test that the rule is applied
+  even though it would be usually balanced between both remaining nodes
+
+Therefore, the expected outcome is:
+- As node3 fails, both services are migrated to node2
diff --git a/src/test/test-colocation-loose-together1/cmdlist b/src/test/test-colocation-loose-together1/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-loose-together1/hardware_status b/src/test/test-colocation-loose-together1/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-loose-together1/log.expect b/src/test/test-colocation-loose-together1/log.expect
new file mode 100644
index 0000000..7d43314
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/log.expect
@@ -0,0 +1,66 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node2'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:104
+info     23    node2/lrm: service status vm:104 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:101': state changed from 'recovery' to 'started'  (node = node1)
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node1)
+info    241    node1/lrm: starting service vm:101
+info    241    node1/lrm: service status vm:101 started
+info    241    node1/lrm: starting service vm:102
+info    241    node1/lrm: service status vm:102 started
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-loose-together1/manager_status b/src/test/test-colocation-loose-together1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-loose-together1/rules_config b/src/test/test-colocation-loose-together1/rules_config
new file mode 100644
index 0000000..37f6aab
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/rules_config
@@ -0,0 +1,4 @@
+colocation: vms-might-stick-together
+    services vm:101,vm:102
+    affinity together
+    strict 0
diff --git a/src/test/test-colocation-loose-together1/service_config b/src/test/test-colocation-loose-together1/service_config
new file mode 100644
index 0000000..9fb091d
--- /dev/null
+++ b/src/test/test-colocation-loose-together1/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node2", "state": "started" }
+}
diff --git a/src/test/test-colocation-loose-together3/README b/src/test/test-colocation-loose-together3/README
new file mode 100644
index 0000000..c2aebcf
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/README
@@ -0,0 +1,16 @@
+Test whether a loose positive colocation rule makes three services migrate to
+the same recovery node in case of a failover of their previously assigned node.
+If one of those fail to start on the recovery node (e.g. insufficient
+resources), the failed service will be relocated to another node.
+
+The test scenario is:
+- vm:101, vm:102, and fa:120002 should be kept together
+- vm:101, vm:102, and fa:120002 are all currently running on node3
+- fa:120002 will fail to start on node2
+- node1 has a higher service count than node2 to test that the rule is applied
+  even though it would be usually balanced between both remaining nodes
+
+Therefore, the expected outcome is:
+- As node3 fails, all services are migrated to node2
+- Two of those services will start successfully, but fa:120002 will be
+  relocated to another node, since it couldn't start on the same recovery node
diff --git a/src/test/test-colocation-loose-together3/cmdlist b/src/test/test-colocation-loose-together3/cmdlist
new file mode 100644
index 0000000..c0a4daa
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-colocation-loose-together3/hardware_status b/src/test/test-colocation-loose-together3/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off" },
+  "node2": { "power": "off", "network": "off" },
+  "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-loose-together3/log.expect b/src/test/test-colocation-loose-together3/log.expect
new file mode 100644
index 0000000..6ca8053
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/log.expect
@@ -0,0 +1,93 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'fa:120002' on node 'node3'
+info     20    node1/crm: adding new service 'vm:101' on node 'node3'
+info     20    node1/crm: adding new service 'vm:102' on node 'node3'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: adding new service 'vm:106' on node 'node2'
+info     20    node1/crm: service 'fa:120002': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node2)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     21    node1/lrm: starting service vm:105
+info     21    node1/lrm: service status vm:105 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:106
+info     23    node2/lrm: service status vm:106 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service fa:120002
+info     25    node3/lrm: service status fa:120002 started
+info     25    node3/lrm: starting service vm:101
+info     25    node3/lrm: service status vm:101 started
+info     25    node3/lrm: starting service vm:102
+info     25    node3/lrm: service status vm:102 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'fa:120002': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    160    node1/crm: service 'vm:102': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'fa:120002': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    240    node1/crm: service 'vm:102': state changed from 'fence' to 'recovery'
+info    240    node1/crm: recover service 'fa:120002' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'fa:120002': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:101' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:101': state changed from 'recovery' to 'started'  (node = node2)
+info    240    node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node2'
+info    240    node1/crm: service 'vm:102': state changed from 'recovery' to 'started'  (node = node2)
+info    243    node2/lrm: starting service fa:120002
+warn    243    node2/lrm: unable to start service fa:120002
+warn    243    node2/lrm: restart policy: retry number 1 for service 'fa:120002'
+info    243    node2/lrm: starting service vm:101
+info    243    node2/lrm: service status vm:101 started
+info    243    node2/lrm: starting service vm:102
+info    243    node2/lrm: service status vm:102 started
+info    263    node2/lrm: starting service fa:120002
+warn    263    node2/lrm: unable to start service fa:120002
+err     263    node2/lrm: unable to start service fa:120002 on local node after 1 retries
+warn    280    node1/crm: starting service fa:120002 on node 'node2' failed, relocating service.
+info    280    node1/crm: relocate service 'fa:120002' to node 'node1'
+info    280    node1/crm: service 'fa:120002': state changed from 'started' to 'relocate'  (node = node2, target = node1)
+info    283    node2/lrm: service fa:120002 - start relocate to node 'node1'
+info    283    node2/lrm: service fa:120002 - end relocate to node 'node1'
+info    300    node1/crm: service 'fa:120002': state changed from 'relocate' to 'started'  (node = node1)
+info    301    node1/lrm: starting service fa:120002
+info    301    node1/lrm: service status fa:120002 started
+info    320    node1/crm: relocation policy successful for 'fa:120002' on node 'node1', failed nodes: node2
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-colocation-loose-together3/manager_status b/src/test/test-colocation-loose-together3/manager_status
new file mode 100644
index 0000000..0967ef4
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-colocation-loose-together3/rules_config b/src/test/test-colocation-loose-together3/rules_config
new file mode 100644
index 0000000..b43c087
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/rules_config
@@ -0,0 +1,4 @@
+colocation: vms-might-stick-together
+	services vm:101,vm:102,fa:120002
+	affinity together
+	strict 0
diff --git a/src/test/test-colocation-loose-together3/service_config b/src/test/test-colocation-loose-together3/service_config
new file mode 100644
index 0000000..3ce5f27
--- /dev/null
+++ b/src/test/test-colocation-loose-together3/service_config
@@ -0,0 +1,8 @@
+{
+    "vm:101": { "node": "node3", "state": "started" },
+    "vm:102": { "node": "node3", "state": "started" },
+    "fa:120002": { "node": "node3", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" },
+    "vm:106": { "node": "node2", "state": "started" }
+}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (13 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose " Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config Daniel Kral
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add test cases, where colocation rules are used with the static
utilization scheduler and the rebalance on start option enabled. These
verify the behavior in the following scenarios:

- 7 services with intertwined colocation rules in a 3 node cluster;
  1 node failing
- 3 neg. colocated services in a 3 node cluster, where the rules are
  stated in a intransitive form; 1 node failing
- 5 neg. colocated services in a 5 node cluster, where the rules are
  stated in a intransitive form; 2 nodes failing

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../test-crs-static-rebalance-coloc1/README   |  26 +++
 .../test-crs-static-rebalance-coloc1/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   5 +
 .../log.expect                                | 120 ++++++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |  24 +++
 .../service_config                            |  10 ++
 .../static_service_stats                      |  10 ++
 .../test-crs-static-rebalance-coloc2/README   |  16 ++
 .../test-crs-static-rebalance-coloc2/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  86 ++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |  14 ++
 .../service_config                            |   5 +
 .../static_service_stats                      |   5 +
 .../test-crs-static-rebalance-coloc3/README   |  14 ++
 .../test-crs-static-rebalance-coloc3/cmdlist  |   4 +
 .../datacenter.cfg                            |   6 +
 .../hardware_status                           |   7 +
 .../log.expect                                | 156 ++++++++++++++++++
 .../manager_status                            |   1 +
 .../rules_config                              |  49 ++++++
 .../service_config                            |   7 +
 .../static_service_stats                      |   5 +
 27 files changed, 597 insertions(+)
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc1/static_service_stats
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc2/static_service_stats
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/README
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/cmdlist
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/datacenter.cfg
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/hardware_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/log.expect
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/manager_status
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/rules_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/service_config
 create mode 100644 src/test/test-crs-static-rebalance-coloc3/static_service_stats

diff --git a/src/test/test-crs-static-rebalance-coloc1/README b/src/test/test-crs-static-rebalance-coloc1/README
new file mode 100644
index 0000000..c709f45
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/README
@@ -0,0 +1,26 @@
+Test whether a mixed set of strict colocation rules in conjunction with the
+static load scheduler with auto-rebalancing are applied correctly on service
+start enabled and in case of a subsequent failover.
+
+The test scenario is:
+- vm:101 and vm:102 are non-colocated services
+- Services that must be kept together:
+    - vm:102, and vm:107
+    - vm:104, vm:106, and vm:108
+- Services that must be kept separate:
+    - vm:103, vm:104, and vm:105
+    - vm:103, vm:106, and vm:107
+    - vm:107, and vm:108
+- Therefore, there are consistent interdependencies between the positive and
+  negative colocation rules' service members
+- vm:101 and vm:102 are currently assigned to node1 and node2 respectively
+- vm:103 through vm:108 are currently assigned to node3
+
+Therefore, the expected outcome is:
+- vm:101, vm:102, vm:103 should be started on node1, node2, and node3
+  respectively, as there's nothing running on there yet
+- vm:104, vm:106, and vm:108 should all be assigned on the same node, which
+  will be node1, since it has the most resources left for vm:104
+- vm:105 and vm:107 should both be assigned on the same node, which will be
+  node2, since both cannot be assigned to the other nodes because of the
+  colocation constraints
diff --git a/src/test/test-crs-static-rebalance-coloc1/cmdlist b/src/test/test-crs-static-rebalance-coloc1/cmdlist
new file mode 100644
index 0000000..eee0e40
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on"],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-crs-static-rebalance-coloc1/datacenter.cfg b/src/test/test-crs-static-rebalance-coloc1/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+    "crs": {
+        "ha": "static",
+        "ha-rebalance-on-start": 1
+    }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc1/hardware_status b/src/test/test-crs-static-rebalance-coloc1/hardware_status
new file mode 100644
index 0000000..84484af
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc1/log.expect b/src/test/test-crs-static-rebalance-coloc1/log.expect
new file mode 100644
index 0000000..cdd2497
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/log.expect
@@ -0,0 +1,120 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'static'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node2'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     20    node1/crm: adding new service 'vm:104' on node 'node3'
+info     20    node1/crm: adding new service 'vm:105' on node 'node3'
+info     20    node1/crm: adding new service 'vm:106' on node 'node3'
+info     20    node1/crm: adding new service 'vm:107' on node 'node3'
+info     20    node1/crm: adding new service 'vm:108' on node 'node3'
+info     20    node1/crm: service vm:101: re-balance selected current node node1 for startup
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service vm:102: re-balance selected current node node2 for startup
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service vm:103: re-balance selected current node node3 for startup
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service vm:104: re-balance selected new node node1 for startup
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'request_start_balance'  (node = node3, target = node1)
+info     20    node1/crm: service vm:105: re-balance selected new node node2 for startup
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'request_start_balance'  (node = node3, target = node2)
+info     20    node1/crm: service vm:106: re-balance selected new node node1 for startup
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'request_start_balance'  (node = node3, target = node1)
+info     20    node1/crm: service vm:107: re-balance selected new node node2 for startup
+info     20    node1/crm: service 'vm:107': state changed from 'request_start' to 'request_start_balance'  (node = node3, target = node2)
+info     20    node1/crm: service vm:108: re-balance selected new node node1 for startup
+info     20    node1/crm: service 'vm:108': state changed from 'request_start' to 'request_start_balance'  (node = node3, target = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:102
+info     23    node2/lrm: service status vm:102 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+info     25    node3/lrm: service vm:104 - start relocate to node 'node1'
+info     25    node3/lrm: service vm:104 - end relocate to node 'node1'
+info     25    node3/lrm: service vm:105 - start relocate to node 'node2'
+info     25    node3/lrm: service vm:105 - end relocate to node 'node2'
+info     25    node3/lrm: service vm:106 - start relocate to node 'node1'
+info     25    node3/lrm: service vm:106 - end relocate to node 'node1'
+info     25    node3/lrm: service vm:107 - start relocate to node 'node2'
+info     25    node3/lrm: service vm:107 - end relocate to node 'node2'
+info     25    node3/lrm: service vm:108 - start relocate to node 'node1'
+info     25    node3/lrm: service vm:108 - end relocate to node 'node1'
+info     40    node1/crm: service 'vm:104': state changed from 'request_start_balance' to 'started'  (node = node1)
+info     40    node1/crm: service 'vm:105': state changed from 'request_start_balance' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:106': state changed from 'request_start_balance' to 'started'  (node = node1)
+info     40    node1/crm: service 'vm:107': state changed from 'request_start_balance' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:108': state changed from 'request_start_balance' to 'started'  (node = node1)
+info     41    node1/lrm: starting service vm:104
+info     41    node1/lrm: service status vm:104 started
+info     41    node1/lrm: starting service vm:106
+info     41    node1/lrm: service status vm:106 started
+info     41    node1/lrm: starting service vm:108
+info     41    node1/lrm: service status vm:108 started
+info     43    node2/lrm: starting service vm:105
+info     43    node2/lrm: service status vm:105 started
+info     43    node2/lrm: starting service vm:107
+info     43    node2/lrm: service status vm:107 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+err     240    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     260    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     280    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     300    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     320    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     340    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     360    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     380    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     400    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     420    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     440    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     460    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     480    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     500    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     520    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     540    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     560    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     580    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     600    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     620    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     640    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     660    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     680    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     700    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-coloc1/manager_status b/src/test/test-crs-static-rebalance-coloc1/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-coloc1/rules_config b/src/test/test-crs-static-rebalance-coloc1/rules_config
new file mode 100644
index 0000000..129778f
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/rules_config
@@ -0,0 +1,24 @@
+colocation: vms-must-stick-together1
+	services vm:102,vm:107
+	affinity together
+	strict 1
+
+colocation: vms-must-stick-together2
+	services vm:104,vm:106,vm:108
+	affinity together
+	strict 1
+
+colocation: vms-must-stay-apart1
+	services vm:103,vm:104,vm:105
+	affinity separate
+	strict 1
+
+colocation: vms-must-stay-apart2
+	services vm:103,vm:106,vm:107
+	affinity separate
+	strict 1
+
+colocation: vms-must-stay-apart3
+	services vm:107,vm:108
+	affinity separate
+	strict 1
diff --git a/src/test/test-crs-static-rebalance-coloc1/service_config b/src/test/test-crs-static-rebalance-coloc1/service_config
new file mode 100644
index 0000000..02e4a07
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/service_config
@@ -0,0 +1,10 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node2", "state": "started" },
+    "vm:103": { "node": "node3", "state": "started" },
+    "vm:104": { "node": "node3", "state": "started" },
+    "vm:105": { "node": "node3", "state": "started" },
+    "vm:106": { "node": "node3", "state": "started" },
+    "vm:107": { "node": "node3", "state": "started" },
+    "vm:108": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc1/static_service_stats b/src/test/test-crs-static-rebalance-coloc1/static_service_stats
new file mode 100644
index 0000000..c6472ca
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc1/static_service_stats
@@ -0,0 +1,10 @@
+{
+    "vm:101": { "maxcpu": 8, "maxmem": 16000000000 },
+    "vm:102": { "maxcpu": 4, "maxmem": 24000000000 },
+    "vm:103": { "maxcpu": 2, "maxmem": 32000000000 },
+    "vm:104": { "maxcpu": 4, "maxmem": 48000000000 },
+    "vm:105": { "maxcpu": 8, "maxmem": 16000000000 },
+    "vm:106": { "maxcpu": 4, "maxmem": 32000000000 },
+    "vm:107": { "maxcpu": 2, "maxmem": 64000000000 },
+    "vm:108": { "maxcpu": 8, "maxmem": 48000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc2/README b/src/test/test-crs-static-rebalance-coloc2/README
new file mode 100644
index 0000000..1b788f8
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/README
@@ -0,0 +1,16 @@
+Test whether a set of transitive strict negative colocation rules, i.e. there's
+negative colocation relations a->b, b->c and a->c, in conjunction with the
+static load scheduler with auto-rebalancing are applied correctly on service
+start and in case of a subsequent failover.
+
+The test scenario is:
+- vm:101 and vm:102 must be kept separate
+- vm:102 and vm:103 must be kept separate
+- vm:101 and vm:103 must be kept separate
+- Therefore, vm:101, vm:102, and vm:103 must be kept separate
+
+Therefore, the expected outcome is:
+- vm:101, vm:102, and vm:103 should be started on node1, node2, and node3
+  respectively, just as if the three negative colocation rule would've been
+  stated in a single negative colocation rule
+- As node3 fails, vm:103 cannot be recovered
diff --git a/src/test/test-crs-static-rebalance-coloc2/cmdlist b/src/test/test-crs-static-rebalance-coloc2/cmdlist
new file mode 100644
index 0000000..eee0e40
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on"],
+    [ "network node3 off" ]
+]
diff --git a/src/test/test-crs-static-rebalance-coloc2/datacenter.cfg b/src/test/test-crs-static-rebalance-coloc2/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+    "crs": {
+        "ha": "static",
+        "ha-rebalance-on-start": 1
+    }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc2/hardware_status b/src/test/test-crs-static-rebalance-coloc2/hardware_status
new file mode 100644
index 0000000..84484af
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc2/log.expect b/src/test/test-crs-static-rebalance-coloc2/log.expect
new file mode 100644
index 0000000..c59f286
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/log.expect
@@ -0,0 +1,86 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'static'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: service vm:101: re-balance selected current node node1 for startup
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service vm:102: re-balance selected new node node2 for startup
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node2)
+info     20    node1/crm: service vm:103: re-balance selected new node node3 for startup
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: service vm:102 - start relocate to node 'node2'
+info     21    node1/lrm: service vm:102 - end relocate to node 'node2'
+info     21    node1/lrm: service vm:103 - start relocate to node 'node3'
+info     21    node1/lrm: service vm:103 - end relocate to node 'node3'
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     40    node1/crm: service 'vm:102': state changed from 'request_start_balance' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:103': state changed from 'request_start_balance' to 'started'  (node = node3)
+info     43    node2/lrm: got lock 'ha_agent_node2_lock'
+info     43    node2/lrm: status change wait_for_agent_lock => active
+info     43    node2/lrm: starting service vm:102
+info     43    node2/lrm: service status vm:102 started
+info     45    node3/lrm: got lock 'ha_agent_node3_lock'
+info     45    node3/lrm: status change wait_for_agent_lock => active
+info     45    node3/lrm: starting service vm:103
+info     45    node3/lrm: service status vm:103 started
+info    120      cmdlist: execute network node3 off
+info    120    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    124    node3/crm: status change slave => wait_for_quorum
+info    125    node3/lrm: status change active => lost_agent_lock
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    160    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+emai    160    node1/crm: FENCE: Try to fence node 'node3'
+info    166     watchdog: execute power node3 off
+info    165    node3/crm: killed by poweroff
+info    166    node3/lrm: killed by poweroff
+info    166     hardware: server 'node3' stopped by poweroff (watchdog)
+info    240    node1/crm: got lock 'ha_agent_node3_lock'
+info    240    node1/crm: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+emai    240    node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3'
+info    240    node1/crm: service 'vm:103': state changed from 'fence' to 'recovery'
+err     240    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     260    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     280    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     300    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     320    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     340    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     360    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     380    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     400    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     420    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     440    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     460    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     480    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     500    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     520    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     540    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     560    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     580    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     600    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     620    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     640    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     660    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     680    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+err     700    node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-coloc2/manager_status b/src/test/test-crs-static-rebalance-coloc2/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-coloc2/rules_config b/src/test/test-crs-static-rebalance-coloc2/rules_config
new file mode 100644
index 0000000..1545064
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/rules_config
@@ -0,0 +1,14 @@
+colocation: very-lonely-services1
+    services vm:101,vm:102
+    affinity separate
+    strict 1
+
+colocation: very-lonely-services2
+    services vm:102,vm:103
+    affinity separate
+    strict 1
+
+colocation: very-lonely-services3
+    services vm:101,vm:103
+    affinity separate
+    strict 1
diff --git a/src/test/test-crs-static-rebalance-coloc2/service_config b/src/test/test-crs-static-rebalance-coloc2/service_config
new file mode 100644
index 0000000..57e3579
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/service_config
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc2/static_service_stats b/src/test/test-crs-static-rebalance-coloc2/static_service_stats
new file mode 100644
index 0000000..d9dc9e7
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc2/static_service_stats
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "maxcpu": 8, "maxmem": 16000000000 },
+    "vm:102": { "maxcpu": 4, "maxmem": 24000000000 },
+    "vm:103": { "maxcpu": 2, "maxmem": 32000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc3/README b/src/test/test-crs-static-rebalance-coloc3/README
new file mode 100644
index 0000000..e54a2d4
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/README
@@ -0,0 +1,14 @@
+Test whether a more complex set of transitive strict negative colocation rules,
+i.e. there's negative colocation relations a->b, b->c and a->c, in conjunction
+with the static load scheduler with auto-rebalancing are applied correctly on
+service start and in case of a subsequent failover.
+
+The test scenario is:
+- Essentially, all 10 strict negative colocation rules say that, vm:101,
+  vm:102, vm:103, vm:104, and vm:105 must be kept together
+
+Therefore, the expected outcome is:
+- vm:101, vm:102, and vm:103 should be started on node1, node2, node3, node4,
+  and node5 respectively, just as if the 10 negative colocation rule would've
+  been stated in a single negative colocation rule
+- As node1 and node5 fails, vm:101 and vm:105 cannot be recovered
diff --git a/src/test/test-crs-static-rebalance-coloc3/cmdlist b/src/test/test-crs-static-rebalance-coloc3/cmdlist
new file mode 100644
index 0000000..a3d806d
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/cmdlist
@@ -0,0 +1,4 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ],
+    [ "network node1 off", "network node5 off" ]
+]
diff --git a/src/test/test-crs-static-rebalance-coloc3/datacenter.cfg b/src/test/test-crs-static-rebalance-coloc3/datacenter.cfg
new file mode 100644
index 0000000..f2671a5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/datacenter.cfg
@@ -0,0 +1,6 @@
+{
+    "crs": {
+        "ha": "static",
+        "ha-rebalance-on-start": 1
+    }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc3/hardware_status b/src/test/test-crs-static-rebalance-coloc3/hardware_status
new file mode 100644
index 0000000..511afb9
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/hardware_status
@@ -0,0 +1,7 @@
+{
+  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node4": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
+  "node5": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc3/log.expect b/src/test/test-crs-static-rebalance-coloc3/log.expect
new file mode 100644
index 0000000..ed36dbe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/log.expect
@@ -0,0 +1,156 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node4 on
+info     20    node4/crm: status change startup => wait_for_quorum
+info     20    node4/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node5 on
+info     20    node5/crm: status change startup => wait_for_quorum
+info     20    node5/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'static'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node4': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node5': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: adding new service 'vm:105' on node 'node1'
+info     20    node1/crm: service vm:101: re-balance selected current node node1 for startup
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service vm:102: re-balance selected new node node2 for startup
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node2)
+info     20    node1/crm: service vm:103: re-balance selected new node node3 for startup
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node3)
+info     20    node1/crm: service vm:104: re-balance selected new node node4 for startup
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node4)
+info     20    node1/crm: service vm:105: re-balance selected new node node5 for startup
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'request_start_balance'  (node = node1, target = node5)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: service vm:102 - start relocate to node 'node2'
+info     21    node1/lrm: service vm:102 - end relocate to node 'node2'
+info     21    node1/lrm: service vm:103 - start relocate to node 'node3'
+info     21    node1/lrm: service vm:103 - end relocate to node 'node3'
+info     21    node1/lrm: service vm:104 - start relocate to node 'node4'
+info     21    node1/lrm: service vm:104 - end relocate to node 'node4'
+info     21    node1/lrm: service vm:105 - start relocate to node 'node5'
+info     21    node1/lrm: service vm:105 - end relocate to node 'node5'
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     26    node4/crm: status change wait_for_quorum => slave
+info     28    node5/crm: status change wait_for_quorum => slave
+info     40    node1/crm: service 'vm:102': state changed from 'request_start_balance' to 'started'  (node = node2)
+info     40    node1/crm: service 'vm:103': state changed from 'request_start_balance' to 'started'  (node = node3)
+info     40    node1/crm: service 'vm:104': state changed from 'request_start_balance' to 'started'  (node = node4)
+info     40    node1/crm: service 'vm:105': state changed from 'request_start_balance' to 'started'  (node = node5)
+info     43    node2/lrm: got lock 'ha_agent_node2_lock'
+info     43    node2/lrm: status change wait_for_agent_lock => active
+info     43    node2/lrm: starting service vm:102
+info     43    node2/lrm: service status vm:102 started
+info     45    node3/lrm: got lock 'ha_agent_node3_lock'
+info     45    node3/lrm: status change wait_for_agent_lock => active
+info     45    node3/lrm: starting service vm:103
+info     45    node3/lrm: service status vm:103 started
+info     47    node4/lrm: got lock 'ha_agent_node4_lock'
+info     47    node4/lrm: status change wait_for_agent_lock => active
+info     47    node4/lrm: starting service vm:104
+info     47    node4/lrm: service status vm:104 started
+info     49    node5/lrm: got lock 'ha_agent_node5_lock'
+info     49    node5/lrm: status change wait_for_agent_lock => active
+info     49    node5/lrm: starting service vm:105
+info     49    node5/lrm: service status vm:105 started
+info    120      cmdlist: execute network node1 off
+info    120      cmdlist: execute network node5 off
+info    120    node1/crm: status change master => lost_manager_lock
+info    120    node1/crm: status change lost_manager_lock => wait_for_quorum
+info    121    node1/lrm: status change active => lost_agent_lock
+info    128    node5/crm: status change slave => wait_for_quorum
+info    129    node5/lrm: status change active => lost_agent_lock
+info    162     watchdog: execute power node1 off
+info    161    node1/crm: killed by poweroff
+info    162    node1/lrm: killed by poweroff
+info    162     hardware: server 'node1' stopped by poweroff (watchdog)
+info    170     watchdog: execute power node5 off
+info    169    node5/crm: killed by poweroff
+info    170    node5/lrm: killed by poweroff
+info    170     hardware: server 'node5' stopped by poweroff (watchdog)
+info    222    node3/crm: got lock 'ha_manager_lock'
+info    222    node3/crm: status change slave => master
+info    222    node3/crm: using scheduler mode 'static'
+info    222    node3/crm: node 'node1': state changed from 'online' => 'unknown'
+info    222    node3/crm: node 'node5': state changed from 'online' => 'unknown'
+info    282    node3/crm: service 'vm:101': state changed from 'started' to 'fence'
+info    282    node3/crm: service 'vm:105': state changed from 'started' to 'fence'
+info    282    node3/crm: node 'node1': state changed from 'unknown' => 'fence'
+emai    282    node3/crm: FENCE: Try to fence node 'node1'
+info    282    node3/crm: got lock 'ha_agent_node1_lock'
+info    282    node3/crm: fencing: acknowledged - got agent lock for node 'node1'
+info    282    node3/crm: node 'node1': state changed from 'fence' => 'unknown'
+emai    282    node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node1'
+info    282    node3/crm: service 'vm:101': state changed from 'fence' to 'recovery'
+info    282    node3/crm: node 'node5': state changed from 'unknown' => 'fence'
+emai    282    node3/crm: FENCE: Try to fence node 'node5'
+info    282    node3/crm: got lock 'ha_agent_node5_lock'
+info    282    node3/crm: fencing: acknowledged - got agent lock for node 'node5'
+info    282    node3/crm: node 'node5': state changed from 'fence' => 'unknown'
+emai    282    node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5'
+info    282    node3/crm: service 'vm:105': state changed from 'fence' to 'recovery'
+err     282    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     282    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     302    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     302    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     322    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     322    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     342    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     342    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     362    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     362    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     382    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     382    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     402    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     402    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     422    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     422    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     442    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     442    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     462    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     462    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     482    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     482    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     502    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     502    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     522    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     522    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     542    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     542    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     562    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     562    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     582    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     582    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     602    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     602    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     622    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     622    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     642    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     642    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     662    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     662    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     682    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     682    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+err     702    node3/crm: recovering service 'vm:101' from fenced node 'node1' failed, no recovery node found
+err     702    node3/crm: recovering service 'vm:105' from fenced node 'node5' failed, no recovery node found
+info    720     hardware: exit simulation - done
diff --git a/src/test/test-crs-static-rebalance-coloc3/manager_status b/src/test/test-crs-static-rebalance-coloc3/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-static-rebalance-coloc3/rules_config b/src/test/test-crs-static-rebalance-coloc3/rules_config
new file mode 100644
index 0000000..6047eff
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/rules_config
@@ -0,0 +1,49 @@
+colocation: very-lonely-service1
+    services vm:101,vm:102
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service2
+    services vm:102,vm:103
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service3
+    services vm:103,vm:104
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service4
+    services vm:104,vm:105
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service5
+    services vm:101,vm:103
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service6
+    services vm:101,vm:104
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service7
+    services vm:101,vm:105
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service8
+    services vm:102,vm:104
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service9
+    services vm:102,vm:105
+    affinity separate
+    strict 1
+
+colocation: very-lonely-service10
+    services vm:103,vm:105
+    affinity separate
+    strict 1
diff --git a/src/test/test-crs-static-rebalance-coloc3/service_config b/src/test/test-crs-static-rebalance-coloc3/service_config
new file mode 100644
index 0000000..a1d61f5
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/service_config
@@ -0,0 +1,7 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" },
+    "vm:105": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-static-rebalance-coloc3/static_service_stats b/src/test/test-crs-static-rebalance-coloc3/static_service_stats
new file mode 100644
index 0000000..d9dc9e7
--- /dev/null
+++ b/src/test/test-crs-static-rebalance-coloc3/static_service_stats
@@ -0,0 +1,5 @@
+{
+    "vm:101": { "maxcpu": 8, "maxmem": 16000000000 },
+    "vm:102": { "maxcpu": 4, "maxmem": 24000000000 },
+    "vm:103": { "maxcpu": 2, "maxmem": 32000000000 }
+}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (14 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios Daniel Kral
@ 2025-03-25 15:12 ` Daniel Kral
  2025-03-25 16:47 ` [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
  2025-04-01  1:50 ` DERUMIER, Alexandre
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 15:12 UTC (permalink / raw)
  To: pve-devel

Add test cases to verify the correct transformation of various types of
ill-defined colocation rules:

- Merging multiple, transitive positive colocation rules of the same
  strictness level
- Dropping colocation rules with not enough defined services
- Dropping colocation rules which have inner conflicts

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
These aren't exhaustive yet since there's no tests in conjunction with
HA groups yet.

 .gitignore                                    |   1 +
 src/test/Makefile                             |   4 +-
 .../connected-positive-colocations.cfg        |  34 ++++++
 .../connected-positive-colocations.cfg.expect |  54 ++++++++++
 .../rules_cfgs/illdefined-colocations.cfg     |   9 ++
 .../illdefined-colocations.cfg.expect         |  12 +++
 .../inner-inconsistent-colocations.cfg        |  14 +++
 .../inner-inconsistent-colocations.cfg.expect |  13 +++
 src/test/test_rules_config.pl                 | 100 ++++++++++++++++++
 9 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 src/test/rules_cfgs/connected-positive-colocations.cfg
 create mode 100644 src/test/rules_cfgs/connected-positive-colocations.cfg.expect
 create mode 100644 src/test/rules_cfgs/illdefined-colocations.cfg
 create mode 100644 src/test/rules_cfgs/illdefined-colocations.cfg.expect
 create mode 100644 src/test/rules_cfgs/inner-inconsistent-colocations.cfg
 create mode 100644 src/test/rules_cfgs/inner-inconsistent-colocations.cfg.expect
 create mode 100755 src/test/test_rules_config.pl

diff --git a/.gitignore b/.gitignore
index c35280e..35de63f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,3 +6,4 @@
 /src/test/test-*/status/*
 /src/test/fence_cfgs/*.cfg.commands
 /src/test/fence_cfgs/*.cfg.write
+/src/test/rules_cfgs/*.cfg.output
diff --git a/src/test/Makefile b/src/test/Makefile
index e54959f..6da9e10 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -5,6 +5,7 @@ all:
 test:
 	@echo "-- start regression tests --"
 	./test_failover1.pl
+	./test_rules_config.pl
 	./ha-tester.pl
 	./test_fence_config.pl
 	@echo "-- end regression tests (success) --"
@@ -12,4 +13,5 @@ test:
 .PHONY: clean
 clean:
 	rm -rf *~ test-*/log  test-*/*~ test-*/status \
-	fence_cfgs/*.cfg.commands fence_cfgs/*.write
+	fence_cfgs/*.cfg.commands fence_cfgs/*.write \
+	rules_cfgs/*.cfg.output
diff --git a/src/test/rules_cfgs/connected-positive-colocations.cfg b/src/test/rules_cfgs/connected-positive-colocations.cfg
new file mode 100644
index 0000000..8cd6e0c
--- /dev/null
+++ b/src/test/rules_cfgs/connected-positive-colocations.cfg
@@ -0,0 +1,34 @@
+colocation: positive1
+    services vm:101,vm:106,vm:108
+    affinity together
+    strict 0
+
+colocation: positive2
+    services vm:106,vm:109
+    affinity together
+    strict 0
+
+colocation: positive3
+    services vm:107,vm:105
+    affinity together
+    strict 0
+
+colocation: positive4
+    services vm:101,vm:102,vm:103
+    affinity together
+    strict 0
+
+colocation: positive5
+    services vm:101,vm:104
+    affinity together
+    strict 1
+
+colocation: positive6
+    services vm:105,vm:110
+    affinity together
+    strict 0
+
+colocation: positive7
+    services vm:108,vm:104,vm:109
+    affinity together
+    strict 1
diff --git a/src/test/rules_cfgs/connected-positive-colocations.cfg.expect b/src/test/rules_cfgs/connected-positive-colocations.cfg.expect
new file mode 100644
index 0000000..f20a87e
--- /dev/null
+++ b/src/test/rules_cfgs/connected-positive-colocations.cfg.expect
@@ -0,0 +1,54 @@
+--- Log ---
+Merge services of positive colocation rule 'positive2' into positive colocation rule 'positive1', because they share at least one service.
+Merge services of positive colocation rule 'positive4' into positive colocation rule 'positive1', because they share at least one service.
+Merge services of positive colocation rule 'positive6' into positive colocation rule 'positive3', because they share at least one service.
+Merge services of positive colocation rule 'positive7' into positive colocation rule 'positive5', because they share at least one service.
+--- Config ---
+$VAR1 = {
+          'digest' => '7781c41891832c7f955d835edcdc38fa6b673bea',
+          'ids' => {
+                     'positive1' => {
+                                      'affinity' => 'together',
+                                      'services' => {
+                                                      'vm:101' => 1,
+                                                      'vm:102' => 1,
+                                                      'vm:103' => 1,
+                                                      'vm:106' => 1,
+                                                      'vm:108' => 1,
+                                                      'vm:109' => 1
+                                                    },
+                                      'strict' => 0,
+                                      'type' => 'colocation'
+                                    },
+                     'positive3' => {
+                                      'affinity' => 'together',
+                                      'services' => {
+                                                      'vm:105' => 1,
+                                                      'vm:107' => 1,
+                                                      'vm:110' => 1
+                                                    },
+                                      'strict' => 0,
+                                      'type' => 'colocation'
+                                    },
+                     'positive5' => {
+                                      'affinity' => 'together',
+                                      'services' => {
+                                                      'vm:101' => 1,
+                                                      'vm:104' => 1,
+                                                      'vm:108' => 1,
+                                                      'vm:109' => 1
+                                                    },
+                                      'strict' => 1,
+                                      'type' => 'colocation'
+                                    }
+                   },
+          'order' => {
+                       'positive1' => 1,
+                       'positive2' => 2,
+                       'positive3' => 3,
+                       'positive4' => 4,
+                       'positive5' => 5,
+                       'positive6' => 6,
+                       'positive7' => 7
+                     }
+        };
diff --git a/src/test/rules_cfgs/illdefined-colocations.cfg b/src/test/rules_cfgs/illdefined-colocations.cfg
new file mode 100644
index 0000000..2a4bf9c
--- /dev/null
+++ b/src/test/rules_cfgs/illdefined-colocations.cfg
@@ -0,0 +1,9 @@
+colocation: lonely-service1
+    services vm:101
+    affinity together
+    strict 1
+
+colocation: lonely-service2
+    services vm:101
+    affinity separate
+    strict 1
diff --git a/src/test/rules_cfgs/illdefined-colocations.cfg.expect b/src/test/rules_cfgs/illdefined-colocations.cfg.expect
new file mode 100644
index 0000000..68ce44a
--- /dev/null
+++ b/src/test/rules_cfgs/illdefined-colocations.cfg.expect
@@ -0,0 +1,12 @@
+--- Log ---
+Drop colocation rule 'lonely-service1', because it does not have enough services defined.
+Drop colocation rule 'lonely-service2', because it does not have enough services defined.
+--- Config ---
+$VAR1 = {
+          'digest' => 'd174e745359cbc8c2e0f950ab5a4d202ffaf15e2',
+          'ids' => {},
+          'order' => {
+                       'lonely-service1' => 1,
+                       'lonely-service2' => 2
+                     }
+        };
diff --git a/src/test/rules_cfgs/inner-inconsistent-colocations.cfg b/src/test/rules_cfgs/inner-inconsistent-colocations.cfg
new file mode 100644
index 0000000..70ae352
--- /dev/null
+++ b/src/test/rules_cfgs/inner-inconsistent-colocations.cfg
@@ -0,0 +1,14 @@
+colocation: keep-apart1
+    services vm:102,vm:103
+    affinity separate
+    strict 1
+
+colocation: keep-apart2
+    services vm:102,vm:104,vm:106
+    affinity separate
+    strict 1
+
+colocation: stick-together1
+    services vm:101,vm:102,vm:103,vm:104,vm:106
+    affinity together
+    strict 1
diff --git a/src/test/rules_cfgs/inner-inconsistent-colocations.cfg.expect b/src/test/rules_cfgs/inner-inconsistent-colocations.cfg.expect
new file mode 100644
index 0000000..ea5b96b
--- /dev/null
+++ b/src/test/rules_cfgs/inner-inconsistent-colocations.cfg.expect
@@ -0,0 +1,13 @@
+--- Log ---
+Drop positive colocation rule 'stick-together1' and negative colocation rule 'keep-apart1', because they share two or more services.
+Drop positive colocation rule 'stick-together1' and negative colocation rule 'keep-apart2', because they share two or more services.
+--- Config ---
+$VAR1 = {
+          'digest' => '1e6a049065bec399e5982d24eb348465eec8520b',
+          'ids' => {},
+          'order' => {
+                       'keep-apart1' => 1,
+                       'keep-apart2' => 2,
+                       'stick-together1' => 3
+                     }
+        };
diff --git a/src/test/test_rules_config.pl b/src/test/test_rules_config.pl
new file mode 100755
index 0000000..0eb55c3
--- /dev/null
+++ b/src/test/test_rules_config.pl
@@ -0,0 +1,100 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use Getopt::Long;
+
+use lib qw(..);
+
+use Test::More;
+use Test::MockModule;
+
+use Data::Dumper;
+
+use PVE::HA::Rules;
+use PVE::HA::Rules::Colocation;
+
+PVE::HA::Rules::Colocation->register();
+
+PVE::HA::Rules->init();
+
+my $opt_nodiff;
+
+if (!GetOptions ("nodiff"   => \$opt_nodiff)) {
+    print "usage: $0 [test.cfg] [--nodiff]\n";
+    exit -1;
+}
+
+sub _log {
+    my ($fh, $source, $message) = @_;
+
+    chomp $message;
+    $message = "[$source] $message" if $source;
+
+    print "$message\n";
+
+    $fh->print("$message\n");
+    $fh->flush();
+};
+
+sub check_cfg {
+    my ($cfg_fn, $outfile) = @_;
+
+    my $raw = PVE::Tools::file_get_contents($cfg_fn);
+
+    open(my $LOG, '>', "$outfile");
+    select($LOG);
+    $| = 1;
+
+    print "--- Log ---\n";
+    my $cfg = PVE::HA::Rules->parse_config($cfg_fn, $raw);
+    PVE::HA::Rules::checked_config($cfg, {}, {});
+    print "--- Config ---\n";
+    {
+	local $Data::Dumper::Sortkeys = 1;
+	print Dumper($cfg);
+    }
+
+    select(STDOUT);
+}
+
+sub run_test {
+    my ($cfg_fn) = @_;
+
+    print "* check: $cfg_fn\n";
+
+    my $outfile = "$cfg_fn.output";
+    my $expect = "$cfg_fn.expect";
+
+    eval {
+	check_cfg($cfg_fn, $outfile);
+    };
+    if (my $err = $@) {
+	die "Test '$cfg_fn' failed:\n$err\n";
+    }
+
+    return if $opt_nodiff;
+
+    my $res;
+
+    if (-f $expect) {
+	my $cmd = ['diff', '-u', $expect, $outfile];
+	$res = system(@$cmd);
+	die "test '$cfg_fn' failed\n" if $res != 0;
+    } else {
+	$res = system('cp', $outfile, $expect);
+	die "test '$cfg_fn' failed\n" if $res != 0;
+    }
+
+    print "* end rules test: $cfg_fn (success)\n\n";
+}
+
+# exec tests
+
+if (my $testcfg = shift) {
+    run_test($testcfg);
+} else {
+    for my $cfg (<rules_cfgs/*cfg>) {
+	run_test($cfg);
+    }
+}
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (15 preceding siblings ...)
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config Daniel Kral
@ 2025-03-25 16:47 ` Daniel Kral
  2025-04-01  1:50 ` DERUMIER, Alexandre
  17 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-03-25 16:47 UTC (permalink / raw)
  To: pve-devel

On 3/25/25 16:12, Daniel Kral wrote:
> Colocation Rules
> ----------------
> 
> The two properties of colocation rules, as described in the
> introduction, are rather straightforward. A typical colocation rule
> inside of the config would look like the following:
> 
> colocation: some-lonely-services
> 	services vm:101,vm:103,ct:909
> 	affinity separate
> 	strict 1
> 
> This means that the three services vm:101, vm:103 and ct:909 must be
> kept separate on different nodes. I'm very keen on naming suggestions
> since I think there could be a better word than 'affinity' here. I
> played around with 'keep-services', since then it would always read
> something like 'keep-services separate', which is very declarative, but
> this might suggest that this is a binary option to too much users (I
> mean it is, but not with the values 0 and 1).

Just to document this, I've played around with using a score to decide 
whether the colocation rule is positive/negative, how strict and to 
allow specifying a value on how much it is desired to meet the 
colocation rule in case of an optional colocation rule, much like 
pacemaker's version.

But in the end, I ditched the idea, since it didn't integrate well and 
it was also not trivial to find a good scale for this weight value that 
would correspond similarly as the node priority in HA groups, for 
example, especially when we select for each service individually.

On 3/25/25 16:12, Daniel Kral wrote:
> [0] https://clusterlabs.org/projects/pacemaker/doc/3.0/Pacemaker_Explained/html/constraints.html#colocation-properties
> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=5260
> [2] https://bugzilla.proxmox.com/show_bug.cgi?id=5332
> [3] https://lore.proxmox.com/pve-devel/c8fa7b8c-fb37-5389-1302-2002780d4ee2@proxmox.com/

I forgot to update the footnotes here when sending this. The first 
footnote was to the initial inspiration of a score-based colocation 
rule, but as already said this was dropped.

So the references for the two quotes from our Bugzilla [0] and [1] map 
to the foot note [1] and [2] here respectively.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [pve-devel] applied: [PATCH ha-manager 01/15] ignore output of fence config tests in tree
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
@ 2025-03-25 17:49   ` Thomas Lamprecht
  0 siblings, 0 replies; 30+ messages in thread
From: Thomas Lamprecht @ 2025-03-25 17:49 UTC (permalink / raw)
  To: Proxmox VE development discussion, Daniel Kral

Am 25.03.25 um 16:12 schrieb Daniel Kral:
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  .gitignore | 2 ++
>  1 file changed, 2 insertions(+)
> 
>

applied this one already, thanks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
@ 2025-03-25 17:53   ` Thomas Lamprecht
  2025-04-03 12:16     ` Fabian Grünbichler
  0 siblings, 1 reply; 30+ messages in thread
From: Thomas Lamprecht @ 2025-03-25 17:53 UTC (permalink / raw)
  To: Proxmox VE development discussion, Daniel Kral

Am 25.03.25 um 16:12 schrieb Daniel Kral:
> Implement helper subroutines, which implement basic set operations done
> on hash sets, i.e. hashes with elements set to a true value, e.g. 1.
> 
> These will be used for various tasks in the HA Manager colocation rules,
> e.g. for verifying the satisfiability of the rules or applying the
> colocation rules on the allowed set of nodes.
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
> If they're useful somewhere else, I can move them to PVE::Tools
> post-RFC, but it'd be probably useful to prefix them with `hash_` there.

meh, not a big fan of growing the overly generic PVE::Tools more, if, this
should go into a dedicated module for hash/data structure helpers ...

> AFAICS there weren't any other helpers for this with a quick grep over
> all projects and `PVE::Tools::array_intersect()` wasn't what I needed.

... which those existing one should then also move into, but out of scope
of this series.

> 
>  src/PVE/HA/Tools.pm | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
> index 0f9e9a5..fc3282c 100644
> --- a/src/PVE/HA/Tools.pm
> +++ b/src/PVE/HA/Tools.pm
> @@ -115,6 +115,48 @@ sub write_json_to_file {
>      PVE::Tools::file_set_contents($filename, $raw);
>  }
>  
> +sub is_disjoint {

IMO a bit too generic name for being in a Tools named module, maybe
prefix them all with hash_ or hashes_ ?

> +    my ($hash1, $hash2) = @_;
> +
> +    for my $key (keys %$hash1) {
> +	return 0 if exists($hash2->{$key});
> +    }
> +
> +    return 1;
> +};
> +
> +sub intersect {
> +    my ($hash1, $hash2) = @_;
> +
> +    my $result = { map { $_ => $hash2->{$_} } keys %$hash1 };
> +
> +    for my $key (keys %$result) {
> +	delete $result->{$key} if !defined($result->{$key});
> +    }
> +
> +    return $result;
> +};
> +
> +sub set_difference {
> +    my ($hash1, $hash2) = @_;
> +
> +    my $result = { map { $_ => 1 } keys %$hash1 };
> +
> +    for my $key (keys %$result) {
> +	delete $result->{$key} if defined($hash2->{$key});
> +    }
> +
> +    return $result;
> +};
> +
> +sub union {
> +    my ($hash1, $hash2) = @_;
> +
> +    my $result = { map { $_ => 1 } keys %$hash1, keys %$hash2 };
> +
> +    return $result;
> +};
> +
>  sub count_fenced_services {
>      my ($ss, $node) = @_;
>  



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
  2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
                   ` (16 preceding siblings ...)
  2025-03-25 16:47 ` [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
@ 2025-04-01  1:50 ` DERUMIER, Alexandre
  2025-04-01  9:39   ` Daniel Kral
  17 siblings, 1 reply; 30+ messages in thread
From: DERUMIER, Alexandre @ 2025-04-01  1:50 UTC (permalink / raw)
  To: pve-devel

Hi Daniel,

thanks for working on this !



>>I chose the name "colocation" in favor of affinity/anti-affinity,
>>since
>>it is a bit more concise that it is about co-locating services
>>between
>>each other in contrast to locating services on nodes, but no hard
>>feelings to change it (same for any other names in this series).

my 2cents, but everybody in the industry is calling this
affinity/antiafifnity (vmware, nutanix, hyperv, openstack, ...).
More precisely, vm affinity rules (vm<->vm)   vs  node affinity rules
(vm->node , the current HA group)

Personnally I don't care, it's just a name ^_^ .  

But I have a lot  of customers asking about "does proxmox support
affinity/anti-affinity". and if they are doing their own research, they
will think that it doesnt exist.
(or at minimum, write  somewhere in the doc something like "aka vm
affinity" or in commercial presentation ^_^)




More serious question : Don't have read yet all the code, but how does
it play with the current topsis placement algorithm ?




>>Additional and/or future ideas
>>------------------------------

Small feature request from students && customers:  they are a lot
asking to be able to use vm tags in the colocation/affinity





>>I'd like to suggest to also transform the existing HA groups to
>>location
>>rules, if the rule concept turns out to be a good fit for the
>>colocation
>>feature in the HA Manager, as HA groups seem to integrate quite
>>easily
>>>into this concept.

I agree with that too



Thanks again !

Alexandre

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
  2025-04-01  1:50 ` DERUMIER, Alexandre
@ 2025-04-01  9:39   ` Daniel Kral
  2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
  2025-04-03 12:26     ` Fabian Grünbichler
  0 siblings, 2 replies; 30+ messages in thread
From: Daniel Kral @ 2025-04-01  9:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, DERUMIER, Alexandre

On 4/1/25 03:50, DERUMIER, Alexandre wrote:
> my 2cents, but everybody in the industry is calling this
> affinity/antiafifnity (vmware, nutanix, hyperv, openstack, ...).
> More precisely, vm affinity rules (vm<->vm)   vs  node affinity rules
> (vm->node , the current HA group)
> 
> Personnally I don't care, it's just a name ^_^ .
> 
> But I have a lot  of customers asking about "does proxmox support
> affinity/anti-affinity". and if they are doing their own research, they
> will think that it doesnt exist.
> (or at minimum, write  somewhere in the doc something like "aka vm
> affinity" or in commercial presentation ^_^)

I see your point and also called it affinity/anti-affinity before, but 
if we go for the HA Rules route here, it'd be really neat to have 
"Location Rules" and "Colocation Rules" in the end to coexist and 
clearly show the distinction between them, as both are affinity rules at 
least for me.

I'd definitely make sure that it is clear from the release notes and 
documentation, that this adds the feature to assign affinity between 
services, but let's wait for some other comments on this ;).

On 4/1/25 03:50, DERUMIER, Alexandre wrote:
> More serious question : Don't have read yet all the code, but how does
> it play with the current topsis placement algorithm ?

I currently implemented the colocation rules to put a constraint on 
which nodes the manager can select from for the to-be-migrated service.

So if users use the static load scheduler (and the basic / service count 
scheduler for that matter too), the colocation rules just make sure that 
no recovery node is selected, which contradicts the colocation rules. So 
the TOPSIS algorithm isn't changed at all.

There are two things that should/could be changed in the future (besides 
the many future ideas that I pointed out already), which are

- (1) the schedulers will still consider all online nodes, i.e. even 
though HA groups and/or colocation rules restrict the allowed nodes in 
the end, the calculation is done for all nodes which could be 
significant for larger clusters, and

- (2) the service (generally) are currently recovered one-by-one in a 
best-fit fashion, i.e. there's no order on the service's needed 
resources, etc. There could be some edge cases (e.g. think about a 
failing node with a bunch of service to be kept together; these should 
now be migrated to the same node, if possible, or put them on the 
minimum amount of nodes), where the algorithm could find better 
solutions if it either orders the to-be-recovered services, and/or the 
utilization scheduler has knowledge about the 'keep together' 
colocations and considers these (and all subsets) as a single service.

For the latter, the complexity explodes a bit and is harder to test for, 
which is why I've gone for the current implementation, as it also 
reduces the burden on users to think about what could happen with a 
specific set of rules and already allows the notion of MUST/SHOULD. This 
gives enough flexibility to improve the decision making of the scheduler 
in the future.

On 4/1/25 03:50, DERUMIER, Alexandre wrote:
> Small feature request from students && customers:  they are a lot
> asking to be able to use vm tags in the colocation/affinity

Good idea! We were thinking about this too and I forgot to add it to the 
list, thanks for bringing it up again!

Yes, the idea would be to make pools and tags available as selectors for 
rules here, so that the changes can be made rather dynamic by just 
adding a tag to a service.

The only thing we have to consider here is that HA rules have some 
verification phase and invalid rules will be dropped or modified to make 
them applicable. Also these external changes must be identified somehow 
in the HA stack, as I want to keep the amount of runs through the 
verification code to a minimum, i.e. only when the configuration is 
changed by the user. But that will be a discussion for another series ;).


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
  2025-04-01  9:39   ` Daniel Kral
@ 2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
  2025-04-03 12:26     ` Fabian Grünbichler
  1 sibling, 0 replies; 30+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-04-01 11:05 UTC (permalink / raw)
  To: pve-devel, d.kral; +Cc: DERUMIER, Alexandre

[-- Attachment #1: Type: message/rfc822, Size: 17669 bytes --]

From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "d.kral@proxmox.com" <d.kral@proxmox.com>
Subject: Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
Date: Tue, 1 Apr 2025 11:05:57 +0000
Message-ID: <ff6ab6753d00e1d6daa85fc985db90a1d056585e.camel@groupe-cyllene.com>

>>I currently implemented the colocation rules to put a constraint on 
>>which nodes the manager can select from for the to-be-migrated
>>service.

>>So if users use the static load scheduler (and the basic / service
>>count 
>>scheduler for that matter too), the colocation rules just make sure
>>that 
>>no recovery node is selected, which contradicts the colocation rules.
>>So 
>>the TOPSIS algorithm isn't changed at all.

Ah ok, got it, so it's an hard constraint (MUST) filtering the target
nodes.


>>There are two things that should/could be changed in the future
(besides 
>>the many future ideas that I pointed out already), which are

>>- (1) the schedulers will still consider all online nodes, i.e. even 
>>though HA groups and/or colocation rules restrict the allowed nodes
>>in 
>>the end, the calculation is done for all nodes which could be 
>>significant for larger clusters, and

>>- (2) the service (generally) are currently recovered one-by-one in a
>>best-fit fashion, i.e. there's no order on the service's needed 
>>resources, etc. There could be some edge cases (e.g. think about a 
>>failing node with a bunch of service to be kept together; these
>>should 
>>now be migrated to the same node, if possible, or put them on the 
>>minimum amount of nodes), where the algorithm could find better 
>>solutions if it either orders the to-be-recovered services, and/or
>>the 
>>utilization scheduler has knowledge about the 'keep together' 
>>colocations and considers these (and all subsets) as a single
service.
>>
>>For the latter, the complexity explodes a bit and is harder to test
>>for, 
>>which is why I've gone for the current implementation, as it also 
>>reduces the burden on users to think about what could happen with a 
>>specific set of rules and already allows the notion of MUST/SHOULD.
>>This 
>>gives enough flexibility to improve the decision making of the
>>scheduler 
>>in the future.

yes, soft constraint (SHOULD) is not so easy indeed.
I remember to have done some tests, putting in the topsis the number of
conflicting constraint by vm  for each host, and migrate vm with the
more constraint first. 
I had not too bad results, but this need to be tested at scale. 

Hard constraint is already a good step. (should work for 90% of people
without 10000 constraints mixed together )


On 4/1/25 03:50, DERUMIER, Alexandre wrote:
> Small feature request from students && customers:  they are a lot
> asking to be able to use vm tags in the colocation/affinity

>>Good idea! We were thinking about this too and I forgot to add it to
>>the 
>>list, thanks for bringing it up again!

Ye>>s, the idea would be to make pools and tags available as selectors
>>for 
>>rules here, so that the changes can be made rather dynamic by just 
>>adding a tag to a service.

could be perfect :)

>>The only thing we have to consider here is that HA rules have some 
>>verification phase and invalid rules will be dropped or modified to
>>make 
>>them applicable. Also these external changes must be identified
>>somehow 
>>in the HA stack, as I want to keep the amount of runs through the 
>>verification code to a minimum, i.e. only when the configuration is 
>>changed by the user. But that will be a discussion for another series
>>;).

yes sure!


BTW, another improvement could be hard constraint on storage
availability, as currently the HA stack is moving the vm blinding, 
try to start, then move the vm to another node if storage is available.
The only workaround is to create HA server group, but this could be a
improvment.

Same for the number of cores available on host.  (host number of cores
must be > than vm cores )


I'll try to take time to follow && test your patches !

Alexandre



[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines
  2025-03-25 17:53   ` Thomas Lamprecht
@ 2025-04-03 12:16     ` Fabian Grünbichler
  2025-04-11 11:24       ` Daniel Kral
  0 siblings, 1 reply; 30+ messages in thread
From: Fabian Grünbichler @ 2025-04-03 12:16 UTC (permalink / raw)
  To: Daniel Kral, Proxmox VE development discussion

On March 25, 2025 6:53 pm, Thomas Lamprecht wrote:
> Am 25.03.25 um 16:12 schrieb Daniel Kral:
>> Implement helper subroutines, which implement basic set operations done
>> on hash sets, i.e. hashes with elements set to a true value, e.g. 1.
>> 
>> These will be used for various tasks in the HA Manager colocation rules,
>> e.g. for verifying the satisfiability of the rules or applying the
>> colocation rules on the allowed set of nodes.
>> 
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>> If they're useful somewhere else, I can move them to PVE::Tools
>> post-RFC, but it'd be probably useful to prefix them with `hash_` there.
> 
> meh, not a big fan of growing the overly generic PVE::Tools more, if, this
> should go into a dedicated module for hash/data structure helpers ...
> 
>> AFAICS there weren't any other helpers for this with a quick grep over
>> all projects and `PVE::Tools::array_intersect()` wasn't what I needed.
> 
> ... which those existing one should then also move into, but out of scope
> of this series.
> 
>> 
>>  src/PVE/HA/Tools.pm | 42 ++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 42 insertions(+)
>> 
>> diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
>> index 0f9e9a5..fc3282c 100644
>> --- a/src/PVE/HA/Tools.pm
>> +++ b/src/PVE/HA/Tools.pm
>> @@ -115,6 +115,48 @@ sub write_json_to_file {
>>      PVE::Tools::file_set_contents($filename, $raw);
>>  }
>>  
>> +sub is_disjoint {
> 
> IMO a bit too generic name for being in a Tools named module, maybe
> prefix them all with hash_ or hashes_ ?

is_disjoint also only really makes sense as a name if you see it as an
operation *on* $hash1, rather than an operation involving both hashes..

i.e., in Rust

set1.is_disjoint(&set2);

makes sense..

in Perl

is_disjoint($set1, $set2)

reads weird, and should maybe be

check_disjoint($set1, $set2)

or something like that?

> 
>> +    my ($hash1, $hash2) = @_;
>> +
>> +    for my $key (keys %$hash1) {
>> +	return 0 if exists($hash2->{$key});
>> +    }
>> +
>> +    return 1;
>> +};
>> +
>> +sub intersect {
>> +    my ($hash1, $hash2) = @_;
>> +
>> +    my $result = { map { $_ => $hash2->{$_} } keys %$hash1 };

this is a bit dangerous if $hash2->{$key} is itself a reference? if I
later modify $result I'll modify $hash2.. I know the commit message says
that the hashes are all just of the form key => 1, but nothing here
tells me that a year later when I am looking for a generic hash
intersection helper ;) I think this should also be clearly mentioned in
the module, and ideally, also in the helper names (i.e., have "set"
there everywhere and a comment above each that it only works for
hashes-as-sets and not generic hashes).

wouldn't it be faster/simpler to iterate over either hash once?

my $result = {};
for my $key (keys %$hash1) {
    $result->{$key} = 1 if $hash1->{$key} && $hash2->{$key};
}
return $result;


>> +
>> +    for my $key (keys %$result) {
>> +	delete $result->{$key} if !defined($result->{$key});
>> +    }
>> +
>> +    return $result;
>> +};
>> +
>> +sub set_difference {
>> +    my ($hash1, $hash2) = @_;
>> +
>> +    my $result = { map { $_ => 1 } keys %$hash1 };

if $hash1 is only of the form key => 1, then this is just

my $result = { %$hash1 };

>> +
>> +    for my $key (keys %$result) {
>> +	delete $result->{$key} if defined($hash2->{$key});
>> +    }
>> +

but the whole thing can be

return { map { $hash2->{$_} ? ($_ => 1) : () } keys %$hash1 };

this transforms hash1 into its keys, and then returns either ($key => 1)
if the key is true in $hash2, or the empty tuple if not. the outer {}
then turn this sequence of tuples into a hash again, which skips empty
tuples ;) can of course also be adapted to use the value from either
hash, check for definedness instead of truthiness, ..

>> +    return $result;
>> +};
>> +
>> +sub union {
>> +    my ($hash1, $hash2) = @_;
>> +
>> +    my $result = { map { $_ => 1 } keys %$hash1, keys %$hash2 };
>> +
>> +    return $result;
>> +};
>> +
>>  sub count_fenced_services {
>>      my ($ss, $node) = @_;
>>  
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
@ 2025-04-03 12:16   ` Fabian Grünbichler
  2025-04-11 11:04     ` Daniel Kral
  0 siblings, 1 reply; 30+ messages in thread
From: Fabian Grünbichler @ 2025-04-03 12:16 UTC (permalink / raw)
  To: Proxmox VE development discussion

On March 25, 2025 4:12 pm, Daniel Kral wrote:
> Add the colocation rule plugin to allow users to specify inter-service
> affinity constraints.
> 
> These colocation rules can either be positive (keeping services
> together) or negative (keeping service separate). Their strictness can
> also be specified as either a MUST or a SHOULD, where the first
> specifies that any service the constraint cannot be applied for stays in
> recovery, while the latter specifies that that any service the
> constraint cannot be applied for is lifted from the constraint.
> 
> The initial implementation also implements four basic transformations,
> where colocation rules with not enough services are dropped, transitive
> positive colocation rules are merged, and inter-colocation rule
> inconsistencies as well as colocation rule inconsistencies with respect
> to the location constraints specified in HA groups are dropped.

a high level question: theres a lot of loops and sorts over rules,
services, groups here - granted that is all in memory, so should be
reasonably fast, do we have concerns here/should we look for further
optimization potential?

e.g. right now I count (coming in via canonicalize):

- check_services_count
-- sort of ruleids (foreach_colocation_rule)
-- loop over rules (foreach_colocation_rule)
--- keys on services of each rule
- loop over the results (should be empty)
- check_positive_intransitivity
-- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
-- loop over each unique pair of ruleids
--- is_disjoint on services of each pair (loop over service keys)
- loop over resulting ruleids (might be many!)
-- loop over mergeable rules for each merge target
--- loop over services of each mergeable rule
- check_inner_consistency
-- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
-- loop over positive rules
--- for every positive rule, loop over negative rules
---- for each pair of positive+negative rule, check service
intersections
- loop over resulting conflicts (should be empty)
- check_consistency_with_groups
-- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
-- loop over positive rules
--- loop over services
---- loop over nodes of service's group
-- loop over negative rules
--- loop over services
---- loop over nodes of service's group
- loop over resulting conflicts (should be empty)

possibly splitting the rules (instead of just the IDs) once and keeping
a list of sorted rule IDs we could save some overhead?

might not be worth it (yet) though, but something to keep in mind if the
rules are getting more complicated over time..

> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  debian/pve-ha-manager.install  |   1 +
>  src/PVE/HA/Makefile            |   1 +
>  src/PVE/HA/Rules/Colocation.pm | 391 +++++++++++++++++++++++++++++++++
>  src/PVE/HA/Rules/Makefile      |   6 +
>  src/PVE/HA/Tools.pm            |   6 +
>  5 files changed, 405 insertions(+)
>  create mode 100644 src/PVE/HA/Rules/Colocation.pm
>  create mode 100644 src/PVE/HA/Rules/Makefile
> 
> diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
> index 9bbd375..89f9144 100644
> --- a/debian/pve-ha-manager.install
> +++ b/debian/pve-ha-manager.install
> @@ -33,6 +33,7 @@
>  /usr/share/perl5/PVE/HA/Resources/PVECT.pm
>  /usr/share/perl5/PVE/HA/Resources/PVEVM.pm
>  /usr/share/perl5/PVE/HA/Rules.pm
> +/usr/share/perl5/PVE/HA/Rules/Colocation.pm
>  /usr/share/perl5/PVE/HA/Tools.pm
>  /usr/share/perl5/PVE/HA/Usage.pm
>  /usr/share/perl5/PVE/HA/Usage/Basic.pm
> diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
> index 489cbc0..e386cbf 100644
> --- a/src/PVE/HA/Makefile
> +++ b/src/PVE/HA/Makefile
> @@ -8,6 +8,7 @@ install:
>  	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA
>  	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/$$i; done
>  	make -C Resources install
> +	make -C Rules install
>  	make -C Usage install
>  	make -C Env install
>  
> diff --git a/src/PVE/HA/Rules/Colocation.pm b/src/PVE/HA/Rules/Colocation.pm
> new file mode 100644
> index 0000000..808d48e
> --- /dev/null
> +++ b/src/PVE/HA/Rules/Colocation.pm
> @@ -0,0 +1,391 @@
> +package PVE::HA::Rules::Colocation;
> +
> +use strict;
> +use warnings;
> +
> +use Data::Dumper;

leftover dumper ;)

> +
> +use PVE::JSONSchema qw(get_standard_option);
> +use PVE::HA::Tools;
> +
> +use base qw(PVE::HA::Rules);
> +
> +sub type {
> +    return 'colocation';
> +}
> +
> +sub properties {
> +    return {
> +	services => get_standard_option('pve-ha-resource-id-list'),
> +	affinity => {
> +	    description => "Describes whether the services are supposed to be kept on separate"
> +		. " nodes, or are supposed to be kept together on the same node.",
> +	    type => 'string',
> +	    enum => ['separate', 'together'],
> +	    optional => 0,
> +	},
> +	strict => {
> +	    description => "Describes whether the colocation rule is mandatory or optional.",
> +	    type => 'boolean',
> +	    optional => 0,
> +	},
> +    }
> +}
> +
> +sub options {
> +    return {
> +	services => { optional => 0 },
> +	strict => { optional => 0 },
> +	affinity => { optional => 0 },
> +	comment => { optional => 1 },
> +    };
> +};
> +
> +sub decode_value {
> +    my ($class, $type, $key, $value) = @_;
> +
> +    if ($key eq 'services') {
> +	my $res = {};
> +
> +	for my $service (PVE::Tools::split_list($value)) {
> +	    if (PVE::HA::Tools::pve_verify_ha_resource_id($service)) {
> +		$res->{$service} = 1;
> +	    }
> +	}
> +
> +	return $res;
> +    }
> +
> +    return $value;
> +}
> +
> +sub encode_value {
> +    my ($class, $type, $key, $value) = @_;
> +
> +    if ($key eq 'services') {
> +	PVE::HA::Tools::pve_verify_ha_resource_id($_) for (keys %$value);
> +
> +	return join(',', keys %$value);
> +    }
> +
> +    return $value;
> +}
> +
> +sub foreach_colocation_rule {
> +    my ($rules, $func, $opts) = @_;
> +
> +    my $my_opts = { map { $_ => $opts->{$_} } keys %$opts };

why? if the caller doesn't want $opts to be modified, they could just
pass in a copy (or you could require it to be passed by value instead of
by reference?).

there's only a single caller that does (introduced by a later patch) and
that one constructs the hash reference right at the call site, so unless
I am missing something this seems a bit overkill..

> +    $my_opts->{type} = 'colocation';
> +
> +    PVE::HA::Rules::foreach_service_rule($rules, $func, $my_opts);
> +}
> +
> +sub split_colocation_rules {
> +    my ($rules) = @_;
> +
> +    my $positive_ruleids = [];
> +    my $negative_ruleids = [];
> +
> +    foreach_colocation_rule($rules, sub {
> +	my ($rule, $ruleid) = @_;
> +
> +	my $ruleid_set = $rule->{affinity} eq 'together' ? $positive_ruleids : $negative_ruleids;
> +	push @$ruleid_set, $ruleid;
> +    });
> +
> +    return ($positive_ruleids, $negative_ruleids);
> +}
> +
> +=head3 check_service_count($rules)
> +
> +Returns a list of conflicts caused by colocation rules, which do not have
> +enough services in them, defined in C<$rules>.
> +
> +If there are no conflicts, the returned list is empty.
> +
> +=cut
> +
> +sub check_services_count {
> +    my ($rules) = @_;
> +
> +    my $conflicts = [];
> +
> +    foreach_colocation_rule($rules, sub {
> +	my ($rule, $ruleid) = @_;
> +
> +	push @$conflicts, $ruleid if (scalar(keys %{$rule->{services}}) < 2);
> +    });
> +
> +    return $conflicts;
> +}

is this really an issue? a colocation rule with a single service is just
a nop? there's currently no cleanup AFAICT if a resource is removed, but
if we add that part (we maybe should?) then one can easily end up in a
situation where a rule temporarily contains a single or no service?

> +
> +=head3 check_positive_intransitivity($rules)
> +
> +Returns a list of conflicts caused by transitive positive colocation rules
> +defined in C<$rules>.
> +
> +Transitive positive colocation rules exist, if there are at least two positive
> +colocation rules with the same strictness, which put at least the same two
> +services in relation. This means, that these rules can be merged together.
> +
> +If there are no conflicts, the returned list is empty.

The terminology here is quit confusing - conflict meaning that two rules
are "transitive" and thus mergeable (which is good, cause it makes
things easier to handle?) is quite weird, as "conflict" is a rather
negative term..

there's only a single call site in the same module, maybe we could just
rename this into "find_mergeable_positive_ruleids", similar to the
variable where the result is stored?

> +
> +=cut
> +
> +sub check_positive_intransitivity {
> +    my ($rules) = @_;
> +
> +    my $conflicts = {};
> +    my ($positive_ruleids) = split_colocation_rules($rules);
> +
> +    while (my $outerid = shift(@$positive_ruleids)) {
> +	my $outer = $rules->{ids}->{$outerid};
> +
> +	for my $innerid (@$positive_ruleids) {

so this is in practice a sort of "optimized" loop over all pairs of
rules - iterating over the positive rules twice, but skipping pairs that
were already visited by virtue of the shift on the outer loop..

might be worth a short note, together with the $inner and $outer
terminology I was a bit confused at first..

> +	    my $inner = $rules->{ids}->{$innerid};
> +
> +	    next if $outerid eq $innerid;
> +	    next if $outer->{strict} != $inner->{strict};
> +	    next if PVE::HA::Tools::is_disjoint($outer->{services}, $inner->{services});
> +
> +	    push @{$conflicts->{$outerid}}, $innerid;
> +	}
> +    }
> +
> +    return $conflicts;
> +}
> +
> +=head3 check_inner_consistency($rules)
> +
> +Returns a list of conflicts caused by inconsistencies between positive and
> +negative colocation rules defined in C<$rules>.
> +
> +Inner inconsistent colocation rules exist, if there are at least the same two
> +services in a positive and a negative colocation relation, which is an
> +impossible constraint as they are opposites of each other.
> +
> +If there are no conflicts, the returned list is empty.

here the conflicts and check terminology makes sense - we are checking
an invariant that must be satisfied after all :)

> +
> +=cut
> +
> +sub check_inner_consistency {

but 'inner' is a weird term since this is consistency between rules?

it basically checks that no pair of services should both be colocated
and not be colocated at the same time, but not sure how to encode that
concisely..

> +    my ($rules) = @_;
> +
> +    my $conflicts = [];
> +    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
> +
> +    for my $outerid (@$positive_ruleids) {
> +	my $outer = $rules->{ids}->{$outerid}->{services};

s/outer/positive ?

> +
> +	for my $innerid (@$negative_ruleids) {
> +	    my $inner = $rules->{ids}->{$innerid}->{services};

s/inner/negative ?

> +
> +	    my $intersection = PVE::HA::Tools::intersect($outer, $inner);
> +	    next if scalar(keys %$intersection < 2);

the keys there is not needed, but the parentheses are in the wrong place
instead ;) it does work by accident though, because the result of keys
will be coerced to a scalar anyway, so you get the result of your
comparison wrapped by another call to scalar, so you end up with either
1 or '' depending on whether the check was true or false..

> +
> +	    push @$conflicts, [$outerid, $innerid];
> +	}
> +    }
> +
> +    return $conflicts;
> +}
> +
> +=head3 check_positive_group_consistency(...)
> +
> +Returns a list of conflicts caused by inconsistencies between positive
> +colocation rules defined in C<$rules> and node restrictions defined in
> +C<$groups> and C<$service>.

services?

> +
> +A positive colocation rule inconsistency with groups exists, if at least two
> +services in a positive colocation rule are restricted to disjoint sets of
> +nodes, i.e. they are in restricted HA groups, which have a disjoint set of
> +nodes.
> +
> +If there are no conflicts, the returned list is empty.
> +
> +=cut
> +
> +sub check_positive_group_consistency {
> +    my ($rules, $groups, $services, $positive_ruleids, $conflicts) = @_;

this could just get $positive_rules (filtered via grep) instead?

> +
> +    for my $ruleid (@$positive_ruleids) {
> +	my $rule_services = $rules->{ids}->{$ruleid}->{services};

and this could be

while (my ($ruleid, $rule) = each %$positive_rules) {
  my $nodes;
  ..
}

> +	my $nodes;
> +
> +	for my $sid (keys %$rule_services) {
> +	    my $groupid = $services->{$sid}->{group};
> +	    return if !$groupid;

should this really be a return?

> +
> +	    my $group = $groups->{ids}->{$groupid};
> +	    return if !$group;
> +	    return if !$group->{restricted};

same here?

> +
> +	    $nodes = { map { $_ => 1 } keys %{$group->{nodes}} } if !defined($nodes);

isn't $group->{nodes} already a hash set of the desired format? so this could be

$nodes = { $group->{nodes}->%* };

?

> +	    $nodes = PVE::HA::Tools::intersect($nodes, $group->{nodes});

could add a break here with the same condition as below?

> +	}
> +
> +	if (defined($nodes) && scalar keys %$nodes < 1) {
> +	    push @$conflicts, ['positive', $ruleid];
> +	}
> +    }
> +}
> +
> +=head3 check_negative_group_consistency(...)
> +
> +Returns a list of conflicts caused by inconsistencies between negative
> +colocation rules defined in C<$rules> and node restrictions defined in
> +C<$groups> and C<$service>.
> +
> +A negative colocation rule inconsistency with groups exists, if at least two
> +services in a negative colocation rule are restricted to less nodes in total
> +than services in the rule, i.e. they are in restricted HA groups, where the
> +union of all restricted node sets have less elements than restricted services.
> +
> +If there are no conflicts, the returned list is empty.
> +
> +=cut
> +
> +sub check_negative_group_consistency {
> +    my ($rules, $groups, $services, $negative_ruleids, $conflicts) = @_;

same question here

> +
> +    for my $ruleid (@$negative_ruleids) {
> +	my $rule_services = $rules->{ids}->{$ruleid}->{services};
> +	my $restricted_services = 0;
> +	my $restricted_nodes;
> +
> +	for my $sid (keys %$rule_services) {
> +	    my $groupid = $services->{$sid}->{group};
> +	    return if !$groupid;

same question as above ;)

> +
> +	    my $group = $groups->{ids}->{$groupid};
> +	    return if !$group;
> +	    return if !$group->{restricted};

same here

> +
> +	    $restricted_services++;
> +
> +	    $restricted_nodes = {} if !defined($restricted_nodes);
> +	    $restricted_nodes = PVE::HA::Tools::union($restricted_nodes, $group->{nodes});

here as well - if restricted_services > restricted_nodes, haven't we
already found a violation of the invariant and should break even if
another service would then be added in the next iteration that can run
on 5 move new nodes..

> +	}
> +
> +	if (defined($restricted_nodes)
> +	    && scalar keys %$restricted_nodes < $restricted_services) {
> +	    push @$conflicts, ['negative', $ruleid];
> +	}
> +    }
> +}
> +
> +sub check_consistency_with_groups {
> +    my ($rules, $groups, $services) = @_;
> +
> +    my $conflicts = [];
> +    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
> +
> +    check_positive_group_consistency($rules, $groups, $services, $positive_ruleids, $conflicts);
> +    check_negative_group_consistency($rules, $groups, $services, $negative_ruleids, $conflicts);
> +
> +    return $conflicts;
> +}
> +
> +sub canonicalize {
> +    my ($class, $rules, $groups, $services) = @_;

should this note that it will modify $rules in-place? this is only
called by PVE::HA::Rules::checked_config which also does not note that
and could be interpreted as "config is checked now" ;)

> +
> +    my $illdefined_ruleids = check_services_count($rules);
> +
> +    for my $ruleid (@$illdefined_ruleids) {
> +	print "Drop colocation rule '$ruleid', because it does not have enough services defined.\n";
> +
> +	delete $rules->{ids}->{$ruleid};
> +    }
> +
> +    my $mergeable_positive_ruleids = check_positive_intransitivity($rules);
> +
> +    for my $outerid (sort keys %$mergeable_positive_ruleids) {
> +	my $outer = $rules->{ids}->{$outerid};
> +	my $innerids = $mergeable_positive_ruleids->{$outerid};
> +
> +	for my $innerid (@$innerids) {
> +	    my $inner = $rules->{ids}->{$innerid};
> +
> +	    $outer->{services}->{$_} = 1 for (keys %{$inner->{services}});
> +
> +	    print "Merge services of positive colocation rule '$innerid' into positive colocation"
> +		. " rule '$outerid', because they share at least one service.\n";

this is a bit confusing because it modifies the rule while continuing to
refer to it using the old name afterwards.. should we merge them and
give them a new name?

> +
> +	    delete $rules->{ids}->{$innerid};
> +	}
> +    }
> +
> +    my $inner_conflicts = check_inner_consistency($rules);
> +
> +    for my $conflict (@$inner_conflicts) {
> +	my ($positiveid, $negativeid) = @$conflict;
> +
> +	print "Drop positive colocation rule '$positiveid' and negative colocation rule"
> +	    . " '$negativeid', because they share two or more services.\n";
> +
> +	delete $rules->{ids}->{$positiveid};
> +	delete $rules->{ids}->{$negativeid};
> +    }
> +
> +    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
> +
> +    for my $conflict (@$group_conflicts) {
> +	my ($type, $ruleid) = @$conflict;
> +
> +	if ($type eq 'positive') {
> +	    print "Drop positive colocation rule '$ruleid', because two or more services are"
> +		. " restricted to different nodes.\n";
> +	} elsif ($type eq 'negative') {
> +	    print "Drop negative colocation rule '$ruleid', because two or more services are"
> +		. " restricted to less nodes than services.\n";
> +	} else {
> +	    die "Invalid group conflict type $type\n";
> +	}
> +
> +	delete $rules->{ids}->{$ruleid};
> +    }
> +}
> +
> +# TODO This will be used to verify modifications to the rules config over the API
> +sub are_satisfiable {

this is basically canonicalize, but
- without deleting rules
- without the transitivity check
- with slightly adapted messages

should they be combined so that we have roughly the same logic when
doing changes via the API and when loading the rules for operations?

> +    my ($class, $rules, $groups, $services) = @_;
> +
> +    my $illdefined_ruleids = check_services_count($rules);
> +
> +    for my $ruleid (@$illdefined_ruleids) {
> +	print "Colocation rule '$ruleid' does not have enough services defined.\n";
> +    }
> +
> +    my $inner_conflicts = check_inner_consistency($rules);
> +
> +    for my $conflict (@$inner_conflicts) {
> +	my ($positiveid, $negativeid) = @$conflict;
> +
> +	print "Positive colocation rule '$positiveid' is inconsistent with negative colocation rule"
> +	    . " '$negativeid', because they share two or more services between them.\n";
> +    }
> +
> +    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
> +
> +    for my $conflict (@$group_conflicts) {
> +	my ($type, $ruleid) = @$conflict;
> +
> +	if ($type eq 'positive') {
> +	    print "Positive colocation rule '$ruleid' is unapplicable, because two or more services"
> +		. " are restricted to different nodes.\n";
> +	} elsif ($type eq 'negative') {
> +	    print "Negative colocation rule '$ruleid' is unapplicable, because two or more services"
> +		. " are restricted to less nodes than services.\n";
> +	} else {
> +	    die "Invalid group conflict type $type\n";
> +	}
> +    }
> +
> +    if (scalar(@$inner_conflicts) || scalar(@$group_conflicts)) {
> +	return 0;
> +    }
> +
> +    return 1;
> +}
> +
> +1;
> diff --git a/src/PVE/HA/Rules/Makefile b/src/PVE/HA/Rules/Makefile
> new file mode 100644
> index 0000000..8cb91ac
> --- /dev/null
> +++ b/src/PVE/HA/Rules/Makefile
> @@ -0,0 +1,6 @@
> +SOURCES=Colocation.pm
> +
> +.PHONY: install
> +install:
> +	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA/Rules
> +	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/Rules/$$i; done
> diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
> index 35107c9..52251d7 100644
> --- a/src/PVE/HA/Tools.pm
> +++ b/src/PVE/HA/Tools.pm
> @@ -46,6 +46,12 @@ PVE::JSONSchema::register_standard_option('pve-ha-resource-id', {
>      type => 'string', format => 'pve-ha-resource-id',
>  });
>  
> +PVE::JSONSchema::register_standard_option('pve-ha-resource-id-list', {
> +    description => "List of HA resource IDs.",
> +    typetext => "<type>:<name>{,<type>:<name>}*",
> +    type => 'string', format => 'pve-ha-resource-id-list',
> +});
> +
>  PVE::JSONSchema::register_format('pve-ha-resource-or-vm-id', \&pve_verify_ha_resource_or_vm_id);
>  sub pve_verify_ha_resource_or_vm_id {
>      my ($sid, $noerr) = @_;
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes
  2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
@ 2025-04-03 12:17   ` Fabian Grünbichler
  2025-04-11 15:56     ` Daniel Kral
  0 siblings, 1 reply; 30+ messages in thread
From: Fabian Grünbichler @ 2025-04-03 12:17 UTC (permalink / raw)
  To: Proxmox VE development discussion

On March 25, 2025 4:12 pm, Daniel Kral wrote:
> Add a mechanism to the node selection subroutine, which enforces the
> colocation rules defined in the rules config.
> 
> The algorithm manipulates the set of nodes directly, which the service
> is allowed to run on, depending on the type and strictness of the
> colocation rules, if there are any.

shouldn't this first attempt to satisfy all rules, and if that fails,
retry with just the strict ones, or something similar? see comments
below (maybe I am missing/misunderstanding something)

> 
> This makes it depend on the prior removal of any nodes, which are
> unavailable (i.e. offline, unreachable, or weren't able to start the
> service in previous tries) or are not allowed to be run on otherwise
> (i.e. HA group node restrictions) to function correctly.
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  src/PVE/HA/Manager.pm      | 203 ++++++++++++++++++++++++++++++++++++-
>  src/test/test_failover1.pl |   4 +-
>  2 files changed, 205 insertions(+), 2 deletions(-)
> 
> diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
> index 8f2ab3d..79b6555 100644
> --- a/src/PVE/HA/Manager.pm
> +++ b/src/PVE/HA/Manager.pm
> @@ -157,8 +157,201 @@ sub get_node_priority_groups {
>      return ($pri_groups, $group_members);
>  }
>  
> +=head3 get_colocated_services($rules, $sid, $online_node_usage)
> +
> +Returns a hash map of all services, which are specified as being in a positive
> +or negative colocation in C<$rules> with the given service with id C<$sid>.
> +
> +Each service entry consists of the type of colocation, strictness of colocation
> +and the node the service is currently assigned to, if any, according to
> +C<$online_node_usage>.
> +
> +For example, a service C<'vm:101'> being strictly colocated together (positive)
> +with two other services C<'vm:102'> and C<'vm:103'> and loosely colocated
> +separate with another service C<'vm:104'> results in the hash map:
> +
> +    {
> +	'vm:102' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:103' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:104' => {
> +	    affinity => 'separate',
> +	    strict => 0,
> +	    node => undef
> +	}
> +    }
> +
> +=cut
> +
> +sub get_colocated_services {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = {};
> +
> +    PVE::HA::Rules::Colocation::foreach_colocation_rule($rules, sub {
> +	my ($rule) = @_;
> +
> +	for my $csid (sort keys %{$rule->{services}}) {
> +	    next if $csid eq $sid;
> +
> +	    $services->{$csid} = {
> +		node => $online_node_usage->get_service_node($csid),
> +		affinity => $rule->{affinity},
> +		strict => $rule->{strict},
> +	    };
> +        }
> +    }, {
> +	sid => $sid,
> +    });
> +
> +    return $services;
> +}
> +
> +=head3 get_colocation_preference($rules, $sid, $online_node_usage)
> +
> +Returns a list of two hashes, where each is a hash map of the colocation
> +preference of C<$sid>, according to the colocation rules in C<$rules> and the
> +service locations in C<$online_node_usage>.
> +
> +The first hash is the positive colocation preference, where each element
> +represents properties for how much C<$sid> prefers to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should be
> +there (C<0>) or must be there (C<1>).
> +
> +The second hash is the negative colocation preference, where each element
> +represents properties for how much C<$sid> prefers not to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should not
> +be there (C<0>) or must not be there (C<1>).
> +
> +=cut
> +
> +sub get_colocation_preference {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = get_colocated_services($rules, $sid, $online_node_usage);
> +
> +    my $together = {};
> +    my $separate = {};
> +
> +    for my $service (values %$services) {
> +	my $node = $service->{node};
> +
> +	next if !$node;
> +
> +	my $node_set = $service->{affinity} eq 'together' ? $together : $separate;
> +	$node_set->{$node}->{strict} = $node_set->{$node}->{strict} || $service->{strict};
> +    }
> +
> +    return ($together, $separate);
> +}
> +
> +=head3 apply_positive_colocation_rules($together, $allowed_nodes)
> +
> +Applies the positive colocation preference C<$together> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Positive colocation means keeping services together on a single node, and
> +therefore minimizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_positive_colocation_rules {
> +    my ($together, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$together) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::intersect($allowed_nodes, $together);
> +
> +    for my $node (sort keys %$together) {
> +	$mandatory_nodes->{$node} = 1 if $together->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to only the nodes the service must be on.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($mandatory_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {

I am not sure I follow this logic here.. if there are any strict
requirements, we only honor those.. if there are no strict requirements,
we only honor the non-strict ones?

> +	# limit to the possible nodes the service should be on, if there are any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}

this is the same code twice, just operating on different hash
references, so could probably be a lot shorter. the next and delete
could also be combined (`delete .. if !...`).

> +    }
> +}
> +
> +=head3 apply_negative_colocation_rules($separate, $allowed_nodes)
> +
> +Applies the negative colocation preference C<$separate> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Negative colocation means keeping services separate on multiple nodes, and
> +therefore maximizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_negative_colocation_rules {
> +    my ($separate, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$separate) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::set_difference($allowed_nodes, $separate);

this is confusing or I misunderstand something here, see below..

> +
> +    for my $node (sort keys %$separate) {
> +	$mandatory_nodes->{$node} = 1 if $separate->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to the nodes the service must not be on.

this is missing a not?
we are limiting to the nodes the service must not not be on :-P

should we rename mandatory_nodes to forbidden_nodes?

> +	for my $node (keys %$allowed_nodes) {

this could just loop over the forbidden nodes and delete them from
allowed nodes?

> +	    next if !exists($mandatory_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {

similar to above - if we have strict exclusions, we honor them, but we
ignore the non-strict exclusions unless there are no strict ones?

> +	# limit to the nodes the service should not be on, if any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    }
> +}
> +
> +sub apply_colocation_rules {
> +    my ($rules, $sid, $allowed_nodes, $online_node_usage) = @_;
> +
> +    my ($together, $separate) = get_colocation_preference($rules, $sid, $online_node_usage);
> +
> +    apply_positive_colocation_rules($together, $allowed_nodes);
> +    apply_negative_colocation_rules($separate, $allowed_nodes);
> +}
> +
>  sub select_service_node {
> -    my ($groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
> +    # TODO Cleanup this signature post-RFC
> +    my ($rules, $groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
>  
>      my $group = get_service_group($groups, $online_node_usage, $service_conf);
>  
> @@ -189,6 +382,8 @@ sub select_service_node {
>  
>      return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
>  
> +    apply_colocation_rules($rules, $sid, $pri_nodes, $online_node_usage);
> +
>      my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
>      my @nodes = sort {
>  	$scores->{$a} <=> $scores->{$b} || $a cmp $b
> @@ -758,6 +953,7 @@ sub next_state_request_start {
>  
>      if ($self->{crs}->{rebalance_on_request_start}) {
>  	my $selected_node = select_service_node(
> +	    $self->{rules},
>  	    $self->{groups},
>  	    $self->{online_node_usage},
>  	    $sid,
> @@ -771,6 +967,9 @@ sub next_state_request_start {
>  	my $select_text = $selected_node ne $current_node ? 'new' : 'current';
>  	$haenv->log('info', "service $sid: re-balance selected $select_text node $selected_node for startup");
>  
> +	# TODO It would be better if this information would be retrieved from $ss/$sd post-RFC
> +	$self->{online_node_usage}->pin_service_node($sid, $selected_node);
> +
>  	if ($selected_node ne $current_node) {
>  	    $change_service_state->($self, $sid, 'request_start_balance', node => $current_node, target => $selected_node);
>  	    return;
> @@ -898,6 +1097,7 @@ sub next_state_started {
>  	    }
>  
>  	    my $node = select_service_node(
> +		$self->{rules},
>  	        $self->{groups},
>  		$self->{online_node_usage},
>  		$sid,
> @@ -1004,6 +1204,7 @@ sub next_state_recovery {
>      $self->recompute_online_node_usage(); # we want the most current node state
>  
>      my $recovery_node = select_service_node(
> +	$self->{rules},
>  	$self->{groups},
>  	$self->{online_node_usage},
>  	$sid,
> diff --git a/src/test/test_failover1.pl b/src/test/test_failover1.pl
> index 308eab3..4c84fbd 100755
> --- a/src/test/test_failover1.pl
> +++ b/src/test/test_failover1.pl
> @@ -8,6 +8,8 @@ use PVE::HA::Groups;
>  use PVE::HA::Manager;
>  use PVE::HA::Usage::Basic;
>  
> +my $rules = {};
> +
>  my $groups = PVE::HA::Groups->parse_config("groups.tmp", <<EOD);
>  group: prefer_node1
>  	nodes node1
> @@ -31,7 +33,7 @@ sub test {
>      my ($expected_node, $try_next) = @_;
>      
>      my $node = PVE::HA::Manager::select_service_node
> -	($groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
> +	($rules, $groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
>  
>      my (undef, undef, $line) = caller();
>      die "unexpected result: $node != ${expected_node} at line $line\n" 
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
  2025-04-01  9:39   ` Daniel Kral
  2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
@ 2025-04-03 12:26     ` Fabian Grünbichler
  1 sibling, 0 replies; 30+ messages in thread
From: Fabian Grünbichler @ 2025-04-03 12:26 UTC (permalink / raw)
  To: DERUMIER, Alexandre, Proxmox VE development discussion

On April 1, 2025 11:39 am, Daniel Kral wrote:
> On 4/1/25 03:50, DERUMIER, Alexandre wrote:
>> Small feature request from students && customers:  they are a lot
>> asking to be able to use vm tags in the colocation/affinity
> 
> Good idea! We were thinking about this too and I forgot to add it to the 
> list, thanks for bringing it up again!
> 
> Yes, the idea would be to make pools and tags available as selectors for 
> rules here, so that the changes can be made rather dynamic by just 
> adding a tag to a service.
> 
> The only thing we have to consider here is that HA rules have some 
> verification phase and invalid rules will be dropped or modified to make 
> them applicable. Also these external changes must be identified somehow 
> in the HA stack, as I want to keep the amount of runs through the 
> verification code to a minimum, i.e. only when the configuration is 
> changed by the user. But that will be a discussion for another series ;).

something to also consider is HA permissions:

https://bugzilla.proxmox.com/show_bug.cgi?id=4597

e.g., who is supposed to define (affinity or other) rules, who sees
them, what if there are conflicts, ..

what about conflicting requests? let's say we have a set of 5 VMs that
should run on the same node, but one is requested to be migrated to node
A, and a second one to node B? if a user doesn't see the rules for lack
of privileges this could get rather confusing behaviour wise in the end?

what about things like VMs X and Y needing to run together, but Z not
being allowed to run together with Y, and user A that only "sees" X
requesting X to be migrated to the node where Z is currently running?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin
  2025-04-03 12:16   ` Fabian Grünbichler
@ 2025-04-11 11:04     ` Daniel Kral
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-04-11 11:04 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Thanks for the review, Fabian!

Sorry for the wait, I was more focused on testing other patch series 
which were already ready to merge for PVE 8.4 ;). But I'm going to be 
working on this again now, so that it will be ready for the next release 
or even before that :)

Thanks for the suggestions, I'll implement them shortly. I still have a 
few questions/discussion points below.

On 4/3/25 14:16, Fabian Grünbichler wrote:
> On March 25, 2025 4:12 pm, Daniel Kral wrote:
>> Add the colocation rule plugin to allow users to specify inter-service
>> affinity constraints.
>>
>> These colocation rules can either be positive (keeping services
>> together) or negative (keeping service separate). Their strictness can
>> also be specified as either a MUST or a SHOULD, where the first
>> specifies that any service the constraint cannot be applied for stays in
>> recovery, while the latter specifies that that any service the
>> constraint cannot be applied for is lifted from the constraint.
>>
>> The initial implementation also implements four basic transformations,
>> where colocation rules with not enough services are dropped, transitive
>> positive colocation rules are merged, and inter-colocation rule
>> inconsistencies as well as colocation rule inconsistencies with respect
>> to the location constraints specified in HA groups are dropped.
> 
> a high level question: theres a lot of loops and sorts over rules,
> services, groups here - granted that is all in memory, so should be
> reasonably fast, do we have concerns here/should we look for further
> optimization potential?
> 
> e.g. right now I count (coming in via canonicalize):
> 
> - check_services_count
> -- sort of ruleids (foreach_colocation_rule)
> -- loop over rules (foreach_colocation_rule)
> --- keys on services of each rule
> - loop over the results (should be empty)
> - check_positive_intransitivity
> -- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
> -- loop over each unique pair of ruleids
> --- is_disjoint on services of each pair (loop over service keys)
> - loop over resulting ruleids (might be many!)
> -- loop over mergeable rules for each merge target
> --- loop over services of each mergeable rule
> - check_inner_consistency
> -- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
> -- loop over positive rules
> --- for every positive rule, loop over negative rules
> ---- for each pair of positive+negative rule, check service
> intersections
> - loop over resulting conflicts (should be empty)
> - check_consistency_with_groups
> -- sort of ruleids, 1x loop over rules (foreach_colocation_rule via split_colocation_rules)
> -- loop over positive rules
> --- loop over services
> ---- loop over nodes of service's group
> -- loop over negative rules
> --- loop over services
> ---- loop over nodes of service's group
> - loop over resulting conflicts (should be empty)
> 
> possibly splitting the rules (instead of just the IDs) once and keeping
> a list of sorted rule IDs we could save some overhead?
> 
> might not be worth it (yet) though, but something to keep in mind if the
> rules are getting more complicated over time..

Thanks for the nice call graph!

I think it would be reasonable to do this already, especially to reduce 
the code duplication between canonicalize() and are_satisfiable() you 
already mentioned below.

I was thinking about something like $cmddef or another registry-type 
structure, which has an entry for each checking subroutine and also a 
handler for what to print/do for both canonicalize() as well as 
are_satisfiable(). Then those would have to only iterate over the list 
and call the subroutines.

For every checking subroutine, we could pass the whole of $rules, and a 
rule type-specific variable, e.g. [$positive_ids, $negative_ids] here, 
or as you already suggested below [$positive_rules, $negative_rules].

One small thing I haven't explicitly mentioned here before is that at 
least the check for mergeable positive colocation rules 
(`check_positive_intransitivity`) and the check for inconsistency 
between positive and negative colocation rules 
(`check_inner_consistency`) do depend on each other somewhat, so order 
for these rules stays important here as well as that modifications to 
$rules are written correctly before the next check handler is called.

I've written about an example why this is necessary in a comment below 
`check_positive_intransitivity` and will document this more clearly in 
the v1.

The only semi-blocker here is that check_consistency_with_groups(...) 
also needs access to $groups and $services, but for the time being we 
could just pass those two to every subroutine and ignore it, where it 
isn't needed.

Another approach could be to write any service group membership into 
$rules internally already and just work with the data from there, so 
that transitioning to replacing "HA Groups" with "HA Location Rules" 
could go more smoothly in a future major version, if we want to do this 
in the end. Or we'll already allow creating location rules explicitly, 
which are synchronized with HA groups.

> 
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>>   debian/pve-ha-manager.install  |   1 +
>>   src/PVE/HA/Makefile            |   1 +
>>   src/PVE/HA/Rules/Colocation.pm | 391 +++++++++++++++++++++++++++++++++
>>   src/PVE/HA/Rules/Makefile      |   6 +
>>   src/PVE/HA/Tools.pm            |   6 +
>>   5 files changed, 405 insertions(+)
>>   create mode 100644 src/PVE/HA/Rules/Colocation.pm
>>   create mode 100644 src/PVE/HA/Rules/Makefile
>>
>> diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
>> index 9bbd375..89f9144 100644
>> --- a/debian/pve-ha-manager.install
>> +++ b/debian/pve-ha-manager.install
>> @@ -33,6 +33,7 @@
>>   /usr/share/perl5/PVE/HA/Resources/PVECT.pm
>>   /usr/share/perl5/PVE/HA/Resources/PVEVM.pm
>>   /usr/share/perl5/PVE/HA/Rules.pm
>> +/usr/share/perl5/PVE/HA/Rules/Colocation.pm
>>   /usr/share/perl5/PVE/HA/Tools.pm
>>   /usr/share/perl5/PVE/HA/Usage.pm
>>   /usr/share/perl5/PVE/HA/Usage/Basic.pm
>> diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
>> index 489cbc0..e386cbf 100644
>> --- a/src/PVE/HA/Makefile
>> +++ b/src/PVE/HA/Makefile
>> @@ -8,6 +8,7 @@ install:
>>   	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA
>>   	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/$$i; done
>>   	make -C Resources install
>> +	make -C Rules install
>>   	make -C Usage install
>>   	make -C Env install
>>   
>> diff --git a/src/PVE/HA/Rules/Colocation.pm b/src/PVE/HA/Rules/Colocation.pm
>> new file mode 100644
>> index 0000000..808d48e
>> --- /dev/null
>> +++ b/src/PVE/HA/Rules/Colocation.pm
>> @@ -0,0 +1,391 @@
>> +package PVE::HA::Rules::Colocation;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use Data::Dumper;
> 
> leftover dumper ;)
> 
>> +
>> +use PVE::JSONSchema qw(get_standard_option);
>> +use PVE::HA::Tools;
>> +
>> +use base qw(PVE::HA::Rules);
>> +
>> +sub type {
>> +    return 'colocation';
>> +}
>> +
>> +sub properties {
>> +    return {
>> +	services => get_standard_option('pve-ha-resource-id-list'),
>> +	affinity => {
>> +	    description => "Describes whether the services are supposed to be kept on separate"
>> +		. " nodes, or are supposed to be kept together on the same node.",
>> +	    type => 'string',
>> +	    enum => ['separate', 'together'],
>> +	    optional => 0,
>> +	},
>> +	strict => {
>> +	    description => "Describes whether the colocation rule is mandatory or optional.",
>> +	    type => 'boolean',
>> +	    optional => 0,
>> +	},
>> +    }
>> +}
>> +
>> +sub options {
>> +    return {
>> +	services => { optional => 0 },
>> +	strict => { optional => 0 },
>> +	affinity => { optional => 0 },
>> +	comment => { optional => 1 },
>> +    };
>> +};
>> +
>> +sub decode_value {
>> +    my ($class, $type, $key, $value) = @_;
>> +
>> +    if ($key eq 'services') {
>> +	my $res = {};
>> +
>> +	for my $service (PVE::Tools::split_list($value)) {
>> +	    if (PVE::HA::Tools::pve_verify_ha_resource_id($service)) {
>> +		$res->{$service} = 1;
>> +	    }
>> +	}
>> +
>> +	return $res;
>> +    }
>> +
>> +    return $value;
>> +}
>> +
>> +sub encode_value {
>> +    my ($class, $type, $key, $value) = @_;
>> +
>> +    if ($key eq 'services') {
>> +	PVE::HA::Tools::pve_verify_ha_resource_id($_) for (keys %$value);
>> +
>> +	return join(',', keys %$value);
>> +    }
>> +
>> +    return $value;
>> +}
>> +
>> +sub foreach_colocation_rule {
>> +    my ($rules, $func, $opts) = @_;
>> +
>> +    my $my_opts = { map { $_ => $opts->{$_} } keys %$opts };
> 
> why? if the caller doesn't want $opts to be modified, they could just
> pass in a copy (or you could require it to be passed by value instead of
> by reference?).
> 
> there's only a single caller that does (introduced by a later patch) and
> that one constructs the hash reference right at the call site, so unless
> I am missing something this seems a bit overkill..

Right, I didn't think about this clearly enough and this could very well 
be just a direct write to the passed hash here.

Will change that in the v1!

> 
>> +    $my_opts->{type} = 'colocation';
>> +
>> +    PVE::HA::Rules::foreach_service_rule($rules, $func, $my_opts);
>> +}
>> +
>> +sub split_colocation_rules {
>> +    my ($rules) = @_;
>> +
>> +    my $positive_ruleids = [];
>> +    my $negative_ruleids = [];
>> +
>> +    foreach_colocation_rule($rules, sub {
>> +	my ($rule, $ruleid) = @_;
>> +
>> +	my $ruleid_set = $rule->{affinity} eq 'together' ? $positive_ruleids : $negative_ruleids;
>> +	push @$ruleid_set, $ruleid;
>> +    });
>> +
>> +    return ($positive_ruleids, $negative_ruleids);
>> +}
>> +
>> +=head3 check_service_count($rules)
>> +
>> +Returns a list of conflicts caused by colocation rules, which do not have
>> +enough services in them, defined in C<$rules>.
>> +
>> +If there are no conflicts, the returned list is empty.
>> +
>> +=cut
>> +
>> +sub check_services_count {
>> +    my ($rules) = @_;
>> +
>> +    my $conflicts = [];
>> +
>> +    foreach_colocation_rule($rules, sub {
>> +	my ($rule, $ruleid) = @_;
>> +
>> +	push @$conflicts, $ruleid if (scalar(keys %{$rule->{services}}) < 2);
>> +    });
>> +
>> +    return $conflicts;
>> +}
> 
> is this really an issue? a colocation rule with a single service is just
> a nop? there's currently no cleanup AFAICT if a resource is removed, but

You're right, AFAICS those are a noop when selecting the service node. I 
guess I was a little pedantic / overprotective here about which rules 
make sense in general instead of what the algorithm does in the end.

And good point about handling when resources are removed, adding that to 
delete_service_from_config comes right on my TODO list for the v1!

> if we add that part (we maybe should?) then one can easily end up in a
> situation where a rule temporarily contains a single or no service?

Hm, yes, especially if we add pools/tags at a later point to select 
services for the rule, then this could happen very easily. But as you 
already mentioned, those two cases would be noops too.

Nevertheless, should we drop this? I think it could benefit users in 
identifying that some rules might not do something they wanted and give 
them a reason why, i.e. there's only one service in there, but at the 
same time it could be a little noisy if there are a lot of affected rules.

> 
>> +
>> +=head3 check_positive_intransitivity($rules)
>> +
>> +Returns a list of conflicts caused by transitive positive colocation rules
>> +defined in C<$rules>.
>> +
>> +Transitive positive colocation rules exist, if there are at least two positive
>> +colocation rules with the same strictness, which put at least the same two
>> +services in relation. This means, that these rules can be merged together.
>> +
>> +If there are no conflicts, the returned list is empty.
> 
> The terminology here is quit confusing - conflict meaning that two rules
> are "transitive" and thus mergeable (which is good, cause it makes
> things easier to handle?) is quite weird, as "conflict" is a rather
> negative term..
> 
> there's only a single call site in the same module, maybe we could just
> rename this into "find_mergeable_positive_ruleids", similar to the
> variable where the result is stored?

Yeah, I was probably to keen on the `$conflict = check_something(...)` 
pattern here, but it would be much more readable with a simpler name, 
I'll change that for the v1!

-----

Ad why: I'll also add some documentation about the rationale why this is 
needed in the first place.

The main reason was because the latter rule check 
'check_inner_consistency' depends on the fact that the positive 
colocation rules have been merged already, as it assumes that each 
positive colocation rule has all of the services in there, which are 
positively colocated. If it weren't so, it wouldn't detect that the 
following three rules are inconsistent with each other:

colocation: stick-together1
     services vm:101,vm:104
     affinity together
     strict 1

colocation: stick-together2
     services vm:104,vm:102
     affinity together
     strict 1

colocation: keep-apart
     services vm:101,vm:102,vm:103
     affinity separate
     strict 1

This reduces the complexity of the logic a little in 
'check_inner_consistency' as there it doesn't have to handle this 
special case as 'stick-together1' and 'stick-together2' are already 
merged in to one and it is easily apparent that vm 101 and vm 102 cannot 
be colocated and non-colocated at the same time.

-----

Also, I was curious about how that would work out for the case where a 
negative colocation rule was defined for three nodes with those rules 
split into three rules (essentially a cycle dependence). This should in 
theory have the same semantics as the above rule set:

colocation: stick-together1
     services vm:101,vm:104
     affinity together
     strict 1

colocation: stick-together2
     services vm:104,vm:102
     affinity together
     strict 1

colocation: very-lonely-services1
     services vm:101,vm:102
     affinity separate
     strict 1

colocation: very-lonely-services2
     services vm:102,vm:103
     affinity separate
     strict 1

colocation: very-lonely-services3
     services vm:101,vm:103
     affinity separate
     strict 1

Without the merge of positive rules, 'check_inner_consistency' would 
again not detect the inconsistency here. But with the merge correctly 
applied before checking the consistency, this would be resolved and the 
effective rule set would be:

colocation: very-lonely-services2
     services vm:102,vm:103
     affinity separate
     strict 1

colocation: very-lonely-services3
     services vm:101,vm:103
     affinity separate
     strict 1

It could be argued, that the negative colocation rules should be merged 
in a similar manner here, as there's now a "effective" difference in the 
semantics of the above rule sets, as the negative colocation rule 
between vm 101 and vm 103 and vm 102 and vm 103 remains.

What do you think?

> 
>> +
>> +=cut
>> +
>> +sub check_positive_intransitivity {
>> +    my ($rules) = @_;
>> +
>> +    my $conflicts = {};
>> +    my ($positive_ruleids) = split_colocation_rules($rules);
>> +
>> +    while (my $outerid = shift(@$positive_ruleids)) {
>> +	my $outer = $rules->{ids}->{$outerid};
>> +
>> +	for my $innerid (@$positive_ruleids) {
> 
> so this is in practice a sort of "optimized" loop over all pairs of
> rules - iterating over the positive rules twice, but skipping pairs that
> were already visited by virtue of the shift on the outer loop..
> 
> might be worth a short note, together with the $inner and $outer
> terminology I was a bit confused at first..

Sorry, I'll make that clearer in a comment above or with better naming 
of the variables!

The `while(shift ...)` was motivated by not having to prune duplicates 
afterwards and of course not having to check the same rules again, but 
lacks a little in readability here.

> 
>> +	    my $inner = $rules->{ids}->{$innerid};
>> +
>> +	    next if $outerid eq $innerid;
>> +	    next if $outer->{strict} != $inner->{strict};
>> +	    next if PVE::HA::Tools::is_disjoint($outer->{services}, $inner->{services});
>> +
>> +	    push @{$conflicts->{$outerid}}, $innerid;
>> +	}
>> +    }
>> +
>> +    return $conflicts;
>> +}
>> +
>> +=head3 check_inner_consistency($rules)
>> +
>> +Returns a list of conflicts caused by inconsistencies between positive and
>> +negative colocation rules defined in C<$rules>.
>> +
>> +Inner inconsistent colocation rules exist, if there are at least the same two
>> +services in a positive and a negative colocation relation, which is an
>> +impossible constraint as they are opposites of each other.
>> +
>> +If there are no conflicts, the returned list is empty.
> 
> here the conflicts and check terminology makes sense - we are checking
> an invariant that must be satisfied after all :)
> 
>> +
>> +=cut
>> +
>> +sub check_inner_consistency {
> 
> but 'inner' is a weird term since this is consistency between rules?
> 
> it basically checks that no pair of services should both be colocated
> and not be colocated at the same time, but not sure how to encode that
> concisely..

Hm right, 'intra' wouldn't make this any simpler. I'll come up with a 
better name for the next revision!

> 
>> +    my ($rules) = @_;
>> +
>> +    my $conflicts = [];
>> +    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
>> +
>> +    for my $outerid (@$positive_ruleids) {
>> +	my $outer = $rules->{ids}->{$outerid}->{services};
> 
> s/outer/positive ?

ACK for this and all the following instances ;)

> 
>> +
>> +	for my $innerid (@$negative_ruleids) {
>> +	    my $inner = $rules->{ids}->{$innerid}->{services};
> 
> s/inner/negative ?
> 
>> +
>> +	    my $intersection = PVE::HA::Tools::intersect($outer, $inner);
>> +	    next if scalar(keys %$intersection < 2);
> 
> the keys there is not needed, but the parentheses are in the wrong place
> instead ;) it does work by accident though, because the result of keys
> will be coerced to a scalar anyway, so you get the result of your
> comparison wrapped by another call to scalar, so you end up with either
> 1 or '' depending on whether the check was true or false..

Oh, what a luck coincidence ;)! Thanks for catching that, I'll fix that!

> 
>> +
>> +	    push @$conflicts, [$outerid, $innerid];
>> +	}
>> +    }
>> +
>> +    return $conflicts;
>> +}
>> +
>> +=head3 check_positive_group_consistency(...)
>> +
>> +Returns a list of conflicts caused by inconsistencies between positive
>> +colocation rules defined in C<$rules> and node restrictions defined in
>> +C<$groups> and C<$service>.
> 
> services?

ACK.

> 
>> +
>> +A positive colocation rule inconsistency with groups exists, if at least two
>> +services in a positive colocation rule are restricted to disjoint sets of
>> +nodes, i.e. they are in restricted HA groups, which have a disjoint set of
>> +nodes.
>> +
>> +If there are no conflicts, the returned list is empty.
>> +
>> +=cut
>> +
>> +sub check_positive_group_consistency {
>> +    my ($rules, $groups, $services, $positive_ruleids, $conflicts) = @_;
> 
> this could just get $positive_rules (filtered via grep) instead?
> 
>> +
>> +    for my $ruleid (@$positive_ruleids) {
>> +	my $rule_services = $rules->{ids}->{$ruleid}->{services};
> 
> and this could be
> 
> while (my ($ruleid, $rule) = each %$positive_rules) {
>    my $nodes;
>    ..
> }

Thanks for the suggestion here and above, will use that for the v1!

> 
>> +	my $nodes;
>> +
>> +	for my $sid (keys %$rule_services) {
>> +	    my $groupid = $services->{$sid}->{group};
>> +	    return if !$groupid;
> 
> should this really be a return?

Oops, no that shouldn't be a return, but a next obviously. Forgot to 
change them back after I moved them back from a handler (since there's 
minimal duplicated code with the next subroutine). I'll change this and 
all the other instances for the v1.

> 
>> +
>> +	    my $group = $groups->{ids}->{$groupid};
>> +	    return if !$group;
>> +	    return if !$group->{restricted};
> 
> same here?
> 
>> +
>> +	    $nodes = { map { $_ => 1 } keys %{$group->{nodes}} } if !defined($nodes);
> 
> isn't $group->{nodes} already a hash set of the desired format? so this could be
> 
> $nodes = { $group->{nodes}->%* };
> 
> ?

Right, yes it is!

I was still somewhat confused about what the ->%* operation exactly did 
or did not really know that it existed before, but I now I finally 
looked up about postfix dereferencing ;).

> 
>> +	    $nodes = PVE::HA::Tools::intersect($nodes, $group->{nodes});
> 
> could add a break here with the same condition as below?

Right for this and the same comment for 
`check_negative_group_consistency`, I'll definitely also add a comment 
above that to make it clear why. Thanks for the suggestion!

> 
>> +	}
>> +
>> +	if (defined($nodes) && scalar keys %$nodes < 1) {
>> +	    push @$conflicts, ['positive', $ruleid];
>> +	}
>> +    }
>> +}
>> +
>> +=head3 check_negative_group_consistency(...)
>> +
>> +Returns a list of conflicts caused by inconsistencies between negative
>> +colocation rules defined in C<$rules> and node restrictions defined in
>> +C<$groups> and C<$service>.
>> +
>> +A negative colocation rule inconsistency with groups exists, if at least two
>> +services in a negative colocation rule are restricted to less nodes in total
>> +than services in the rule, i.e. they are in restricted HA groups, where the
>> +union of all restricted node sets have less elements than restricted services.
>> +
>> +If there are no conflicts, the returned list is empty.
>> +
>> +=cut
>> +
>> +sub check_negative_group_consistency {
>> +    my ($rules, $groups, $services, $negative_ruleids, $conflicts) = @_;
> 
> same question here
> 
>> +
>> +    for my $ruleid (@$negative_ruleids) {
>> +	my $rule_services = $rules->{ids}->{$ruleid}->{services};
>> +	my $restricted_services = 0;
>> +	my $restricted_nodes;
>> +
>> +	for my $sid (keys %$rule_services) {
>> +	    my $groupid = $services->{$sid}->{group};
>> +	    return if !$groupid;
> 
> same question as above ;)
> 
>> +
>> +	    my $group = $groups->{ids}->{$groupid};
>> +	    return if !$group;
>> +	    return if !$group->{restricted};
> 
> same here
> 
>> +
>> +	    $restricted_services++;
>> +
>> +	    $restricted_nodes = {} if !defined($restricted_nodes);
>> +	    $restricted_nodes = PVE::HA::Tools::union($restricted_nodes, $group->{nodes});
> 
> here as well - if restricted_services > restricted_nodes, haven't we
> already found a violation of the invariant and should break even if
> another service would then be added in the next iteration that can run
> on 5 move new nodes..

Thanks for catching this, will do that as already said in the above comment!

> 
>> +	}
>> +
>> +	if (defined($restricted_nodes)
>> +	    && scalar keys %$restricted_nodes < $restricted_services) {
>> +	    push @$conflicts, ['negative', $ruleid];
>> +	}
>> +    }
>> +}
>> +
>> +sub check_consistency_with_groups {
>> +    my ($rules, $groups, $services) = @_;
>> +
>> +    my $conflicts = [];
>> +    my ($positive_ruleids, $negative_ruleids) = split_colocation_rules($rules);
>> +
>> +    check_positive_group_consistency($rules, $groups, $services, $positive_ruleids, $conflicts);
>> +    check_negative_group_consistency($rules, $groups, $services, $negative_ruleids, $conflicts);
>> +
>> +    return $conflicts;
>> +}
>> +
>> +sub canonicalize {
>> +    my ($class, $rules, $groups, $services) = @_;
> 
> should this note that it will modify $rules in-place? this is only
> called by PVE::HA::Rules::checked_config which also does not note that
> and could be interpreted as "config is checked now" ;)

Yes, it should really be pointed out by checked_config, but it doesn't 
hurt at all to document it for both. checked_config could also have a 
better name.

> 
>> +
>> +    my $illdefined_ruleids = check_services_count($rules);
>> +
>> +    for my $ruleid (@$illdefined_ruleids) {
>> +	print "Drop colocation rule '$ruleid', because it does not have enough services defined.\n";
>> +
>> +	delete $rules->{ids}->{$ruleid};
>> +    }
>> +
>> +    my $mergeable_positive_ruleids = check_positive_intransitivity($rules);
>> +
>> +    for my $outerid (sort keys %$mergeable_positive_ruleids) {
>> +	my $outer = $rules->{ids}->{$outerid};
>> +	my $innerids = $mergeable_positive_ruleids->{$outerid};
>> +
>> +	for my $innerid (@$innerids) {
>> +	    my $inner = $rules->{ids}->{$innerid};
>> +
>> +	    $outer->{services}->{$_} = 1 for (keys %{$inner->{services}});
>> +
>> +	    print "Merge services of positive colocation rule '$innerid' into positive colocation"
>> +		. " rule '$outerid', because they share at least one service.\n";
> 
> this is a bit confusing because it modifies the rule while continuing to
> refer to it using the old name afterwards.. should we merge them and
> give them a new name?

Good call, I would go for just appending the name, but depending on how 
many rules that are affected this could get rather long... We could also 
just do some temporary name at each new merge, but this could harder to 
follow if there are more than two merge actions.

I think I'll prefer just appending it since it seems that Perl can 
handle hash keys of 2**31 "fine" anyway :P and hope for the better that 
there won't be too many affected rules for users so that the key doesn't 
grow that long. Or what do you think?

> 
>> +
>> +	    delete $rules->{ids}->{$innerid};
>> +	}
>> +    }
>> +
>> +    my $inner_conflicts = check_inner_consistency($rules);
>> +
>> +    for my $conflict (@$inner_conflicts) {
>> +	my ($positiveid, $negativeid) = @$conflict;
>> +
>> +	print "Drop positive colocation rule '$positiveid' and negative colocation rule"
>> +	    . " '$negativeid', because they share two or more services.\n";
>> +
>> +	delete $rules->{ids}->{$positiveid};
>> +	delete $rules->{ids}->{$negativeid};
>> +    }
>> +
>> +    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
>> +
>> +    for my $conflict (@$group_conflicts) {
>> +	my ($type, $ruleid) = @$conflict;
>> +
>> +	if ($type eq 'positive') {
>> +	    print "Drop positive colocation rule '$ruleid', because two or more services are"
>> +		. " restricted to different nodes.\n";
>> +	} elsif ($type eq 'negative') {
>> +	    print "Drop negative colocation rule '$ruleid', because two or more services are"
>> +		. " restricted to less nodes than services.\n";
>> +	} else {
>> +	    die "Invalid group conflict type $type\n";
>> +	}
>> +
>> +	delete $rules->{ids}->{$ruleid};
>> +    }
>> +}
>> +
>> +# TODO This will be used to verify modifications to the rules config over the API
>> +sub are_satisfiable {
> 
> this is basically canonicalize, but
> - without deleting rules
> - without the transitivity check
> - with slightly adapted messages
> 
> should they be combined so that we have roughly the same logic when
> doing changes via the API and when loading the rules for operations?

That would be much better, yes, as it's easy to miss adding it to both 
and could become cumbersome if there are more checks needed in the future.

If there's nothing speaking against that, I would go for the structure I 
have mentioned in the first inline comment to improve this, so that the 
check routine and handlers for canonicalize() and are_satisfiable() are 
closer together.

> 
>> +    my ($class, $rules, $groups, $services) = @_;
>> +
>> +    my $illdefined_ruleids = check_services_count($rules);
>> +
>> +    for my $ruleid (@$illdefined_ruleids) {
>> +	print "Colocation rule '$ruleid' does not have enough services defined.\n";
>> +    }
>> +
>> +    my $inner_conflicts = check_inner_consistency($rules);
>> +
>> +    for my $conflict (@$inner_conflicts) {
>> +	my ($positiveid, $negativeid) = @$conflict;
>> +
>> +	print "Positive colocation rule '$positiveid' is inconsistent with negative colocation rule"
>> +	    . " '$negativeid', because they share two or more services between them.\n";
>> +    }
>> +
>> +    my $group_conflicts = check_consistency_with_groups($rules, $groups, $services);
>> +
>> +    for my $conflict (@$group_conflicts) {
>> +	my ($type, $ruleid) = @$conflict;
>> +
>> +	if ($type eq 'positive') {
>> +	    print "Positive colocation rule '$ruleid' is unapplicable, because two or more services"
>> +		. " are restricted to different nodes.\n";
>> +	} elsif ($type eq 'negative') {
>> +	    print "Negative colocation rule '$ruleid' is unapplicable, because two or more services"
>> +		. " are restricted to less nodes than services.\n";
>> +	} else {
>> +	    die "Invalid group conflict type $type\n";
>> +	}
>> +    }
>> +
>> +    if (scalar(@$inner_conflicts) || scalar(@$group_conflicts)) {
>> +	return 0;
>> +    }
>> +
>> +    return 1;
>> +}
>> +
>> +1;
>> diff --git a/src/PVE/HA/Rules/Makefile b/src/PVE/HA/Rules/Makefile
>> new file mode 100644
>> index 0000000..8cb91ac
>> --- /dev/null
>> +++ b/src/PVE/HA/Rules/Makefile
>> @@ -0,0 +1,6 @@
>> +SOURCES=Colocation.pm
>> +
>> +.PHONY: install
>> +install:
>> +	install -d -m 0755 ${DESTDIR}${PERLDIR}/PVE/HA/Rules
>> +	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/HA/Rules/$$i; done
>> diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
>> index 35107c9..52251d7 100644
>> --- a/src/PVE/HA/Tools.pm
>> +++ b/src/PVE/HA/Tools.pm
>> @@ -46,6 +46,12 @@ PVE::JSONSchema::register_standard_option('pve-ha-resource-id', {
>>       type => 'string', format => 'pve-ha-resource-id',
>>   });
>>   
>> +PVE::JSONSchema::register_standard_option('pve-ha-resource-id-list', {
>> +    description => "List of HA resource IDs.",
>> +    typetext => "<type>:<name>{,<type>:<name>}*",
>> +    type => 'string', format => 'pve-ha-resource-id-list',
>> +});
>> +
>>   PVE::JSONSchema::register_format('pve-ha-resource-or-vm-id', \&pve_verify_ha_resource_or_vm_id);
>>   sub pve_verify_ha_resource_or_vm_id {
>>       my ($sid, $noerr) = @_;
>> -- 
>> 2.39.5
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines
  2025-04-03 12:16     ` Fabian Grünbichler
@ 2025-04-11 11:24       ` Daniel Kral
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-04-11 11:24 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox VE development discussion

Thanks here for the feedback from both of you.

I agree with all the comments and will make the helpers more reusable so 
that they can be moved to a new data structure/hash module in PVE::Tools.

On 4/3/25 14:16, Fabian Grünbichler wrote:
> On March 25, 2025 6:53 pm, Thomas Lamprecht wrote:
>> Am 25.03.25 um 16:12 schrieb Daniel Kral:
>>> Implement helper subroutines, which implement basic set operations done
>>> on hash sets, i.e. hashes with elements set to a true value, e.g. 1.
>>>
>>> These will be used for various tasks in the HA Manager colocation rules,
>>> e.g. for verifying the satisfiability of the rules or applying the
>>> colocation rules on the allowed set of nodes.
>>>
>>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>>> ---
>>> If they're useful somewhere else, I can move them to PVE::Tools
>>> post-RFC, but it'd be probably useful to prefix them with `hash_` there.
>>
>> meh, not a big fan of growing the overly generic PVE::Tools more, if, this
>> should go into a dedicated module for hash/data structure helpers ...
>>
>>> AFAICS there weren't any other helpers for this with a quick grep over
>>> all projects and `PVE::Tools::array_intersect()` wasn't what I needed.
>>
>> ... which those existing one should then also move into, but out of scope
>> of this series.
>>
>>>
>>>   src/PVE/HA/Tools.pm | 42 ++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 42 insertions(+)
>>>
>>> diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
>>> index 0f9e9a5..fc3282c 100644
>>> --- a/src/PVE/HA/Tools.pm
>>> +++ b/src/PVE/HA/Tools.pm
>>> @@ -115,6 +115,48 @@ sub write_json_to_file {
>>>       PVE::Tools::file_set_contents($filename, $raw);
>>>   }
>>>   
>>> +sub is_disjoint {
>>
>> IMO a bit too generic name for being in a Tools named module, maybe
>> prefix them all with hash_ or hashes_ ?

Yes, good call, I think I'll go for what Fabian mentioned below to 
prefix them with hash_set_ / set_ or something similar.

And as we're working towards making those helpers more accessible for 
other use cases, I'll also move them to a separate PVE::Tools::* module 
as suggested above :)

> 
> is_disjoint also only really makes sense as a name if you see it as an
> operation *on* $hash1, rather than an operation involving both hashes..
> 
> i.e., in Rust
> 
> set1.is_disjoint(&set2);
> 
> makes sense..
> 
> in Perl
> 
> is_disjoint($set1, $set2)
> 
> reads weird, and should maybe be
> 
> check_disjoint($set1, $set2)
> 
> or something like that?

Yes makes sense, I was going for `are_disjoint`, but both are fine for me.

> 
>>
>>> +    my ($hash1, $hash2) = @_;
>>> +
>>> +    for my $key (keys %$hash1) {
>>> +	return 0 if exists($hash2->{$key});
>>> +    }
>>> +
>>> +    return 1;
>>> +};
>>> +
>>> +sub intersect {
>>> +    my ($hash1, $hash2) = @_;
>>> +
>>> +    my $result = { map { $_ => $hash2->{$_} } keys %$hash1 };
> 
> this is a bit dangerous if $hash2->{$key} is itself a reference? if I
> later modify $result I'll modify $hash2.. I know the commit message says
> that the hashes are all just of the form key => 1, but nothing here
> tells me that a year later when I am looking for a generic hash
> intersection helper ;) I think this should also be clearly mentioned in
> the module, and ideally, also in the helper names (i.e., have "set"
> there everywhere and a comment above each that it only works for
> hashes-as-sets and not generic hashes).
> 
> wouldn't it be faster/simpler to iterate over either hash once?
> 
> my $result = {};
> for my $key (keys %$hash1) {
>      $result->{$key} = 1 if $hash1->{$key} && $hash2->{$key};
> }
> return $result;

I haven't thought too much about what { map {} } would cost here for the 
RFC, but the above is both easier to read and also safer, so I'll adapt 
the subroutine to the above, thanks :).

> 
> 
>>> +
>>> +    for my $key (keys %$result) {
>>> +	delete $result->{$key} if !defined($result->{$key});
>>> +    }
>>> +
>>> +    return $result;
>>> +};
>>> +
>>> +sub set_difference {
>>> +    my ($hash1, $hash2) = @_;
>>> +
>>> +    my $result = { map { $_ => 1 } keys %$hash1 };
> 
> if $hash1 is only of the form key => 1, then this is just
> 
> my $result = { %$hash1 };

But $result would then be a copy instead of a reference to %$hash1 here 
right? But only if there's no other references in there?

> 
>>> +
>>> +    for my $key (keys %$result) {
>>> +	delete $result->{$key} if defined($hash2->{$key});
>>> +    }
>>> +
> 
> but the whole thing can be
> 
> return { map { $hash2->{$_} ? ($_ => 1) : () } keys %$hash1 };
> 
> this transforms hash1 into its keys, and then returns either ($key => 1)
> if the key is true in $hash2, or the empty tuple if not. the outer {}
> then turn this sequence of tuples into a hash again, which skips empty
> tuples ;) can of course also be adapted to use the value from either
> hash, check for definedness instead of truthiness, ..

I'll have to check out more of the perldoc of the more common functions, 
didn't know that map will skip empty lists here, thanks :)

> 
>>> +    return $result;
>>> +};
>>> +
>>> +sub union {
>>> +    my ($hash1, $hash2) = @_;
>>> +
>>> +    my $result = { map { $_ => 1 } keys %$hash1, keys %$hash2 };
>>> +
>>> +    return $result;
>>> +};
>>> +
>>>   sub count_fenced_services {
>>>       my ($ss, $node) = @_;
>>>   
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes
  2025-04-03 12:17   ` Fabian Grünbichler
@ 2025-04-11 15:56     ` Daniel Kral
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Kral @ 2025-04-11 15:56 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Thanks for the taking the time here too!

I'm unsure if the documentation wasn't clear enough or I'm just blinded 
here in some details how the division between strict/non-strict should 
be, but I hope I could clarify some points about my understanding here. 
Please correct me here in any case there are scenarios where the current 
implementation will break user expectations, that's definitely not 
something that I want ;).

I'll definitely take some time to improve the control flow and names of 
variables/subroutines here to make it easier to understand and add 
examples how the content of $together and $separate look like at 
different stages.

The algorithm is online and is quite dependent on many other things like 
that $allowed_nodes has already nodes removed that were already tried 
and failed on, etc., so it's pretty dynamic here.

On 4/3/25 14:17, Fabian Grünbichler wrote:
> On March 25, 2025 4:12 pm, Daniel Kral wrote:
>> Add a mechanism to the node selection subroutine, which enforces the
>> colocation rules defined in the rules config.
>>
>> The algorithm manipulates the set of nodes directly, which the service
>> is allowed to run on, depending on the type and strictness of the
>> colocation rules, if there are any.
> 
> shouldn't this first attempt to satisfy all rules, and if that fails,
> retry with just the strict ones, or something similar? see comments
> below (maybe I am missing/misunderstanding something)

Hm, I'm not sure if I can follow what you mean here.

I tried to come up with some scenarios, where there could be conflicts 
because of "loose" colocation rules being overshadowed by strict 
colocation rules, but I'm currently not seeing that. But I've also been 
mostly concerned with smaller clusters (3 to 5 nodes) for now, so I'll 
take a closer look for larger applications/environments.

In general, when applying colocation rules, the logic is less concerned 
about which rules specifically get applied, but to make sure that none 
is violated in general.

This is also why a single service colocation rules turns out to be a 
noop, since it will never depend on the location of another service (the 
rule will never add something to $together/$separate, since there's only 
an entry there if other services already have a node pinned to them).

I hope the comments below clarify this a little bit or make it clearer 
where I'm missing something, so that the code/behavior/documentation can 
be improved ;).

> 
>>
>> This makes it depend on the prior removal of any nodes, which are
>> unavailable (i.e. offline, unreachable, or weren't able to start the
>> service in previous tries) or are not allowed to be run on otherwise
>> (i.e. HA group node restrictions) to function correctly.
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>>   src/PVE/HA/Manager.pm      | 203 ++++++++++++++++++++++++++++++++++++-
>>   src/test/test_failover1.pl |   4 +-
>>   2 files changed, 205 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
>> index 8f2ab3d..79b6555 100644
>> --- a/src/PVE/HA/Manager.pm
>> +++ b/src/PVE/HA/Manager.pm
>> @@ -157,8 +157,201 @@ sub get_node_priority_groups {
>>       return ($pri_groups, $group_members);
>>   }
>>   
>> +=head3 get_colocated_services($rules, $sid, $online_node_usage)
>> +
>> +Returns a hash map of all services, which are specified as being in a positive
>> +or negative colocation in C<$rules> with the given service with id C<$sid>.
>> +
>> +Each service entry consists of the type of colocation, strictness of colocation
>> +and the node the service is currently assigned to, if any, according to
>> +C<$online_node_usage>.
>> +
>> +For example, a service C<'vm:101'> being strictly colocated together (positive)
>> +with two other services C<'vm:102'> and C<'vm:103'> and loosely colocated
>> +separate with another service C<'vm:104'> results in the hash map:
>> +
>> +    {
>> +	'vm:102' => {
>> +	    affinity => 'together',
>> +	    strict => 1,
>> +	    node => 'node2'
>> +	},
>> +	'vm:103' => {
>> +	    affinity => 'together',
>> +	    strict => 1,
>> +	    node => 'node2'
>> +	},
>> +	'vm:104' => {
>> +	    affinity => 'separate',
>> +	    strict => 0,
>> +	    node => undef
>> +	}
>> +    }
>> +
>> +=cut
>> +
>> +sub get_colocated_services {
>> +    my ($rules, $sid, $online_node_usage) = @_;
>> +
>> +    my $services = {};
>> +
>> +    PVE::HA::Rules::Colocation::foreach_colocation_rule($rules, sub {
>> +	my ($rule) = @_;
>> +
>> +	for my $csid (sort keys %{$rule->{services}}) {
>> +	    next if $csid eq $sid;
>> +
>> +	    $services->{$csid} = {
>> +		node => $online_node_usage->get_service_node($csid),
>> +		affinity => $rule->{affinity},
>> +		strict => $rule->{strict},
>> +	    };
>> +        }
>> +    }, {
>> +	sid => $sid,
>> +    });
>> +
>> +    return $services;
>> +}
>> +
>> +=head3 get_colocation_preference($rules, $sid, $online_node_usage)
>> +
>> +Returns a list of two hashes, where each is a hash map of the colocation
>> +preference of C<$sid>, according to the colocation rules in C<$rules> and the
>> +service locations in C<$online_node_usage>.
>> +
>> +The first hash is the positive colocation preference, where each element
>> +represents properties for how much C<$sid> prefers to be on the node.
>> +Currently, this is a binary C<$strict> field, which means either it should be
>> +there (C<0>) or must be there (C<1>).
>> +
>> +The second hash is the negative colocation preference, where each element
>> +represents properties for how much C<$sid> prefers not to be on the node.
>> +Currently, this is a binary C<$strict> field, which means either it should not
>> +be there (C<0>) or must not be there (C<1>).
>> +
>> +=cut
>> +
>> +sub get_colocation_preference {
>> +    my ($rules, $sid, $online_node_usage) = @_;
>> +
>> +    my $services = get_colocated_services($rules, $sid, $online_node_usage);
>> +
>> +    my $together = {};
>> +    my $separate = {};
>> +
>> +    for my $service (values %$services) {
>> +	my $node = $service->{node};
>> +
>> +	next if !$node;
>> +
>> +	my $node_set = $service->{affinity} eq 'together' ? $together : $separate;
>> +	$node_set->{$node}->{strict} = $node_set->{$node}->{strict} || $service->{strict};
>> +    }
>> +
>> +    return ($together, $separate);
>> +}
>> +
>> +=head3 apply_positive_colocation_rules($together, $allowed_nodes)
>> +
>> +Applies the positive colocation preference C<$together> on the allowed node
>> +hash set C<$allowed_nodes> directly.
>> +
>> +Positive colocation means keeping services together on a single node, and
>> +therefore minimizing the separation of services.
>> +
>> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
>> +which is available to the service, i.e. each node is currently online, is
>> +available according to other location constraints, and the service has not
>> +failed running there yet.
>> +
>> +=cut
>> +
>> +sub apply_positive_colocation_rules {
>> +    my ($together, $allowed_nodes) = @_;
>> +
>> +    return if scalar(keys %$together) < 1;
>> +
>> +    my $mandatory_nodes = {};
>> +    my $possible_nodes = PVE::HA::Tools::intersect($allowed_nodes, $together);
>> +
>> +    for my $node (sort keys %$together) {
>> +	$mandatory_nodes->{$node} = 1 if $together->{$node}->{strict};
>> +    }
>> +
>> +    if (scalar keys %$mandatory_nodes) {
>> +	# limit to only the nodes the service must be on.
>> +	for my $node (keys %$allowed_nodes) {
>> +	    next if exists($mandatory_nodes->{$node});
>> +
>> +	    delete $allowed_nodes->{$node};
>> +	}
>> +    } elsif (scalar keys %$possible_nodes) {
> 
> I am not sure I follow this logic here.. if there are any strict
> requirements, we only honor those.. if there are no strict requirements,
> we only honor the non-strict ones?

Please correct me if I'm wrong, but at least for my understanding this 
seems right, because the nodes in $together are the nodes, which other 
co-located services are already running on.

If there is a co-located service already running somewhere and the 
services MUST be kept together, then there will be an entry like 'node3' 
=> { strict => 1 } in $together. AFAICS we can then ignore any 
non-strict nodes here, because we already know where the service MUST run.

If there is a co-located service already running somewhere and the 
services SHOULD be kept together, then there will be one or more 
entries, e.g. $together = { 'node1' => { strict => 0 }, 'node2' => { 
strict => 0 } };

If there is no co-located service already running somewhere, then 
$together = {}; and this subroutine won't do anything to $allowed_nodes.

In theory, we could assume that %$mandatory_nodes has always only one 
node, because it is mandatory. But currently, we do not hinder users 
manually migrating against colocation rules (maybe we should?) or what 
if rules suddenly change from non-strict to strict. We do not 
auto-migrate if rules change (maybe we should?).

-----

On another note, intersect() here is used with $together (and 
set_difference() with $separate below), which goes against what I said 
in patch #5 to only use hash sets, but as it tries to get only the truth 
value anyway, it was fine here. I'll make that more robust in a v1.

> 
>> +	# limit to the possible nodes the service should be on, if there are any.
>> +	for my $node (keys %$allowed_nodes) {
>> +	    next if exists($possible_nodes->{$node});
>> +
>> +	    delete $allowed_nodes->{$node};
>> +	}
> 
> this is the same code twice, just operating on different hash
> references, so could probably be a lot shorter. the next and delete
> could also be combined (`delete .. if !...`).

Yes, I wanted to break it down more and will improve it, thanks for the 
suggestion with the delete post-if!

I guess we can also move the definition + assignment of $possible_nodes 
down here too, as it won't be needed for the $mandatory_nodes case, 
depending if the general behavior won't be changed.

> 
>> +    }
>> +}
>> +
>> +=head3 apply_negative_colocation_rules($separate, $allowed_nodes)
>> +
>> +Applies the negative colocation preference C<$separate> on the allowed node
>> +hash set C<$allowed_nodes> directly.
>> +
>> +Negative colocation means keeping services separate on multiple nodes, and
>> +therefore maximizing the separation of services.
>> +
>> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
>> +which is available to the service, i.e. each node is currently online, is
>> +available according to other location constraints, and the service has not
>> +failed running there yet.
>> +
>> +=cut
>> +
>> +sub apply_negative_colocation_rules {
>> +    my ($separate, $allowed_nodes) = @_;
>> +
>> +    return if scalar(keys %$separate) < 1;
>> +
>> +    my $mandatory_nodes = {};
>> +    my $possible_nodes = PVE::HA::Tools::set_difference($allowed_nodes, $separate);
> 
> this is confusing or I misunderstand something here, see below..
> 
>> +
>> +    for my $node (sort keys %$separate) {
>> +	$mandatory_nodes->{$node} = 1 if $separate->{$node}->{strict};
>> +    }
>> +
>> +    if (scalar keys %$mandatory_nodes) {
>> +	# limit to the nodes the service must not be on.
> 
> this is missing a not?
> we are limiting to the nodes the service must not not be on :-P
> 
> should we rename mandatory_nodes to forbidden_nodes?

Good idea, yes this would be a much better fitting name. When I wrote 
$mandatory_nodes as above, I was always thinking 'mandatory to not be 
there'...

> 
>> +	for my $node (keys %$allowed_nodes) {
> 
> this could just loop over the forbidden nodes and delete them from
> allowed nodes?

Yes, this should also be possible. I think I had a counter example in an 
earlier version, where this didn't work, but now it should make sense.

> 
>> +	    next if !exists($mandatory_nodes->{$node});
>> +
>> +	    delete $allowed_nodes->{$node};
>> +	}
>> +    } elsif (scalar keys %$possible_nodes) {
> 
> similar to above - if we have strict exclusions, we honor them, but we
> ignore the non-strict exclusions unless there are no strict ones?

Same principle above, but now $separate holds all nodes where the 
anti-colocated services are already running on, so we're trying to not 
select a node from there.

> 
>> +	# limit to the nodes the service should not be on, if any.
>> +	for my $node (keys %$allowed_nodes) {
>> +	    next if exists($possible_nodes->{$node});
>> +
>> +	    delete $allowed_nodes->{$node};
>> +	}
>> +    }
>> +}
>> +
>> +sub apply_colocation_rules {
>> +    my ($rules, $sid, $allowed_nodes, $online_node_usage) = @_;
>> +
>> +    my ($together, $separate) = get_colocation_preference($rules, $sid, $online_node_usage);
>> +
>> +    apply_positive_colocation_rules($together, $allowed_nodes);
>> +    apply_negative_colocation_rules($separate, $allowed_nodes);
>> +}
>> +
>>   sub select_service_node {
>> -    my ($groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
>> +    # TODO Cleanup this signature post-RFC
>> +    my ($rules, $groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
>>   
>>       my $group = get_service_group($groups, $online_node_usage, $service_conf);
>>   
>> @@ -189,6 +382,8 @@ sub select_service_node {
>>   
>>       return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
>>   
>> +    apply_colocation_rules($rules, $sid, $pri_nodes, $online_node_usage);
>> +
>>       my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
>>       my @nodes = sort {
>>   	$scores->{$a} <=> $scores->{$b} || $a cmp $b
>> @@ -758,6 +953,7 @@ sub next_state_request_start {
>>   
>>       if ($self->{crs}->{rebalance_on_request_start}) {
>>   	my $selected_node = select_service_node(
>> +	    $self->{rules},
>>   	    $self->{groups},
>>   	    $self->{online_node_usage},
>>   	    $sid,
>> @@ -771,6 +967,9 @@ sub next_state_request_start {
>>   	my $select_text = $selected_node ne $current_node ? 'new' : 'current';
>>   	$haenv->log('info', "service $sid: re-balance selected $select_text node $selected_node for startup");
>>   
>> +	# TODO It would be better if this information would be retrieved from $ss/$sd post-RFC
>> +	$self->{online_node_usage}->pin_service_node($sid, $selected_node);
>> +
>>   	if ($selected_node ne $current_node) {
>>   	    $change_service_state->($self, $sid, 'request_start_balance', node => $current_node, target => $selected_node);
>>   	    return;
>> @@ -898,6 +1097,7 @@ sub next_state_started {
>>   	    }
>>   
>>   	    my $node = select_service_node(
>> +		$self->{rules},
>>   	        $self->{groups},
>>   		$self->{online_node_usage},
>>   		$sid,
>> @@ -1004,6 +1204,7 @@ sub next_state_recovery {
>>       $self->recompute_online_node_usage(); # we want the most current node state
>>   
>>       my $recovery_node = select_service_node(
>> +	$self->{rules},
>>   	$self->{groups},
>>   	$self->{online_node_usage},
>>   	$sid,
>> diff --git a/src/test/test_failover1.pl b/src/test/test_failover1.pl
>> index 308eab3..4c84fbd 100755
>> --- a/src/test/test_failover1.pl
>> +++ b/src/test/test_failover1.pl
>> @@ -8,6 +8,8 @@ use PVE::HA::Groups;
>>   use PVE::HA::Manager;
>>   use PVE::HA::Usage::Basic;
>>   
>> +my $rules = {};
>> +
>>   my $groups = PVE::HA::Groups->parse_config("groups.tmp", <<EOD);
>>   group: prefer_node1
>>   	nodes node1
>> @@ -31,7 +33,7 @@ sub test {
>>       my ($expected_node, $try_next) = @_;
>>       
>>       my $node = PVE::HA::Manager::select_service_node
>> -	($groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
>> +	($rules, $groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
>>   
>>       my (undef, undef, $line) = caller();
>>       die "unexpected result: $node != ${expected_node} at line $line\n"
>> -- 
>> 2.39.5
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-04-11 15:57 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
2025-03-25 17:49   ` [pve-devel] applied: " Thomas Lamprecht
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
2025-03-25 17:53   ` Thomas Lamprecht
2025-04-03 12:16     ` Fabian Grünbichler
2025-04-11 11:24       ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
2025-04-03 12:16   ` Fabian Grünbichler
2025-04-11 11:04     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
2025-04-03 12:17   ` Fabian Grünbichler
2025-04-11 15:56     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive " Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose " Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config Daniel Kral
2025-03-25 16:47 ` [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-04-01  1:50 ` DERUMIER, Alexandre
2025-04-01  9:39   ` Daniel Kral
2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
2025-04-03 12:26     ` Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal