public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Daniel Kral <d.kral@proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes
Date: Mon, 28 Apr 2025 14:26:43 +0200	[thread overview]
Message-ID: <67d201c9-64ca-4392-b166-aa14cb2bae48@proxmox.com> (raw)
In-Reply-To: <20250325151254.193177-11-d.kral@proxmox.com>

Am 25.03.25 um 16:12 schrieb Daniel Kral:
> Add a mechanism to the node selection subroutine, which enforces the
> colocation rules defined in the rules config.
> 
> The algorithm manipulates the set of nodes directly, which the service
> is allowed to run on, depending on the type and strictness of the
> colocation rules, if there are any.
> 
> This makes it depend on the prior removal of any nodes, which are
> unavailable (i.e. offline, unreachable, or weren't able to start the
> service in previous tries) or are not allowed to be run on otherwise
> (i.e. HA group node restrictions) to function correctly.
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  src/PVE/HA/Manager.pm      | 203 ++++++++++++++++++++++++++++++++++++-
>  src/test/test_failover1.pl |   4 +-
>  2 files changed, 205 insertions(+), 2 deletions(-)
> 
> diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
> index 8f2ab3d..79b6555 100644
> --- a/src/PVE/HA/Manager.pm
> +++ b/src/PVE/HA/Manager.pm
> @@ -157,8 +157,201 @@ sub get_node_priority_groups {
>      return ($pri_groups, $group_members);
>  }
>  

I feel like these helper functions should rather go into the colocation
plugin or some other module to not bloat up Manager.pm more.

> +=head3 get_colocated_services($rules, $sid, $online_node_usage)
> +
> +Returns a hash map of all services, which are specified as being in a positive
> +or negative colocation in C<$rules> with the given service with id C<$sid>.
> +
> +Each service entry consists of the type of colocation, strictness of colocation
> +and the node the service is currently assigned to, if any, according to
> +C<$online_node_usage>.
> +
> +For example, a service C<'vm:101'> being strictly colocated together (positive)
> +with two other services C<'vm:102'> and C<'vm:103'> and loosely colocated
> +separate with another service C<'vm:104'> results in the hash map:
> +
> +    {
> +	'vm:102' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:103' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:104' => {
> +	    affinity => 'separate',
> +	    strict => 0,
> +	    node => undef

Why is the node undef here?

> +	}
> +    }
> +
> +=cut
> +
> +sub get_colocated_services {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = {};
> +
> +    PVE::HA::Rules::Colocation::foreach_colocation_rule($rules, sub {
> +	my ($rule) = @_;
> +
> +	for my $csid (sort keys %{$rule->{services}}) {
> +	    next if $csid eq $sid;
> +
> +	    $services->{$csid} = {
> +		node => $online_node_usage->get_service_node($csid),
> +		affinity => $rule->{affinity},
> +		strict => $rule->{strict},
> +	    };
> +        }
> +    }, {
> +	sid => $sid,
> +    });
> +
> +    return $services;
> +}
> +
> +=head3 get_colocation_preference($rules, $sid, $online_node_usage)
> +
> +Returns a list of two hashes, where each is a hash map of the colocation
> +preference of C<$sid>, according to the colocation rules in C<$rules> and the
> +service locations in C<$online_node_usage>.
> +
> +The first hash is the positive colocation preference, where each element
> +represents properties for how much C<$sid> prefers to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should be

s/it/the service/

> +there (C<0>) or must be there (C<1>).
> +
> +The second hash is the negative colocation preference, where each element
> +represents properties for how much C<$sid> prefers not to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should not

s/it/the service/

> +be there (C<0>) or must not be there (C<1>).
> +
> +=cut
> +
> +sub get_colocation_preference {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = get_colocated_services($rules, $sid, $online_node_usage);

The name $services is a bit too generic, maybe $colocation_per_service
or something?

Maybe it would be better to just merge this one and the helper above
into a single one? I.e. just handle the info while iterating the rules
directly instead of creating a novel temporary per-service
data-structure and iterate twice.

> +
> +    my $together = {};
> +    my $separate = {};
> +
> +    for my $service (values %$services) {
> +	my $node = $service->{node};
> +
> +	next if !$node;
> +
> +	my $node_set = $service->{affinity} eq 'together' ? $together : $separate;
> +	$node_set->{$node}->{strict} = $node_set->{$node}->{strict} || $service->{strict};
> +    }
> +
> +    return ($together, $separate);
> +}
> +
> +=head3 apply_positive_colocation_rules($together, $allowed_nodes)
> +
> +Applies the positive colocation preference C<$together> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Positive colocation means keeping services together on a single node, and
> +therefore minimizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_positive_colocation_rules {
> +    my ($together, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$together) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::intersect($allowed_nodes, $together);
> +
> +    for my $node (sort keys %$together) {
> +	$mandatory_nodes->{$node} = 1 if $together->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to only the nodes the service must be on.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($mandatory_nodes->{$node});

Style nit: I'd avoid using exists() if you explicitly expect a set
value. Otherwise, it can break because of accidental auto-vivification
in the future.

> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {
> +	# limit to the possible nodes the service should be on, if there are any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};

This seems wrong. Non-strict rules should not limit the allowed nodes.
See below for more on this.

> +	}
> +    }
> +}
> +
> +=head3 apply_negative_colocation_rules($separate, $allowed_nodes)
> +
> +Applies the negative colocation preference C<$separate> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Negative colocation means keeping services separate on multiple nodes, and
> +therefore maximizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_negative_colocation_rules {
> +    my ($separate, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$separate) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::set_difference($allowed_nodes, $separate);
> +
> +    for my $node (sort keys %$separate) {
> +	$mandatory_nodes->{$node} = 1 if $separate->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to the nodes the service must not be on.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if !exists($mandatory_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {
> +	# limit to the nodes the service should not be on, if any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    }
> +}
> +
> +sub apply_colocation_rules {
> +    my ($rules, $sid, $allowed_nodes, $online_node_usage) = @_;
> +
> +    my ($together, $separate) = get_colocation_preference($rules, $sid, $online_node_usage);
> +
> +    apply_positive_colocation_rules($together, $allowed_nodes);
> +    apply_negative_colocation_rules($separate, $allowed_nodes);

I think there could be a problematic scenario with
* no strict positive rules, but loose strict positive rules
* strict negative rules
where apply_positive_colocation_rules() will limit $allowed_nodes in
such a way that the strict negative rules cannot be satisfied anymore
afterwards.

I feel like what we actually want from non-strict rules is not to limit
the allowed nodes at all, but only express preferences. After scoring,
we could:
1. always take a colocation preference node if present no matter what
the usage score is
2. have a threshold to not follow through, if there is a non-colocation
preference node with a much better usage score relatively
3. somehow massage it into the score itself. E.g. every node that would
be preferred by colocation gets a 0.5 multiplier score adjustment while
other scores are unchanged - remember that lower score is better.
4. [insert your suggestion here]

So to me it seems like there should be a helper that gives us:
1. list of nodes that satisfy strict rules - these we can then intersect
with the $pri_nodes
2. list of nodes that are preferred by non-strict rules - these we can
consider after scoring

> +}
> +
>  sub select_service_node {
> -    my ($groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
> +    # TODO Cleanup this signature post-RFC
> +    my ($rules, $groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
>  
>      my $group = get_service_group($groups, $online_node_usage, $service_conf);
>  
> @@ -189,6 +382,8 @@ sub select_service_node {
>  
>      return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
>  
> +    apply_colocation_rules($rules, $sid, $pri_nodes, $online_node_usage);
> +
>      my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
>      my @nodes = sort {
>  	$scores->{$a} <=> $scores->{$b} || $a cmp $b


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-04-28 12:27 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
2025-03-25 17:49   ` [pve-devel] applied: " Thomas Lamprecht
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
2025-03-25 17:53   ` Thomas Lamprecht
2025-04-03 12:16     ` Fabian Grünbichler
2025-04-11 11:24       ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods Daniel Kral
2025-04-24 12:29   ` Fiona Ebner
2025-04-25  7:39     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin Daniel Kral
2025-04-24 13:03   ` Fiona Ebner
2025-04-25  8:29     ` Daniel Kral
2025-04-25  9:12       ` Fiona Ebner
2025-04-25 13:30         ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
2025-04-03 12:16   ` Fabian Grünbichler
2025-04-11 11:04     ` Daniel Kral
2025-04-25 14:06       ` Fiona Ebner
2025-04-29  8:37         ` Daniel Kral
2025-04-29  9:15           ` Fiona Ebner
2025-05-07  8:41             ` Daniel Kral
2025-04-25 14:05   ` Fiona Ebner
2025-04-29  8:44     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods Daniel Kral
2025-04-25 14:11   ` Fiona Ebner
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config Daniel Kral
2025-04-25 14:30   ` Fiona Ebner
2025-04-29  8:04     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node Daniel Kral
2025-04-28 13:03   ` Fiona Ebner
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
2025-04-03 12:17   ` Fabian Grünbichler
2025-04-11 15:56     ` Daniel Kral
2025-04-28 12:46       ` Fiona Ebner
2025-04-29  9:07         ` Daniel Kral
2025-04-29  9:22           ` Fiona Ebner
2025-04-28 12:26   ` Fiona Ebner [this message]
2025-04-28 14:33     ` Fiona Ebner
2025-04-29  9:39       ` Daniel Kral
2025-04-29  9:50     ` Daniel Kral
2025-04-30 11:09   ` Daniel Kral
2025-05-02  9:33     ` Fiona Ebner
2025-05-07  8:31       ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node Daniel Kral
2025-04-28 13:20   ` Fiona Ebner
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
2025-04-28 13:44   ` Fiona Ebner
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive " Daniel Kral
2025-04-28 13:51   ` Fiona Ebner
2025-05-09 11:22     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose " Daniel Kral
2025-04-28 14:44   ` Fiona Ebner
2025-05-09 11:20     ` Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios Daniel Kral
2025-04-29  8:54   ` Fiona Ebner
2025-04-29  9:01   ` Fiona Ebner
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config Daniel Kral
2025-03-25 16:47 ` [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-04-24 10:12   ` Fiona Ebner
2025-04-01  1:50 ` DERUMIER, Alexandre
2025-04-01  9:39   ` Daniel Kral
2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
2025-04-03 12:26     ` Fabian Grünbichler
2025-04-24 10:12     ` Fiona Ebner
2025-04-24 10:12 ` Fiona Ebner
2025-04-25  8:36   ` Daniel Kral
2025-04-25 12:25     ` Fiona Ebner
2025-04-25 13:25       ` Daniel Kral
2025-04-25 13:58         ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67d201c9-64ca-4392-b166-aa14cb2bae48@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal