public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes
Date: Thu, 03 Apr 2025 14:17:02 +0200	[thread overview]
Message-ID: <1743681290.ng3l34qeu2.astroid@yuna.none> (raw)
In-Reply-To: <20250325151254.193177-11-d.kral@proxmox.com>

On March 25, 2025 4:12 pm, Daniel Kral wrote:
> Add a mechanism to the node selection subroutine, which enforces the
> colocation rules defined in the rules config.
> 
> The algorithm manipulates the set of nodes directly, which the service
> is allowed to run on, depending on the type and strictness of the
> colocation rules, if there are any.

shouldn't this first attempt to satisfy all rules, and if that fails,
retry with just the strict ones, or something similar? see comments
below (maybe I am missing/misunderstanding something)

> 
> This makes it depend on the prior removal of any nodes, which are
> unavailable (i.e. offline, unreachable, or weren't able to start the
> service in previous tries) or are not allowed to be run on otherwise
> (i.e. HA group node restrictions) to function correctly.
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  src/PVE/HA/Manager.pm      | 203 ++++++++++++++++++++++++++++++++++++-
>  src/test/test_failover1.pl |   4 +-
>  2 files changed, 205 insertions(+), 2 deletions(-)
> 
> diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
> index 8f2ab3d..79b6555 100644
> --- a/src/PVE/HA/Manager.pm
> +++ b/src/PVE/HA/Manager.pm
> @@ -157,8 +157,201 @@ sub get_node_priority_groups {
>      return ($pri_groups, $group_members);
>  }
>  
> +=head3 get_colocated_services($rules, $sid, $online_node_usage)
> +
> +Returns a hash map of all services, which are specified as being in a positive
> +or negative colocation in C<$rules> with the given service with id C<$sid>.
> +
> +Each service entry consists of the type of colocation, strictness of colocation
> +and the node the service is currently assigned to, if any, according to
> +C<$online_node_usage>.
> +
> +For example, a service C<'vm:101'> being strictly colocated together (positive)
> +with two other services C<'vm:102'> and C<'vm:103'> and loosely colocated
> +separate with another service C<'vm:104'> results in the hash map:
> +
> +    {
> +	'vm:102' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:103' => {
> +	    affinity => 'together',
> +	    strict => 1,
> +	    node => 'node2'
> +	},
> +	'vm:104' => {
> +	    affinity => 'separate',
> +	    strict => 0,
> +	    node => undef
> +	}
> +    }
> +
> +=cut
> +
> +sub get_colocated_services {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = {};
> +
> +    PVE::HA::Rules::Colocation::foreach_colocation_rule($rules, sub {
> +	my ($rule) = @_;
> +
> +	for my $csid (sort keys %{$rule->{services}}) {
> +	    next if $csid eq $sid;
> +
> +	    $services->{$csid} = {
> +		node => $online_node_usage->get_service_node($csid),
> +		affinity => $rule->{affinity},
> +		strict => $rule->{strict},
> +	    };
> +        }
> +    }, {
> +	sid => $sid,
> +    });
> +
> +    return $services;
> +}
> +
> +=head3 get_colocation_preference($rules, $sid, $online_node_usage)
> +
> +Returns a list of two hashes, where each is a hash map of the colocation
> +preference of C<$sid>, according to the colocation rules in C<$rules> and the
> +service locations in C<$online_node_usage>.
> +
> +The first hash is the positive colocation preference, where each element
> +represents properties for how much C<$sid> prefers to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should be
> +there (C<0>) or must be there (C<1>).
> +
> +The second hash is the negative colocation preference, where each element
> +represents properties for how much C<$sid> prefers not to be on the node.
> +Currently, this is a binary C<$strict> field, which means either it should not
> +be there (C<0>) or must not be there (C<1>).
> +
> +=cut
> +
> +sub get_colocation_preference {
> +    my ($rules, $sid, $online_node_usage) = @_;
> +
> +    my $services = get_colocated_services($rules, $sid, $online_node_usage);
> +
> +    my $together = {};
> +    my $separate = {};
> +
> +    for my $service (values %$services) {
> +	my $node = $service->{node};
> +
> +	next if !$node;
> +
> +	my $node_set = $service->{affinity} eq 'together' ? $together : $separate;
> +	$node_set->{$node}->{strict} = $node_set->{$node}->{strict} || $service->{strict};
> +    }
> +
> +    return ($together, $separate);
> +}
> +
> +=head3 apply_positive_colocation_rules($together, $allowed_nodes)
> +
> +Applies the positive colocation preference C<$together> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Positive colocation means keeping services together on a single node, and
> +therefore minimizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_positive_colocation_rules {
> +    my ($together, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$together) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::intersect($allowed_nodes, $together);
> +
> +    for my $node (sort keys %$together) {
> +	$mandatory_nodes->{$node} = 1 if $together->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to only the nodes the service must be on.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($mandatory_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {

I am not sure I follow this logic here.. if there are any strict
requirements, we only honor those.. if there are no strict requirements,
we only honor the non-strict ones?

> +	# limit to the possible nodes the service should be on, if there are any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}

this is the same code twice, just operating on different hash
references, so could probably be a lot shorter. the next and delete
could also be combined (`delete .. if !...`).

> +    }
> +}
> +
> +=head3 apply_negative_colocation_rules($separate, $allowed_nodes)
> +
> +Applies the negative colocation preference C<$separate> on the allowed node
> +hash set C<$allowed_nodes> directly.
> +
> +Negative colocation means keeping services separate on multiple nodes, and
> +therefore maximizing the separation of services.
> +
> +The allowed node hash set C<$allowed_nodes> is expected to contain any node,
> +which is available to the service, i.e. each node is currently online, is
> +available according to other location constraints, and the service has not
> +failed running there yet.
> +
> +=cut
> +
> +sub apply_negative_colocation_rules {
> +    my ($separate, $allowed_nodes) = @_;
> +
> +    return if scalar(keys %$separate) < 1;
> +
> +    my $mandatory_nodes = {};
> +    my $possible_nodes = PVE::HA::Tools::set_difference($allowed_nodes, $separate);

this is confusing or I misunderstand something here, see below..

> +
> +    for my $node (sort keys %$separate) {
> +	$mandatory_nodes->{$node} = 1 if $separate->{$node}->{strict};
> +    }
> +
> +    if (scalar keys %$mandatory_nodes) {
> +	# limit to the nodes the service must not be on.

this is missing a not?
we are limiting to the nodes the service must not not be on :-P

should we rename mandatory_nodes to forbidden_nodes?

> +	for my $node (keys %$allowed_nodes) {

this could just loop over the forbidden nodes and delete them from
allowed nodes?

> +	    next if !exists($mandatory_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    } elsif (scalar keys %$possible_nodes) {

similar to above - if we have strict exclusions, we honor them, but we
ignore the non-strict exclusions unless there are no strict ones?

> +	# limit to the nodes the service should not be on, if any.
> +	for my $node (keys %$allowed_nodes) {
> +	    next if exists($possible_nodes->{$node});
> +
> +	    delete $allowed_nodes->{$node};
> +	}
> +    }
> +}
> +
> +sub apply_colocation_rules {
> +    my ($rules, $sid, $allowed_nodes, $online_node_usage) = @_;
> +
> +    my ($together, $separate) = get_colocation_preference($rules, $sid, $online_node_usage);
> +
> +    apply_positive_colocation_rules($together, $allowed_nodes);
> +    apply_negative_colocation_rules($separate, $allowed_nodes);
> +}
> +
>  sub select_service_node {
> -    my ($groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
> +    # TODO Cleanup this signature post-RFC
> +    my ($rules, $groups, $online_node_usage, $sid, $service_conf, $current_node, $try_next, $tried_nodes, $maintenance_fallback, $best_scored) = @_;
>  
>      my $group = get_service_group($groups, $online_node_usage, $service_conf);
>  
> @@ -189,6 +382,8 @@ sub select_service_node {
>  
>      return $current_node if (!$try_next && !$best_scored) && $pri_nodes->{$current_node};
>  
> +    apply_colocation_rules($rules, $sid, $pri_nodes, $online_node_usage);
> +
>      my $scores = $online_node_usage->score_nodes_to_start_service($sid, $current_node);
>      my @nodes = sort {
>  	$scores->{$a} <=> $scores->{$b} || $a cmp $b
> @@ -758,6 +953,7 @@ sub next_state_request_start {
>  
>      if ($self->{crs}->{rebalance_on_request_start}) {
>  	my $selected_node = select_service_node(
> +	    $self->{rules},
>  	    $self->{groups},
>  	    $self->{online_node_usage},
>  	    $sid,
> @@ -771,6 +967,9 @@ sub next_state_request_start {
>  	my $select_text = $selected_node ne $current_node ? 'new' : 'current';
>  	$haenv->log('info', "service $sid: re-balance selected $select_text node $selected_node for startup");
>  
> +	# TODO It would be better if this information would be retrieved from $ss/$sd post-RFC
> +	$self->{online_node_usage}->pin_service_node($sid, $selected_node);
> +
>  	if ($selected_node ne $current_node) {
>  	    $change_service_state->($self, $sid, 'request_start_balance', node => $current_node, target => $selected_node);
>  	    return;
> @@ -898,6 +1097,7 @@ sub next_state_started {
>  	    }
>  
>  	    my $node = select_service_node(
> +		$self->{rules},
>  	        $self->{groups},
>  		$self->{online_node_usage},
>  		$sid,
> @@ -1004,6 +1204,7 @@ sub next_state_recovery {
>      $self->recompute_online_node_usage(); # we want the most current node state
>  
>      my $recovery_node = select_service_node(
> +	$self->{rules},
>  	$self->{groups},
>  	$self->{online_node_usage},
>  	$sid,
> diff --git a/src/test/test_failover1.pl b/src/test/test_failover1.pl
> index 308eab3..4c84fbd 100755
> --- a/src/test/test_failover1.pl
> +++ b/src/test/test_failover1.pl
> @@ -8,6 +8,8 @@ use PVE::HA::Groups;
>  use PVE::HA::Manager;
>  use PVE::HA::Usage::Basic;
>  
> +my $rules = {};
> +
>  my $groups = PVE::HA::Groups->parse_config("groups.tmp", <<EOD);
>  group: prefer_node1
>  	nodes node1
> @@ -31,7 +33,7 @@ sub test {
>      my ($expected_node, $try_next) = @_;
>      
>      my $node = PVE::HA::Manager::select_service_node
> -	($groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
> +	($rules, $groups, $online_node_usage, "vm:111", $service_conf, $current_node, $try_next);
>  
>      my (undef, undef, $line) = caller();
>      die "unexpected result: $node != ${expected_node} at line $line\n" 
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2025-04-03 12:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-25 15:12 [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH cluster 1/1] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 01/15] ignore output of fence config tests in tree Daniel Kral
2025-03-25 17:49   ` [pve-devel] applied: " Thomas Lamprecht
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 02/15] tools: add hash set helper subroutines Daniel Kral
2025-03-25 17:53   ` Thomas Lamprecht
2025-04-03 12:16     ` Fabian Grünbichler
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 03/15] usage: add get_service_node and pin_service_node methods Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 04/15] add rules section config base plugin Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 05/15] rules: add colocation rule plugin Daniel Kral
2025-04-03 12:16   ` Fabian Grünbichler
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 06/15] config, env, hw: add rules read and parse methods Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 07/15] manager: read and update rules config Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 08/15] manager: factor out prioritized nodes in select_service_node Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 09/15] manager: apply colocation rules when selecting service nodes Daniel Kral
2025-04-03 12:17   ` Fabian Grünbichler [this message]
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 10/15] sim: resources: add option to limit start and migrate tries to node Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 11/15] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 12/15] test: ha tester: add test cases for strict positive " Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose " Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 14/15] test: ha tester: add test cases in more complex scenarios Daniel Kral
2025-03-25 15:12 ` [pve-devel] [PATCH ha-manager 15/15] test: add test cases for rules config Daniel Kral
2025-03-25 16:47 ` [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules Daniel Kral
2025-04-01  1:50 ` DERUMIER, Alexandre
2025-04-01  9:39   ` Daniel Kral
2025-04-01 11:05     ` DERUMIER, Alexandre via pve-devel
2025-04-03 12:26     ` Fabian Grünbichler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1743681290.ng3l34qeu2.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal