From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH ha-manager 5/9] rules: resource affinity: decouple get_resource_affinity helper from Usage class
Date: Tue, 30 Sep 2025 16:19:15 +0200 [thread overview]
Message-ID: <20250930142021.366529-9-d.kral@proxmox.com> (raw)
In-Reply-To: <20250930142021.366529-1-d.kral@proxmox.com>
The resource affinity rules need information about the other HA
resource's used nodes to be enacted correctly, which has been proxied
through $online_node_usage before.
The get_used_service_nodes(...) reflects the same logic to retrieve the
nodes, where a HA resource $sid currently puts load on, as in
recompute_online_node_usage(...).
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
I wanted to put this information in PVE::HA::Usage directly, but figured
it would make Usage much more dependent on $ss / $sd.. We can still do
that later on or move the helper elsewhere, e.g. making ServiceStatus
its own class?
src/PVE/HA/Manager.pm | 24 +++++++++++------------
src/PVE/HA/Rules/ResourceAffinity.pm | 15 ++++++++------
src/PVE/HA/Tools.pm | 29 ++++++++++++++++++++++++++++
src/PVE/HA/Usage.pm | 18 -----------------
src/PVE/HA/Usage/Basic.pm | 19 ------------------
src/PVE/HA/Usage/Static.pm | 19 ------------------
src/test/test_failover1.pl | 17 +++++++++-------
7 files changed, 59 insertions(+), 82 deletions(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 468e41eb..0226e427 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -121,12 +121,12 @@ sub flush_master_status {
=head3 select_service_node(...)
-=head3 select_service_node($rules, $online_node_usage, $sid, $service_conf, $sd, $node_preference)
+=head3 select_service_node($rules, $online_node_usage, $sid, $service_conf, $ss, $node_preference)
Used to select the best fitting node for the service C<$sid>, with the
-configuration C<$service_conf> and state C<$sd>, according to the rules defined
-in C<$rules>, available node utilization in C<$online_node_usage>, and the
-given C<$node_preference>.
+configuration C<$service_conf>, according to the rules defined in C<$rules>,
+available node utilization in C<$online_node_usage>, the service states in
+C<$ss> and the given C<$node_preference>.
The C<$node_preference> can be set to:
@@ -143,11 +143,12 @@ The C<$node_preference> can be set to:
=cut
sub select_service_node {
- my ($rules, $online_node_usage, $sid, $service_conf, $sd, $node_preference) = @_;
+ my ($rules, $online_node_usage, $sid, $service_conf, $ss, $node_preference) = @_;
die "'$node_preference' is not a valid node_preference for select_service_node\n"
if $node_preference !~ m/(none|best-score|try-next)/;
+ my $sd = $ss->{$sid};
my ($current_node, $tried_nodes, $maintenance_fallback) =
$sd->@{qw(node failed_nodes maintenance_node)};
@@ -155,7 +156,8 @@ sub select_service_node {
return undef if !%$pri_nodes;
- my ($together, $separate) = get_resource_affinity($rules, $sid, $online_node_usage);
+ my $online_nodes = { map { $_ => 1 } $online_node_usage->list_nodes() };
+ my ($together, $separate) = get_resource_affinity($rules, $sid, $ss, $online_nodes);
# stay on current node if possible (avoids random migrations)
if (
@@ -281,7 +283,6 @@ sub recompute_online_node_usage {
|| $state eq 'recovery'
) {
$online_node_usage->add_service_usage_to_node($sd->{node}, $sid, $sd->{node});
- $online_node_usage->set_service_node($sid, $sd->{node});
} elsif (
$state eq 'migrate'
|| $state eq 'relocate'
@@ -291,11 +292,9 @@ sub recompute_online_node_usage {
# count it for both, source and target as load is put on both
if ($state ne 'request_start_balance') {
$online_node_usage->add_service_usage_to_node($source, $sid, $source, $target);
- $online_node_usage->add_service_node($sid, $source);
}
if ($online_node_usage->contains_node($target)) {
$online_node_usage->add_service_usage_to_node($target, $sid, $source, $target);
- $online_node_usage->add_service_node($sid, $target);
}
} elsif ($state eq 'stopped' || $state eq 'request_start') {
# do nothing
@@ -308,7 +307,6 @@ sub recompute_online_node_usage {
# case a node dies, as we cannot really know if the to-be-aborted incoming migration
# has already cleaned up all used resources
$online_node_usage->add_service_usage_to_node($target, $sid, $sd->{node}, $target);
- $online_node_usage->set_service_node($sid, $target);
}
}
}
@@ -1022,7 +1020,7 @@ sub next_state_request_start {
$self->{online_node_usage},
$sid,
$cd,
- $sd,
+ $self->{ss},
'best-score',
);
my $select_text = $selected_node ne $current_node ? 'new' : 'current';
@@ -1189,7 +1187,7 @@ sub next_state_started {
$self->{online_node_usage},
$sid,
$cd,
- $sd,
+ $self->{ss},
$select_node_preference,
);
@@ -1303,7 +1301,7 @@ sub next_state_recovery {
$self->{online_node_usage},
$sid,
$cd,
- $sd,
+ $self->{ss},
'best-score',
);
diff --git a/src/PVE/HA/Rules/ResourceAffinity.pm b/src/PVE/HA/Rules/ResourceAffinity.pm
index 9bc039ba..968aa088 100644
--- a/src/PVE/HA/Rules/ResourceAffinity.pm
+++ b/src/PVE/HA/Rules/ResourceAffinity.pm
@@ -5,6 +5,7 @@ use warnings;
use PVE::HA::HashTools qw(set_intersect sets_are_disjoint);
use PVE::HA::Rules;
+use PVE::HA::Tools;
use base qw(Exporter);
use base qw(PVE::HA::Rules);
@@ -496,12 +497,12 @@ sub get_affinitive_resources : prototype($$) {
return ($together, $separate);
}
-=head3 get_resource_affinity($rules, $sid, $online_node_usage)
+=head3 get_resource_affinity($rules, $sid, $ss, $online_nodes)
Returns a list of two hashes, where the first describes the positive resource
affinity and the second hash describes the negative resource affinity for
-resource C<$sid> according to the resource affinity rules in C<$rules> and the
-resource locations in C<$online_node_usage>.
+resource C<$sid> according to the resource affinity rules in C<$rules>, the
+service status C<$ss> and the C<$online_nodes> hash.
For the positive resource affinity of a resource C<$sid>, each element in the
hash represents an online node, where other resources, which C<$sid> is in
@@ -529,8 +530,8 @@ resource C<$sid> is in a negative affinity with, the returned value will be:
=cut
-sub get_resource_affinity : prototype($$$) {
- my ($rules, $sid, $online_node_usage) = @_;
+sub get_resource_affinity : prototype($$$$) {
+ my ($rules, $sid, $ss, $online_nodes) = @_;
my $together = {};
my $separate = {};
@@ -543,7 +544,9 @@ sub get_resource_affinity : prototype($$$) {
for my $csid (keys %{ $rule->{resources} }) {
next if $csid eq $sid;
- my $nodes = $online_node_usage->get_service_nodes($csid);
+ my $used_nodes =
+ PVE::HA::Tools::get_used_service_nodes($ss->{$csid}, $online_nodes);
+ my $nodes = [values %$used_nodes];
next if !$nodes || !@$nodes; # skip unassigned nodes
diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
index 71eb5d0b..7f718e25 100644
--- a/src/PVE/HA/Tools.pm
+++ b/src/PVE/HA/Tools.pm
@@ -188,6 +188,35 @@ sub count_fenced_services {
return $count;
}
+sub get_used_service_nodes {
+ my ($sd, $online_nodes) = @_;
+
+ my $result = {};
+
+ my ($state, $node, $target) = $sd->@{qw(state node target)};
+
+ return $result if $state eq 'stopped' || $state eq 'request_start';
+
+ if (
+ $state eq 'started'
+ || $state eq 'request_stop'
+ || $state eq 'fence'
+ || $state eq 'freeze'
+ || $state eq 'error'
+ || $state eq 'recovery'
+ || $state eq 'migrate'
+ || $state eq 'relocate'
+ ) {
+ $result->{current} = $node if $online_nodes->{$node};
+ }
+
+ if ($state eq 'migrate' || $state eq 'relocate' || $state eq 'request_start_balance') {
+ $result->{target} = $target if defined($target) && $online_nodes->{$target};
+ }
+
+ return $result;
+}
+
sub get_verbose_service_state {
my ($service_state, $service_conf) = @_;
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 7f4d9ca3..66d9572e 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -27,24 +27,6 @@ sub list_nodes {
die "implement in subclass";
}
-sub get_service_nodes {
- my ($self, $sid) = @_;
-
- die "implement in subclass";
-}
-
-sub set_service_node {
- my ($self, $sid, $nodename) = @_;
-
- die "implement in subclass";
-}
-
-sub add_service_node {
- my ($self, $sid, $nodename) = @_;
-
- die "implement in subclass";
-}
-
sub contains_node {
my ($self, $nodename) = @_;
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index afe3733c..ead08c54 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -11,7 +11,6 @@ sub new {
return bless {
nodes => {},
haenv => $haenv,
- 'service-nodes' => {},
}, $class;
}
@@ -39,24 +38,6 @@ sub contains_node {
return defined($self->{nodes}->{$nodename});
}
-sub get_service_nodes {
- my ($self, $sid) = @_;
-
- return $self->{'service-nodes'}->{$sid};
-}
-
-sub set_service_node {
- my ($self, $sid, $nodename) = @_;
-
- $self->{'service-nodes'}->{$sid} = [$nodename];
-}
-
-sub add_service_node {
- my ($self, $sid, $nodename) = @_;
-
- push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
-}
-
sub add_service_usage_to_node {
my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 6707a54c..061e74a2 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -22,7 +22,6 @@ sub new {
'service-stats' => {},
haenv => $haenv,
scheduler => $scheduler,
- 'service-nodes' => {},
'service-counts' => {}, # Service count on each node. Fallback if scoring calculation fails.
}, $class;
}
@@ -87,24 +86,6 @@ my sub get_service_usage {
return $service_stats;
}
-sub get_service_nodes {
- my ($self, $sid) = @_;
-
- return $self->{'service-nodes'}->{$sid};
-}
-
-sub set_service_node {
- my ($self, $sid, $nodename) = @_;
-
- $self->{'service-nodes'}->{$sid} = [$nodename];
-}
-
-sub add_service_node {
- my ($self, $sid, $nodename) = @_;
-
- push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
-}
-
sub add_service_usage_to_node {
my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
diff --git a/src/test/test_failover1.pl b/src/test/test_failover1.pl
index 78a001eb..495d4b4b 100755
--- a/src/test/test_failover1.pl
+++ b/src/test/test_failover1.pl
@@ -14,9 +14,10 @@ PVE::HA::Rules::NodeAffinity->register();
PVE::HA::Rules->init(property_isolation => 1);
+my $sid = 'vm:111';
my $rules = PVE::HA::Rules->parse_config("rules.tmp", <<EOD);
node-affinity: prefer_node1
- resources vm:111
+ resources $sid
nodes node1
EOD
@@ -31,10 +32,12 @@ my $service_conf = {
failback => 1,
};
-my $sd = {
- node => $service_conf->{node},
- failed_nodes => undef,
- maintenance_node => undef,
+my $ss = {
+ "$sid" => {
+ node => $service_conf->{node},
+ failed_nodes => undef,
+ maintenance_node => undef,
+ },
};
sub test {
@@ -43,14 +46,14 @@ sub test {
my $select_node_preference = $try_next ? 'try-next' : 'none';
my $node = PVE::HA::Manager::select_service_node(
- $rules, $online_node_usage, "vm:111", $service_conf, $sd, $select_node_preference,
+ $rules, $online_node_usage, "$sid", $service_conf, $ss, $select_node_preference,
);
my (undef, undef, $line) = caller();
die "unexpected result: $node != ${expected_node} at line $line\n"
if $node ne $expected_node;
- $sd->{node} = $node;
+ $ss->{$sid}->{node} = $node;
}
test('node1');
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-09-30 14:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 14:19 [pve-devel] [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH qemu-server 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH proxmox 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 2/9] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 3/9] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 4/9] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-09-30 14:19 ` Daniel Kral [this message]
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 6/9] manager: make recompute_online_node_usage use get_service_nodes helper Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 7/9] usage: allow granular changes to Usage implementations Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250930142021.366529-9-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox