From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH ha-manager v3 30/40] usage: use add_service to add service usage to nodes
Date: Mon, 30 Mar 2026 16:30:39 +0200 [thread overview]
Message-ID: <20260330144101.668747-31-d.kral@proxmox.com> (raw)
In-Reply-To: <20260330144101.668747-1-d.kral@proxmox.com>
The pve_static (and upcoming pve_dynamic) bindings expose the new
add_resource(...) method, which allow adding resources in a single call
with the additional running flag.
The running flag is needed to discriminate starting and started HA
resources from each other, which is needed to correctly account for HA
resources for the dynamic load usage implementation in the next patch.
This is because for the dynamic load usage, any HA resource, which is
scheduled to start by the HA Manager in the same round, will not be
accounted for in the next call to score_nodes_to_start_resource(...).
This is not a problem for the static load usage, because there the
current node usages are derived from the started resources on every
call already.
Passing only the HA resources' 'state' property is not enough since the
HA Manager will move any HA resource from the 'request_start' (or
through other transient states such as 'request_start_balance' and a
successful 'migrate'/'relocate') into the 'started' state.
This 'started' state is then picked up by the HA resource's LRM, which
will actually start the HA resource and if successful respond with a
'SUCCESS' LRM result. Only then will the HA Manager acknowledges this by
adding the running flag to the HA resource's state.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
changes v2 -> v3:
- fix setting $running flag only if state 'started' and running is set
or for any non-started state where current_node is set
- Additionally handle the case where only $target_node is set in
add_service(), which can only happen in specific cases; I have some
patches which inline this behavior in get_used_service_nodes() (should
be named something else later) to make this behavior more consise, but
that should be handled separately
- change the $service property names to kebab-case
src/PVE/HA/Usage.pm | 13 ++++++++-----
src/PVE/HA/Usage/Basic.pm | 9 ++++++++-
src/PVE/HA/Usage/Static.pm | 30 ++++++++++++++++++++++++------
3 files changed, 40 insertions(+), 12 deletions(-)
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index be3e64d6..43feb041 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -33,9 +33,8 @@ sub contains_node {
die "implement in subclass";
}
-# Logs a warning to $haenv upon failure, but does not die.
-sub add_service_usage_to_node {
- my ($self, $nodename, $sid) = @_;
+sub add_service {
+ my ($self, $sid, $current_node, $target_node, $running) = @_;
die "implement in subclass";
}
@@ -47,8 +46,12 @@ sub add_service_usage {
my $online_nodes = { map { $_ => 1 } $self->list_nodes() };
my ($current_node, $target_node) = get_used_service_nodes($online_nodes, $sd);
- $self->add_service_usage_to_node($current_node, $sid) if $current_node;
- $self->add_service_usage_to_node($target_node, $sid) if $target_node;
+ # some usage implementations need to discern whether a service is truly running;
+ # a service does only have the 'running' flag in 'started' state
+ my $running = ($sd->{state} eq 'started' && $sd->{running})
+ || ($sd->{state} ne 'started' && defined($current_node));
+
+ $self->add_service($sid, $current_node, $target_node, $running);
}
sub remove_service_usage {
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index 2584727b..5aa3ac05 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -38,7 +38,7 @@ sub contains_node {
return defined($self->{nodes}->{$nodename});
}
-sub add_service_usage_to_node {
+my sub add_service_usage_to_node {
my ($self, $nodename, $sid) = @_;
if ($self->contains_node($nodename)) {
@@ -51,6 +51,13 @@ sub add_service_usage_to_node {
}
}
+sub add_service {
+ my ($self, $sid, $current_node, $target_node, $running) = @_;
+
+ add_service_usage_to_node($self, $current_node, $sid) if defined($current_node);
+ add_service_usage_to_node($self, $target_node, $sid) if defined($target_node);
+}
+
sub remove_service_usage {
my ($self, $sid) = @_;
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index b60f5000..8c7a614b 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -71,17 +71,35 @@ my sub get_service_usage {
return $service_stats;
}
-sub add_service_usage_to_node {
- my ($self, $nodename, $sid) = @_;
+sub add_service {
+ my ($self, $sid, $current_node, $target_node, $running) = @_;
- $self->{'node-services'}->{$nodename}->{$sid} = 1;
+ # do not add service which do not put any usage on the nodes
+ return if !defined($current_node) && !defined($target_node);
+
+ # PVE::RS::ResourceScheduling::Static::add_service() expects $current_node
+ # to be set, so consider $target_node as $current_node for unset $current_node;
+ #
+ # currently, this happens for the request_start_balance service state and if
+ # node maintenance causes services to migrate to other nodes
+ if (!defined($current_node)) {
+ $current_node = $target_node;
+ undef $target_node;
+ }
eval {
my $service_usage = get_service_usage($self, $sid);
- $self->{scheduler}->add_service_usage_to_node($nodename, $sid, $service_usage);
+
+ my $service = {
+ stats => $service_usage,
+ running => $running,
+ 'current-node' => $current_node,
+ 'target-node' => $target_node,
+ };
+
+ $self->{scheduler}->add_service($sid, $service);
};
- $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@")
- if $@;
+ $self->{haenv}->log('warning', "unable to add service '$sid' - $@") if $@;
}
sub remove_service_usage {
--
2.47.3
next prev parent reply other threads:[~2026-03-30 14:44 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 14:30 [PATCH-SERIES cluster/ha-manager/perl-rs/proxmox v3 00/40] dynamic scheduler + load rebalancer Daniel Kral
2026-03-30 14:30 ` [PATCH proxmox v3 01/40] resource-scheduling: inline add_cpu_usage in score_nodes_to_start_service Daniel Kral
2026-03-31 6:01 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH proxmox v3 02/40] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-03-31 6:01 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH proxmox v3 03/40] resource-scheduling: rename service to resource where appropriate Daniel Kral
2026-03-31 6:02 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH proxmox v3 04/40] resource-scheduling: introduce generic scheduler implementation Daniel Kral
2026-03-31 6:11 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH proxmox v3 05/40] resource-scheduling: implement generic cluster usage implementation Daniel Kral
2026-03-31 7:26 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH proxmox v3 06/40] resource-scheduling: topsis: handle empty criteria without panics Daniel Kral
2026-03-30 14:30 ` [PATCH proxmox v3 07/40] resource-scheduling: compare by nodename in score_nodes_to_start_resource Daniel Kral
2026-03-30 14:30 ` [PATCH proxmox v3 08/40] resource-scheduling: factor out topsis alternative mapping Daniel Kral
2026-03-30 14:30 ` [PATCH proxmox v3 09/40] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-03-31 7:33 ` Dominik Rusovac
2026-03-31 12:42 ` Michael Köppl
2026-03-31 13:32 ` Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 10/40] pve-rs: resource-scheduling: remove pedantic error handling from remove_node Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 11/40] pve-rs: resource-scheduling: remove pedantic error handling from remove_service_usage Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 12/40] pve-rs: resource-scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 13/40] pve-rs: resource-scheduling: use generic usage implementation Daniel Kral
2026-03-31 7:40 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH perl-rs v3 14/40] pve-rs: resource-scheduling: static: replace deprecated usage structs Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 15/40] pve-rs: resource-scheduling: implement pve_dynamic bindings Daniel Kral
2026-03-30 14:30 ` [PATCH perl-rs v3 16/40] pve-rs: resource-scheduling: expose auto rebalancing methods Daniel Kral
2026-03-30 14:30 ` [PATCH cluster v3 17/40] datacenter config: restructure verbose description for the ha crs option Daniel Kral
2026-03-30 14:30 ` [PATCH cluster v3 18/40] datacenter config: add dynamic load scheduler option Daniel Kral
2026-03-30 14:30 ` [PATCH cluster v3 19/40] datacenter config: add auto rebalancing options Daniel Kral
2026-03-31 7:52 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 20/40] env: pve2: implement dynamic node and service stats Daniel Kral
2026-03-31 13:25 ` Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 21/40] sim: hardware: pass correct types for static stats Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 22/40] sim: hardware: factor out static stats' default values Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 23/40] sim: hardware: fix static stats guard Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 24/40] sim: hardware: handle dynamic service stats Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 25/40] sim: hardware: add set-dynamic-stats command Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 26/40] sim: hardware: add getters for dynamic {node,service} stats Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 27/40] usage: pass service data to add_service_usage Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 28/40] usage: pass service data to get_used_service_nodes Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 29/40] add running flag to non-HA cluster service stats Daniel Kral
2026-03-31 7:58 ` Dominik Rusovac
2026-03-30 14:30 ` Daniel Kral [this message]
2026-03-31 8:12 ` [PATCH ha-manager v3 30/40] usage: use add_service to add service usage to nodes Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 31/40] usage: add dynamic usage scheduler Daniel Kral
2026-03-31 8:15 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 32/40] test: add dynamic usage scheduler test cases Daniel Kral
2026-03-31 8:20 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 33/40] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 34/40] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 35/40] implement automatic rebalancing Daniel Kral
2026-03-31 9:07 ` Dominik Rusovac
2026-03-31 9:07 ` Michael Köppl
2026-03-31 9:16 ` Dominik Rusovac
2026-03-31 9:32 ` Daniel Kral
2026-03-31 9:39 ` Dominik Rusovac
2026-03-31 13:55 ` Daniel Kral
2026-03-31 9:42 ` Daniel Kral
2026-03-31 11:01 ` Michael Köppl
2026-03-31 13:50 ` Daniel Kral
2026-03-30 14:30 ` [PATCH ha-manager v3 36/40] test: add resource bundle generation test cases Daniel Kral
2026-03-31 9:09 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 37/40] test: add dynamic automatic rebalancing system " Daniel Kral
2026-03-31 9:33 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 38/40] test: add static " Daniel Kral
2026-03-31 9:44 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 39/40] test: add automatic rebalancing system test cases with TOPSIS method Daniel Kral
2026-03-31 9:48 ` Dominik Rusovac
2026-03-30 14:30 ` [PATCH ha-manager v3 40/40] test: add automatic rebalancing system test cases with affinity rules Daniel Kral
2026-03-31 10:06 ` Dominik Rusovac
2026-03-31 20:44 ` partially-applied: [PATCH-SERIES cluster/ha-manager/perl-rs/proxmox v3 00/40] dynamic scheduler + load rebalancer Thomas Lamprecht
2026-04-02 12:55 ` superseded: " Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260330144101.668747-31-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox