From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 7347C1FF141 for ; Mon, 30 Mar 2026 16:44:59 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id EE85033; Mon, 30 Mar 2026 16:42:14 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Subject: [PATCH ha-manager v3 30/40] usage: use add_service to add service usage to nodes Date: Mon, 30 Mar 2026 16:30:39 +0200 Message-ID: <20260330144101.668747-31-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260330144101.668747-1-d.kral@proxmox.com> References: <20260330144101.668747-1-d.kral@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774881614778 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.064 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: LO2HEI446I5BZAQGNAG7IT3TLVKW3IDS X-Message-ID-Hash: LO2HEI446I5BZAQGNAG7IT3TLVKW3IDS X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The pve_static (and upcoming pve_dynamic) bindings expose the new add_resource(...) method, which allow adding resources in a single call with the additional running flag. The running flag is needed to discriminate starting and started HA resources from each other, which is needed to correctly account for HA resources for the dynamic load usage implementation in the next patch. This is because for the dynamic load usage, any HA resource, which is scheduled to start by the HA Manager in the same round, will not be accounted for in the next call to score_nodes_to_start_resource(...). This is not a problem for the static load usage, because there the current node usages are derived from the started resources on every call already. Passing only the HA resources' 'state' property is not enough since the HA Manager will move any HA resource from the 'request_start' (or through other transient states such as 'request_start_balance' and a successful 'migrate'/'relocate') into the 'started' state. This 'started' state is then picked up by the HA resource's LRM, which will actually start the HA resource and if successful respond with a 'SUCCESS' LRM result. Only then will the HA Manager acknowledges this by adding the running flag to the HA resource's state. Signed-off-by: Daniel Kral --- changes v2 -> v3: - fix setting $running flag only if state 'started' and running is set or for any non-started state where current_node is set - Additionally handle the case where only $target_node is set in add_service(), which can only happen in specific cases; I have some patches which inline this behavior in get_used_service_nodes() (should be named something else later) to make this behavior more consise, but that should be handled separately - change the $service property names to kebab-case src/PVE/HA/Usage.pm | 13 ++++++++----- src/PVE/HA/Usage/Basic.pm | 9 ++++++++- src/PVE/HA/Usage/Static.pm | 30 ++++++++++++++++++++++++------ 3 files changed, 40 insertions(+), 12 deletions(-) diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm index be3e64d6..43feb041 100644 --- a/src/PVE/HA/Usage.pm +++ b/src/PVE/HA/Usage.pm @@ -33,9 +33,8 @@ sub contains_node { die "implement in subclass"; } -# Logs a warning to $haenv upon failure, but does not die. -sub add_service_usage_to_node { - my ($self, $nodename, $sid) = @_; +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; die "implement in subclass"; } @@ -47,8 +46,12 @@ sub add_service_usage { my $online_nodes = { map { $_ => 1 } $self->list_nodes() }; my ($current_node, $target_node) = get_used_service_nodes($online_nodes, $sd); - $self->add_service_usage_to_node($current_node, $sid) if $current_node; - $self->add_service_usage_to_node($target_node, $sid) if $target_node; + # some usage implementations need to discern whether a service is truly running; + # a service does only have the 'running' flag in 'started' state + my $running = ($sd->{state} eq 'started' && $sd->{running}) + || ($sd->{state} ne 'started' && defined($current_node)); + + $self->add_service($sid, $current_node, $target_node, $running); } sub remove_service_usage { diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm index 2584727b..5aa3ac05 100644 --- a/src/PVE/HA/Usage/Basic.pm +++ b/src/PVE/HA/Usage/Basic.pm @@ -38,7 +38,7 @@ sub contains_node { return defined($self->{nodes}->{$nodename}); } -sub add_service_usage_to_node { +my sub add_service_usage_to_node { my ($self, $nodename, $sid) = @_; if ($self->contains_node($nodename)) { @@ -51,6 +51,13 @@ sub add_service_usage_to_node { } } +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; + + add_service_usage_to_node($self, $current_node, $sid) if defined($current_node); + add_service_usage_to_node($self, $target_node, $sid) if defined($target_node); +} + sub remove_service_usage { my ($self, $sid) = @_; diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm index b60f5000..8c7a614b 100644 --- a/src/PVE/HA/Usage/Static.pm +++ b/src/PVE/HA/Usage/Static.pm @@ -71,17 +71,35 @@ my sub get_service_usage { return $service_stats; } -sub add_service_usage_to_node { - my ($self, $nodename, $sid) = @_; +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; - $self->{'node-services'}->{$nodename}->{$sid} = 1; + # do not add service which do not put any usage on the nodes + return if !defined($current_node) && !defined($target_node); + + # PVE::RS::ResourceScheduling::Static::add_service() expects $current_node + # to be set, so consider $target_node as $current_node for unset $current_node; + # + # currently, this happens for the request_start_balance service state and if + # node maintenance causes services to migrate to other nodes + if (!defined($current_node)) { + $current_node = $target_node; + undef $target_node; + } eval { my $service_usage = get_service_usage($self, $sid); - $self->{scheduler}->add_service_usage_to_node($nodename, $sid, $service_usage); + + my $service = { + stats => $service_usage, + running => $running, + 'current-node' => $current_node, + 'target-node' => $target_node, + }; + + $self->{scheduler}->add_service($sid, $service); }; - $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@") - if $@; + $self->{haenv}->log('warning', "unable to add service '$sid' - $@") if $@; } sub remove_service_usage { -- 2.47.3