From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id C445F1FF13C for ; Thu, 02 Apr 2026 14:49:09 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 0F8B216E17; Thu, 2 Apr 2026 14:48:52 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Subject: [PATCH ha-manager v4 14/28] usage: use add_service to add service usage to nodes Date: Thu, 2 Apr 2026 14:44:08 +0200 Message-ID: <20260402124817.416232-15-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260402124817.416232-1-d.kral@proxmox.com> References: <20260402124817.416232-1-d.kral@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1775134043352 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.082 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: XK37AQ6LVCGJNTKIWNXBOB3LXS3573ZI X-Message-ID-Hash: XK37AQ6LVCGJNTKIWNXBOB3LXS3573ZI X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The pve_static (and upcoming pve_dynamic) bindings expose the new add_resource(...) method, which allow adding resources in a single call with the additional running flag. The running flag is needed to discriminate starting and started HA resources from each other, which is needed to correctly account for HA resources for the dynamic load usage implementation in the next patch. This is because for the dynamic load usage, any HA resource, which is scheduled to start by the HA Manager in the same round, will not be accounted for in the next call to score_nodes_to_start_resource(...). This is not a problem for the static load usage, because there the current node usages are derived from the started resources on every call already. Passing only the HA resources' 'state' property is not enough since the HA Manager will move any HA resource from the 'request_start' (or through other transient states such as 'request_start_balance' and a successful 'migrate'/'relocate') into the 'started' state. This 'started' state is then picked up by the HA resource's LRM, which will actually start the HA resource and if successful respond with a 'SUCCESS' LRM result. Only then will the HA Manager acknowledges this by adding the running flag to the HA resource's state. Signed-off-by: Daniel Kral Reviewed-by: Dominik Rusovac --- changes v3 -> v4: - none src/PVE/HA/Usage.pm | 13 ++++++++----- src/PVE/HA/Usage/Basic.pm | 9 ++++++++- src/PVE/HA/Usage/Static.pm | 30 ++++++++++++++++++++++++------ 3 files changed, 40 insertions(+), 12 deletions(-) diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm index be3e64d6..43feb041 100644 --- a/src/PVE/HA/Usage.pm +++ b/src/PVE/HA/Usage.pm @@ -33,9 +33,8 @@ sub contains_node { die "implement in subclass"; } -# Logs a warning to $haenv upon failure, but does not die. -sub add_service_usage_to_node { - my ($self, $nodename, $sid) = @_; +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; die "implement in subclass"; } @@ -47,8 +46,12 @@ sub add_service_usage { my $online_nodes = { map { $_ => 1 } $self->list_nodes() }; my ($current_node, $target_node) = get_used_service_nodes($online_nodes, $sd); - $self->add_service_usage_to_node($current_node, $sid) if $current_node; - $self->add_service_usage_to_node($target_node, $sid) if $target_node; + # some usage implementations need to discern whether a service is truly running; + # a service does only have the 'running' flag in 'started' state + my $running = ($sd->{state} eq 'started' && $sd->{running}) + || ($sd->{state} ne 'started' && defined($current_node)); + + $self->add_service($sid, $current_node, $target_node, $running); } sub remove_service_usage { diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm index 2584727b..5aa3ac05 100644 --- a/src/PVE/HA/Usage/Basic.pm +++ b/src/PVE/HA/Usage/Basic.pm @@ -38,7 +38,7 @@ sub contains_node { return defined($self->{nodes}->{$nodename}); } -sub add_service_usage_to_node { +my sub add_service_usage_to_node { my ($self, $nodename, $sid) = @_; if ($self->contains_node($nodename)) { @@ -51,6 +51,13 @@ sub add_service_usage_to_node { } } +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; + + add_service_usage_to_node($self, $current_node, $sid) if defined($current_node); + add_service_usage_to_node($self, $target_node, $sid) if defined($target_node); +} + sub remove_service_usage { my ($self, $sid) = @_; diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm index b60f5000..8c7a614b 100644 --- a/src/PVE/HA/Usage/Static.pm +++ b/src/PVE/HA/Usage/Static.pm @@ -71,17 +71,35 @@ my sub get_service_usage { return $service_stats; } -sub add_service_usage_to_node { - my ($self, $nodename, $sid) = @_; +sub add_service { + my ($self, $sid, $current_node, $target_node, $running) = @_; - $self->{'node-services'}->{$nodename}->{$sid} = 1; + # do not add service which do not put any usage on the nodes + return if !defined($current_node) && !defined($target_node); + + # PVE::RS::ResourceScheduling::Static::add_service() expects $current_node + # to be set, so consider $target_node as $current_node for unset $current_node; + # + # currently, this happens for the request_start_balance service state and if + # node maintenance causes services to migrate to other nodes + if (!defined($current_node)) { + $current_node = $target_node; + undef $target_node; + } eval { my $service_usage = get_service_usage($self, $sid); - $self->{scheduler}->add_service_usage_to_node($nodename, $sid, $service_usage); + + my $service = { + stats => $service_usage, + running => $running, + 'current-node' => $current_node, + 'target-node' => $target_node, + }; + + $self->{scheduler}->add_service($sid, $service); }; - $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@") - if $@; + $self->{haenv}->log('warning', "unable to add service '$sid' - $@") if $@; } sub remove_service_usage { -- 2.47.3