From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 89B861FF15E for ; Mon, 27 Oct 2025 17:45:25 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 91F5ABA46; Mon, 27 Oct 2025 17:45:26 +0100 (CET) From: Daniel Kral To: pve-devel@lists.proxmox.com Date: Mon, 27 Oct 2025 17:43:45 +0100 Message-ID: <20251027164513.542678-11-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20251027164513.542678-1-d.kral@proxmox.com> References: <20251027164513.542678-1-d.kral@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1761583507453 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.016 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH ha-manager v3 7/8] manager: make online node usage computation granular X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" The HA Manager builds $online_node_usage in every FSM iteration in manage(...) and at every HA resource state change in change_service_state(...). This becomes quite costly with a high HA resource count and a lot of state changes happening at once, e.g. starting up multiple nodes with rebalance_on_request_start set or a failover of a node with many configured HA resources. To improve this situation, make the changes to the $online_node_usage more granular by building $online_node_usage only once per call to manage(...) and changing the nodes a HA resource uses individually on every HA resource state transition. This allows the HA Manager to handle many more HA resources with the static load scheduler. Signed-off-by: Daniel Kral Reviewed-by: Fiona Ebner --- src/PVE/HA/Manager.pm | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm index bf6895ad..3bd6e1a6 100644 --- a/src/PVE/HA/Manager.pm +++ b/src/PVE/HA/Manager.pm @@ -238,8 +238,6 @@ my $valid_service_states = { error => 1, }; -# FIXME with 'static' mode and thousands of services, the overhead can be noticable and the fact -# that this function is called for each state change and upon recovery doesn't help. sub recompute_online_node_usage { my ($self) = @_; @@ -317,7 +315,9 @@ my $change_service_state = sub { $sd->{$k} = $v; } - $self->recompute_online_node_usage(); + $self->{online_node_usage}->remove_service_usage($sid); + $self->{online_node_usage} + ->add_service_usage($sid, $sd->{state}, $sd->{node}, $sd->{target}); $sd->{uid} = compute_new_uuid($new_state); @@ -709,6 +709,8 @@ sub manage { delete $ss->{$sid}; } + $self->recompute_online_node_usage(); + my $new_rules = $haenv->read_rules_config(); # TODO PVE 10: Remove group migration when HA groups have been fully migrated to rules @@ -738,8 +740,6 @@ sub manage { for (;;) { my $repeat = 0; - $self->recompute_online_node_usage(); - foreach my $sid (sort keys %$ss) { my $sd = $ss->{$sid}; my $cd = $sc->{$sid} || { state => 'disabled' }; -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel