From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH ha-manager v2 7/8] manager: make online node usage computation granular
Date: Mon, 20 Oct 2025 18:45:37 +0200 [thread overview]
Message-ID: <20251020164540.517231-12-d.kral@proxmox.com> (raw)
In-Reply-To: <20251020164540.517231-1-d.kral@proxmox.com>
The HA Manager builds $online_node_usage in every FSM iteration in
manage(...) and at every HA resource state change in
change_service_state(...). This becomes quite costly with a high HA
resource count and a lot of state changes happening at once, e.g.
starting up multiple nodes with rebalance_on_request_start set or a
failover of a node with many configured HA resources.
To improve this situation, make the changes to the $online_node_usage
more granular by building $online_node_usage only once per call to
manage(...) and changing the nodes a HA resource uses individually on
every HA resource state transition. This allows the HA Manager to handle
many more HA resources with the static load scheduler.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
changes since v1:
- remove FIXME
- remove argument about cache from patch message
- use add_service_usage(...) helper from $online_node_usage now
- did not add R-b from Fiona as add_service_usage(...) was moved
src/PVE/HA/Manager.pm | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index bf6895ad..3bd6e1a6 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -238,8 +238,6 @@ my $valid_service_states = {
error => 1,
};
-# FIXME with 'static' mode and thousands of services, the overhead can be noticable and the fact
-# that this function is called for each state change and upon recovery doesn't help.
sub recompute_online_node_usage {
my ($self) = @_;
@@ -317,7 +315,9 @@ my $change_service_state = sub {
$sd->{$k} = $v;
}
- $self->recompute_online_node_usage();
+ $self->{online_node_usage}->remove_service_usage($sid);
+ $self->{online_node_usage}
+ ->add_service_usage($sid, $sd->{state}, $sd->{node}, $sd->{target});
$sd->{uid} = compute_new_uuid($new_state);
@@ -709,6 +709,8 @@ sub manage {
delete $ss->{$sid};
}
+ $self->recompute_online_node_usage();
+
my $new_rules = $haenv->read_rules_config();
# TODO PVE 10: Remove group migration when HA groups have been fully migrated to rules
@@ -738,8 +740,6 @@ sub manage {
for (;;) {
my $repeat = 0;
- $self->recompute_online_node_usage();
-
foreach my $sid (sort keys %$ss) {
my $sd = $ss->{$sid};
my $cd = $sc->{$sid} || { state => 'disabled' };
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-10-20 16:45 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 16:45 [pve-devel] [PATCH ha-manager/perl-rs/proxmox/qemu-server v2 00/12] Granular online_node_usage accounting Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH qemu-server v2 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-21 11:47 ` [pve-devel] applied: " Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH proxmox v2 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-10-21 12:14 ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH perl-rs v2 1/2] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH perl-rs v2 2/2] test: resource_scheduling: use score_nodes helper to imitate HA Manager Daniel Kral
2025-10-21 12:14 ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 1/8] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 2/8] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 3/8] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 4/8] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-21 13:02 ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 5/8] manager: make recompute_online_node_usage use add_service_usage helper Daniel Kral
2025-10-21 13:06 ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 6/8] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-20 16:45 ` Daniel Kral [this message]
2025-10-21 13:09 ` [pve-devel] [PATCH ha-manager v2 7/8] manager: make online node usage computation granular Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 8/8] implement static service stats cache Daniel Kral
2025-10-21 13:23 ` Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251020164540.517231-12-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox