From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id CACC71FF15C for ; Fri, 17 Oct 2025 14:33:07 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B627E1FBC; Fri, 17 Oct 2025 14:33:27 +0200 (CEST) Message-ID: Date: Fri, 17 Oct 2025 14:32:53 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion , Daniel Kral References: <20250930142021.366529-1-d.kral@proxmox.com> <20250930142021.366529-12-d.kral@proxmox.com> Content-Language: en-US From: Fiona Ebner In-Reply-To: <20250930142021.366529-12-d.kral@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1760704370276 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.021 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Am 30.09.25 um 4:20 PM schrieb Daniel Kral: > The HA Manager builds $online_node_usage in every FSM iteration in > manage(...) and at every HA resource state change in > change_service_state(...). This becomes quite costly with a high HA > resource count and a lot of state changes happening at once, e.g. > starting up multiple nodes with rebalance_on_request_start set or a > failover of a node with many configured HA resources. > > To improve this situation, make the changes to the $online_node_usage > more granular by building $online_node_usage only once per call to > manage(...) and changing the nodes a HA resource uses individually on > every HA resource state transition. > > The change in service usage "freshness" should be negligible here as the > static service usage data is cached anyway (except if the cache fails > for some reason). But the cache is refreshed on every recompute_online_node_usage(), which happened much more frequently before, so the fact that it's cached doesn't seem like a strong argument here? I /do/ think there is a real tradeoff being made, namely "the ability to manage much larger fleets of guests" versus "immediately incorporating every guest config change in decisions". Config changes that would lead to wildly different decisions would need to be timed very badly to cause actual issues and should be rare to begin with. Also, with PSI-based information, things are also less "instant", I don't see an issue with moving in the same direction. > > Signed-off-by: Daniel Kral Reviewed-by: Fiona Ebner > --- > The add_service_usage(...) helper is added in anticipation for the next > patch, we don't need a helper if we don't go for #9. I think it's nice to have regardless. Inlining the function would just bloat change_service_state() or what would be the alternative? > @@ -314,7 +329,8 @@ my $change_service_state = sub { > $sd->{$k} = $v; > } > > - $self->recompute_online_node_usage(); > + $self->{online_node_usage}->remove_service_usage($sid); > + $self->add_service_usage($sid, $sd); Nice! > > $sd->{uid} = compute_new_uuid($new_state); > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel