From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 4A3201FF15C for ; Fri, 17 Oct 2025 18:07:43 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 331EB7B13; Fri, 17 Oct 2025 18:08:03 +0200 (CEST) Mime-Version: 1.0 Date: Fri, 17 Oct 2025 18:07:29 +0200 Message-Id: From: "Daniel Kral" To: "Fiona Ebner" , "Proxmox VE development discussion" X-Mailer: aerc 0.20.0 References: <20250930142021.366529-1-d.kral@proxmox.com> <20250930142021.366529-12-d.kral@proxmox.com> In-Reply-To: X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1760717245847 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.015 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On Fri Oct 17, 2025 at 2:32 PM CEST, Fiona Ebner wrote: > Am 30.09.25 um 4:20 PM schrieb Daniel Kral: >> The HA Manager builds $online_node_usage in every FSM iteration in >> manage(...) and at every HA resource state change in >> change_service_state(...). This becomes quite costly with a high HA >> resource count and a lot of state changes happening at once, e.g. >> starting up multiple nodes with rebalance_on_request_start set or a >> failover of a node with many configured HA resources. >> >> To improve this situation, make the changes to the $online_node_usage >> more granular by building $online_node_usage only once per call to >> manage(...) and changing the nodes a HA resource uses individually on >> every HA resource state transition. >> >> The change in service usage "freshness" should be negligible here as the >> static service usage data is cached anyway (except if the cache fails >> for some reason). > > But the cache is refreshed on every recompute_online_node_usage(), which > happened much more frequently before, so the fact that it's cached > doesn't seem like a strong argument here? > > I /do/ think there is a real tradeoff being made, namely "the ability to > manage much larger fleets of guests" versus "immediately incorporating > every guest config change in decisions". Config changes that would lead > to wildly different decisions would need to be timed very badly to cause > actual issues and should be rare to begin with. Also, with PSI-based > information, things are also less "instant", I don't see an issue with > moving in the same direction. Right, I'll change that to better reflect the tradeoff here! _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel