From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 8F4FD1FF183 for ; Wed, 27 Aug 2025 16:55:41 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id F41F41D8CD; Wed, 27 Aug 2025 16:55:40 +0200 (CEST) Mime-Version: 1.0 Date: Wed, 27 Aug 2025 16:55:07 +0200 Message-Id: Cc: "pve-devel" From: "Daniel Kral" To: "Proxmox VE development discussion" X-Mailer: aerc 0.20.0 References: <20250821143705.256562-1-d.kral@proxmox.com> <20250821143705.256562-3-d.kral@proxmox.com> In-Reply-To: <20250821143705.256562-3-d.kral@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1756306502119 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.014 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH ha-manager 02/18] manager: retranslate rules if nodes are added or removed X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On Thu Aug 21, 2025 at 4:35 PM CEST, Daniel Kral wrote: > Some rule checks depend on the list of cluster nodes, e.g., to check > whether a negative resource affinity rule doesn't specify more HA resources than cluster nodes. > > The HA Manager retranslate rules only in certain conditions to reduce > unnecessary computations, but lacks a check whether cluster nodes have > been added or removed, which is different from what users are reported > through the rules API endpoints and web interface. > > Fixes: 6c4c0458 ("rules: add haenv node list to the rules' canonicalization stage") > Signed-off-by: Daniel Kral > --- > src/PVE/HA/Manager.pm | 2 ++ > src/PVE/HA/NodeStatus.pm | 14 ++++++++++++++ > 2 files changed, 16 insertions(+) As @Michael and I briefly taked about this off-list, the nodelist shouldn't cange too much in production (i.e. the PVE2 environment), but this check makes the HA rules retranslation more correct as the checks are dependent on $nodes. AFAICT the main reasons the nodelist changes in production is that a node joins or leaves, where PVE::HA::Env::PVE2::get_node_info($self) gets the nodelist from PVE::Cluster::get_members(). Even though the pve-ha-crm systemd unit has an ordering dependency on pve-cluster, pvedaemon, ..., which are restarted on node join, these are only restarted on the newly added node AFAICS when calling PVE::Cluster::Setup::join, so the HA Manager isn't updated with the new nodelist in that case, therefore the added condition in this patch is required. I noticed this when implementing the plugin_compile for the node affinity rules, which heavily depend on the $nodes. Without this additional condition, at least 'test-crs-static2' will fail as it doesn't power on all nodes at once, but only powers node4 some time later. The nodelist won't be updated and therefore the HA node affinity rule in that test cases won't get retranslated too, which will change the behavior of the node (re)assignment. The check helper could definitely been implemented nicer, but I didn't want to overcomplicate things here. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel