From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id C68511FF144 for ; Tue, 24 Mar 2026 09:52:06 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id EE74B7AB2; Tue, 24 Mar 2026 09:52:26 +0100 (CET) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 24 Mar 2026 09:51:47 +0100 Message-Id: From: "Daniel Kral" To: "Daniel Kral" , "Thomas Lamprecht" , Subject: Re: [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer X-Mailer: aerc 0.21.0-38-g7088c3642f2c-dirty References: <20260217141437.584852-1-d.kral@proxmox.com> <3fcd4459-e5ff-48ca-8b70-53411a666247@proxmox.com> In-Reply-To: X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774342261505 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.054 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: QIAYJ7OTMUD5UGPU73PDRIFEG5ARGLG6 X-Message-ID-Hash: QIAYJ7OTMUD5UGPU73PDRIFEG5ARGLG6 X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Thu Mar 19, 2026 at 10:12 AM CET, Daniel Kral wrote: > On Wed Mar 18, 2026 at 5:54 PM CET, Thomas Lamprecht wrote: >> ScoredMigration's Ord only compares imbalance, so two migrations with >> the same imbalance but different source/target count as Equal, which >> makes the BinaryHeap output order unpredictable. Maybe use the Migration >> field, which is already Ord itself, to break any ties here as a secondar= y >> key. > > Thanks! I forgot to add this as a FIXME there for the RFC series. > I had a > > impl Ord for ScoredMigration { > fn cmp(&self, other: &Self) -> Ordering { > self.imbalance > .total_cmp(&other.imbalance) > .reverse() > .then(self.migration.cmp(&other.migration)) > } > } > > before, but while testing it seemed to not sort as expected. I haven't > looked into this yet, though I guess that different calculations might > end up in different exponents, which totalOrder does define as unequal > [1]. > > I'll briefly test this again, but sorting here in some reasonable way is > still better than letting the order of the input data decide. > > [1] https://en.wikipedia.org/wiki/IEEE_754#Total-ordering_predicate Indeed there was a slight issue about the f64 imbalance not being the same even if it "seemed" like it is. I ran into the problem with a pve-ha-manager test case yesterday, where a test case was flakey and _sometimes_ reordered the seemingly same imbalance fp number wrong: { sid: "vm:102", source_node: "node1", target_node: "node3", imbalance:= 0.723174948891693 } { sid: "vm:102", source_node: "node1", target_node: "node2", imbalance:= 0.723174948891693 } Perl seems to truncate the 16th decimal place here and with some closer inspection with some tracing, the actual values were: ScoredMigration { sid: "vm:102", source_node: "node1", target_node: "no= de3", imbalance: 0.723174948891693 } ScoredMigration { sid: "vm:102", source_node: "node1", target_node: "no= de2", imbalance: 0.7231749488916931 } We and I make sure that any interaction with `Usage` in the HA Manager is deterministic in the sense that we go through the services in the same order every time, e.g., by sorting the keys. But I guess that hashbrown's HashMap Iter is not deterministic as the struct itself takes a RandomState and so we couldn't rely on the fp operation orders here. I tried using BTreeMap and BTreeSet to make the iterator orders deterministic, but this increased the runtime by roughly 10x, which was unacceptable for larger cluster sizes. It also doesn't fix the problem directly, because these cases of approximations errors caused by changing the fp operations order could even happen with deterministic calculations. For the v2, I'll use the easy route and just truncated these numbers before ordering these as the 16th decimal place is not a significant digit anymore and shouldn't be relied on for ordering.