From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id EFA0C1FF146 for ; Tue, 28 Apr 2026 11:05:43 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B1C75BA24; Tue, 28 Apr 2026 11:05:43 +0200 (CEST) Content-Type: text/plain; charset=UTF-8 Date: Tue, 28 Apr 2026 11:05:36 +0200 Message-Id: Subject: Re: [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value to unit interval From: "Daniel Kral" To: "Dominik Rusovac" , Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: aerc 0.21.0-136-gdb9fe9896a79-dirty References: <20260427132031.220468-1-d.rusovac@proxmox.com> <20260427132031.220468-2-d.rusovac@proxmox.com> In-Reply-To: <20260427132031.220468-2-d.rusovac@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1777367040478 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.078 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [scheduler.rs,ipbeja.pt,stackexchange.com] Message-ID-Hash: 5JYFTPDTXSHMLH3GH5A26D6L5H4HCWCG X-Message-ID-Hash: 5JYFTPDTXSHMLH3GH5A26D6L5H4HCWCG X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Mon Apr 27, 2026 at 3:20 PM CEST, Dominik Rusovac wrote: > The currently used load imbalance value is given as the so-called > coefficient of variation (CV), a value that may exceed 1. As such, the > CV value alone lacks meaning. A CV value of 0.0 means no imbalance, but > what does a value of, say, 1.7 mean? > > Relative to the number of nodes in a cluster, it is possible to > determine the upper bound of the CV value [0][1]. By dividing the CV > value by its upper bound, the load imbalance can be represented as a > value that varies between 0 and 1. Expressing the CV as a percentage > makes the concept of load imbalance easier to interpret. Nice, thanks for the work! Will test the changes over the week, but just from the better readability / interpretability of the imbalance value this should make it more user-friendly overall. > > [0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe= 0-402f-9d2f-90c5bf6e418c/content > [1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coef= ficient-of-variation-for-bounded-data-set and a good read overall, thanks! A note above Example 1 and the proposition 13 from the first paper [0] is interesting here: All these properties refer to the case where a single sample is considered, however, as noted by [16], Dodd=E2=80=99s corrected coeffic= ient of variation, CV corr is not suitable for comparative purpose, as can be seen in the next example. Example 1. [...] [...] Proposition 13. Dodd=E2=80=99s corrected coefficient of variation, CV c= orr, is sample-size sensitive. AFAICT this should not be a problem for us to compare these values to make decisions since we only compare the values for the same cluster size configuration and therefore this shouldn't affect us badly. On another note, it should also be easier for users to set a imbalance threshold value, which is (at least roughly) invariant to the size of the cluster. If one or more nodes failed or are in maintenance node, this dramatically changes the decisions that the load balancer can make anyway, but I wonder whether how much difference it makes to the sensitivity to trigger the load balancing system. > > Signed-off-by: Dominik Rusovac > --- > proxmox-resource-scheduling/src/scheduler.rs | 33 +++++++++++++------- > 1 file changed, 21 insertions(+), 12 deletions(-) > > diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resou= rce-scheduling/src/scheduler.rs > index 49d16f9f..4eacbff9 100644 > --- a/proxmox-resource-scheduling/src/scheduler.rs > +++ b/proxmox-resource-scheduling/src/scheduler.rs > @@ -17,17 +17,23 @@ pub struct NodeUsage { > pub stats: NodeStats, > } > =20 > -/// Returns the load imbalance among the nodes. > +/// Returns the load imbalance among the nodes, which is a value between= 0 and 1 that describes the > +/// statistical dispersion of the individual node loads around the mean = node load. The lower the > +/// value, the better. > /// > -/// The load balance is measured as the statistical dispersion of the in= dividual node loads. > -/// > -/// The current implementation uses the dimensionless coefficient of var= iation, which expresses the > -/// standard deviation in relation to the average mean of the node loads= . > -/// > -/// The coefficient of variation is not robust, which is a desired prope= rty here, because outliers > -/// should be detected as much as possible. > +/// In more detail, the current implementation computes the so-called co= efficient of variation (CV), > +/// which is the ratio of the standard deviation to the mean of the give= n node loads. The lower > +/// bound of the CV is reached if all node loads are equal. The upper bo= und is reached if all nodes > +/// except one are idle. To present the CV as a value between 0 and 1, i= t's being divided by the > +/// upper bound of the CV for the given number of nodes. > fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeU= sage) -> f64) -> f64 { > - let node_count =3D nodes.len(); > + let node_count =3D nodes.len() as f64; even though this reduces the amount of 'as f64's below, the node count is by its nature a positive integer so it should stay that way. > + > + // imbalance is perfect for less than 2 nodes > + if node_count < 2.0 { > + return 0.0; > + } this could replace the check `load_sum =3D=3D 0.0` below, which also makes sure that we don't ever divide by zero (and return a NaN as a result), and move the assignments for node_loads and load_sum into the else branch's code path. A comment could make it more explicit that this avoids dividing by zero. > + > let node_loads =3D nodes.iter().map(to_load).collect::>(); > =20 > let load_sum =3D node_loads.iter().sum::(); > @@ -36,14 +42,17 @@ fn calculate_node_imbalance(nodes: &[NodeUsage], to_l= oad: impl Fn(&NodeUsage) -> > if load_sum =3D=3D 0.0 { > 0.0 > } else { > - let load_mean =3D load_sum / node_count as f64; > + let load_mean =3D load_sum / node_count; > =20 > let squared_diff_sum =3D node_loads > .iter() > .fold(0.0, |sum, node_load| sum + (node_load - load_mean).po= wi(2)); > - let load_sd =3D (squared_diff_sum / node_count as f64).sqrt(); > + let load_sd =3D (squared_diff_sum / node_count).sqrt(); > + > + let max_cv =3D (node_count - 1.0).sqrt(); > + let cv =3D load_sd / load_mean; nit: just for aesthetics, could be reordered to cv and then max_cv also to not loose the reference from the patch message with future changes, it would be nice to add a comment here why the calculation for max_cv is correct. > =20 > - load_sd / load_mean > + cv / max_cv > } > } > =20