From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id EFA0C1FF146
	for <inbox@lore.proxmox.com>; Tue, 28 Apr 2026 11:05:43 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id B1C75BA24;
	Tue, 28 Apr 2026 11:05:43 +0200 (CEST)
Content-Type: text/plain; charset=UTF-8
Date: Tue, 28 Apr 2026 11:05:36 +0200
Message-Id: <DI4O2EE0OX1J.1B95XY2KTZNQ1@proxmox.com>
Subject: Re: [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value
 to unit interval
From: "Daniel Kral" <d.kral@proxmox.com>
To: "Dominik Rusovac" <d.rusovac@proxmox.com>, <pve-devel@lists.proxmox.com>
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Mailer: aerc 0.21.0-136-gdb9fe9896a79-dirty
References: <20260427132031.220468-1-d.rusovac@proxmox.com>
 <20260427132031.220468-2-d.rusovac@proxmox.com>
In-Reply-To: <20260427132031.220468-2-d.rusovac@proxmox.com>
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1777367040478
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.078 Adjusted score from AWL reputation of From:
 address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_MISSING             0.1 Missing DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict
 Alignment
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
	URIBL_BLOCKED           0.001 ADMINISTRATOR NOTICE: The query to URIBL was
 blocked.  See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
 for more information. [scheduler.rs,ipbeja.pt,stackexchange.com]
Message-ID-Hash: 5JYFTPDTXSHMLH3GH5A26D6L5H4HCWCG
X-Message-ID-Hash: 5JYFTPDTXSHMLH3GH5A26D6L5H4HCWCG
X-MailFrom: d.kral@proxmox.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop;
 banned-address; emergency; member-moderation; nonmember-moderation;
 administrivia; implicit-dest; max-recipients; max-size; news-moderation;
 no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.10
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Owner: <mailto:pve-devel-owner@lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Subscribe: <mailto:pve-devel-join@lists.proxmox.com>
List-Unsubscribe: <mailto:pve-devel-leave@lists.proxmox.com>

On Mon Apr 27, 2026 at 3:20 PM CEST, Dominik Rusovac wrote:
> The currently used load imbalance value is given as the so-called
> coefficient of variation (CV), a value that may exceed 1. As such, the
> CV value alone lacks meaning. A CV value of 0.0 means no imbalance, but
> what does a value of, say, 1.7 mean?
>
> Relative to the number of nodes in a cluster, it is possible to
> determine the upper bound of the CV value [0][1]. By dividing the CV
> value by its upper bound, the load imbalance can be represented as a
> value that varies between 0 and 1. Expressing the CV as a percentage
> makes the concept of load imbalance easier to interpret.

Nice, thanks for the work!

Will test the changes over the week, but just from the better
readability / interpretability of the imbalance value this should make
it more user-friendly overall.

>
> [0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe=
0-402f-9d2f-90c5bf6e418c/content
> [1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coef=
ficient-of-variation-for-bounded-data-set

and a good read overall, thanks!

A note above Example 1 and the proposition 13 from the first paper [0]
is interesting here:

    All these properties refer to the case where a single sample is
    considered, however, as noted by [16], Dodd=E2=80=99s corrected coeffic=
ient
    of variation, CV corr is not suitable for comparative purpose, as
    can be seen in the next example.

    Example 1. [...]

    [...]

    Proposition 13. Dodd=E2=80=99s corrected coefficient of variation, CV c=
orr,
    is sample-size sensitive.

AFAICT this should not be a problem for us to compare these values to
make decisions since we only compare the values for the same cluster
size configuration and therefore this shouldn't affect us badly.


On another note, it should also be easier for users to set a imbalance
threshold value, which is (at least roughly) invariant to the size of
the cluster. If one or more nodes failed or are in maintenance node,
this dramatically changes the decisions that the load balancer can make
anyway, but I wonder whether how much difference it makes to the
sensitivity to trigger the load balancing system.

>
> Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
> ---
>  proxmox-resource-scheduling/src/scheduler.rs | 33 +++++++++++++-------
>  1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resou=
rce-scheduling/src/scheduler.rs
> index 49d16f9f..4eacbff9 100644
> --- a/proxmox-resource-scheduling/src/scheduler.rs
> +++ b/proxmox-resource-scheduling/src/scheduler.rs
> @@ -17,17 +17,23 @@ pub struct NodeUsage {
>      pub stats: NodeStats,
>  }
> =20
> -/// Returns the load imbalance among the nodes.
> +/// Returns the load imbalance among the nodes, which is a value between=
 0 and 1 that describes the
> +/// statistical dispersion of the individual node loads around the mean =
node load. The lower the
> +/// value, the better.
>  ///
> -/// The load balance is measured as the statistical dispersion of the in=
dividual node loads.
> -///
> -/// The current implementation uses the dimensionless coefficient of var=
iation, which expresses the
> -/// standard deviation in relation to the average mean of the node loads=
.
> -///
> -/// The coefficient of variation is not robust, which is a desired prope=
rty here, because outliers
> -/// should be detected as much as possible.
> +/// In more detail, the current implementation computes the so-called co=
efficient of variation (CV),
> +/// which is the ratio of the standard deviation to the mean of the give=
n node loads. The lower
> +/// bound of the CV is reached if all node loads are equal. The upper bo=
und is reached if all nodes
> +/// except one are idle. To present the CV as a value between 0 and 1, i=
t's being divided by the
> +/// upper bound of the CV for the given number of nodes.
>  fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeU=
sage) -> f64) -> f64 {
> -    let node_count =3D nodes.len();
> +    let node_count =3D nodes.len() as f64;

even though this reduces the amount of 'as f64's below, the node count
is by its nature a positive integer so it should stay that way.

> +
> +    // imbalance is perfect for less than 2 nodes
> +    if node_count < 2.0 {
> +        return 0.0;
> +    }

this could replace the check `load_sum =3D=3D 0.0` below, which also makes
sure that we don't ever divide by zero (and return a NaN as a result),
and move the assignments for node_loads and load_sum into the else
branch's code path.

A comment could make it more explicit that this avoids dividing by zero.

> +
>      let node_loads =3D nodes.iter().map(to_load).collect::<Vec<_>>();
> =20
>      let load_sum =3D node_loads.iter().sum::<f64>();
> @@ -36,14 +42,17 @@ fn calculate_node_imbalance(nodes: &[NodeUsage], to_l=
oad: impl Fn(&NodeUsage) ->
>      if load_sum =3D=3D 0.0 {
>          0.0
>      } else {
> -        let load_mean =3D load_sum / node_count as f64;
> +        let load_mean =3D load_sum / node_count;
> =20
>          let squared_diff_sum =3D node_loads
>              .iter()
>              .fold(0.0, |sum, node_load| sum + (node_load - load_mean).po=
wi(2));
> -        let load_sd =3D (squared_diff_sum / node_count as f64).sqrt();
> +        let load_sd =3D (squared_diff_sum / node_count).sqrt();
> +
> +        let max_cv =3D (node_count - 1.0).sqrt();
> +        let cv =3D load_sd / load_mean;

nit: just for aesthetics, could be reordered to cv and then max_cv

also to not loose the reference from the patch message with future
changes, it would be nice to add a comment here why the calculation for
max_cv is correct.

> =20
> -        load_sd / load_mean
> +        cv / max_cv
>      }
>  }
> =20