From: "Daniel Kral" <d.kral@proxmox.com>
To: "Dominik Rusovac" <d.rusovac@proxmox.com>, <pve-devel@lists.proxmox.com>
Subject: Re: [PATCH proxmox v2 1/6] resource-scheduling: clamp imbalance value to unit interval
Date: Thu, 30 Apr 2026 09:48:03 +0200 [thread overview]
Message-ID: <DI6BO454X69O.3W55A44HFZA1M@proxmox.com> (raw)
In-Reply-To: <20260429122051.179485-2-d.rusovac@proxmox.com>
On Wed Apr 29, 2026 at 2:20 PM CEST, Dominik Rusovac wrote:
> The currently used load imbalance value is given as the so-called
> coefficient of variation (CV), a value that may exceed 1. As such, the
> CV value alone lacks meaning. A CV value of 0.0 means no imbalance, but
> what does a value of, say, 1.7 mean?
>
> Relative to the number of nodes in a cluster, it is possible to
> determine the upper bound of the CV value [0][1]. By dividing the CV
> value by its upper bound, the load imbalance can be represented as a
> value that varies between 0 and 1. Expressing the CV as a percentage
> makes the concept of load imbalance easier to interpret.
>
> Re-adjust hardcoded imbalance values within tests accordingly.
>
> [0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe0-402f-9d2f-90c5bf6e418c/content
> [1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coefficient-of-variation-for-bounded-data-set
>
> Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
> ---
>
> Notes:
> changes since v1:
> * squash commit that re-adjusts tests into this one
> * back to multiple `as f64` casts of node_count variable
> * go from if-else to early-return
> * make comment above early return clause more explanatory
> * re-order cv and max_cv bindings
> * add comment with ref relating to computation of max_cv
>
> proxmox-resource-scheduling/src/scheduler.rs | 47 +++++++++++--------
> .../tests/scheduler.rs | 8 ++--
> 2 files changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
> index 49d16f9f..87eccfee 100644
> --- a/proxmox-resource-scheduling/src/scheduler.rs
> +++ b/proxmox-resource-scheduling/src/scheduler.rs
> @@ -17,34 +17,43 @@ pub struct NodeUsage {
> pub stats: NodeStats,
> }
>
> -/// Returns the load imbalance among the nodes.
> +/// Returns the load imbalance among the nodes, which is a value between 0 and 1 that describes the
> +/// statistical dispersion of the individual node loads around the mean node load. The lower the
> +/// value, the better.
> ///
> -/// The load balance is measured as the statistical dispersion of the individual node loads.
> -///
> -/// The current implementation uses the dimensionless coefficient of variation, which expresses the
> -/// standard deviation in relation to the average mean of the node loads.
> -///
> -/// The coefficient of variation is not robust, which is a desired property here, because outliers
> -/// should be detected as much as possible.
> +/// In more detail, the current implementation computes the so-called coefficient of variation (CV),
> +/// which is the ratio of the standard deviation to the mean of the given node loads. The lower
> +/// bound of the CV is reached if all node loads are equal. The upper bound is reached if all nodes
> +/// except one are idle. To present the CV as a value between 0 and 1, it's being divided by the
> +/// upper bound of the CV for the given number of nodes.
> fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeUsage) -> f64) -> f64 {
> let node_count = nodes.len();
> - let node_loads = nodes.iter().map(to_load).collect::<Vec<_>>();
>
> + // early return with perfect imbalance to avoid division by zero
> + if node_count < 2 {
> + return 0.0;
> + }
> +
> + let node_loads = nodes.iter().map(to_load).collect::<Vec<_>>();
> let load_sum = node_loads.iter().sum::<f64>();
>
> - // load_sum is guaranteed to be -0.0 for empty `nodes`
> + // early return with perfect imbalance to avoid division by zero
> if load_sum == 0.0 {
> - 0.0
> - } else {
> - let load_mean = load_sum / node_count as f64;
> + return 0.0;
> + }
>
> - let squared_diff_sum = node_loads
> - .iter()
> - .fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
> - let load_sd = (squared_diff_sum / node_count as f64).sqrt();
> + let load_mean = load_sum / node_count as f64;
> + let squared_diff_sum = node_loads
> + .iter()
> + .fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
> + let load_sd = (squared_diff_sum / node_count as f64).sqrt();
>
> - load_sd / load_mean
> - }
> + let cv = load_sd / load_mean;
> +
small whitespace error, could also be fixed on apply though
> + // https://stats.stackexchange.com/questions/18621
> + let max_cv = ((node_count - 1) as f64).sqrt();
> +
> + cv / max_cv
> }
>
> criteria_struct! {
> diff --git a/proxmox-resource-scheduling/tests/scheduler.rs b/proxmox-resource-scheduling/tests/scheduler.rs
> index be90e4f9..21dbe451 100644
> --- a/proxmox-resource-scheduling/tests/scheduler.rs
> +++ b/proxmox-resource-scheduling/tests/scheduler.rs
> @@ -172,7 +172,7 @@ fn test_score_best_balancing_migration_candidates_with_no_candidates() {
> fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
> let scheduler = new_homogeneous_cluster_scheduler();
>
> - assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
> + assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
>
> let (candidates, migration1, migration2) = new_simple_migration_candidates();
>
> @@ -186,7 +186,7 @@ fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
> fn test_score_best_balancing_migration_candidates_in_heterogeneous_cluster() {
> let scheduler = new_heterogeneous_cluster_scheduler();
>
> - assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
> + assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
>
> let (candidates, migration1, migration2) = new_simple_migration_candidates();
>
> @@ -225,7 +225,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_homogeneous_cluster(
> ) -> Result<(), Error> {
> let scheduler = new_homogeneous_cluster_scheduler();
>
> - assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
> + assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
>
> let (candidates, migration1, migration2) = new_simple_migration_candidates();
>
> @@ -242,7 +242,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_heterogeneous_cluste
> ) -> Result<(), Error> {
> let scheduler = new_heterogeneous_cluster_scheduler();
>
> - assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
> + assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
>
> let (candidates, migration1, migration2) = new_simple_migration_candidates();
>
next prev parent reply other threads:[~2026-04-30 7:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 12:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox v2 0/6] clamp load imbalance to unit interval Dominik Rusovac
2026-04-29 12:20 ` [PATCH proxmox v2 1/6] resource-scheduling: clamp imbalance value " Dominik Rusovac
2026-04-30 7:48 ` Daniel Kral [this message]
2026-04-30 8:33 ` Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-manager v2 2/6] ui: from/CRSOptions: add maximum for threshold Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 3/6] test: re-adjust logged imbalance values to corrected calculation Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 4/6] manager: add load imbalance to status Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 5/6] api: status: " Dominik Rusovac
2026-04-30 7:46 ` Daniel Kral
2026-04-30 8:21 ` Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-cluster v2 6/6] datacenter config: add maxima for load scheduler options Dominik Rusovac
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DI6BO454X69O.3W55A44HFZA1M@proxmox.com \
--to=d.kral@proxmox.com \
--cc=d.rusovac@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.