From: Dominik Rusovac <d.rusovac@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH proxmox v2 1/6] resource-scheduling: clamp imbalance value to unit interval
Date: Wed, 29 Apr 2026 14:20:46 +0200 [thread overview]
Message-ID: <20260429122051.179485-2-d.rusovac@proxmox.com> (raw)
In-Reply-To: <20260429122051.179485-1-d.rusovac@proxmox.com>
The currently used load imbalance value is given as the so-called
coefficient of variation (CV), a value that may exceed 1. As such, the
CV value alone lacks meaning. A CV value of 0.0 means no imbalance, but
what does a value of, say, 1.7 mean?
Relative to the number of nodes in a cluster, it is possible to
determine the upper bound of the CV value [0][1]. By dividing the CV
value by its upper bound, the load imbalance can be represented as a
value that varies between 0 and 1. Expressing the CV as a percentage
makes the concept of load imbalance easier to interpret.
Re-adjust hardcoded imbalance values within tests accordingly.
[0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe0-402f-9d2f-90c5bf6e418c/content
[1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coefficient-of-variation-for-bounded-data-set
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
Notes:
changes since v1:
* squash commit that re-adjusts tests into this one
* back to multiple `as f64` casts of node_count variable
* go from if-else to early-return
* make comment above early return clause more explanatory
* re-order cv and max_cv bindings
* add comment with ref relating to computation of max_cv
proxmox-resource-scheduling/src/scheduler.rs | 47 +++++++++++--------
.../tests/scheduler.rs | 8 ++--
2 files changed, 32 insertions(+), 23 deletions(-)
diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
index 49d16f9f..87eccfee 100644
--- a/proxmox-resource-scheduling/src/scheduler.rs
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -17,34 +17,43 @@ pub struct NodeUsage {
pub stats: NodeStats,
}
-/// Returns the load imbalance among the nodes.
+/// Returns the load imbalance among the nodes, which is a value between 0 and 1 that describes the
+/// statistical dispersion of the individual node loads around the mean node load. The lower the
+/// value, the better.
///
-/// The load balance is measured as the statistical dispersion of the individual node loads.
-///
-/// The current implementation uses the dimensionless coefficient of variation, which expresses the
-/// standard deviation in relation to the average mean of the node loads.
-///
-/// The coefficient of variation is not robust, which is a desired property here, because outliers
-/// should be detected as much as possible.
+/// In more detail, the current implementation computes the so-called coefficient of variation (CV),
+/// which is the ratio of the standard deviation to the mean of the given node loads. The lower
+/// bound of the CV is reached if all node loads are equal. The upper bound is reached if all nodes
+/// except one are idle. To present the CV as a value between 0 and 1, it's being divided by the
+/// upper bound of the CV for the given number of nodes.
fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeUsage) -> f64) -> f64 {
let node_count = nodes.len();
- let node_loads = nodes.iter().map(to_load).collect::<Vec<_>>();
+ // early return with perfect imbalance to avoid division by zero
+ if node_count < 2 {
+ return 0.0;
+ }
+
+ let node_loads = nodes.iter().map(to_load).collect::<Vec<_>>();
let load_sum = node_loads.iter().sum::<f64>();
- // load_sum is guaranteed to be -0.0 for empty `nodes`
+ // early return with perfect imbalance to avoid division by zero
if load_sum == 0.0 {
- 0.0
- } else {
- let load_mean = load_sum / node_count as f64;
+ return 0.0;
+ }
- let squared_diff_sum = node_loads
- .iter()
- .fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
- let load_sd = (squared_diff_sum / node_count as f64).sqrt();
+ let load_mean = load_sum / node_count as f64;
+ let squared_diff_sum = node_loads
+ .iter()
+ .fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
+ let load_sd = (squared_diff_sum / node_count as f64).sqrt();
- load_sd / load_mean
- }
+ let cv = load_sd / load_mean;
+
+ // https://stats.stackexchange.com/questions/18621
+ let max_cv = ((node_count - 1) as f64).sqrt();
+
+ cv / max_cv
}
criteria_struct! {
diff --git a/proxmox-resource-scheduling/tests/scheduler.rs b/proxmox-resource-scheduling/tests/scheduler.rs
index be90e4f9..21dbe451 100644
--- a/proxmox-resource-scheduling/tests/scheduler.rs
+++ b/proxmox-resource-scheduling/tests/scheduler.rs
@@ -172,7 +172,7 @@ fn test_score_best_balancing_migration_candidates_with_no_candidates() {
fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
let scheduler = new_homogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
+ assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -186,7 +186,7 @@ fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
fn test_score_best_balancing_migration_candidates_in_heterogeneous_cluster() {
let scheduler = new_heterogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
+ assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -225,7 +225,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_homogeneous_cluster(
) -> Result<(), Error> {
let scheduler = new_homogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
+ assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -242,7 +242,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_heterogeneous_cluste
) -> Result<(), Error> {
let scheduler = new_heterogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
+ assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
--
2.47.3
next prev parent reply other threads:[~2026-04-29 12:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 12:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox v2 0/6] clamp load imbalance to unit interval Dominik Rusovac
2026-04-29 12:20 ` Dominik Rusovac [this message]
2026-04-30 7:48 ` [PATCH proxmox v2 1/6] resource-scheduling: clamp imbalance value " Daniel Kral
2026-04-30 8:33 ` Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-manager v2 2/6] ui: from/CRSOptions: add maximum for threshold Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 3/6] test: re-adjust logged imbalance values to corrected calculation Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 4/6] manager: add load imbalance to status Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-ha-manager v2 5/6] api: status: " Dominik Rusovac
2026-04-30 7:46 ` Daniel Kral
2026-04-30 8:21 ` Dominik Rusovac
2026-04-29 12:20 ` [PATCH pve-cluster v2 6/6] datacenter config: add maxima for load scheduler options Dominik Rusovac
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260429122051.179485-2-d.rusovac@proxmox.com \
--to=d.rusovac@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.