* [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval
@ 2026-04-27 13:20 Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value " Dominik Rusovac
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
# TL;DR
clamp load imbalance to value between 0 and 1, and display the value as
percentage in HA Status panel of PVE UI.
# Details
The currently used load imbalance value is given as the so-called coefficient of
variation (CV), a value that may exceed 1. As such, the CV value alone lacks
meaning. A CV value of 0.0 means no imbalance, but what does a value of, say,
1.7 mean?
Relative to the number of nodes in a cluster, it is possible to determine the
upper bound of the CV value [0][1]. By dividing the CV value by its upper
bound, the load imbalance can be represented as a value that varies between 0
and 1. Expressing the CV as a percentage makes the concept of load imbalance
easier to interpret.
# Summary of Changes
This series:
- represents load imbalance as a value between 0 and 1;
- adds a maximum value of 1.0 for load scheduler options; and
- integrates the load imbalance value within the HA status endpoint;
this is to provide feedback on the prevailing load imbalance in the PVE UI.
# Refs
[0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe0-402f-9d2f-90c5bf6e418c/content
[1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coefficient-of-variation-for-bounded-data-set
proxmox:
Dominik Rusovac (2):
resource-scheduling: clamp imbalance value to unit interval
resource-scheduling: re-adjust hardcoded imbalance values
proxmox-resource-scheduling/src/scheduler.rs | 33 ++++++++++++-------
.../tests/scheduler.rs | 8 ++---
2 files changed, 25 insertions(+), 16 deletions(-)
pve-manager:
Dominik Rusovac (1):
ui: from/CRSOptions: add maximum for threshold
www/manager6/form/CRSOptions.js | 1 +
1 file changed, 1 insertion(+)
pve-ha-manager:
Dominik Rusovac (3):
test: re-adjust logged imbalance values
manager: add load imbalance to status
api: status: add load imbalance to status
src/PVE/API2/HA/Status.pm | 4 +-
src/PVE/HA/Manager.pm | 1 +
.../log.expect | 4 +-
.../log.expect | 38 +++++++++----------
.../log.expect | 4 +-
.../log.expect | 29 +++++---------
.../log.expect | 2 +-
.../log.expect | 2 +-
.../log.expect | 4 +-
.../log.expect | 4 +-
.../log.expect | 4 +-
.../log.expect | 22 +----------
12 files changed, 47 insertions(+), 71 deletions(-)
pve-cluster:
Dominik Rusovac (1):
datacenter config: add maxima for load scheduler options
src/PVE/DataCenterConfig.pm | 2 ++
1 file changed, 2 insertions(+)
Summary over all repositories:
16 files changed, 75 insertions(+), 87 deletions(-)
--
Generated by murpp 0.11.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value to unit interval
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 2/7] resource-scheduling: re-adjust hardcoded imbalance values Dominik Rusovac
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
The currently used load imbalance value is given as the so-called
coefficient of variation (CV), a value that may exceed 1. As such, the
CV value alone lacks meaning. A CV value of 0.0 means no imbalance, but
what does a value of, say, 1.7 mean?
Relative to the number of nodes in a cluster, it is possible to
determine the upper bound of the CV value [0][1]. By dividing the CV
value by its upper bound, the load imbalance can be represented as a
value that varies between 0 and 1. Expressing the CV as a percentage
makes the concept of load imbalance easier to interpret.
[0] https://repositorio.ipbeja.pt/server/api/core/bitstreams/8ed9a444-dbe0-402f-9d2f-90c5bf6e418c/content
[1] https://stats.stackexchange.com/questions/18621/maximum-value-of-coefficient-of-variation-for-bounded-data-set
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
proxmox-resource-scheduling/src/scheduler.rs | 33 +++++++++++++-------
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
index 49d16f9f..4eacbff9 100644
--- a/proxmox-resource-scheduling/src/scheduler.rs
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -17,17 +17,23 @@ pub struct NodeUsage {
pub stats: NodeStats,
}
-/// Returns the load imbalance among the nodes.
+/// Returns the load imbalance among the nodes, which is a value between 0 and 1 that describes the
+/// statistical dispersion of the individual node loads around the mean node load. The lower the
+/// value, the better.
///
-/// The load balance is measured as the statistical dispersion of the individual node loads.
-///
-/// The current implementation uses the dimensionless coefficient of variation, which expresses the
-/// standard deviation in relation to the average mean of the node loads.
-///
-/// The coefficient of variation is not robust, which is a desired property here, because outliers
-/// should be detected as much as possible.
+/// In more detail, the current implementation computes the so-called coefficient of variation (CV),
+/// which is the ratio of the standard deviation to the mean of the given node loads. The lower
+/// bound of the CV is reached if all node loads are equal. The upper bound is reached if all nodes
+/// except one are idle. To present the CV as a value between 0 and 1, it's being divided by the
+/// upper bound of the CV for the given number of nodes.
fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeUsage) -> f64) -> f64 {
- let node_count = nodes.len();
+ let node_count = nodes.len() as f64;
+
+ // imbalance is perfect for less than 2 nodes
+ if node_count < 2.0 {
+ return 0.0;
+ }
+
let node_loads = nodes.iter().map(to_load).collect::<Vec<_>>();
let load_sum = node_loads.iter().sum::<f64>();
@@ -36,14 +42,17 @@ fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeUsage) ->
if load_sum == 0.0 {
0.0
} else {
- let load_mean = load_sum / node_count as f64;
+ let load_mean = load_sum / node_count;
let squared_diff_sum = node_loads
.iter()
.fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
- let load_sd = (squared_diff_sum / node_count as f64).sqrt();
+ let load_sd = (squared_diff_sum / node_count).sqrt();
+
+ let max_cv = (node_count - 1.0).sqrt();
+ let cv = load_sd / load_mean;
- load_sd / load_mean
+ cv / max_cv
}
}
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH proxmox 2/7] resource-scheduling: re-adjust hardcoded imbalance values
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value " Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-manager 3/7] ui: from/CRSOptions: add maximum for threshold Dominik Rusovac
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
proxmox-resource-scheduling/tests/scheduler.rs | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/proxmox-resource-scheduling/tests/scheduler.rs b/proxmox-resource-scheduling/tests/scheduler.rs
index be90e4f9..21dbe451 100644
--- a/proxmox-resource-scheduling/tests/scheduler.rs
+++ b/proxmox-resource-scheduling/tests/scheduler.rs
@@ -172,7 +172,7 @@ fn test_score_best_balancing_migration_candidates_with_no_candidates() {
fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
let scheduler = new_homogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
+ assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -186,7 +186,7 @@ fn test_score_best_balancing_migration_candidates_in_homogeneous_cluster() {
fn test_score_best_balancing_migration_candidates_in_heterogeneous_cluster() {
let scheduler = new_heterogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
+ assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -225,7 +225,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_homogeneous_cluster(
) -> Result<(), Error> {
let scheduler = new_homogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.4893954724628247);
+ assert_imbalance(scheduler.node_imbalance(), 0.3460548572604576);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
@@ -242,7 +242,7 @@ fn test_score_best_balancing_migration_candidates_topsis_in_heterogeneous_cluste
) -> Result<(), Error> {
let scheduler = new_heterogeneous_cluster_scheduler();
- assert_imbalance(scheduler.node_imbalance(), 0.33026013056867354);
+ assert_imbalance(scheduler.node_imbalance(), 0.23352917788066363);
let (candidates, migration1, migration2) = new_simple_migration_candidates();
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH pve-manager 3/7] ui: from/CRSOptions: add maximum for threshold
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value " Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 2/7] resource-scheduling: re-adjust hardcoded imbalance values Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 4/7] test: re-adjust logged imbalance values Dominik Rusovac
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
www/manager6/form/CRSOptions.js | 1 +
1 file changed, 1 insertion(+)
diff --git a/www/manager6/form/CRSOptions.js b/www/manager6/form/CRSOptions.js
index b5476bd5..985eb8cf 100644
--- a/www/manager6/form/CRSOptions.js
+++ b/www/manager6/form/CRSOptions.js
@@ -66,6 +66,7 @@ Ext.define('PVE.form.CRSOptions', {
fieldLabel: gettext('Imbalance Threshold'),
emptyText: '0.3',
minValue: 0.0,
+ maxValue: 1.0,
step: 0.01,
bind: {
disabled: '{!enableAutoRebalance.checked}',
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH pve-ha-manager 4/7] test: re-adjust logged imbalance values
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
` (2 preceding siblings ...)
2026-04-27 13:20 ` [PATCH pve-manager 3/7] ui: from/CRSOptions: add maximum for threshold Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 5/7] manager: add load imbalance to status Dominik Rusovac
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
.../log.expect | 4 +-
.../log.expect | 38 +++++++++----------
.../log.expect | 4 +-
.../log.expect | 29 +++++---------
.../log.expect | 2 +-
.../log.expect | 2 +-
.../log.expect | 4 +-
.../log.expect | 4 +-
.../log.expect | 4 +-
.../log.expect | 22 +----------
10 files changed, 43 insertions(+), 70 deletions(-)
diff --git a/src/test/test-crs-dynamic-auto-rebalance-topsis2/log.expect b/src/test/test-crs-dynamic-auto-rebalance-topsis2/log.expect
index 3d79026..83d4e60 100644
--- a/src/test/test-crs-dynamic-auto-rebalance-topsis2/log.expect
+++ b/src/test/test-crs-dynamic-auto-rebalance-topsis2/log.expect
@@ -34,7 +34,7 @@ info 21 node1/lrm: starting service vm:104
info 21 node1/lrm: service status vm:104 started
info 22 node2/crm: status change wait_for_quorum => slave
info 24 node3/crm: status change wait_for_quorum => slave
-info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.41 to 0.94)
+info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.00 to 0.66)
info 80 node1/crm: got crm command: migrate vm:101 node2
info 80 node1/crm: migrate service 'vm:101' to node 'node2'
info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
@@ -45,7 +45,7 @@ info 83 node2/lrm: status change wait_for_agent_lock => active
info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
info 103 node2/lrm: starting service vm:101
info 103 node2/lrm: service status vm:101 started
-info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.94 to 0.35)
+info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.66 to 0.25)
info 160 node1/crm: got crm command: migrate vm:102 node3
info 160 node1/crm: migrate service 'vm:102' to node 'node3'
info 160 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node3)
diff --git a/src/test/test-crs-dynamic-auto-rebalance-topsis3/log.expect b/src/test/test-crs-dynamic-auto-rebalance-topsis3/log.expect
index c9fc29e..c539122 100644
--- a/src/test/test-crs-dynamic-auto-rebalance-topsis3/log.expect
+++ b/src/test/test-crs-dynamic-auto-rebalance-topsis3/log.expect
@@ -53,7 +53,7 @@ info 25 node3/lrm: service status vm:107 started
info 120 cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8 mem 7912
info 120 cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7 mem 8192
info 120 cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0 mem 8011
-info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected change for imbalance from 0.85 to 0.42)
+info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected change for imbalance from 0.60 to 0.30)
info 160 node1/crm: got crm command: migrate vm:105 node2
info 160 node1/crm: migrate service 'vm:105' to node 'node2'
info 160 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node2)
@@ -68,22 +68,22 @@ info 220 cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7 mem 8
info 220 cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8 mem 1201
info 220 cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1 mem 1211
info 220 cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9 mem 1191
-info 240 node1/crm: auto rebalance - migrate vm:103 to node3 (expected change for imbalance from 0.81 to 0.43)
-info 240 node1/crm: got crm command: migrate vm:103 node3
-info 240 node1/crm: migrate service 'vm:103' to node 'node3'
-info 240 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 243 node2/lrm: service vm:103 - start migrate to node 'node3'
-info 243 node2/lrm: service vm:103 - end migrate to node 'node3'
-info 260 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3)
-info 265 node3/lrm: starting service vm:103
-info 265 node3/lrm: service status vm:103 started
-info 320 node1/crm: auto rebalance - migrate vm:105 to node1 (expected change for imbalance from 0.43 to 0.24)
-info 320 node1/crm: got crm command: migrate vm:105 node1
-info 320 node1/crm: migrate service 'vm:105' to node 'node1'
-info 320 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node2, target = node1)
-info 323 node2/lrm: service vm:105 - start migrate to node 'node1'
-info 323 node2/lrm: service vm:105 - end migrate to node 'node1'
-info 340 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node1)
-info 341 node1/lrm: starting service vm:105
-info 341 node1/lrm: service status vm:105 started
+info 260 node1/crm: auto rebalance - migrate vm:103 to node3 (expected change for imbalance from 0.57 to 0.30)
+info 260 node1/crm: got crm command: migrate vm:103 node3
+info 260 node1/crm: migrate service 'vm:103' to node 'node3'
+info 260 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 263 node2/lrm: service vm:103 - start migrate to node 'node3'
+info 263 node2/lrm: service vm:103 - end migrate to node 'node3'
+info 280 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3)
+info 285 node3/lrm: starting service vm:103
+info 285 node3/lrm: service status vm:103 started
+info 340 node1/crm: auto rebalance - migrate vm:105 to node1 (expected change for imbalance from 0.30 to 0.17)
+info 340 node1/crm: got crm command: migrate vm:105 node1
+info 340 node1/crm: migrate service 'vm:105' to node 'node1'
+info 340 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 343 node2/lrm: service vm:105 - start migrate to node 'node1'
+info 343 node2/lrm: service vm:105 - end migrate to node 'node1'
+info 360 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node1)
+info 361 node1/lrm: starting service vm:105
+info 361 node1/lrm: service status vm:105 started
info 820 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
index 3d79026..83d4e60 100644
--- a/src/test/test-crs-dynamic-auto-rebalance2/log.expect
+++ b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
@@ -34,7 +34,7 @@ info 21 node1/lrm: starting service vm:104
info 21 node1/lrm: service status vm:104 started
info 22 node2/crm: status change wait_for_quorum => slave
info 24 node3/crm: status change wait_for_quorum => slave
-info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.41 to 0.94)
+info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.00 to 0.66)
info 80 node1/crm: got crm command: migrate vm:101 node2
info 80 node1/crm: migrate service 'vm:101' to node 'node2'
info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
@@ -45,7 +45,7 @@ info 83 node2/lrm: status change wait_for_agent_lock => active
info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
info 103 node2/lrm: starting service vm:101
info 103 node2/lrm: service status vm:101 started
-info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.94 to 0.35)
+info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.66 to 0.25)
info 160 node1/crm: got crm command: migrate vm:102 node3
info 160 node1/crm: migrate service 'vm:102' to node 'node3'
info 160 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node3)
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
index 275f7ae..6f8c1ee 100644
--- a/src/test/test-crs-dynamic-auto-rebalance3/log.expect
+++ b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
@@ -53,7 +53,7 @@ info 25 node3/lrm: service status vm:107 started
info 120 cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8 mem 7912
info 120 cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7 mem 8192
info 120 cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0 mem 8011
-info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected change for imbalance from 0.85 to 0.42)
+info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected change for imbalance from 0.60 to 0.30)
info 160 node1/crm: got crm command: migrate vm:105 node2
info 160 node1/crm: migrate service 'vm:105' to node 'node2'
info 160 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node2)
@@ -68,22 +68,13 @@ info 220 cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7 mem 8
info 220 cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8 mem 1201
info 220 cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1 mem 1211
info 220 cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9 mem 1191
-info 240 node1/crm: auto rebalance - migrate vm:103 to node1 (expected change for imbalance from 0.81 to 0.40)
-info 240 node1/crm: got crm command: migrate vm:103 node1
-info 240 node1/crm: migrate service 'vm:103' to node 'node1'
-info 240 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node1)
-info 243 node2/lrm: service vm:103 - start migrate to node 'node1'
-info 243 node2/lrm: service vm:103 - end migrate to node 'node1'
-info 260 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
-info 261 node1/lrm: starting service vm:103
-info 261 node1/lrm: service status vm:103 started
-info 320 node1/crm: auto rebalance - migrate vm:105 to node3 (expected change for imbalance from 0.40 to 0.21)
-info 320 node1/crm: got crm command: migrate vm:105 node3
-info 320 node1/crm: migrate service 'vm:105' to node 'node3'
-info 320 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 323 node2/lrm: service vm:105 - start migrate to node 'node3'
-info 323 node2/lrm: service vm:105 - end migrate to node 'node3'
-info 340 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node3)
-info 345 node3/lrm: starting service vm:105
-info 345 node3/lrm: service status vm:105 started
+info 260 node1/crm: auto rebalance - migrate vm:103 to node1 (expected change for imbalance from 0.57 to 0.28)
+info 260 node1/crm: got crm command: migrate vm:103 node1
+info 260 node1/crm: migrate service 'vm:103' to node 'node1'
+info 260 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 263 node2/lrm: service vm:103 - start migrate to node 'node1'
+info 263 node2/lrm: service vm:103 - end migrate to node 'node1'
+info 280 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 281 node1/lrm: starting service vm:103
+info 281 node1/lrm: service status vm:103 started
info 820 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-constrained-auto-rebalance1/log.expect b/src/test/test-crs-dynamic-constrained-auto-rebalance1/log.expect
index c926799..30d9721 100644
--- a/src/test/test-crs-dynamic-constrained-auto-rebalance1/log.expect
+++ b/src/test/test-crs-dynamic-constrained-auto-rebalance1/log.expect
@@ -35,7 +35,7 @@ info 120 cmdlist: execute service vm:104 set-static-stats maxcpu 8.0 max
info 120 cmdlist: execute service vm:104 set-dynamic-stats cpu 4.0 mem 4096
info 120 node1/crm: adding new service 'vm:104' on node 'node1'
info 120 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
-info 140 node1/crm: auto rebalance - migrate vm:104 to node2 (expected change for imbalance from 1.41 to 0.98)
+info 140 node1/crm: auto rebalance - migrate vm:104 to node2 (expected change for imbalance from 1.00 to 0.70)
info 140 node1/crm: got crm command: migrate vm:104 node2
info 140 node1/crm: migrate service 'vm:104' to node 'node2'
info 140 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node1, target = node2)
diff --git a/src/test/test-crs-dynamic-constrained-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-constrained-auto-rebalance2/log.expect
index 26be942..d9189c9 100644
--- a/src/test/test-crs-dynamic-constrained-auto-rebalance2/log.expect
+++ b/src/test/test-crs-dynamic-constrained-auto-rebalance2/log.expect
@@ -31,7 +31,7 @@ info 120 cmdlist: execute service vm:103 set-static-stats maxcpu 8.0 max
info 120 cmdlist: execute service vm:103 set-dynamic-stats cpu 4.0 mem 4096
info 120 node1/crm: adding new service 'vm:103' on node 'node1'
info 120 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
-info 140 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.41 to 0.86)
+info 140 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.00 to 0.61)
info 140 node1/crm: got crm command: migrate vm:101 node2
info 140 node1/crm: crm command 'migrate vm:101 node2' - migrate service 'vm:102' to node 'node2' (service 'vm:102' in positive affinity with service 'vm:101')
info 140 node1/crm: migrate service 'vm:101' to node 'node2'
diff --git a/src/test/test-crs-dynamic-constrained-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-constrained-auto-rebalance3/log.expect
index 35282c7..82b0b13 100644
--- a/src/test/test-crs-dynamic-constrained-auto-rebalance3/log.expect
+++ b/src/test/test-crs-dynamic-constrained-auto-rebalance3/log.expect
@@ -28,7 +28,7 @@ info 24 node3/crm: status change wait_for_quorum => slave
info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
info 41 node1/lrm: starting service vm:101
info 41 node1/lrm: service status vm:101 started
-info 60 node1/crm: auto rebalance - migrate vm:102 to node2 (expected change for imbalance from 1.41 to 0.72)
+info 60 node1/crm: auto rebalance - migrate vm:102 to node2 (expected change for imbalance from 1.00 to 0.51)
info 60 node1/crm: got crm command: migrate vm:102 node2
info 60 node1/crm: migrate service 'vm:102' to node 'node2'
info 60 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node2)
@@ -37,7 +37,7 @@ info 61 node1/lrm: service vm:102 - end migrate to node 'node2'
info 80 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
info 83 node2/lrm: starting service vm:102
info 83 node2/lrm: service status vm:102 started
-info 100 node1/crm: auto rebalance - migrate vm:101 to node3 (expected change for imbalance from 0.72 to 0.27)
+info 100 node1/crm: auto rebalance - migrate vm:101 to node3 (expected change for imbalance from 0.51 to 0.19)
info 100 node1/crm: got crm command: migrate vm:101 node3
info 100 node1/crm: crm command 'migrate vm:101 node3' - migrate service 'vm:103' to node 'node3' (service 'vm:103' in positive affinity with service 'vm:101')
info 100 node1/crm: migrate service 'vm:101' to node 'node3'
diff --git a/src/test/test-crs-dynamic-constrained-auto-rebalance4/log.expect b/src/test/test-crs-dynamic-constrained-auto-rebalance4/log.expect
index cd87f3a..d454328 100644
--- a/src/test/test-crs-dynamic-constrained-auto-rebalance4/log.expect
+++ b/src/test/test-crs-dynamic-constrained-auto-rebalance4/log.expect
@@ -38,7 +38,7 @@ info 25 node3/lrm: got lock 'ha_agent_node3_lock'
info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:104
info 25 node3/lrm: service status vm:104 started
-info 80 node1/crm: auto rebalance - migrate vm:101 to node3 (expected change for imbalance from 1.04 to 0.72)
+info 80 node1/crm: auto rebalance - migrate vm:101 to node3 (expected change for imbalance from 0.74 to 0.51)
info 80 node1/crm: got crm command: migrate vm:101 node3
info 80 node1/crm: migrate service 'vm:101' to node 'node3'
info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
@@ -47,7 +47,7 @@ info 81 node1/lrm: service vm:101 - end migrate to node 'node3'
info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
info 105 node3/lrm: starting service vm:101
info 105 node3/lrm: service status vm:101 started
-info 160 node1/crm: auto rebalance - migrate vm:104 to node2 (expected change for imbalance from 0.72 to 0.33)
+info 160 node1/crm: auto rebalance - migrate vm:104 to node2 (expected change for imbalance from 0.51 to 0.23)
info 160 node1/crm: got crm command: migrate vm:104 node2
info 160 node1/crm: migrate service 'vm:104' to node 'node2'
info 160 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node3, target = node2)
diff --git a/src/test/test-crs-static-auto-rebalance2/log.expect b/src/test/test-crs-static-auto-rebalance2/log.expect
index 6a2ab89..e6d7f7b 100644
--- a/src/test/test-crs-static-auto-rebalance2/log.expect
+++ b/src/test/test-crs-static-auto-rebalance2/log.expect
@@ -34,7 +34,7 @@ info 21 node1/lrm: starting service vm:104
info 21 node1/lrm: service status vm:104 started
info 22 node2/crm: status change wait_for_quorum => slave
info 24 node3/crm: status change wait_for_quorum => slave
-info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.41 to 0.94)
+info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected change for imbalance from 1.00 to 0.66)
info 80 node1/crm: got crm command: migrate vm:101 node2
info 80 node1/crm: migrate service 'vm:101' to node 'node2'
info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
@@ -45,7 +45,7 @@ info 83 node2/lrm: status change wait_for_agent_lock => active
info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
info 103 node2/lrm: starting service vm:101
info 103 node2/lrm: service status vm:101 started
-info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.94 to 0.35)
+info 160 node1/crm: auto rebalance - migrate vm:102 to node3 (expected change for imbalance from 0.66 to 0.25)
info 160 node1/crm: got crm command: migrate vm:102 node3
info 160 node1/crm: migrate service 'vm:102' to node 'node3'
info 160 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node3)
diff --git a/src/test/test-crs-static-auto-rebalance3/log.expect b/src/test/test-crs-static-auto-rebalance3/log.expect
index ecf2d18..d3a8080 100644
--- a/src/test/test-crs-static-auto-rebalance3/log.expect
+++ b/src/test/test-crs-static-auto-rebalance3/log.expect
@@ -53,7 +53,7 @@ info 25 node3/lrm: service status vm:107 started
info 120 cmdlist: execute service vm:105 set-static-stats maxcpu 8.0 maxmem 8192
info 120 cmdlist: execute service vm:106 set-static-stats maxcpu 8.0 maxmem 8192
info 120 cmdlist: execute service vm:107 set-static-stats maxcpu 8.0 maxmem 8192
-info 160 node1/crm: auto rebalance - migrate vm:105 to node1 (expected change for imbalance from 0.88 to 0.47)
+info 160 node1/crm: auto rebalance - migrate vm:105 to node1 (expected change for imbalance from 0.62 to 0.33)
info 160 node1/crm: got crm command: migrate vm:105 node1
info 160 node1/crm: migrate service 'vm:105' to node 'node1'
info 160 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node1)
@@ -67,7 +67,7 @@ info 220 cmdlist: execute service vm:102 set-static-stats maxcpu 1.0 max
info 220 cmdlist: execute service vm:103 set-static-stats maxcpu 1.0 maxmem 1024
info 220 cmdlist: execute service vm:104 set-static-stats maxcpu 1.0 maxmem 1024
info 220 cmdlist: execute service vm:105 set-static-stats maxcpu 1.0 maxmem 1024
-info 240 node1/crm: auto rebalance - migrate vm:106 to node2 (expected change for imbalance from 0.91 to 0.42)
+info 240 node1/crm: auto rebalance - migrate vm:106 to node2 (expected change for imbalance from 0.64 to 0.30)
info 240 node1/crm: got crm command: migrate vm:106 node2
info 240 node1/crm: migrate service 'vm:106' to node 'node2'
info 240 node1/crm: service 'vm:106': state changed from 'started' to 'migrate' (node = node3, target = node2)
@@ -76,22 +76,4 @@ info 245 node3/lrm: service vm:106 - end migrate to node 'node2'
info 260 node1/crm: service 'vm:106': state changed from 'migrate' to 'started' (node = node2)
info 263 node2/lrm: starting service vm:106
info 263 node2/lrm: service status vm:106 started
-info 320 node1/crm: auto rebalance - migrate vm:103 to node1 (expected change for imbalance from 0.42 to 0.31)
-info 320 node1/crm: got crm command: migrate vm:103 node1
-info 320 node1/crm: migrate service 'vm:103' to node 'node1'
-info 320 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node1)
-info 323 node2/lrm: service vm:103 - start migrate to node 'node1'
-info 323 node2/lrm: service vm:103 - end migrate to node 'node1'
-info 340 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
-info 341 node1/lrm: starting service vm:103
-info 341 node1/lrm: service status vm:103 started
-info 400 node1/crm: auto rebalance - migrate vm:104 to node1 (expected change for imbalance from 0.31 to 0.20)
-info 400 node1/crm: got crm command: migrate vm:104 node1
-info 400 node1/crm: migrate service 'vm:104' to node 'node1'
-info 400 node1/crm: service 'vm:104': state changed from 'started' to 'migrate' (node = node2, target = node1)
-info 403 node2/lrm: service vm:104 - start migrate to node 'node1'
-info 403 node2/lrm: service vm:104 - end migrate to node 'node1'
-info 420 node1/crm: service 'vm:104': state changed from 'migrate' to 'started' (node = node1)
-info 421 node1/lrm: starting service vm:104
-info 421 node1/lrm: service status vm:104 started
info 820 hardware: exit simulation - done
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH pve-ha-manager 5/7] manager: add load imbalance to status
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
` (3 preceding siblings ...)
2026-04-27 13:20 ` [PATCH pve-ha-manager 4/7] test: re-adjust logged imbalance values Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 6/7] api: status: " Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-cluster 7/7] datacenter config: add maxima for load scheduler options Dominik Rusovac
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
src/PVE/HA/Manager.pm | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index b69a6bb..ba26fbf 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -285,6 +285,7 @@ sub flush_master_status {
$ms->{node_status} = $ns->{status};
$ms->{service_status} = $ss;
$ms->{timestamp} = $haenv->get_time();
+ $ms->{imbalance} = $self->{online_node_usage}->calculate_node_imbalance();
$haenv->write_manager_status($ms);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH pve-ha-manager 6/7] api: status: add load imbalance to status
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
` (4 preceding siblings ...)
2026-04-27 13:20 ` [PATCH pve-ha-manager 5/7] manager: add load imbalance to status Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-cluster 7/7] datacenter config: add maxima for load scheduler options Dominik Rusovac
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
This is a very basic measure to enable users to detect the prevailing
load imbalance in the UI, which atm reveals nothing about the latter.
imo, enabling users to track how the load imbalance changed over time
(using RRD graphs, for example) should be considered, in the long run.
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
src/PVE/API2/HA/Status.pm | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/PVE/API2/HA/Status.pm b/src/PVE/API2/HA/Status.pm
index 4894f3b..acec78e 100644
--- a/src/PVE/API2/HA/Status.pm
+++ b/src/PVE/API2/HA/Status.pm
@@ -199,7 +199,9 @@ __PACKAGE__->register_method({
}
my $datacenter_config = eval { cfs_read_file('datacenter.cfg') } // {};
if (my $crs = $datacenter_config->{crs}) {
- $extra_status .= " - $crs->{ha} load CRS"
+ $extra_status .=
+ " - $crs->{ha} load CRS "
+ . sprintf("(load imbalance: %.2f", 100 * $status->{imbalance}) . "%)"
if $crs->{ha} && $crs->{ha} ne 'basic';
}
my $time_str = localtime($status->{timestamp});
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH pve-cluster 7/7] datacenter config: add maxima for load scheduler options
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
` (5 preceding siblings ...)
2026-04-27 13:20 ` [PATCH pve-ha-manager 6/7] api: status: " Dominik Rusovac
@ 2026-04-27 13:20 ` Dominik Rusovac
6 siblings, 0 replies; 8+ messages in thread
From: Dominik Rusovac @ 2026-04-27 13:20 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
---
src/PVE/DataCenterConfig.pm | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/PVE/DataCenterConfig.pm b/src/PVE/DataCenterConfig.pm
index 6513594..d120017 100644
--- a/src/PVE/DataCenterConfig.pm
+++ b/src/PVE/DataCenterConfig.pm
@@ -44,6 +44,7 @@ EODESC
type => 'number',
optional => 1,
minimum => 0.0,
+ maximum => 1.0,
default => 0.3,
requires => 'ha-auto-rebalance',
description => "The threshold for the cluster node imbalance, which will"
@@ -72,6 +73,7 @@ EODESC
type => 'number',
optional => 1,
minimum => 0.0,
+ maximum => 1.0,
default => 0.1,
requires => 'ha-auto-rebalance',
description => "The minimum relative improvement in cluster node"
--
2.47.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-27 13:21 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 13:20 [RFC PATCH-SERIES cluster/ha-manager/manager/proxmox 0/7] clamp load imbalance to unit interval Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 1/7] resource-scheduling: clamp imbalance value " Dominik Rusovac
2026-04-27 13:20 ` [PATCH proxmox 2/7] resource-scheduling: re-adjust hardcoded imbalance values Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-manager 3/7] ui: from/CRSOptions: add maximum for threshold Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 4/7] test: re-adjust logged imbalance values Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 5/7] manager: add load imbalance to status Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-ha-manager 6/7] api: status: " Dominik Rusovac
2026-04-27 13:20 ` [PATCH pve-cluster 7/7] datacenter config: add maxima for load scheduler options Dominik Rusovac
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox