From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases
Date: Tue, 17 Feb 2026 15:14:28 +0100 [thread overview]
Message-ID: <20260217141437.584852-35-d.kral@proxmox.com> (raw)
In-Reply-To: <20260217141437.584852-1-d.kral@proxmox.com>
These test cases document the basic behavior of the automatic load
rebalancer with non-changing and changing dynamic resource usages.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../test-crs-dynamic-auto-rebalance0/README | 2 +
.../test-crs-dynamic-auto-rebalance0/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 1 +
.../hardware_status | 5 ++
.../log.expect | 11 +++
.../manager_status | 1 +
.../service_config | 1 +
.../static_service_stats | 1 +
.../test-crs-dynamic-auto-rebalance1/README | 6 ++
.../test-crs-dynamic-auto-rebalance1/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 3 +
.../hardware_status | 5 ++
.../log.expect | 25 ++++++
.../manager_status | 1 +
.../service_config | 3 +
.../static_service_stats | 3 +
.../test-crs-dynamic-auto-rebalance2/README | 3 +
.../test-crs-dynamic-auto-rebalance2/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 6 ++
.../hardware_status | 5 ++
.../log.expect | 59 +++++++++++++
.../manager_status | 1 +
.../service_config | 6 ++
.../static_service_stats | 6 ++
.../test-crs-dynamic-auto-rebalance3/README | 3 +
.../test-crs-dynamic-auto-rebalance3/cmdlist | 24 +++++
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 9 ++
.../hardware_status | 5 ++
.../log.expect | 88 +++++++++++++++++++
.../manager_status | 1 +
.../service_config | 9 ++
.../static_service_stats | 9 ++
36 files changed, 343 insertions(+)
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/README b/src/test/test-crs-dynamic-auto-rebalance0/README
new file mode 100644
index 00000000..54e1d981
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/README
@@ -0,0 +1,2 @@
+Test that the auto rebalance system does not trigger if no HA resources are
+configured in a homogeneous node cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/cmdlist b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/hardware_status b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/log.expect b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
new file mode 100644
index 00000000..27eed635
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
@@ -0,0 +1,11 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/manager_status b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/service_config b/src/test/test-crs-dynamic-auto-rebalance0/service_config
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/service_config
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/README b/src/test/test-crs-dynamic-auto-rebalance1/README
new file mode 100644
index 00000000..c99a7891
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/README
@@ -0,0 +1,6 @@
+Test that the auto rebalance system does not trigger for a single running HA
+resource in a homogeneous cluster.
+
+Even though the single running HA resource will create a high node imbalance,
+which would trigger a reblancing migration, there is no such migration that can
+improve the imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/cmdlist b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
new file mode 100644
index 00000000..50dd4901
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/hardware_status b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/log.expect b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
new file mode 100644
index 00000000..e6ee4402
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
@@ -0,0 +1,25 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/manager_status b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/service_config b/src/test/test-crs-dynamic-auto-rebalance1/service_config
new file mode 100644
index 00000000..a0ab66d2
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/service_config
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
new file mode 100644
index 00000000..e1bf0839
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/README b/src/test/test-crs-dynamic-auto-rebalance2/README
new file mode 100644
index 00000000..b9acfdb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running,
+homogeneous HA resources on a single node to other cluster nodes to reach a
+minimum cluster node imbalance in the homogeneous cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/cmdlist b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
new file mode 100644
index 00000000..f01fd768
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:102": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:103": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:104": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/hardware_status b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
new file mode 100644
index 00000000..ce8cf0eb
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
new file mode 100644
index 00000000..a1796c56
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
@@ -0,0 +1,59 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected target imbalance: 0.94)
+info 80 node1/crm: got crm command: migrate vm:101 node2
+info 80 node1/crm: migrate service 'vm:101' to node 'node2'
+info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 81 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 81 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 83 node2/lrm: got lock 'ha_agent_node2_lock'
+info 83 node2/lrm: status change wait_for_agent_lock => active
+info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 103 node2/lrm: starting service vm:101
+info 103 node2/lrm: service status vm:101 started
+info 160 node1/crm: auto rebalance - migrate vm:103 to node3 (expected target imbalance: 0.35)
+info 160 node1/crm: got crm command: migrate vm:103 node3
+info 160 node1/crm: migrate service 'vm:103' to node 'node3'
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 161 node1/lrm: service vm:103 - start migrate to node 'node3'
+info 161 node1/lrm: service vm:103 - end migrate to node 'node3'
+info 165 node3/lrm: got lock 'ha_agent_node3_lock'
+info 165 node3/lrm: status change wait_for_agent_lock => active
+info 180 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3)
+info 185 node3/lrm: starting service vm:103
+info 185 node3/lrm: service status vm:103 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/manager_status b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/service_config b/src/test/test-crs-dynamic-auto-rebalance2/service_config
new file mode 100644
index 00000000..b5960cb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
new file mode 100644
index 00000000..6cf8c106
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:102": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:103": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:104": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/README b/src/test/test-crs-dynamic-auto-rebalance3/README
new file mode 100644
index 00000000..44791d6f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running HA
+resources with different usages in a homogeneous cluster with changing usages
+over time to reach minimum cluster node imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/cmdlist b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
new file mode 100644
index 00000000..42fb259f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
@@ -0,0 +1,24 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [
+ "service vm:105 set-dynamic-stats cpu 7.8",
+ "service vm:105 set-dynamic-stats mem 7912",
+ "service vm:106 set-dynamic-stats cpu 5.7",
+ "service vm:106 set-dynamic-stats mem 8192",
+ "service vm:107 set-dynamic-stats cpu 6.0",
+ "service vm:107 set-dynamic-stats mem 8011"
+ ],
+ [
+ "service vm:101 set-dynamic-stats mem 1011",
+ "service vm:103 set-dynamic-stats cpu 3.9",
+ "service vm:103 set-dynamic-stats mem 6517",
+ "service vm:104 set-dynamic-stats cpu 6.7",
+ "service vm:104 set-dynamic-stats mem 8001",
+ "service vm:105 set-dynamic-stats cpu 1.8",
+ "service vm:105 set-dynamic-stats mem 1201",
+ "service vm:106 set-dynamic-stats cpu 2.1",
+ "service vm:106 set-dynamic-stats mem 1211",
+ "service vm:107 set-dynamic-stats cpu 0.9",
+ "service vm:107 set-dynamic-stats mem 1191"
+ ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
new file mode 100644
index 00000000..77e72c16
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "cpu": 0.9, "mem": 5444206592 },
+ "vm:102": { "cpu": 1.2, "mem": 2621440000 },
+ "vm:103": { "cpu": 0.8, "mem": 5444206592 },
+ "vm:104": { "cpu": 0.9, "mem": 2621440000 },
+ "vm:105": { "cpu": 3.0, "mem": 5444206592 },
+ "vm:106": { "cpu": 2.9, "mem": 2621440000 },
+ "vm:107": { "cpu": 2.1, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/hardware_status b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
new file mode 100644
index 00000000..8f1e695c
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
new file mode 100644
index 00000000..1832c44f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
@@ -0,0 +1,88 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node2'
+info 20 node1/crm: adding new service 'vm:104' on node 'node2'
+info 20 node1/crm: adding new service 'vm:105' on node 'node3'
+info 20 node1/crm: adding new service 'vm:106' on node 'node3'
+info 20 node1/crm: adding new service 'vm:107' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:103
+info 23 node2/lrm: service status vm:103 started
+info 23 node2/lrm: starting service vm:104
+info 23 node2/lrm: service status vm:104 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:105
+info 25 node3/lrm: service status vm:105 started
+info 25 node3/lrm: starting service vm:106
+info 25 node3/lrm: service status vm:106 started
+info 25 node3/lrm: starting service vm:107
+info 25 node3/lrm: service status vm:107 started
+info 120 cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8
+info 120 cmdlist: execute service vm:105 set-dynamic-stats mem 7912
+info 120 cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7
+info 120 cmdlist: execute service vm:106 set-dynamic-stats mem 8192
+info 120 cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0
+info 120 cmdlist: execute service vm:107 set-dynamic-stats mem 8011
+info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected target imbalance: 0.42)
+info 160 node1/crm: got crm command: migrate vm:105 node2
+info 160 node1/crm: migrate service 'vm:105' to node 'node2'
+info 160 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 165 node3/lrm: service vm:105 - start migrate to node 'node2'
+info 165 node3/lrm: service vm:105 - end migrate to node 'node2'
+info 180 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node2)
+info 183 node2/lrm: starting service vm:105
+info 183 node2/lrm: service status vm:105 started
+info 220 cmdlist: execute service vm:101 set-dynamic-stats mem 1011
+info 220 cmdlist: execute service vm:103 set-dynamic-stats cpu 3.9
+info 220 cmdlist: execute service vm:103 set-dynamic-stats mem 6517
+info 220 cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7
+info 220 cmdlist: execute service vm:104 set-dynamic-stats mem 8001
+info 220 cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8
+info 220 cmdlist: execute service vm:105 set-dynamic-stats mem 1201
+info 220 cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1
+info 220 cmdlist: execute service vm:106 set-dynamic-stats mem 1211
+info 220 cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9
+info 220 cmdlist: execute service vm:107 set-dynamic-stats mem 1191
+info 260 node1/crm: auto rebalance - migrate vm:103 to node1 (expected target imbalance: 0.4)
+info 260 node1/crm: got crm command: migrate vm:103 node1
+info 260 node1/crm: migrate service 'vm:103' to node 'node1'
+info 260 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 263 node2/lrm: service vm:103 - start migrate to node 'node1'
+info 263 node2/lrm: service vm:103 - end migrate to node 'node1'
+info 280 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 281 node1/lrm: starting service vm:103
+info 281 node1/lrm: service status vm:103 started
+info 820 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/manager_status b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/service_config b/src/test/test-crs-dynamic-auto-rebalance3/service_config
new file mode 100644
index 00000000..a44ddd0e
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/service_config
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node2", "state": "started" },
+ "vm:104": { "node": "node2", "state": "started" },
+ "vm:105": { "node": "node3", "state": "started" },
+ "vm:106": { "node": "node3", "state": "started" },
+ "vm:107": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
new file mode 100644
index 00000000..7a52ea73
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:102": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:103": { "maxcpu": 4.0, "maxmem": 8589934592 },
+ "vm:104": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:105": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:106": { "maxcpu": 6.0, "maxmem": 8589934592 },
+ "vm:107": { "maxcpu": 6.0, "maxmem": 8589934592 }
+}
--
2.47.3
next prev parent reply other threads:[~2026-02-17 14:17 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
2026-02-18 11:06 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
2026-02-18 11:15 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
2026-02-17 14:14 ` Daniel Kral [this message]
2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
2026-02-18 11:10 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260217141437.584852-35-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox