From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases
Date: Tue, 17 Feb 2026 15:14:28 +0100 [thread overview]
Message-ID: <20260217141437.584852-35-d.kral@proxmox.com> (raw)
In-Reply-To: <20260217141437.584852-1-d.kral@proxmox.com>
These test cases document the basic behavior of the automatic load
rebalancer with non-changing and changing dynamic resource usages.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
.../test-crs-dynamic-auto-rebalance0/README | 2 +
.../test-crs-dynamic-auto-rebalance0/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 1 +
.../hardware_status | 5 ++
.../log.expect | 11 +++
.../manager_status | 1 +
.../service_config | 1 +
.../static_service_stats | 1 +
.../test-crs-dynamic-auto-rebalance1/README | 6 ++
.../test-crs-dynamic-auto-rebalance1/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 3 +
.../hardware_status | 5 ++
.../log.expect | 25 ++++++
.../manager_status | 1 +
.../service_config | 3 +
.../static_service_stats | 3 +
.../test-crs-dynamic-auto-rebalance2/README | 3 +
.../test-crs-dynamic-auto-rebalance2/cmdlist | 3 +
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 6 ++
.../hardware_status | 5 ++
.../log.expect | 59 +++++++++++++
.../manager_status | 1 +
.../service_config | 6 ++
.../static_service_stats | 6 ++
.../test-crs-dynamic-auto-rebalance3/README | 3 +
.../test-crs-dynamic-auto-rebalance3/cmdlist | 24 +++++
.../datacenter.cfg | 8 ++
.../dynamic_service_stats | 9 ++
.../hardware_status | 5 ++
.../log.expect | 88 +++++++++++++++++++
.../manager_status | 1 +
.../service_config | 9 ++
.../static_service_stats | 9 ++
36 files changed, 343 insertions(+)
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/README
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/cmdlist
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/hardware_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/log.expect
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/manager_status
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/service_config
create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/README b/src/test/test-crs-dynamic-auto-rebalance0/README
new file mode 100644
index 00000000..54e1d981
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/README
@@ -0,0 +1,2 @@
+Test that the auto rebalance system does not trigger if no HA resources are
+configured in a homogeneous node cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/cmdlist b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/hardware_status b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/log.expect b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
new file mode 100644
index 00000000..27eed635
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
@@ -0,0 +1,11 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/manager_status b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/service_config b/src/test/test-crs-dynamic-auto-rebalance0/service_config
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/service_config
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/README b/src/test/test-crs-dynamic-auto-rebalance1/README
new file mode 100644
index 00000000..c99a7891
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/README
@@ -0,0 +1,6 @@
+Test that the auto rebalance system does not trigger for a single running HA
+resource in a homogeneous cluster.
+
+Even though the single running HA resource will create a high node imbalance,
+which would trigger a reblancing migration, there is no such migration that can
+improve the imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/cmdlist b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
new file mode 100644
index 00000000..50dd4901
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/hardware_status b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/log.expect b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
new file mode 100644
index 00000000..e6ee4402
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
@@ -0,0 +1,25 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/manager_status b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/service_config b/src/test/test-crs-dynamic-auto-rebalance1/service_config
new file mode 100644
index 00000000..a0ab66d2
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/service_config
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
new file mode 100644
index 00000000..e1bf0839
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
@@ -0,0 +1,3 @@
+{
+ "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/README b/src/test/test-crs-dynamic-auto-rebalance2/README
new file mode 100644
index 00000000..b9acfdb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running,
+homogeneous HA resources on a single node to other cluster nodes to reach a
+minimum cluster node imbalance in the homogeneous cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/cmdlist b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
new file mode 100644
index 00000000..f01fd768
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:102": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:103": { "cpu": 1.0, "mem": 4294967296 },
+ "vm:104": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/hardware_status b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
new file mode 100644
index 00000000..ce8cf0eb
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
new file mode 100644
index 00000000..a1796c56
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
@@ -0,0 +1,59 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 21 node1/lrm: starting service vm:103
+info 21 node1/lrm: service status vm:103 started
+info 21 node1/lrm: starting service vm:104
+info 21 node1/lrm: service status vm:104 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 80 node1/crm: auto rebalance - migrate vm:101 to node2 (expected target imbalance: 0.94)
+info 80 node1/crm: got crm command: migrate vm:101 node2
+info 80 node1/crm: migrate service 'vm:101' to node 'node2'
+info 80 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 81 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 81 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 83 node2/lrm: got lock 'ha_agent_node2_lock'
+info 83 node2/lrm: status change wait_for_agent_lock => active
+info 100 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 103 node2/lrm: starting service vm:101
+info 103 node2/lrm: service status vm:101 started
+info 160 node1/crm: auto rebalance - migrate vm:103 to node3 (expected target imbalance: 0.35)
+info 160 node1/crm: got crm command: migrate vm:103 node3
+info 160 node1/crm: migrate service 'vm:103' to node 'node3'
+info 160 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 161 node1/lrm: service vm:103 - start migrate to node 'node3'
+info 161 node1/lrm: service vm:103 - end migrate to node 'node3'
+info 165 node3/lrm: got lock 'ha_agent_node3_lock'
+info 165 node3/lrm: status change wait_for_agent_lock => active
+info 180 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node3)
+info 185 node3/lrm: starting service vm:103
+info 185 node3/lrm: service status vm:103 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/manager_status b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/service_config b/src/test/test-crs-dynamic-auto-rebalance2/service_config
new file mode 100644
index 00000000..b5960cb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/service_config
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
new file mode 100644
index 00000000..6cf8c106
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
@@ -0,0 +1,6 @@
+{
+ "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:102": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:103": { "maxcpu": 2.0, "maxmem": 8589934592 },
+ "vm:104": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/README b/src/test/test-crs-dynamic-auto-rebalance3/README
new file mode 100644
index 00000000..44791d6f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running HA
+resources with different usages in a homogeneous cluster with changing usages
+over time to reach minimum cluster node imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/cmdlist b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
new file mode 100644
index 00000000..42fb259f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
@@ -0,0 +1,24 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on" ],
+ [
+ "service vm:105 set-dynamic-stats cpu 7.8",
+ "service vm:105 set-dynamic-stats mem 7912",
+ "service vm:106 set-dynamic-stats cpu 5.7",
+ "service vm:106 set-dynamic-stats mem 8192",
+ "service vm:107 set-dynamic-stats cpu 6.0",
+ "service vm:107 set-dynamic-stats mem 8011"
+ ],
+ [
+ "service vm:101 set-dynamic-stats mem 1011",
+ "service vm:103 set-dynamic-stats cpu 3.9",
+ "service vm:103 set-dynamic-stats mem 6517",
+ "service vm:104 set-dynamic-stats cpu 6.7",
+ "service vm:104 set-dynamic-stats mem 8001",
+ "service vm:105 set-dynamic-stats cpu 1.8",
+ "service vm:105 set-dynamic-stats mem 1201",
+ "service vm:106 set-dynamic-stats cpu 2.1",
+ "service vm:106 set-dynamic-stats mem 1211",
+ "service vm:107 set-dynamic-stats cpu 0.9",
+ "service vm:107 set-dynamic-stats mem 1191"
+ ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+ "crs": {
+ "ha": "dynamic",
+ "ha-auto-rebalance": 1,
+ "ha-auto-rebalance-threshold": 0.7
+ }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
new file mode 100644
index 00000000..77e72c16
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "cpu": 0.9, "mem": 5444206592 },
+ "vm:102": { "cpu": 1.2, "mem": 2621440000 },
+ "vm:103": { "cpu": 0.8, "mem": 5444206592 },
+ "vm:104": { "cpu": 0.9, "mem": 2621440000 },
+ "vm:105": { "cpu": 3.0, "mem": 5444206592 },
+ "vm:106": { "cpu": 2.9, "mem": 2621440000 },
+ "vm:107": { "cpu": 2.1, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/hardware_status b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
new file mode 100644
index 00000000..8f1e695c
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+ "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+ "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
new file mode 100644
index 00000000..1832c44f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
@@ -0,0 +1,88 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: using scheduler mode 'dynamic'
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node2'
+info 20 node1/crm: adding new service 'vm:104' on node 'node2'
+info 20 node1/crm: adding new service 'vm:105' on node 'node3'
+info 20 node1/crm: adding new service 'vm:106' on node 'node3'
+info 20 node1/crm: adding new service 'vm:107' on node 'node3'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node2)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'started' (node = node3)
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: starting service vm:101
+info 21 node1/lrm: service status vm:101 started
+info 21 node1/lrm: starting service vm:102
+info 21 node1/lrm: service status vm:102 started
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:103
+info 23 node2/lrm: service status vm:103 started
+info 23 node2/lrm: starting service vm:104
+info 23 node2/lrm: service status vm:104 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:105
+info 25 node3/lrm: service status vm:105 started
+info 25 node3/lrm: starting service vm:106
+info 25 node3/lrm: service status vm:106 started
+info 25 node3/lrm: starting service vm:107
+info 25 node3/lrm: service status vm:107 started
+info 120 cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8
+info 120 cmdlist: execute service vm:105 set-dynamic-stats mem 7912
+info 120 cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7
+info 120 cmdlist: execute service vm:106 set-dynamic-stats mem 8192
+info 120 cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0
+info 120 cmdlist: execute service vm:107 set-dynamic-stats mem 8011
+info 160 node1/crm: auto rebalance - migrate vm:105 to node2 (expected target imbalance: 0.42)
+info 160 node1/crm: got crm command: migrate vm:105 node2
+info 160 node1/crm: migrate service 'vm:105' to node 'node2'
+info 160 node1/crm: service 'vm:105': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 165 node3/lrm: service vm:105 - start migrate to node 'node2'
+info 165 node3/lrm: service vm:105 - end migrate to node 'node2'
+info 180 node1/crm: service 'vm:105': state changed from 'migrate' to 'started' (node = node2)
+info 183 node2/lrm: starting service vm:105
+info 183 node2/lrm: service status vm:105 started
+info 220 cmdlist: execute service vm:101 set-dynamic-stats mem 1011
+info 220 cmdlist: execute service vm:103 set-dynamic-stats cpu 3.9
+info 220 cmdlist: execute service vm:103 set-dynamic-stats mem 6517
+info 220 cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7
+info 220 cmdlist: execute service vm:104 set-dynamic-stats mem 8001
+info 220 cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8
+info 220 cmdlist: execute service vm:105 set-dynamic-stats mem 1201
+info 220 cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1
+info 220 cmdlist: execute service vm:106 set-dynamic-stats mem 1211
+info 220 cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9
+info 220 cmdlist: execute service vm:107 set-dynamic-stats mem 1191
+info 260 node1/crm: auto rebalance - migrate vm:103 to node1 (expected target imbalance: 0.4)
+info 260 node1/crm: got crm command: migrate vm:103 node1
+info 260 node1/crm: migrate service 'vm:103' to node 'node1'
+info 260 node1/crm: service 'vm:103': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 263 node2/lrm: service vm:103 - start migrate to node 'node1'
+info 263 node2/lrm: service vm:103 - end migrate to node 'node1'
+info 280 node1/crm: service 'vm:103': state changed from 'migrate' to 'started' (node = node1)
+info 281 node1/lrm: starting service vm:103
+info 281 node1/lrm: service status vm:103 started
+info 820 hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/manager_status b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/service_config b/src/test/test-crs-dynamic-auto-rebalance3/service_config
new file mode 100644
index 00000000..a44ddd0e
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/service_config
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node2", "state": "started" },
+ "vm:104": { "node": "node2", "state": "started" },
+ "vm:105": { "node": "node3", "state": "started" },
+ "vm:106": { "node": "node3", "state": "started" },
+ "vm:107": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
new file mode 100644
index 00000000..7a52ea73
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
@@ -0,0 +1,9 @@
+{
+ "vm:101": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:102": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:103": { "maxcpu": 4.0, "maxmem": 8589934592 },
+ "vm:104": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:105": { "maxcpu": 8.0, "maxmem": 8589934592 },
+ "vm:106": { "maxcpu": 6.0, "maxmem": 8589934592 },
+ "vm:107": { "maxcpu": 6.0, "maxmem": 8589934592 }
+}
--
2.47.3
next prev parent reply other threads:[~2026-02-17 14:17 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
2026-02-18 11:06 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
2026-02-18 11:15 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
2026-02-17 14:14 ` Daniel Kral [this message]
2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
2026-02-18 11:10 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260217141437.584852-35-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.