all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases
Date: Tue, 17 Feb 2026 15:14:28 +0100	[thread overview]
Message-ID: <20260217141437.584852-35-d.kral@proxmox.com> (raw)
In-Reply-To: <20260217141437.584852-1-d.kral@proxmox.com>

These test cases document the basic behavior of the automatic load
rebalancer with non-changing and changing dynamic resource usages.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../test-crs-dynamic-auto-rebalance0/README   |  2 +
 .../test-crs-dynamic-auto-rebalance0/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  1 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 11 +++
 .../manager_status                            |  1 +
 .../service_config                            |  1 +
 .../static_service_stats                      |  1 +
 .../test-crs-dynamic-auto-rebalance1/README   |  6 ++
 .../test-crs-dynamic-auto-rebalance1/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  3 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 25 ++++++
 .../manager_status                            |  1 +
 .../service_config                            |  3 +
 .../static_service_stats                      |  3 +
 .../test-crs-dynamic-auto-rebalance2/README   |  3 +
 .../test-crs-dynamic-auto-rebalance2/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  6 ++
 .../hardware_status                           |  5 ++
 .../log.expect                                | 59 +++++++++++++
 .../manager_status                            |  1 +
 .../service_config                            |  6 ++
 .../static_service_stats                      |  6 ++
 .../test-crs-dynamic-auto-rebalance3/README   |  3 +
 .../test-crs-dynamic-auto-rebalance3/cmdlist  | 24 +++++
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  9 ++
 .../hardware_status                           |  5 ++
 .../log.expect                                | 88 +++++++++++++++++++
 .../manager_status                            |  1 +
 .../service_config                            |  9 ++
 .../static_service_stats                      |  9 ++
 36 files changed, 343 insertions(+)
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/static_service_stats

diff --git a/src/test/test-crs-dynamic-auto-rebalance0/README b/src/test/test-crs-dynamic-auto-rebalance0/README
new file mode 100644
index 00000000..54e1d981
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/README
@@ -0,0 +1,2 @@
+Test that the auto rebalance system does not trigger if no HA resources are
+configured in a homogeneous node cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/cmdlist b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/hardware_status b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/log.expect b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
new file mode 100644
index 00000000..27eed635
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
@@ -0,0 +1,11 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/manager_status b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/service_config b/src/test/test-crs-dynamic-auto-rebalance0/service_config
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/service_config
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/README b/src/test/test-crs-dynamic-auto-rebalance1/README
new file mode 100644
index 00000000..c99a7891
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/README
@@ -0,0 +1,6 @@
+Test that the auto rebalance system does not trigger for a single running HA
+resource in a homogeneous cluster.
+
+Even though the single running HA resource will create a high node imbalance,
+which would trigger a reblancing migration, there is no such migration that can
+improve the imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/cmdlist b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
new file mode 100644
index 00000000..50dd4901
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/hardware_status b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/log.expect b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
new file mode 100644
index 00000000..e6ee4402
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
@@ -0,0 +1,25 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/manager_status b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/service_config b/src/test/test-crs-dynamic-auto-rebalance1/service_config
new file mode 100644
index 00000000..a0ab66d2
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/service_config
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
new file mode 100644
index 00000000..e1bf0839
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/README b/src/test/test-crs-dynamic-auto-rebalance2/README
new file mode 100644
index 00000000..b9acfdb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running,
+homogeneous HA resources on a single node to other cluster nodes to reach a
+minimum cluster node imbalance in the homogeneous cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/cmdlist b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
new file mode 100644
index 00000000..f01fd768
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:102": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:103": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:104": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/hardware_status b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
new file mode 100644
index 00000000..ce8cf0eb
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
new file mode 100644
index 00000000..a1796c56
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
@@ -0,0 +1,59 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: starting service vm:102
+info     21    node1/lrm: service status vm:102 started
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     80    node1/crm: auto rebalance - migrate vm:101 to node2 (expected target imbalance: 0.94)
+info     80    node1/crm: got crm command: migrate vm:101 node2
+info     80    node1/crm: migrate service 'vm:101' to node 'node2'
+info     80    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node1, target = node2)
+info     81    node1/lrm: service vm:101 - start migrate to node 'node2'
+info     81    node1/lrm: service vm:101 - end migrate to node 'node2'
+info     83    node2/lrm: got lock 'ha_agent_node2_lock'
+info     83    node2/lrm: status change wait_for_agent_lock => active
+info    100    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node2)
+info    103    node2/lrm: starting service vm:101
+info    103    node2/lrm: service status vm:101 started
+info    160    node1/crm: auto rebalance - migrate vm:103 to node3 (expected target imbalance: 0.35)
+info    160    node1/crm: got crm command: migrate vm:103 node3
+info    160    node1/crm: migrate service 'vm:103' to node 'node3'
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node1, target = node3)
+info    161    node1/lrm: service vm:103 - start migrate to node 'node3'
+info    161    node1/lrm: service vm:103 - end migrate to node 'node3'
+info    165    node3/lrm: got lock 'ha_agent_node3_lock'
+info    165    node3/lrm: status change wait_for_agent_lock => active
+info    180    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node3)
+info    185    node3/lrm: starting service vm:103
+info    185    node3/lrm: service status vm:103 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/manager_status b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/service_config b/src/test/test-crs-dynamic-auto-rebalance2/service_config
new file mode 100644
index 00000000..b5960cb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
new file mode 100644
index 00000000..6cf8c106
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:102": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:103": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:104": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/README b/src/test/test-crs-dynamic-auto-rebalance3/README
new file mode 100644
index 00000000..44791d6f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running HA
+resources with different usages in a homogeneous cluster with changing usages
+over time to reach minimum cluster node imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/cmdlist b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
new file mode 100644
index 00000000..42fb259f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
@@ -0,0 +1,24 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [
+        "service vm:105 set-dynamic-stats cpu 7.8",
+        "service vm:105 set-dynamic-stats mem 7912",
+        "service vm:106 set-dynamic-stats cpu 5.7",
+        "service vm:106 set-dynamic-stats mem 8192",
+        "service vm:107 set-dynamic-stats cpu 6.0",
+        "service vm:107 set-dynamic-stats mem 8011"
+    ],
+    [
+        "service vm:101 set-dynamic-stats mem 1011",
+        "service vm:103 set-dynamic-stats cpu 3.9",
+        "service vm:103 set-dynamic-stats mem 6517",
+        "service vm:104 set-dynamic-stats cpu 6.7",
+        "service vm:104 set-dynamic-stats mem 8001",
+        "service vm:105 set-dynamic-stats cpu 1.8",
+        "service vm:105 set-dynamic-stats mem 1201",
+        "service vm:106 set-dynamic-stats cpu 2.1",
+        "service vm:106 set-dynamic-stats mem 1211",
+        "service vm:107 set-dynamic-stats cpu 0.9",
+        "service vm:107 set-dynamic-stats mem 1191"
+    ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
new file mode 100644
index 00000000..77e72c16
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "cpu": 0.9, "mem": 5444206592 },
+    "vm:102": { "cpu": 1.2, "mem": 2621440000 },
+    "vm:103": { "cpu": 0.8, "mem": 5444206592 },
+    "vm:104": { "cpu": 0.9, "mem": 2621440000 },
+    "vm:105": { "cpu": 3.0, "mem": 5444206592 },
+    "vm:106": { "cpu": 2.9, "mem": 2621440000 },
+    "vm:107": { "cpu": 2.1, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/hardware_status b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
new file mode 100644
index 00000000..8f1e695c
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
new file mode 100644
index 00000000..1832c44f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
@@ -0,0 +1,88 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node2'
+info     20    node1/crm: adding new service 'vm:104' on node 'node2'
+info     20    node1/crm: adding new service 'vm:105' on node 'node3'
+info     20    node1/crm: adding new service 'vm:106' on node 'node3'
+info     20    node1/crm: adding new service 'vm:107' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:107': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: starting service vm:102
+info     21    node1/lrm: service status vm:102 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:103
+info     23    node2/lrm: service status vm:103 started
+info     23    node2/lrm: starting service vm:104
+info     23    node2/lrm: service status vm:104 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:105
+info     25    node3/lrm: service status vm:105 started
+info     25    node3/lrm: starting service vm:106
+info     25    node3/lrm: service status vm:106 started
+info     25    node3/lrm: starting service vm:107
+info     25    node3/lrm: service status vm:107 started
+info    120      cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8
+info    120      cmdlist: execute service vm:105 set-dynamic-stats mem 7912
+info    120      cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7
+info    120      cmdlist: execute service vm:106 set-dynamic-stats mem 8192
+info    120      cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0
+info    120      cmdlist: execute service vm:107 set-dynamic-stats mem 8011
+info    160    node1/crm: auto rebalance - migrate vm:105 to node2 (expected target imbalance: 0.42)
+info    160    node1/crm: got crm command: migrate vm:105 node2
+info    160    node1/crm: migrate service 'vm:105' to node 'node2'
+info    160    node1/crm: service 'vm:105': state changed from 'started' to 'migrate'  (node = node3, target = node2)
+info    165    node3/lrm: service vm:105 - start migrate to node 'node2'
+info    165    node3/lrm: service vm:105 - end migrate to node 'node2'
+info    180    node1/crm: service 'vm:105': state changed from 'migrate' to 'started'  (node = node2)
+info    183    node2/lrm: starting service vm:105
+info    183    node2/lrm: service status vm:105 started
+info    220      cmdlist: execute service vm:101 set-dynamic-stats mem 1011
+info    220      cmdlist: execute service vm:103 set-dynamic-stats cpu 3.9
+info    220      cmdlist: execute service vm:103 set-dynamic-stats mem 6517
+info    220      cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7
+info    220      cmdlist: execute service vm:104 set-dynamic-stats mem 8001
+info    220      cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8
+info    220      cmdlist: execute service vm:105 set-dynamic-stats mem 1201
+info    220      cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1
+info    220      cmdlist: execute service vm:106 set-dynamic-stats mem 1211
+info    220      cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9
+info    220      cmdlist: execute service vm:107 set-dynamic-stats mem 1191
+info    260    node1/crm: auto rebalance - migrate vm:103 to node1 (expected target imbalance: 0.4)
+info    260    node1/crm: got crm command: migrate vm:103 node1
+info    260    node1/crm: migrate service 'vm:103' to node 'node1'
+info    260    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    263    node2/lrm: service vm:103 - start migrate to node 'node1'
+info    263    node2/lrm: service vm:103 - end migrate to node 'node1'
+info    280    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node1)
+info    281    node1/lrm: starting service vm:103
+info    281    node1/lrm: service status vm:103 started
+info    820     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/manager_status b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/service_config b/src/test/test-crs-dynamic-auto-rebalance3/service_config
new file mode 100644
index 00000000..a44ddd0e
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/service_config
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node2", "state": "started" },
+    "vm:104": { "node": "node2", "state": "started" },
+    "vm:105": { "node": "node3", "state": "started" },
+    "vm:106": { "node": "node3", "state": "started" },
+    "vm:107": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
new file mode 100644
index 00000000..7a52ea73
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:102": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:103": { "maxcpu": 4.0, "maxmem": 8589934592 },
+    "vm:104": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:105": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:106": { "maxcpu": 6.0, "maxmem": 8589934592 },
+    "vm:107": { "maxcpu": 6.0, "maxmem": 8589934592 }
+}
-- 
2.47.3





  parent reply	other threads:[~2026-02-17 14:17 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
2026-02-18 11:06   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
2026-02-18 11:15   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
2026-02-17 14:14 ` Daniel Kral [this message]
2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
2026-02-18 11:10   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260217141437.584852-35-d.kral@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal