[RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer

public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer
@ 2026-02-17 14:13 Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
                   ` (35 more replies)
  0 siblings, 36 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

This RFC series proposes an implementation for a dynamic scheduler and
manual/automatic static/dynamic load rebalancer by implementing the
following:

- make the basic and static scheduler acknowledge the active usage of
  running, non-HA resources when making decisions,

- gather dynamic node and service usage information and use it in the
  dynamic scheduler, and

- implement a load rebalancer, which actively moves HA resources to
  other nodes, to lower the overall cluster node imbalance, while
  adhering to the HA rules.



== Model ==

The automatic load rebalancing system checks whether the cluster node
imbalance exceeds some user-defined threshold for some HA Manager rounds
("hold duration"). If it does exceed on consecutive HA Manager rounds,
it will choose the best service migration/relocation to improve the
cluster node imbalance and queue it if it significantly improves it by
some user-defined improvement ("margin").

The best service motion can be selected by either bruteforce or TOPSIS.
This selection method and some other parameters from above can be
tweaked at runtime in this RFC revision, but will likely be reduced to a
minimum in a final revision to allow further improvements in the future
without pinning us to a specific model in the background.



== Tests ==

I've added some rudimentary test cases to ensure more basic decisions
are documented. Otherwise, I've done tests in homogeneous and
heterogeneous virtualized clusters with adding load dynamically to
guests with stress-ng and plan to rely more on real-world load
simulators for the next batch of tests.

The repositories were all tested individually with their derivations of
`git rebase master --exec 'make clean && make deb` and the old pve-rs
shared library can be built with the new proxmox-resource-scheduling.
Additionally, the old pve-rs can still be built with the new
proxmox-resource-scheduling package.

Otherwise, I'd be very welcome from feedback from a different variety of
use cases. In this RFC version, the pve-ha-crm service reports the
node imbalance every ~10 seconds to the syslog to keep an eye on that.



== Benchmarks ==

I've also done some theoretical benchmarks with the target of being able
to handle a 48 nodes cluster and 9.999 HA resources / guests and a
worst-case scenario of each HA resource being part of 3 HA rules
(pairwise positive and negative resource affinity rules, where each
positive resource affinity pair has a common node affinity rule).

Generating the migration candidates for the huge cluster with the
worst-case HA ruleset takes 243 +- 9 ms.

Generating the migration candidates for the huge cluster without the
worst-case HA ruleset (to gain the most amount of 459954 migration
candidates) takes 356 +- 6 ms. This is expected, because we need to
evaluate more HA resources' rules as there are no HA resource bundles.

Excluding the generation, the brute force and TOPSIS method for
select_best_balancing_migration() were roughly similar both being in the
range 350 +- 50 ms for the huge cluster without any HA rules (for the
maximum amount of migration candidates) including the serialization
between Perl and Rust.



== TODO ==

- as always, more test cases (high priority)

- fix rebalancing on service start for dynamic mode with
  yet-to-be-started HA resources, see known issues (high priority)

- decide on whether to use bruteforce or TOPSIS to choose a service
  migration when rebalancing. Currently, I included both methods in the
  implementation, but more practical tests will show which is ideal.
  (high priority)

- assess whether an additional individual node load threshold is needed
  to trigger the load balancing when extreme outlier cases are not
  detected (e.g. 4 nodes 90% load, 1 node 0% load - would need an
  imbalance threshold of 0.5 to detect) (high priority)

- allow individual HA resources to be actively excluded from the
  automatic rebalancing, e.g., because containers cannot be live
  migrated (medium priority)

- move the migration candidate generation to the rust-side; the
  generation on the perl-side was chosen first to reduce code
  duplication, but it doesn't seem future proof and right to copy state
  to the online_node_usage object twice (medium priority)

- user/admin documentation (medium priority)

- factor out the common hashmap build-ups from the static and dynamic
  load scheduler (low priority)

- probably move these to the proxmox-resource-scheduling crate to make
  the perlmod bindings wrapper thinner (low priority)



== Future ideas ==

- include the migration costs in score_best_balancing_migrations(),
  e.g., so that VMs with lots of memory are less likely to be migrated
  if the link between the nodes is slow, but that would need measuring
  and storing the migration network link speeds as a mesh

- apply some filter like moving average window or exponential smoothing
  on the usage time series to dampen spikes; triple exponential
  smoothing (Holts-Winters) is also already implemented in rrdcached and
  allows for exponential smoothing with better time series analysis but
  would require changing the rrdcached data structure once more

- score_best_balancing_migrations(...) can already provide a
  size-limited list of the best migrations, which could be exposed to
  users to allow manual load balancing actions, e.g., from the web
  interface, to get some insight in the system

- The current scheduler can only solve bin covering, but it would be
  interesting to also allow bin packing if certain criteria are met,
  e.g., for energy preservation while the overall cluster load is low



== Known issues / Discussion ==

(1) score_nodes_to_start_service() in Dynamic Mode

Since we derive the node's usage from the rrddump exclusively for the
dynamic mode, if a HA resource is scheduled to be started, it will not
add any usage to their assigned node(s). This ensures that we always
work with the actual runtime values to not skew any decisions for the
hypothetical resource commitment values from the HA resource guest's
config.

This has the side-effect that score_nodes_to_start_service(...) won't
accommodate for other already scheduled HA resources. This differs from
what the basic and static load scheduler do, let me illustrate by an
example:

...

a) Basic / Static Mode

A 3 node cluster with no HA resources running yet

1. vm:100 is on node1 and in state 'request_start'

2. 'ha-rebalance-on-start' is set, so check for the best starting node
   with select_service_node(..., 'best-score'); If the cluster is
   homogeneous, score_nodes_to_start_service(...) will score all nodes
   equally, therefore vm:100 stays on node1

3. vm:100 is set to the state 'started'; this will immediately add the
   vm:100's maxcpu and maxmem to node1's usage, even though vm:100 is
   not started yet by node1's LRM

4. Next, vm:101 is also on node1 and in state 'request_start'

5. Same procedure, but now score_nodes_to_start_service(...) will score
   node1 worse than node2 and node3, because node1 already has vm:100's
   usage added, and will choose node2

b) Dynamic Mode

Same setup and same actions, with these steps differing:

3. vm:100 is set to the state 'started'; but as vm:100 is not actually
   started yet, it is not accounted in node1's usage yet

5. score_nodes_to_start_service(...) will also score all nodes equally,
   because the node usages haven't actually changed; it will also choose
   node1 as its starting node

...

There are multiple solutions to this:

1. This is fine, the actual usage is only known while running and the
   automatic rebalancer will do its job at the cost of migrating the
   guests live (worse)

2. Make the HA resources, which are scheduled to start, but not started
   yet special, and add them to a hypothetical node usage, which is only
   used for score_nodes_to_start_service(...)

3. Recording a characteristic load that is known from previous runs of
   that HA resource and use that as a weight when scoring the starting
   node.

N. ...

Solution 2 seems like the most reasonable to implement at the moment,
even though it might not be perfect.



== Diffstat ==


proxmox:

Daniel Kral (5):
  resource-scheduling: move score_nodes_to_start_service to scheduler
    crate
  resource-scheduling: introduce generic cluster usage implementation
  resource-scheduling: add dynamic node and service stats
  resource-scheduling: implement rebalancing migration selection
  resource-scheduling: implement Add and Default for
    {Dynamic,Static}ServiceStats

 proxmox-resource-scheduling/src/lib.rs        |   3 +
 .../src/pve_dynamic.rs                        |  68 +++
 proxmox-resource-scheduling/src/pve_static.rs | 110 ++---
 proxmox-resource-scheduling/src/scheduler.rs  | 436 ++++++++++++++++++
 4 files changed, 552 insertions(+), 65 deletions(-)
 create mode 100644 proxmox-resource-scheduling/src/pve_dynamic.rs
 create mode 100644 proxmox-resource-scheduling/src/scheduler.rs


base-commit: 984affa2c9149710d4f832c7522ddd3eb8802000
prerequisite-patch-id: 6b73d4dd683cbad857fbd527e1f2d53a709f2eef
prerequisite-patch-id: df7b2e2d7d42a588b7ecc18f475967ecdd0f25ef
prerequisite-patch-id: cffcc847e63bae6ce133507882e010da234df429
prerequisite-patch-id: a4699e1ba53fe9003b9b70f11eae50e924516fa1
prerequisite-patch-id: cf1be5cf0be66fc1388be3a269be5b10290d643d
prerequisite-patch-id: 3802a06b0b641438d949480479f83b8bd5000a0f
prerequisite-patch-id: 2890c6a14768e3fdfd3f35daccbe1953b0da199c
prerequisite-patch-id: 54c4904ef279bc1e64fe4904f862f690beb2d221
prerequisite-patch-id: a863374a7d7b814e74b1cacef3d1f7ed7428e45c
prerequisite-patch-id: 9a9452055bc5b1e3cfabadbb72567afe721b5856
prerequisite-patch-id: a4437f7be281e8efbf9628272ad41f9de2586994
prerequisite-patch-id: 14555958f0c6c1f3b6fae29f998139c900a97162

perl-rs:

Daniel Kral (6):
  pve-rs: resource scheduling: use generic cluster usage implementation
  pve-rs: resource scheduling: create service_nodes hashset from array
  pve-rs: resource scheduling: store service stats independently of node
  pve-rs: resource scheduling: expose auto rebalancing methods
  pve-rs: resource scheduling: move pve_static into resource_scheduling
    module
  pve-rs: resource scheduling: implement pve_dynamic bindings

 pve-rs/Makefile                               |   1 +
 pve-rs/src/bindings/mod.rs                    |   3 +-
 .../src/bindings/resource_scheduling/mod.rs   |  20 +
 .../resource_scheduling/pve_dynamic.rs        | 349 +++++++++++++++++
 .../resource_scheduling/pve_static.rs         | 365 ++++++++++++++++++
 .../bindings/resource_scheduling_static.rs    | 215 -----------
 pve-rs/test/resource_scheduling.pl            |   1 +
 7 files changed, 737 insertions(+), 217 deletions(-)
 create mode 100644 pve-rs/src/bindings/resource_scheduling/mod.rs
 create mode 100644 pve-rs/src/bindings/resource_scheduling/pve_dynamic.rs
 create mode 100644 pve-rs/src/bindings/resource_scheduling/pve_static.rs
 delete mode 100644 pve-rs/src/bindings/resource_scheduling_static.rs


cluster:

Daniel Kral (2):
  datacenter config: add dynamic load scheduler option
  datacenter config: add auto rebalancing options

 src/PVE/DataCenterConfig.pm | 43 +++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)


base-commit: 300978f19f91e3e226a80bb69ba6a21ec279e869

ha-manager:

Daniel Kral (17):
  rename static node stats to be consistent with similar interfaces
  resources: remove redundant load_config fallback for static config
  remove redundant service_node and migration_target parameter
  factor out common pve to ha resource type mapping
  derive static service stats while filling the service stats repository
  test: make static service usage explicit for all resources
  make static service stats indexable by sid
  move static service stats repository to PVE::HA::Usage::Static
  usage: augment service stats with node and state information
  include running non-HA resources in the scheduler's accounting
  env, resources: add dynamic node and service stats abstraction
  env: pve2: implement dynamic node and service stats
  usage: add dynamic usage scheduler
  manager: rename execute_migration to queue_resource_motion
  manager: update_crs_scheduler_mode: factor out crs config
  implement automatic rebalancing
  test: add basic automatic rebalancing system test cases

Dominik Rusovac (4):
  sim: hardware: pass correct types for static stats
  sim: hardware: factor out static stats' default values
  sim: hardware: rewrite set-static-stats
  sim: hardware: add set-dynamic-stats for services

 debian/pve-ha-manager.install                 |   1 +
 src/PVE/HA/Config.pm                          |   8 +-
 src/PVE/HA/Env.pm                             |  28 ++-
 src/PVE/HA/Env/PVE2.pm                        | 126 ++++++++--
 src/PVE/HA/Manager.pm                         | 208 ++++++++++++++-
 src/PVE/HA/Resources.pm                       |   4 +-
 src/PVE/HA/Resources/PVECT.pm                 |   7 +-
 src/PVE/HA/Resources/PVEVM.pm                 |   7 +-
 src/PVE/HA/Sim/Env.pm                         |  28 ++-
 src/PVE/HA/Sim/Hardware.pm                    | 236 ++++++++++++++++--
 src/PVE/HA/Sim/RTHardware.pm                  |   3 +-
 src/PVE/HA/Sim/Resources.pm                   |  17 --
 src/PVE/HA/Tools.pm                           |  23 +-
 src/PVE/HA/Usage.pm                           |  48 +++-
 src/PVE/HA/Usage/Basic.pm                     |   6 +-
 src/PVE/HA/Usage/Dynamic.pm                   | 160 ++++++++++++
 src/PVE/HA/Usage/Makefile                     |   2 +-
 src/PVE/HA/Usage/Static.pm                    | 101 +++++---
 .../test-crs-dynamic-auto-rebalance0/README   |   2 +
 .../test-crs-dynamic-auto-rebalance0/cmdlist  |   3 +
 .../datacenter.cfg                            |   8 +
 .../dynamic_service_stats                     |   1 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  11 +
 .../manager_status                            |   1 +
 .../service_config                            |   1 +
 .../static_service_stats                      |   1 +
 .../test-crs-dynamic-auto-rebalance1/README   |   6 +
 .../test-crs-dynamic-auto-rebalance1/cmdlist  |   3 +
 .../datacenter.cfg                            |   8 +
 .../dynamic_service_stats                     |   3 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  25 ++
 .../manager_status                            |   1 +
 .../service_config                            |   3 +
 .../static_service_stats                      |   3 +
 .../test-crs-dynamic-auto-rebalance2/README   |   3 +
 .../test-crs-dynamic-auto-rebalance2/cmdlist  |   3 +
 .../datacenter.cfg                            |   8 +
 .../dynamic_service_stats                     |   6 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  59 +++++
 .../manager_status                            |   1 +
 .../service_config                            |   6 +
 .../static_service_stats                      |   6 +
 .../test-crs-dynamic-auto-rebalance3/README   |   3 +
 .../test-crs-dynamic-auto-rebalance3/cmdlist  |  24 ++
 .../datacenter.cfg                            |   8 +
 .../dynamic_service_stats                     |   9 +
 .../hardware_status                           |   5 +
 .../log.expect                                |  88 +++++++
 .../manager_status                            |   1 +
 .../service_config                            |   9 +
 .../static_service_stats                      |   9 +
 .../hardware_status                           |   6 +-
 .../hardware_status                           |   6 +-
 .../hardware_status                           |  10 +-
 .../hardware_status                           |   6 +-
 .../static_service_stats                      |  52 +++-
 .../hardware_status                           |   6 +-
 .../static_service_stats                      |   9 +-
 src/test/test-crs-static1/hardware_status     |   6 +-
 src/test/test-crs-static2/hardware_status     |  10 +-
 src/test/test-crs-static3/hardware_status     |   6 +-
 src/test/test-crs-static4/hardware_status     |   6 +-
 src/test/test-crs-static5/hardware_status     |   6 +-
 66 files changed, 1290 insertions(+), 195 deletions(-)
 create mode 100644 src/PVE/HA/Usage/Dynamic.pm
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/static_service_stats


manager:

Daniel Kral (2):
  ui: dc/options: add dynamic load scheduler option
  ui: dc/options: add auto rebalancing options

 www/manager6/dc/OptionView.js | 46 +++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)


base-commit: 71482d1833ded40a25a78b67f09cc1975acf92c9
prerequisite-patch-id: 25cc6a017d5278d73a77510dfa90379cef4d66b1

Summary over all repositories:
  79 files changed, 2666 insertions(+), 479 deletions(-)

-- 
Generated by murpp 0.9.0




^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
@ 2026-02-17 14:13 ` Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 proxmox-resource-scheduling/src/lib.rs        |  2 +
 proxmox-resource-scheduling/src/pve_static.rs | 74 +---------------
 proxmox-resource-scheduling/src/scheduler.rs  | 86 +++++++++++++++++++
 3 files changed, 91 insertions(+), 71 deletions(-)
 create mode 100644 proxmox-resource-scheduling/src/scheduler.rs

diff --git a/proxmox-resource-scheduling/src/lib.rs b/proxmox-resource-scheduling/src/lib.rs
index 47980259..c73e7b1e 100644
--- a/proxmox-resource-scheduling/src/lib.rs
+++ b/proxmox-resource-scheduling/src/lib.rs
@@ -1,4 +1,6 @@
 #[macro_use]
 pub mod topsis;
 
+pub mod scheduler;
+
 pub mod pve_static;
diff --git a/proxmox-resource-scheduling/src/pve_static.rs b/proxmox-resource-scheduling/src/pve_static.rs
index b81086dd..184e615d 100644
--- a/proxmox-resource-scheduling/src/pve_static.rs
+++ b/proxmox-resource-scheduling/src/pve_static.rs
@@ -1,7 +1,7 @@
 use anyhow::Error;
 use serde::{Deserialize, Serialize};
 
-use crate::topsis;
+use crate::scheduler;
 
 #[derive(Serialize, Deserialize)]
 #[serde(rename_all = "kebab-case")]
@@ -35,7 +35,7 @@ impl AsRef<StaticNodeUsage> for StaticNodeUsage {
 
 /// Calculate new CPU usage in percent.
 /// `add` being `0.0` means "unlimited" and results in `max` being added.
-fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
+pub fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
     if add == 0.0 {
         old + max
     } else {
@@ -53,23 +53,6 @@ pub struct StaticServiceUsage {
     pub maxmem: usize,
 }
 
-criteria_struct! {
-    /// A given alternative.
-    struct PveTopsisAlternative {
-        #[criterion("average CPU", -1.0)]
-        average_cpu: f64,
-        #[criterion("highest CPU", -2.0)]
-        highest_cpu: f64,
-        #[criterion("average memory", -5.0)]
-        average_memory: f64,
-        #[criterion("highest memory", -10.0)]
-        highest_memory: f64,
-    }
-
-    const N_CRITERIA;
-    static PVE_HA_TOPSIS_CRITERIA;
-}
-
 /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
 /// and CPU usages of the nodes as if the service would already be running on each.
 ///
@@ -79,56 +62,5 @@ pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
     nodes: &[T],
     service: &StaticServiceUsage,
 ) -> Result<Vec<(String, f64)>, Error> {
-    let len = nodes.len();
-
-    let matrix = nodes
-        .iter()
-        .enumerate()
-        .map(|(target_index, _)| {
-            // Base values on percentages to allow comparing nodes with different stats.
-            let mut highest_cpu = 0.0;
-            let mut squares_cpu = 0.0;
-            let mut highest_mem = 0.0;
-            let mut squares_mem = 0.0;
-
-            for (index, node) in nodes.iter().enumerate() {
-                let node = node.as_ref();
-                let new_cpu = if index == target_index {
-                    add_cpu_usage(node.cpu, node.maxcpu as f64, service.maxcpu)
-                } else {
-                    node.cpu
-                } / (node.maxcpu as f64);
-                highest_cpu = f64::max(highest_cpu, new_cpu);
-                squares_cpu += new_cpu.powi(2);
-
-                let new_mem = if index == target_index {
-                    node.mem + service.maxmem
-                } else {
-                    node.mem
-                } as f64
-                    / node.maxmem as f64;
-                highest_mem = f64::max(highest_mem, new_mem);
-                squares_mem += new_mem.powi(2);
-            }
-
-            // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
-            // 1.004 is only slightly more than 1.002.
-            PveTopsisAlternative {
-                average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
-                highest_cpu: 1.0 + highest_cpu,
-                average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
-                highest_memory: 1.0 + highest_mem,
-            }
-            .into()
-        })
-        .collect::<Vec<_>>();
-
-    let scores =
-        topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
-
-    Ok(scores
-        .into_iter()
-        .enumerate()
-        .map(|(n, score)| (nodes[n].as_ref().name.clone(), score))
-        .collect())
+    scheduler::score_nodes_to_start_service(nodes, service)
 }
diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
new file mode 100644
index 00000000..29353d84
--- /dev/null
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -0,0 +1,86 @@
+use anyhow::Error;
+
+use crate::{
+    pve_static::{add_cpu_usage, StaticNodeUsage, StaticServiceUsage},
+    topsis,
+};
+
+criteria_struct! {
+    /// A given alternative.
+    struct PveTopsisAlternative {
+        #[criterion("average CPU", -1.0)]
+        average_cpu: f64,
+        #[criterion("highest CPU", -2.0)]
+        highest_cpu: f64,
+        #[criterion("average memory", -5.0)]
+        average_memory: f64,
+        #[criterion("highest memory", -10.0)]
+        highest_memory: f64,
+    }
+
+    const N_CRITERIA;
+    static PVE_HA_TOPSIS_CRITERIA;
+}
+
+/// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
+/// and CPU usages of the nodes as if the service would already be running on each.
+///
+/// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
+/// is better.
+pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
+    nodes: &[T],
+    service: &StaticServiceUsage,
+) -> Result<Vec<(String, f64)>, Error> {
+    let len = nodes.len();
+
+    let matrix = nodes
+        .iter()
+        .enumerate()
+        .map(|(target_index, _)| {
+            // Base values on percentages to allow comparing nodes with different stats.
+            let mut highest_cpu = 0.0;
+            let mut squares_cpu = 0.0;
+            let mut highest_mem = 0.0;
+            let mut squares_mem = 0.0;
+
+            for (index, node) in nodes.iter().enumerate() {
+                let node = node.as_ref();
+                let new_cpu = if index == target_index {
+                    add_cpu_usage(node.cpu, node.maxcpu as f64, service.maxcpu)
+                } else {
+                    node.cpu
+                } / (node.maxcpu as f64);
+                highest_cpu = f64::max(highest_cpu, new_cpu);
+                squares_cpu += new_cpu.powi(2);
+
+                let new_mem = if index == target_index {
+                    node.mem + service.maxmem
+                } else {
+                    node.mem
+                } as f64
+                    / node.maxmem as f64;
+                highest_mem = f64::max(highest_mem, new_mem);
+                squares_mem += new_mem.powi(2);
+            }
+
+            // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
+            // 1.004 is only slightly more than 1.002.
+            PveTopsisAlternative {
+                average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
+                highest_cpu: 1.0 + highest_cpu,
+                average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
+                highest_memory: 1.0 + highest_mem,
+            }
+            .into()
+        })
+        .collect::<Vec<_>>();
+
+    let scores =
+        topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+
+    Ok(scores
+        .into_iter()
+        .enumerate()
+        .map(|(n, score)| (nodes[n].as_ref().name.clone(), score))
+        .collect())
+}
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
@ 2026-02-17 14:13 ` Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

Declare generic NodeStats and ServiceStats structs, which special use
cases convert their types into, and use these to implement generic
scheduler methods such as the existing scoring of nodes to start a
previously non-running service.

This is best viewed with the git option --ignore-all-space.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 proxmox-resource-scheduling/src/pve_static.rs |  45 ++++-
 proxmox-resource-scheduling/src/scheduler.rs  | 185 ++++++++++++------
 2 files changed, 166 insertions(+), 64 deletions(-)

diff --git a/proxmox-resource-scheduling/src/pve_static.rs b/proxmox-resource-scheduling/src/pve_static.rs
index 184e615d..b269c44f 100644
--- a/proxmox-resource-scheduling/src/pve_static.rs
+++ b/proxmox-resource-scheduling/src/pve_static.rs
@@ -1,9 +1,9 @@
 use anyhow::Error;
 use serde::{Deserialize, Serialize};
 
-use crate::scheduler;
+use crate::scheduler::{ClusterUsage, NodeStats, NodeUsage, ServiceStats};
 
-#[derive(Serialize, Deserialize)]
+#[derive(Clone, Serialize, Deserialize)]
 #[serde(rename_all = "kebab-case")]
 /// Static usage information of a node.
 pub struct StaticNodeUsage {
@@ -33,9 +33,25 @@ impl AsRef<StaticNodeUsage> for StaticNodeUsage {
     }
 }
 
+impl From<StaticNodeUsage> for NodeUsage {
+    fn from(value: StaticNodeUsage) -> Self {
+        let stats = NodeStats {
+            cpu: value.cpu,
+            maxcpu: value.maxcpu,
+            mem: value.mem,
+            maxmem: value.maxmem,
+        };
+
+        Self {
+            name: value.name,
+            stats,
+        }
+    }
+}
+
 /// Calculate new CPU usage in percent.
 /// `add` being `0.0` means "unlimited" and results in `max` being added.
-pub fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
+fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
     if add == 0.0 {
         old + max
     } else {
@@ -43,7 +59,7 @@ pub fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
     }
 }
 
-#[derive(Serialize, Deserialize)]
+#[derive(Clone, Copy, Serialize, Deserialize)]
 #[serde(rename_all = "kebab-case")]
 /// Static usage information of an HA resource.
 pub struct StaticServiceUsage {
@@ -53,14 +69,33 @@ pub struct StaticServiceUsage {
     pub maxmem: usize,
 }
 
+impl From<StaticServiceUsage> for ServiceStats {
+    fn from(value: StaticServiceUsage) -> Self {
+        Self {
+            cpu: value.maxcpu,
+            maxcpu: value.maxcpu,
+            mem: value.maxmem,
+            maxmem: value.maxmem,
+        }
+    }
+}
+
 /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
 /// and CPU usages of the nodes as if the service would already be running on each.
 ///
 /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
 /// is better.
+#[deprecated]
 pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
     nodes: &[T],
     service: &StaticServiceUsage,
 ) -> Result<Vec<(String, f64)>, Error> {
-    scheduler::score_nodes_to_start_service(nodes, service)
+    let nodes = nodes
+        .iter()
+        .map(|node| node.as_ref().clone().into())
+        .collect::<Vec<NodeUsage>>();
+
+    let cluster_usage = ClusterUsage::from_nodes(nodes);
+
+    cluster_usage.score_nodes_to_start_service(*service)
 }
diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
index 29353d84..58215f03 100644
--- a/proxmox-resource-scheduling/src/scheduler.rs
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -1,9 +1,66 @@
 use anyhow::Error;
 
-use crate::{
-    pve_static::{add_cpu_usage, StaticNodeUsage, StaticServiceUsage},
-    topsis,
-};
+use crate::topsis;
+
+/// Generic service stats.
+#[derive(Clone, Copy)]
+pub struct ServiceStats {
+    /// CPU utilization in CPU cores.
+    pub cpu: f64,
+    /// Number of assigned CPUs or CPU limit.
+    pub maxcpu: f64,
+    /// Used memory in bytes.
+    pub mem: usize,
+    /// Maximum assigned memory in bytes.
+    pub maxmem: usize,
+}
+
+/// Generic node stats.
+#[derive(Clone, Copy)]
+pub struct NodeStats {
+    /// CPU utilization in CPU cores.
+    pub cpu: f64,
+    /// Total number of CPU cores.
+    pub maxcpu: usize,
+    /// Used memory in bytes.
+    pub mem: usize,
+    /// Total memory in bytes.
+    pub maxmem: usize,
+}
+
+impl NodeStats {
+    /// Adds the service stats to the node stats as if the service has started on the node.
+    pub fn add_started_service(&mut self, service_stats: &ServiceStats) {
+        // a maxcpu value of `0.0` means no cpu usage limit on the node
+        let service_cpu = if service_stats.maxcpu == 0.0 {
+            self.maxcpu as f64
+        } else {
+            service_stats.maxcpu
+        };
+
+        self.cpu += service_cpu;
+        self.mem += service_stats.maxmem;
+    }
+
+    /// Returns the current cpu usage as a percentage.
+    pub fn cpu_load(&self) -> f64 {
+        self.cpu / self.maxcpu as f64
+    }
+
+    /// Returns the current memory usage as a percentage.
+    pub fn mem_load(&self) -> f64 {
+        self.mem as f64 / self.maxmem as f64
+    }
+}
+
+pub struct NodeUsage {
+    pub name: String,
+    pub stats: NodeStats,
+}
+
+pub struct ClusterUsage {
+    nodes: Vec<NodeUsage>,
+}
 
 criteria_struct! {
     /// A given alternative.
@@ -22,65 +79,75 @@ criteria_struct! {
     static PVE_HA_TOPSIS_CRITERIA;
 }
 
-/// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
-/// and CPU usages of the nodes as if the service would already be running on each.
-///
-/// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
-/// is better.
-pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
-    nodes: &[T],
-    service: &StaticServiceUsage,
-) -> Result<Vec<(String, f64)>, Error> {
-    let len = nodes.len();
+impl ClusterUsage {
+    /// Instantiate cluster usage from node usages.
+    pub fn from_nodes<I>(nodes: I) -> Self
+    where
+        I: IntoIterator<Item: Into<NodeUsage>>,
+    {
+        Self {
+            nodes: nodes.into_iter().map(|node| node.into()).collect(),
+        }
+    }
 
-    let matrix = nodes
-        .iter()
-        .enumerate()
-        .map(|(target_index, _)| {
-            // Base values on percentages to allow comparing nodes with different stats.
-            let mut highest_cpu = 0.0;
-            let mut squares_cpu = 0.0;
-            let mut highest_mem = 0.0;
-            let mut squares_mem = 0.0;
+    /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
+    /// and CPU usages of the nodes as if the service would already be running on each.
+    ///
+    /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
+    /// is better.
+    pub fn score_nodes_to_start_service<T: Into<ServiceStats>>(
+        &self,
+        service_stats: T,
+    ) -> Result<Vec<(String, f64)>, Error> {
+        let len = self.nodes.len();
+        let service_stats = service_stats.into();
 
-            for (index, node) in nodes.iter().enumerate() {
-                let node = node.as_ref();
-                let new_cpu = if index == target_index {
-                    add_cpu_usage(node.cpu, node.maxcpu as f64, service.maxcpu)
-                } else {
-                    node.cpu
-                } / (node.maxcpu as f64);
-                highest_cpu = f64::max(highest_cpu, new_cpu);
-                squares_cpu += new_cpu.powi(2);
+        let matrix = self
+            .nodes
+            .iter()
+            .enumerate()
+            .map(|(target_index, _)| {
+                // Base values on percentages to allow comparing nodes with different stats.
+                let mut highest_cpu = 0.0;
+                let mut squares_cpu = 0.0;
+                let mut highest_mem = 0.0;
+                let mut squares_mem = 0.0;
 
-                let new_mem = if index == target_index {
-                    node.mem + service.maxmem
-                } else {
-                    node.mem
-                } as f64
-                    / node.maxmem as f64;
-                highest_mem = f64::max(highest_mem, new_mem);
-                squares_mem += new_mem.powi(2);
-            }
+                for (index, node) in self.nodes.iter().enumerate() {
+                    let mut new_stats = node.stats;
 
-            // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
-            // 1.004 is only slightly more than 1.002.
-            PveTopsisAlternative {
-                average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
-                highest_cpu: 1.0 + highest_cpu,
-                average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
-                highest_memory: 1.0 + highest_mem,
-            }
-            .into()
-        })
-        .collect::<Vec<_>>();
+                    if index == target_index {
+                        new_stats.add_started_service(&service_stats)
+                    };
 
-    let scores =
-        topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+                    let new_cpu = new_stats.cpu_load();
+                    highest_cpu = f64::max(highest_cpu, new_cpu);
+                    squares_cpu += new_cpu.powi(2);
 
-    Ok(scores
-        .into_iter()
-        .enumerate()
-        .map(|(n, score)| (nodes[n].as_ref().name.clone(), score))
-        .collect())
+                    let new_mem = new_stats.mem_load();
+                    highest_mem = f64::max(highest_mem, new_mem);
+                    squares_mem += new_mem.powi(2);
+                }
+
+                // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
+                // 1.004 is only slightly more than 1.002.
+                PveTopsisAlternative {
+                    average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
+                    highest_cpu: 1.0 + highest_cpu,
+                    average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
+                    highest_memory: 1.0 + highest_mem,
+                }
+                .into()
+            })
+            .collect::<Vec<_>>();
+
+        let scores =
+            topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+
+        Ok(scores
+            .into_iter()
+            .enumerate()
+            .map(|(n, score)| (self.nodes[n].name.to_string(), score))
+            .collect())
+    }
 }
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
@ 2026-02-17 14:13 ` Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 proxmox-resource-scheduling/src/lib.rs        |  1 +
 .../src/pve_dynamic.rs                        | 53 +++++++++++++++++++
 2 files changed, 54 insertions(+)
 create mode 100644 proxmox-resource-scheduling/src/pve_dynamic.rs

diff --git a/proxmox-resource-scheduling/src/lib.rs b/proxmox-resource-scheduling/src/lib.rs
index c73e7b1e..2c22dbce 100644
--- a/proxmox-resource-scheduling/src/lib.rs
+++ b/proxmox-resource-scheduling/src/lib.rs
@@ -3,4 +3,5 @@ pub mod topsis;
 
 pub mod scheduler;
 
+pub mod pve_dynamic;
 pub mod pve_static;
diff --git a/proxmox-resource-scheduling/src/pve_dynamic.rs b/proxmox-resource-scheduling/src/pve_dynamic.rs
new file mode 100644
index 00000000..4f480612
--- /dev/null
+++ b/proxmox-resource-scheduling/src/pve_dynamic.rs
@@ -0,0 +1,53 @@
+use serde::{Deserialize, Serialize};
+
+use crate::scheduler::{NodeStats, ServiceStats};
+
+#[derive(Clone, Copy, Serialize, Deserialize)]
+#[serde(rename_all = "kebab-case")]
+/// Dynamic usage stats of a node.
+pub struct DynamicNodeStats {
+    /// CPU utilization in CPU cores.
+    pub cpu: f64,
+    /// Total number of CPU cores.
+    pub maxcpu: usize,
+    /// Used memory in bytes.
+    pub mem: usize,
+    /// Total memory in bytes.
+    pub maxmem: usize,
+}
+
+impl From<DynamicNodeStats> for NodeStats {
+    fn from(value: DynamicNodeStats) -> Self {
+        Self {
+            cpu: value.cpu,
+            maxcpu: value.maxcpu,
+            mem: value.mem,
+            maxmem: value.maxmem,
+        }
+    }
+}
+
+#[derive(Clone, Copy, Default, Serialize, Deserialize)]
+#[serde(rename_all = "kebab-case")]
+/// Dynamic usage stats of an HA resource.
+pub struct DynamicServiceStats {
+    /// CPU utilization in CPU cores.
+    pub cpu: f64,
+    /// Number of assigned CPUs or CPU limit.
+    pub maxcpu: f64,
+    /// Used memory in bytes.
+    pub mem: usize,
+    /// Maximum assigned memory in bytes.
+    pub maxmem: usize,
+}
+
+impl From<DynamicServiceStats> for ServiceStats {
+    fn from(value: DynamicServiceStats) -> Self {
+        Self {
+            cpu: value.cpu,
+            maxcpu: value.maxcpu,
+            mem: value.mem,
+            maxmem: value.maxmem,
+        }
+    }
+}
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (2 preceding siblings ...)
  2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
@ 2026-02-17 14:13 ` Daniel Kral
  2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

Assuming that a service will hold the same dynamic resource usage on a
new node as on the previous node, score possible migrations, where:

- the cluster node imbalance is minimal (bruteforce), or

- the shifted root mean square and maximum resource usages of the cpu
  and memory is minimal across the cluster nodes (TOPSIS).

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
score_best_balancing_migrations() and select_best_balancing_migration()
are separate because there could be future improvements for the single
select, but might be unnecessary and redundant (especially since we need
to expose it at perlmod and PVE::HA::Usage::{Dynamic,Static} twice).

 proxmox-resource-scheduling/src/scheduler.rs | 283 +++++++++++++++++++
 1 file changed, 283 insertions(+)

diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
index 58215f03..bd69cb2a 100644
--- a/proxmox-resource-scheduling/src/scheduler.rs
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -2,6 +2,9 @@ use anyhow::Error;
 
 use crate::topsis;
 
+use serde::{Deserialize, Serialize};
+use std::collections::BinaryHeap;
+
 /// Generic service stats.
 #[derive(Clone, Copy)]
 pub struct ServiceStats {
@@ -42,6 +45,18 @@ impl NodeStats {
         self.mem += service_stats.maxmem;
     }
 
+    /// Adds the service stats to the node stats as if the service is running on the node.
+    pub fn add_running_service(&mut self, service_stats: &ServiceStats) {
+        self.cpu += service_stats.cpu;
+        self.mem += service_stats.mem;
+    }
+
+    /// Removes the service stats from the node stats as if the service is not running on the node.
+    pub fn remove_running_service(&mut self, service_stats: &ServiceStats) {
+        self.cpu -= service_stats.cpu;
+        self.mem -= service_stats.mem;
+    }
+
     /// Returns the current cpu usage as a percentage.
     pub fn cpu_load(&self) -> f64 {
         self.cpu / self.maxcpu as f64
@@ -51,6 +66,45 @@ impl NodeStats {
     pub fn mem_load(&self) -> f64 {
         self.mem as f64 / self.maxmem as f64
     }
+
+    /// Returns a combined node usage as a percentage.
+    pub fn load(&self) -> f64 {
+        (self.cpu_load() + self.mem_load()) / 2.0
+    }
+}
+
+fn calculate_node_loads(nodes: &[NodeStats]) -> Vec<f64> {
+    nodes.iter().map(|stats| stats.load()).collect()
+}
+
+/// Returns the load imbalance among the nodes.
+///
+/// The load balance is measured as the statistical dispersion of the individual node loads.
+///
+/// The current implementation uses the dimensionless coefficient of variation, which expresses the
+/// standard deviation in relation to the average mean of the node loads. Additionally, the
+/// coefficient of variation is not robust, which is
+fn calculate_node_imbalance(nodes: &[NodeStats]) -> f64 {
+    let node_count = nodes.len();
+    let node_loads = calculate_node_loads(nodes);
+
+    let load_sum = node_loads
+        .iter()
+        .fold(0.0, |sum, node_load| sum + node_load);
+
+    // load_sum is guaranteed to be 0.0 for empty nodes
+    if load_sum == 0.0 {
+        0.0
+    } else {
+        let load_mean = load_sum / node_count as f64;
+
+        let squared_diff_sum = node_loads
+            .iter()
+            .fold(0.0, |sum, node_load| sum + (node_load - load_mean).powi(2));
+        let load_sd = (squared_diff_sum / node_count as f64).sqrt();
+
+        load_sd / load_mean
+    }
 }
 
 pub struct NodeUsage {
@@ -79,6 +133,71 @@ criteria_struct! {
     static PVE_HA_TOPSIS_CRITERIA;
 }
 
+/// A possible migration.
+#[derive(Eq, PartialEq, Ord, PartialOrd, Serialize, Deserialize)]
+#[serde(rename_all = "kebab-case")]
+pub struct Migration {
+    /// Service identifier.
+    pub sid: String,
+    /// The current node of the service.
+    pub source_node: String,
+    /// The possible migration target node for the service.
+    pub target_node: String,
+}
+
+/// A possible migration with a score.
+#[derive(Serialize, Deserialize)]
+#[serde(rename_all = "kebab-case")]
+pub struct ScoredMigration {
+    /// The possible migration.
+    pub migration: Migration,
+    /// The expected node imbalance after the migration.
+    pub imbalance: f64,
+}
+
+impl Ord for ScoredMigration {
+    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
+        self.imbalance.total_cmp(&other.imbalance).reverse()
+    }
+}
+
+impl PartialOrd for ScoredMigration {
+    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
+        Some(self.cmp(other))
+    }
+}
+
+impl PartialEq for ScoredMigration {
+    fn eq(&self, other: &Self) -> bool {
+        self.cmp(other) == std::cmp::Ordering::Equal
+    }
+}
+
+impl Eq for ScoredMigration {}
+
+/// A possible migration candidate.
+#[derive(Clone)]
+pub struct MigrationCandidate {
+    /// Service identifier of a standalone or leading service.
+    pub sid: String,
+    /// The current node of the service.
+    pub source_node: String,
+    /// The possible migration target node for the service.
+    pub target_node: String,
+    /// The current stats of the service.
+    pub stats: ServiceStats,
+}
+
+impl From<MigrationCandidate> for Migration {
+    fn from(candidate: MigrationCandidate) -> Self {
+        Migration {
+            sid: candidate.sid,
+            source_node: candidate.source_node,
+            target_node: candidate.target_node,
+        }
+    }
+}
+
 impl ClusterUsage {
     /// Instantiate cluster usage from node usages.
     pub fn from_nodes<I>(nodes: I) -> Self
@@ -90,6 +209,170 @@ impl ClusterUsage {
         }
     }
 
+    fn node_stats(&self) -> Vec<NodeStats> {
+        self.nodes.iter().map(|node| node.stats).collect()
+    }
+
+    /// Returns the individual node loads.
+    pub fn node_loads(&self) -> Vec<(String, f64)> {
+        self.nodes
+            .iter()
+            .map(|node| (node.name.to_string(), node.stats.load()))
+            .collect()
+    }
+
+    /// Returns the load imbalance among the nodes.
+    ///
+    /// See [`calculate_node_imbalance`] for more information.
+    pub fn node_imbalance(&self) -> f64 {
+        let node_stats = self.node_stats();
+
+        calculate_node_imbalance(&node_stats)
+    }
+
+    /// Returns the load imbalance among the nodes as if a specific service was moved.
+    ///
+    /// See [`calculate_node_imbalance`] for more information.
+    pub fn node_imbalance_with_migration(&self, migration: &MigrationCandidate) -> f64 {
+        let mut new_node_stats = Vec::with_capacity(self.nodes.len());
+
+        self.nodes.iter().for_each(|node| {
+            let mut new_stats = node.stats;
+
+            if node.name == migration.source_node {
+                new_stats.remove_running_service(&migration.stats);
+            } else if node.name == migration.target_node {
+                new_stats.add_running_service(&migration.stats);
+            }
+
+            new_node_stats.push(new_stats);
+        });
+
+        calculate_node_imbalance(&new_node_stats)
+    }
+
+    /// Score the service motions by the best node imbalance improvement with exhaustive search.
+    pub fn score_best_balancing_migrations<I>(
+        &self,
+        candidates: I,
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error>
+    where
+        I: IntoIterator<Item = MigrationCandidate>,
+    {
+        let mut scored_migrations = candidates
+            .into_iter()
+            .map(|candidate| {
+                let imbalance = self.node_imbalance_with_migration(&candidate);
+
+                ScoredMigration {
+                    migration: candidate.into(),
+                    imbalance,
+                }
+            })
+            .collect::<BinaryHeap<_>>();
+
+        let mut best_alternatives = Vec::new();
+
+        // BinaryHeap::into_iter_sorted() is still in nightly unfortunately
+        while best_alternatives.len() < limit {
+            match scored_migrations.pop() {
+                Some(alternative) => best_alternatives.push(alternative),
+                None => break,
+            }
+        }
+
+        Ok(best_alternatives)
+    }
+
+    /// Select the service motion with the best node imbalance improvement with exhaustive search.
+    pub fn select_best_balancing_migration<I>(
+        &self,
+        candidates: I,
+    ) -> Result<Option<ScoredMigration>, Error>
+    where
+        I: IntoIterator<Item = MigrationCandidate>,
+    {
+        let migrations = self.score_best_balancing_migrations(candidates, 1)?;
+
+        Ok(migrations.into_iter().next())
+    }
+
+    /// Score the service motions by the best node imbalance improvement with the TOPSIS method.
+    pub fn score_best_balancing_migrations_topsis(
+        &self,
+        candidates: &[MigrationCandidate],
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error> {
+        let len = self.nodes.len();
+
+        let matrix = candidates
+            .iter()
+            .map(|migration| {
+                let mut highest_cpu = 0.0;
+                let mut squares_cpu = 0.0;
+                let mut highest_mem = 0.0;
+                let mut squares_mem = 0.0;
+
+                let service = &migration.stats;
+                let source_node = &migration.source_node;
+                let target_node = &migration.target_node;
+
+                for node in self.nodes.iter() {
+                    let mut new_stats = node.stats;
+
+                    if &node.name == source_node {
+                        new_stats.remove_running_service(service);
+                    } else if &node.name == target_node {
+                        new_stats.add_running_service(service);
+                    }
+
+                    let new_cpu_load = new_stats.cpu_load();
+                    highest_cpu = f64::max(highest_cpu, new_cpu_load);
+                    squares_cpu += new_cpu_load.powi(2);
+
+                    let new_mem_load = new_stats.mem_load();
+                    highest_mem = f64::max(highest_mem, new_mem_load);
+                    squares_mem += new_mem_load.powi(2);
+                }
+
+                PveTopsisAlternative {
+                    average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
+                    highest_cpu: 1.0 + highest_cpu,
+                    average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
+                    highest_memory: 1.0 + highest_mem,
+                }
+                .into()
+            })
+            .collect::<Vec<_>>();
+
+        let best_alternatives =
+            topsis::rank_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+
+        Ok(best_alternatives
+            .into_iter()
+            .take(limit)
+            .map(|i| {
+                let imbalance = self.node_imbalance_with_migration(&candidates[i]);
+
+                ScoredMigration {
+                    migration: candidates[i].clone().into(),
+                    imbalance,
+                }
+            })
+            .collect())
+    }
+
+    /// Select the service motion with the best node imbalance improvement with the TOPSIS search.
+    pub fn select_best_balancing_migration_topsis(
+        &self,
+        candidates: &[MigrationCandidate],
+    ) -> Result<Option<ScoredMigration>, Error> {
+        let migrations = self.score_best_balancing_migrations_topsis(candidates, 1)?;
+
+        Ok(migrations.into_iter().next())
+    }
+
     /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
     /// and CPU usages of the nodes as if the service would already be running on each.
     ///
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (3 preceding siblings ...)
  2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
@ 2026-02-17 14:13 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:13 UTC (permalink / raw)
  To: pve-devel

This allows for more elegant code to aggregate multiple
{Dynamic,Static}ServiceStats, e.g., for building service bundles.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 proxmox-resource-scheduling/src/pve_dynamic.rs | 17 ++++++++++++++++-
 proxmox-resource-scheduling/src/pve_static.rs  | 15 ++++++++++++++-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/proxmox-resource-scheduling/src/pve_dynamic.rs b/proxmox-resource-scheduling/src/pve_dynamic.rs
index 4f480612..21a15e81 100644
--- a/proxmox-resource-scheduling/src/pve_dynamic.rs
+++ b/proxmox-resource-scheduling/src/pve_dynamic.rs
@@ -1,8 +1,10 @@
 use serde::{Deserialize, Serialize};
 
+use std::ops::Add;
+
 use crate::scheduler::{NodeStats, ServiceStats};
 
-#[derive(Clone, Copy, Serialize, Deserialize)]
+#[derive(Clone, Copy, Default, Serialize, Deserialize)]
 #[serde(rename_all = "kebab-case")]
 /// Dynamic usage stats of a node.
 pub struct DynamicNodeStats {
@@ -51,3 +53,16 @@ impl From<DynamicServiceStats> for ServiceStats {
         }
     }
 }
+
+impl Add for DynamicServiceStats {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self {
+        Self {
+            cpu: self.cpu + rhs.cpu,
+            maxcpu: self.maxcpu + rhs.maxcpu,
+            mem: self.mem + rhs.mem,
+            maxmem: self.maxmem + rhs.maxmem,
+        }
+    }
+}
diff --git a/proxmox-resource-scheduling/src/pve_static.rs b/proxmox-resource-scheduling/src/pve_static.rs
index b269c44f..c96da100 100644
--- a/proxmox-resource-scheduling/src/pve_static.rs
+++ b/proxmox-resource-scheduling/src/pve_static.rs
@@ -1,6 +1,8 @@
 use anyhow::Error;
 use serde::{Deserialize, Serialize};
 
+use std::ops::Add;
+
 use crate::scheduler::{ClusterUsage, NodeStats, NodeUsage, ServiceStats};
 
 #[derive(Clone, Serialize, Deserialize)]
@@ -59,7 +61,7 @@ fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
     }
 }
 
-#[derive(Clone, Copy, Serialize, Deserialize)]
+#[derive(Clone, Copy, Default, Serialize, Deserialize)]
 #[serde(rename_all = "kebab-case")]
 /// Static usage information of an HA resource.
 pub struct StaticServiceUsage {
@@ -80,6 +82,17 @@ impl From<StaticServiceUsage> for ServiceStats {
     }
 }
 
+impl Add for StaticServiceUsage {
+    type Output = Self;
+
+    fn add(self, rhs: Self) -> Self::Output {
+        Self {
+            maxcpu: self.maxcpu + rhs.maxcpu,
+            maxmem: self.maxmem + rhs.maxmem,
+        }
+    }
+}
+
 /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
 /// and CPU usages of the nodes as if the service would already be running on each.
 ///
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (4 preceding siblings ...)
  2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
In general, in a v2 or later series this could be generalized for the
static and (upcoming) dynamic use case and moved to
proxmox-resource-scheduling as well to make the perlmod bindings as thin
as possible (to allow flexibility in the internals and not introduce
unnecessary build breaks).

 .../bindings/resource_scheduling_static.rs    | 41 +++++++++++--------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
index 5b91d36..a51b8a2 100644
--- a/pve-rs/src/bindings/resource_scheduling_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling_static.rs
@@ -13,6 +13,7 @@ pub mod pve_rs_resource_scheduling_static {
 
     use perlmod::Value;
     use proxmox_resource_scheduling::pve_static::{StaticNodeUsage, StaticServiceUsage};
+    use proxmox_resource_scheduling::scheduler::ClusterUsage;
 
     perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Static");
 
@@ -175,21 +176,7 @@ pub mod pve_rs_resource_scheduling_static {
         Ok(())
     }
 
-    /// Scores all previously added nodes for starting a `service` on.
-    ///
-    /// Scoring is done according to the static memory and CPU usages of the nodes as if the
-    /// service would already be running on each.
-    ///
-    /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher
-    /// score is better.
-    ///
-    /// See [`proxmox_resource_scheduling::pve_static::score_nodes_to_start_service`].
-    #[export]
-    pub fn score_nodes_to_start_service(
-        #[try_from_ref] this: &Scheduler,
-        service: StaticServiceUsage,
-    ) -> Result<Vec<(String, f64)>, Error> {
-        let usage = this.inner.lock().unwrap();
+    fn as_cluster_usage(usage: &Usage) -> ClusterUsage {
         let nodes = usage
             .nodes
             .values()
@@ -208,8 +195,28 @@ pub mod pve_rs_resource_scheduling_static {
 
                 node_usage
             })
-            .collect::<Vec<StaticNodeUsage>>();
+            .collect::<Vec<_>>();
 
-        proxmox_resource_scheduling::pve_static::score_nodes_to_start_service(&nodes, &service)
+        ClusterUsage::from_nodes(nodes)
+    }
+
+    /// Scores all previously added nodes for starting a `service` on.
+    ///
+    /// Scoring is done according to the static memory and CPU usages of the nodes as if the
+    /// service would already be running on each.
+    ///
+    /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher
+    /// score is better.
+    ///
+    /// See [`proxmox_resource_scheduling::pve_static::score_nodes_to_start_service`].
+    #[export]
+    pub fn score_nodes_to_start_service(
+        #[try_from_ref] this: &Scheduler,
+        service: StaticServiceUsage,
+    ) -> Result<Vec<(String, f64)>, Error> {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.score_nodes_to_start_service(service)
     }
 }
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (5 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 pve-rs/src/bindings/resource_scheduling_static.rs | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
index a51b8a2..4abf742 100644
--- a/pve-rs/src/bindings/resource_scheduling_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling_static.rs
@@ -145,8 +145,7 @@ pub mod pve_rs_resource_scheduling_static {
 
             service_nodes.insert(nodename.to_string());
         } else {
-            let mut service_nodes = HashSet::new();
-            service_nodes.insert(nodename.to_string());
+            let service_nodes = HashSet::from([nodename.to_string()]);
             usage.service_nodes.insert(sid.to_string(), service_nodes);
         }
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (6 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The static service stats are currently only added to the HashSet in the
StaticNodeInfo struct, but for an upcoming patch these stats need to be
retrieved independently from the node these are on.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../bindings/resource_scheduling_static.rs    | 35 ++++++++++++-------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
index 4abf742..3764aaa 100644
--- a/pve-rs/src/bindings/resource_scheduling_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling_static.rs
@@ -24,9 +24,14 @@ pub mod pve_rs_resource_scheduling_static {
         services: HashMap<String, StaticServiceUsage>,
     }
 
+    struct StaticServiceInfo {
+        stats: StaticServiceUsage,
+        nodes: HashSet<String>,
+    }
+
     struct Usage {
         nodes: HashMap<String, StaticNodeInfo>,
-        service_nodes: HashMap<String, HashSet<String>>,
+        services: HashMap<String, StaticServiceInfo>,
     }
 
     /// A scheduler instance contains the resource usage by node.
@@ -39,7 +44,7 @@ pub mod pve_rs_resource_scheduling_static {
     pub fn new(#[raw] class: Value) -> Result<Value, Error> {
         let inner = Usage {
             nodes: HashMap::new(),
-            service_nodes: HashMap::new(),
+            services: HashMap::new(),
         };
 
         Ok(perlmod::instantiate_magic!(
@@ -81,12 +86,12 @@ pub mod pve_rs_resource_scheduling_static {
 
         if let Some(node) = usage.nodes.remove(nodename) {
             for (sid, _) in node.services.iter() {
-                match usage.service_nodes.get_mut(sid) {
-                    Some(service_nodes) => {
-                        service_nodes.remove(nodename);
+                match usage.services.get_mut(sid) {
+                    Some(service) => {
+                        service.nodes.remove(nodename);
                     }
                     None => bail!(
-                        "service '{}' not present in service_nodes hashmap while removing node '{}'",
+                        "service '{}' not present in services hashmap while removing node '{}'",
                         sid,
                         nodename
                     ),
@@ -138,15 +143,19 @@ pub mod pve_rs_resource_scheduling_static {
             None => bail!("node '{}' not present in usage hashmap", nodename),
         }
 
-        if let Some(service_nodes) = usage.service_nodes.get_mut(sid) {
-            if service_nodes.contains(nodename) {
+        if let Some(service) = usage.services.get_mut(sid) {
+            if service.nodes.contains(nodename) {
                 bail!("node '{}' already added to service '{}'", nodename, sid);
             }
 
-            service_nodes.insert(nodename.to_string());
+            service.nodes.insert(nodename.to_string());
         } else {
-            let service_nodes = HashSet::from([nodename.to_string()]);
-            usage.service_nodes.insert(sid.to_string(), service_nodes);
+            let service = StaticServiceInfo {
+                stats: service_usage,
+                nodes: HashSet::from([nodename.to_string()]),
+            };
+
+            usage.services.insert(sid.to_string(), service);
         }
 
         Ok(())
@@ -157,8 +166,8 @@ pub mod pve_rs_resource_scheduling_static {
     fn remove_service_usage(#[try_from_ref] this: &Scheduler, sid: &str) -> Result<(), Error> {
         let mut usage = this.inner.lock().unwrap();
 
-        if let Some(nodes) = usage.service_nodes.remove(sid) {
-            for nodename in &nodes {
+        if let Some(service) = usage.services.remove(sid) {
+            for nodename in &service.nodes {
                 match usage.nodes.get_mut(nodename) {
                     Some(node) => {
                         node.services.remove(sid);
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (7 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

In the current implementation, the callee of
{score,select}_best_balancing_migration provides the migration
candidates, which are unpacked with generate_migration_candidates_from()

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../bindings/resource_scheduling_static.rs    | 150 +++++++++++++++++-
 1 file changed, 148 insertions(+), 2 deletions(-)

diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
index 3764aaa..84a5497 100644
--- a/pve-rs/src/bindings/resource_scheduling_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling_static.rs
@@ -9,11 +9,15 @@ pub mod pve_rs_resource_scheduling_static {
     use std::collections::{HashMap, HashSet};
     use std::sync::Mutex;
 
-    use anyhow::{Error, bail};
+    use serde::{Deserialize, Serialize};
+
+    use anyhow::{Context, Error, bail};
 
     use perlmod::Value;
     use proxmox_resource_scheduling::pve_static::{StaticNodeUsage, StaticServiceUsage};
-    use proxmox_resource_scheduling::scheduler::ClusterUsage;
+    use proxmox_resource_scheduling::scheduler::{
+        ClusterUsage, MigrationCandidate, ScoredMigration,
+    };
 
     perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Static");
 
@@ -208,6 +212,148 @@ pub mod pve_rs_resource_scheduling_static {
         ClusterUsage::from_nodes(nodes)
     }
 
+    /// Method: Calculates the loads for each node.
+    #[export]
+    pub fn calculate_node_loads(#[try_from_ref] this: &Scheduler) -> Vec<(String, f64)> {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.node_loads()
+    }
+
+    /// Method: Calculates the imbalance among the nodes.
+    #[export]
+    pub fn calculate_node_imbalance(#[try_from_ref] this: &Scheduler) -> f64 {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.node_imbalance()
+    }
+
+    /// A compact representation of MigationCandidate.
+    #[derive(Serialize, Deserialize)]
+    pub struct CompactMigrationCandidate {
+        /// The identifier of the leading service.
+        pub leader: String,
+        /// The services which are part of the leading service's bundle.
+        pub services: Vec<String>,
+        /// The nodes, which are possible to migrate to for the services.
+        pub nodes: Vec<String>,
+    }
+
+    fn generate_migration_candidates_from(
+        usage: &Usage,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Vec<MigrationCandidate>, Error> {
+        let mut migration_candidates = Vec::new();
+
+        for candidate in candidates.into_iter() {
+            let leader_sid = candidate.leader;
+            let leader = usage.services.get(&leader_sid).with_context(|| {
+                format!(
+                    "leader {} is not present in services usage hashmap",
+                    leader_sid
+                )
+            })?;
+            let source_node = leader.nodes.iter().next().unwrap();
+
+            let mut service_candidates = Vec::new();
+
+            for sid in candidate.services.iter() {
+                let service = usage
+                    .services
+                    .get(sid)
+                    .with_context(|| format!("service {} is not present in usage hashmap", sid))?;
+                let service_nodes = &service.nodes;
+
+                if service_nodes.len() > 1 {
+                    bail!("service {sid} is on multiple nodes");
+                }
+
+                if !service_nodes.contains(source_node) {
+                    bail!("service {sid} is not on common source node {source_node}");
+                }
+
+                service_candidates.push(service);
+            }
+
+            let bundle_stats = service_candidates
+                .into_iter()
+                .fold(StaticServiceUsage::default(), |total_stats, service| {
+                    total_stats + service.stats
+                });
+
+            for target_node in candidate.nodes.into_iter() {
+                migration_candidates.push(MigrationCandidate {
+                    sid: leader_sid.to_string(),
+                    source_node: source_node.to_string(),
+                    target_node,
+                    stats: bundle_stats.into(),
+                });
+            }
+        }
+
+        Ok(migration_candidates)
+    }
+
+    /// Method: Score the service motions by the best node imbalance improvement with exhaustive search.
+    #[export]
+    pub fn score_best_balancing_migrations(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.score_best_balancing_migrations(candidates, limit)
+    }
+
+    /// Method: Select the service motion with the best node imbalance improvement with exhaustive search.
+    #[export]
+    pub fn select_best_balancing_migration(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Option<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.select_best_balancing_migration(candidates)
+    }
+
+    /// Method: Score the service motions by the best node imbalance improvement with the TOPSIS method.
+    #[export]
+    pub fn score_best_balancing_migrations_topsis(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.score_best_balancing_migrations_topsis(&candidates, limit)
+    }
+
+    /// Method: Select the service motion with the best node imbalance improvement with the TOPSIS method.
+    #[export]
+    pub fn select_best_balancing_migration_topsis(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Option<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.select_best_balancing_migration_topsis(&candidates)
+    }
+
     /// Scores all previously added nodes for starting a `service` on.
     ///
     /// Scoring is done according to the static memory and CPU usages of the nodes as if the
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (8 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 pve-rs/src/bindings/mod.rs                                    | 3 +--
 pve-rs/src/bindings/resource_scheduling/mod.rs                | 4 ++++
 .../pve_static.rs}                                            | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)
 create mode 100644 pve-rs/src/bindings/resource_scheduling/mod.rs
 rename pve-rs/src/bindings/{resource_scheduling_static.rs => resource_scheduling/pve_static.rs} (99%)

diff --git a/pve-rs/src/bindings/mod.rs b/pve-rs/src/bindings/mod.rs
index c21b328..853a3dd 100644
--- a/pve-rs/src/bindings/mod.rs
+++ b/pve-rs/src/bindings/mod.rs
@@ -3,8 +3,7 @@
 mod oci;
 pub use oci::pve_rs_oci;
 
-mod resource_scheduling_static;
-pub use resource_scheduling_static::pve_rs_resource_scheduling_static;
+pub mod resource_scheduling;
 
 mod tfa;
 pub use tfa::pve_rs_tfa;
diff --git a/pve-rs/src/bindings/resource_scheduling/mod.rs b/pve-rs/src/bindings/resource_scheduling/mod.rs
new file mode 100644
index 0000000..af1fb6b
--- /dev/null
+++ b/pve-rs/src/bindings/resource_scheduling/mod.rs
@@ -0,0 +1,4 @@
+//! Resource scheduling related bindings.
+
+mod pve_static;
+pub use pve_static::pve_rs_resource_scheduling_static;
diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling/pve_static.rs
similarity index 99%
rename from pve-rs/src/bindings/resource_scheduling_static.rs
rename to pve-rs/src/bindings/resource_scheduling/pve_static.rs
index 84a5497..44ea851 100644
--- a/pve-rs/src/bindings/resource_scheduling_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling/pve_static.rs
@@ -2,7 +2,7 @@
 pub mod pve_rs_resource_scheduling_static {
     //! The `PVE::RS::ResourceScheduling::Static` package.
     //!
-    //! Provides bindings for the resource scheduling module.
+    //! Provides bindings for the static resource scheduling module.
     //!
     //! See [`proxmox_resource_scheduling`].
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (9 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
                   ` (24 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The implementation is similar to pve_static, but extends the node and
service stats with the current dynamic resource usages and the node
usage is derived from the node itself instead of the sum of service
stats, which are assigned to the node.

The CompactMigrationCandidate struct is shared between the pve_static
and pve_dynamic implementation.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 pve-rs/Makefile                               |   1 +
 .../src/bindings/resource_scheduling/mod.rs   |  16 +
 .../resource_scheduling/pve_dynamic.rs        | 349 ++++++++++++++++++
 .../resource_scheduling/pve_static.rs         |  15 +-
 pve-rs/test/resource_scheduling.pl            |   1 +
 5 files changed, 369 insertions(+), 13 deletions(-)
 create mode 100644 pve-rs/src/bindings/resource_scheduling/pve_dynamic.rs

diff --git a/pve-rs/Makefile b/pve-rs/Makefile
index aa7181e..19695a1 100644
--- a/pve-rs/Makefile
+++ b/pve-rs/Makefile
@@ -30,6 +30,7 @@ PERLMOD_PACKAGES := \
 	  PVE::RS::OCI \
 	  PVE::RS::OpenId \
 	  PVE::RS::ResourceScheduling::Static \
+	  PVE::RS::ResourceScheduling::Dynamic \
 	  PVE::RS::SDN::Fabrics \
 	  PVE::RS::TFA
 
diff --git a/pve-rs/src/bindings/resource_scheduling/mod.rs b/pve-rs/src/bindings/resource_scheduling/mod.rs
index af1fb6b..ff8d94b 100644
--- a/pve-rs/src/bindings/resource_scheduling/mod.rs
+++ b/pve-rs/src/bindings/resource_scheduling/mod.rs
@@ -2,3 +2,19 @@
 
 mod pve_static;
 pub use pve_static::pve_rs_resource_scheduling_static;
+
+mod pve_dynamic;
+pub use pve_dynamic::pve_rs_resource_scheduling_dynamic;
+
+use serde::{Deserialize, Serialize};
+
+/// A compact representation of MigationCandidate.
+#[derive(Serialize, Deserialize)]
+pub struct CompactMigrationCandidate {
+    /// The identifier of the leading service.
+    pub leader: String,
+    /// The services which are part of the leading service's bundle.
+    pub services: Vec<String>,
+    /// The nodes, which are possible to migrate to for the services.
+    pub nodes: Vec<String>,
+}
diff --git a/pve-rs/src/bindings/resource_scheduling/pve_dynamic.rs b/pve-rs/src/bindings/resource_scheduling/pve_dynamic.rs
new file mode 100644
index 0000000..7c8ffce
--- /dev/null
+++ b/pve-rs/src/bindings/resource_scheduling/pve_dynamic.rs
@@ -0,0 +1,349 @@
+#[perlmod::package(name = "PVE::RS::ResourceScheduling::Dynamic", lib = "pve_rs")]
+pub mod pve_rs_resource_scheduling_dynamic {
+    //! The `PVE::RS::ResourceScheduling::Dynamic` package.
+    //!
+    //! Provides bindings for the dynamic resource scheduling module.
+    //!
+    //! See [`proxmox_resource_scheduling`].
+
+    use std::collections::{HashMap, HashSet};
+    use std::sync::Mutex;
+
+    use anyhow::{Context, Error, bail};
+
+    use perlmod::Value;
+    use proxmox_resource_scheduling::pve_dynamic::{DynamicNodeStats, DynamicServiceStats};
+    use proxmox_resource_scheduling::scheduler::{
+        ClusterUsage, MigrationCandidate, NodeUsage, ScoredMigration,
+    };
+
+    use crate::bindings::resource_scheduling::CompactMigrationCandidate;
+
+    perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Dynamic");
+
+    struct DynamicNodeInfo {
+        stats: DynamicNodeStats,
+        services: HashSet<String>,
+    }
+
+    struct DynamicServiceInfo {
+        stats: DynamicServiceStats,
+        nodes: HashSet<String>,
+    }
+
+    struct Usage {
+        nodes: HashMap<String, DynamicNodeInfo>,
+        services: HashMap<String, DynamicServiceInfo>,
+    }
+
+    /// A scheduler instance contains the resource usage by node.
+    pub struct Scheduler {
+        inner: Mutex<Usage>,
+    }
+
+    /// Class method: Create a new [`Scheduler`] instance.
+    #[export(raw_return)]
+    pub fn new(#[raw] class: Value) -> Result<Value, Error> {
+        let inner = Usage {
+            nodes: HashMap::new(),
+            services: HashMap::new(),
+        };
+
+        Ok(perlmod::instantiate_magic!(
+            &class, MAGIC => Box::new(Scheduler { inner: Mutex::new(inner) })
+        ))
+    }
+
+    /// Method: Add a node with its basic CPU and memory info.
+    ///
+    /// This inserts a [`DynamicNodeInfo`] entry for the node into the scheduler instance.
+    #[export]
+    pub fn add_node(
+        #[try_from_ref] this: &Scheduler,
+        nodename: String,
+        stats: DynamicNodeStats,
+    ) -> Result<(), Error> {
+        let mut usage = this.inner.lock().unwrap();
+
+        if usage.nodes.contains_key(&nodename) {
+            bail!("node {} already added", nodename);
+        }
+
+        let node = DynamicNodeInfo {
+            stats,
+            services: HashSet::new(),
+        };
+
+        usage.nodes.insert(nodename, node);
+        Ok(())
+    }
+
+    /// Method: Remove a node from the scheduler.
+    #[export]
+    pub fn remove_node(#[try_from_ref] this: &Scheduler, nodename: &str) -> Result<(), Error> {
+        let mut usage = this.inner.lock().unwrap();
+
+        if let Some(node) = usage.nodes.remove(nodename) {
+            for sid in node.services.iter() {
+                match usage.services.get_mut(sid) {
+                    Some(service) => {
+                        service.nodes.remove(nodename);
+                    }
+                    None => bail!(
+                        "service '{}' not present in services hashmap while removing node '{}'",
+                        sid,
+                        nodename
+                    ),
+                }
+            }
+        }
+
+        Ok(())
+    }
+
+    /// Method: Get a list of all the nodes in the scheduler.
+    #[export]
+    pub fn list_nodes(#[try_from_ref] this: &Scheduler) -> Vec<String> {
+        let usage = this.inner.lock().unwrap();
+
+        usage
+            .nodes
+            .keys()
+            .map(|nodename| nodename.to_string())
+            .collect()
+    }
+
+    /// Method: Check whether a node exists in the scheduler.
+    #[export]
+    pub fn contains_node(#[try_from_ref] this: &Scheduler, nodename: &str) -> bool {
+        let usage = this.inner.lock().unwrap();
+
+        usage.nodes.contains_key(nodename)
+    }
+
+    /// Method: Add service `sid` and its `service_usage` to the node.
+    #[export]
+    pub fn add_service_usage_to_node(
+        #[try_from_ref] this: &Scheduler,
+        nodename: &str,
+        sid: &str,
+        service_usage: DynamicServiceStats,
+    ) -> Result<(), Error> {
+        let mut usage = this.inner.lock().unwrap();
+
+        match usage.nodes.get_mut(nodename) {
+            Some(node) => {
+                if node.services.contains(sid) {
+                    bail!("service '{}' already added to node '{}'", sid, nodename);
+                }
+
+                node.services.insert(sid.to_string());
+            }
+            None => bail!("node '{}' not present in usage hashmap", nodename),
+        }
+
+        if let Some(service) = usage.services.get_mut(sid) {
+            if service.nodes.contains(nodename) {
+                bail!("node '{}' already added to service '{}'", nodename, sid);
+            }
+
+            service.nodes.insert(nodename.to_string());
+        } else {
+            let service = DynamicServiceInfo {
+                stats: service_usage,
+                nodes: HashSet::from([nodename.to_string()]),
+            };
+
+            usage.services.insert(sid.to_string(), service);
+        }
+
+        Ok(())
+    }
+
+    /// Method: Remove service `sid` and its usage from all assigned nodes.
+    #[export]
+    fn remove_service_usage(#[try_from_ref] this: &Scheduler, sid: &str) -> Result<(), Error> {
+        let mut usage = this.inner.lock().unwrap();
+
+        if let Some(service) = usage.services.remove(sid) {
+            for nodename in &service.nodes {
+                match usage.nodes.get_mut(nodename) {
+                    Some(node) => {
+                        node.services.remove(sid);
+                    }
+                    None => bail!(
+                        "service '{}' not present in usage hashmap on node '{}'",
+                        sid,
+                        nodename
+                    ),
+                }
+            }
+        }
+
+        Ok(())
+    }
+
+    fn as_cluster_usage(usage: &Usage) -> ClusterUsage {
+        let nodes = usage
+            .nodes
+            .iter()
+            .map(|(nodename, node)| NodeUsage {
+                name: nodename.to_string(),
+                stats: node.stats.into(),
+            })
+            .collect::<Vec<_>>();
+
+        ClusterUsage::from_nodes(nodes)
+    }
+
+    /// Method: Calculates the loads for each node.
+    #[export]
+    pub fn calculate_node_loads(#[try_from_ref] this: &Scheduler) -> Vec<(String, f64)> {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.node_loads()
+    }
+
+    /// Method: Calculates the imbalance among the nodes.
+    #[export]
+    pub fn calculate_node_imbalance(#[try_from_ref] this: &Scheduler) -> f64 {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.node_imbalance()
+    }
+
+    fn generate_migration_candidates_from(
+        usage: &Usage,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Vec<MigrationCandidate>, Error> {
+        let mut migration_candidates = Vec::new();
+
+        for candidate in candidates.into_iter() {
+            let leader_sid = candidate.leader;
+            let leader = usage.services.get(&leader_sid).with_context(|| {
+                format!(
+                    "leader {} is not present in services usage hashmap",
+                    leader_sid
+                )
+            })?;
+            let source_node = leader.nodes.iter().next().unwrap();
+
+            let mut service_candidates = Vec::new();
+
+            for sid in candidate.services.iter() {
+                let service = usage
+                    .services
+                    .get(sid)
+                    .with_context(|| format!("service {} is not present in usage hashmap", sid))?;
+                let service_nodes = &service.nodes;
+
+                if service_nodes.len() > 1 {
+                    bail!("service {sid} is on multiple nodes");
+                }
+
+                if !service_nodes.contains(source_node) {
+                    bail!("service {sid} is not on common source node {source_node}");
+                }
+
+                service_candidates.push(service);
+            }
+
+            let bundle_stats = service_candidates
+                .into_iter()
+                .fold(DynamicServiceStats::default(), |total_stats, service| {
+                    total_stats + service.stats
+                });
+
+            for target_node in candidate.nodes.into_iter() {
+                migration_candidates.push(MigrationCandidate {
+                    sid: leader_sid.to_string(),
+                    source_node: source_node.to_string(),
+                    target_node,
+                    stats: bundle_stats.into(),
+                });
+            }
+        }
+
+        Ok(migration_candidates)
+    }
+
+    /// Method: Score the service motions by the best node imbalance improvement with exhaustive search.
+    #[export]
+    pub fn score_best_balancing_migrations(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.score_best_balancing_migrations(candidates, limit)
+    }
+
+    /// Method: Select the service motion with the best node imbalance improvement with exhaustive search.
+    #[export]
+    pub fn select_best_balancing_migration(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Option<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.select_best_balancing_migration(candidates)
+    }
+
+    /// Method: Score the service motions by the best node imbalance improvement with the TOPSIS method.
+    #[export]
+    pub fn score_best_balancing_migrations_topsis(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+        limit: usize,
+    ) -> Result<Vec<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.score_best_balancing_migrations_topsis(&candidates, limit)
+    }
+
+    /// Method: Select the service motion with the best node imbalance improvement with the TOPSIS method.
+    #[export]
+    pub fn select_best_balancing_migration_topsis(
+        #[try_from_ref] this: &Scheduler,
+        candidates: Vec<CompactMigrationCandidate>,
+    ) -> Result<Option<ScoredMigration>, Error> {
+        let usage = this.inner.lock().unwrap();
+
+        let cluster_usage = as_cluster_usage(&usage);
+        let candidates = generate_migration_candidates_from(&usage, candidates)?;
+
+        cluster_usage.select_best_balancing_migration_topsis(&candidates)
+    }
+
+    /// Scores all previously added nodes for starting a `service` on.
+    ///
+    /// Scoring is done according to the dynamic memory and CPU usages of the nodes as if the
+    /// service would already be running on each.
+    ///
+    /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher
+    /// score is better.
+    ///
+    /// See [`proxmox_resource_scheduling::pve_dynamic::score_nodes_to_start_service`].
+    #[export]
+    pub fn score_nodes_to_start_service(
+        #[try_from_ref] this: &Scheduler,
+        service: DynamicServiceStats,
+    ) -> Result<Vec<(String, f64)>, Error> {
+        let usage = this.inner.lock().unwrap();
+        let cluster_usage = as_cluster_usage(&usage);
+
+        cluster_usage.score_nodes_to_start_service(service)
+    }
+}
diff --git a/pve-rs/src/bindings/resource_scheduling/pve_static.rs b/pve-rs/src/bindings/resource_scheduling/pve_static.rs
index 44ea851..f7440d2 100644
--- a/pve-rs/src/bindings/resource_scheduling/pve_static.rs
+++ b/pve-rs/src/bindings/resource_scheduling/pve_static.rs
@@ -9,8 +9,6 @@ pub mod pve_rs_resource_scheduling_static {
     use std::collections::{HashMap, HashSet};
     use std::sync::Mutex;
 
-    use serde::{Deserialize, Serialize};
-
     use anyhow::{Context, Error, bail};
 
     use perlmod::Value;
@@ -19,6 +17,8 @@ pub mod pve_rs_resource_scheduling_static {
         ClusterUsage, MigrationCandidate, ScoredMigration,
     };
 
+    use crate::bindings::resource_scheduling::CompactMigrationCandidate;
+
     perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Static");
 
     struct StaticNodeInfo {
@@ -230,17 +230,6 @@ pub mod pve_rs_resource_scheduling_static {
         cluster_usage.node_imbalance()
     }
 
-    /// A compact representation of MigationCandidate.
-    #[derive(Serialize, Deserialize)]
-    pub struct CompactMigrationCandidate {
-        /// The identifier of the leading service.
-        pub leader: String,
-        /// The services which are part of the leading service's bundle.
-        pub services: Vec<String>,
-        /// The nodes, which are possible to migrate to for the services.
-        pub nodes: Vec<String>,
-    }
-
     fn generate_migration_candidates_from(
         usage: &Usage,
         candidates: Vec<CompactMigrationCandidate>,
diff --git a/pve-rs/test/resource_scheduling.pl b/pve-rs/test/resource_scheduling.pl
index a332269..3775242 100755
--- a/pve-rs/test/resource_scheduling.pl
+++ b/pve-rs/test/resource_scheduling.pl
@@ -6,6 +6,7 @@ use warnings;
 use Test::More;
 
 use PVE::RS::ResourceScheduling::Static;
+use PVE::RS::ResourceScheduling::Dynamic;
 
 my sub score_nodes {
     my ($static, $service) = @_;
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC cluster 1/2] datacenter config: add dynamic load scheduler option
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (10 preceding siblings ...)
  2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-18 11:06   ` Maximiliano Sandoval
  2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/DataCenterConfig.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/PVE/DataCenterConfig.pm b/src/PVE/DataCenterConfig.pm
index 514c867..5c91f80 100644
--- a/src/PVE/DataCenterConfig.pm
+++ b/src/PVE/DataCenterConfig.pm
@@ -13,13 +13,14 @@ my $PROXMOX_OUI = 'BC:24:11';
 my $crs_format = {
     ha => {
         type => 'string',
-        enum => ['basic', 'static'],
+        enum => ['basic', 'static', 'dynamic'],
         optional => 1,
         default => 'basic',
         description => "Use this resource scheduler mode for HA.",
         verbose_description => "Configures how the HA manager should select nodes to start or "
             . "recover services. With 'basic', only the number of services is used, with 'static', "
-            . "static CPU and memory configuration of services is considered.",
+            . "static CPU and memory configuration of services is considered, and with 'dynamic', "
+            . "static and dynamic CPU and memory usage of services is considered.",
     },
     'ha-rebalance-on-start' => {
         type => 'boolean',
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC cluster 2/2] datacenter config: add auto rebalancing options
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (11 preceding siblings ...)
  2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-18 11:15   ` Maximiliano Sandoval
  2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/DataCenterConfig.pm | 38 +++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/src/PVE/DataCenterConfig.pm b/src/PVE/DataCenterConfig.pm
index 5c91f80..86bd06a 100644
--- a/src/PVE/DataCenterConfig.pm
+++ b/src/PVE/DataCenterConfig.pm
@@ -30,6 +30,44 @@ my $crs_format = {
             "Set to use CRS for selecting a suited node when a HA services request-state"
             . " changes from stop to start.",
     },
+    'ha-auto-rebalance' => {
+        type => 'boolean',
+        optional => 1,
+        default => 0,
+        description => "Set to use CRS for balancing HA resources automatically depending on"
+            . " the current node imbalance.",
+    },
+    'ha-auto-rebalance-threshold' => {
+        type => 'number',
+        optional => 1,
+        default => 0.7,
+        requires => 'ha-auto-rebalance',
+        description => "The threshold for the node load, which will trigger the automatic"
+            . " HA resource balancing if the threshold is exceeded.",
+    },
+    'ha-auto-rebalance-method' => {
+        type => 'string',
+        enum => ['bruteforce', 'topsis'],
+        optional => 1,
+        default => 'bruteforce',
+        requires => 'ha-auto-rebalance',
+    },
+    'ha-auto-rebalance-hold-duration' => {
+        type => 'number',
+        optional => 1,
+        default => 3,
+        requires => 'ha-auto-rebalance',
+        description => "The duration the threshold must be exceeded for to trigger an automatic"
+            . " HA resource balancing migration in HA rounds.",
+    },
+    'ha-auto-rebalance-margin' => {
+        type => 'number',
+        optional => 1,
+        default => 0.1,
+        requires => 'ha-auto-rebalance',
+        description => "The minimum relative improvement in cluster node imbalance to commit to"
+            . " a HA resource rebalancing migration.",
+    },
 };
 
 my $migration_format = {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (12 preceding siblings ...)
  2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The names `maxcpu` and `maxmem` are used in the static load scheduler
itself and is more telling that these properties provide the maximum
configured amount of CPU cores and memory.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env/PVE2.pm                                 |  9 ++++++++-
 src/PVE/HA/Sim/Hardware.pm                             |  8 ++++----
 src/PVE/HA/Usage/Static.pm                             |  6 +++---
 .../hardware_status                                    |  6 +++---
 .../hardware_status                                    |  6 +++---
 .../hardware_status                                    | 10 +++++-----
 src/test/test-crs-static-rebalance1/hardware_status    |  6 +++---
 src/test/test-crs-static-rebalance2/hardware_status    |  6 +++---
 src/test/test-crs-static1/hardware_status              |  6 +++---
 src/test/test-crs-static2/hardware_status              | 10 +++++-----
 src/test/test-crs-static3/hardware_status              |  6 +++---
 src/test/test-crs-static4/hardware_status              |  6 +++---
 src/test/test-crs-static5/hardware_status              |  6 +++---
 13 files changed, 49 insertions(+), 42 deletions(-)

diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 37720f72..ee4fa23d 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -543,7 +543,14 @@ sub get_static_node_stats {
 
     my $stats = PVE::Cluster::get_node_kv('static-info');
     for my $node (keys $stats->%*) {
-        $stats->{$node} = eval { decode_json($stats->{$node}) };
+        $stats->{$node} = eval {
+            my $node_stats = decode_json($stats->{$node});
+
+            return {
+                maxcpu => $node_stats->{cpus},
+                maxmem => $node_stats->{memory},
+            };
+        };
         $self->log('err', "unable to decode static node info for '$node' - $@") if $@;
     }
 
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 97ada580..702500c2 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -488,9 +488,9 @@ sub new {
             || die "Copy failed: $!\n";
     } else {
         my $cstatus = {
-            node1 => { power => 'off', network => 'off', cpus => 24, memory => 131072 },
-            node2 => { power => 'off', network => 'off', cpus => 24, memory => 131072 },
-            node3 => { power => 'off', network => 'off', cpus => 24, memory => 131072 },
+            node1 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
+            node2 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
+            node3 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
         };
         $self->write_hardware_status_nolock($cstatus);
     }
@@ -1088,7 +1088,7 @@ sub get_static_node_stats {
 
     my $stats = {};
     for my $node (keys $cstatus->%*) {
-        $stats->{$node} = { $cstatus->{$node}->%{qw(cpus memory)} };
+        $stats->{$node} = { $cstatus->{$node}->%{qw(maxcpu maxmem)} };
     }
 
     return $stats;
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index d586b603..395be871 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -33,10 +33,10 @@ sub add_node {
 
     my $stats = $self->{'node-stats'}->{$nodename}
         or die "did not get static node usage information for '$nodename'\n";
-    die "static node usage information for '$nodename' missing cpu count\n" if !$stats->{cpus};
-    die "static node usage information for '$nodename' missing memory\n" if !$stats->{memory};
+    die "static node usage information for '$nodename' missing cpu count\n" if !$stats->{maxcpu};
+    die "static node usage information for '$nodename' missing memory\n" if !$stats->{maxmem};
 
-    eval { $self->{scheduler}->add_node($nodename, int($stats->{cpus}), int($stats->{memory})); };
+    eval { $self->{scheduler}->add_node($nodename, int($stats->{maxcpu}), int($stats->{maxmem})); };
     die "initializing static node usage for '$nodename' failed - $@" if $@;
 }
 
diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
index 84484af1..3d4cf91f 100644
--- a/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
+++ b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 }
 }
diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
index 84484af1..3d4cf91f 100644
--- a/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
+++ b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 112000000000 }
 }
diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
index b6dcb1a5..7bc741f1 100644
--- a/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
+++ b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status
@@ -1,7 +1,7 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 16, "memory": 24000000000 },
-  "node4": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 },
-  "node5": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 48000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 36000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 16, "maxmem": 24000000000 },
+  "node4": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 36000000000 },
+  "node5": { "power": "off", "network": "off", "maxcpu": 8, "maxmem": 48000000000 }
 }
diff --git a/src/test/test-crs-static-rebalance1/hardware_status b/src/test/test-crs-static-rebalance1/hardware_status
index 651ad792..bfdbbf7b 100644
--- a/src/test/test-crs-static-rebalance1/hardware_status
+++ b/src/test/test-crs-static-rebalance1/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 256000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 256000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 256000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 256000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 256000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 256000000000 }
 }
diff --git a/src/test/test-crs-static-rebalance2/hardware_status b/src/test/test-crs-static-rebalance2/hardware_status
index 9be70a40..c5cbde3d 100644
--- a/src/test/test-crs-static-rebalance2/hardware_status
+++ b/src/test/test-crs-static-rebalance2/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 40, "memory": 384000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 256000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 256000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 40, "maxmem": 384000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 256000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 256000000000 }
 }
diff --git a/src/test/test-crs-static1/hardware_status b/src/test/test-crs-static1/hardware_status
index 0fa8c265..bbe44a96 100644
--- a/src/test/test-crs-static1/hardware_status
+++ b/src/test/test-crs-static1/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 200000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 300000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 200000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 300000000000 }
 }
diff --git a/src/test/test-crs-static2/hardware_status b/src/test/test-crs-static2/hardware_status
index d426023a..815436ef 100644
--- a/src/test/test-crs-static2/hardware_status
+++ b/src/test/test-crs-static2/hardware_status
@@ -1,7 +1,7 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 200000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 300000000000 },
-  "node4": { "power": "off", "network": "off", "cpus": 64, "memory": 300000000000 },
-  "node5": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 200000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 300000000000 },
+  "node4": { "power": "off", "network": "off", "maxcpu": 64, "maxmem": 300000000000 },
+  "node5": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 }
 }
diff --git a/src/test/test-crs-static3/hardware_status b/src/test/test-crs-static3/hardware_status
index dfbf496e..ed84b8bd 100644
--- a/src/test/test-crs-static3/hardware_status
+++ b/src/test/test-crs-static3/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 64, "memory": 200000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 64, "maxmem": 200000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 }
 }
diff --git a/src/test/test-crs-static4/hardware_status b/src/test/test-crs-static4/hardware_status
index a83a2dcc..b08ba7f9 100644
--- a/src/test/test-crs-static4/hardware_status
+++ b/src/test/test-crs-static4/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 }
 }
diff --git a/src/test/test-crs-static5/hardware_status b/src/test/test-crs-static5/hardware_status
index 3eb9e735..edfd6db2 100644
--- a/src/test/test-crs-static5/hardware_status
+++ b/src/test/test-crs-static5/hardware_status
@@ -1,5 +1,5 @@
 {
-  "node1": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 100000000000 },
-  "node3": { "power": "off", "network": "off", "cpus": 128, "memory": 100000000000 }
+  "node1": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 32, "maxmem": 100000000000 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 128, "maxmem": 100000000000 }
 }
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (13 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The return value of get_static_service_stats(...) is fetched through
PVE::Cluster::get_guest_config_properties(...), which in turn reads all
guest configuration files with a memdb_read_nolock(...) in the pmxcfs.

As PVE::AbstractConfig::load_config(...) internally gets the content of
the guest configuration file through cfs_read_file(...), which in turn
receives the return value of the equivalent memdb_read(...) from a
CFS_IPC_GET_CONFIG message, the fallback is likely to fail as well.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Resources/PVECT.pm | 1 -
 src/PVE/HA/Resources/PVEVM.pm | 1 -
 2 files changed, 2 deletions(-)

diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index 4cbf6db3..b9ce2ac3 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -163,7 +163,6 @@ sub get_static_stats {
     my ($class, $haenv, $id, $service_node) = @_;
 
     my $conf = $haenv->get_static_service_stats($id);
-    $conf = PVE::LXC::Config->load_config($id, $service_node) if !defined($conf);
 
     return {
         maxcpu => PVE::LXC::Config->get_derived_property($conf, 'max-cpu'),
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 7586da84..303334ba 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -184,7 +184,6 @@ sub get_static_stats {
     my ($class, $haenv, $id, $service_node) = @_;
 
     my $conf = $haenv->get_static_service_stats($id);
-    $conf = PVE::QemuConfig->load_config($id, $service_node) if !defined($conf);
 
     return {
         maxcpu => PVE::QemuConfig->get_derived_property($conf, 'max-cpu'),
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (14 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

As the retrieval of the static service stats are not dependent on the
location of the guest's config file, there is no need for providing the
$service_node and $migration_target arguments anymore.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Resources.pm       |  2 +-
 src/PVE/HA/Resources/PVECT.pm |  2 +-
 src/PVE/HA/Resources/PVEVM.pm |  2 +-
 src/PVE/HA/Sim/Resources.pm   |  2 +-
 src/PVE/HA/Usage.pm           | 10 ++++------
 src/PVE/HA/Usage/Basic.pm     |  4 ++--
 src/PVE/HA/Usage/Static.pm    | 19 +++++++------------
 7 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index 68d9d16d..38e0841b 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -177,7 +177,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id) = @_;
 
     die "implement in subclass";
 }
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index b9ce2ac3..1f4eb2e9 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -160,7 +160,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id) = @_;
 
     my $conf = $haenv->get_static_service_stats($id);
 
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 303334ba..760259e4 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -181,7 +181,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id) = @_;
 
     my $conf = $haenv->get_static_service_stats($id);
 
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index ed43373e..1b2bfaaf 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -138,7 +138,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id) = @_;
 
     my $sid = $class->type() . ":$id";
     my $hardware = $haenv->hardware();
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 92e575cb..1be5fa09 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -35,7 +35,7 @@ sub contains_node {
 
 # Logs a warning to $haenv upon failure, but does not die.
 sub add_service_usage_to_node {
-    my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
+    my ($self, $nodename, $sid) = @_;
 
     die "implement in subclass";
 }
@@ -49,10 +49,8 @@ sub add_service_usage {
     my ($current_node, $target_node) =
         get_used_service_nodes($online_nodes, $service_state, $service_node, $migration_target);
 
-    $self->add_service_usage_to_node($current_node, $sid, $service_node, $migration_target)
-        if $current_node;
-    $self->add_service_usage_to_node($target_node, $sid, $service_node, $migration_target)
-        if $target_node;
+    $self->add_service_usage_to_node($current_node, $sid) if $current_node;
+    $self->add_service_usage_to_node($target_node, $sid) if $target_node;
 }
 
 sub remove_service_usage {
@@ -63,7 +61,7 @@ sub remove_service_usage {
 
 # Returns a hash with $nodename => $score pairs. A lower $score is better.
 sub score_nodes_to_start_service {
-    my ($self, $sid, $service_node) = @_;
+    my ($self, $sid) = @_;
 
     die "implement in subclass";
 }
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index 43817bf6..ef9ae3d6 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -39,7 +39,7 @@ sub contains_node {
 }
 
 sub add_service_usage_to_node {
-    my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
+    my ($self, $nodename, $sid) = @_;
 
     if ($self->contains_node($nodename)) {
         $self->{nodes}->{$nodename}->{$sid} = 1;
@@ -60,7 +60,7 @@ sub remove_service_usage {
 }
 
 sub score_nodes_to_start_service {
-    my ($self, $sid, $service_node) = @_;
+    my ($self, $sid) = @_;
 
     my $nodes = $self->{nodes};
 
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 395be871..2304139c 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -61,20 +61,15 @@ sub contains_node {
 }
 
 my sub get_service_usage {
-    my ($self, $sid, $service_node, $migration_target) = @_;
+    my ($self, $sid) = @_;
 
     return $self->{'service-stats'}->{$sid} if $self->{'service-stats'}->{$sid};
 
     my (undef, $type, $id) = $self->{haenv}->parse_sid($sid);
     my $plugin = PVE::HA::Resources->lookup($type);
 
-    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $service_node) };
-    if (my $err = $@) {
-        # config might've already moved during a migration
-        $stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $migration_target); }
-            if $migration_target;
-        die "did not get static service usage information for '$sid' - $err\n" if !$stats;
-    }
+    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $id) };
+    die "did not get static service usage information for '$sid'\n" if !$stats;
 
     my $service_stats = {
         maxcpu => $stats->{maxcpu} + 0.0, # containers allow non-integer cpulimit
@@ -87,12 +82,12 @@ my sub get_service_usage {
 }
 
 sub add_service_usage_to_node {
-    my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
+    my ($self, $nodename, $sid) = @_;
 
     $self->{'node-services'}->{$nodename}->{$sid} = 1;
 
     eval {
-        my $service_usage = get_service_usage($self, $sid, $service_node, $migration_target);
+        my $service_usage = get_service_usage($self, $sid);
         $self->{scheduler}->add_service_usage_to_node($nodename, $sid, $service_usage);
     };
     $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@")
@@ -111,10 +106,10 @@ sub remove_service_usage {
 }
 
 sub score_nodes_to_start_service {
-    my ($self, $sid, $service_node) = @_;
+    my ($self, $sid) = @_;
 
     my $score_list = eval {
-        my $service_usage = get_service_usage($self, $sid, $service_node);
+        my $service_usage = get_service_usage($self, $sid);
         $self->{scheduler}->score_nodes_to_start_service($service_usage);
     };
     if (my $err = $@) {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 04/21] factor out common pve to ha resource type mapping
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (15 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Config.pm |  8 +-------
 src/PVE/HA/Tools.pm  | 23 +++++++++++++++--------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 844ab1b1..bca5bb1c 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -197,13 +197,7 @@ sub parse_sid {
         my $vmlist = PVE::Cluster::get_vmlist();
         if (defined($vmlist->{ids}->{$name})) {
             my $vm_type = $vmlist->{ids}->{$name}->{type};
-            if ($vm_type eq 'lxc') {
-                $type = 'ct';
-            } elsif ($vm_type eq 'qemu') {
-                $type = 'vm';
-            } else {
-                die "internal error";
-            }
+            $type = PVE::HA::Tools::get_ha_resource_type($vm_type);
             $sid = "$type:$name";
         } else {
             die "unable do detect SID from VMID - VM/CT $1 does not exist\n";
diff --git a/src/PVE/HA/Tools.pm b/src/PVE/HA/Tools.pm
index 1fa53df8..26629fb5 100644
--- a/src/PVE/HA/Tools.pm
+++ b/src/PVE/HA/Tools.pm
@@ -289,6 +289,18 @@ sub has_min_version {
     return 1;
 }
 
+sub get_ha_resource_type {
+    my ($pve_resource_type) = @_;
+
+    if ($pve_resource_type eq 'lxc') {
+        return 'ct';
+    } elsif ($pve_resource_type eq 'qemu') {
+        return 'vm';
+    } else {
+        die "unknown PVE resource type '$pve_resource_type'";
+    }
+}
+
 # bash auto completion helper
 
 # NOTE: we use PVE::HA::Config here without declaring an 'use' clause above as
@@ -309,15 +321,10 @@ sub complete_sid {
 
         while (my ($vmid, $info) = each %{ $vmlist->{ids} }) {
 
-            my $sid;
+            my $type = eval { get_ha_resource_type($info->{type}) };
+            next if $@; # silently ignore unknown pve types
 
-            if ($info->{type} eq 'lxc') {
-                $sid = "ct:$vmid";
-            } elsif ($info->{type} eq 'qemu') {
-                $sid = "vm:$vmid";
-            } else {
-                next; # should not happen
-            }
+            my $sid = "$type:$vmid";
 
             next if $cfg->{ids}->{$sid};
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 05/21] derive static service stats while filling the service stats repository
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (16 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
                   ` (17 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

This is needed to be able to derive the proper static service stats for
non-HA resources in a following patch as well.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env/PVE2.pm        | 14 ++++++++------
 src/PVE/HA/Resources.pm       |  6 ++++++
 src/PVE/HA/Resources/PVECT.pm | 12 ++++++++----
 src/PVE/HA/Resources/PVEVM.pm | 12 ++++++++----
 4 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index ee4fa23d..78ece447 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -518,17 +518,19 @@ sub update_static_service_stats {
 
     my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
     my $service_stats = eval {
-        my $stats = PVE::Cluster::get_guest_config_properties($properties);
+        my $stats = {};
+        my $confs = PVE::Cluster::get_guest_config_properties($properties);
 
-        # get_guest_config_properties(...) doesn't add guests which do not
-        # specify any of the given properties, but we need to make a distinction
-        # between "not cached" and "not specified" here
         my $vmlist = PVE::Cluster::get_vmlist();
         my $idlist = $vmlist->{ids} // {};
         for my $id (keys %$idlist) {
-            next if defined($stats->{$id});
+            my $type = eval { PVE::HA::Tools::get_ha_resource_type($idlist->{$id}->{type}) };
+            next if $@; # silently ignore unknown pve types
 
-            $stats->{$id} = {};
+            my $conf = $confs->{$id} // {};
+            my $plugin = PVE::HA::Resources->lookup($type);
+
+            $stats->{$id} = $plugin->get_static_stats_from_config($conf);
         }
 
         return $stats;
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index 38e0841b..c40c260f 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -176,6 +176,12 @@ sub remove_locks {
     die "implement in subclass";
 }
 
+sub get_static_stats_from_config {
+    my ($class, $conf) = @_;
+
+    die "implement in subclass";
+}
+
 sub get_static_stats {
     my ($class, $haenv, $id) = @_;
 
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index 1f4eb2e9..0dc500d1 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -159,10 +159,8 @@ sub remove_locks {
     return undef;
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
-
-    my $conf = $haenv->get_static_service_stats($id);
+sub get_static_stats_from_config {
+    my ($class, $conf) = @_;
 
     return {
         maxcpu => PVE::LXC::Config->get_derived_property($conf, 'max-cpu'),
@@ -170,4 +168,10 @@ sub get_static_stats {
     };
 }
 
+sub get_static_stats {
+    my ($class, $haenv, $id) = @_;
+
+    return $haenv->get_static_service_stats($id);
+}
+
 1;
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 760259e4..579a7fca 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -180,10 +180,8 @@ sub remove_locks {
     return undef;
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
-
-    my $conf = $haenv->get_static_service_stats($id);
+sub get_static_stats_from_config {
+    my ($class, $conf) = @_;
 
     return {
         maxcpu => PVE::QemuConfig->get_derived_property($conf, 'max-cpu'),
@@ -191,4 +189,10 @@ sub get_static_stats {
     };
 }
 
+sub get_static_stats {
+    my ($class, $haenv, $id) = @_;
+
+    return $haenv->get_static_service_stats($id);
+}
+
 1;
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 06/21] test: make static service usage explicit for all resources
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (17 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
                   ` (16 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Even though deriving these from the HA resource sid is convenient, this
is needed to get the proper static service stats for all HA resources at
once.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
Could also be passed through get_static_service_stats_from_config() with
an additional $id / $sid parameter...

 src/PVE/HA/Sim/Resources.pm                   | 10 +---
 .../static_service_stats                      | 52 ++++++++++++++++++-
 .../static_service_stats                      |  9 +++-
 3 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index 1b2bfaaf..2bde9aa3 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -143,15 +143,7 @@ sub get_static_stats {
     my $sid = $class->type() . ":$id";
     my $hardware = $haenv->hardware();
 
-    if (my $service_stats = $hardware->get_static_service_stats($sid)) {
-        return $service_stats;
-    } elsif ($id =~ /^(\d)(\d\d)/) {
-        # auto assign usage calculated from ID for convenience
-        my ($maxcpu, $maxmeory) = (int($1) + 1, (int($2) + 1) * 1 << 29);
-        return { maxcpu => $maxcpu, maxmem => $maxmeory };
-    } else {
-        return {};
-    }
+    return $hardware->get_static_service_stats($sid);
 }
 
 1;
diff --git a/src/test/test-crs-static-rebalance1/static_service_stats b/src/test/test-crs-static-rebalance1/static_service_stats
index 7fb992dd..666af861 100644
--- a/src/test/test-crs-static-rebalance1/static_service_stats
+++ b/src/test/test-crs-static-rebalance1/static_service_stats
@@ -1,3 +1,53 @@
 {
-    "vm:102": { "maxcpu": 2, "maxmem": 4000000000 }
+    "vm:101": { "maxcpu": 2, "maxmem": 1073741824 },
+    "vm:102": { "maxcpu": 2, "maxmem": 4000000000 },
+    "vm:103": { "maxcpu": 2, "maxmem": 2147483648 },
+    "vm:104": { "maxcpu": 2, "maxmem": 2684354560 },
+    "vm:105": { "maxcpu": 2, "maxmem": 3221225472 },
+    "vm:106": { "maxcpu": 2, "maxmem": 3758096384 },
+    "vm:107": { "maxcpu": 2, "maxmem": 4294967296 },
+    "vm:108": { "maxcpu": 2, "maxmem": 4831838208 },
+    "vm:109": { "maxcpu": 2, "maxmem": 5368709120 },
+    "vm:110": { "maxcpu": 2, "maxmem": 5905580032 },
+    "vm:111": { "maxcpu": 2, "maxmem": 6442450944 },
+    "vm:112": { "maxcpu": 2, "maxmem": 6979321856 },
+    "vm:113": { "maxcpu": 2, "maxmem": 7516192768 },
+    "vm:114": { "maxcpu": 2, "maxmem": 8053063680 },
+    "vm:115": { "maxcpu": 2, "maxmem": 8589934592 },
+    "vm:116": { "maxcpu": 2, "maxmem": 9126805504 },
+    "vm:117": { "maxcpu": 2, "maxmem": 9663676416 },
+    "vm:118": { "maxcpu": 2, "maxmem": 10200547328 },
+    "vm:119": { "maxcpu": 2, "maxmem": 10737418240 },
+    "vm:120": { "maxcpu": 2, "maxmem": 11274289152 },
+    "vm:121": { "maxcpu": 2, "maxmem": 11811160064 },
+    "vm:122": { "maxcpu": 2, "maxmem": 12348030976 },
+    "vm:123": { "maxcpu": 2, "maxmem": 12884901888 },
+    "vm:124": { "maxcpu": 2, "maxmem": 13421772800 },
+    "vm:125": { "maxcpu": 2, "maxmem": 13958643712 },
+    "vm:126": { "maxcpu": 2, "maxmem": 14495514624 },
+    "vm:127": { "maxcpu": 2, "maxmem": 15032385536 },
+    "vm:128": { "maxcpu": 2, "maxmem": 15569256448 },
+    "vm:129": { "maxcpu": 2, "maxmem": 16106127360 },
+    "vm:130": { "maxcpu": 2, "maxmem": 16642998272 },
+    "vm:131": { "maxcpu": 2, "maxmem": 17179869184 },
+    "vm:132": { "maxcpu": 2, "maxmem": 17716740096 },
+    "vm:133": { "maxcpu": 2, "maxmem": 18253611008 },
+    "vm:134": { "maxcpu": 2, "maxmem": 18790481920 },
+    "vm:135": { "maxcpu": 2, "maxmem": 19327352832 },
+    "vm:136": { "maxcpu": 2, "maxmem": 19864223744 },
+    "vm:137": { "maxcpu": 2, "maxmem": 20401094656 },
+    "vm:138": { "maxcpu": 2, "maxmem": 20937965568 },
+    "vm:139": { "maxcpu": 2, "maxmem": 21474836480 },
+    "vm:140": { "maxcpu": 2, "maxmem": 22011707392 },
+    "vm:141": { "maxcpu": 2, "maxmem": 22548578304 },
+    "vm:142": { "maxcpu": 2, "maxmem": 23085449216 },
+    "vm:143": { "maxcpu": 2, "maxmem": 23622320128 },
+    "vm:144": { "maxcpu": 2, "maxmem": 24159191040 },
+    "vm:145": { "maxcpu": 2, "maxmem": 24696061952 },
+    "vm:146": { "maxcpu": 2, "maxmem": 25232932864 },
+    "vm:147": { "maxcpu": 2, "maxmem": 25769803776 },
+    "vm:148": { "maxcpu": 2, "maxmem": 26306674688 },
+    "vm:149": { "maxcpu": 2, "maxmem": 26843545600 },
+    "vm:150": { "maxcpu": 2, "maxmem": 27380416512 },
+    "vm:151": { "maxcpu": 2, "maxmem": 27917287424 }
 }
diff --git a/src/test/test-crs-static-rebalance2/static_service_stats b/src/test/test-crs-static-rebalance2/static_service_stats
index 0967ef42..a4049f80 100644
--- a/src/test/test-crs-static-rebalance2/static_service_stats
+++ b/src/test/test-crs-static-rebalance2/static_service_stats
@@ -1 +1,8 @@
-{}
+{
+    "vm:100": { "maxcpu": 2, "maxmem": 536870912 },
+    "vm:101": { "maxcpu": 2, "maxmem": 1073741824 },
+    "vm:102": { "maxcpu": 2, "maxmem": 1610612736 },
+    "vm:103": { "maxcpu": 2, "maxmem": 2147483648 },
+    "vm:104": { "maxcpu": 2, "maxmem": 2684354560 },
+    "vm:105": { "maxcpu": 2, "maxmem": 3221225472 }
+}
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 07/21] make static service stats indexable by sid
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (18 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
                   ` (15 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env/PVE2.pm        | 3 ++-
 src/PVE/HA/Resources.pm       | 2 +-
 src/PVE/HA/Resources/PVECT.pm | 4 ++--
 src/PVE/HA/Resources/PVEVM.pm | 4 ++--
 src/PVE/HA/Sim/Resources.pm   | 3 +--
 src/PVE/HA/Usage/Static.pm    | 4 ++--
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 78ece447..1a9dc4ea 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -527,10 +527,11 @@ sub update_static_service_stats {
             my $type = eval { PVE::HA::Tools::get_ha_resource_type($idlist->{$id}->{type}) };
             next if $@; # silently ignore unknown pve types
 
+            my $sid = "$type:$id";
             my $conf = $confs->{$id} // {};
             my $plugin = PVE::HA::Resources->lookup($type);
 
-            $stats->{$id} = $plugin->get_static_stats_from_config($conf);
+            $stats->{$sid} = $plugin->get_static_stats_from_config($conf);
         }
 
         return $stats;
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index c40c260f..ea7655e5 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -183,7 +183,7 @@ sub get_static_stats_from_config {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
+    my ($class, $haenv, $sid) = @_;
 
     die "implement in subclass";
 }
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index 0dc500d1..bda33717 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -169,9 +169,9 @@ sub get_static_stats_from_config {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
+    my ($class, $haenv, $sid) = @_;
 
-    return $haenv->get_static_service_stats($id);
+    return $haenv->get_static_service_stats($sid);
 }
 
 1;
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 579a7fca..786c5130 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -190,9 +190,9 @@ sub get_static_stats_from_config {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
+    my ($class, $haenv, $sid) = @_;
 
-    return $haenv->get_static_service_stats($id);
+    return $haenv->get_static_service_stats($sid);
 }
 
 1;
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index 2bde9aa3..f91d3ea2 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -138,9 +138,8 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id) = @_;
+    my ($class, $haenv, $sid) = @_;
 
-    my $sid = $class->type() . ":$id";
     my $hardware = $haenv->hardware();
 
     return $hardware->get_static_service_stats($sid);
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 2304139c..acc3533c 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -65,10 +65,10 @@ my sub get_service_usage {
 
     return $self->{'service-stats'}->{$sid} if $self->{'service-stats'}->{$sid};
 
-    my (undef, $type, $id) = $self->{haenv}->parse_sid($sid);
+    my (undef, $type, undef) = $self->{haenv}->parse_sid($sid);
     my $plugin = PVE::HA::Resources->lookup($type);
 
-    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $id) };
+    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $sid) };
     die "did not get static service usage information for '$sid'\n" if !$stats;
 
     my $service_stats = {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (19 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Since the static service stats are already processed by their individual
plugin's implementation of get_static_stats_from_config(...), the static
service stats repository can be used by the static load scheduler
directly.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env.pm             |  8 +------
 src/PVE/HA/Env/PVE2.pm        | 42 +++++++++++------------------------
 src/PVE/HA/Manager.pm         |  1 -
 src/PVE/HA/Resources.pm       |  6 -----
 src/PVE/HA/Resources/PVECT.pm |  6 -----
 src/PVE/HA/Resources/PVEVM.pm |  6 -----
 src/PVE/HA/Sim/Env.pm         |  8 +------
 src/PVE/HA/Sim/Hardware.pm    | 13 +----------
 src/PVE/HA/Sim/Resources.pm   |  8 -------
 src/PVE/HA/Usage/Static.pm    | 23 +++++--------------
 10 files changed, 22 insertions(+), 99 deletions(-)

diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index 64cf3ea5..6843cb30 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -301,15 +301,9 @@ sub get_datacenter_settings {
 }
 
 sub get_static_service_stats {
-    my ($self, $id) = @_;
-
-    return $self->{plug}->get_static_service_stats($id);
-}
-
-sub update_static_service_stats {
     my ($self) = @_;
 
-    return $self->{plug}->update_static_service_stats();
+    return $self->{plug}->get_static_service_stats();
 }
 
 sub get_static_node_stats {
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 1a9dc4ea..87b1435a 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -49,8 +49,6 @@ sub new {
 
     $self->{nodename} = $nodename;
 
-    $self->{static_service_stats} = undef;
-
     return $self;
 }
 
@@ -505,40 +503,26 @@ sub get_datacenter_settings {
 }
 
 sub get_static_service_stats {
-    my ($self, $id) = @_;
-
-    # undef if update_static_service_stats(...) failed before
-    return undef if !defined($self->{static_service_stats});
-
-    return $self->{static_service_stats}->{$id};
-}
-
-sub update_static_service_stats {
     my ($self) = @_;
 
     my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
-    my $service_stats = eval {
-        my $stats = {};
-        my $confs = PVE::Cluster::get_guest_config_properties($properties);
+    my $stats = {};
+    my $confs = PVE::Cluster::get_guest_config_properties($properties);
 
-        my $vmlist = PVE::Cluster::get_vmlist();
-        my $idlist = $vmlist->{ids} // {};
-        for my $id (keys %$idlist) {
-            my $type = eval { PVE::HA::Tools::get_ha_resource_type($idlist->{$id}->{type}) };
-            next if $@; # silently ignore unknown pve types
+    my $vmlist = PVE::Cluster::get_vmlist();
+    my $idlist = $vmlist->{ids} // {};
+    for my $id (keys %$idlist) {
+        my $type = eval { PVE::HA::Tools::get_ha_resource_type($idlist->{$id}->{type}) };
+        next if $@; # silently ignore unknown pve types
 
-            my $sid = "$type:$id";
-            my $conf = $confs->{$id} // {};
-            my $plugin = PVE::HA::Resources->lookup($type);
+        my $sid = "$type:$id";
+        my $conf = $confs->{$id} // {};
+        my $plugin = PVE::HA::Resources->lookup($type);
 
-            $stats->{$sid} = $plugin->get_static_stats_from_config($conf);
-        }
+        $stats->{$sid} = $plugin->get_static_stats_from_config($conf);
+    }
 
-        return $stats;
-    };
-    $self->log('warning', "unable to update static service stats cache - $@") if $@;
-
-    $self->{static_service_stats} = $service_stats;
+    return $stats;
 }
 
 sub get_static_node_stats {
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 2e31296f..57fd0017 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -253,7 +253,6 @@ sub recompute_online_node_usage {
                 $online_node_usage = eval {
                     my $scheduler = PVE::HA::Usage::Static->new($haenv);
                     $scheduler->add_node($_) for $online_nodes->@*;
-                    $haenv->update_static_service_stats();
                     return $scheduler;
                 };
             } else {
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index ea7655e5..aa26d019 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -182,12 +182,6 @@ sub get_static_stats_from_config {
     die "implement in subclass";
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $sid) = @_;
-
-    die "implement in subclass";
-}
-
 # package PVE::HA::Resources::IPAddr;
 
 # use strict;
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index bda33717..79bb7c83 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -168,10 +168,4 @@ sub get_static_stats_from_config {
     };
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $sid) = @_;
-
-    return $haenv->get_static_service_stats($sid);
-}
-
 1;
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 786c5130..5a0ac348 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -189,10 +189,4 @@ sub get_static_stats_from_config {
     };
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $sid) = @_;
-
-    return $haenv->get_static_service_stats($sid);
-}
-
 1;
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index 32b5224c..0e3b02e5 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -489,15 +489,9 @@ sub get_datacenter_settings {
 }
 
 sub get_static_service_stats {
-    my ($self, $id) = @_;
-
-    return $self->{hardware}->get_static_service_stats($id);
-}
-
-sub update_static_service_stats {
     my ($self) = @_;
 
-    return $self->{hardware}->update_static_service_stats();
+    return $self->{hardware}->get_static_service_stats();
 }
 
 sub get_static_node_stats {
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 702500c2..a7e28c8a 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -525,8 +525,6 @@ sub new {
 
     $self->{service_config} = $self->read_service_config();
 
-    $self->{static_service_stats} = undef;
-
     return $self;
 }
 
@@ -1063,22 +1061,13 @@ sub watchdog_update {
 }
 
 sub get_static_service_stats {
-    my ($self, $id) = @_;
-
-    # undef if update_static_service_stats(...) failed before
-    return undef if !defined($self->{static_service_stats});
-
-    return $self->{static_service_stats}->{$id};
-}
-
-sub update_static_service_stats {
     my ($self) = @_;
 
     my $filename = "$self->{statusdir}/static_service_stats";
     my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
     $self->log('warning', "unable to update static service stats cache - $@") if $@;
 
-    $self->{static_service_stats} = $stats;
+    return $stats;
 }
 
 sub get_static_node_stats {
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index f91d3ea2..09a042b6 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -137,12 +137,4 @@ sub remove_locks {
     return undef;
 }
 
-sub get_static_stats {
-    my ($class, $haenv, $sid) = @_;
-
-    my $hardware = $haenv->hardware();
-
-    return $hardware->get_static_service_stats($sid);
-}
-
 1;
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index acc3533c..48622d62 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -14,12 +14,15 @@ sub new {
     my $node_stats = eval { $haenv->get_static_node_stats() };
     die "did not get static node usage information - $@" if $@;
 
+    my $service_stats = eval { $haenv->get_static_service_stats() };
+    die "did not get static service usage information - $@" if $@;
+
     my $scheduler = eval { PVE::RS::ResourceScheduling::Static->new(); };
     die "unable to initialize static scheduling - $@" if $@;
 
     return bless {
         'node-stats' => $node_stats,
-        'service-stats' => {},
+        'service-stats' => $service_stats,
         haenv => $haenv,
         scheduler => $scheduler,
         'node-services' => {}, # Services on each node. Fallback if scoring calculation fails.
@@ -63,20 +66,8 @@ sub contains_node {
 my sub get_service_usage {
     my ($self, $sid) = @_;
 
-    return $self->{'service-stats'}->{$sid} if $self->{'service-stats'}->{$sid};
-
-    my (undef, $type, undef) = $self->{haenv}->parse_sid($sid);
-    my $plugin = PVE::HA::Resources->lookup($type);
-
-    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $sid) };
-    die "did not get static service usage information for '$sid'\n" if !$stats;
-
-    my $service_stats = {
-        maxcpu => $stats->{maxcpu} + 0.0, # containers allow non-integer cpulimit
-        maxmem => int($stats->{maxmem}),
-    };
-
-    $self->{'service-stats'}->{$sid} = $service_stats;
+    my $service_stats = $self->{'service-stats'}->{$sid}
+        or die "did not get static service usage information for '$sid'\n";
 
     return $service_stats;
 }
@@ -101,8 +92,6 @@ sub remove_service_usage {
 
     eval { $self->{scheduler}->remove_service_usage($sid) };
     $self->{haenv}->log('warning', "unable to remove service '$sid' usage - $@") if $@;
-
-    delete $self->{'service-stats'}->{$sid}; # Invalidate old service stats
 }
 
 sub score_nodes_to_start_service {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 09/21] usage: augment service stats with node and state information
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (20 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
                   ` (13 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Augment the service stats with the node and state information, which is
necessary for non-HA resources to be added to the service-node
accounting done by the implementations of PVE::HA::Usage in an upcoming
patch.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env/PVE2.pm     | 44 +++++++++++++++++++++++++++++++-------
 src/PVE/HA/Sim/Hardware.pm | 27 ++++++++++++++++++++++-
 src/PVE/HA/Usage/Static.pm |  2 +-
 3 files changed, 63 insertions(+), 10 deletions(-)

diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 87b1435a..c43cf3ca 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -502,24 +502,52 @@ sub get_datacenter_settings {
     };
 }
 
-sub get_static_service_stats {
-    my ($self) = @_;
-
-    my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
-    my $stats = {};
-    my $confs = PVE::Cluster::get_guest_config_properties($properties);
-
+my sub get_cluster_service_stats {
     my $vmlist = PVE::Cluster::get_vmlist();
     my $idlist = $vmlist->{ids} // {};
+
+    my $rrd = PVE::Cluster::rrd_dump();
+
+    my $stats = {};
     for my $id (keys %$idlist) {
         my $type = eval { PVE::HA::Tools::get_ha_resource_type($idlist->{$id}->{type}) };
         next if $@; # silently ignore unknown pve types
 
+        my $sid = "$type:$id";
+        my $nodename = $idlist->{$id}->{node};
+
+        my $rrdentry = $rrd->{"pve-vm-9.0/$id"} // [];
+        # status can be any QMP RunState, but 'running' is the only active VM state
+        my $status = $rrdentry->[2] // "stopped";
+        my $state = $status eq "running" ? "started" : "stopped";
+
+        $stats->{$sid} = {
+            id => $id,
+            node => $nodename,
+            state => $state,
+            type => $type,
+            usage => {},
+        };
+    }
+
+    return $stats;
+}
+
+sub get_static_service_stats {
+    my ($self) = @_;
+
+    my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
+    my $stats = get_cluster_service_stats();
+    my $confs = PVE::Cluster::get_guest_config_properties($properties);
+
+    for my $sid (keys %$stats) {
+        my ($id, $type) = $stats->{$sid}->@{qw(id type)};
+
         my $sid = "$type:$id";
         my $conf = $confs->{$id} // {};
         my $plugin = PVE::HA::Resources->lookup($type);
 
-        $stats->{$sid} = $plugin->get_static_stats_from_config($conf);
+        $stats->{$sid}->{usage} = $plugin->get_static_stats_from_config($conf);
     }
 
     return $stats;
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index a7e28c8a..4b4a187f 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -1060,13 +1060,38 @@ sub watchdog_update {
     return &$modify_watchog($self, $code);
 }
 
+my sub get_cluster_service_stats {
+    my ($self) = @_;
+
+    my $stats = {};
+    for my $sid (keys $self->{service_config}->%*) {
+        my $cfg = $self->{service_config}->{$sid};
+
+        $stats->{$sid} = {
+            node => $cfg->{node},
+            state => $cfg->{state},
+            usage => {},
+        };
+    }
+
+    return $stats;
+}
+
 sub get_static_service_stats {
     my ($self) = @_;
 
+    my $stats = get_cluster_service_stats($self);
+
     my $filename = "$self->{statusdir}/static_service_stats";
-    my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
+    my $usage_stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
     $self->log('warning', "unable to update static service stats cache - $@") if $@;
 
+    for my $sid (keys %$stats) {
+        next if !defined($usage_stats->{$sid});
+
+        $stats->{$sid}->{usage} = $usage_stats->{$sid};
+    }
+
     return $stats;
 }
 
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 48622d62..98752691 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -66,7 +66,7 @@ sub contains_node {
 my sub get_service_usage {
     my ($self, $sid) = @_;
 
-    my $service_stats = $self->{'service-stats'}->{$sid}
+    my $service_stats = $self->{'service-stats'}->{$sid}->{usage}
         or die "did not get static service usage information for '$sid'\n";
 
     return $service_stats;
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (21 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
                   ` (12 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

As the service stats repository includes non-HA resources as well, use
it to add the static usage stats of these running, non-HA resources to
the scheduler.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env.pm          |  6 ++++++
 src/PVE/HA/Env/PVE2.pm     |  6 ++++++
 src/PVE/HA/Manager.pm      | 15 ++++++++++++++-
 src/PVE/HA/Sim/Env.pm      |  6 ++++++
 src/PVE/HA/Sim/Hardware.pm |  6 ++++++
 src/PVE/HA/Usage.pm        |  2 +-
 src/PVE/HA/Usage/Basic.pm  |  2 +-
 src/PVE/HA/Usage/Static.pm |  5 +----
 8 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index 6843cb30..ed8c8c83 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -300,6 +300,12 @@ sub get_datacenter_settings {
     return $self->{plug}->get_datacenter_settings();
 }
 
+sub get_basic_service_stats {
+    my ($self) = @_;
+
+    return $self->{plug}->get_basic_service_stats();
+}
+
 sub get_static_service_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index c43cf3ca..d5c20460 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -533,6 +533,12 @@ my sub get_cluster_service_stats {
     return $stats;
 }
 
+sub get_basic_service_stats {
+    my ($self) = @_;
+
+    return get_cluster_service_stats();
+}
+
 sub get_static_service_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 57fd0017..f9dec661 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -245,13 +245,15 @@ sub recompute_online_node_usage {
 
     my $online_nodes = $self->{ns}->list_online_nodes();
 
+    my $service_stats;
     my $online_node_usage;
 
     if (my $mode = $self->{crs}->{scheduler}) {
         if ($mode eq 'static') {
             if ($have_static_scheduling) {
                 $online_node_usage = eval {
-                    my $scheduler = PVE::HA::Usage::Static->new($haenv);
+                    $service_stats = $haenv->get_static_service_stats();
+                    my $scheduler = PVE::HA::Usage::Static->new($haenv, $service_stats);
                     $scheduler->add_node($_) for $online_nodes->@*;
                     return $scheduler;
                 };
@@ -271,6 +273,7 @@ sub recompute_online_node_usage {
 
     # fallback to the basic algorithm in any case
     if (!$online_node_usage) {
+        $service_stats = $haenv->get_basic_service_stats();
         $online_node_usage = PVE::HA::Usage::Basic->new($haenv);
         $online_node_usage->add_node($_) for $online_nodes->@*;
     }
@@ -281,6 +284,16 @@ sub recompute_online_node_usage {
         $online_node_usage->add_service_usage($sid, $sd->{state}, $sd->{node}, $sd->{target});
     }
 
+    # add remaining non-HA resources to online node usage
+    for my $sid (sort keys %$service_stats) {
+        next if $self->{ss}->{$sid};
+
+        my ($node, $state) = $service_stats->{$sid}->@{qw(node state)};
+
+        # the migration target is not known for non-HA resources
+        $online_node_usage->add_service_usage($sid, $state, $node, undef);
+    }
+
     $self->{online_node_usage} = $online_node_usage;
 }
 
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index 0e3b02e5..ad51245c 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -488,6 +488,12 @@ sub get_datacenter_settings {
     };
 }
 
+sub get_basic_service_stats {
+    my ($self) = @_;
+
+    return $self->{hardware}->get_basic_service_stats();
+}
+
 sub get_static_service_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 4b4a187f..37aa28f7 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -1077,6 +1077,12 @@ my sub get_cluster_service_stats {
     return $stats;
 }
 
+sub get_basic_service_stats {
+    my ($self) = @_;
+
+    return get_cluster_service_stats($self);
+}
+
 sub get_static_service_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 1be5fa09..9f19a82b 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -4,7 +4,7 @@ use strict;
 use warnings;
 
 sub new {
-    my ($class, $haenv) = @_;
+    my ($class, $haenv, $service_stats) = @_;
 
     die "implement in subclass";
 }
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index ef9ae3d6..2584727b 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -6,7 +6,7 @@ use warnings;
 use base qw(PVE::HA::Usage);
 
 sub new {
-    my ($class, $haenv) = @_;
+    my ($class, $haenv, $service_stats) = @_;
 
     return bless {
         nodes => {},
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 98752691..c8460fd7 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -9,14 +9,11 @@ use PVE::RS::ResourceScheduling::Static;
 use base qw(PVE::HA::Usage);
 
 sub new {
-    my ($class, $haenv) = @_;
+    my ($class, $haenv, $service_stats) = @_;
 
     my $node_stats = eval { $haenv->get_static_node_stats() };
     die "did not get static node usage information - $@" if $@;
 
-    my $service_stats = eval { $haenv->get_static_service_stats() };
-    die "did not get static service usage information - $@" if $@;
-
     my $scheduler = eval { PVE::RS::ResourceScheduling::Static->new(); };
     die "unable to initialize static scheduling - $@" if $@;
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (22 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env.pm     | 12 ++++++++++++
 src/PVE/HA/Sim/Env.pm | 12 ++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index ed8c8c83..0929d3c7 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -312,12 +312,24 @@ sub get_static_service_stats {
     return $self->{plug}->get_static_service_stats();
 }
 
+sub get_dynamic_service_stats {
+    my ($self) = @_;
+
+    return $self->{plug}->get_dynamic_service_stats();
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
     return $self->{plug}->get_static_node_stats();
 }
 
+sub get_dynamic_node_stats {
+    my ($self) = @_;
+
+    return $self->{plug}->get_dynamic_node_stats();
+}
+
 sub get_node_version {
     my ($self, $node) = @_;
 
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index ad51245c..65d4efad 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -500,12 +500,24 @@ sub get_static_service_stats {
     return $self->{hardware}->get_static_service_stats();
 }
 
+sub get_dynamic_service_stats {
+    my ($self) = @_;
+
+    return $self->{hardware}->get_dynamic_service_stats();
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
     return $self->{hardware}->get_static_node_stats();
 }
 
+sub get_dynamic_node_stats {
+    my ($self) = @_;
+
+    return $self->{hardware}->get_dynamic_node_stats();
+}
+
 sub get_node_version {
     my ($self, $node) = @_;
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (23 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Fetch dynamic node and service stats from the rrd_dump() propagated
through the pmxcfs periodically.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Env/PVE2.pm | 50 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index d5c20460..bb0d8c74 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -559,6 +559,30 @@ sub get_static_service_stats {
     return $stats;
 }
 
+sub get_dynamic_service_stats {
+    my ($self, $id) = @_;
+
+    my $rrd = PVE::Cluster::rrd_dump();
+
+    my $stats = get_cluster_service_stats();
+    for my $sid (keys %$stats) {
+        my $id = $stats->{$sid}->{id};
+        my $rrdentry = $rrd->{"pve-vm-9.0/$id"} // [];
+
+        # FIXME $rrdentry->[5] is the problematic $d->{cpus} from vmstatus
+        my $maxcpu = ($rrdentry->[5] || 0.0) + 0.0;
+
+        $stats->{$sid}->{usage} = {
+            maxcpu => $maxcpu,
+            cpu => (($rrdentry->[6] || 0.0) + 0.0) * $maxcpu,
+            maxmem => int($rrdentry->[7] || 0),
+            mem => int($rrdentry->[8] || 0),
+        };
+    }
+
+    return $stats;
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
@@ -578,6 +602,32 @@ sub get_static_node_stats {
     return $stats;
 }
 
+sub get_dynamic_node_stats {
+    my ($self) = @_;
+
+    my $rrd = PVE::Cluster::rrd_dump();
+
+    my $stats = {};
+    for my $key (keys %$rrd) {
+        my ($nodename) = $key =~ m/^pve-node-9.0\/(\w+)$/;
+
+        next if !$nodename;
+
+        my $rrdentry = $rrd->{$key} // [];
+
+        my $maxcpu = int($rrdentry->[4] || 0);
+
+        $stats->{$nodename} = {
+            maxcpu => $maxcpu,
+            cpu => (($rrdentry->[5] || 0.0) + 0.0) * $maxcpu,
+            maxmem => int($rrdentry->[7] || 0),
+            mem => int($rrdentry->[8] || 0),
+        };
+    }
+
+    return $stats;
+}
+
 sub get_node_version {
     my ($self, $node) = @_;
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (24 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

From: Dominik Rusovac <d.rusovac@proxmox.com>

crm expects f64 for cpu-related values and usize for mem-related values.
Hence, now we pass doubles for the former and ints for the latter.

Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Sim/Hardware.pm | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 37aa28f7..ce8a6b94 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -488,9 +488,9 @@ sub new {
             || die "Copy failed: $!\n";
     } else {
         my $cstatus = {
-            node1 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
-            node2 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
-            node3 => { power => 'off', network => 'off', maxcpu => 24, maxmem => 131072 },
+            node1 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
+            node2 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
+            node3 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
         };
         $self->write_hardware_status_nolock($cstatus);
     }
@@ -507,7 +507,7 @@ sub new {
         copy("$testdir/static_service_stats", "$statusdir/static_service_stats");
     } else {
         my $services = $self->read_service_config();
-        my $stats = { map { $_ => { maxcpu => 4, maxmem => 4096 } } keys %$services };
+        my $stats = { map { $_ => { maxcpu => 4.0, maxmem => 4096 } } keys %$services };
         $self->write_static_service_stats($stats);
     }
 
@@ -874,7 +874,7 @@ sub sim_hardware_cmd {
 
                 $self->set_static_service_stats(
                     $sid,
-                    { maxcpu => $params[0], maxmem => $params[1] },
+                    { maxcpu => 0.0 + $params[0], maxmem => int($params[1]) },
                 );
 
             } elsif ($action eq 'delete') {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (25 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

From: Dominik Rusovac <d.rusovac@proxmox.com>

Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Sim/Hardware.pm | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index ce8a6b94..4d82e18c 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -21,6 +21,11 @@ use PVE::HA::Groups;
 
 my $watchdog_timeout = 60;
 
+my $default_service_maxcpu = 4.0;
+my $default_service_maxmem = 4096 * 1024**2;
+my $default_node_maxcpu = 24.0;
+my $default_node_maxmem = 131072 * 1024**2;
+
 # Status directory layout
 #
 # configuration
@@ -488,9 +493,24 @@ sub new {
             || die "Copy failed: $!\n";
     } else {
         my $cstatus = {
-            node1 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
-            node2 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
-            node3 => { power => 'off', network => 'off', maxcpu => 24.0, maxmem => 131072 },
+            node1 => {
+                power => 'off',
+                network => 'off',
+                maxcpu => $default_node_maxcpu,
+                maxmem => $default_node_maxmem,
+            },
+            node2 => {
+                power => 'off',
+                network => 'off',
+                maxcpu => $default_node_maxcpu,
+                maxmem => $default_node_maxmem,
+            },
+            node3 => {
+                power => 'off',
+                network => 'off',
+                maxcpu => $default_node_maxcpu,
+                maxmem => $default_node_maxmem,
+            },
         };
         $self->write_hardware_status_nolock($cstatus);
     }
@@ -507,7 +527,12 @@ sub new {
         copy("$testdir/static_service_stats", "$statusdir/static_service_stats");
     } else {
         my $services = $self->read_service_config();
-        my $stats = { map { $_ => { maxcpu => 4.0, maxmem => 4096 } } keys %$services };
+        my $stats = {
+            map {
+                $_ => { maxcpu => $default_service_maxcpu, maxmem => $default_service_maxmem }
+            }
+                keys %$services
+        };
         $self->write_static_service_stats($stats);
     }
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (26 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

From: Dominik Rusovac <d.rusovac@proxmox.com>

This decouples the stats input for the set-static-stats command. Using
the old version one had to pass both, the maxcpu and maxmem stat, even
if they wanted to set only one of the stats. To me, it appears to be
more convenient to set one stat at a time instead of having to pass the
value of the stat that is not changing alongside the value that should
be changed.

Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Sim/Hardware.pm   | 23 ++++++++++++++++-------
 src/PVE/HA/Sim/RTHardware.pm |  3 ++-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 4d82e18c..ec135e09 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -743,7 +743,7 @@ sub get_cfs_state {
 #   service <sid> stop <timeout>
 #   service <sid> lock/unlock [lockname]
 #   service <sid> add <node> [<request-state=started>] [<running=0>]
-#   service <sid> set-static-stats <maxcpu> <maxmem>
+#   service <sid> set-static-stats  <maxcpu|maxmem> <cpu cores|MiB>
 #   service <sid> delete
 sub sim_hardware_cmd {
     my ($self, $cmdstr, $logid) = @_;
@@ -894,14 +894,23 @@ sub sim_hardware_cmd {
                 );
 
             } elsif ($action eq 'set-static-stats') {
-                die "sim_hardware_cmd: missing maxcpu for '$action' command" if !$params[0];
-                die "sim_hardware_cmd: missing maxmem for '$action' command" if !$params[1];
+                my ($target, $val) = ($params[0], $params[1]);
 
-                $self->set_static_service_stats(
-                    $sid,
-                    { maxcpu => 0.0 + $params[0], maxmem => int($params[1]) },
-                );
+                if (!$target) {
+                    die "sim_hardware_cmd: missing target stat for '$action' command";
+                } elsif ($target eq "maxcpu") {
+                    die "sim_hardware_cmd: missing value for '$action $target' command"
+                        if !$val;
 
+                    $self->set_static_service_stats($sid, { $target => 0.0 + $val });
+                } elsif ($target eq "maxmem") {
+                    die "sim_hardware_cmd: missing value for '$action $target' command"
+                        if !$val;
+
+                    $self->set_static_service_stats($sid, { $target => $val * 1024**2 });
+                } else {
+                    die "sim_hardware_cmd: unknown target stat '$target' for '$action' command";
+                }
             } elsif ($action eq 'delete') {
 
                 $self->delete_service($sid);
diff --git a/src/PVE/HA/Sim/RTHardware.pm b/src/PVE/HA/Sim/RTHardware.pm
index 9a83d098..3fc52240 100644
--- a/src/PVE/HA/Sim/RTHardware.pm
+++ b/src/PVE/HA/Sim/RTHardware.pm
@@ -532,7 +532,8 @@ sub show_service_add_dialog {
 
         my $maxcpu = $cpu_count_spin->get_value();
         my $maxmem = $memory_spin->get_value();
-        $self->sim_hardware_cmd("service $sid set-static-stats $maxcpu $maxmem", 'command');
+        $self->sim_hardware_cmd("service $sid set-static-stats maxcpu $maxcpu", 'command');
+        $self->sim_hardware_cmd("service $sid set-static-stats maxmem $maxmem", 'command');
 
         $self->add_service_to_gui($sid);
     }
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (27 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

From: Dominik Rusovac <d.rusovac@proxmox.com>

This adds command set-dynamic-stats, to simulate the cpu load (cpu) and
memory usage (mem in MiB) of a service, as well as command
set-static-stats, to configure the number of cores (maxcpu) and RAM
(maxmem in MiB) of a service. In addition to using the designated
command, dynamic service stats can be specified beforehand in file
dynamic_service_stats.

Upon calling set-dynamic-stats on some service, the dynamic stats of the
node this very service is running on will be aggregated accordingly.

Signed-off-by: Dominik Rusovac <d.rusovac@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Sim/Hardware.pm | 130 +++++++++++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)

diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index ec135e09..910f9718 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -21,8 +21,11 @@ use PVE::HA::Groups;
 
 my $watchdog_timeout = 60;
 
+my $default_service_cpu = 2.0;
 my $default_service_maxcpu = 4.0;
+my $default_service_mem = 2048 * 1024**2;
 my $default_service_maxmem = 4096 * 1024**2;
+
 my $default_node_maxcpu = 24.0;
 my $default_node_maxmem = 131072 * 1024**2;
 
@@ -213,6 +216,25 @@ sub set_static_service_stats {
     $self->write_static_service_stats($stats);
 }
 
+sub set_dynamic_service_stats {
+    my ($self, $sid, $new_stats) = @_;
+
+    my $conf = $self->read_service_config();
+    die "no such service '$sid'" if !$conf->{$sid};
+
+    my $stats = $self->read_dynamic_service_stats();
+
+    if (my $memory = $new_stats->{mem}) {
+        $stats->{$sid}->{mem} = $memory;
+    }
+
+    if (my $cpu = $new_stats->{cpu}) {
+        $stats->{$sid}->{cpu} = $cpu;
+    }
+
+    $self->write_dynamic_service_stats($stats);
+}
+
 sub add_service {
     my ($self, $sid, $opts, $running) = @_;
 
@@ -438,6 +460,16 @@ sub read_static_service_stats {
     return $stats;
 }
 
+sub read_dynamic_service_stats {
+    my ($self) = @_;
+
+    my $filename = "$self->{statusdir}/dynamic_service_stats";
+    my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
+    $self->log('error', "loading dynamic service stats failed - $@") if $@;
+
+    return $stats;
+}
+
 sub write_static_service_stats {
     my ($self, $stats) = @_;
 
@@ -446,6 +478,14 @@ sub write_static_service_stats {
     $self->log('error', "writing static service stats failed - $@") if $@;
 }
 
+sub write_dynamic_service_stats {
+    my ($self, $stats) = @_;
+
+    my $filename = "$self->{statusdir}/dynamic_service_stats";
+    eval { PVE::HA::Tools::write_json_to_file($filename, $stats) };
+    $self->log('error', "writing dynamic service stats failed - $@") if $@;
+}
+
 sub new {
     my ($this, $testdir) = @_;
 
@@ -536,6 +576,18 @@ sub new {
         $self->write_static_service_stats($stats);
     }
 
+    if (-f "$testdir/dynamic_service_stats") {
+        copy("$testdir/dynamic_service_stats", "$statusdir/dynamic_service_stats");
+    } else {
+        my $services = $self->read_static_service_stats();
+        my $stats = {
+            map { $_ => { cpu => $default_service_cpu, mem => $default_service_mem } }
+                keys %$services
+        };
+
+        $self->write_dynamic_service_stats($stats);
+    }
+
     my $cstatus = $self->read_hardware_status_nolock();
 
     foreach my $node (sort keys %$cstatus) {
@@ -744,6 +796,7 @@ sub get_cfs_state {
 #   service <sid> lock/unlock [lockname]
 #   service <sid> add <node> [<request-state=started>] [<running=0>]
 #   service <sid> set-static-stats  <maxcpu|maxmem> <cpu cores|MiB>
+#   service <sid> set-dynamic-stats <cpu|mem>       <load in cpu cores|usage in MiB>
 #   service <sid> delete
 sub sim_hardware_cmd {
     my ($self, $cmdstr, $logid) = @_;
@@ -911,6 +964,24 @@ sub sim_hardware_cmd {
                 } else {
                     die "sim_hardware_cmd: unknown target stat '$target' for '$action' command";
                 }
+            } elsif ($action eq 'set-dynamic-stats') {
+                my ($target, $val) = ($params[0], $params[1]);
+
+                if (!$target) {
+                    die "sim_hardware_cmd: missing target stat for '$action' command";
+                } elsif ($target eq "cpu") {
+                    die "sim_hardware_cmd: missing value for '$action $target' command"
+                        if !$val;
+
+                    $self->set_dynamic_service_stats($sid, { $target => 0.0 + $val });
+                } elsif ($target eq "mem") {
+                    die "sim_hardware_cmd: missing value for '$action $target' command"
+                        if !$val;
+
+                    $self->set_dynamic_service_stats($sid, { $target => $val * 1024**2 });
+                } else {
+                    die "sim_hardware_cmd: unknown target stat '$target' for '$action' command";
+                }
             } elsif ($action eq 'delete') {
 
                 $self->delete_service($sid);
@@ -1135,6 +1206,27 @@ sub get_static_service_stats {
     return $stats;
 }
 
+sub get_dynamic_service_stats {
+    my ($self) = @_;
+
+    my $stats = get_cluster_service_stats($self);
+    my $static_stats = $self->read_static_service_stats();
+    my $dynamic_stats = $self->read_dynamic_service_stats();
+
+    for my $sid (keys %$stats) {
+        $stats->{$sid}->{usage} = {
+            $static_stats->{$sid}->%*, $dynamic_stats->{$sid}->%*,
+        };
+
+        die "overcommitted cpu on '$sid'"
+            if $stats->{$sid}->{usage}->{cpu} > $stats->{$sid}->{usage}->{maxcpu};
+        die "overcommitted mem on '$sid'"
+            if $stats->{$sid}->{usage}->{mem} > $stats->{$sid}->{usage}->{maxmem};
+    }
+
+    return $stats;
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
@@ -1148,6 +1240,44 @@ sub get_static_node_stats {
     return $stats;
 }
 
+sub get_dynamic_node_stats {
+    my ($self) = @_;
+
+    my $stats = $self->get_static_node_stats();
+    for my $node (keys %$stats) {
+        $stats->{$node}->{maxcpu} = $stats->{$node}->{maxcpu} // $default_node_maxcpu;
+        $stats->{$node}->{cpu} = $stats->{$node}->{cpu} // 0.0;
+        $stats->{$node}->{maxmem} = $stats->{$node}->{maxmem} // $default_node_maxmem;
+        $stats->{$node}->{mem} = $stats->{$node}->{mem} // 0;
+    }
+
+    my $service_conf = $self->read_service_config();
+    my $dynamic_service_stats = $self->get_dynamic_service_stats();
+
+    my $cstatus = $self->read_hardware_status_nolock();
+    my $node_service_status = { map { $_ => $self->read_service_status($_) } keys %$cstatus };
+
+    for my $sid (keys %$service_conf) {
+        my $node = $service_conf->{$sid}->{node};
+
+        if ($node_service_status->{$node}->{$sid}) {
+            my ($cpu, $mem) = $dynamic_service_stats->{$sid}->{usage}->@{qw(cpu mem)};
+
+            die "unknown cpu load for '$sid'" if !defined($cpu);
+            $stats->{$node}->{cpu} += $cpu;
+            die "overcommitted cpu on '$node'"
+                if $stats->{$node}->{cpu} > $stats->{$node}->{maxcpu};
+
+            die "unknown memory usage for '$sid'" if !defined($mem);
+            $stats->{$node}->{mem} += $mem;
+            die "overcommitted mem on '$node'"
+                if $stats->{$node}->{mem} > $stats->{$node}->{maxmem};
+        }
+    }
+
+    return $stats;
+}
+
 sub get_node_version {
     my ($self, $node) = @_;
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 17/21] usage: add dynamic usage scheduler
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (28 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 debian/pve-ha-manager.install |  1 +
 src/PVE/HA/Manager.pm         | 12 +++++
 src/PVE/HA/Usage/Dynamic.pm   | 99 +++++++++++++++++++++++++++++++++++
 src/PVE/HA/Usage/Makefile     |  2 +-
 4 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/HA/Usage/Dynamic.pm

diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 38d5d60b..75220a0b 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -42,6 +42,7 @@
 /usr/share/perl5/PVE/HA/Usage.pm
 /usr/share/perl5/PVE/HA/Usage/Basic.pm
 /usr/share/perl5/PVE/HA/Usage/Static.pm
+/usr/share/perl5/PVE/HA/Usage/Dynamic.pm
 /usr/share/perl5/PVE/Service/pve_ha_crm.pm
 /usr/share/perl5/PVE/Service/pve_ha_lrm.pm
 /usr/share/pve-manager/templates/default/fencing-body.html.hbs
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index f9dec661..fc0e7fc2 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -14,6 +14,7 @@ use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
 use PVE::HA::Rules::ResourceAffinity
     qw(get_affinitive_resources get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
 use PVE::HA::Usage::Basic;
+use PVE::HA::Usage::Dynamic;
 
 my $have_static_scheduling;
 eval {
@@ -264,6 +265,17 @@ sub recompute_online_node_usage {
                 'warning',
                 "fallback to 'basic' scheduler mode, init for 'static' failed - $@",
             ) if $@;
+        } elsif ($mode eq 'dynamic') {
+            $online_node_usage = eval {
+                $service_stats = $haenv->get_dynamic_service_stats();
+                my $scheduler = PVE::HA::Usage::Dynamic->new($haenv, $service_stats);
+                $scheduler->add_node($_) for $online_nodes->@*;
+                return $scheduler;
+            };
+            $haenv->log(
+                'warning',
+                "fallback to 'basic' scheduler mode, init for 'dynamic' failed - $@",
+            ) if $@;
         } elsif ($mode eq 'basic') {
             # handled below in the general fall-back case
         } else {
diff --git a/src/PVE/HA/Usage/Dynamic.pm b/src/PVE/HA/Usage/Dynamic.pm
new file mode 100644
index 00000000..f4049f62
--- /dev/null
+++ b/src/PVE/HA/Usage/Dynamic.pm
@@ -0,0 +1,99 @@
+package PVE::HA::Usage::Dynamic;
+
+use strict;
+use warnings;
+
+use PVE::HA::Resources;
+use PVE::RS::ResourceScheduling::Dynamic;
+
+use base qw(PVE::HA::Usage);
+
+sub new {
+    my ($class, $haenv, $service_stats) = @_;
+
+    my $node_stats = eval { $haenv->get_dynamic_node_stats() };
+    die "did not get dynamic node usage information - $@" if $@;
+
+    my $scheduler = eval { PVE::RS::ResourceScheduling::Dynamic->new() };
+    die "unable to initialize dynamic scheduling - $@" if $@;
+
+    return bless {
+        'node-stats' => $node_stats,
+        'service-stats' => $service_stats,
+        haenv => $haenv,
+        scheduler => $scheduler,
+    }, $class;
+}
+
+sub add_node {
+    my ($self, $nodename) = @_;
+
+    my $stats = $self->{'node-stats'}->{$nodename}
+        or die "did not get dynamic node usage information for '$nodename'\n";
+    die "dynamic node usage information for '$nodename' missing cpu count\n" if !$stats->{maxcpu};
+    die "dynamic node usage information for '$nodename' missing memory\n" if !$stats->{maxmem};
+
+    eval { $self->{scheduler}->add_node($nodename, $stats); };
+    die "initializing dynamic node usage for '$nodename' failed - $@" if $@;
+}
+
+sub remove_node {
+    my ($self, $nodename) = @_;
+
+    $self->{scheduler}->remove_node($nodename);
+}
+
+sub list_nodes {
+    my ($self) = @_;
+
+    return $self->{scheduler}->list_nodes()->@*;
+}
+
+sub contains_node {
+    my ($self, $nodename) = @_;
+
+    return $self->{scheduler}->contains_node($nodename);
+}
+
+my sub get_service_usage {
+    my ($self, $sid) = @_;
+
+    my $service_stats = $self->{'service-stats'}->{$sid}->{usage}
+        or die "did not get static service usage information for '$sid'\n";
+
+    return $service_stats;
+}
+
+sub add_service_usage_to_node {
+    my ($self, $nodename, $sid) = @_;
+
+    eval {
+        my $service_usage = get_service_usage($self, $sid);
+        $self->{scheduler}->add_service_usage_to_node($nodename, $sid, $service_usage);
+    };
+    $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@")
+        if $@;
+}
+
+sub remove_service_usage {
+    my ($self, $sid) = @_;
+
+    eval { $self->{scheduler}->remove_service_usage($sid) };
+    $self->{haenv}->log('warning', "unable to remove service '$sid' usage - $@") if $@;
+}
+
+sub score_nodes_to_start_service {
+    my ($self, $sid) = @_;
+
+    my $score_list = eval {
+        my $service_usage = get_service_usage($self, $sid);
+        $self->{scheduler}->score_nodes_to_start_service($service_usage);
+    };
+    $self->{haenv}
+        ->log('err', "unable to score nodes according to dynamic usage for service '$sid' - $@");
+
+    # Take minus the value, so that a lower score is better, which our caller(s) expect(s).
+    return { map { $_->[0] => -$_->[1] } $score_list->@* };
+}
+
+1;
diff --git a/src/PVE/HA/Usage/Makefile b/src/PVE/HA/Usage/Makefile
index befdda60..5d51a9c1 100644
--- a/src/PVE/HA/Usage/Makefile
+++ b/src/PVE/HA/Usage/Makefile
@@ -1,5 +1,5 @@
 SIM_SOURCES=Basic.pm
-SOURCES=${SIM_SOURCES} Static.pm
+SOURCES=${SIM_SOURCES} Static.pm Dynamic.pm
 
 .PHONY: install
 install:
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (29 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The name is misleading, because the HA resource migration is not
executed, but only queues the HA resource to change into the state
'migrate' or 'relocate', which is then picked up by the respective LRM
to execute.

The term 'resource motion' also generalizes the different actions
implied by the 'migrate' and 'relocate' command and state.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Manager.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index fc0e7fc2..fd71ec44 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -408,7 +408,7 @@ sub read_lrm_status {
     return ($results, $modes);
 }
 
-sub execute_migration {
+sub queue_resource_motion {
     my ($self, $cmd, $task, $sid, $target) = @_;
 
     my ($haenv, $ss) = $self->@{qw(haenv ss)};
@@ -477,7 +477,7 @@ sub update_crm_commands {
                             "ignore crm command - service already on target node: $cmd",
                         );
                     } else {
-                        $self->execute_migration($cmd, $task, $sid, $node);
+                        $self->queue_resource_motion($cmd, $task, $sid, $node);
                     }
                 }
             } else {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (30 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/HA/Manager.pm | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index fd71ec44..e3ab1ee7 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -86,11 +86,12 @@ sub update_crs_scheduler_mode {
 
     my $haenv = $self->{haenv};
     my $dc_cfg = $haenv->get_datacenter_settings();
+    my $crs_cfg = $dc_cfg->{crs};
 
-    $self->{crs}->{rebalance_on_request_start} = !!$dc_cfg->{crs}->{'ha-rebalance-on-start'};
+    $self->{crs}->{rebalance_on_request_start} = !!$crs_cfg->{'ha-rebalance-on-start'};
 
     my $old_mode = $self->{crs}->{scheduler};
-    my $new_mode = $dc_cfg->{crs}->{ha} || 'basic';
+    my $new_mode = $crs_cfg->{ha} || 'basic';
 
     if (!defined($old_mode)) {
         $haenv->log('info', "using scheduler mode '$new_mode'") if $new_mode ne 'basic';
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 20/21] implement automatic rebalancing
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (31 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases Daniel Kral
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

The automatic load rebalancing system checks whether the cluster node
imbalance exceeds some user-defined threshold for some HA Manager rounds
("hold duration"). If it does exceed on consecutive HA Manager rounds,
it will choose the best service migration/relocation to improve the
cluster node imbalance and queue it if it significantly improves it by
some user-defined margin.

This introduces resource bundles, which make sure that HA resources in
strict positive resource affinity rules are considered as a whole
"bundle" instead of individually. Additionally, the migration candidate
generation prunes any target nodes, which do not adhere the HA rules
before scoring these migration candidates.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
As noted by the TODO, the migration candidate generation will likely be
moved to the perlmod bindings or proxmox-resource-scheduling.

This version includes a debug log statement which will report the
current node imbalance through the pve-ha-crm's log, which will
obviously not be part of a final revision due to syslog spam.

 src/PVE/HA/Manager.pm       | 171 +++++++++++++++++++++++++++++++++++-
 src/PVE/HA/Usage.pm         |  36 ++++++++
 src/PVE/HA/Usage/Dynamic.pm |  65 +++++++++++++-
 src/PVE/HA/Usage/Static.pm  |  60 +++++++++++++
 4 files changed, 329 insertions(+), 3 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index e3ab1ee7..5915b55a 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -54,10 +54,13 @@ sub new {
 
     my $self = bless {
         haenv => $haenv,
-        crs => {},
+        crs => {
+            auto_rebalance => {},
+        },
         last_rules_digest => '',
         last_groups_digest => '',
         last_services_digest => '',
+        sustained_imbalance_round => 0,
         group_migration_round => 3, # wait a little bit
     }, $class;
 
@@ -89,6 +92,13 @@ sub update_crs_scheduler_mode {
     my $crs_cfg = $dc_cfg->{crs};
 
     $self->{crs}->{rebalance_on_request_start} = !!$crs_cfg->{'ha-rebalance-on-start'};
+    $self->{crs}->{auto_rebalance}->{enable} = !!$crs_cfg->{'ha-auto-rebalance'};
+    $self->{crs}->{auto_rebalance}->{threshold} = $crs_cfg->{'ha-auto-rebalance-threshold'} // 0.7;
+    $self->{crs}->{auto_rebalance}->{method} = $crs_cfg->{'ha-auto-rebalance-method'}
+        // 'bruteforce';
+    $self->{crs}->{auto_rebalance}->{hold_duration} = $crs_cfg->{'ha-auto-rebalance-hold-duration'}
+        // 3;
+    $self->{crs}->{auto_rebalance}->{margin} = $crs_cfg->{'ha-auto-rebalance-margin'} // 0.1;
 
     my $old_mode = $self->{crs}->{scheduler};
     my $new_mode = $crs_cfg->{ha} || 'basic';
@@ -106,6 +116,148 @@ sub update_crs_scheduler_mode {
     return;
 }
 
+# Returns a hash of lists, which contain the running, non-moving HA resource
+# bundles, which are on the same node, implied by the strict positive resource
+# affinity rules.
+#
+# Each resource bundle has a leader, which is the alphabetically first running
+# HA resource in the resource bundle and also the key of each resource bundle
+# in the returned hash.
+my sub get_active_stationary_resource_bundles {
+    my ($ss, $resource_affinity) = @_;
+
+    my $resource_bundles = {};
+    for my $sid (sort keys %$ss) {
+        next if $ss->{$sid}->{state} ne 'started';
+
+        my @resources = ($sid);
+        my $nodes = { $ss->{$sid}->{node} => 1 };
+
+        my ($dependent_resources) = get_affinitive_resources($resource_affinity, $sid);
+        if (%$dependent_resources) {
+            for my $csid (keys %$dependent_resources) {
+                my ($state, $node) = $ss->{$csid}->@{qw(state node)};
+
+                next if $state ne 'started';
+
+                $nodes->{$node} = 1;
+
+                push @resources, $csid;
+            }
+
+            @resources = sort @resources;
+        }
+
+        # skip resource bundles, which are not on the same node yet
+        next if keys %$nodes > 1;
+
+        my $leader_sid = $resources[0];
+
+        $resource_bundles->{$leader_sid} = \@resources;
+    }
+
+    return $resource_bundles;
+}
+
+# Returns a hash of hashes, where each item contains the resource bundle's
+# leader, the list of HA resources in the resource bundle, and the list of
+# possible nodes to migrate to.
+sub get_resource_migration_candidates {
+    my ($self) = @_;
+
+    my ($ss, $compiled_rules, $online_node_usage) =
+        $self->@{qw(ss compiled_rules online_node_usage)};
+    my ($node_affinity, $resource_affinity) =
+        $compiled_rules->@{qw(node-affinity resource-affinity)};
+
+    my $resource_bundles = get_active_stationary_resource_bundles($ss, $resource_affinity);
+
+    my @compact_migration_candidates = ();
+    for my $leader_sid (sort keys %$resource_bundles) {
+        my $current_leader_node = $ss->{$leader_sid}->{node};
+        my $online_nodes = { map { $_ => 1 } $online_node_usage->list_nodes() };
+
+        my (undef, $target_nodes) = get_node_affinity($node_affinity, $leader_sid, $online_nodes);
+        my ($together, $separate) =
+            get_resource_affinity($resource_affinity, $leader_sid, $ss, $online_nodes);
+        apply_negative_resource_affinity($separate, $target_nodes);
+
+        delete $target_nodes->{$current_leader_node};
+
+        next if !%$target_nodes;
+
+        push @compact_migration_candidates,
+            {
+                leader => $leader_sid,
+                nodes => [sort keys %$target_nodes],
+                services => $resource_bundles->{$leader_sid},
+            };
+    }
+
+    return \@compact_migration_candidates;
+}
+
+sub load_balance {
+    my ($self) = @_;
+
+    my ($crs, $haenv, $online_node_usage) = $self->@{qw(crs haenv online_node_usage)};
+    my ($auto_rebalance_opts) = $crs->{auto_rebalance};
+
+    return if !$auto_rebalance_opts->{enable};
+    return if $crs->{scheduler} ne 'static' && $crs->{scheduler} ne 'dynamic';
+    return if $self->any_resource_motion_queued_or_running();
+
+    my ($threshold, $method, $hold_duration, $margin) =
+        $auto_rebalance_opts->@{qw(threshold method hold_duration margin)};
+
+    my $node_loads = $online_node_usage->calculate_node_loads();
+    my $imbalance = $online_node_usage->calculate_node_imbalance();
+
+    $haenv->log('debug', "auto rebalance - node imbalance: $imbalance");
+
+    # do not load balance unless imbalance threshold has been exceeded
+    # consecutively for $hold_duration calls to load_balance()
+    if ($imbalance < $threshold) {
+        $self->{sustained_imbalance_round} = 0;
+        return;
+    } else {
+        $self->{sustained_imbalance_round}++;
+        return if $self->{sustained_imbalance_round} < $hold_duration;
+        $self->{sustained_imbalance_round} = 0;
+    }
+
+    # TODO Move migration candidate generation into PVE::RS::ResourceScheduling
+    my $candidates = $self->get_resource_migration_candidates();
+
+    my $result;
+    if ($method eq 'bruteforce') {
+        $result = $online_node_usage->select_best_balancing_migration($candidates);
+    } elsif ($method eq 'topsis') {
+        $result = $online_node_usage->select_best_balancing_migration_topsis($candidates);
+    }
+
+    return if !$result;
+
+    my ($migration, $target_imbalance) = $result->@{qw(migration imbalance)};
+
+    my $relative_change = ($imbalance - $target_imbalance) / $imbalance;
+    return if $relative_change < $margin;
+
+    my ($sid, $source, $target) = $migration->@{qw(sid source-node target-node)};
+
+    my (undef, $type, $id) = $haenv->parse_sid($sid);
+    my $task = $type eq 'vm' ? "migrate" : "relocate";
+    my $cmd = "$task $sid $target";
+
+    my $target_imbalance_str = int(100 * $target_imbalance + 0.5) / 100;
+    $haenv->log(
+        'info',
+        "auto rebalance - $task $sid to $target (expected target imbalance: $target_imbalance_str)",
+    );
+
+    $self->queue_resource_motion($cmd, $task, $sid, $target);
+}
+
 sub cleanup {
     my ($self) = @_;
 
@@ -455,6 +607,21 @@ sub queue_resource_motion {
     }
 }
 
+sub any_resource_motion_queued_or_running {
+    my ($self) = @_;
+
+    my ($ss) = $self->@{qw(ss)};
+
+    for my $sid (keys %$ss) {
+        my ($cmd, $state) = $ss->{$sid}->@{qw(cmd state)};
+
+        return 1 if $state eq 'migrate' || $state eq 'relocate';
+        return 1 if defined($cmd) && ($cmd->[0] eq 'migrate' || $cmd->[0] eq 'relocate');
+    }
+
+    return 0;
+}
+
 # read new crm commands and save them into crm master status
 sub update_crm_commands {
     my ($self) = @_;
@@ -738,6 +905,8 @@ sub manage {
 
     $self->update_crm_commands();
 
+    $self->load_balance();
+
     for (;;) {
         my $repeat = 0;
 
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 9f19a82b..3515b48f 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -59,6 +59,42 @@ sub remove_service_usage {
     die "implement in subclass";
 }
 
+sub calculate_node_loads {
+    my ($self) = @_;
+
+    die "implement in subclass";
+}
+
+sub calculate_node_imbalance {
+    my ($self) = @_;
+
+    die "implement in subclass";
+}
+
+sub score_best_balancing_migrations {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    die "implement in subclass";
+}
+
+sub select_best_balancing_migration {
+    my ($self, $migration_candidates) = @_;
+
+    die "implement in subclass";
+}
+
+sub score_best_balancing_migrations_topsis {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    die "implement in subclass";
+}
+
+sub select_best_balancing_migration_topsis {
+    my ($self, $migration_candidates) = @_;
+
+    die "implement in subclass";
+}
+
 # Returns a hash with $nodename => $score pairs. A lower $score is better.
 sub score_nodes_to_start_service {
     my ($self, $sid) = @_;
diff --git a/src/PVE/HA/Usage/Dynamic.pm b/src/PVE/HA/Usage/Dynamic.pm
index f4049f62..12bdc383 100644
--- a/src/PVE/HA/Usage/Dynamic.pm
+++ b/src/PVE/HA/Usage/Dynamic.pm
@@ -59,7 +59,7 @@ my sub get_service_usage {
     my ($self, $sid) = @_;
 
     my $service_stats = $self->{'service-stats'}->{$sid}->{usage}
-        or die "did not get static service usage information for '$sid'\n";
+        or die "did not get dynamic service usage information for '$sid'\n";
 
     return $service_stats;
 }
@@ -82,6 +82,66 @@ sub remove_service_usage {
     $self->{haenv}->log('warning', "unable to remove service '$sid' usage - $@") if $@;
 }
 
+sub calculate_node_loads {
+    my ($self) = @_;
+
+    my $node_loads = eval { $self->{scheduler}->calculate_node_loads() };
+    $self->{haenv}->log('warning', "unable to calculate dynamic node loads - $@") if $@;
+
+    return { map { $_->[0] => $_->[1] } $node_loads->@* };
+}
+
+sub calculate_node_imbalance {
+    my ($self) = @_;
+
+    my $node_imbalance = eval { $self->{scheduler}->calculate_node_imbalance() };
+    $self->{haenv}->log('warning', "unable to calculate dynamic node imbalance - $@") if $@;
+
+    return $node_imbalance // 0.0;
+}
+
+sub score_best_balancing_migrations {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    my $migrations =
+        eval { $self->{scheduler}->score_best_balancing_migrations($migration_candidates, $limit); };
+    $self->{haenv}->log('warning', "unable to score best balancing migration - $@") if $@;
+
+    return $migrations;
+}
+
+sub select_best_balancing_migration {
+    my ($self, $migration_candidates) = @_;
+
+    my $result =
+        eval { $self->{scheduler}->select_best_balancing_migration($migration_candidates) };
+    $self->{haenv}->log('warning', "unable to select best balancing migration - $@") if $@;
+
+    return $result;
+}
+
+sub score_best_balancing_migrations_topsis {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    my $migrations = eval {
+        $self->{scheduler}
+            ->score_best_balancing_migrations_topsis($migration_candidates, $limit);
+    };
+    $self->{haenv}->log('warning', "unable to score best balancing migration - $@") if $@;
+
+    return $migrations;
+}
+
+sub select_best_balancing_migration_topsis {
+    my ($self, $migration_candidates) = @_;
+
+    my $result =
+        eval { $self->{scheduler}->select_best_balancing_migration_topsis($migration_candidates) };
+    $self->{haenv}->log('warning', "unable to select best balancing migration - $@") if $@;
+
+    return $result;
+}
+
 sub score_nodes_to_start_service {
     my ($self, $sid) = @_;
 
@@ -90,7 +150,8 @@ sub score_nodes_to_start_service {
         $self->{scheduler}->score_nodes_to_start_service($service_usage);
     };
     $self->{haenv}
-        ->log('err', "unable to score nodes according to dynamic usage for service '$sid' - $@");
+        ->log('err', "unable to score nodes according to dynamic usage for service '$sid' - $@")
+        if $@;
 
     # Take minus the value, so that a lower score is better, which our caller(s) expect(s).
     return { map { $_->[0] => -$_->[1] } $score_list->@* };
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index c8460fd7..ecc2a14f 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -91,6 +91,66 @@ sub remove_service_usage {
     $self->{haenv}->log('warning', "unable to remove service '$sid' usage - $@") if $@;
 }
 
+sub calculate_node_loads {
+    my ($self) = @_;
+
+    my $node_loads = eval { $self->{scheduler}->calculate_node_loads() };
+    $self->{haenv}->log('warning', "unable to calculate static node loads - $@") if $@;
+
+    return { map { $_->[0] => $_->[1] } $node_loads->@* };
+}
+
+sub calculate_node_imbalance {
+    my ($self) = @_;
+
+    my $node_imbalance = eval { $self->{scheduler}->calculate_node_imbalance() };
+    $self->{haenv}->log('warning', "unable to calculate static node imbalance - $@") if $@;
+
+    return $node_imbalance // 0.0;
+}
+
+sub score_best_balancing_migrations {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    my $migrations =
+        eval { $self->{scheduler}->score_best_balancing_migrations($migration_candidates, $limit); };
+    $self->{haenv}->log('warning', "unable to score best balancing migration - $@") if $@;
+
+    return $migrations;
+}
+
+sub select_best_balancing_migration {
+    my ($self, $migration_candidates) = @_;
+
+    my $result =
+        eval { $self->{scheduler}->select_best_balancing_migration($migration_candidates) };
+    $self->{haenv}->log('warning', "unable to select best balancing migration - $@") if $@;
+
+    return $result;
+}
+
+sub score_best_balancing_migrations_topsis {
+    my ($self, $migration_candidates, $limit) = @_;
+
+    my $migrations = eval {
+        $self->{scheduler}
+            ->score_best_balancing_migrations_topsis($migration_candidates, $limit);
+    };
+    $self->{haenv}->log('warning', "unable to score best balancing migration - $@") if $@;
+
+    return $migrations;
+}
+
+sub select_best_balancing_migration_topsis {
+    my ($self, $migration_candidates) = @_;
+
+    my $result =
+        eval { $self->{scheduler}->select_best_balancing_migration_topsis($migration_candidates) };
+    $self->{haenv}->log('warning', "unable to select best balancing migration - $@") if $@;
+
+    return $result;
+}
+
 sub score_nodes_to_start_service {
     my ($self, $sid) = @_;
 
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (32 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
  2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

These test cases document the basic behavior of the automatic load
rebalancer with non-changing and changing dynamic resource usages.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 .../test-crs-dynamic-auto-rebalance0/README   |  2 +
 .../test-crs-dynamic-auto-rebalance0/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  1 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 11 +++
 .../manager_status                            |  1 +
 .../service_config                            |  1 +
 .../static_service_stats                      |  1 +
 .../test-crs-dynamic-auto-rebalance1/README   |  6 ++
 .../test-crs-dynamic-auto-rebalance1/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  3 +
 .../hardware_status                           |  5 ++
 .../log.expect                                | 25 ++++++
 .../manager_status                            |  1 +
 .../service_config                            |  3 +
 .../static_service_stats                      |  3 +
 .../test-crs-dynamic-auto-rebalance2/README   |  3 +
 .../test-crs-dynamic-auto-rebalance2/cmdlist  |  3 +
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  6 ++
 .../hardware_status                           |  5 ++
 .../log.expect                                | 59 +++++++++++++
 .../manager_status                            |  1 +
 .../service_config                            |  6 ++
 .../static_service_stats                      |  6 ++
 .../test-crs-dynamic-auto-rebalance3/README   |  3 +
 .../test-crs-dynamic-auto-rebalance3/cmdlist  | 24 +++++
 .../datacenter.cfg                            |  8 ++
 .../dynamic_service_stats                     |  9 ++
 .../hardware_status                           |  5 ++
 .../log.expect                                | 88 +++++++++++++++++++
 .../manager_status                            |  1 +
 .../service_config                            |  9 ++
 .../static_service_stats                      |  9 ++
 36 files changed, 343 insertions(+)
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/README
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/cmdlist
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/hardware_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/log.expect
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/manager_status
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/service_config
 create mode 100644 src/test/test-crs-dynamic-auto-rebalance3/static_service_stats

diff --git a/src/test/test-crs-dynamic-auto-rebalance0/README b/src/test/test-crs-dynamic-auto-rebalance0/README
new file mode 100644
index 00000000..54e1d981
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/README
@@ -0,0 +1,2 @@
+Test that the auto rebalance system does not trigger if no HA resources are
+configured in a homogeneous node cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/cmdlist b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/dynamic_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/hardware_status b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/log.expect b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
new file mode 100644
index 00000000..27eed635
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/log.expect
@@ -0,0 +1,11 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/manager_status b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/service_config b/src/test/test-crs-dynamic-auto-rebalance0/service_config
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/service_config
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance0/static_service_stats
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/README b/src/test/test-crs-dynamic-auto-rebalance1/README
new file mode 100644
index 00000000..c99a7891
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/README
@@ -0,0 +1,6 @@
+Test that the auto rebalance system does not trigger for a single running HA
+resource in a homogeneous cluster.
+
+Even though the single running HA resource will create a high node imbalance,
+which would trigger a reblancing migration, there is no such migration that can
+improve the imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/cmdlist b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
new file mode 100644
index 00000000..50dd4901
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/dynamic_service_stats
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/hardware_status b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
new file mode 100644
index 00000000..7f97253b
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 4, "maxmem": 17179869184 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/log.expect b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
new file mode 100644
index 00000000..e6ee4402
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/log.expect
@@ -0,0 +1,25 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/manager_status b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/service_config b/src/test/test-crs-dynamic-auto-rebalance1/service_config
new file mode 100644
index 00000000..a0ab66d2
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/service_config
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
new file mode 100644
index 00000000..e1bf0839
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance1/static_service_stats
@@ -0,0 +1,3 @@
+{
+    "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/README b/src/test/test-crs-dynamic-auto-rebalance2/README
new file mode 100644
index 00000000..b9acfdb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running,
+homogeneous HA resources on a single node to other cluster nodes to reach a
+minimum cluster node imbalance in the homogeneous cluster.
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/cmdlist b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
new file mode 100644
index 00000000..13f90cd7
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/cmdlist
@@ -0,0 +1,3 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
new file mode 100644
index 00000000..f01fd768
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/dynamic_service_stats
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:102": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:103": { "cpu": 1.0, "mem": 4294967296 },
+    "vm:104": { "cpu": 1.0, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/hardware_status b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
new file mode 100644
index 00000000..ce8cf0eb
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 34359738368 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/log.expect b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
new file mode 100644
index 00000000..a1796c56
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/log.expect
@@ -0,0 +1,59 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node1'
+info     20    node1/crm: adding new service 'vm:104' on node 'node1'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node1)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: starting service vm:102
+info     21    node1/lrm: service status vm:102 started
+info     21    node1/lrm: starting service vm:103
+info     21    node1/lrm: service status vm:103 started
+info     21    node1/lrm: starting service vm:104
+info     21    node1/lrm: service status vm:104 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     80    node1/crm: auto rebalance - migrate vm:101 to node2 (expected target imbalance: 0.94)
+info     80    node1/crm: got crm command: migrate vm:101 node2
+info     80    node1/crm: migrate service 'vm:101' to node 'node2'
+info     80    node1/crm: service 'vm:101': state changed from 'started' to 'migrate'  (node = node1, target = node2)
+info     81    node1/lrm: service vm:101 - start migrate to node 'node2'
+info     81    node1/lrm: service vm:101 - end migrate to node 'node2'
+info     83    node2/lrm: got lock 'ha_agent_node2_lock'
+info     83    node2/lrm: status change wait_for_agent_lock => active
+info    100    node1/crm: service 'vm:101': state changed from 'migrate' to 'started'  (node = node2)
+info    103    node2/lrm: starting service vm:101
+info    103    node2/lrm: service status vm:101 started
+info    160    node1/crm: auto rebalance - migrate vm:103 to node3 (expected target imbalance: 0.35)
+info    160    node1/crm: got crm command: migrate vm:103 node3
+info    160    node1/crm: migrate service 'vm:103' to node 'node3'
+info    160    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node1, target = node3)
+info    161    node1/lrm: service vm:103 - start migrate to node 'node3'
+info    161    node1/lrm: service vm:103 - end migrate to node 'node3'
+info    165    node3/lrm: got lock 'ha_agent_node3_lock'
+info    165    node3/lrm: status change wait_for_agent_lock => active
+info    180    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node3)
+info    185    node3/lrm: starting service vm:103
+info    185    node3/lrm: service status vm:103 started
+info    620     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/manager_status b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
new file mode 100644
index 00000000..0967ef42
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/manager_status
@@ -0,0 +1 @@
+{}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/service_config b/src/test/test-crs-dynamic-auto-rebalance2/service_config
new file mode 100644
index 00000000..b5960cb1
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/service_config
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node1", "state": "started" },
+    "vm:104": { "node": "node1", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
new file mode 100644
index 00000000..6cf8c106
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance2/static_service_stats
@@ -0,0 +1,6 @@
+{
+    "vm:101": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:102": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:103": { "maxcpu": 2.0, "maxmem": 8589934592 },
+    "vm:104": { "maxcpu": 2.0, "maxmem": 8589934592 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/README b/src/test/test-crs-dynamic-auto-rebalance3/README
new file mode 100644
index 00000000..44791d6f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/README
@@ -0,0 +1,3 @@
+Test that the auto rebalance system will auto rebalance multiple running HA
+resources with different usages in a homogeneous cluster with changing usages
+over time to reach minimum cluster node imbalance.
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/cmdlist b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
new file mode 100644
index 00000000..42fb259f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/cmdlist
@@ -0,0 +1,24 @@
+[
+    [ "power node1 on", "power node2 on", "power node3 on" ],
+    [
+        "service vm:105 set-dynamic-stats cpu 7.8",
+        "service vm:105 set-dynamic-stats mem 7912",
+        "service vm:106 set-dynamic-stats cpu 5.7",
+        "service vm:106 set-dynamic-stats mem 8192",
+        "service vm:107 set-dynamic-stats cpu 6.0",
+        "service vm:107 set-dynamic-stats mem 8011"
+    ],
+    [
+        "service vm:101 set-dynamic-stats mem 1011",
+        "service vm:103 set-dynamic-stats cpu 3.9",
+        "service vm:103 set-dynamic-stats mem 6517",
+        "service vm:104 set-dynamic-stats cpu 6.7",
+        "service vm:104 set-dynamic-stats mem 8001",
+        "service vm:105 set-dynamic-stats cpu 1.8",
+        "service vm:105 set-dynamic-stats mem 1201",
+        "service vm:106 set-dynamic-stats cpu 2.1",
+        "service vm:106 set-dynamic-stats mem 1211",
+        "service vm:107 set-dynamic-stats cpu 0.9",
+        "service vm:107 set-dynamic-stats mem 1191"
+    ]
+]
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
new file mode 100644
index 00000000..6526c203
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/datacenter.cfg
@@ -0,0 +1,8 @@
+{
+    "crs": {
+        "ha": "dynamic",
+        "ha-auto-rebalance": 1,
+        "ha-auto-rebalance-threshold": 0.7
+    }
+}
+
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
new file mode 100644
index 00000000..77e72c16
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/dynamic_service_stats
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "cpu": 0.9, "mem": 5444206592 },
+    "vm:102": { "cpu": 1.2, "mem": 2621440000 },
+    "vm:103": { "cpu": 0.8, "mem": 5444206592 },
+    "vm:104": { "cpu": 0.9, "mem": 2621440000 },
+    "vm:105": { "cpu": 3.0, "mem": 5444206592 },
+    "vm:106": { "cpu": 2.9, "mem": 2621440000 },
+    "vm:107": { "cpu": 2.1, "mem": 4294967296 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/hardware_status b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
new file mode 100644
index 00000000..8f1e695c
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/hardware_status
@@ -0,0 +1,5 @@
+{
+  "node1": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+  "node2": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 },
+  "node3": { "power": "off", "network": "off", "maxcpu": 24, "maxmem": 51539607552 }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/log.expect b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
new file mode 100644
index 00000000..1832c44f
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/log.expect
@@ -0,0 +1,88 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: using scheduler mode 'dynamic'
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:101' on node 'node1'
+info     20    node1/crm: adding new service 'vm:102' on node 'node1'
+info     20    node1/crm: adding new service 'vm:103' on node 'node2'
+info     20    node1/crm: adding new service 'vm:104' on node 'node2'
+info     20    node1/crm: adding new service 'vm:105' on node 'node3'
+info     20    node1/crm: adding new service 'vm:106' on node 'node3'
+info     20    node1/crm: adding new service 'vm:107' on node 'node3'
+info     20    node1/crm: service 'vm:101': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:102': state changed from 'request_start' to 'started'  (node = node1)
+info     20    node1/crm: service 'vm:103': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:104': state changed from 'request_start' to 'started'  (node = node2)
+info     20    node1/crm: service 'vm:105': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:106': state changed from 'request_start' to 'started'  (node = node3)
+info     20    node1/crm: service 'vm:107': state changed from 'request_start' to 'started'  (node = node3)
+info     21    node1/lrm: got lock 'ha_agent_node1_lock'
+info     21    node1/lrm: status change wait_for_agent_lock => active
+info     21    node1/lrm: starting service vm:101
+info     21    node1/lrm: service status vm:101 started
+info     21    node1/lrm: starting service vm:102
+info     21    node1/lrm: service status vm:102 started
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     23    node2/lrm: starting service vm:103
+info     23    node2/lrm: service status vm:103 started
+info     23    node2/lrm: starting service vm:104
+info     23    node2/lrm: service status vm:104 started
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:105
+info     25    node3/lrm: service status vm:105 started
+info     25    node3/lrm: starting service vm:106
+info     25    node3/lrm: service status vm:106 started
+info     25    node3/lrm: starting service vm:107
+info     25    node3/lrm: service status vm:107 started
+info    120      cmdlist: execute service vm:105 set-dynamic-stats cpu 7.8
+info    120      cmdlist: execute service vm:105 set-dynamic-stats mem 7912
+info    120      cmdlist: execute service vm:106 set-dynamic-stats cpu 5.7
+info    120      cmdlist: execute service vm:106 set-dynamic-stats mem 8192
+info    120      cmdlist: execute service vm:107 set-dynamic-stats cpu 6.0
+info    120      cmdlist: execute service vm:107 set-dynamic-stats mem 8011
+info    160    node1/crm: auto rebalance - migrate vm:105 to node2 (expected target imbalance: 0.42)
+info    160    node1/crm: got crm command: migrate vm:105 node2
+info    160    node1/crm: migrate service 'vm:105' to node 'node2'
+info    160    node1/crm: service 'vm:105': state changed from 'started' to 'migrate'  (node = node3, target = node2)
+info    165    node3/lrm: service vm:105 - start migrate to node 'node2'
+info    165    node3/lrm: service vm:105 - end migrate to node 'node2'
+info    180    node1/crm: service 'vm:105': state changed from 'migrate' to 'started'  (node = node2)
+info    183    node2/lrm: starting service vm:105
+info    183    node2/lrm: service status vm:105 started
+info    220      cmdlist: execute service vm:101 set-dynamic-stats mem 1011
+info    220      cmdlist: execute service vm:103 set-dynamic-stats cpu 3.9
+info    220      cmdlist: execute service vm:103 set-dynamic-stats mem 6517
+info    220      cmdlist: execute service vm:104 set-dynamic-stats cpu 6.7
+info    220      cmdlist: execute service vm:104 set-dynamic-stats mem 8001
+info    220      cmdlist: execute service vm:105 set-dynamic-stats cpu 1.8
+info    220      cmdlist: execute service vm:105 set-dynamic-stats mem 1201
+info    220      cmdlist: execute service vm:106 set-dynamic-stats cpu 2.1
+info    220      cmdlist: execute service vm:106 set-dynamic-stats mem 1211
+info    220      cmdlist: execute service vm:107 set-dynamic-stats cpu 0.9
+info    220      cmdlist: execute service vm:107 set-dynamic-stats mem 1191
+info    260    node1/crm: auto rebalance - migrate vm:103 to node1 (expected target imbalance: 0.4)
+info    260    node1/crm: got crm command: migrate vm:103 node1
+info    260    node1/crm: migrate service 'vm:103' to node 'node1'
+info    260    node1/crm: service 'vm:103': state changed from 'started' to 'migrate'  (node = node2, target = node1)
+info    263    node2/lrm: service vm:103 - start migrate to node 'node1'
+info    263    node2/lrm: service vm:103 - end migrate to node 'node1'
+info    280    node1/crm: service 'vm:103': state changed from 'migrate' to 'started'  (node = node1)
+info    281    node1/lrm: starting service vm:103
+info    281    node1/lrm: service status vm:103 started
+info    820     hardware: exit simulation - done
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/manager_status b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/service_config b/src/test/test-crs-dynamic-auto-rebalance3/service_config
new file mode 100644
index 00000000..a44ddd0e
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/service_config
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "node": "node1", "state": "started" },
+    "vm:102": { "node": "node1", "state": "started" },
+    "vm:103": { "node": "node2", "state": "started" },
+    "vm:104": { "node": "node2", "state": "started" },
+    "vm:105": { "node": "node3", "state": "started" },
+    "vm:106": { "node": "node3", "state": "started" },
+    "vm:107": { "node": "node3", "state": "started" }
+}
diff --git a/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
new file mode 100644
index 00000000..7a52ea73
--- /dev/null
+++ b/src/test/test-crs-dynamic-auto-rebalance3/static_service_stats
@@ -0,0 +1,9 @@
+{
+    "vm:101": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:102": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:103": { "maxcpu": 4.0, "maxmem": 8589934592 },
+    "vm:104": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:105": { "maxcpu": 8.0, "maxmem": 8589934592 },
+    "vm:106": { "maxcpu": 6.0, "maxmem": 8589934592 },
+    "vm:107": { "maxcpu": 6.0, "maxmem": 8589934592 }
+}
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (33 preceding siblings ...)
  2026-02-17 14:14 ` [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  2026-02-18 11:10   ` Maximiliano Sandoval
  2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral
  35 siblings, 1 reply; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 www/manager6/dc/OptionView.js | 1 +
 1 file changed, 1 insertion(+)

diff --git a/www/manager6/dc/OptionView.js b/www/manager6/dc/OptionView.js
index e80c6457..46ab95e7 100644
--- a/www/manager6/dc/OptionView.js
+++ b/www/manager6/dc/OptionView.js
@@ -197,6 +197,7 @@ Ext.define('PVE.dc.OptionView', {
                         ['__default__', Proxmox.Utils.defaultText + ' (basic)'],
                         ['basic', 'Basic (Resource Count)'],
                         ['static', 'Static Load'],
+                        ['dynamic', 'Dynamic Load'],
                     ],
                     defaultValue: '__default__',
                 },
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC manager 2/2] ui: dc/options: add auto rebalancing options
  2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
                   ` (34 preceding siblings ...)
  2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
@ 2026-02-17 14:14 ` Daniel Kral
  35 siblings, 0 replies; 40+ messages in thread
From: Daniel Kral @ 2026-02-17 14:14 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 www/manager6/dc/OptionView.js | 45 +++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/www/manager6/dc/OptionView.js b/www/manager6/dc/OptionView.js
index 46ab95e7..ccc72d47 100644
--- a/www/manager6/dc/OptionView.js
+++ b/www/manager6/dc/OptionView.js
@@ -210,6 +210,51 @@ Ext.define('PVE.dc.OptionView', {
                     ),
                     value: 0,
                 },
+                {
+                    xtype: 'proxmoxcheckbox',
+                    name: 'ha-auto-rebalance',
+                    fieldLabel: gettext('Automatic Rebalance'),
+                    boxLabel: gettext('Automatically rebalance HA resources'),
+                    value: 0,
+                },
+                {
+                    xtype: 'numberfield',
+                    name: 'ha-auto-rebalance-threshold',
+                    fieldLabel: gettext('Automatic Rebalance Threshold'),
+                    emptyText: '0.7',
+                    minValue: 0.0,
+                    step: 0.01,
+                },
+                {
+                    xtype: 'proxmoxKVComboBox',
+                    name: 'ha-auto-rebalance-method',
+                    fieldLabel: gettext('Automatic Rebalance Method'),
+                    deleteEmpty: false,
+                    value: '__default__',
+                    comboItems: [
+                        ['__default__', Proxmox.Utils.defaultText + ' (bruteforce)'],
+                        ['bruteforce', 'Bruteforce'],
+                        ['topsis', 'TOPSIS'],
+                    ],
+                    defaultValue: '__default__',
+                },
+                {
+                    xtype: 'numberfield',
+                    name: 'ha-auto-rebalance-hold-duration',
+                    fieldLabel: gettext('Automatic Rebalance Hold Duration'),
+                    emptyText: '3',
+                    minValue: 0,
+                    step: 1,
+                },
+                {
+                    xtype: 'numberfield',
+                    name: 'ha-auto-rebalance-margin',
+                    fieldLabel: gettext('Automatic Rebalance Margin'),
+                    emptyText: '0.1',
+                    minValue: 0.0,
+                    maxValue: 1.0,
+                    step: 0.01,
+                },
             ],
         });
         me.add_inputpanel_row('u2f', gettext('U2F Settings'), {
-- 
2.47.3





^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC cluster 1/2] datacenter config: add dynamic load scheduler option
  2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
@ 2026-02-18 11:06   ` Maximiliano Sandoval
  0 siblings, 0 replies; 40+ messages in thread
From: Maximiliano Sandoval @ 2026-02-18 11:06 UTC (permalink / raw)
  To: Daniel Kral; +Cc: pve-devel

Daniel Kral <d.kral@proxmox.com> writes:

> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  src/PVE/DataCenterConfig.pm | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/PVE/DataCenterConfig.pm b/src/PVE/DataCenterConfig.pm
> index 514c867..5c91f80 100644
> --- a/src/PVE/DataCenterConfig.pm
> +++ b/src/PVE/DataCenterConfig.pm
> @@ -13,13 +13,14 @@ my $PROXMOX_OUI = 'BC:24:11';
>  my $crs_format = {
>      ha => {
>          type => 'string',
> -        enum => ['basic', 'static'],
> +        enum => ['basic', 'static', 'dynamic'],
>          optional => 1,
>          default => 'basic',
>          description => "Use this resource scheduler mode for HA.",
>          verbose_description => "Configures how the HA manager should select nodes to start or "
>              . "recover services. With 'basic', only the number of services is used, with 'static', "
> -            . "static CPU and memory configuration of services is considered.",
> +            . "static CPU and memory configuration of services is considered, and with 'dynamic', "
> +            . "static and dynamic CPU and memory usage of services is considered.",

I would personally add a full stop and rephrase a bit e.g.:

...static CPU and memory configuration of services is considered. When
set to 'dynamic' static CPU usage, dynamic CPU usage, and memory usage
of services is considered.

-- 
Maximiliano




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option
  2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
@ 2026-02-18 11:10   ` Maximiliano Sandoval
  0 siblings, 0 replies; 40+ messages in thread
From: Maximiliano Sandoval @ 2026-02-18 11:10 UTC (permalink / raw)
  To: Daniel Kral; +Cc: pve-devel

Daniel Kral <d.kral@proxmox.com> writes:

> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  www/manager6/dc/OptionView.js | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/www/manager6/dc/OptionView.js b/www/manager6/dc/OptionView.js
> index e80c6457..46ab95e7 100644
> --- a/www/manager6/dc/OptionView.js
> +++ b/www/manager6/dc/OptionView.js
> @@ -197,6 +197,7 @@ Ext.define('PVE.dc.OptionView', {
>                          ['__default__', Proxmox.Utils.defaultText + ' (basic)'],
>                          ['basic', 'Basic (Resource Count)'],
>                          ['static', 'Static Load'],
> +                        ['dynamic', 'Dynamic Load'],
>                      ],
>                      defaultValue: '__default__',
>                  },

These is pre-existing, but these should be translatable.

-- 
Maximiliano




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC cluster 2/2] datacenter config: add auto rebalancing options
  2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
@ 2026-02-18 11:15   ` Maximiliano Sandoval
  0 siblings, 0 replies; 40+ messages in thread
From: Maximiliano Sandoval @ 2026-02-18 11:15 UTC (permalink / raw)
  To: Daniel Kral; +Cc: pve-devel

Daniel Kral <d.kral@proxmox.com> writes:

> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
>  src/PVE/DataCenterConfig.pm | 38 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>
> diff --git a/src/PVE/DataCenterConfig.pm b/src/PVE/DataCenterConfig.pm
> index 5c91f80..86bd06a 100644
> --- a/src/PVE/DataCenterConfig.pm
> +++ b/src/PVE/DataCenterConfig.pm
> @@ -30,6 +30,44 @@ my $crs_format = {
>              "Set to use CRS for selecting a suited node when a HA services request-state"
>              . " changes from stop to start.",
>      },
> +    'ha-auto-rebalance' => {
> +        type => 'boolean',
> +        optional => 1,
> +        default => 0,
> +        description => "Set to use CRS for balancing HA resources automatically depending on"
> +            . " the current node imbalance.",

For boolean parameters I would personally prefer something like "Whether
to use..." or "If true, uses...", similar for the rest.

> +    'ha-auto-rebalance-threshold' => {
> +        type => 'number',
> +        optional => 1,
> +        default => 0.7,
> +        requires => 'ha-auto-rebalance',
> +        description => "The threshold for the node load, which will trigger the automatic"
> +            . " HA resource balancing if the threshold is exceeded.",
> +    },
> +    'ha-auto-rebalance-method' => {
> +        type => 'string',
> +        enum => ['bruteforce', 'topsis'],
> +        optional => 1,
> +        default => 'bruteforce',
> +        requires => 'ha-auto-rebalance',
> +    },
> +    'ha-auto-rebalance-hold-duration' => {
> +        type => 'number',
> +        optional => 1,
> +        default => 3,
> +        requires => 'ha-auto-rebalance',
> +        description => "The duration the threshold must be exceeded for to trigger an automatic"
> +            . " HA resource balancing migration in HA rounds.",
> +    },
> +    'ha-auto-rebalance-margin' => {
> +        type => 'number',
> +        optional => 1,
> +        default => 0.1,
> +        requires => 'ha-auto-rebalance',
> +        description => "The minimum relative improvement in cluster node imbalance to commit to"
> +            . " a HA resource rebalancing migration.",
> +    },
>  };
>  
>  my $migration_format = {

-- 
Maximiliano




^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2026-02-18 11:14 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
2026-02-18 11:06   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
2026-02-18 11:15   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases Daniel Kral
2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
2026-02-18 11:10   ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal