From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [RFC proxmox 2/5] resource-scheduling: introduce generic cluster usage implementation
Date: Tue, 17 Feb 2026 15:13:56 +0100 [thread overview]
Message-ID: <20260217141437.584852-3-d.kral@proxmox.com> (raw)
In-Reply-To: <20260217141437.584852-1-d.kral@proxmox.com>
Declare generic NodeStats and ServiceStats structs, which special use
cases convert their types into, and use these to implement generic
scheduler methods such as the existing scoring of nodes to start a
previously non-running service.
This is best viewed with the git option --ignore-all-space.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
proxmox-resource-scheduling/src/pve_static.rs | 45 ++++-
proxmox-resource-scheduling/src/scheduler.rs | 185 ++++++++++++------
2 files changed, 166 insertions(+), 64 deletions(-)
diff --git a/proxmox-resource-scheduling/src/pve_static.rs b/proxmox-resource-scheduling/src/pve_static.rs
index 184e615d..b269c44f 100644
--- a/proxmox-resource-scheduling/src/pve_static.rs
+++ b/proxmox-resource-scheduling/src/pve_static.rs
@@ -1,9 +1,9 @@
use anyhow::Error;
use serde::{Deserialize, Serialize};
-use crate::scheduler;
+use crate::scheduler::{ClusterUsage, NodeStats, NodeUsage, ServiceStats};
-#[derive(Serialize, Deserialize)]
+#[derive(Clone, Serialize, Deserialize)]
#[serde(rename_all = "kebab-case")]
/// Static usage information of a node.
pub struct StaticNodeUsage {
@@ -33,9 +33,25 @@ impl AsRef<StaticNodeUsage> for StaticNodeUsage {
}
}
+impl From<StaticNodeUsage> for NodeUsage {
+ fn from(value: StaticNodeUsage) -> Self {
+ let stats = NodeStats {
+ cpu: value.cpu,
+ maxcpu: value.maxcpu,
+ mem: value.mem,
+ maxmem: value.maxmem,
+ };
+
+ Self {
+ name: value.name,
+ stats,
+ }
+ }
+}
+
/// Calculate new CPU usage in percent.
/// `add` being `0.0` means "unlimited" and results in `max` being added.
-pub fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
+fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
if add == 0.0 {
old + max
} else {
@@ -43,7 +59,7 @@ pub fn add_cpu_usage(old: f64, max: f64, add: f64) -> f64 {
}
}
-#[derive(Serialize, Deserialize)]
+#[derive(Clone, Copy, Serialize, Deserialize)]
#[serde(rename_all = "kebab-case")]
/// Static usage information of an HA resource.
pub struct StaticServiceUsage {
@@ -53,14 +69,33 @@ pub struct StaticServiceUsage {
pub maxmem: usize,
}
+impl From<StaticServiceUsage> for ServiceStats {
+ fn from(value: StaticServiceUsage) -> Self {
+ Self {
+ cpu: value.maxcpu,
+ maxcpu: value.maxcpu,
+ mem: value.maxmem,
+ maxmem: value.maxmem,
+ }
+ }
+}
+
/// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
/// and CPU usages of the nodes as if the service would already be running on each.
///
/// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
/// is better.
+#[deprecated]
pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
nodes: &[T],
service: &StaticServiceUsage,
) -> Result<Vec<(String, f64)>, Error> {
- scheduler::score_nodes_to_start_service(nodes, service)
+ let nodes = nodes
+ .iter()
+ .map(|node| node.as_ref().clone().into())
+ .collect::<Vec<NodeUsage>>();
+
+ let cluster_usage = ClusterUsage::from_nodes(nodes);
+
+ cluster_usage.score_nodes_to_start_service(*service)
}
diff --git a/proxmox-resource-scheduling/src/scheduler.rs b/proxmox-resource-scheduling/src/scheduler.rs
index 29353d84..58215f03 100644
--- a/proxmox-resource-scheduling/src/scheduler.rs
+++ b/proxmox-resource-scheduling/src/scheduler.rs
@@ -1,9 +1,66 @@
use anyhow::Error;
-use crate::{
- pve_static::{add_cpu_usage, StaticNodeUsage, StaticServiceUsage},
- topsis,
-};
+use crate::topsis;
+
+/// Generic service stats.
+#[derive(Clone, Copy)]
+pub struct ServiceStats {
+ /// CPU utilization in CPU cores.
+ pub cpu: f64,
+ /// Number of assigned CPUs or CPU limit.
+ pub maxcpu: f64,
+ /// Used memory in bytes.
+ pub mem: usize,
+ /// Maximum assigned memory in bytes.
+ pub maxmem: usize,
+}
+
+/// Generic node stats.
+#[derive(Clone, Copy)]
+pub struct NodeStats {
+ /// CPU utilization in CPU cores.
+ pub cpu: f64,
+ /// Total number of CPU cores.
+ pub maxcpu: usize,
+ /// Used memory in bytes.
+ pub mem: usize,
+ /// Total memory in bytes.
+ pub maxmem: usize,
+}
+
+impl NodeStats {
+ /// Adds the service stats to the node stats as if the service has started on the node.
+ pub fn add_started_service(&mut self, service_stats: &ServiceStats) {
+ // a maxcpu value of `0.0` means no cpu usage limit on the node
+ let service_cpu = if service_stats.maxcpu == 0.0 {
+ self.maxcpu as f64
+ } else {
+ service_stats.maxcpu
+ };
+
+ self.cpu += service_cpu;
+ self.mem += service_stats.maxmem;
+ }
+
+ /// Returns the current cpu usage as a percentage.
+ pub fn cpu_load(&self) -> f64 {
+ self.cpu / self.maxcpu as f64
+ }
+
+ /// Returns the current memory usage as a percentage.
+ pub fn mem_load(&self) -> f64 {
+ self.mem as f64 / self.maxmem as f64
+ }
+}
+
+pub struct NodeUsage {
+ pub name: String,
+ pub stats: NodeStats,
+}
+
+pub struct ClusterUsage {
+ nodes: Vec<NodeUsage>,
+}
criteria_struct! {
/// A given alternative.
@@ -22,65 +79,75 @@ criteria_struct! {
static PVE_HA_TOPSIS_CRITERIA;
}
-/// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
-/// and CPU usages of the nodes as if the service would already be running on each.
-///
-/// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
-/// is better.
-pub fn score_nodes_to_start_service<T: AsRef<StaticNodeUsage>>(
- nodes: &[T],
- service: &StaticServiceUsage,
-) -> Result<Vec<(String, f64)>, Error> {
- let len = nodes.len();
+impl ClusterUsage {
+ /// Instantiate cluster usage from node usages.
+ pub fn from_nodes<I>(nodes: I) -> Self
+ where
+ I: IntoIterator<Item: Into<NodeUsage>>,
+ {
+ Self {
+ nodes: nodes.into_iter().map(|node| node.into()).collect(),
+ }
+ }
- let matrix = nodes
- .iter()
- .enumerate()
- .map(|(target_index, _)| {
- // Base values on percentages to allow comparing nodes with different stats.
- let mut highest_cpu = 0.0;
- let mut squares_cpu = 0.0;
- let mut highest_mem = 0.0;
- let mut squares_mem = 0.0;
+ /// Scores candidate `nodes` to start a `service` on. Scoring is done according to the static memory
+ /// and CPU usages of the nodes as if the service would already be running on each.
+ ///
+ /// Returns a vector of (nodename, score) pairs. Scores are between 0.0 and 1.0 and a higher score
+ /// is better.
+ pub fn score_nodes_to_start_service<T: Into<ServiceStats>>(
+ &self,
+ service_stats: T,
+ ) -> Result<Vec<(String, f64)>, Error> {
+ let len = self.nodes.len();
+ let service_stats = service_stats.into();
- for (index, node) in nodes.iter().enumerate() {
- let node = node.as_ref();
- let new_cpu = if index == target_index {
- add_cpu_usage(node.cpu, node.maxcpu as f64, service.maxcpu)
- } else {
- node.cpu
- } / (node.maxcpu as f64);
- highest_cpu = f64::max(highest_cpu, new_cpu);
- squares_cpu += new_cpu.powi(2);
+ let matrix = self
+ .nodes
+ .iter()
+ .enumerate()
+ .map(|(target_index, _)| {
+ // Base values on percentages to allow comparing nodes with different stats.
+ let mut highest_cpu = 0.0;
+ let mut squares_cpu = 0.0;
+ let mut highest_mem = 0.0;
+ let mut squares_mem = 0.0;
- let new_mem = if index == target_index {
- node.mem + service.maxmem
- } else {
- node.mem
- } as f64
- / node.maxmem as f64;
- highest_mem = f64::max(highest_mem, new_mem);
- squares_mem += new_mem.powi(2);
- }
+ for (index, node) in self.nodes.iter().enumerate() {
+ let mut new_stats = node.stats;
- // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
- // 1.004 is only slightly more than 1.002.
- PveTopsisAlternative {
- average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
- highest_cpu: 1.0 + highest_cpu,
- average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
- highest_memory: 1.0 + highest_mem,
- }
- .into()
- })
- .collect::<Vec<_>>();
+ if index == target_index {
+ new_stats.add_started_service(&service_stats)
+ };
- let scores =
- topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+ let new_cpu = new_stats.cpu_load();
+ highest_cpu = f64::max(highest_cpu, new_cpu);
+ squares_cpu += new_cpu.powi(2);
- Ok(scores
- .into_iter()
- .enumerate()
- .map(|(n, score)| (nodes[n].as_ref().name.clone(), score))
- .collect())
+ let new_mem = new_stats.mem_load();
+ highest_mem = f64::max(highest_mem, new_mem);
+ squares_mem += new_mem.powi(2);
+ }
+
+ // Add 1.0 to avoid boosting tiny differences: e.g. 0.004 is twice as much as 0.002, but
+ // 1.004 is only slightly more than 1.002.
+ PveTopsisAlternative {
+ average_cpu: 1.0 + (squares_cpu / len as f64).sqrt(),
+ highest_cpu: 1.0 + highest_cpu,
+ average_memory: 1.0 + (squares_mem / len as f64).sqrt(),
+ highest_memory: 1.0 + highest_mem,
+ }
+ .into()
+ })
+ .collect::<Vec<_>>();
+
+ let scores =
+ topsis::score_alternatives(&topsis::Matrix::new(matrix)?, &PVE_HA_TOPSIS_CRITERIA)?;
+
+ Ok(scores
+ .into_iter()
+ .enumerate()
+ .map(|(n, score)| (self.nodes[n].name.to_string(), score))
+ .collect())
+ }
}
--
2.47.3
next prev parent reply other threads:[~2026-02-17 14:16 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-17 14:13 [RFC PATCH-SERIES many 00/36] dynamic scheduler + load rebalancer Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 1/5] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-02-17 14:13 ` Daniel Kral [this message]
2026-02-17 14:13 ` [RFC proxmox 3/5] resource-scheduling: add dynamic node and service stats Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 4/5] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-02-17 14:13 ` [RFC proxmox 5/5] resource-scheduling: implement Add and Default for {Dynamic,Static}ServiceStats Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 1/6] pve-rs: resource scheduling: use generic cluster usage implementation Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 2/6] pve-rs: resource scheduling: create service_nodes hashset from array Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 3/6] pve-rs: resource scheduling: store service stats independently of node Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 4/6] pve-rs: resource scheduling: expose auto rebalancing methods Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 5/6] pve-rs: resource scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-02-17 14:14 ` [RFC perl-rs 6/6] pve-rs: resource scheduling: implement pve_dynamic bindings Daniel Kral
2026-02-17 14:14 ` [RFC cluster 1/2] datacenter config: add dynamic load scheduler option Daniel Kral
2026-02-18 11:06 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC cluster 2/2] datacenter config: add auto rebalancing options Daniel Kral
2026-02-18 11:15 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC ha-manager 01/21] rename static node stats to be consistent with similar interfaces Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 02/21] resources: remove redundant load_config fallback for static config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 03/21] remove redundant service_node and migration_target parameter Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 04/21] factor out common pve to ha resource type mapping Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 05/21] derive static service stats while filling the service stats repository Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 06/21] test: make static service usage explicit for all resources Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 07/21] make static service stats indexable by sid Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 08/21] move static service stats repository to PVE::HA::Usage::Static Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 09/21] usage: augment service stats with node and state information Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 10/21] include running non-HA resources in the scheduler's accounting Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 11/21] env, resources: add dynamic node and service stats abstraction Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 12/21] env: pve2: implement dynamic node and service stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 13/21] sim: hardware: pass correct types for static stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 14/21] sim: hardware: factor out static stats' default values Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 15/21] sim: hardware: rewrite set-static-stats Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 16/21] sim: hardware: add set-dynamic-stats for services Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 17/21] usage: add dynamic usage scheduler Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 18/21] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 19/21] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 20/21] implement automatic rebalancing Daniel Kral
2026-02-17 14:14 ` [RFC ha-manager 21/21] test: add basic automatic rebalancing system test cases Daniel Kral
2026-02-17 14:14 ` [RFC manager 1/2] ui: dc/options: add dynamic load scheduler option Daniel Kral
2026-02-18 11:10 ` Maximiliano Sandoval
2026-02-17 14:14 ` [RFC manager 2/2] ui: dc/options: add auto rebalancing options Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260217141437.584852-3-d.kral@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.