all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Daniel Kral" <d.kral@proxmox.com>
To: "Dominik Rusovac" <d.rusovac@proxmox.com>, <pve-devel@lists.proxmox.com>
Subject: Re: [PATCH proxmox v2 09/40] resource-scheduling: implement rebalancing migration selection
Date: Thu, 26 Mar 2026 15:11:28 +0100	[thread overview]
Message-ID: <DHCRWM3F526R.13E4LJL187H4K@proxmox.com> (raw)
In-Reply-To: <DHCNATRHTX0J.1X3B76NP84GES@proxmox.com>

On Thu Mar 26, 2026 at 11:34 AM CET, Dominik Rusovac wrote:
> lgtm 
>
> pls find my comments inline, mostly relating to nits or tiny things
>
> On Tue Mar 24, 2026 at 7:29 PM CET, Daniel Kral wrote:
>> Assuming that a resource will hold the same dynamic resource usage on a
>> new node as on the previous node, score possible migrations, where:
>>
>> - the cluster node imbalance is minimal (bruteforce), or
>> - the shifted root mean square and maximum resource usages of the cpu
>>   and memory is minimal across the cluster nodes (TOPSIS).
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>> ---
>> changes v1 -> v2:
>> - add saturating_sub() in remove_running_resource(...) (as suggested by
>>   @Thomas)
>> - slightly move declarations and impls around so that reading from
>>   top-to-bottom is a little easier
>> - pass NodeUsage vec instead of NodeStats vec to
>>   calculate_node_imbalance(...)
>> - pass a closure to calculate_node_imbalance(...) (as suggested by
>>   @Dominik)
>> - also use `migration` for `Ord` impl of `ScoredMigration`, s.t. the
>>   struct is now ordered first by the imbalance and then the strings in
>>   the `Migration` struct
>> - fix floating-point issue for the imbalance ordering for
>>   ScoredMigration
>> - correctly implement `Ord` (essentially removing the reverse() and
>>   moving these Reverse() wrappers to the usages for the BinaryHeap)
>> - use the `Migration` struct in `MigrationCandidate` as well
>> - drop Scheduler::node_stats() as it's unused now
>> - use Vec::with_capacity(...) where possible
>> - eagerly implement common traits (especially Clone and Debug)
>> - add test cases for the ScoredMigration ordering, node imbalance
>>   calculation and the two rebalancing migration scoring methods
>> - s/score_best_balancing_migrations
>>    /score_best_balancing_migration_candidates
>>   to possibly allow the Scheduler/Usage impls handling the migration
>>   candidate generation in the future instead of the callers
>
> [snip]
>
>> +/// Returns the load imbalance among the nodes.
>> +///
>> +/// The load balance is measured as the statistical dispersion of the individual node loads.
>> +///
>> +/// The current implementation uses the dimensionless coefficient of variation, which expresses the
>> +/// standard deviation in relation to the average mean of the node loads.
>> +///
>> +/// The coefficient of variation is not robust, which is a desired property here, because outliers
>> +/// should be detected as much as possible.
>> +fn calculate_node_imbalance(nodes: &[NodeUsage], to_load: impl Fn(&NodeUsage) -> f64) -> f64 {
>
> very nice docs! 
>
> [snip]
>
>> +/// A possible migration.
>> +#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug, Serialize, Deserialize)]
>> +#[serde(rename_all = "kebab-case")]
>> +pub struct Migration {
>> +    /// The identifier of a leading resource.
>> +    pub sid: String,
>> +    /// The current node of the leading resource.
>> +    pub source_node: String,
>> +    /// The possible migration target node for the resource.
>> +    pub target_node: String,
>
> nit: on the long run, instead of having `ScoredMigration`, 
> it could be more convenient to have a field:
>
>     pub imbalance: Option<f64>,
>

Might make sense, but then we can't reuse the same structure in
`MigrationCandidate` anymore, so I would let ScoredMigration be it's own
type, what do you think?

>> +}
>
> [snip]
>
>> +impl ScoredMigration {
>> +    pub fn new<T: Into<Migration>>(migration: T, imbalance: f64) -> Self {
>> +        // Depending how the imbalance is calculated, it can contain minor approximation errors. As
>
>            // Depending [on] how [...]

Thanks, will change that!

>
>> +        // this struct implements the Ord trait, users of the struct's cmp() can run into cases,
>> +        // where the imbalance is the same up to the significant digits in base 10, but treated as
>> +        // different values.
>> +        //
>> +        // Therefore, truncate any non-significant digits to prevent these cases.
>> +        let factor = 10_f64.powf(f64::DIGITS as f64);
>> +        let truncated_imbalance = f64::trunc(factor * imbalance) / factor;
>
> Nice solution, this appears to be a clean approach to achieve deterministic `Ord`
> for `f64`. 
>
> One small thing, tho: `f64::DIGITS` is technically not a floating number, but `15_u32`.
>
>            let factor = 10_f64.powi(f64::DIGITS as i32);
>
> thus, seems to be the better choice here. `powi` is also generally faster than `powf` [0]. 
>
> [0] https://doc.rust-lang.org/std/primitive.f64.html#method.powi:~:text=Using%20this%20function%20is%20generally%20faster%20than%20using%20powf

Thanks, good catch! I also weren't aware of them being non-deterministic
here, that's good to know.

We briefly also talked about this off-list, I'll adapt the test cases
below to not eq() the ScoredMigration directly as the truncation here
might be non-deterministic (and I assumed otherwise).

More-so, it's more important to verify the order in which the migrations
are scored than their exact imbalance score as you suggested.

>
> [snip]
>
>> +/// A possible migration candidate with the migrated usage stats.
>> +#[derive(Clone, Debug)]
>> +pub struct MigrationCandidate {
>> +    /// The possible migration.
>> +    pub migration: Migration,
>> +    /// The to-be-migrated resource usage stats.
>
> imo, easier to comprehend: 
>
>     /// Usage stats of the resource to be migrated

Nice, will change it to that!

>
>> +    pub stats: ResourceStats,
>> +}
>
> [snip]





  reply	other threads:[~2026-03-26 14:11 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 18:29 [PATCH cluster/ha-manager/perl-rs/proxmox v2 00/40] dynamic scheduler + load rebalancer Daniel Kral
2026-03-24 18:29 ` [PATCH proxmox v2 01/40] resource-scheduling: inline add_cpu_usage in score_nodes_to_start_service Daniel Kral
2026-03-26 10:10   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 02/40] resource-scheduling: move score_nodes_to_start_service to scheduler crate Daniel Kral
2026-03-26 10:11   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 03/40] resource-scheduling: rename service to resource where appropriate Daniel Kral
2026-03-26 10:12   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 04/40] resource-scheduling: introduce generic scheduler implementation Daniel Kral
2026-03-26 10:19   ` Dominik Rusovac
2026-03-26 14:16     ` Daniel Kral
2026-03-24 18:29 ` [PATCH proxmox v2 05/40] resource-scheduling: implement generic cluster usage implementation Daniel Kral
2026-03-26 10:28   ` Dominik Rusovac
2026-03-26 14:15     ` Daniel Kral
2026-03-24 18:29 ` [PATCH proxmox v2 06/40] resource-scheduling: topsis: handle empty criteria without panics Daniel Kral
2026-03-26 10:29   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 07/40] resource-scheduling: compare by nodename in score_nodes_to_start_resource Daniel Kral
2026-03-26 10:29   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 08/40] resource-scheduling: factor out topsis alternative mapping Daniel Kral
2026-03-26 10:30   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH proxmox v2 09/40] resource-scheduling: implement rebalancing migration selection Daniel Kral
2026-03-26 10:34   ` Dominik Rusovac
2026-03-26 14:11     ` Daniel Kral [this message]
2026-03-27  9:34       ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 10/40] pve-rs: resource-scheduling: remove pedantic error handling from remove_node Daniel Kral
2026-03-27  9:38   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 11/40] pve-rs: resource-scheduling: remove pedantic error handling from remove_service_usage Daniel Kral
2026-03-27  9:39   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 12/40] pve-rs: resource-scheduling: move pve_static into resource_scheduling module Daniel Kral
2026-03-27  9:41   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 13/40] pve-rs: resource-scheduling: use generic usage implementation Daniel Kral
2026-03-27 14:13   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 14/40] pve-rs: resource-scheduling: static: replace deprecated usage structs Daniel Kral
2026-03-27 14:18   ` Dominik Rusovac
2026-03-24 18:29 ` [PATCH perl-rs v2 15/40] pve-rs: resource-scheduling: implement pve_dynamic bindings Daniel Kral
2026-03-27 14:15   ` Dominik Rusovac
2026-03-24 18:30 ` [PATCH perl-rs v2 16/40] pve-rs: resource-scheduling: expose auto rebalancing methods Daniel Kral
2026-03-27 14:16   ` Dominik Rusovac
2026-03-24 18:30 ` [PATCH cluster v2 17/40] datacenter config: restructure verbose description for the ha crs option Daniel Kral
2026-03-24 18:30 ` [PATCH cluster v2 18/40] datacenter config: add dynamic load scheduler option Daniel Kral
2026-03-24 18:30 ` [PATCH cluster v2 19/40] datacenter config: add auto rebalancing options Daniel Kral
2026-03-26 16:08   ` Jillian Morgan
2026-03-26 16:20     ` Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 20/40] env: pve2: implement dynamic node and service stats Daniel Kral
2026-03-25 21:43   ` Thomas Lamprecht
2026-03-24 18:30 ` [PATCH ha-manager v2 21/40] sim: hardware: pass correct types for static stats Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 22/40] sim: hardware: factor out static stats' default values Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 23/40] sim: hardware: fix static stats guard Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 24/40] sim: hardware: handle dynamic service stats Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 25/40] sim: hardware: add set-dynamic-stats command Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 26/40] sim: hardware: add getters for dynamic {node,service} stats Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 27/40] usage: pass service data to add_service_usage Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 28/40] usage: pass service data to get_used_service_nodes Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 29/40] add running flag to cluster service stats Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 30/40] usage: use add_service to add service usage to nodes Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 31/40] usage: add dynamic usage scheduler Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 32/40] test: add dynamic usage scheduler test cases Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 33/40] manager: rename execute_migration to queue_resource_motion Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 34/40] manager: update_crs_scheduler_mode: factor out crs config Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 35/40] implement automatic rebalancing Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 36/40] test: add resource bundle generation test cases Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 37/40] test: add dynamic automatic rebalancing system " Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 38/40] test: add static " Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 39/40] test: add automatic rebalancing system test cases with TOPSIS method Daniel Kral
2026-03-24 18:30 ` [PATCH ha-manager v2 40/40] test: add automatic rebalancing system test cases with affinity rules Daniel Kral

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DHCRWM3F526R.13E4LJL187H4K@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=d.rusovac@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal