From: "Daniel Kral" <d.kral@proxmox.com>
To: "Fiona Ebner" <f.ebner@proxmox.com>,
"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes
Date: Thu, 16 Oct 2025 17:34:52 +0200 [thread overview]
Message-ID: <DDJUUR3FA0IC.1J5AYPWLRMS3F@proxmox.com> (raw)
In-Reply-To: <88318e2a-e64f-4537-9a6e-09003bd4af53@proxmox.com>
On Thu Oct 16, 2025 at 12:32 PM CEST, Fiona Ebner wrote:
> Am 30.09.25 um 4:21 PM schrieb Daniel Kral:
>> Implements a simple bidirectional map to track which service usages have
>> been added to nodes, so that these can be removed later individually.
>>
>> The StaticServiceUsage is added to the HashMap<> in StaticNodeInfo to
>> reduce unnecessary indirection when summing these values in
>> score_nodes_to_start_service(...).
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>
> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
>
>> ---
>> I started out adding and removing the service_usage in StaticNodeUsage
>> on each add_service_usage_to_node(...) and remove_service_usage(...)
>> call individually, but I think summing those up every time is better for
>> numerical stability..
>
> This can go into the commit message too if worded a bit less informally ;)
Will do! Even though I asked myself if the concern about numerical
stability really holds true here if we don't go for ha #9 as then these
values will only change over a single manage(...) run and I'd guess that
most users would use floating-point values which can be represented
exactly (0.0, 1.0, 2.0, 3.0, ... and 0.5, 0.25, 0.125, ...). For these,
numerical stability shouldn't be a problem at all, or not?
>
>>
>> .../bindings/resource_scheduling_static.rs | 98 ++++++++++++++++---
>> pve-rs/test/resource_scheduling.pl | 84 ++++++++++++++--
>> 2 files changed, 160 insertions(+), 22 deletions(-)
>>
>> diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
>> index 7cf2f35..535dfe3 100644
>> --- a/pve-rs/src/bindings/resource_scheduling_static.rs
>> +++ b/pve-rs/src/bindings/resource_scheduling_static.rs
>> @@ -16,8 +16,16 @@ pub mod pve_rs_resource_scheduling_static {
>>
>> perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Static");
>>
>> + struct StaticNodeInfo {
>> + name: String,
>> + maxcpu: usize,
>> + maxmem: usize,
>> + services: HashMap<String, StaticServiceUsage>,
>> + }
>> +
>> struct Usage {
>> - nodes: HashMap<String, StaticNodeUsage>,
>> + nodes: HashMap<String, StaticNodeInfo>,
>> + service_nodes: HashMap<String, Vec<String>>,
>
> Rather than Vec it could be a HashSet, but I'm fine with either way.
>
>> }
>>
>> /// A scheduler instance contains the resource usage by node.
>
> ---snip---
>
>> @@ -67,10 +75,25 @@ pub mod pve_rs_resource_scheduling_static {
>>
>> /// Method: Remove a node from the scheduler.
>> #[export]
>> - pub fn remove_node(#[try_from_ref] this: &Scheduler, nodename: &str) {
>> + pub fn remove_node(#[try_from_ref] this: &Scheduler, nodename: &str) -> Result<(), Error> {
>> let mut usage = this.inner.lock().unwrap();
>>
>> - usage.nodes.remove(nodename);
>> + if let Some(node) = usage.nodes.remove(nodename) {
>> + for (sid, _) in node.services.iter() {
>> + match usage.service_nodes.get_mut(sid) {
>> + Some(nodes) => {
>> + nodes.retain(|node| !node.eq(nodename));
>> + }
>> + None => bail!(
>> + "service '{}' not present while removing node '{}'",
>
> Nit: for a user (or developer in 6 months time :P), this error message
> is rather unspecific. I'd suggest:
> "internal error: ... not present in service_nodes hash map while ..."
Will do as those are always helpful but hopefully this won't reach any
users as it's a consistency check of the implementation itself ;)
>
>> + sid,
>> + nodename
>> + ),
>> + }
>> + }
>> + }
>> +
>> + Ok(())
>> }
>>
>> /// Method: Get a list of all the nodes in the scheduler.
>> @@ -93,22 +116,55 @@ pub mod pve_rs_resource_scheduling_static {
>> usage.nodes.contains_key(nodename)
>> }
>>
>> - /// Method: Add usage of `service` to the node's usage.
>> + /// Method: Add service `sid` and its `service_usage` to the node.
>> #[export]
>> pub fn add_service_usage_to_node(
>> #[try_from_ref] this: &Scheduler,
>> nodename: &str,
>> - service: StaticServiceUsage,
>> + sid: &str,
>> + service_usage: StaticServiceUsage,
>
> So this requires a versioned Breaks in d/control. Please mention this in
> the commit notes and in the cover letter.
ACK will add that information here and will keep that in mind for future
patch series!
>
>> ) -> Result<(), Error> {
>> let mut usage = this.inner.lock().unwrap();
>>
>> match usage.nodes.get_mut(nodename) {
>> Some(node) => {
>> - node.add_service_usage(&service);
>> - Ok(())
>> + if node.services.contains_key(sid) {
>> + bail!("service '{}' already added to node '{}'", sid, nodename);
>> + }
>> +
>> + node.services.insert(sid.to_string(), service_usage);
>> }
>> None => bail!("node '{}' not present in usage hashmap", nodename),
>> }
>> +
>> + if let Some(nodes) = usage.service_nodes.get_mut(sid) {
>> + nodes.push(nodename.to_string());
>> + } else {
>> + usage
>> + .service_nodes
>> + .insert(sid.to_string(), vec![nodename.to_string()]);
>> + }
>
> Nit: just to be sure, we could check whether the node would be duplicate
> in service_nodes too (is more straightforward with using HashSet). Maybe
> not really worth it though, because if we keep the mappings node ->
> services and service -> nodes consistent with each other (which we of
> course should) then the check above already protects us.
I'll check that again, but I'd go for the HashSet idea then as it fits
better conceptually. Also because I was curious:
```
println!("HashMap<String, Vec<String>>: {}", size_of::<HashMap<String, Vec<String>>>());
println!("HashMap<String, HashSet<String>>: {}", size_of::<HashMap<String, HashSet<String>>>());
---
HashMap<String, Vec<String>>: 48
HashMap<String, HashSet<String>>: 48
```
although
```
println!("Vec<String>: {}", size_of::<Vec<String>>());
println!("HashSet<String>: {}", size_of::<HashSet<String>>());
---
Vec<String>: 24
HashSet<String>: 48
```
>
> ---snip---
>
>> @@ -50,7 +65,54 @@ sub test_balance {
>> is($nodes[1], "A", 'second should be A');
>> }
>>
>> - $static->add_service_usage_to_node($nodes[0], $service);
>> + $static->add_service_usage_to_node($nodes[0], $sid_names[$i], $service);
>
> Nit: could also just use something like "vm:$i" or "vm:" . 100 + $i
> rather than letters. If it's a variable-controlled enumeration that
> seems more natural to me.
Right, I wanted to stick to the alphabet usage here but that makes much
more sense for something that is generated ^^.
>
>> + }
>> +}
>> +
>
> ---snip---
>
>> @@ -96,9 +158,9 @@ sub test_balance_small_memory_difference {
>> $static->add_node("C", 4, 8_000_000_000);
>>
>> if ($with_start_load) {
>> - $static->add_service_usage_to_node("A", { maxcpu => 4, maxmem => 1_000_000_000 });
>> - $static->add_service_usage_to_node("B", { maxcpu => 2, maxmem => 1_000_000_000 });
>> - $static->add_service_usage_to_node("C", { maxcpu => 2, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("A", "a", { maxcpu => 4, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("B", "b", { maxcpu => 2, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("C", "c", { maxcpu => 2, maxmem => 1_000_000_000 });
>> }
>>
>> my $service = {
>> @@ -106,6 +168,7 @@ sub test_balance_small_memory_difference {
>> maxmem => 16_000_000,
>> };
>>
>> + my @sid_names = ("d" .. "z");
>> for (my $i = 0; $i < 20; $i++) {
>> my $score_list = $static->score_nodes_to_start_service($service);
>>
>> @@ -131,12 +194,13 @@ sub test_balance_small_memory_difference {
>> die "internal error, got $i % 4 == " . ($i % 4) . "\n";
>> }
>>
>> - $static->add_service_usage_to_node($nodes[0], $service);
>> + $static->add_service_usage_to_node($nodes[0], $sid_names[$i], $service);
>
> Nit: same as above regarding service names, we're already very close to
> the edge of the alphabet in this case ;P
>
>> }
>> }
>>
>> test_basic();
>> test_balance();
>> +test_balance_removal();
>> test_overcommitted();
>> test_balance_small_memory_difference(1);
>> test_balance_small_memory_difference(0);
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-10-16 15:34 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 14:19 [pve-devel] [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH qemu-server 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-15 14:31 ` Fiona Ebner
2025-10-16 9:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH proxmox 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-16 10:32 ` Fiona Ebner
2025-10-16 15:34 ` Daniel Kral [this message]
2025-10-17 10:55 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache Daniel Kral
2025-10-16 11:12 ` Fiona Ebner
2025-10-16 15:15 ` Daniel Kral
2025-10-17 10:02 ` Fiona Ebner
2025-10-17 10:08 ` Fiona Ebner
2025-10-17 16:18 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 2/9] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-16 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 3/9] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-16 11:33 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 4/9] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-16 11:39 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 5/9] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-17 11:14 ` Fiona Ebner
2025-10-17 15:46 ` Daniel Kral
2025-10-20 15:18 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 6/9] manager: make recompute_online_node_usage use get_service_nodes helper Daniel Kral
2025-10-17 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 7/9] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-17 11:57 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular Daniel Kral
2025-10-17 12:32 ` Fiona Ebner
2025-10-17 16:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular Daniel Kral
2025-10-17 12:42 ` Fiona Ebner
2025-10-17 15:59 ` Daniel Kral
2025-10-20 16:50 ` [pve-devel] superseded: [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DDJUUR3FA0IC.1J5AYPWLRMS3F@proxmox.com \
--to=d.kral@proxmox.com \
--cc=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox