From: "Daniel Kral" <d.kral@proxmox.com>
To: "Fiona Ebner" <f.ebner@proxmox.com>,
"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes
Date: Thu, 16 Oct 2025 17:34:52 +0200 [thread overview]
Message-ID: <DDJUUR3FA0IC.1J5AYPWLRMS3F@proxmox.com> (raw)
In-Reply-To: <88318e2a-e64f-4537-9a6e-09003bd4af53@proxmox.com>
On Thu Oct 16, 2025 at 12:32 PM CEST, Fiona Ebner wrote:
> Am 30.09.25 um 4:21 PM schrieb Daniel Kral:
>> Implements a simple bidirectional map to track which service usages have
>> been added to nodes, so that these can be removed later individually.
>>
>> The StaticServiceUsage is added to the HashMap<> in StaticNodeInfo to
>> reduce unnecessary indirection when summing these values in
>> score_nodes_to_start_service(...).
>>
>> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
>
> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
>
>> ---
>> I started out adding and removing the service_usage in StaticNodeUsage
>> on each add_service_usage_to_node(...) and remove_service_usage(...)
>> call individually, but I think summing those up every time is better for
>> numerical stability..
>
> This can go into the commit message too if worded a bit less informally ;)
Will do! Even though I asked myself if the concern about numerical
stability really holds true here if we don't go for ha #9 as then these
values will only change over a single manage(...) run and I'd guess that
most users would use floating-point values which can be represented
exactly (0.0, 1.0, 2.0, 3.0, ... and 0.5, 0.25, 0.125, ...). For these,
numerical stability shouldn't be a problem at all, or not?
>
>>
>> .../bindings/resource_scheduling_static.rs | 98 ++++++++++++++++---
>> pve-rs/test/resource_scheduling.pl | 84 ++++++++++++++--
>> 2 files changed, 160 insertions(+), 22 deletions(-)
>>
>> diff --git a/pve-rs/src/bindings/resource_scheduling_static.rs b/pve-rs/src/bindings/resource_scheduling_static.rs
>> index 7cf2f35..535dfe3 100644
>> --- a/pve-rs/src/bindings/resource_scheduling_static.rs
>> +++ b/pve-rs/src/bindings/resource_scheduling_static.rs
>> @@ -16,8 +16,16 @@ pub mod pve_rs_resource_scheduling_static {
>>
>> perlmod::declare_magic!(Box<Scheduler> : &Scheduler as "PVE::RS::ResourceScheduling::Static");
>>
>> + struct StaticNodeInfo {
>> + name: String,
>> + maxcpu: usize,
>> + maxmem: usize,
>> + services: HashMap<String, StaticServiceUsage>,
>> + }
>> +
>> struct Usage {
>> - nodes: HashMap<String, StaticNodeUsage>,
>> + nodes: HashMap<String, StaticNodeInfo>,
>> + service_nodes: HashMap<String, Vec<String>>,
>
> Rather than Vec it could be a HashSet, but I'm fine with either way.
>
>> }
>>
>> /// A scheduler instance contains the resource usage by node.
>
> ---snip---
>
>> @@ -67,10 +75,25 @@ pub mod pve_rs_resource_scheduling_static {
>>
>> /// Method: Remove a node from the scheduler.
>> #[export]
>> - pub fn remove_node(#[try_from_ref] this: &Scheduler, nodename: &str) {
>> + pub fn remove_node(#[try_from_ref] this: &Scheduler, nodename: &str) -> Result<(), Error> {
>> let mut usage = this.inner.lock().unwrap();
>>
>> - usage.nodes.remove(nodename);
>> + if let Some(node) = usage.nodes.remove(nodename) {
>> + for (sid, _) in node.services.iter() {
>> + match usage.service_nodes.get_mut(sid) {
>> + Some(nodes) => {
>> + nodes.retain(|node| !node.eq(nodename));
>> + }
>> + None => bail!(
>> + "service '{}' not present while removing node '{}'",
>
> Nit: for a user (or developer in 6 months time :P), this error message
> is rather unspecific. I'd suggest:
> "internal error: ... not present in service_nodes hash map while ..."
Will do as those are always helpful but hopefully this won't reach any
users as it's a consistency check of the implementation itself ;)
>
>> + sid,
>> + nodename
>> + ),
>> + }
>> + }
>> + }
>> +
>> + Ok(())
>> }
>>
>> /// Method: Get a list of all the nodes in the scheduler.
>> @@ -93,22 +116,55 @@ pub mod pve_rs_resource_scheduling_static {
>> usage.nodes.contains_key(nodename)
>> }
>>
>> - /// Method: Add usage of `service` to the node's usage.
>> + /// Method: Add service `sid` and its `service_usage` to the node.
>> #[export]
>> pub fn add_service_usage_to_node(
>> #[try_from_ref] this: &Scheduler,
>> nodename: &str,
>> - service: StaticServiceUsage,
>> + sid: &str,
>> + service_usage: StaticServiceUsage,
>
> So this requires a versioned Breaks in d/control. Please mention this in
> the commit notes and in the cover letter.
ACK will add that information here and will keep that in mind for future
patch series!
>
>> ) -> Result<(), Error> {
>> let mut usage = this.inner.lock().unwrap();
>>
>> match usage.nodes.get_mut(nodename) {
>> Some(node) => {
>> - node.add_service_usage(&service);
>> - Ok(())
>> + if node.services.contains_key(sid) {
>> + bail!("service '{}' already added to node '{}'", sid, nodename);
>> + }
>> +
>> + node.services.insert(sid.to_string(), service_usage);
>> }
>> None => bail!("node '{}' not present in usage hashmap", nodename),
>> }
>> +
>> + if let Some(nodes) = usage.service_nodes.get_mut(sid) {
>> + nodes.push(nodename.to_string());
>> + } else {
>> + usage
>> + .service_nodes
>> + .insert(sid.to_string(), vec![nodename.to_string()]);
>> + }
>
> Nit: just to be sure, we could check whether the node would be duplicate
> in service_nodes too (is more straightforward with using HashSet). Maybe
> not really worth it though, because if we keep the mappings node ->
> services and service -> nodes consistent with each other (which we of
> course should) then the check above already protects us.
I'll check that again, but I'd go for the HashSet idea then as it fits
better conceptually. Also because I was curious:
```
println!("HashMap<String, Vec<String>>: {}", size_of::<HashMap<String, Vec<String>>>());
println!("HashMap<String, HashSet<String>>: {}", size_of::<HashMap<String, HashSet<String>>>());
---
HashMap<String, Vec<String>>: 48
HashMap<String, HashSet<String>>: 48
```
although
```
println!("Vec<String>: {}", size_of::<Vec<String>>());
println!("HashSet<String>: {}", size_of::<HashSet<String>>());
---
Vec<String>: 24
HashSet<String>: 48
```
>
> ---snip---
>
>> @@ -50,7 +65,54 @@ sub test_balance {
>> is($nodes[1], "A", 'second should be A');
>> }
>>
>> - $static->add_service_usage_to_node($nodes[0], $service);
>> + $static->add_service_usage_to_node($nodes[0], $sid_names[$i], $service);
>
> Nit: could also just use something like "vm:$i" or "vm:" . 100 + $i
> rather than letters. If it's a variable-controlled enumeration that
> seems more natural to me.
Right, I wanted to stick to the alphabet usage here but that makes much
more sense for something that is generated ^^.
>
>> + }
>> +}
>> +
>
> ---snip---
>
>> @@ -96,9 +158,9 @@ sub test_balance_small_memory_difference {
>> $static->add_node("C", 4, 8_000_000_000);
>>
>> if ($with_start_load) {
>> - $static->add_service_usage_to_node("A", { maxcpu => 4, maxmem => 1_000_000_000 });
>> - $static->add_service_usage_to_node("B", { maxcpu => 2, maxmem => 1_000_000_000 });
>> - $static->add_service_usage_to_node("C", { maxcpu => 2, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("A", "a", { maxcpu => 4, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("B", "b", { maxcpu => 2, maxmem => 1_000_000_000 });
>> + $static->add_service_usage_to_node("C", "c", { maxcpu => 2, maxmem => 1_000_000_000 });
>> }
>>
>> my $service = {
>> @@ -106,6 +168,7 @@ sub test_balance_small_memory_difference {
>> maxmem => 16_000_000,
>> };
>>
>> + my @sid_names = ("d" .. "z");
>> for (my $i = 0; $i < 20; $i++) {
>> my $score_list = $static->score_nodes_to_start_service($service);
>>
>> @@ -131,12 +194,13 @@ sub test_balance_small_memory_difference {
>> die "internal error, got $i % 4 == " . ($i % 4) . "\n";
>> }
>>
>> - $static->add_service_usage_to_node($nodes[0], $service);
>> + $static->add_service_usage_to_node($nodes[0], $sid_names[$i], $service);
>
> Nit: same as above regarding service names, we're already very close to
> the edge of the alphabet in this case ;P
>
>> }
>> }
>>
>> test_basic();
>> test_balance();
>> +test_balance_removal();
>> test_overcommitted();
>> test_balance_small_memory_difference(1);
>> test_balance_small_memory_difference(0);
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-10-16 15:34 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 14:19 [pve-devel] [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH qemu-server 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-15 14:31 ` Fiona Ebner
2025-10-16 9:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH proxmox 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-16 10:32 ` Fiona Ebner
2025-10-16 15:34 ` Daniel Kral [this message]
2025-10-17 10:55 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache Daniel Kral
2025-10-16 11:12 ` Fiona Ebner
2025-10-16 15:15 ` Daniel Kral
2025-10-17 10:02 ` Fiona Ebner
2025-10-17 10:08 ` Fiona Ebner
2025-10-17 16:18 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 2/9] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-16 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 3/9] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-16 11:33 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 4/9] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-16 11:39 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 5/9] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-17 11:14 ` Fiona Ebner
2025-10-17 15:46 ` Daniel Kral
2025-10-20 15:18 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 6/9] manager: make recompute_online_node_usage use get_service_nodes helper Daniel Kral
2025-10-17 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 7/9] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-17 11:57 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular Daniel Kral
2025-10-17 12:32 ` Fiona Ebner
2025-10-17 16:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular Daniel Kral
2025-10-17 12:42 ` Fiona Ebner
2025-10-17 15:59 ` Daniel Kral
2025-10-20 16:50 ` [pve-devel] superseded: [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DDJUUR3FA0IC.1J5AYPWLRMS3F@proxmox.com \
--to=d.kral@proxmox.com \
--cc=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.