From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id B15AC1FF13F for ; Thu, 26 Mar 2026 11:28:40 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 20F6FC4D4; Thu, 26 Mar 2026 11:29:00 +0100 (CET) Content-Type: text/plain; charset=UTF-8 Date: Thu, 26 Mar 2026 11:28:53 +0100 Message-Id: Subject: Re: [PATCH proxmox v2 05/40] resource-scheduling: implement generic cluster usage implementation From: "Dominik Rusovac" To: "Daniel Kral" , Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 X-Mailer: aerc 0.20.0 References: <20260324183029.1274972-1-d.kral@proxmox.com> <20260324183029.1274972-6-d.kral@proxmox.com> In-Reply-To: <20260324183029.1274972-6-d.kral@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774520885375 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.336 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: W33F2ZGATUL742WYMVA72L4DZ4VHF553 X-Message-ID-Hash: W33F2ZGATUL742WYMVA72L4DZ4VHF553 X-MailFrom: d.rusovac@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: lgtm pls find my comments inline, mostly relating to nits or tiny things On Tue Mar 24, 2026 at 7:29 PM CET, Daniel Kral wrote: > This is a more generic version of the `Usage` implementation from the > pve_static bindings in the pve_rs repository. > > As the upcoming load balancing scheduler actions and dynamic resource > scheduler will need more information about each resource, this further > improves on the state tracking of each resource: > > In this implementation, a resource is composed of its usage statistics > and its two essential states: the running state and the node placement. > The non_exhaustive attribute ensures that usages need to construct the > a Resource instance through its API. > > Users can repeatedly use the current state of Usage to make scheduling > decisions with the to_scheduler() method. This method takes an > implementation of UsageAggregator, which dictates how the usage > information is represented to the Scheduler. > > Signed-off-by: Daniel Kral > --- > changes v1 -> v2: > - new! > > This patch is added to move the handling of specific usage stats and > their (de)serialization to the pve-rs bindings and have the general > functionality in this crate. [snip] nit: imo, it's more convenient to expose the more ergonomic `&str` type, using: pub fn resources_iter(&self) -> impl Iterator { self.resources.iter().map(String::as_str) } > + pub fn resources_iter(&self) -> impl Iterator { > + self.resources.iter() > + } [snip]=20 > + pub fn moving_to(&mut self, target_node: String) -> Result<(), Error= > { > + match &self.placement { > + ResourcePlacement::Stationary { current_node } =3D> { > + self.placement =3D ResourcePlacement::Moving { > + current_node: current_node.to_string(), nit:=20 current_node: current_node.to_owned(),=20 represents the intention best, that is, owning rather than converting [snip] > + /// Handles the external removal of a node. > + /// > + /// Returns whether the resource does not have any node left. Considering what it does, I find the name of this function a bit confusing. > + pub fn remove_node(&mut self, nodename: &str) -> bool { > + match &self.placement { > + ResourcePlacement::Stationary { current_node } =3D> current_= node =3D=3D nodename, > + ResourcePlacement::Moving { > + current_node, > + target_node, > + } =3D> { > + if current_node =3D=3D nodename { > + self.placement =3D ResourcePlacement::Stationary { > + current_node: target_node.to_string(), nit: to_owned() represents the intention best > + }; > + } else if target_node =3D=3D nodename { > + self.placement =3D ResourcePlacement::Stationary { > + current_node: current_node.to_string(), nit: to_owned() represents the intention best > + }; > + } > + > + false > + } > + } > + } [snip] > + /// Add a node to the cluster usage. > + /// > + /// This method fails if a node with the same `nodename` already exi= sts. > + pub fn add_node(&mut self, nodename: String, stats: NodeStats) -> Re= sult<(), Error> { > + if self.nodes.contains_key(&nodename) { > + bail!("node '{}' already exists", nodename); nit:=20 bail!("node '{nodename}' already exists"); > + } [snip] we are reading only, consider using a slice for `nodenames` here (just like for `remove_resource_from_nodes`): fn add_resource_to_nodes(&mut self, sid: &str, nodenames: &[&str]) -= > Result<(), Error> { pls find the related changes [0] and [1]. > + fn add_resource_to_nodes(&mut self, sid: &str, nodenames: Vec<&str>)= -> Result<(), Error> { > + if nodenames > + .iter() > + .any(|nodename| !self.nodes.contains_key(*nodename)) > + { > + bail!("resource nodes do not exist"); > + } > + > + nodenames.iter().for_each(|nodename| { > + if let Some(node) =3D self.nodes.get_mut(*nodename) { > + node.add_resource(sid); > + } > + }); > + > + Ok(()) > + } [snip] > + /// Add `resource` with identifier `sid` to cluster usage. > + /// > + /// This method fails if a resource with the same `sid` already exis= ts or the resource's nodes > + /// do not exist in the cluster usage. > + pub fn add_resource(&mut self, sid: String, resource: Resource) -> R= esult<(), Error> { > + if self.resources.contains_key(&sid) { > + bail!("resource '{}' already exists", sid); > + } > + > + self.add_resource_to_nodes(&sid, resource.nodenames())?; [0]: self.add_resource_to_nodes(&sid, &resource.nodenames())?; > + > + self.resources.insert(sid.to_string(), resource); nit: to_owned() instead of of to_string() represents the intention best [snip] > + pub fn add_resource_usage_to_node( > + &mut self, > + nodename: &str, > + sid: &str, > + stats: ResourceStats, > + ) -> Result<(), Error> { > + if let Some(resource) =3D self.resources.get_mut(sid) { > + resource.moving_to(nodename.to_string())?; > + > + self.add_resource_to_nodes(sid, vec![nodename]) [1]: self.add_resource_to_nodes(sid, &[nodename]) [snip] > +#[test] > +fn test_no_duplicate_nodes() -> Result<(), Error> { > + let mut usage =3D Usage::new(); > + > + usage.add_node("node1".to_string(), NodeStats::default())?; > + > + match usage.add_node("node1".to_string(), NodeStats::default()) { > + Ok(_) =3D> bail!("cluster usage does allow duplicate node entrie= s"), > + Err(_) =3D> Ok(()), > + } since this is supposed to be a test case, I would rather assert instead of bail, using: assert!( usage .add_node("node1".to_string(), NodeStats::default()) .is_err(), "cluster usage allows duplicate node entries" ); > +} > + > +#[test] > +fn test_no_duplicate_resources() -> Result<(), Error> { > + let mut usage =3D Usage::new(); > + > + usage.add_node("node1".to_string(), NodeStats::default())?; > + > + let placement =3D ResourcePlacement::Stationary { > + current_node: "node1".to_string(), > + }; > + let resource =3D Resource::new(ResourceStats::default(), ResourceSta= te::Stopped, placement); > + > + usage.add_resource("vm:101".to_string(), resource.clone())?; > + > + match usage.add_resource("vm:101".to_string(), resource) { > + Ok(_) =3D> bail!("cluster usage does allow duplicate resource en= tries"), > + Err(_) =3D> Ok(()), > + } assert instead of bail: assert!( usage.add_resource("vm:101".to_string(), resource).is_err(), "cluster usage allows duplicate resource entries" ); > +} > + > +#[test] > +#[allow(deprecated)] > +fn test_add_resource_usage_to_node() -> Result<(), Error> { > + let mut usage =3D Usage::new(); > + > + usage.add_node("node1".to_string(), NodeStats::default())?; > + usage.add_node("node2".to_string(), NodeStats::default())?; > + usage.add_node("node3".to_string(), NodeStats::default())?; > + > + usage.add_resource_usage_to_node("node1", "vm:101", ResourceStats::d= efault())?; > + usage.add_resource_usage_to_node("node2", "vm:101", ResourceStats::d= efault())?; > + > + if usage > + .add_resource_usage_to_node("node3", "vm:101", ResourceStats::de= fault()) > + .is_ok() > + { > + bail!("add_resource_usage_to_node() allows adding resource to mo= re than two nodes"); > + } assert instead of bail: assert!( usage .add_resource_usage_to_node("node3", "vm:101", ResourceStats= ::default()) .is_err(), "add_resource_usage_to_node() allows adding resource to more tha= n two nodes" ); > + > + Ok(()) > +} > + > +#[test] > +fn test_add_remove_stationary_resource() -> Result<(), Error> { > + let mut usage =3D Usage::new(); > + > + let (sid, nodename) =3D ("vm:101", "node1"); > + > + usage.add_node(nodename.to_string(), NodeStats::default())?; > + > + let placement =3D ResourcePlacement::Stationary { > + current_node: nodename.to_string(), > + }; > + let resource =3D Resource::new(ResourceStats::default(), ResourceSta= te::Stopped, placement); > + > + usage.add_resource(sid.to_string(), resource)?; > + > + match (usage.get_resource(sid), usage.get_node(nodename)) { > + (Some(_), Some(node)) =3D> { > + if !node.contains_resource(sid) { > + bail!("resource '{sid}' was not added to node '{nodename= }'"); > + } > + } > + _ =3D> bail!("resource '{sid}' or node '{nodename}' were not add= ed"), > + } assert instead of bail: assert!( usage.get_resource(sid).is_some(), "resource '{sid}' was not added" ); assert!( usage .get_node(nodename) .map(|node| { assert!( node.contains_resource(sid), "resource '{sid}' was not added to node '{nodename}'= " ); }) .is_some(), "node '{nodename}' was not added" ); > + > + usage.remove_resource(sid); > + > + match (usage.get_resource(sid), usage.get_node(nodename)) { > + (None, Some(node)) =3D> { > + if node.contains_resource(sid) { > + bail!("resource '{sid}' was not removed from node '{node= name}'"); > + } > + } > + _ =3D> bail!("resource '{sid}' was not removed"), > + } assert instead of bail: assert!( usage.get_resource(sid).is_none(), "resource '{sid}' was not removed" ); assert!( usage .get_node(nodename) .map(|node| { assert!( !node.contains_resource(sid), "resource '{sid}' was not removed from node '{nodena= me}'" ); }) .is_some(), "node '{nodename}' was not added" ); > + > + Ok(()) > +} > + > +#[test] > +fn test_add_remove_moving_resource() -> Result<(), Error> { > + let mut usage =3D Usage::new(); > + > + let (sid, current_nodename, target_nodename) =3D ("vm:101", "node1",= "node2"); > + > + usage.add_node(current_nodename.to_string(), NodeStats::default())?; > + usage.add_node(target_nodename.to_string(), NodeStats::default())?; > + > + let placement =3D ResourcePlacement::Moving { > + current_node: current_nodename.to_string(), > + target_node: target_nodename.to_string(), > + }; > + let resource =3D Resource::new(ResourceStats::default(), ResourceSta= te::Stopped, placement); > + > + usage.add_resource(sid.to_string(), resource)?; > + analogously, here I'd find asserting more appropriate than bailing > + match ( > + usage.get_resource(sid), > + usage.get_node(current_nodename), > + usage.get_node(target_nodename), > + ) { > + (Some(_), Some(current_node), Some(target_node)) =3D> { > + if !current_node.contains_resource("vm:101") { > + bail!("resource '{sid}' was not added to current node '{= current_nodename}'"); > + } > + > + if !target_node.contains_resource("vm:101") { > + bail!("resource '{sid}' was not added to target node '{t= arget_nodename}'"); > + } > + } > + _ =3D> bail!("resource '{sid}' or nodes were not added"), > + } > + > + usage.remove_resource(sid); analogously, here I'd find asserting more appropriate than bailing > + > + match ( > + usage.get_resource(sid), > + usage.get_node(current_nodename), > + usage.get_node(target_nodename), > + ) { > + (None, Some(current_node), Some(target_node)) =3D> { > + if current_node.contains_resource(sid) { > + bail!("resource '{sid}' was not removed from current nod= e '{current_nodename}'"); > + } > + > + if target_node.contains_resource(sid) { > + bail!("resource '{sid}' was not removed from target node= '{target_nodename}'"); > + } > + } > + _ =3D> bail!("resource '{sid}' was not removed"), > + } > + > + Ok(()) > +}