From: Dominik Csapak <d.csapak@proxmox.com>
To: Proxmox Datacenter Manager development discussion
<pdm-devel@lists.proxmox.com>,
Lukas Wagner <l.wagner@proxmox.com>
Subject: Re: [pdm-devel] [RFC proxmox-datacenter-manager 1/2] server: add convenience helper to fetch results from many remote nodes in parallel
Date: Fri, 29 Aug 2025 10:57:49 +0200 [thread overview]
Message-ID: <5fca5c4c-cdbb-4cbf-b3a0-475f2d7f1f44@proxmox.com> (raw)
In-Reply-To: <20250828144247.298536-2-l.wagner@proxmox.com>
some minor comments inline, but looks good in general to me
On 8/28/25 4:42 PM, Lukas Wagner wrote:
> Implementing this over and over again is a lot of work; many parts of
> PDM need to fetch some kind of resource from all remotes and all nodes
> of these remotes.
>
> The general approach is the same as for remote task fetching (JoinSet
> and semaphores to control concurrency), it will be refactored in a later
> commit to use this new helper.
>
> Signed-off-by: Lukas Wagner <l.wagner@proxmox.com>
> ---
> server/src/lib.rs | 1 +
> server/src/parallel_fetcher.rs | 236 +++++++++++++++++++++++++++++++++
> 2 files changed, 237 insertions(+)
> create mode 100644 server/src/parallel_fetcher.rs
>
> diff --git a/server/src/lib.rs b/server/src/lib.rs
> index 485bc792..3f8b7708 100644
> --- a/server/src/lib.rs
> +++ b/server/src/lib.rs
> @@ -6,6 +6,7 @@ pub mod auth;
> pub mod context;
> pub mod env;
> pub mod metric_collection;
> +pub mod parallel_fetcher;
> pub mod remote_cache;
> pub mod remote_tasks;
> pub mod resource_cache;
> diff --git a/server/src/parallel_fetcher.rs b/server/src/parallel_fetcher.rs
> new file mode 100644
> index 00000000..3fad68a0
> --- /dev/null
> +++ b/server/src/parallel_fetcher.rs
> @@ -0,0 +1,236 @@
> +use std::{
> + collections::HashMap,
> + fmt::Debug,
> + future::Future,
> + sync::Arc,
> + time::{Duration, Instant},
> +};
> +
> +use anyhow::Error;
> +use pdm_api_types::remotes::{Remote, RemoteType};
> +use pve_api_types::ClusterNodeIndexResponse;
> +use tokio::{
> + sync::{OwnedSemaphorePermit, Semaphore},
> + task::JoinSet,
> +};
> +
> +use crate::connection;
> +
> +pub struct ParallelFetcher<C> {
> + pub max_connections: usize,
> + pub max_connections_per_remote: usize,
> + pub context: C,
> +}
> +
> +pub struct FetchResults<T> {
> + pub remote_results: HashMap<String, Result<RemoteResult<T>, Error>>,
> +}
> +
> +impl<T> Default for FetchResults<T> {
> + fn default() -> Self {
> + Self {
> + remote_results: Default::default(),
> + }
> + }
> +}
> +
> +#[derive(Debug)]
> +pub struct RemoteResult<T> {
> + pub node_results: HashMap<String, Result<NodeResults<T>, Error>>,
> +}
> +
> +impl<T> Default for RemoteResult<T> {
> + fn default() -> Self {
> + Self {
> + node_results: Default::default(),
> + }
> + }
> +}
> +
> +#[derive(Debug)]
> +pub struct NodeResults<T> {
> + pub data: T,
> + pub api_response_time: Duration,
> +}
> +
> +impl<C: Clone + Send + 'static> ParallelFetcher<C> {
> + pub fn new(context: C) -> Self {
> + Self {
> + max_connections: 20,
> + max_connections_per_remote: 5,
you probably thought of it already, but we probably want to
make this settable from outside in a seperate constructor, or at least
put the values in const values for now
> + context,
> + }
> + }
> +
> + pub async fn do_for_all_remote_nodes<A, F, T, Ft>(self, remotes: A, func: F) -> FetchResults<T>
> + where
> + A: Iterator<Item = Remote>,
> + F: Fn(C, Remote, String) -> Ft + Clone + Send + 'static,
> + Ft: Future<Output = Result<T, Error>> + Send + 'static,
> + T: Send + Debug + 'static,
> + {
> + let total_connections_semaphore = Arc::new(Semaphore::new(self.max_connections));
> +
> + let mut remote_join_set = JoinSet::new();
> +
> + for remote in remotes {
> + let semaphore = Arc::clone(&total_connections_semaphore);
> +
> + let f = func.clone();
> +
> + remote_join_set.spawn(Self::fetch_remote(
> + remote,
> + self.context.clone(),
> + semaphore,
> + f,
> + self.max_connections_per_remote,
> + ));
not sure if it would work properly, but couldn't we put the semaphores
into self, and clone that instead?
then `fetch_remote` could become a normal method on self
and we could implement a helper for getting permits and
we wouldn't have to pass the semaphores around all the time.
(maybe that does not work as i think though..)
> + }
> +
> + let mut results = FetchResults::default();
> +
> + while let Some(a) = remote_join_set.join_next().await {
> + match a {
> + Ok((remote_name, remote_result)) => {
> + results.remote_results.insert(remote_name, remote_result);
> + }
> + Err(err) => {
> + log::error!("join error when waiting for future: {err}")
> + }
> + }
> + }
> +
> + results
> + }
> +
> + async fn fetch_remote<F, Ft, T>(
> + remote: Remote,
> + context: C,
> + semaphore: Arc<Semaphore>,
> + func: F,
> + max_connections_per_remote: usize,
> + ) -> (String, Result<RemoteResult<T>, Error>)
> + where
> + F: Fn(C, Remote, String) -> Ft + Clone + Send + 'static,
> + Ft: Future<Output = Result<T, Error>> + Send + 'static,
> + T: Send + Debug + 'static,
> + {
> + let mut per_remote_results = RemoteResult::default();
> +
> + let mut permit = Some(Arc::clone(&semaphore).acquire_owned().await.unwrap());
> + let per_remote_semaphore = Arc::new(Semaphore::new(max_connections_per_remote));
> +
> + match remote.ty {
> + RemoteType::Pve => {
> + let remote_clone = remote.clone();
> +
> + let nodes = match async move {
> + let client = connection::make_pve_client(&remote_clone)?;
> + let nodes = client.list_nodes().await?;
> +
> + Ok::<Vec<ClusterNodeIndexResponse>, Error>(nodes)
> + }
> + .await
> + {
> + Ok(nodes) => nodes,
> + Err(err) => return (remote.id.clone(), Err(err)),
> + };
> +
> + let mut nodes_join_set = JoinSet::new();
> +
> + for node in nodes {
> + let permit = if let Some(permit) = permit.take() {
> + permit
> + } else {
> + Arc::clone(&semaphore).acquire_owned().await.unwrap()
> + };
> +
> + let per_remote_connections_permit = Arc::clone(&per_remote_semaphore)
> + .acquire_owned()
> + .await
> + .unwrap();
> +
> + let func_clone = func.clone();
> + let remote_clone = remote.clone();
> + let node_name = node.node.clone();
> + let context_clone = context.clone();
> +
> + nodes_join_set.spawn(Self::fetch_pve_node(
> + func_clone,
> + context_clone,
> + remote_clone,
> + node_name,
> + permit,
> + per_remote_connections_permit,
> + ));
> + }
> +
> + while let Some(join_result) = nodes_join_set.join_next().await {
> + match join_result {
> + Ok((node_name, per_node_result)) => {
> + per_remote_results
> + .node_results
> + .insert(node_name, per_node_result);
> + }
> + Err(e) => {
> + log::error!("join error when waiting for future: {e}")
> + }
> + }
> + }
> + }
> + RemoteType::Pbs => {
> + let node = "localhost".to_string();
> +
> + let now = Instant::now();
> + let result = func(context, remote.clone(), node.clone()).await;
> + let api_response_time = now.elapsed();
> +
> + match result {
> + Ok(data) => {
> + per_remote_results.node_results.insert(
> + node,
> + Ok(NodeResults {
> + data,
> + api_response_time,
> + }),
> + );
> + }
> + Err(err) => {
> + per_remote_results.node_results.insert(node, Err(err));
> + }
> + }
> + }
> + }
> +
> + (remote.id, Ok(per_remote_results))
> + }
probably a `fn do_for_all_remotes` that only does one thing per remote
would also be nice? (i think stefan said he'll need something like this)
> +
> + async fn fetch_pve_node<F, Ft, T>(
> + func: F,
> + context: C,
> + remote: Remote,
> + node: String,
> + _permit: OwnedSemaphorePermit,
> + _per_remote_connections_permit: OwnedSemaphorePermit,
> + ) -> (String, Result<NodeResults<T>, Error>)
> + where
> + F: Fn(C, Remote, String) -> Ft + Clone + Send + 'static,
> + Ft: Future<Output = Result<T, Error>> + Send + 'static,
> + T: Send + Debug + 'static,
> + {
> + let now = Instant::now();
> + let result = func(context, remote.clone(), node.clone()).await;
> + let api_response_time = now.elapsed();
> +
> + match result {
> + Ok(data) => (
> + node,
> + Ok(NodeResults {
> + data,
> + api_response_time,
> + }),
> + ),
> + Err(err) => (node, Err(err)),
> + }
> + }
> +}
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-08-29 8:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-28 14:42 [pdm-devel] [RFC proxmox-datacenter-manager 0/2] add helper to fetch " Lukas Wagner
2025-08-28 14:42 ` [pdm-devel] [RFC proxmox-datacenter-manager 1/2] server: add convenience helper to fetch results " Lukas Wagner
2025-08-29 8:57 ` Dominik Csapak [this message]
2025-08-29 9:02 ` Stefan Hanreich
2025-08-29 9:17 ` Lukas Wagner
2025-08-29 14:05 ` Lukas Wagner
2025-08-28 14:42 ` [pdm-devel] [RFC proxmox-datacenter-manager 2/2] remote task fetching: use ParallelFetcher helper Lukas Wagner
2025-08-29 9:15 ` Dominik Csapak
2025-08-29 9:19 ` Lukas Wagner
2025-08-29 14:11 ` [pdm-devel] superseded: [RFC proxmox-datacenter-manager 0/2] add helper to fetch from many remote nodes in parallel Lukas Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5fca5c4c-cdbb-4cbf-b3a0-475f2d7f1f44@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=l.wagner@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox