public inbox for pdm-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start
@ 2025-01-29 10:51 Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call Dominik Csapak
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Dominik Csapak @ 2025-01-29 10:51 UTC (permalink / raw)
  To: pdm-devel

Sending as RFC, because it's still very rough and i want to get some
early feedback.

This series implements an api call 'bulk-start' which is running on
the pdm itself, that mimics the bulkstart from pve, but without the
node limitation of pve.

Does that make sense? Or would it be better to try to implement that
on pve side? The advantage we have here is that we have an
external view of the cluster, which means that things like node
failures, synchronisation, etc. are much easier to handle.

If we'd implment something like this on PVE, there has to be a node
that has control of the api calls to make (or to schedule something via
pmxcfs) and that is probably much harder to do there (pmxcfs sync queue)
or brings some problems with it (node dies in the middle of an api call)

It's very early, so please don't judge the actual api call code just
now, I'd extend it with failure resulotion, polling the task, etc.

OTOH there is the question if the UI makes sense this way, or if we want
to combine the 'select to view details' and 'select to to a bulk action'
into one. Or if we want to do the bulk actions more like in pve with
a popup that shows the vm list again.

Dominik Csapak (3):
  server: pve api: add new bulkstart api call
  pdm-client: add bulk_start method
  ui: pve tree: add bulk start action

 lib/pdm-client/src/lib.rs |   9 ++-
 server/src/api/pve/mod.rs |  98 +++++++++++++++++++++++++++-
 ui/src/pve/tree.rs        | 133 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 234 insertions(+), 6 deletions(-)

-- 
2.39.5



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call
  2025-01-29 10:51 [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Dominik Csapak
@ 2025-01-29 10:51 ` Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 2/3] pdm-client: add bulk_start method Dominik Csapak
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2025-01-29 10:51 UTC (permalink / raw)
  To: pdm-devel

similar to the 'bulkstart' of pve itself, but implemented here, since we
can do it across the cluster, and are not bound to one specific node.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 server/src/api/pve/mod.rs | 98 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git a/server/src/api/pve/mod.rs b/server/src/api/pve/mod.rs
index 2cefbb4..7dc7a34 100644
--- a/server/src/api/pve/mod.rs
+++ b/server/src/api/pve/mod.rs
@@ -1,10 +1,12 @@
 //! Manage PVE instances.
 
+use std::collections::HashMap;
 use std::sync::Arc;
 
 use anyhow::{bail, format_err, Error};
 
 use proxmox_access_control::CachedUserInfo;
+use proxmox_rest_server::WorkerTask;
 use proxmox_router::{
     http_bail, http_err, list_subdirs_api_method, Permission, Router, RpcEnvironment, SubdirMap,
 };
@@ -17,7 +19,7 @@ use pdm_api_types::remotes::{NodeUrl, Remote, RemoteType, REMOTE_ID_SCHEMA};
 use pdm_api_types::resource::PveResource;
 use pdm_api_types::{
     Authid, RemoteUpid, HOST_OPTIONAL_PORT_FORMAT, PRIV_RESOURCE_AUDIT, PRIV_RESOURCE_DELETE,
-    PRIV_SYS_MODIFY,
+    PRIV_RESOURCE_MANAGE, PRIV_SYS_MODIFY, UPID, VMID_SCHEMA,
 };
 
 use pve_api_types::client::PveClient;
@@ -57,6 +59,7 @@ const MAIN_ROUTER: Router = Router::new()
 
 #[sortable]
 const REMOTE_SUBDIRS: SubdirMap = &sorted!([
+    ("bulk-start", &Router::new().post(&API_METHOD_BULK_START)),
     ("lxc", &lxc::ROUTER),
     ("nodes", &NODES_ROUTER),
     ("qemu", &qemu::ROUTER),
@@ -427,3 +430,96 @@ pub async fn list_realm_remote_pve(
 
     Ok(list)
 }
+
+#[api(
+    input: {
+        properties: {
+            remote: { schema: REMOTE_ID_SCHEMA },
+            "vmid-list": {
+                type: Array,
+                description: "A list of vmids to start",
+                items: {
+                    schema: VMID_SCHEMA,
+                },
+            },
+        },
+    },
+    returns: { type: UPID },
+    access: {
+        permission: &Permission::Privilege(&["resource", "{remote}"], PRIV_RESOURCE_MANAGE, false),
+    },
+)]
+/// Start a remote qemu vm.
+pub async fn bulk_start(
+    remote: String,
+    vmid_list: Vec<u32>,
+    rpcenv: &mut dyn RpcEnvironment,
+) -> Result<UPID, Error> {
+    let (remotes, _) = pdm_config::remotes::config()?;
+
+    let pve = connect_to_remote(&remotes, &remote)?;
+
+    let auth_id = rpcenv.get_auth_id().unwrap();
+
+    let upid = WorkerTask::spawn("qmbulkstart", None, auth_id, false, |_| async move {
+        let resources = pve.cluster_resources(Some(ClusterResourceKind::Vm)).await?;
+
+        let mut map = HashMap::new();
+
+        for res in resources {
+            match res.ty {
+                ClusterResourceType::Qemu => {
+                    map.insert(res.vmid.unwrap(), (res.node, res.ty));
+                }
+                ClusterResourceType::Lxc => {
+                    map.insert(res.vmid.unwrap(), (res.node, res.ty));
+                }
+                _ => {}
+            }
+        }
+
+        for vmid in vmid_list {
+            // TODO:
+            // * get boot order/delay?
+            // * wait for start task to finish?
+            // * how to handle errors?
+            // * check privileges for each vmid
+
+            let (node, vmtype) = if let Some((node, vmtype)) = map.get(&vmid) {
+                match node {
+                    Some(node) => (node, vmtype),
+                    None => bail!("vm without node"),
+                }
+            } else {
+                bail!("no such vmid");
+            };
+
+            log::info!("Start VM {vmid} on {node}");
+            let res = match vmtype {
+                ClusterResourceType::Qemu => {
+                    pve.start_qemu_async(node, vmid, Default::default()).await
+                }
+                ClusterResourceType::Lxc => {
+                    pve.start_lxc_async(node, vmid, Default::default()).await
+                }
+                _ => bail!("invalid vm type"),
+            };
+
+            match res {
+                Ok(upid) => {
+                    log::info!("Started Task: {upid}");
+
+                    // track the remote upids
+                    let _ = new_remote_upid(remote.clone(), upid);
+                }
+                Err(err) => {
+                    log::error!("Starting failed: {err}");
+                }
+            }
+        }
+
+        Ok(())
+    })?;
+
+    upid.parse()
+}
-- 
2.39.5



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pdm-devel] [RFC PATCH datacenter-manager 2/3] pdm-client: add bulk_start method
  2025-01-29 10:51 [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call Dominik Csapak
@ 2025-01-29 10:51 ` Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 3/3] ui: pve tree: add bulk start action Dominik Csapak
  2025-01-29 18:48 ` [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Thomas Lamprecht
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2025-01-29 10:51 UTC (permalink / raw)
  To: pdm-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 lib/pdm-client/src/lib.rs | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/pdm-client/src/lib.rs b/lib/pdm-client/src/lib.rs
index a41b82c..962c76e 100644
--- a/lib/pdm-client/src/lib.rs
+++ b/lib/pdm-client/src/lib.rs
@@ -7,7 +7,7 @@ use pdm_api_types::resource::{PveResource, RemoteResources, TopEntities};
 use pdm_api_types::rrddata::{
     LxcDataPoint, NodeDataPoint, PbsDatastoreDataPoint, PbsNodeDataPoint, QemuDataPoint,
 };
-use pdm_api_types::BasicRealmInfo;
+use pdm_api_types::{BasicRealmInfo, UPID};
 use pve_api_types::StartQemuMigrationType;
 use serde::{Deserialize, Serialize};
 use serde_json::{json, Value};
@@ -447,6 +447,13 @@ impl<T: HttpApiClient> PdmClient<T> {
             .await
     }
 
+    pub async fn pve_bulk_start(&self, remote: &str, vmid_list: Vec<u32>) -> Result<UPID, Error> {
+        let path = format!("/api2/extjs/pve/remotes/{remote}/bulk-start");
+        let mut request = json!({});
+        request["vmid-list"] = vmid_list.into();
+        Ok(self.0.post(&path, &request).await?.expect_json()?.data)
+    }
+
     pub async fn pve_qemu_shutdown(
         &self,
         remote: &str,
-- 
2.39.5



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pdm-devel] [RFC PATCH datacenter-manager 3/3] ui: pve tree: add bulk start action
  2025-01-29 10:51 [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call Dominik Csapak
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 2/3] pdm-client: add bulk_start method Dominik Csapak
@ 2025-01-29 10:51 ` Dominik Csapak
  2025-01-29 18:48 ` [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Thomas Lamprecht
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2025-01-29 10:51 UTC (permalink / raw)
  To: pdm-devel

This adds a new checkbox column that is independent from the usual
selection. If one or more elements are selected, the 'Bulk Action' gets
enabled, and one can select the 'Start' action, which will start a Task
to start the selected guests.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 ui/src/pve/tree.rs | 133 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 129 insertions(+), 4 deletions(-)

diff --git a/ui/src/pve/tree.rs b/ui/src/pve/tree.rs
index 95fb0ec..5184bdc 100644
--- a/ui/src/pve/tree.rs
+++ b/ui/src/pve/tree.rs
@@ -9,20 +9,26 @@ use yew::{
 
 use proxmox_yew_comp::{
     LoadableComponent, LoadableComponentContext, LoadableComponentLink, LoadableComponentMaster,
+    TaskViewer,
 };
 use pwt::css::{AlignItems, ColorScheme, FlexFit, JustifyContent};
+use pwt::prelude::*;
 use pwt::props::{ContainerBuilder, CssBorderBuilder, ExtractPrimaryKey, WidgetBuilder};
 use pwt::state::{KeyedSlabTree, NavigationContext, NavigationContextExt, Selection, TreeStore};
 use pwt::widget::{
-    data_table::{DataTable, DataTableColumn, DataTableHeader},
+    data_table::{
+        DataTable, DataTableCellRenderArgs, DataTableColumn, DataTableHeader,
+        DataTableKeyboardEvent, DataTableMouseEvent,
+    },
     form::Field,
-    ActionIcon, Column, Container, Fa, MessageBox, MessageBoxButtons, Row, Toolbar, Trigger,
+    menu::{Menu, MenuButton, MenuItem},
+    ActionIcon, Button, Column, Container, Fa, MessageBox, MessageBoxButtons, Row, Toolbar,
+    Trigger,
 };
-use pwt::{prelude::*, widget::Button};
 
 use pdm_api_types::{
     resource::{PveLxcResource, PveNodeResource, PveQemuResource, PveResource},
-    RemoteUpid,
+    RemoteUpid, UPID,
 };
 
 use crate::{get_deep_url, widget::MigrateWindow};
@@ -119,6 +125,7 @@ impl std::fmt::Display for Action {
 pub enum ViewState {
     Confirm(Action, String),  // ID
     MigrateWindow(GuestInfo), // ID
+    ShowPdmTask(UPID),
 }
 
 pub enum Msg {
@@ -126,6 +133,8 @@ pub enum Msg {
     GuestAction(Action, String), //ID
     KeySelected(Option<Key>),
     RouteChanged(String),
+    BulkSelected(Key),
+    BulkStart,
 }
 
 pub struct PveTreeComp {
@@ -135,6 +144,7 @@ pub struct PveTreeComp {
     filter: String,
     _nav_handle: ContextHandle<NavigationContext>,
     view_selection: Selection,
+    selection: Selection,
 }
 
 impl PveTreeComp {
@@ -269,6 +279,7 @@ impl LoadableComponent for PveTreeComp {
 
         let path = _nav_ctx.path();
         ctx.link().send_message(Msg::RouteChanged(path));
+        let selection = Selection::new().multiselect(true);
 
         Self {
             columns: columns(
@@ -276,12 +287,14 @@ impl LoadableComponent for PveTreeComp {
                 store.clone(),
                 ctx.props().remote.clone(),
                 ctx.props().loading,
+                selection.clone(),
             ),
             loaded: false,
             store,
             filter: String::new(),
             _nav_handle,
             view_selection,
+            selection,
         }
     }
 
@@ -390,6 +403,59 @@ impl LoadableComponent for PveTreeComp {
                     });
                 }
             }
+            Msg::BulkSelected(key) => {
+                let props = ctx.props();
+                self.selection.toggle(key.clone());
+                let selected = self.selection.contains(&key);
+
+                let store = self.store.read();
+                let item = store.lookup_node(&key);
+                if item.is_none() {
+                    return false;
+                }
+
+                let item = item.unwrap();
+
+                if let PveTreeNode::Node(_) = item.record() {
+                    for child in item.children() {
+                        let key = child.key();
+                        if selected != self.selection.contains(&key) {
+                            self.selection.toggle(key);
+                        }
+                    }
+                }
+
+                self.columns = columns(
+                    ctx.link(),
+                    self.store.clone(),
+                    props.remote.clone(),
+                    props.loading,
+                    self.selection.clone(),
+                );
+                return true;
+            }
+            Msg::BulkStart => {
+                let mut vmids = Vec::new();
+                for (_, item) in self.store.filtered_data() {
+                    let key = item.key();
+                    if self.selection.contains(&key) {
+                        match *item.record() {
+                            PveTreeNode::Lxc(PveLxcResource { vmid, .. })
+                            | PveTreeNode::Qemu(PveQemuResource { vmid, .. }) => vmids.push(vmid),
+                            _ => {}
+                        }
+                    }
+                }
+
+                let link = ctx.link().clone();
+                let remote = ctx.props().remote.clone();
+                ctx.link().spawn(async move {
+                    match crate::pdm_client().pve_bulk_start(&remote, vmids).await {
+                        Ok(upid) => link.change_view(Some(ViewState::ShowPdmTask(upid))),
+                        Err(err) => link.show_error(tr!("Start failed"), err.to_string(), true),
+                    }
+                });
+            }
         }
         true
     }
@@ -409,6 +475,7 @@ impl LoadableComponent for PveTreeComp {
             self.store.clone(),
             props.remote.clone(),
             props.loading,
+            self.selection.clone(),
         );
 
         true
@@ -447,6 +514,19 @@ impl LoadableComponent for PveTreeComp {
                             .on_input(link.callback(Msg::Filter)),
                     )
                     .with_flex_spacer()
+                    .with_child(
+                        MenuButton::new(tr!("Bulk Actions"))
+                            .disabled(self.selection.is_empty())
+                            .icon_class("fa fa-list")
+                            .show_arrow(true)
+                            .menu(
+                                Menu::new().with_item(
+                                    MenuItem::new(tr!("Start"))
+                                        .icon_class("fa fa-play")
+                                        .on_select(ctx.link().callback(|_| Msg::BulkStart)),
+                                ),
+                            ),
+                    )
                     .with_child(Button::refresh(ctx.props().loading).onclick({
                         let on_reload_click = ctx.props().on_reload_click.clone();
                         move |_| {
@@ -495,6 +575,11 @@ impl LoadableComponent for PveTreeComp {
                     })
                     .into(),
             ),
+            ViewState::ShowPdmTask(upid) => Some(
+                TaskViewer::new(upid.to_string())
+                    .on_close(ctx.link().change_view_callback(|_| None))
+                    .into(),
+            ),
         }
     }
 
@@ -526,8 +611,48 @@ fn columns(
     store: TreeStore<PveTreeNode>,
     remote: String,
     loading: bool,
+    selection: Selection,
 ) -> Rc<Vec<DataTableHeader<PveTreeNode>>> {
     Rc::new(vec![
+        DataTableColumn::new("selection indicator")
+            .width("max-content")
+            //   .width("2.5em")
+            .resizable(false)
+            .show_menu(false)
+            .render_header(|_args: &mut _| Fa::new("check").into())
+            .render_cell({
+                move |args: &mut DataTableCellRenderArgs<PveTreeNode>| {
+                    let selected = selection.contains(args.key());
+                    Fa::new(if selected {
+                        "check-square-o"
+                    } else {
+                        "square-o"
+                    })
+                    .class("pwt-pointer")
+                    .into()
+                }
+            })
+            .on_cell_click({
+                let link = link.clone();
+                move |event: &mut DataTableMouseEvent| {
+                    let record_key = event.record_key.clone();
+                    //selection.toggle(record_key.clone());
+                    event.prevent_default();
+                    event.stop_propagation();
+                    link.send_message(Msg::BulkSelected(record_key));
+                }
+            })
+            .on_cell_keydown({
+                let link = link.clone();
+                move |event: &mut DataTableKeyboardEvent| {
+                    if event.key() == " " {
+                        event.stop_propagation();
+                        event.prevent_default();
+                        link.send_message(Msg::BulkSelected(event.record_key.clone()));
+                    }
+                }
+            })
+            .into(),
         DataTableColumn::new("Type/ID")
             .flex(1)
             .tree_column(store)
-- 
2.39.5



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start
  2025-01-29 10:51 [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Dominik Csapak
                   ` (2 preceding siblings ...)
  2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 3/3] ui: pve tree: add bulk start action Dominik Csapak
@ 2025-01-29 18:48 ` Thomas Lamprecht
  2025-01-30  8:14   ` Dominik Csapak
  3 siblings, 1 reply; 7+ messages in thread
From: Thomas Lamprecht @ 2025-01-29 18:48 UTC (permalink / raw)
  To: Proxmox Datacenter Manager development discussion, Dominik Csapak

Am 29.01.25 um 11:51 schrieb Dominik Csapak:
> Sending as RFC, because it's still very rough and i want to get some
> early feedback.
> 
> This series implements an api call 'bulk-start' which is running on
> the pdm itself, that mimics the bulkstart from pve, but without the
> node limitation of pve.
> 
> Does that make sense? Or would it be better to try to implement that
> on pve side? The advantage we have here is that we have an
> external view of the cluster, which means that things like node
> failures, synchronisation, etc. are much easier to handle.

I think we talked offlist about this a while ago, albeit rather casually,
and yes IMO exposing this on the PVE side would be better – it can be done
more efficiently there, better control for overall active job count and
avoids some oddities. TBH I'd be surprised if it's easier to do from
external with the same feature set.

Having an external services handle this over a potentially flaky connection
seems much more error-prone to me compared going over a LAN that clusters
require.

IMO we actually should avoid having much of this stuff or dedicated state
(that affects the remotes or their resources) in the PDM directly. The
more things are handled by the end products the 1) simpler PDM stays
(PVE needs some complexity anyway, coupling two complex projects will IMO
amplify maintenance cost more) 2) ensures PVE provides already a powerful
feature set on its own – i.e. PVE already has a good architecture and is
not as limited like vmware esxi, which requires vsphere for relatively
simple (from user POV, not implementation) things even if they are only
affecting nodes in the same LAN, so we should continue to mainly "empower"
PVE and plug that into PDM 3) PDM will become relatively complex even
with trying to avoid state and such features implemented only there,
all the metrics, tasks, health and SDN tracking is already quite a bit
to handle, if done actually well, flexible and powerful.

> If we'd implment something like this on PVE, there has to be a node
> that has control of the api calls to make (or to schedule something via
> pmxcfs) and that is probably much harder to do there (pmxcfs sync queue)
> or brings some problems with it (node dies in the middle of an api call)

In the simplest architecture it could be like the SDN reload is
implemented; I'm quite sure that I mentioned that, but would not bet that
much on my (or most) brain(s) that is. 

I.e. a single task on one node that connects to all involved cluster nodes
through the API and creates the respective bulk-tasks for the guests residing
on each node and then polls these. Some generic infrastructure for doing such
things might be nice and would have some reuse between different bulk tasks
and SDN, potentially others in the future.
Switching to an even more efficient channel or method could be done
transparently (from POV of the external user/program of the cluster-wide
bulk-action API), so I'd not worry too much about that now.

Besides that there are (most of the time) fewer points of failures between
nodes compared to PDM and nodes network wise, if node(s) indeed die in the
middle of an API call the PDM is naturally cannot magically fix that and
as node failure is not expected behavior but rather an extraordinary event
it also means that an interrupted bulk-action is not really a big problem
there.

in short: lets do this in PVE directly.

> It's very early, so please don't judge the actual api call code just
> now, I'd extend it with failure resulotion, polling the task, etc.
> 
> OTOH there is the question if the UI makes sense this way, or if we want
> to combine the 'select to view details' and 'select to to a bulk action'
> into one. Or if we want to do the bulk actions more like in pve with
> a popup that shows the vm list again.
> 
> Dominik Csapak (3):
>   server: pve api: add new bulkstart api call
>   pdm-client: add bulk_start method
>   ui: pve tree: add bulk start action
> 
>  lib/pdm-client/src/lib.rs |   9 ++-
>  server/src/api/pve/mod.rs |  98 +++++++++++++++++++++++++++-
>  ui/src/pve/tree.rs        | 133 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 234 insertions(+), 6 deletions(-)
> 



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start
  2025-01-29 18:48 ` [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Thomas Lamprecht
@ 2025-01-30  8:14   ` Dominik Csapak
  2025-01-30 16:15     ` Thomas Lamprecht
  0 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2025-01-30  8:14 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox Datacenter Manager development discussion

On 1/29/25 19:48, Thomas Lamprecht wrote:
> Am 29.01.25 um 11:51 schrieb Dominik Csapak:
>> Sending as RFC, because it's still very rough and i want to get some
>> early feedback.
>>
>> This series implements an api call 'bulk-start' which is running on
>> the pdm itself, that mimics the bulkstart from pve, but without the
>> node limitation of pve.
>>
>> Does that make sense? Or would it be better to try to implement that
>> on pve side? The advantage we have here is that we have an
>> external view of the cluster, which means that things like node
>> failures, synchronisation, etc. are much easier to handle.
> 
> I think we talked offlist about this a while ago, albeit rather casually,
> and yes IMO exposing this on the PVE side would be better – it can be done
> more efficiently there, better control for overall active job count and
> avoids some oddities. TBH I'd be surprised if it's easier to do from
> external with the same feature set.
> 
> Having an external services handle this over a potentially flaky connection
> seems much more error-prone to me compared going over a LAN that clusters
> require.
> 
> IMO we actually should avoid having much of this stuff or dedicated state
> (that affects the remotes or their resources) in the PDM directly. The
> more things are handled by the end products the 1) simpler PDM stays
> (PVE needs some complexity anyway, coupling two complex projects will IMO
> amplify maintenance cost more) 2) ensures PVE provides already a powerful
> feature set on its own – i.e. PVE already has a good architecture and is
> not as limited like vmware esxi, which requires vsphere for relatively
> simple (from user POV, not implementation) things even if they are only
> affecting nodes in the same LAN, so we should continue to mainly "empower"
> PVE and plug that into PDM 3) PDM will become relatively complex even
> with trying to avoid state and such features implemented only there,
> all the metrics, tasks, health and SDN tracking is already quite a bit
> to handle, if done actually well, flexible and powerful.
> 
>> If we'd implment something like this on PVE, there has to be a node
>> that has control of the api calls to make (or to schedule something via
>> pmxcfs) and that is probably much harder to do there (pmxcfs sync queue)
>> or brings some problems with it (node dies in the middle of an api call)
> 
> In the simplest architecture it could be like the SDN reload is
> implemented; I'm quite sure that I mentioned that, but would not bet that
> much on my (or most) brain(s) that is.
> 
> I.e. a single task on one node that connects to all involved cluster nodes
> through the API and creates the respective bulk-tasks for the guests residing
> on each node and then polls these. Some generic infrastructure for doing such
> things might be nice and would have some reuse between different bulk tasks
> and SDN, potentially others in the future.
> Switching to an even more efficient channel or method could be done
> transparently (from POV of the external user/program of the cluster-wide
> bulk-action API), so I'd not worry too much about that now.
> 
> Besides that there are (most of the time) fewer points of failures between
> nodes compared to PDM and nodes network wise, if node(s) indeed die in the
> middle of an API call the PDM is naturally cannot magically fix that and
> as node failure is not expected behavior but rather an extraordinary event
> it also means that an interrupted bulk-action is not really a big problem
> there.
> 
> in short: lets do this in PVE directly.

Sounds good to me, with one caveat. When implementing this, I would go for
a new api call on the pve side that does this to properly separate this.
I think otherwise the existing call would get much more complex. (but I have to
try it first). On the pdm side I'd implement it with a fallback
to call the "old" bulkstart api call on each node in case the new api call does
not exist?

That way older nodes/clusters can still profit from the functionality without
much state/logic handling on the pdm side.

We can still remove that fallback from PDM again when the PDM code is sufficiently old
and PDM has no first release yet.

How does that sound?

> 
>> It's very early, so please don't judge the actual api call code just
>> now, I'd extend it with failure resulotion, polling the task, etc.
>>
>> OTOH there is the question if the UI makes sense this way, or if we want
>> to combine the 'select to view details' and 'select to to a bulk action'
>> into one. Or if we want to do the bulk actions more like in pve with
>> a popup that shows the vm list again.
>>
>> Dominik Csapak (3):
>>    server: pve api: add new bulkstart api call
>>    pdm-client: add bulk_start method
>>    ui: pve tree: add bulk start action
>>
>>   lib/pdm-client/src/lib.rs |   9 ++-
>>   server/src/api/pve/mod.rs |  98 +++++++++++++++++++++++++++-
>>   ui/src/pve/tree.rs        | 133 ++++++++++++++++++++++++++++++++++++--
>>   3 files changed, 234 insertions(+), 6 deletions(-)
>>
> 



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start
  2025-01-30  8:14   ` Dominik Csapak
@ 2025-01-30 16:15     ` Thomas Lamprecht
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Lamprecht @ 2025-01-30 16:15 UTC (permalink / raw)
  To: Dominik Csapak, Proxmox Datacenter Manager development discussion

Am 30.01.25 um 09:14 schrieb Dominik Csapak:
> Sounds good to me, with one caveat. When implementing this, I would go for
> a new api call on the pve side that does this to properly separate this.
> I think otherwise the existing call would get much more complex. (but I have to
> try it first). On the pdm side I'd implement it with a fallback
> to call the "old" bulkstart api call on each node in case the new api call does
> not exist?

Yes, definitively a new API endpoint. FWIW, I started something at /cluster/
level, but that was pretty bare bones so not sure if its worth to dig up.
And I was not yet set on having a single endpoint or a per-action-type, the
latter might be slightly nicer for consistency with the node level, maybe
`/cluster/bulk-action/guest-{start,stop,migrate}`.

In the long run I'd also prefer to extend selection/filter capabillities of
these endpoints, but that's really not relevant for initial MVP.

> That way older nodes/clusters can still profit from the functionality without
> much state/logic handling on the pdm side.
> 
> We can still remove that fallback from PDM again when the PDM code is sufficiently old
> and PDM has no first release yet.

I'd not bother much with that, the minium support PVE for PDM is already pretty
much the latest, and the FAQ of the PDM Alpha announcement states that PDM and
PVE is expected to be developed in lock-step during stabilization. So if we manage
to add the cluster wide bulk action no later than PVE 8.4 in Q2 then I think it's
fine to have a hard dependency for that in PDM IMO. And adding that fallback
if there are lots of users with somewhat (!) reasonable complaints we can
still add the fallback. While I agree with your sentiment, I think we're a bit
special here in PDM, where ripping it out and thus reducing the minimum supported
version might be a bit harder to sell compared to adding a fallback later.



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-30 16:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-29 10:51 [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 2/3] pdm-client: add bulk_start method Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 3/3] ui: pve tree: add bulk start action Dominik Csapak
2025-01-29 18:48 ` [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Thomas Lamprecht
2025-01-30  8:14   ` Dominik Csapak
2025-01-30 16:15     ` Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal