From: Dominik Csapak <d.csapak@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
Proxmox Datacenter Manager development discussion
<pdm-devel@lists.proxmox.com>
Subject: Re: [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start
Date: Thu, 30 Jan 2025 09:14:17 +0100 [thread overview]
Message-ID: <15a36f0d-fef5-4ceb-8089-7cc84381e640@proxmox.com> (raw)
In-Reply-To: <bc6fb18d-84c5-435b-89ec-f3d61b755687@proxmox.com>
On 1/29/25 19:48, Thomas Lamprecht wrote:
> Am 29.01.25 um 11:51 schrieb Dominik Csapak:
>> Sending as RFC, because it's still very rough and i want to get some
>> early feedback.
>>
>> This series implements an api call 'bulk-start' which is running on
>> the pdm itself, that mimics the bulkstart from pve, but without the
>> node limitation of pve.
>>
>> Does that make sense? Or would it be better to try to implement that
>> on pve side? The advantage we have here is that we have an
>> external view of the cluster, which means that things like node
>> failures, synchronisation, etc. are much easier to handle.
>
> I think we talked offlist about this a while ago, albeit rather casually,
> and yes IMO exposing this on the PVE side would be better – it can be done
> more efficiently there, better control for overall active job count and
> avoids some oddities. TBH I'd be surprised if it's easier to do from
> external with the same feature set.
>
> Having an external services handle this over a potentially flaky connection
> seems much more error-prone to me compared going over a LAN that clusters
> require.
>
> IMO we actually should avoid having much of this stuff or dedicated state
> (that affects the remotes or their resources) in the PDM directly. The
> more things are handled by the end products the 1) simpler PDM stays
> (PVE needs some complexity anyway, coupling two complex projects will IMO
> amplify maintenance cost more) 2) ensures PVE provides already a powerful
> feature set on its own – i.e. PVE already has a good architecture and is
> not as limited like vmware esxi, which requires vsphere for relatively
> simple (from user POV, not implementation) things even if they are only
> affecting nodes in the same LAN, so we should continue to mainly "empower"
> PVE and plug that into PDM 3) PDM will become relatively complex even
> with trying to avoid state and such features implemented only there,
> all the metrics, tasks, health and SDN tracking is already quite a bit
> to handle, if done actually well, flexible and powerful.
>
>> If we'd implment something like this on PVE, there has to be a node
>> that has control of the api calls to make (or to schedule something via
>> pmxcfs) and that is probably much harder to do there (pmxcfs sync queue)
>> or brings some problems with it (node dies in the middle of an api call)
>
> In the simplest architecture it could be like the SDN reload is
> implemented; I'm quite sure that I mentioned that, but would not bet that
> much on my (or most) brain(s) that is.
>
> I.e. a single task on one node that connects to all involved cluster nodes
> through the API and creates the respective bulk-tasks for the guests residing
> on each node and then polls these. Some generic infrastructure for doing such
> things might be nice and would have some reuse between different bulk tasks
> and SDN, potentially others in the future.
> Switching to an even more efficient channel or method could be done
> transparently (from POV of the external user/program of the cluster-wide
> bulk-action API), so I'd not worry too much about that now.
>
> Besides that there are (most of the time) fewer points of failures between
> nodes compared to PDM and nodes network wise, if node(s) indeed die in the
> middle of an API call the PDM is naturally cannot magically fix that and
> as node failure is not expected behavior but rather an extraordinary event
> it also means that an interrupted bulk-action is not really a big problem
> there.
>
> in short: lets do this in PVE directly.
Sounds good to me, with one caveat. When implementing this, I would go for
a new api call on the pve side that does this to properly separate this.
I think otherwise the existing call would get much more complex. (but I have to
try it first). On the pdm side I'd implement it with a fallback
to call the "old" bulkstart api call on each node in case the new api call does
not exist?
That way older nodes/clusters can still profit from the functionality without
much state/logic handling on the pdm side.
We can still remove that fallback from PDM again when the PDM code is sufficiently old
and PDM has no first release yet.
How does that sound?
>
>> It's very early, so please don't judge the actual api call code just
>> now, I'd extend it with failure resulotion, polling the task, etc.
>>
>> OTOH there is the question if the UI makes sense this way, or if we want
>> to combine the 'select to view details' and 'select to to a bulk action'
>> into one. Or if we want to do the bulk actions more like in pve with
>> a popup that shows the vm list again.
>>
>> Dominik Csapak (3):
>> server: pve api: add new bulkstart api call
>> pdm-client: add bulk_start method
>> ui: pve tree: add bulk start action
>>
>> lib/pdm-client/src/lib.rs | 9 ++-
>> server/src/api/pve/mod.rs | 98 +++++++++++++++++++++++++++-
>> ui/src/pve/tree.rs | 133 ++++++++++++++++++++++++++++++++++++--
>> 3 files changed, 234 insertions(+), 6 deletions(-)
>>
>
_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel
next prev parent reply other threads:[~2025-01-30 8:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-29 10:51 Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 1/3] server: pve api: add new bulkstart api call Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 2/3] pdm-client: add bulk_start method Dominik Csapak
2025-01-29 10:51 ` [pdm-devel] [RFC PATCH datacenter-manager 3/3] ui: pve tree: add bulk start action Dominik Csapak
2025-01-29 18:48 ` [pdm-devel] [RFC PATCH datacenter-manager 0/3] implement bulk start Thomas Lamprecht
2025-01-30 8:14 ` Dominik Csapak [this message]
2025-01-30 16:15 ` Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15a36f0d-fef5-4ceb-8089-7cc84381e640@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox