From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH manager v2] api: implement node-independent bulk actions
Date: Fri, 14 Nov 2025 09:32:29 +0100 [thread overview]
Message-ID: <176310914957.64802.13109309341568645230@yuna.proxmox.com> (raw)
In-Reply-To: <20250814112659.2584520-1-d.csapak@proxmox.com>
Quoting Dominik Csapak (2025-08-14 13:26:59)
> To achieve this, start a worker task and use our generic api client
> to start the tasks on the relevant nodes. The client always points
> to 'localhost' so we let the pveproxy worry about the proxying etc.
>
> We reuse some logic from the startall/stopall/etc. calls, like getting
> the ordered guest info list. For that to work, we must convert some of
> the private subs into proper subs. We also fix handling loading configs
> from other nodes.
>
> In each worker, for each task, we check if the target is in the desired
> state (e.g. stopped when wanting to start, etc.). If that is the case,
> start the task and put the UPID in a queue to check. This is done until
> the parallel count is at 'max_workers', at which point we wait until at
> least one task is finished before starting the next one.
>
> Failures (e.g. task errors or failure to fetch the tasks) are printed,
> and the vmid is saved and they're collectively printed at the end for
> convenience.
>
> Special handling is required for checking the permissions for suspend:
> We have to load the config of the VM and find the target state storage.
> We can then check the privileges for that storage.
>
> Further improvements can be:
> * filters (I'd prefer starting out with front end filters)
> * failure mode resolution (I'd wait until someone requests that)
> * token handling (probably not necessary since we do check the
> permissions upfront for the correct token.)
I disagree with that last point, see below.
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v1:
> * rebased on master (perltidy changes)
> * added missing suspend to index
> * refactored more functionality to be reused
>
> PVE/API2/Cluster.pm | 7 +
> PVE/API2/Cluster/BulkAction.pm | 45 ++
> PVE/API2/Cluster/BulkAction/Guest.pm | 753 +++++++++++++++++++++++++++
> PVE/API2/Cluster/BulkAction/Makefile | 17 +
> PVE/API2/Cluster/Makefile | 4 +-
> PVE/API2/Nodes.pm | 24 +-
> 6 files changed, 838 insertions(+), 12 deletions(-)
> create mode 100644 PVE/API2/Cluster/BulkAction.pm
> create mode 100644 PVE/API2/Cluster/BulkAction/Guest.pm
> create mode 100644 PVE/API2/Cluster/BulkAction/Makefile
>
> diff --git a/PVE/API2/Cluster.pm b/PVE/API2/Cluster.pm
> index 02a7ceff..f0dd88f3 100644
> --- a/PVE/API2/Cluster.pm
> +++ b/PVE/API2/Cluster.pm
> @@ -25,6 +25,7 @@ use PVE::API2::ACMEAccount;
> use PVE::API2::ACMEPlugin;
> use PVE::API2::Backup;
> use PVE::API2::Cluster::BackupInfo;
> +use PVE::API2::Cluster::BulkAction;
> use PVE::API2::Cluster::Ceph;
> use PVE::API2::Cluster::Mapping;
> use PVE::API2::Cluster::Jobs;
> @@ -103,6 +104,11 @@ __PACKAGE__->register_method({
> path => 'mapping',
> });
>
> +__PACKAGE__->register_method({
> + subclass => "PVE::API2::Cluster::BulkAction",
> + path => 'bulk-action',
> +});
> +
> if ($have_sdn) {
> __PACKAGE__->register_method({
> subclass => "PVE::API2::Network::SDN",
> @@ -163,6 +169,7 @@ __PACKAGE__->register_method({
> { name => 'resources' },
> { name => 'status' },
> { name => 'tasks' },
> + { name => 'bulk-action' },
sorting? ;)
> ];
>
> if ($have_sdn) {
> diff --git a/PVE/API2/Cluster/BulkAction.pm b/PVE/API2/Cluster/BulkAction.pm
> new file mode 100644
> index 00000000..df650514
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction.pm
> @@ -0,0 +1,45 @@
> +package PVE::API2::Cluster::BulkAction;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::API2::Cluster::BulkAction::Guest;
> +
> +use base qw(PVE::RESTHandler);
> +
> +__PACKAGE__->register_method({
> + subclass => "PVE::API2::Cluster::BulkAction::Guest",
> + path => 'guest',
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'index',
> + path => '',
> + method => 'GET',
> + description => "List resource types.",
> + permissions => {
> + user => 'all',
> + },
> + parameters => {
> + additionalProperties => 0,
> + properties => {},
> + },
> + returns => {
> + type => 'array',
> + items => {
> + type => "object",
> + },
> + links => [{ rel => 'child', href => "{name}" }],
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $result = [
> + { name => 'guest' },
> + ];
> +
> + return $result;
> + },
> +});
> +
> +1;
> diff --git a/PVE/API2/Cluster/BulkAction/Guest.pm b/PVE/API2/Cluster/BulkAction/Guest.pm
> new file mode 100644
> index 00000000..785c9036
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction/Guest.pm
> @@ -0,0 +1,753 @@
> +package PVE::API2::Cluster::BulkAction::Guest;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::APIClient::LWP;
> +use PVE::AccessControl;
> +use PVE::Cluster;
> +use PVE::Exception qw(raise raise_perm_exc raise_param_exc);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option);
> +use PVE::QemuConfig;
> +use PVE::QemuServer;
> +use PVE::RESTEnvironment qw(log_warn);
> +use PVE::RESTHandler;
> +use PVE::RPCEnvironment;
> +use PVE::Storage;
> +use PVE::Tools qw();
> +
> +use PVE::API2::Nodes;
> +
> +use base qw(PVE::RESTHandler);
> +
> +__PACKAGE__->register_method({
> + name => 'index',
> + path => '',
> + method => 'GET',
> + description => "Bulk action index.",
> + permissions => { user => 'all' },
> + parameters => {
> + additionalProperties => 0,
> + properties => {},
> + },
> + returns => {
> + type => 'array',
> + items => {
> + type => "object",
> + properties => {},
> + },
> + links => [{ rel => 'child', href => "{name}" }],
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + return [
> + { name => 'start' },
> + { name => 'shutdown' },
> + { name => 'migrate' },
> + { name => 'suspend' },
> + ];
> + },
> +});
> +
> +sub create_client {
> + my ($authuser, $request_timeout) = @_;
> + my ($user, undef) = PVE::AccessControl::split_tokenid($authuser, 1);
> +
> + # TODO: How to handle Tokens?
not like below for sure ;) we'd need to make it queriable using the
RPCEnvironment (and store it there) I guess? maybe opt-in so the storing only
happens for certain API handlers (e.g., these ones here for a start)?
this basically escalates from the token to a ticket of the user, which is a
nogo even if you duplicate the current set of privilege checks here, as that is
just waiting to get out of sync
> + my $ticket = PVE::AccessControl::assemble_ticket($user || $authuser);
> + my $csrf_token = PVE::AccessControl::assemble_csrf_prevention_token($user || $authuser);
> +
> + my $node = PVE::INotify::nodename();
> + my $fingerprint = PVE::Cluster::get_node_fingerprint($node);
> +
> + my $conn_args = {
> + protocol => 'https',
> + host => 'localhost', # always call the api locally, let pveproxy handle the proxying
> + port => 8006,
> + ticket => $ticket,
> + timeout => $request_timeout // 25, # default slightly shorter than the proxy->daemon timeout
> + cached_fingerprints => {
> + $fingerprint => 1,
> + },
> + };
> +
> + my $api_client = PVE::APIClient::LWP->new($conn_args->%*);
> + if (defined($csrf_token)) {
> + $api_client->update_csrftoken($csrf_token);
> + }
> +
> + return $api_client;
this client doesn't automatically refresh the ticket before it expires, so
bigger sets of bulk actions that take more than 2h will always fail..
> +}
> +
> +# starts and awaits a task for each guest given via $startlist.
> +#
> +# takes a vm list in the form of
> +# {
> +# 0 => {
> +# 100 => { .. guest info ..},
> +# 101 => { .. guest info ..},
> +# },
> +# 1 => {
> +# 102 => { .. guest info ..},
> +# 103 => { .. guest info ..},
> +# },
> +# }
> +#
> +# max_workers: how many parallel tasks should be started.
> +# start_task: a sub that returns eiter a upid or 1 (undef means failure)
> +# check_task: if start_task returned a upid, will wait for that to finish and
> +# call check_task with the resulting task status
> +sub handle_task_foreach_guest {
> + my ($startlist, $max_workers, $start_task, $check_task) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> + my $api_client = create_client($authuser);
> +
> + my $failed = [];
> + for my $order (sort { $a <=> $b } keys $startlist->%*) {
> + my $vmlist = $startlist->{$order};
> + my $workers = {};
> +
> + for my $vmid (sort { $a <=> $b } keys $vmlist->%*) {
> +
> + # wait until at least one slot is free
> + while (scalar(keys($workers->%*)) >= $max_workers) {
> + for my $upid (keys($workers->%*)) {
> + my $worker = $workers->{$upid};
> + my $node = $worker->{guest}->{node};
> +
> + my $task = eval { $api_client->get("/nodes/$node/tasks/$upid/status") };
this could easily fail for reasons other than the task having exited? should we
maybe retry a few times to avoid accidents, before giving up?
> + if (my $err = $@) {
> + push $failed->@*, $worker->{vmid};
> +
> + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, 1, undef);
> +
> + delete $workers->{$upid};
> + } elsif ($task->{status} ne 'running') {
> + my $is_error = PVE::Tools::upid_status_is_error($task->{exitstatus});
> + push $failed->@*, $worker->{vmid} if $is_error;
> +
> + $check_task->(
> + $api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task,
> + );
> +
> + delete $workers->{$upid};
> + }
> + }
> + sleep(1); # How much?
> + }
> +
> + my $guest = $vmlist->{$vmid};
> + my $upid = eval { $start_task->($api_client, $vmid, $guest) };
> + warn $@ if $@;
A: here we use warn (see further similar nits below)
> +
> + # success but no task necessary
> + next if defined($upid) && "$upid" eq "1";
> +
> + if (!defined($upid)) {
> + push $failed->@*, $vmid;
> + next;
> + }
> +
> + $workers->{$upid} = {
> + vmid => $vmid,
> + guest => $guest,
> + };
> + }
> +
> + # wait until current order is finished
> + for my $upid (keys($workers->%*)) {
> + my $worker = $workers->{$upid};
> + my $node = $worker->{guest}->{node};
> +
> + my $task = eval { wait_for_task_finished($api_client, $node, $upid) };
> + my $err = $@;
> + my $is_error = ($err || PVE::Tools::upid_status_is_error($task->{exitstatus})) ? 1 : 0;
> + push $failed->@*, $worker->{vmid} if $is_error;
> +
> + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task);
> +
> + delete $workers->{$upid};
> + }
> + }
> +
> + return $failed;
> +}
> +
> +sub get_type_text {
> + my ($type) = @_;
> +
> + if ($type eq 'lxc') {
> + return 'CT';
> + } elsif ($type eq 'qemu') {
> + return 'VM';
> + } else {
> + die "unknown guest type $type\n";
> + }
> +}
> +
> +sub wait_for_task_finished {
> + my ($client, $node, $upid) = @_;
> +
> + while (1) {
> + my $task = $client->get("/nodes/$node/tasks/$upid/status");
same question as above here - should we try to handle transient errors here?
> + return $task if $task->{status} ne 'running';
> + sleep(1); # How much time?
> + }
> +}
> +
> +sub check_guest_permissions {
> + my ($rpcenv, $authuser, $vmlist, $priv_list) = @_;
> +
> + if (scalar($vmlist->@*) > 0) {
> + $rpcenv->check($authuser, "/vms/$_", $priv_list) for $vmlist->@*;
> + } elsif (!$rpcenv->check($authuser, "/", $priv_list, 1)) {
> + raise_perm_exc("/, " . join(', ', $priv_list->@*));
> + }
> +}
> +
> +sub extract_vmlist {
> + my ($param) = @_;
> +
> + if (my $vmlist = $param->{vms}) {
> + my $vmlist_string = join(',', $vmlist->@*);
> + return ($vmlist, $vmlist_string);
> + }
> + return ([], undef);
> +}
> +
> +sub print_start_action {
> + my ($vmlist, $prefix, $suffix) = @_;
> +
> + $suffix = defined($suffix) ? " $suffix" : "";
> +
> + if (scalar($vmlist->@*)) {
> + print STDERR "$prefix guests$suffix: " . join(', ', $vmlist->@*) . "\n";
A: here we use STDERR
> + } else {
> + print STDERR "$prefix all guests$suffix\n";
> + }
> +}
> +
> +__PACKAGE__->register_method({
> + name => 'start',
> + path => 'start',
> + method => 'POST',
> + description => "Bulk start or resume all guests on the cluster.",
> + permissions => {
> + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> + . "each ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + timeout => {
> + description =>
> + "Default start timeout in seconds. Only valid for VMs. (default depends on the guest configuration).",
> + type => 'integer',
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Starting");
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> + my $operation = 'start';
> + my $status =
> + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> + if (defined($status) && $status->{status} eq 'running') {
> + if (defined($status->{qmpstatus}) && $status->{qmpstatus} ne 'paused') {
> + print STDERR "Skipping $type_text $vmid, already running.\n";
> + return 1;
> + } else {
> + $operation = 'resume';
> + }
> + }
> +
> + my $params = {};
> + if (defined($param->{timeout}) && $operation eq 'start' && $type eq 'qemu') {
> + $params->{timeout} = $param->{timeout};
> + }
> +
> + my $url = "/nodes/$node/$type/$vmid/status/$operation";
> + print STDERR "Starting $type_text $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> +
> + my $default_delay = 0;
> +
> + if (!$is_error) {
> + my $delay = defined($guest->{up}) ? int($guest->{up}) : $default_delay;
> + if ($delay > 0) {
> + print STDERR "Waiting for $delay seconds (startup delay)\n"
> + if $guest->{up};
> + for (my $i = 0; $i < $delay; $i++) {
> + sleep(1);
> + }
> + }
> + } else {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Starting $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to start: " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkstart', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'shutdown',
> + path => 'shutdown',
> + method => 'POST',
> + description => "Bulk shutdown all guests on the cluster.",
> + permissions => {
> + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> + . "each ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + timeout => {
> + description =>
> + "Default shutdown timeout in seconds if none is configured for the guest.",
> + type => 'integer',
> + default => 180,
> + optional => 1,
> + },
> + 'force-stop' => {
> + description => "Makes sure the Guest stops after the timeout.",
> + type => 'boolean',
> + default => 1,
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Shutting down");
> +
> + # reverse order for shutdown
> + for my $order (keys $startlist->%*) {
> + my $list = delete $startlist->{$order};
> + $order = $order * -1;
> + $startlist->{$order} = $list;
> + }
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> +
> + my $status =
> + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> + if (defined($status) && $status->{status} ne 'running') {
> + print STDERR "Skipping $type_text $vmid, not running.\n";
> + return 1;
> + }
> +
> + if (
> + defined($status)
> + && defined($status->{qmpstatus})
> + && $status->{qmpstatus} eq 'paused'
> + && !$param->{'force-stop'}
> + ) {
> + log_warn("Skipping $type_text $vmid, resume paused VM before shutdown.\n");
A: here we use log_warn
> + return 1;
> + }
> +
> + my $timeout = int($guest->{down} // $param->{timeout} // 180);
> + my $forceStop = $param->{'force-stop'} // 1;
> +
> + my $params = {
> + forceStop => $forceStop,
> + timeout => $timeout,
> + };
> +
> + my $url = "/nodes/$node/$type/$vmid/status/shutdown";
> + print STDERR "Shutting down $type_text $vmid (Timeout = $timeout seconds)\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Stopping $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to shutdown " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkshutdown', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'suspend',
> + path => 'suspend',
> + method => 'POST',
> + description => "Bulk suspend all guests on the cluster.",
> + permissions => {
> + description =>
> + "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for each"
> + . " ID passed via the 'vms' parameter. Additionally, you need 'VM.Config.Disk' on the"
> + . " '/vms/{vmid}' path and 'Datastore.AllocateSpace' for the configured state-storage(s)",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + statestorage => get_standard_option(
> + 'pve-storage-id',
> + {
> + description => "The storage for the VM state.",
> + requires => 'to-disk',
> + optional => 1,
> + completion => \&PVE::Storage::complete_storage_enabled,
> + },
> + ),
> + 'to-disk' => {
> + description =>
> + "If set, suspends the guests to disk. Will be resumed on next start.",
> + type => 'boolean',
> + default => 0,
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + if ($param->{'to-disk'}) {
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Config.Disk']);
> + }
> +
> + if (my $statestorage = $param->{statestorage}) {
> + $rpcenv->check($authuser, "/storage/$statestorage", ['Datastore.AllocateSpace']);
> + } else {
> + # storage access must be done in start task
> + }
this if should be nested in the other if?
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Suspending");
> +
> + # reverse order for suspend
> + for my $order (keys $startlist->%*) {
> + my $list = delete $startlist->{$order};
> + $order = $order * -1;
> + $startlist->{$order} = $list;
> + }
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + if ($guest->{type} ne 'qemu') {
> + log_warn("skipping $vmid, only VMs can be suspended");
> + return 1;
> + }
> +
> + if (!$param->{statestorage}) {
this should again be nested inside a check for to-disk being set
> + my $conf = PVE::QemuConfig->load_config($vmid, $node);
> + my $storecfg = PVE::Storage::config();
> + my $statestorage = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
this does not exist, it's in QemuConfig
> + $rpcenv->check(
> + $authuser,
> + "/storage/$statestorage",
> + ['Datastore.AllocateSpace'],
> + );
> + }
> +
> + my $status =
> + eval { $api_client->get("/nodes/$node/qemu/$vmid/status/current") };
> + if (defined($status) && $status->{status} ne 'running') {
> + print STDERR "Skipping VM $vmid, not running.\n";
> + return 1;
> + }
> +
> + my $params = {};
> + $params->{'todisk'} = $param->{'to-disk'} // 0;
> + $params->{statestorage} = $param->{statestorage}
> + if defined($param->{statestorage});
statestorage only makes sense if you set to-disk, so it should be ordered like
that here as well..
> +
> + my $url = "/nodes/$node/qemu/$vmid/status/suspend";
> + print STDERR "Suspending VM $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Stopping $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to suspend " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulksuspend', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'migrate',
> + path => 'migrate',
> + method => 'POST',
> + description => "Bulk migrate all guests on the cluster.",
> + permissions => {
> + description =>
> + "The 'VM.Migrate' permission is required on '/' or on '/vms/<ID>' for each "
> + . "ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + 'target-node' => get_standard_option('pve-node', { description => "Target node." }),
> + online => {
> + type => 'boolean',
> + description => "Enable live migration for VMs and restart migration for CTs.",
> + optional => 1,
> + },
> + "with-local-disks" => {
> + type => 'boolean',
> + description => "Enable live storage migration for local disk",
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Migrate']);
> +
> + my $code = sub {
> + my $list =
> + PVE::API2::Nodes::Nodeinfo::get_filtered_vmlist(undef, $vmlist_string, 1, 1);
> +
> + print_start_action($vmlist, "Migrating", "to $param->{'target-node'}");
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> +
> + if ($node eq $param->{'target-node'}) {
> + print STDERR "$type_text $vmid already on $node, skipping.\n";
> + return 1;
> + }
> +
> + my $params = {
> + target => $param->{'target-node'},
> + };
> +
> + if ($type eq 'lxc') {
> + $params->{restart} = $param->{online} if defined($param->{online});
> + } elsif ($type eq 'qemu') {
> + $params->{online} = $param->{online} if defined($param->{online});
> + $params->{'with-local-disks'} = $param->{'with-local-disks'}
> + if defined($param->{'with-local-disks'});
> + }
> +
> + my $url = "/nodes/$node/$type/$vmid/migrate";
> + print STDERR "Migrating $type_text $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Migrating $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest({ '0' => $list }, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to migrate " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkmigrate', undef, $authuser, $code);
> + },
> +});
> +
> +1;
> diff --git a/PVE/API2/Cluster/BulkAction/Makefile b/PVE/API2/Cluster/BulkAction/Makefile
> new file mode 100644
> index 00000000..822c1c15
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction/Makefile
> @@ -0,0 +1,17 @@
> +include ../../../../defines.mk
> +
> +# for node independent, cluster-wide applicable, API endpoints
> +# ensure we do not conflict with files shipped by pve-cluster!!
> +PERLSOURCE= \
> + Guest.pm
> +
> +all:
> +
> +.PHONY: clean
> +clean:
> + rm -rf *~
> +
> +.PHONY: install
> +install: ${PERLSOURCE}
> + install -d ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction
> + install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction
> diff --git a/PVE/API2/Cluster/Makefile b/PVE/API2/Cluster/Makefile
> index b109e5cb..6cffe4c9 100644
> --- a/PVE/API2/Cluster/Makefile
> +++ b/PVE/API2/Cluster/Makefile
> @@ -1,11 +1,13 @@
> include ../../../defines.mk
>
> -SUBDIRS=Mapping
> +SUBDIRS=Mapping \
> + BulkAction
>
> # for node independent, cluster-wide applicable, API endpoints
> # ensure we do not conflict with files shipped by pve-cluster!!
> PERLSOURCE= \
> BackupInfo.pm \
> + BulkAction.pm \
> MetricServer.pm \
> Mapping.pm \
> Notifications.pm \
> diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
> index ce7eecaf..0c43f5c7 100644
> --- a/PVE/API2/Nodes.pm
> +++ b/PVE/API2/Nodes.pm
> @@ -1908,7 +1908,7 @@ __PACKAGE__->register_method({
> # * vmid whitelist
> # * guest is a template (default: skip)
> # * guest is HA manged (default: skip)
> -my $get_filtered_vmlist = sub {
> +sub get_filtered_vmlist {
> my ($nodename, $vmfilter, $templates, $ha_managed) = @_;
>
> my $vmlist = PVE::Cluster::get_vmlist();
> @@ -1935,28 +1935,29 @@ my $get_filtered_vmlist = sub {
> die "unknown virtual guest type '$d->{type}'\n";
> }
>
> - my $conf = $class->load_config($vmid);
> + my $conf = $class->load_config($vmid, $d->{node});
> return if !$templates && $class->is_template($conf);
> return if !$ha_managed && PVE::HA::Config::vm_is_ha_managed($vmid);
>
> $res->{$vmid}->{conf} = $conf;
> $res->{$vmid}->{type} = $d->{type};
> $res->{$vmid}->{class} = $class;
> + $res->{$vmid}->{node} = $d->{node};
> };
> warn $@ if $@;
> }
>
> return $res;
> -};
> +}
>
> # return all VMs which should get started/stopped on power up/down
> -my $get_start_stop_list = sub {
> +sub get_start_stop_list {
> my ($nodename, $autostart, $vmfilter) = @_;
>
> # do not skip HA vms on force or if a specific VMID set is wanted
> my $include_ha_managed = defined($vmfilter) ? 1 : 0;
>
> - my $vmlist = $get_filtered_vmlist->($nodename, $vmfilter, undef, $include_ha_managed);
> + my $vmlist = get_filtered_vmlist($nodename, $vmfilter, undef, $include_ha_managed);
>
> my $resList = {};
> foreach my $vmid (keys %$vmlist) {
> @@ -1969,15 +1970,16 @@ my $get_start_stop_list = sub {
>
> $resList->{$order}->{$vmid} = $startup;
> $resList->{$order}->{$vmid}->{type} = $vmlist->{$vmid}->{type};
> + $resList->{$order}->{$vmid}->{node} = $vmlist->{$vmid}->{node};
> }
>
> return $resList;
> -};
> +}
>
> my $remove_locks_on_startup = sub {
> my ($nodename) = @_;
>
> - my $vmlist = &$get_filtered_vmlist($nodename, undef, undef, 1);
> + my $vmlist = get_filtered_vmlist($nodename, undef, undef, 1);
>
> foreach my $vmid (keys %$vmlist) {
> my $conf = $vmlist->{$vmid}->{conf};
> @@ -2069,7 +2071,7 @@ __PACKAGE__->register_method({
> warn $@ if $@;
>
> my $autostart = $force ? undef : 1;
> - my $startList = $get_start_stop_list->($nodename, $autostart, $param->{vms});
> + my $startList = get_start_stop_list($nodename, $autostart, $param->{vms});
>
> # Note: use numeric sorting with <=>
> for my $order (sort { $a <=> $b } keys %$startList) {
> @@ -2215,7 +2217,7 @@ __PACKAGE__->register_method({
>
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $stopList = $get_start_stop_list->($nodename, undef, $param->{vms});
> + my $stopList = get_start_stop_list($nodename, undef, $param->{vms});
>
> my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
> my $datacenterconfig = cfs_read_file('datacenter.cfg');
> @@ -2344,7 +2346,7 @@ __PACKAGE__->register_method({
>
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $toSuspendList = $get_start_stop_list->($nodename, undef, $param->{vms});
> + my $toSuspendList = get_start_stop_list($nodename, undef, $param->{vms});
>
> my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
> my $datacenterconfig = cfs_read_file('datacenter.cfg');
> @@ -2549,7 +2551,7 @@ __PACKAGE__->register_method({
> my $code = sub {
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $vmlist = &$get_filtered_vmlist($nodename, $param->{vms}, 1, 1);
> + my $vmlist = get_filtered_vmlist($nodename, $param->{vms}, 1, 1);
> if (!scalar(keys %$vmlist)) {
> warn "no virtual guests matched, nothing to do..\n";
> return;
> --
> 2.39.5
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-11-14 8:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 11:26 Dominik Csapak
2025-11-14 8:32 ` Fabian Grünbichler [this message]
2025-11-14 9:08 ` Thomas Lamprecht
2025-11-14 9:16 ` Dominik Csapak
2025-11-14 9:32 ` Fabian Grünbichler
2025-11-14 15:00 ` [pve-devel] superseded: " Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=176310914957.64802.13109309341568645230@yuna.proxmox.com \
--to=f.gruenbichler@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox