From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH manager v2] api: implement node-independent bulk actions
Date: Fri, 14 Nov 2025 09:32:29 +0100 [thread overview]
Message-ID: <176310914957.64802.13109309341568645230@yuna.proxmox.com> (raw)
In-Reply-To: <20250814112659.2584520-1-d.csapak@proxmox.com>
Quoting Dominik Csapak (2025-08-14 13:26:59)
> To achieve this, start a worker task and use our generic api client
> to start the tasks on the relevant nodes. The client always points
> to 'localhost' so we let the pveproxy worry about the proxying etc.
>
> We reuse some logic from the startall/stopall/etc. calls, like getting
> the ordered guest info list. For that to work, we must convert some of
> the private subs into proper subs. We also fix handling loading configs
> from other nodes.
>
> In each worker, for each task, we check if the target is in the desired
> state (e.g. stopped when wanting to start, etc.). If that is the case,
> start the task and put the UPID in a queue to check. This is done until
> the parallel count is at 'max_workers', at which point we wait until at
> least one task is finished before starting the next one.
>
> Failures (e.g. task errors or failure to fetch the tasks) are printed,
> and the vmid is saved and they're collectively printed at the end for
> convenience.
>
> Special handling is required for checking the permissions for suspend:
> We have to load the config of the VM and find the target state storage.
> We can then check the privileges for that storage.
>
> Further improvements can be:
> * filters (I'd prefer starting out with front end filters)
> * failure mode resolution (I'd wait until someone requests that)
> * token handling (probably not necessary since we do check the
> permissions upfront for the correct token.)
I disagree with that last point, see below.
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v1:
> * rebased on master (perltidy changes)
> * added missing suspend to index
> * refactored more functionality to be reused
>
> PVE/API2/Cluster.pm | 7 +
> PVE/API2/Cluster/BulkAction.pm | 45 ++
> PVE/API2/Cluster/BulkAction/Guest.pm | 753 +++++++++++++++++++++++++++
> PVE/API2/Cluster/BulkAction/Makefile | 17 +
> PVE/API2/Cluster/Makefile | 4 +-
> PVE/API2/Nodes.pm | 24 +-
> 6 files changed, 838 insertions(+), 12 deletions(-)
> create mode 100644 PVE/API2/Cluster/BulkAction.pm
> create mode 100644 PVE/API2/Cluster/BulkAction/Guest.pm
> create mode 100644 PVE/API2/Cluster/BulkAction/Makefile
>
> diff --git a/PVE/API2/Cluster.pm b/PVE/API2/Cluster.pm
> index 02a7ceff..f0dd88f3 100644
> --- a/PVE/API2/Cluster.pm
> +++ b/PVE/API2/Cluster.pm
> @@ -25,6 +25,7 @@ use PVE::API2::ACMEAccount;
> use PVE::API2::ACMEPlugin;
> use PVE::API2::Backup;
> use PVE::API2::Cluster::BackupInfo;
> +use PVE::API2::Cluster::BulkAction;
> use PVE::API2::Cluster::Ceph;
> use PVE::API2::Cluster::Mapping;
> use PVE::API2::Cluster::Jobs;
> @@ -103,6 +104,11 @@ __PACKAGE__->register_method({
> path => 'mapping',
> });
>
> +__PACKAGE__->register_method({
> + subclass => "PVE::API2::Cluster::BulkAction",
> + path => 'bulk-action',
> +});
> +
> if ($have_sdn) {
> __PACKAGE__->register_method({
> subclass => "PVE::API2::Network::SDN",
> @@ -163,6 +169,7 @@ __PACKAGE__->register_method({
> { name => 'resources' },
> { name => 'status' },
> { name => 'tasks' },
> + { name => 'bulk-action' },
sorting? ;)
> ];
>
> if ($have_sdn) {
> diff --git a/PVE/API2/Cluster/BulkAction.pm b/PVE/API2/Cluster/BulkAction.pm
> new file mode 100644
> index 00000000..df650514
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction.pm
> @@ -0,0 +1,45 @@
> +package PVE::API2::Cluster::BulkAction;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::API2::Cluster::BulkAction::Guest;
> +
> +use base qw(PVE::RESTHandler);
> +
> +__PACKAGE__->register_method({
> + subclass => "PVE::API2::Cluster::BulkAction::Guest",
> + path => 'guest',
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'index',
> + path => '',
> + method => 'GET',
> + description => "List resource types.",
> + permissions => {
> + user => 'all',
> + },
> + parameters => {
> + additionalProperties => 0,
> + properties => {},
> + },
> + returns => {
> + type => 'array',
> + items => {
> + type => "object",
> + },
> + links => [{ rel => 'child', href => "{name}" }],
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $result = [
> + { name => 'guest' },
> + ];
> +
> + return $result;
> + },
> +});
> +
> +1;
> diff --git a/PVE/API2/Cluster/BulkAction/Guest.pm b/PVE/API2/Cluster/BulkAction/Guest.pm
> new file mode 100644
> index 00000000..785c9036
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction/Guest.pm
> @@ -0,0 +1,753 @@
> +package PVE::API2::Cluster::BulkAction::Guest;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::APIClient::LWP;
> +use PVE::AccessControl;
> +use PVE::Cluster;
> +use PVE::Exception qw(raise raise_perm_exc raise_param_exc);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option);
> +use PVE::QemuConfig;
> +use PVE::QemuServer;
> +use PVE::RESTEnvironment qw(log_warn);
> +use PVE::RESTHandler;
> +use PVE::RPCEnvironment;
> +use PVE::Storage;
> +use PVE::Tools qw();
> +
> +use PVE::API2::Nodes;
> +
> +use base qw(PVE::RESTHandler);
> +
> +__PACKAGE__->register_method({
> + name => 'index',
> + path => '',
> + method => 'GET',
> + description => "Bulk action index.",
> + permissions => { user => 'all' },
> + parameters => {
> + additionalProperties => 0,
> + properties => {},
> + },
> + returns => {
> + type => 'array',
> + items => {
> + type => "object",
> + properties => {},
> + },
> + links => [{ rel => 'child', href => "{name}" }],
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + return [
> + { name => 'start' },
> + { name => 'shutdown' },
> + { name => 'migrate' },
> + { name => 'suspend' },
> + ];
> + },
> +});
> +
> +sub create_client {
> + my ($authuser, $request_timeout) = @_;
> + my ($user, undef) = PVE::AccessControl::split_tokenid($authuser, 1);
> +
> + # TODO: How to handle Tokens?
not like below for sure ;) we'd need to make it queriable using the
RPCEnvironment (and store it there) I guess? maybe opt-in so the storing only
happens for certain API handlers (e.g., these ones here for a start)?
this basically escalates from the token to a ticket of the user, which is a
nogo even if you duplicate the current set of privilege checks here, as that is
just waiting to get out of sync
> + my $ticket = PVE::AccessControl::assemble_ticket($user || $authuser);
> + my $csrf_token = PVE::AccessControl::assemble_csrf_prevention_token($user || $authuser);
> +
> + my $node = PVE::INotify::nodename();
> + my $fingerprint = PVE::Cluster::get_node_fingerprint($node);
> +
> + my $conn_args = {
> + protocol => 'https',
> + host => 'localhost', # always call the api locally, let pveproxy handle the proxying
> + port => 8006,
> + ticket => $ticket,
> + timeout => $request_timeout // 25, # default slightly shorter than the proxy->daemon timeout
> + cached_fingerprints => {
> + $fingerprint => 1,
> + },
> + };
> +
> + my $api_client = PVE::APIClient::LWP->new($conn_args->%*);
> + if (defined($csrf_token)) {
> + $api_client->update_csrftoken($csrf_token);
> + }
> +
> + return $api_client;
this client doesn't automatically refresh the ticket before it expires, so
bigger sets of bulk actions that take more than 2h will always fail..
> +}
> +
> +# starts and awaits a task for each guest given via $startlist.
> +#
> +# takes a vm list in the form of
> +# {
> +# 0 => {
> +# 100 => { .. guest info ..},
> +# 101 => { .. guest info ..},
> +# },
> +# 1 => {
> +# 102 => { .. guest info ..},
> +# 103 => { .. guest info ..},
> +# },
> +# }
> +#
> +# max_workers: how many parallel tasks should be started.
> +# start_task: a sub that returns eiter a upid or 1 (undef means failure)
> +# check_task: if start_task returned a upid, will wait for that to finish and
> +# call check_task with the resulting task status
> +sub handle_task_foreach_guest {
> + my ($startlist, $max_workers, $start_task, $check_task) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> + my $api_client = create_client($authuser);
> +
> + my $failed = [];
> + for my $order (sort { $a <=> $b } keys $startlist->%*) {
> + my $vmlist = $startlist->{$order};
> + my $workers = {};
> +
> + for my $vmid (sort { $a <=> $b } keys $vmlist->%*) {
> +
> + # wait until at least one slot is free
> + while (scalar(keys($workers->%*)) >= $max_workers) {
> + for my $upid (keys($workers->%*)) {
> + my $worker = $workers->{$upid};
> + my $node = $worker->{guest}->{node};
> +
> + my $task = eval { $api_client->get("/nodes/$node/tasks/$upid/status") };
this could easily fail for reasons other than the task having exited? should we
maybe retry a few times to avoid accidents, before giving up?
> + if (my $err = $@) {
> + push $failed->@*, $worker->{vmid};
> +
> + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, 1, undef);
> +
> + delete $workers->{$upid};
> + } elsif ($task->{status} ne 'running') {
> + my $is_error = PVE::Tools::upid_status_is_error($task->{exitstatus});
> + push $failed->@*, $worker->{vmid} if $is_error;
> +
> + $check_task->(
> + $api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task,
> + );
> +
> + delete $workers->{$upid};
> + }
> + }
> + sleep(1); # How much?
> + }
> +
> + my $guest = $vmlist->{$vmid};
> + my $upid = eval { $start_task->($api_client, $vmid, $guest) };
> + warn $@ if $@;
A: here we use warn (see further similar nits below)
> +
> + # success but no task necessary
> + next if defined($upid) && "$upid" eq "1";
> +
> + if (!defined($upid)) {
> + push $failed->@*, $vmid;
> + next;
> + }
> +
> + $workers->{$upid} = {
> + vmid => $vmid,
> + guest => $guest,
> + };
> + }
> +
> + # wait until current order is finished
> + for my $upid (keys($workers->%*)) {
> + my $worker = $workers->{$upid};
> + my $node = $worker->{guest}->{node};
> +
> + my $task = eval { wait_for_task_finished($api_client, $node, $upid) };
> + my $err = $@;
> + my $is_error = ($err || PVE::Tools::upid_status_is_error($task->{exitstatus})) ? 1 : 0;
> + push $failed->@*, $worker->{vmid} if $is_error;
> +
> + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task);
> +
> + delete $workers->{$upid};
> + }
> + }
> +
> + return $failed;
> +}
> +
> +sub get_type_text {
> + my ($type) = @_;
> +
> + if ($type eq 'lxc') {
> + return 'CT';
> + } elsif ($type eq 'qemu') {
> + return 'VM';
> + } else {
> + die "unknown guest type $type\n";
> + }
> +}
> +
> +sub wait_for_task_finished {
> + my ($client, $node, $upid) = @_;
> +
> + while (1) {
> + my $task = $client->get("/nodes/$node/tasks/$upid/status");
same question as above here - should we try to handle transient errors here?
> + return $task if $task->{status} ne 'running';
> + sleep(1); # How much time?
> + }
> +}
> +
> +sub check_guest_permissions {
> + my ($rpcenv, $authuser, $vmlist, $priv_list) = @_;
> +
> + if (scalar($vmlist->@*) > 0) {
> + $rpcenv->check($authuser, "/vms/$_", $priv_list) for $vmlist->@*;
> + } elsif (!$rpcenv->check($authuser, "/", $priv_list, 1)) {
> + raise_perm_exc("/, " . join(', ', $priv_list->@*));
> + }
> +}
> +
> +sub extract_vmlist {
> + my ($param) = @_;
> +
> + if (my $vmlist = $param->{vms}) {
> + my $vmlist_string = join(',', $vmlist->@*);
> + return ($vmlist, $vmlist_string);
> + }
> + return ([], undef);
> +}
> +
> +sub print_start_action {
> + my ($vmlist, $prefix, $suffix) = @_;
> +
> + $suffix = defined($suffix) ? " $suffix" : "";
> +
> + if (scalar($vmlist->@*)) {
> + print STDERR "$prefix guests$suffix: " . join(', ', $vmlist->@*) . "\n";
A: here we use STDERR
> + } else {
> + print STDERR "$prefix all guests$suffix\n";
> + }
> +}
> +
> +__PACKAGE__->register_method({
> + name => 'start',
> + path => 'start',
> + method => 'POST',
> + description => "Bulk start or resume all guests on the cluster.",
> + permissions => {
> + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> + . "each ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + timeout => {
> + description =>
> + "Default start timeout in seconds. Only valid for VMs. (default depends on the guest configuration).",
> + type => 'integer',
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Starting");
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> + my $operation = 'start';
> + my $status =
> + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> + if (defined($status) && $status->{status} eq 'running') {
> + if (defined($status->{qmpstatus}) && $status->{qmpstatus} ne 'paused') {
> + print STDERR "Skipping $type_text $vmid, already running.\n";
> + return 1;
> + } else {
> + $operation = 'resume';
> + }
> + }
> +
> + my $params = {};
> + if (defined($param->{timeout}) && $operation eq 'start' && $type eq 'qemu') {
> + $params->{timeout} = $param->{timeout};
> + }
> +
> + my $url = "/nodes/$node/$type/$vmid/status/$operation";
> + print STDERR "Starting $type_text $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> +
> + my $default_delay = 0;
> +
> + if (!$is_error) {
> + my $delay = defined($guest->{up}) ? int($guest->{up}) : $default_delay;
> + if ($delay > 0) {
> + print STDERR "Waiting for $delay seconds (startup delay)\n"
> + if $guest->{up};
> + for (my $i = 0; $i < $delay; $i++) {
> + sleep(1);
> + }
> + }
> + } else {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Starting $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to start: " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkstart', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'shutdown',
> + path => 'shutdown',
> + method => 'POST',
> + description => "Bulk shutdown all guests on the cluster.",
> + permissions => {
> + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> + . "each ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + timeout => {
> + description =>
> + "Default shutdown timeout in seconds if none is configured for the guest.",
> + type => 'integer',
> + default => 180,
> + optional => 1,
> + },
> + 'force-stop' => {
> + description => "Makes sure the Guest stops after the timeout.",
> + type => 'boolean',
> + default => 1,
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Shutting down");
> +
> + # reverse order for shutdown
> + for my $order (keys $startlist->%*) {
> + my $list = delete $startlist->{$order};
> + $order = $order * -1;
> + $startlist->{$order} = $list;
> + }
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> +
> + my $status =
> + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> + if (defined($status) && $status->{status} ne 'running') {
> + print STDERR "Skipping $type_text $vmid, not running.\n";
> + return 1;
> + }
> +
> + if (
> + defined($status)
> + && defined($status->{qmpstatus})
> + && $status->{qmpstatus} eq 'paused'
> + && !$param->{'force-stop'}
> + ) {
> + log_warn("Skipping $type_text $vmid, resume paused VM before shutdown.\n");
A: here we use log_warn
> + return 1;
> + }
> +
> + my $timeout = int($guest->{down} // $param->{timeout} // 180);
> + my $forceStop = $param->{'force-stop'} // 1;
> +
> + my $params = {
> + forceStop => $forceStop,
> + timeout => $timeout,
> + };
> +
> + my $url = "/nodes/$node/$type/$vmid/status/shutdown";
> + print STDERR "Shutting down $type_text $vmid (Timeout = $timeout seconds)\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Stopping $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to shutdown " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkshutdown', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'suspend',
> + path => 'suspend',
> + method => 'POST',
> + description => "Bulk suspend all guests on the cluster.",
> + permissions => {
> + description =>
> + "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for each"
> + . " ID passed via the 'vms' parameter. Additionally, you need 'VM.Config.Disk' on the"
> + . " '/vms/{vmid}' path and 'Datastore.AllocateSpace' for the configured state-storage(s)",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + statestorage => get_standard_option(
> + 'pve-storage-id',
> + {
> + description => "The storage for the VM state.",
> + requires => 'to-disk',
> + optional => 1,
> + completion => \&PVE::Storage::complete_storage_enabled,
> + },
> + ),
> + 'to-disk' => {
> + description =>
> + "If set, suspends the guests to disk. Will be resumed on next start.",
> + type => 'boolean',
> + default => 0,
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
> +
> + if ($param->{'to-disk'}) {
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Config.Disk']);
> + }
> +
> + if (my $statestorage = $param->{statestorage}) {
> + $rpcenv->check($authuser, "/storage/$statestorage", ['Datastore.AllocateSpace']);
> + } else {
> + # storage access must be done in start task
> + }
this if should be nested in the other if?
> +
> + my $code = sub {
> + my $startlist =
> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
> +
> + print_start_action($vmlist, "Suspending");
> +
> + # reverse order for suspend
> + for my $order (keys $startlist->%*) {
> + my $list = delete $startlist->{$order};
> + $order = $order * -1;
> + $startlist->{$order} = $list;
> + }
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + if ($guest->{type} ne 'qemu') {
> + log_warn("skipping $vmid, only VMs can be suspended");
> + return 1;
> + }
> +
> + if (!$param->{statestorage}) {
this should again be nested inside a check for to-disk being set
> + my $conf = PVE::QemuConfig->load_config($vmid, $node);
> + my $storecfg = PVE::Storage::config();
> + my $statestorage = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
this does not exist, it's in QemuConfig
> + $rpcenv->check(
> + $authuser,
> + "/storage/$statestorage",
> + ['Datastore.AllocateSpace'],
> + );
> + }
> +
> + my $status =
> + eval { $api_client->get("/nodes/$node/qemu/$vmid/status/current") };
> + if (defined($status) && $status->{status} ne 'running') {
> + print STDERR "Skipping VM $vmid, not running.\n";
> + return 1;
> + }
> +
> + my $params = {};
> + $params->{'todisk'} = $param->{'to-disk'} // 0;
> + $params->{statestorage} = $param->{statestorage}
> + if defined($param->{statestorage});
statestorage only makes sense if you set to-disk, so it should be ordered like
that here as well..
> +
> + my $url = "/nodes/$node/qemu/$vmid/status/suspend";
> + print STDERR "Suspending VM $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + my $node = $guest->{node};
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Stopping $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to suspend " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulksuspend', undef, $authuser, $code);
> + },
> +});
> +
> +__PACKAGE__->register_method({
> + name => 'migrate',
> + path => 'migrate',
> + method => 'POST',
> + description => "Bulk migrate all guests on the cluster.",
> + permissions => {
> + description =>
> + "The 'VM.Migrate' permission is required on '/' or on '/vms/<ID>' for each "
> + . "ID passed via the 'vms' parameter.",
> + user => 'all',
> + },
> + protected => 1,
> + parameters => {
> + additionalProperties => 0,
> + properties => {
> + vms => {
> + description => "Only consider guests from this list of VMIDs.",
> + type => 'array',
> + items => get_standard_option('pve-vmid'),
> + optional => 1,
> + },
> + 'target-node' => get_standard_option('pve-node', { description => "Target node." }),
> + online => {
> + type => 'boolean',
> + description => "Enable live migration for VMs and restart migration for CTs.",
> + optional => 1,
> + },
> + "with-local-disks" => {
> + type => 'boolean',
> + description => "Enable live storage migration for local disk",
> + optional => 1,
> + },
> + 'max-workers' => {
> + description => "How many parallel tasks at maximum should be started.",
> + optional => 1,
> + default => 1,
> + type => 'integer',
> + },
> + # TODO:
> + # Failure resolution mode (fail, warn, retry?)
> + # mode-limits (offline only, suspend only, ?)
> + # filter (tags, name, ?)
> + },
> + },
> + returns => {
> + type => 'string',
> + description => "UPID of the worker",
> + },
> + code => sub {
> + my ($param) = @_;
> +
> + my $rpcenv = PVE::RPCEnvironment::get();
> + my $authuser = $rpcenv->get_user();
> +
> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
> +
> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Migrate']);
> +
> + my $code = sub {
> + my $list =
> + PVE::API2::Nodes::Nodeinfo::get_filtered_vmlist(undef, $vmlist_string, 1, 1);
> +
> + print_start_action($vmlist, "Migrating", "to $param->{'target-node'}");
> +
> + my $start_task = sub {
> + my ($api_client, $vmid, $guest) = @_;
> + my $node = $guest->{node};
> +
> + my $type = $guest->{type};
> + my $type_text = get_type_text($type);
> +
> + if ($node eq $param->{'target-node'}) {
> + print STDERR "$type_text $vmid already on $node, skipping.\n";
> + return 1;
> + }
> +
> + my $params = {
> + target => $param->{'target-node'},
> + };
> +
> + if ($type eq 'lxc') {
> + $params->{restart} = $param->{online} if defined($param->{online});
> + } elsif ($type eq 'qemu') {
> + $params->{online} = $param->{online} if defined($param->{online});
> + $params->{'with-local-disks'} = $param->{'with-local-disks'}
> + if defined($param->{'with-local-disks'});
> + }
> +
> + my $url = "/nodes/$node/$type/$vmid/migrate";
> + print STDERR "Migrating $type_text $vmid\n";
> + return $api_client->post($url, $params);
> + };
> +
> + my $check_task = sub {
> + my ($api_client, $vmid, $guest, $is_error, $task) = @_;
> + if ($is_error) {
> + my $err =
> + defined($task) ? $task->{exitstatus} : "could not query task status";
> + my $type_text = get_type_text($guest->{type});
> + print STDERR "Migrating $type_text $vmid failed: $err\n";
> + }
> + };
> +
> + my $max_workers = $param->{'max-workers'} // 1;
> + my $failed =
> + handle_task_foreach_guest({ '0' => $list }, $max_workers, $start_task, $check_task);
> +
> + if (scalar($failed->@*)) {
> + die "Some guests failed to migrate " . join(', ', $failed->@*) . "\n";
> + }
> + };
> +
> + return $rpcenv->fork_worker('bulkmigrate', undef, $authuser, $code);
> + },
> +});
> +
> +1;
> diff --git a/PVE/API2/Cluster/BulkAction/Makefile b/PVE/API2/Cluster/BulkAction/Makefile
> new file mode 100644
> index 00000000..822c1c15
> --- /dev/null
> +++ b/PVE/API2/Cluster/BulkAction/Makefile
> @@ -0,0 +1,17 @@
> +include ../../../../defines.mk
> +
> +# for node independent, cluster-wide applicable, API endpoints
> +# ensure we do not conflict with files shipped by pve-cluster!!
> +PERLSOURCE= \
> + Guest.pm
> +
> +all:
> +
> +.PHONY: clean
> +clean:
> + rm -rf *~
> +
> +.PHONY: install
> +install: ${PERLSOURCE}
> + install -d ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction
> + install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction
> diff --git a/PVE/API2/Cluster/Makefile b/PVE/API2/Cluster/Makefile
> index b109e5cb..6cffe4c9 100644
> --- a/PVE/API2/Cluster/Makefile
> +++ b/PVE/API2/Cluster/Makefile
> @@ -1,11 +1,13 @@
> include ../../../defines.mk
>
> -SUBDIRS=Mapping
> +SUBDIRS=Mapping \
> + BulkAction
>
> # for node independent, cluster-wide applicable, API endpoints
> # ensure we do not conflict with files shipped by pve-cluster!!
> PERLSOURCE= \
> BackupInfo.pm \
> + BulkAction.pm \
> MetricServer.pm \
> Mapping.pm \
> Notifications.pm \
> diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
> index ce7eecaf..0c43f5c7 100644
> --- a/PVE/API2/Nodes.pm
> +++ b/PVE/API2/Nodes.pm
> @@ -1908,7 +1908,7 @@ __PACKAGE__->register_method({
> # * vmid whitelist
> # * guest is a template (default: skip)
> # * guest is HA manged (default: skip)
> -my $get_filtered_vmlist = sub {
> +sub get_filtered_vmlist {
> my ($nodename, $vmfilter, $templates, $ha_managed) = @_;
>
> my $vmlist = PVE::Cluster::get_vmlist();
> @@ -1935,28 +1935,29 @@ my $get_filtered_vmlist = sub {
> die "unknown virtual guest type '$d->{type}'\n";
> }
>
> - my $conf = $class->load_config($vmid);
> + my $conf = $class->load_config($vmid, $d->{node});
> return if !$templates && $class->is_template($conf);
> return if !$ha_managed && PVE::HA::Config::vm_is_ha_managed($vmid);
>
> $res->{$vmid}->{conf} = $conf;
> $res->{$vmid}->{type} = $d->{type};
> $res->{$vmid}->{class} = $class;
> + $res->{$vmid}->{node} = $d->{node};
> };
> warn $@ if $@;
> }
>
> return $res;
> -};
> +}
>
> # return all VMs which should get started/stopped on power up/down
> -my $get_start_stop_list = sub {
> +sub get_start_stop_list {
> my ($nodename, $autostart, $vmfilter) = @_;
>
> # do not skip HA vms on force or if a specific VMID set is wanted
> my $include_ha_managed = defined($vmfilter) ? 1 : 0;
>
> - my $vmlist = $get_filtered_vmlist->($nodename, $vmfilter, undef, $include_ha_managed);
> + my $vmlist = get_filtered_vmlist($nodename, $vmfilter, undef, $include_ha_managed);
>
> my $resList = {};
> foreach my $vmid (keys %$vmlist) {
> @@ -1969,15 +1970,16 @@ my $get_start_stop_list = sub {
>
> $resList->{$order}->{$vmid} = $startup;
> $resList->{$order}->{$vmid}->{type} = $vmlist->{$vmid}->{type};
> + $resList->{$order}->{$vmid}->{node} = $vmlist->{$vmid}->{node};
> }
>
> return $resList;
> -};
> +}
>
> my $remove_locks_on_startup = sub {
> my ($nodename) = @_;
>
> - my $vmlist = &$get_filtered_vmlist($nodename, undef, undef, 1);
> + my $vmlist = get_filtered_vmlist($nodename, undef, undef, 1);
>
> foreach my $vmid (keys %$vmlist) {
> my $conf = $vmlist->{$vmid}->{conf};
> @@ -2069,7 +2071,7 @@ __PACKAGE__->register_method({
> warn $@ if $@;
>
> my $autostart = $force ? undef : 1;
> - my $startList = $get_start_stop_list->($nodename, $autostart, $param->{vms});
> + my $startList = get_start_stop_list($nodename, $autostart, $param->{vms});
>
> # Note: use numeric sorting with <=>
> for my $order (sort { $a <=> $b } keys %$startList) {
> @@ -2215,7 +2217,7 @@ __PACKAGE__->register_method({
>
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $stopList = $get_start_stop_list->($nodename, undef, $param->{vms});
> + my $stopList = get_start_stop_list($nodename, undef, $param->{vms});
>
> my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
> my $datacenterconfig = cfs_read_file('datacenter.cfg');
> @@ -2344,7 +2346,7 @@ __PACKAGE__->register_method({
>
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $toSuspendList = $get_start_stop_list->($nodename, undef, $param->{vms});
> + my $toSuspendList = get_start_stop_list($nodename, undef, $param->{vms});
>
> my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
> my $datacenterconfig = cfs_read_file('datacenter.cfg');
> @@ -2549,7 +2551,7 @@ __PACKAGE__->register_method({
> my $code = sub {
> $rpcenv->{type} = 'priv'; # to start tasks in background
>
> - my $vmlist = &$get_filtered_vmlist($nodename, $param->{vms}, 1, 1);
> + my $vmlist = get_filtered_vmlist($nodename, $param->{vms}, 1, 1);
> if (!scalar(keys %$vmlist)) {
> warn "no virtual guests matched, nothing to do..\n";
> return;
> --
> 2.39.5
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-11-14 8:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 11:26 Dominik Csapak
2025-11-14 8:32 ` Fabian Grünbichler [this message]
2025-11-14 9:08 ` Thomas Lamprecht
2025-11-14 9:16 ` Dominik Csapak
2025-11-14 9:32 ` Fabian Grünbichler
2025-11-14 15:00 ` [pve-devel] superseded: " Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=176310914957.64802.13109309341568645230@yuna.proxmox.com \
--to=f.gruenbichler@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.