From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 746F81FF15C for ; Fri, 14 Nov 2025 09:31:58 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 744A3A5B3; Fri, 14 Nov 2025 09:32:51 +0100 (CET) MIME-Version: 1.0 In-Reply-To: <20250814112659.2584520-1-d.csapak@proxmox.com> References: <20250814112659.2584520-1-d.csapak@proxmox.com> From: Fabian =?utf-8?q?Gr=C3=BCnbichler?= To: Dominik Csapak , pve-devel@lists.proxmox.com Date: Fri, 14 Nov 2025 09:32:29 +0100 Message-ID: <176310914957.64802.13109309341568645230@yuna.proxmox.com> User-Agent: alot/0.0.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1763109135868 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.045 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [metricserver.pm, cluster.pm, nodes.pm, proxmox.com, notifications.pm, defines.mk, backupinfo.pm, mapping.pm, bulkaction.pm, guest.pm] Subject: Re: [pve-devel] [PATCH manager v2] api: implement node-independent bulk actions X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Quoting Dominik Csapak (2025-08-14 13:26:59) > To achieve this, start a worker task and use our generic api client > to start the tasks on the relevant nodes. The client always points > to 'localhost' so we let the pveproxy worry about the proxying etc. > > We reuse some logic from the startall/stopall/etc. calls, like getting > the ordered guest info list. For that to work, we must convert some of > the private subs into proper subs. We also fix handling loading configs > from other nodes. > > In each worker, for each task, we check if the target is in the desired > state (e.g. stopped when wanting to start, etc.). If that is the case, > start the task and put the UPID in a queue to check. This is done until > the parallel count is at 'max_workers', at which point we wait until at > least one task is finished before starting the next one. > > Failures (e.g. task errors or failure to fetch the tasks) are printed, > and the vmid is saved and they're collectively printed at the end for > convenience. > > Special handling is required for checking the permissions for suspend: > We have to load the config of the VM and find the target state storage. > We can then check the privileges for that storage. > > Further improvements can be: > * filters (I'd prefer starting out with front end filters) > * failure mode resolution (I'd wait until someone requests that) > * token handling (probably not necessary since we do check the > permissions upfront for the correct token.) I disagree with that last point, see below. > > Signed-off-by: Dominik Csapak > --- > changes from v1: > * rebased on master (perltidy changes) > * added missing suspend to index > * refactored more functionality to be reused > > PVE/API2/Cluster.pm | 7 + > PVE/API2/Cluster/BulkAction.pm | 45 ++ > PVE/API2/Cluster/BulkAction/Guest.pm | 753 +++++++++++++++++++++++++++ > PVE/API2/Cluster/BulkAction/Makefile | 17 + > PVE/API2/Cluster/Makefile | 4 +- > PVE/API2/Nodes.pm | 24 +- > 6 files changed, 838 insertions(+), 12 deletions(-) > create mode 100644 PVE/API2/Cluster/BulkAction.pm > create mode 100644 PVE/API2/Cluster/BulkAction/Guest.pm > create mode 100644 PVE/API2/Cluster/BulkAction/Makefile > > diff --git a/PVE/API2/Cluster.pm b/PVE/API2/Cluster.pm > index 02a7ceff..f0dd88f3 100644 > --- a/PVE/API2/Cluster.pm > +++ b/PVE/API2/Cluster.pm > @@ -25,6 +25,7 @@ use PVE::API2::ACMEAccount; > use PVE::API2::ACMEPlugin; > use PVE::API2::Backup; > use PVE::API2::Cluster::BackupInfo; > +use PVE::API2::Cluster::BulkAction; > use PVE::API2::Cluster::Ceph; > use PVE::API2::Cluster::Mapping; > use PVE::API2::Cluster::Jobs; > @@ -103,6 +104,11 @@ __PACKAGE__->register_method({ > path => 'mapping', > }); > > +__PACKAGE__->register_method({ > + subclass => "PVE::API2::Cluster::BulkAction", > + path => 'bulk-action', > +}); > + > if ($have_sdn) { > __PACKAGE__->register_method({ > subclass => "PVE::API2::Network::SDN", > @@ -163,6 +169,7 @@ __PACKAGE__->register_method({ > { name => 'resources' }, > { name => 'status' }, > { name => 'tasks' }, > + { name => 'bulk-action' }, sorting? ;) > ]; > > if ($have_sdn) { > diff --git a/PVE/API2/Cluster/BulkAction.pm b/PVE/API2/Cluster/BulkAction.pm > new file mode 100644 > index 00000000..df650514 > --- /dev/null > +++ b/PVE/API2/Cluster/BulkAction.pm > @@ -0,0 +1,45 @@ > +package PVE::API2::Cluster::BulkAction; > + > +use strict; > +use warnings; > + > +use PVE::API2::Cluster::BulkAction::Guest; > + > +use base qw(PVE::RESTHandler); > + > +__PACKAGE__->register_method({ > + subclass => "PVE::API2::Cluster::BulkAction::Guest", > + path => 'guest', > +}); > + > +__PACKAGE__->register_method({ > + name => 'index', > + path => '', > + method => 'GET', > + description => "List resource types.", > + permissions => { > + user => 'all', > + }, > + parameters => { > + additionalProperties => 0, > + properties => {}, > + }, > + returns => { > + type => 'array', > + items => { > + type => "object", > + }, > + links => [{ rel => 'child', href => "{name}" }], > + }, > + code => sub { > + my ($param) = @_; > + > + my $result = [ > + { name => 'guest' }, > + ]; > + > + return $result; > + }, > +}); > + > +1; > diff --git a/PVE/API2/Cluster/BulkAction/Guest.pm b/PVE/API2/Cluster/BulkAction/Guest.pm > new file mode 100644 > index 00000000..785c9036 > --- /dev/null > +++ b/PVE/API2/Cluster/BulkAction/Guest.pm > @@ -0,0 +1,753 @@ > +package PVE::API2::Cluster::BulkAction::Guest; > + > +use strict; > +use warnings; > + > +use PVE::APIClient::LWP; > +use PVE::AccessControl; > +use PVE::Cluster; > +use PVE::Exception qw(raise raise_perm_exc raise_param_exc); > +use PVE::INotify; > +use PVE::JSONSchema qw(get_standard_option); > +use PVE::QemuConfig; > +use PVE::QemuServer; > +use PVE::RESTEnvironment qw(log_warn); > +use PVE::RESTHandler; > +use PVE::RPCEnvironment; > +use PVE::Storage; > +use PVE::Tools qw(); > + > +use PVE::API2::Nodes; > + > +use base qw(PVE::RESTHandler); > + > +__PACKAGE__->register_method({ > + name => 'index', > + path => '', > + method => 'GET', > + description => "Bulk action index.", > + permissions => { user => 'all' }, > + parameters => { > + additionalProperties => 0, > + properties => {}, > + }, > + returns => { > + type => 'array', > + items => { > + type => "object", > + properties => {}, > + }, > + links => [{ rel => 'child', href => "{name}" }], > + }, > + code => sub { > + my ($param) = @_; > + > + return [ > + { name => 'start' }, > + { name => 'shutdown' }, > + { name => 'migrate' }, > + { name => 'suspend' }, > + ]; > + }, > +}); > + > +sub create_client { > + my ($authuser, $request_timeout) = @_; > + my ($user, undef) = PVE::AccessControl::split_tokenid($authuser, 1); > + > + # TODO: How to handle Tokens? not like below for sure ;) we'd need to make it queriable using the RPCEnvironment (and store it there) I guess? maybe opt-in so the storing only happens for certain API handlers (e.g., these ones here for a start)? this basically escalates from the token to a ticket of the user, which is a nogo even if you duplicate the current set of privilege checks here, as that is just waiting to get out of sync > + my $ticket = PVE::AccessControl::assemble_ticket($user || $authuser); > + my $csrf_token = PVE::AccessControl::assemble_csrf_prevention_token($user || $authuser); > + > + my $node = PVE::INotify::nodename(); > + my $fingerprint = PVE::Cluster::get_node_fingerprint($node); > + > + my $conn_args = { > + protocol => 'https', > + host => 'localhost', # always call the api locally, let pveproxy handle the proxying > + port => 8006, > + ticket => $ticket, > + timeout => $request_timeout // 25, # default slightly shorter than the proxy->daemon timeout > + cached_fingerprints => { > + $fingerprint => 1, > + }, > + }; > + > + my $api_client = PVE::APIClient::LWP->new($conn_args->%*); > + if (defined($csrf_token)) { > + $api_client->update_csrftoken($csrf_token); > + } > + > + return $api_client; this client doesn't automatically refresh the ticket before it expires, so bigger sets of bulk actions that take more than 2h will always fail.. > +} > + > +# starts and awaits a task for each guest given via $startlist. > +# > +# takes a vm list in the form of > +# { > +# 0 => { > +# 100 => { .. guest info ..}, > +# 101 => { .. guest info ..}, > +# }, > +# 1 => { > +# 102 => { .. guest info ..}, > +# 103 => { .. guest info ..}, > +# }, > +# } > +# > +# max_workers: how many parallel tasks should be started. > +# start_task: a sub that returns eiter a upid or 1 (undef means failure) > +# check_task: if start_task returned a upid, will wait for that to finish and > +# call check_task with the resulting task status > +sub handle_task_foreach_guest { > + my ($startlist, $max_workers, $start_task, $check_task) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + my $authuser = $rpcenv->get_user(); > + my $api_client = create_client($authuser); > + > + my $failed = []; > + for my $order (sort { $a <=> $b } keys $startlist->%*) { > + my $vmlist = $startlist->{$order}; > + my $workers = {}; > + > + for my $vmid (sort { $a <=> $b } keys $vmlist->%*) { > + > + # wait until at least one slot is free > + while (scalar(keys($workers->%*)) >= $max_workers) { > + for my $upid (keys($workers->%*)) { > + my $worker = $workers->{$upid}; > + my $node = $worker->{guest}->{node}; > + > + my $task = eval { $api_client->get("/nodes/$node/tasks/$upid/status") }; this could easily fail for reasons other than the task having exited? should we maybe retry a few times to avoid accidents, before giving up? > + if (my $err = $@) { > + push $failed->@*, $worker->{vmid}; > + > + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, 1, undef); > + > + delete $workers->{$upid}; > + } elsif ($task->{status} ne 'running') { > + my $is_error = PVE::Tools::upid_status_is_error($task->{exitstatus}); > + push $failed->@*, $worker->{vmid} if $is_error; > + > + $check_task->( > + $api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task, > + ); > + > + delete $workers->{$upid}; > + } > + } > + sleep(1); # How much? > + } > + > + my $guest = $vmlist->{$vmid}; > + my $upid = eval { $start_task->($api_client, $vmid, $guest) }; > + warn $@ if $@; A: here we use warn (see further similar nits below) > + > + # success but no task necessary > + next if defined($upid) && "$upid" eq "1"; > + > + if (!defined($upid)) { > + push $failed->@*, $vmid; > + next; > + } > + > + $workers->{$upid} = { > + vmid => $vmid, > + guest => $guest, > + }; > + } > + > + # wait until current order is finished > + for my $upid (keys($workers->%*)) { > + my $worker = $workers->{$upid}; > + my $node = $worker->{guest}->{node}; > + > + my $task = eval { wait_for_task_finished($api_client, $node, $upid) }; > + my $err = $@; > + my $is_error = ($err || PVE::Tools::upid_status_is_error($task->{exitstatus})) ? 1 : 0; > + push $failed->@*, $worker->{vmid} if $is_error; > + > + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task); > + > + delete $workers->{$upid}; > + } > + } > + > + return $failed; > +} > + > +sub get_type_text { > + my ($type) = @_; > + > + if ($type eq 'lxc') { > + return 'CT'; > + } elsif ($type eq 'qemu') { > + return 'VM'; > + } else { > + die "unknown guest type $type\n"; > + } > +} > + > +sub wait_for_task_finished { > + my ($client, $node, $upid) = @_; > + > + while (1) { > + my $task = $client->get("/nodes/$node/tasks/$upid/status"); same question as above here - should we try to handle transient errors here? > + return $task if $task->{status} ne 'running'; > + sleep(1); # How much time? > + } > +} > + > +sub check_guest_permissions { > + my ($rpcenv, $authuser, $vmlist, $priv_list) = @_; > + > + if (scalar($vmlist->@*) > 0) { > + $rpcenv->check($authuser, "/vms/$_", $priv_list) for $vmlist->@*; > + } elsif (!$rpcenv->check($authuser, "/", $priv_list, 1)) { > + raise_perm_exc("/, " . join(', ', $priv_list->@*)); > + } > +} > + > +sub extract_vmlist { > + my ($param) = @_; > + > + if (my $vmlist = $param->{vms}) { > + my $vmlist_string = join(',', $vmlist->@*); > + return ($vmlist, $vmlist_string); > + } > + return ([], undef); > +} > + > +sub print_start_action { > + my ($vmlist, $prefix, $suffix) = @_; > + > + $suffix = defined($suffix) ? " $suffix" : ""; > + > + if (scalar($vmlist->@*)) { > + print STDERR "$prefix guests$suffix: " . join(', ', $vmlist->@*) . "\n"; A: here we use STDERR > + } else { > + print STDERR "$prefix all guests$suffix\n"; > + } > +} > + > +__PACKAGE__->register_method({ > + name => 'start', > + path => 'start', > + method => 'POST', > + description => "Bulk start or resume all guests on the cluster.", > + permissions => { > + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/' for " > + . "each ID passed via the 'vms' parameter.", > + user => 'all', > + }, > + protected => 1, > + parameters => { > + additionalProperties => 0, > + properties => { > + vms => { > + description => "Only consider guests from this list of VMIDs.", > + type => 'array', > + items => get_standard_option('pve-vmid'), > + optional => 1, > + }, > + timeout => { > + description => > + "Default start timeout in seconds. Only valid for VMs. (default depends on the guest configuration).", > + type => 'integer', > + optional => 1, > + }, > + 'max-workers' => { > + description => "How many parallel tasks at maximum should be started.", > + optional => 1, > + default => 1, > + type => 'integer', > + }, > + # TODO: > + # Failure resolution mode (fail, warn, retry?) > + # mode-limits (offline only, suspend only, ?) > + # filter (tags, name, ?) > + }, > + }, > + returns => { > + type => 'string', > + description => "UPID of the worker", > + }, > + code => sub { > + my ($param) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + my $authuser = $rpcenv->get_user(); > + > + my ($vmlist, $vmlist_string) = extract_vmlist($param); > + > + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']); > + > + my $code = sub { > + my $startlist = > + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string); > + > + print_start_action($vmlist, "Starting"); > + > + my $start_task = sub { > + my ($api_client, $vmid, $guest) = @_; > + my $node = $guest->{node}; > + > + my $type = $guest->{type}; > + my $type_text = get_type_text($type); > + my $operation = 'start'; > + my $status = > + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") }; > + if (defined($status) && $status->{status} eq 'running') { > + if (defined($status->{qmpstatus}) && $status->{qmpstatus} ne 'paused') { > + print STDERR "Skipping $type_text $vmid, already running.\n"; > + return 1; > + } else { > + $operation = 'resume'; > + } > + } > + > + my $params = {}; > + if (defined($param->{timeout}) && $operation eq 'start' && $type eq 'qemu') { > + $params->{timeout} = $param->{timeout}; > + } > + > + my $url = "/nodes/$node/$type/$vmid/status/$operation"; > + print STDERR "Starting $type_text $vmid\n"; > + return $api_client->post($url, $params); > + }; > + > + my $check_task = sub { > + my ($api_client, $vmid, $guest, $is_error, $task) = @_; > + my $node = $guest->{node}; > + > + my $default_delay = 0; > + > + if (!$is_error) { > + my $delay = defined($guest->{up}) ? int($guest->{up}) : $default_delay; > + if ($delay > 0) { > + print STDERR "Waiting for $delay seconds (startup delay)\n" > + if $guest->{up}; > + for (my $i = 0; $i < $delay; $i++) { > + sleep(1); > + } > + } > + } else { > + my $err = > + defined($task) ? $task->{exitstatus} : "could not query task status"; > + my $type_text = get_type_text($guest->{type}); > + print STDERR "Starting $type_text $vmid failed: $err\n"; > + } > + }; > + > + my $max_workers = $param->{'max-workers'} // 1; > + my $failed = > + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task); > + > + if (scalar($failed->@*)) { > + die "Some guests failed to start: " . join(', ', $failed->@*) . "\n"; > + } > + }; > + > + return $rpcenv->fork_worker('bulkstart', undef, $authuser, $code); > + }, > +}); > + > +__PACKAGE__->register_method({ > + name => 'shutdown', > + path => 'shutdown', > + method => 'POST', > + description => "Bulk shutdown all guests on the cluster.", > + permissions => { > + description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/' for " > + . "each ID passed via the 'vms' parameter.", > + user => 'all', > + }, > + protected => 1, > + parameters => { > + additionalProperties => 0, > + properties => { > + vms => { > + description => "Only consider guests from this list of VMIDs.", > + type => 'array', > + items => get_standard_option('pve-vmid'), > + optional => 1, > + }, > + timeout => { > + description => > + "Default shutdown timeout in seconds if none is configured for the guest.", > + type => 'integer', > + default => 180, > + optional => 1, > + }, > + 'force-stop' => { > + description => "Makes sure the Guest stops after the timeout.", > + type => 'boolean', > + default => 1, > + optional => 1, > + }, > + 'max-workers' => { > + description => "How many parallel tasks at maximum should be started.", > + optional => 1, > + default => 1, > + type => 'integer', > + }, > + # TODO: > + # Failure resolution mode (fail, warn, retry?) > + # mode-limits (offline only, suspend only, ?) > + # filter (tags, name, ?) > + }, > + }, > + returns => { > + type => 'string', > + description => "UPID of the worker", > + }, > + code => sub { > + my ($param) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + my $authuser = $rpcenv->get_user(); > + > + my ($vmlist, $vmlist_string) = extract_vmlist($param); > + > + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']); > + > + my $code = sub { > + my $startlist = > + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string); > + > + print_start_action($vmlist, "Shutting down"); > + > + # reverse order for shutdown > + for my $order (keys $startlist->%*) { > + my $list = delete $startlist->{$order}; > + $order = $order * -1; > + $startlist->{$order} = $list; > + } > + > + my $start_task = sub { > + my ($api_client, $vmid, $guest) = @_; > + my $node = $guest->{node}; > + > + my $type = $guest->{type}; > + my $type_text = get_type_text($type); > + > + my $status = > + eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") }; > + if (defined($status) && $status->{status} ne 'running') { > + print STDERR "Skipping $type_text $vmid, not running.\n"; > + return 1; > + } > + > + if ( > + defined($status) > + && defined($status->{qmpstatus}) > + && $status->{qmpstatus} eq 'paused' > + && !$param->{'force-stop'} > + ) { > + log_warn("Skipping $type_text $vmid, resume paused VM before shutdown.\n"); A: here we use log_warn > + return 1; > + } > + > + my $timeout = int($guest->{down} // $param->{timeout} // 180); > + my $forceStop = $param->{'force-stop'} // 1; > + > + my $params = { > + forceStop => $forceStop, > + timeout => $timeout, > + }; > + > + my $url = "/nodes/$node/$type/$vmid/status/shutdown"; > + print STDERR "Shutting down $type_text $vmid (Timeout = $timeout seconds)\n"; > + return $api_client->post($url, $params); > + }; > + > + my $check_task = sub { > + my ($api_client, $vmid, $guest, $is_error, $task) = @_; > + my $node = $guest->{node}; > + if ($is_error) { > + my $err = > + defined($task) ? $task->{exitstatus} : "could not query task status"; > + my $type_text = get_type_text($guest->{type}); > + print STDERR "Stopping $type_text $vmid failed: $err\n"; > + } > + }; > + > + my $max_workers = $param->{'max-workers'} // 1; > + my $failed = > + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task); > + > + if (scalar($failed->@*)) { > + die "Some guests failed to shutdown " . join(', ', $failed->@*) . "\n"; > + } > + }; > + > + return $rpcenv->fork_worker('bulkshutdown', undef, $authuser, $code); > + }, > +}); > + > +__PACKAGE__->register_method({ > + name => 'suspend', > + path => 'suspend', > + method => 'POST', > + description => "Bulk suspend all guests on the cluster.", > + permissions => { > + description => > + "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/' for each" > + . " ID passed via the 'vms' parameter. Additionally, you need 'VM.Config.Disk' on the" > + . " '/vms/{vmid}' path and 'Datastore.AllocateSpace' for the configured state-storage(s)", > + user => 'all', > + }, > + protected => 1, > + parameters => { > + additionalProperties => 0, > + properties => { > + vms => { > + description => "Only consider guests from this list of VMIDs.", > + type => 'array', > + items => get_standard_option('pve-vmid'), > + optional => 1, > + }, > + statestorage => get_standard_option( > + 'pve-storage-id', > + { > + description => "The storage for the VM state.", > + requires => 'to-disk', > + optional => 1, > + completion => \&PVE::Storage::complete_storage_enabled, > + }, > + ), > + 'to-disk' => { > + description => > + "If set, suspends the guests to disk. Will be resumed on next start.", > + type => 'boolean', > + default => 0, > + optional => 1, > + }, > + 'max-workers' => { > + description => "How many parallel tasks at maximum should be started.", > + optional => 1, > + default => 1, > + type => 'integer', > + }, > + # TODO: > + # Failure resolution mode (fail, warn, retry?) > + # mode-limits (offline only, suspend only, ?) > + # filter (tags, name, ?) > + }, > + }, > + returns => { > + type => 'string', > + description => "UPID of the worker", > + }, > + code => sub { > + my ($param) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + my $authuser = $rpcenv->get_user(); > + > + my ($vmlist, $vmlist_string) = extract_vmlist($param); > + > + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']); > + > + if ($param->{'to-disk'}) { > + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Config.Disk']); > + } > + > + if (my $statestorage = $param->{statestorage}) { > + $rpcenv->check($authuser, "/storage/$statestorage", ['Datastore.AllocateSpace']); > + } else { > + # storage access must be done in start task > + } this if should be nested in the other if? > + > + my $code = sub { > + my $startlist = > + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string); > + > + print_start_action($vmlist, "Suspending"); > + > + # reverse order for suspend > + for my $order (keys $startlist->%*) { > + my $list = delete $startlist->{$order}; > + $order = $order * -1; > + $startlist->{$order} = $list; > + } > + > + my $start_task = sub { > + my ($api_client, $vmid, $guest) = @_; > + my $node = $guest->{node}; > + > + if ($guest->{type} ne 'qemu') { > + log_warn("skipping $vmid, only VMs can be suspended"); > + return 1; > + } > + > + if (!$param->{statestorage}) { this should again be nested inside a check for to-disk being set > + my $conf = PVE::QemuConfig->load_config($vmid, $node); > + my $storecfg = PVE::Storage::config(); > + my $statestorage = PVE::QemuServer::find_vmstate_storage($conf, $storecfg); this does not exist, it's in QemuConfig > + $rpcenv->check( > + $authuser, > + "/storage/$statestorage", > + ['Datastore.AllocateSpace'], > + ); > + } > + > + my $status = > + eval { $api_client->get("/nodes/$node/qemu/$vmid/status/current") }; > + if (defined($status) && $status->{status} ne 'running') { > + print STDERR "Skipping VM $vmid, not running.\n"; > + return 1; > + } > + > + my $params = {}; > + $params->{'todisk'} = $param->{'to-disk'} // 0; > + $params->{statestorage} = $param->{statestorage} > + if defined($param->{statestorage}); statestorage only makes sense if you set to-disk, so it should be ordered like that here as well.. > + > + my $url = "/nodes/$node/qemu/$vmid/status/suspend"; > + print STDERR "Suspending VM $vmid\n"; > + return $api_client->post($url, $params); > + }; > + > + my $check_task = sub { > + my ($api_client, $vmid, $guest, $is_error, $task) = @_; > + my $node = $guest->{node}; > + if ($is_error) { > + my $err = > + defined($task) ? $task->{exitstatus} : "could not query task status"; > + my $type_text = get_type_text($guest->{type}); > + print STDERR "Stopping $type_text $vmid failed: $err\n"; > + } > + }; > + > + my $max_workers = $param->{'max-workers'} // 1; > + my $failed = > + handle_task_foreach_guest($startlist, $max_workers, $start_task, $check_task); > + > + if (scalar($failed->@*)) { > + die "Some guests failed to suspend " . join(', ', $failed->@*) . "\n"; > + } > + }; > + > + return $rpcenv->fork_worker('bulksuspend', undef, $authuser, $code); > + }, > +}); > + > +__PACKAGE__->register_method({ > + name => 'migrate', > + path => 'migrate', > + method => 'POST', > + description => "Bulk migrate all guests on the cluster.", > + permissions => { > + description => > + "The 'VM.Migrate' permission is required on '/' or on '/vms/' for each " > + . "ID passed via the 'vms' parameter.", > + user => 'all', > + }, > + protected => 1, > + parameters => { > + additionalProperties => 0, > + properties => { > + vms => { > + description => "Only consider guests from this list of VMIDs.", > + type => 'array', > + items => get_standard_option('pve-vmid'), > + optional => 1, > + }, > + 'target-node' => get_standard_option('pve-node', { description => "Target node." }), > + online => { > + type => 'boolean', > + description => "Enable live migration for VMs and restart migration for CTs.", > + optional => 1, > + }, > + "with-local-disks" => { > + type => 'boolean', > + description => "Enable live storage migration for local disk", > + optional => 1, > + }, > + 'max-workers' => { > + description => "How many parallel tasks at maximum should be started.", > + optional => 1, > + default => 1, > + type => 'integer', > + }, > + # TODO: > + # Failure resolution mode (fail, warn, retry?) > + # mode-limits (offline only, suspend only, ?) > + # filter (tags, name, ?) > + }, > + }, > + returns => { > + type => 'string', > + description => "UPID of the worker", > + }, > + code => sub { > + my ($param) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + my $authuser = $rpcenv->get_user(); > + > + my ($vmlist, $vmlist_string) = extract_vmlist($param); > + > + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Migrate']); > + > + my $code = sub { > + my $list = > + PVE::API2::Nodes::Nodeinfo::get_filtered_vmlist(undef, $vmlist_string, 1, 1); > + > + print_start_action($vmlist, "Migrating", "to $param->{'target-node'}"); > + > + my $start_task = sub { > + my ($api_client, $vmid, $guest) = @_; > + my $node = $guest->{node}; > + > + my $type = $guest->{type}; > + my $type_text = get_type_text($type); > + > + if ($node eq $param->{'target-node'}) { > + print STDERR "$type_text $vmid already on $node, skipping.\n"; > + return 1; > + } > + > + my $params = { > + target => $param->{'target-node'}, > + }; > + > + if ($type eq 'lxc') { > + $params->{restart} = $param->{online} if defined($param->{online}); > + } elsif ($type eq 'qemu') { > + $params->{online} = $param->{online} if defined($param->{online}); > + $params->{'with-local-disks'} = $param->{'with-local-disks'} > + if defined($param->{'with-local-disks'}); > + } > + > + my $url = "/nodes/$node/$type/$vmid/migrate"; > + print STDERR "Migrating $type_text $vmid\n"; > + return $api_client->post($url, $params); > + }; > + > + my $check_task = sub { > + my ($api_client, $vmid, $guest, $is_error, $task) = @_; > + if ($is_error) { > + my $err = > + defined($task) ? $task->{exitstatus} : "could not query task status"; > + my $type_text = get_type_text($guest->{type}); > + print STDERR "Migrating $type_text $vmid failed: $err\n"; > + } > + }; > + > + my $max_workers = $param->{'max-workers'} // 1; > + my $failed = > + handle_task_foreach_guest({ '0' => $list }, $max_workers, $start_task, $check_task); > + > + if (scalar($failed->@*)) { > + die "Some guests failed to migrate " . join(', ', $failed->@*) . "\n"; > + } > + }; > + > + return $rpcenv->fork_worker('bulkmigrate', undef, $authuser, $code); > + }, > +}); > + > +1; > diff --git a/PVE/API2/Cluster/BulkAction/Makefile b/PVE/API2/Cluster/BulkAction/Makefile > new file mode 100644 > index 00000000..822c1c15 > --- /dev/null > +++ b/PVE/API2/Cluster/BulkAction/Makefile > @@ -0,0 +1,17 @@ > +include ../../../../defines.mk > + > +# for node independent, cluster-wide applicable, API endpoints > +# ensure we do not conflict with files shipped by pve-cluster!! > +PERLSOURCE= \ > + Guest.pm > + > +all: > + > +.PHONY: clean > +clean: > + rm -rf *~ > + > +.PHONY: install > +install: ${PERLSOURCE} > + install -d ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction > + install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/API2/Cluster/BulkAction > diff --git a/PVE/API2/Cluster/Makefile b/PVE/API2/Cluster/Makefile > index b109e5cb..6cffe4c9 100644 > --- a/PVE/API2/Cluster/Makefile > +++ b/PVE/API2/Cluster/Makefile > @@ -1,11 +1,13 @@ > include ../../../defines.mk > > -SUBDIRS=Mapping > +SUBDIRS=Mapping \ > + BulkAction > > # for node independent, cluster-wide applicable, API endpoints > # ensure we do not conflict with files shipped by pve-cluster!! > PERLSOURCE= \ > BackupInfo.pm \ > + BulkAction.pm \ > MetricServer.pm \ > Mapping.pm \ > Notifications.pm \ > diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm > index ce7eecaf..0c43f5c7 100644 > --- a/PVE/API2/Nodes.pm > +++ b/PVE/API2/Nodes.pm > @@ -1908,7 +1908,7 @@ __PACKAGE__->register_method({ > # * vmid whitelist > # * guest is a template (default: skip) > # * guest is HA manged (default: skip) > -my $get_filtered_vmlist = sub { > +sub get_filtered_vmlist { > my ($nodename, $vmfilter, $templates, $ha_managed) = @_; > > my $vmlist = PVE::Cluster::get_vmlist(); > @@ -1935,28 +1935,29 @@ my $get_filtered_vmlist = sub { > die "unknown virtual guest type '$d->{type}'\n"; > } > > - my $conf = $class->load_config($vmid); > + my $conf = $class->load_config($vmid, $d->{node}); > return if !$templates && $class->is_template($conf); > return if !$ha_managed && PVE::HA::Config::vm_is_ha_managed($vmid); > > $res->{$vmid}->{conf} = $conf; > $res->{$vmid}->{type} = $d->{type}; > $res->{$vmid}->{class} = $class; > + $res->{$vmid}->{node} = $d->{node}; > }; > warn $@ if $@; > } > > return $res; > -}; > +} > > # return all VMs which should get started/stopped on power up/down > -my $get_start_stop_list = sub { > +sub get_start_stop_list { > my ($nodename, $autostart, $vmfilter) = @_; > > # do not skip HA vms on force or if a specific VMID set is wanted > my $include_ha_managed = defined($vmfilter) ? 1 : 0; > > - my $vmlist = $get_filtered_vmlist->($nodename, $vmfilter, undef, $include_ha_managed); > + my $vmlist = get_filtered_vmlist($nodename, $vmfilter, undef, $include_ha_managed); > > my $resList = {}; > foreach my $vmid (keys %$vmlist) { > @@ -1969,15 +1970,16 @@ my $get_start_stop_list = sub { > > $resList->{$order}->{$vmid} = $startup; > $resList->{$order}->{$vmid}->{type} = $vmlist->{$vmid}->{type}; > + $resList->{$order}->{$vmid}->{node} = $vmlist->{$vmid}->{node}; > } > > return $resList; > -}; > +} > > my $remove_locks_on_startup = sub { > my ($nodename) = @_; > > - my $vmlist = &$get_filtered_vmlist($nodename, undef, undef, 1); > + my $vmlist = get_filtered_vmlist($nodename, undef, undef, 1); > > foreach my $vmid (keys %$vmlist) { > my $conf = $vmlist->{$vmid}->{conf}; > @@ -2069,7 +2071,7 @@ __PACKAGE__->register_method({ > warn $@ if $@; > > my $autostart = $force ? undef : 1; > - my $startList = $get_start_stop_list->($nodename, $autostart, $param->{vms}); > + my $startList = get_start_stop_list($nodename, $autostart, $param->{vms}); > > # Note: use numeric sorting with <=> > for my $order (sort { $a <=> $b } keys %$startList) { > @@ -2215,7 +2217,7 @@ __PACKAGE__->register_method({ > > $rpcenv->{type} = 'priv'; # to start tasks in background > > - my $stopList = $get_start_stop_list->($nodename, undef, $param->{vms}); > + my $stopList = get_start_stop_list($nodename, undef, $param->{vms}); > > my $cpuinfo = PVE::ProcFSTools::read_cpuinfo(); > my $datacenterconfig = cfs_read_file('datacenter.cfg'); > @@ -2344,7 +2346,7 @@ __PACKAGE__->register_method({ > > $rpcenv->{type} = 'priv'; # to start tasks in background > > - my $toSuspendList = $get_start_stop_list->($nodename, undef, $param->{vms}); > + my $toSuspendList = get_start_stop_list($nodename, undef, $param->{vms}); > > my $cpuinfo = PVE::ProcFSTools::read_cpuinfo(); > my $datacenterconfig = cfs_read_file('datacenter.cfg'); > @@ -2549,7 +2551,7 @@ __PACKAGE__->register_method({ > my $code = sub { > $rpcenv->{type} = 'priv'; # to start tasks in background > > - my $vmlist = &$get_filtered_vmlist($nodename, $param->{vms}, 1, 1); > + my $vmlist = get_filtered_vmlist($nodename, $param->{vms}, 1, 1); > if (!scalar(keys %$vmlist)) { > warn "no virtual guests matched, nothing to do..\n"; > return; > -- > 2.39.5 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel