From: Dominik Csapak <d.csapak@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH manager v2] api: implement node-independent bulk actions
Date: Fri, 14 Nov 2025 10:16:20 +0100 [thread overview]
Message-ID: <c9ff0657-de5f-431a-b493-ea138bd36ad0@proxmox.com> (raw)
In-Reply-To: <176310914957.64802.13109309341568645230@yuna.proxmox.com>
thanks for the review!
just a few questions inline
On 11/14/25 9:32 AM, Fabian Grünbichler wrote:
> Quoting Dominik Csapak (2025-08-14 13:26:59)
[snip]
>> +sub create_client {
>> + my ($authuser, $request_timeout) = @_;
>> + my ($user, undef) = PVE::AccessControl::split_tokenid($authuser, 1);
>> +
>> + # TODO: How to handle Tokens?
>
> not like below for sure ;) we'd need to make it queriable using the
> RPCEnvironment (and store it there) I guess? maybe opt-in so the storing only
> happens for certain API handlers (e.g., these ones here for a start)?
>
> this basically escalates from the token to a ticket of the user, which is a
> nogo even if you duplicate the current set of privilege checks here, as that is
> just waiting to get out of sync
Just to clarify: what's the worst that could happen with this?
the API call checks if the token has the correct permissions for all
resources, and the user can't have less permissions at that point.
So 'upgrading' the token to a full user ticket for the context
of these changes does not inherently break anything?
(Except I'm missing something?)
There is a TOCTOU issue of course, but IMHO it would not that big
of a deal since this can only happen if the permissions change
during the running task, and one could argue that the token
had permission to do these changes when starting the api call,
so it should be able to finish it.
I modeled this after we do the metrics export api call
in PVE/API2/Cluster/MetricServer.pm where we do the same
(or similar), so we probably should change it there as well then?
Anyway, I'll see that we extend the rpcenv to be able to do that,
but how should we mark the api call? in a special way like we do
with 'protected' ? (so e.g. '"save-token-info" => 1,' in the api
call (with a better name ofc))
>
>> + my $ticket = PVE::AccessControl::assemble_ticket($user || $authuser);
>> + my $csrf_token = PVE::AccessControl::assemble_csrf_prevention_token($user || $authuser);
>> +
>> + my $node = PVE::INotify::nodename();
>> + my $fingerprint = PVE::Cluster::get_node_fingerprint($node);
>> +
>> + my $conn_args = {
>> + protocol => 'https',
>> + host => 'localhost', # always call the api locally, let pveproxy handle the proxying
>> + port => 8006,
>> + ticket => $ticket,
>> + timeout => $request_timeout // 25, # default slightly shorter than the proxy->daemon timeout
>> + cached_fingerprints => {
>> + $fingerprint => 1,
>> + },
>> + };
>> +
>> + my $api_client = PVE::APIClient::LWP->new($conn_args->%*);
>> + if (defined($csrf_token)) {
>> + $api_client->update_csrftoken($csrf_token);
>> + }
>> +
>> + return $api_client;
>
> this client doesn't automatically refresh the ticket before it expires, so
> bigger sets of bulk actions that take more than 2h will always fail..
>
true, so i'd abstract away a 'make_request' function that checks
if we should renew the ticket and always use that?
>> +}
>> +
>> +# starts and awaits a task for each guest given via $startlist.
>> +#
>> +# takes a vm list in the form of
>> +# {
>> +# 0 => {
>> +# 100 => { .. guest info ..},
>> +# 101 => { .. guest info ..},
>> +# },
>> +# 1 => {
>> +# 102 => { .. guest info ..},
>> +# 103 => { .. guest info ..},
>> +# },
>> +# }
>> +#
>> +# max_workers: how many parallel tasks should be started.
>> +# start_task: a sub that returns eiter a upid or 1 (undef means failure)
>> +# check_task: if start_task returned a upid, will wait for that to finish and
>> +# call check_task with the resulting task status
>> +sub handle_task_foreach_guest {
>> + my ($startlist, $max_workers, $start_task, $check_task) = @_;
>> +
>> + my $rpcenv = PVE::RPCEnvironment::get();
>> + my $authuser = $rpcenv->get_user();
>> + my $api_client = create_client($authuser);
>> +
>> + my $failed = [];
>> + for my $order (sort { $a <=> $b } keys $startlist->%*) {
>> + my $vmlist = $startlist->{$order};
>> + my $workers = {};
>> +
>> + for my $vmid (sort { $a <=> $b } keys $vmlist->%*) {
>> +
>> + # wait until at least one slot is free
>> + while (scalar(keys($workers->%*)) >= $max_workers) {
>> + for my $upid (keys($workers->%*)) {
>> + my $worker = $workers->{$upid};
>> + my $node = $worker->{guest}->{node};
>> +
>> + my $task = eval { $api_client->get("/nodes/$node/tasks/$upid/status") };
>
> this could easily fail for reasons other than the task having exited? should we
> maybe retry a few times to avoid accidents, before giving up?
>
yep, i'd include such functionality (opt-in just for these calls) in the
'make_request' abstraction?
>> + if (my $err = $@) {
>> + push $failed->@*, $worker->{vmid};
>> +
>> + $check_task->($api_client, $worker->{vmid}, $worker->{guest}, 1, undef);
>> +
>> + delete $workers->{$upid};
>> + } elsif ($task->{status} ne 'running') {
>> + my $is_error = PVE::Tools::upid_status_is_error($task->{exitstatus});
>> + push $failed->@*, $worker->{vmid} if $is_error;
>> +
>> + $check_task->(
>> + $api_client, $worker->{vmid}, $worker->{guest}, $is_error, $task,
>> + );
>> +
>> + delete $workers->{$upid};
>> + }
>> + }
>> + sleep(1); # How much?
>> + }
>> +
>> + my $guest = $vmlist->{$vmid};
>> + my $upid = eval { $start_task->($api_client, $vmid, $guest) };
>> + warn $@ if $@;
>
> A: here we use warn (see further similar nits below)
true, i'll use log_warn for those
>
[snip]
>> + code => sub {
>> + my ($param) = @_;
>> +
>> + my $rpcenv = PVE::RPCEnvironment::get();
>> + my $authuser = $rpcenv->get_user();
>> +
>> + my ($vmlist, $vmlist_string) = extract_vmlist($param);
>> +
>> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.PowerMgmt']);
>> +
>> + if ($param->{'to-disk'}) {
>> + check_guest_permissions($rpcenv, $authuser, $vmlist, ['VM.Config.Disk']);
>> + }
>> +
>> + if (my $statestorage = $param->{statestorage}) {
>> + $rpcenv->check($authuser, "/storage/$statestorage", ['Datastore.AllocateSpace']);
>> + } else {
>> + # storage access must be done in start task
>> + }
>
> this if should be nested in the other if?
we could, but since 'statestorage' requires 'to-disk' in the api anyway,
so changing this would just increase indentation?
i could put it int he if, but the api already does that for us
>
>> +
>> + my $code = sub {
>> + my $startlist =
>> + PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $vmlist_string);
>> +
>> + print_start_action($vmlist, "Suspending");
>> +
>> + # reverse order for suspend
>> + for my $order (keys $startlist->%*) {
>> + my $list = delete $startlist->{$order};
>> + $order = $order * -1;
>> + $startlist->{$order} = $list;
>> + }
>> +
>> + my $start_task = sub {
>> + my ($api_client, $vmid, $guest) = @_;
>> + my $node = $guest->{node};
>> +
>> + if ($guest->{type} ne 'qemu') {
>> + log_warn("skipping $vmid, only VMs can be suspended");
>> + return 1;
>> + }
>> +
>> + if (!$param->{statestorage}) {
>
> this should again be nested inside a check for to-disk being set
again, same argument as above
>
>> + my $conf = PVE::QemuConfig->load_config($vmid, $node);
>> + my $storecfg = PVE::Storage::config();
>> + my $statestorage = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
>
> this does not exist, it's in QemuConfig
ah yes, this was moved betweeen my v1 and v2 IIRC and i forgot to change it
>
>> + $rpcenv->check(
>> + $authuser,
>> + "/storage/$statestorage",
>> + ['Datastore.AllocateSpace'],
>> + );
>> + }
>> +
>> + my $status =
>> + eval { $api_client->get("/nodes/$node/qemu/$vmid/status/current") };
>> + if (defined($status) && $status->{status} ne 'running') {
>> + print STDERR "Skipping VM $vmid, not running.\n";
>> + return 1;
>> + }
>> +
>> + my $params = {};
>> + $params->{'todisk'} = $param->{'to-disk'} // 0;
>> + $params->{statestorage} = $param->{statestorage}
>> + if defined($param->{statestorage});
>
> statestorage only makes sense if you set to-disk, so it should be ordered like
> that here as well..
again same argument as above
if you insist, i'll change it, but IMHO it just increases the
indentation level for not much gain
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-11-14 9:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 11:26 Dominik Csapak
2025-11-14 8:32 ` Fabian Grünbichler
2025-11-14 9:08 ` Thomas Lamprecht
2025-11-14 9:16 ` Dominik Csapak [this message]
2025-11-14 9:32 ` Fabian Grünbichler
2025-11-14 15:00 ` [pve-devel] superseded: " Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c9ff0657-de5f-431a-b493-ea138bd36ad0@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox