From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 8EC4D1FF15C
	for <inbox@lore.proxmox.com>; Fri,  4 Apr 2025 16:15:06 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id A95F7335BC;
	Fri,  4 Apr 2025 16:14:52 +0200 (CEST)
Message-ID: <bd0f609c-9e47-4b54-9f56-ec23753d81f0@proxmox.com>
Date: Fri, 4 Apr 2025 16:14:48 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
To: pve-devel@lists.proxmox.com, d.csapak@proxmox.com
References: <20250403141215.3735230-1-d.csapak@proxmox.com>
Content-Language: en-US
From: =?UTF-8?Q?Michael_K=C3=B6ppl?= <m.koeppl@proxmox.com>
In-Reply-To: <20250403141215.3735230-1-d.csapak@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.001 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [guest.pm]
Subject: Re: [pve-devel] [PATCH manager] api: implement node-independent
 bulk actions
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

On 4/3/25 16:12, Dominik Csapak wrote:
> To achieve this, start a worker task and use our generic api client
> to start the tasks on the relevant nodes. The client always points
> to 'localhost' so we let the pveproxy worry about the proxying etc.
> 
> We reuse some logic from the startall/stopall/etc. calls, like getting
> the ordered guest info list. For that to work, we must convert some of
> the private subs into proper subs. We also fix handling loading configs
> from other nodes.
> 
> In each worker, for each task, we check if the target is in the desired
> state (e.g. stopped when wanting to start, etc.). If that iss the case,
> start the task and put the UPID in a queue to check. This is done until
> the parallel count is at 'max_workers', at which point we wait until at
> least one task is finished before starting the next one.
> 
> Failures (e.g. task errors or failure to fetch the tasks) are printed,
> ant the vmid is saved and they're collectively printed at the end for
> convenience.
> 
> Special handling is required for checking the permissions for suspend:
> We have to load the config of the VM and find the target state storage.
> We can then check the privileges for that storage.
> 
> Further improvements can be:
> * filters (I'd prefer starting out with front end filters)
> * failure mode resolution (I'd wait until someone requests that)
> * token handling (probably not necessary since we do check the
>    permissions upfront for the correct token.)
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>

I tested start, suspend, shutdown, migrate with a 3-node cluster with 
multiple VMs and CTs. I tested with both ticket and token authentication 
and tried the following scenarios:

- Start (with no vms parameter, selection of VMIDs, and with empty 
string vor vms)
- Shutdown (vms parameter as above, force-stop on and off)
- Suspend (vms parameter as above, to-disk on and off)
- Migrate (vms parameter as above, online on and off, with-local-disks 
on and off)

All of the above I also tried with varying settings for max-workers.

The tested cases seem to work as expected and I find the API intuitive. 
I like it!

I added 3 comments inline

> +++ b/PVE/API2/Cluster/BulkAction/Guest.pm
> @@ -0,0 +1,701 @@
> +package PVE::API2::Cluster::BulkAction::Guest;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::APIClient::LWP;
> +use PVE::AccessControl;
> +use PVE::Cluster;
> +use PVE::Exception qw(raise raise_perm_exc raise_param_exc);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option);
> +use PVE::QemuConfig;
> +use PVE::QemuServer;
> +use PVE::RESTEnvironment qw(log_warn);
> +use PVE::RESTHandler;
> +use PVE::RPCEnvironment;
> +use PVE::Storage;
> +use PVE::Tools qw();
> +
> +use PVE::API2::Nodes;
> +
> +use base qw(PVE::RESTHandler);
> +
> +
> +__PACKAGE__->register_method ({
> +    name => 'index',
> +    path => '',
> +    method => 'GET',
> +    description => "Bulk action index.",
> +    permissions => { user => 'all' },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {},
> +    },
> +    returns => {
> +	type => 'array',
> +	items => {
> +	    type => "object",
> +	    properties => {},
> +	},
> +	links => [ { rel => 'child', href => "{name}" } ],
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	return [
> +	    { name => 'start' },
> +	    { name => 'shutdown' },
> +	    { name => 'migrate' },
> +	];
> +    }});

'suspend' seems to be missing here.

> +__PACKAGE__->register_method({
> +    name => 'shutdown',
> +    path => 'shutdown',
> +    method => 'POST',
> +    description => "Bulk shutdown all guests on the cluster.",
> +    permissions => {
> +	description => "The 'VM.PowerMgmt' permission is required on '/' or on '/vms/<ID>' for "
> +	    ."each ID passed via the 'vms' parameter.",
> +	user => 'all',
> +    },
> +    protected => 1,
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    vms => {
> +		description => "Only consider guests from this comma separated list of VMIDs.",
> +		type => 'string',  format => 'pve-vmid-list',
> +		optional => 1,
> +	    },
> +	    timeout => {
> +		description => "Default shutdown timeout in seconds if none is configured for the guest.",
> +		type => 'integer',
> +		default => 180,
> +		optional => 1,
> +	    },
> +	    'force-stop' => {
> +		description => "Makes sure the Guest stops after the timeout.",
> +		type => 'boolean',
> +		default => 1,
> +		optional => 1,
> +	    },
> +	    'max-workers' => {
> +		description => "How many parallel tasks at maximum should be started.",
> +		optional => 1,
> +		default => 1,
> +		type => 'integer',
> +	    },
> +	    # TODO:
> +	    # Failure resolution mode (fail, warn, retry?)
> +	    # mode-limits (offline only, suspend only, ?)
> +	    # filter (tags, name, ?)
> +	},
> +    },
> +    returns => {
> +	type => 'string',
> +	description => "UPID of the worker",
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $vmlist = [PVE::Tools::split_list($param->{vms})];
> +
> +	check_guest_permissions($rpcenv, $authuser, $vmlist, [ 'VM.PowerMgmt' ]);
> +
> +	my $code = sub {
> +	    my $startlist = PVE::API2::Nodes::Nodeinfo::get_start_stop_list(undef, undef, $param->{vms});
> +
> +	    if (scalar($vmlist->@*)) {
> +		print STDERR "Shutting down guests: " . join(', ', $vmlist->@*) . "\n";
> +	    } else {
> +		print STDERR "Shutting down all guests\n";
> +	    }

nit: the above part is used (with varying permissions that are required) 
in multiple subroutines. I think a helper function for the initial setup 
of each subroutine and checking the required permissions would make sense.

> +
> +	    # reverse order for shutdown
> +	    for my $order (keys $startlist->%*) {
> +		my $list = delete $startlist->{$order};
> +		$order = $order * -1;
> +		$startlist->{$order} = $list;
> +	    }
> +
> +	    my $start_task = sub {
> +		my ($api_client, $vmid, $guest) = @_;
> +		my $node = $guest->{node};
> +
> +		my $type = $guest->{type};
> +		my $type_text = get_type_text($type);
> +
> +		my $status = eval { $api_client->get("/nodes/$node/$type/$vmid/status/current") };
> +		if (defined($status) && $status->{status} ne 'running') {
> +		    print STDERR "Skipping $type_text $vmid, not running.\n";
> +		    return 1;
> +		}
> +
> +		if (defined($status) && $status->{qmpstatus} eq 'paused' && !$param->{'force-stop'}) {
> +		    log_warn("Skipping $type_text $vmid, resume paused VM before shutdown.\n");
> +		    return 1;
> +		}

Since there is no check if $status->{qmpstatus} is defined, I get a "Use 
of uninitialized value" error printed to the log when running shutdown 
with containers.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel