From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id B30D860097
 for <pve-devel@lists.proxmox.com>; Fri,  5 Feb 2021 13:40:43 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id A09EC8DE3
 for <pve-devel@lists.proxmox.com>; Fri,  5 Feb 2021 13:40:13 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id C57698DD3
 for <pve-devel@lists.proxmox.com>; Fri,  5 Feb 2021 13:40:11 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 802E1461ED
 for <pve-devel@lists.proxmox.com>; Fri,  5 Feb 2021 13:40:11 +0100 (CET)
Message-ID: <005c7bc3-a624-68c7-cdbc-cc6000c6ef89@proxmox.com>
Date: Fri, 5 Feb 2021 13:40:10 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:86.0) Gecko/20100101
 Thunderbird/86.0
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Hannes Laimer <h.laimer@proxmox.com>
References: <20210205103538.742007-1-h.laimer@proxmox.com>
 <20210205103538.742007-2-h.laimer@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
In-Reply-To: <20210205103538.742007-2-h.laimer@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.028 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.182 Looks like a legit reply (A)
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com, nodes.pm]
Subject: Re: [pve-devel] [PATCH pve-manager 1/2] api2: add suspendall
 endpoint
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 05 Feb 2021 12:40:43 -0000

On 05.02.21 11:35, Hannes Laimer wrote:

Missing reference to bug #804 which this patch addresses.
https://pve.proxmox.com/wiki/Developer_Documentation#Commits_and_Commit_Messages

Mentioning that this is mostly a 1:1 copy of the stop-, start-, all
methods would also be good; improves confidence in code and explains
why some parts seem so familiar ;-)


Looks OK code wise, but lots of code deduplication potential with the
other "start/stop/.. all" endpoints and IMO some missing thoughts on
the used API privileges.

> Signed-off-by: Hannes Laimer <h.laimer@proxmox.com>
> ---
>  PVE/API2/Nodes.pm | 113 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 113 insertions(+)
> 
> diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
> index 8172231e..0c11fe35 100644
> --- a/PVE/API2/Nodes.pm
> +++ b/PVE/API2/Nodes.pm
> @@ -1943,6 +1943,119 @@ __PACKAGE__->register_method ({
>  	return $rpcenv->fork_worker('stopall', undef, $authuser, $code);
>      }});
>  
> +my $create_suspend_worker = sub {
> +    my ($nodename, $type, $vmid, $down_timeout) = @_;
> +
> +    my $upid;
> +    if ($type eq 'qemu') {
> +	return if !PVE::QemuServer::check_running($vmid, 1);
> +	my $timeout =  defined($down_timeout) ? int($down_timeout) : 60*3;
> +	print STDERR "Suspending VM $vmid (timeout = $timeout seconds)\n";
> +	$upid = PVE::API2::Qemu->vm_suspend({node => $nodename, vmid => $vmid});

You could avoid the $upid and rather just return here

> +    } else {
> +	die "suspension is only supported on VMs, not on '$type'\n";


As we won't get other suspendable types in the foreseeable future we could
make this an early check.

> +    }
> +
> +    return $upid;
> +};
> +
> +__PACKAGE__->register_method ({
> +    name => 'suspendall',
> +    path => 'suspendall',
> +    method => 'POST',
> +    protected => 1,
> +    permissions => {
> +	check => ['perm', '/', [ 'VM.PowerMgmt' ]],

In contrast to the pretty much state less start and stop, supension results in
saving states to disks, so not sure if just having VM.PowerMgmt is enough?

Or at least I see no reasoning telling so in the (rather non existent) commit
message..

> +    },
> +    proxyto => 'node',
> +    description => "Suspend all VMs.",

Suspend all or a specific set of VMs.

> +    parameters => {
> +    	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vms => {
> +		description => "Only consider Guests with these IDs.",
> +		type => 'string',  format => 'pve-vmid-list',
> +		optional => 1,
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => 'string',
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $nodename = $param->{node};
> +	$nodename = PVE::INotify::nodename() if $nodename eq 'localhost';
> +
> +	my $code = sub {
> +
> +	    $rpcenv->{type} = 'priv'; # to start tasks in background
> +
> +	    my $stopList = &$get_start_stop_list($nodename, undef, $param->{vms});

Use new call syntax for code references in new code:

$get_start_stop_list->(...)

> +
> +	    my $cpuinfo = PVE::ProcFSTools::read_cpuinfo();
> +	    my $datacenterconfig = cfs_read_file('datacenter.cfg');
> +	    # if not set by user spawn max cpu count number of workers
> +	    my $maxWorkers =  $datacenterconfig->{max_workers} || $cpuinfo->{cpus};

IIRC, above lines are now duplicated two or three times, refactoring it
out into a separate method would avoid code duplication and reduce the
length of this scope.

> +
> +	    foreach my $order (sort {$b <=> $a} keys %$stopList) {
> +		my $vmlist = $stopList->{$order};
> +		my $workers = {};
> +
> +		my $finish_worker = sub {
> +		    my $pid = shift;
> +		    my $d = $workers->{$pid};
> +		    return if !$d;
> +		    delete $workers->{$pid};
> +
> +		    syslog('info', "end task $d->{upid}");
> +		};
> +
> +		foreach my $vmid (sort {$b <=> $a} keys %$vmlist) {
> +		    my $d = $vmlist->{$vmid};
> +		    my $upid;
> +		    eval { $upid = &$create_suspend_worker($nodename, $d->{type}, $vmid, $d->{down}); };

above could be shorter, i.e., write as:

my $upid = eval { $create_suspend_worker->($nodename, $d->{type}, $vmid, $d->{down}) };

> +		    warn $@ if $@;
> +		    next if !$upid;
> +
> +		    my $res = PVE::Tools::upid_decode($upid, 1);
> +		    next if !$res;
> +
> +		    my $pid = $res->{pid};
> +
> +		    $workers->{$pid} = { type => $d->{type}, upid => $upid, vmid => $vmid };
> +		    while (scalar(keys %$workers) >= $maxWorkers) {
> +			foreach my $p (keys %$workers) {
> +			    if (!PVE::ProcFSTools::check_process_running($p)) {
> +				&$finish_worker($p);
> +			    }
> +			}
> +			sleep(1);
> +		    }
> +		}
> +		while (scalar(keys %$workers)) {
> +		    foreach my $p (keys %$workers) {
> +			if (!PVE::ProcFSTools::check_process_running($p)) {
> +			    &$finish_worker($p);
> +			}
> +		    }


we have that worker loop twice time per "stop/start/.. all" endpoint,
would be nice to be factored out in a sub which takes the hash reference
to $workers and the code from finish_worker inlined, as that is also
duplicated a few times.

> +		    sleep(1);
> +		}
> +	    }
> +
> +	    syslog('info', "all VMs suspended");
> +
> +	    return;
> +	};
> +
> +	return $rpcenv->fork_worker('suspendall', undef, $authuser, $code);
> +    }});
> +
>  my $create_migrate_worker = sub {
>      my ($nodename, $type, $vmid, $target, $with_local_disks) = @_;
>  
>