From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 84C139A813 for ; Fri, 17 Nov 2023 14:09:32 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5CE0E33195 for ; Fri, 17 Nov 2023 14:09:02 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 17 Nov 2023 14:09:01 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 9FEA943DFB for ; Fri, 17 Nov 2023 14:09:01 +0100 (CET) Date: Fri, 17 Nov 2023 14:09:00 +0100 From: Wolfgang Bumiller To: Friedrich Weber Cc: pve-devel@lists.proxmox.com Message-ID: References: <20230126083214.711099-1-f.weber@proxmox.com> <20230126083214.711099-3-f.weber@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230126083214.711099-3-f.weber@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.100 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [status.pm] Subject: Re: [pve-devel] [RFC container 2/4] fix #4474: lxc api: add overrule-shutdown parameter to stop endpoint X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2023 13:09:32 -0000 On Thu, Jan 26, 2023 at 09:32:12AM +0100, Friedrich Weber wrote: > The new `overrule-shutdown` parameter is boolean and defaults to 0. If > it is 1, all active `vzshutdown` tasks by the current user for the same > CT are aborted before attempting to stop the CT. > > Passing `overrule-shutdown=1` is forbidden for HA resources. > > Signed-off-by: Friedrich Weber > --- > src/PVE/API2/LXC/Status.pm | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/src/PVE/API2/LXC/Status.pm b/src/PVE/API2/LXC/Status.pm > index f7e3128..d1d67f4 100644 > --- a/src/PVE/API2/LXC/Status.pm > +++ b/src/PVE/API2/LXC/Status.pm > @@ -221,6 +221,12 @@ __PACKAGE__->register_method({ > node => get_standard_option('pve-node'), > vmid => get_standard_option('pve-vmid', { completion => \&PVE::LXC::complete_ctid_running }), > skiplock => get_standard_option('skiplock'), > + 'overrule-shutdown' => { > + description => "Abort any active 'vzshutdown' task by the current user for this CT before stopping", > + optional => 1, > + type => 'boolean', > + default => 0, > + } > }, > }, > returns => { > @@ -238,10 +244,15 @@ __PACKAGE__->register_method({ > raise_param_exc({ skiplock => "Only root may use this option." }) > if $skiplock && $authuser ne 'root@pam'; > > + my $overrule_shutdown = extract_param($param, 'overrule-shutdown'); > + > die "CT $vmid not running\n" if !PVE::LXC::check_running($vmid); > > if (PVE::HA::Config::vm_is_ha_managed($vmid) && $rpcenv->{type} ne 'ha') { > > + raise_param_exc({ 'overrule-shutdown' => "Not applicable for HA resources." }) > + if $overrule_shutdown; > + > my $hacmd = sub { > my $upid = shift; > > @@ -272,6 +283,11 @@ __PACKAGE__->register_method({ > return $rpcenv->fork_worker('vzstop', $vmid, $authuser, $realcmd); > }; > > + if ($overrule_shutdown) { > + my $overruled_tasks = PVE::GuestHelpers::overrule_tasks('vzshutdown', $authuser, $vmid); > + syslog('info', "overruled vzshutdown tasks: " . join(", ", $overruled_tasks->@*) . "\n"); > + }; > + ^ So this part is fine (mostly¹) > return PVE::LXC::Config->lock_config($vmid, $lockcmd); ^ Here we lock first, then fork the worker, then do `vm_stop` with the config lock inherited. This means that creating multiple shutdown tasks before using one with override=true could cause the override task to cancel the *first* ongoing shutdown task, then move on to the `lock_config` call - in the meantime a second shutdown task acquires this very lock and performs another long-running shutdown, causing the `override` parameter to be ineffective. We should switch the ordering here: first fork the worker, then lock. (¹ And your new chunk would go into the worker as well) Unless I'm missing something, but AFAICT the current ordering there is rather ... bad :-)