From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 0ECD21FF141 for ; Fri, 13 Feb 2026 12:40:12 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 51917299; Fri, 13 Feb 2026 12:40:56 +0100 (CET) Date: Fri, 13 Feb 2026 12:40:17 +0100 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= Subject: Re: [PATCH qemu-server v2] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds To: Dominik Csapak , pve-devel@lists.proxmox.com References: <20260210111612.2017883-1-d.csapak@proxmox.com> In-Reply-To: <20260210111612.2017883-1-d.csapak@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.17.0 (https://github.com/astroidmail/astroid) Message-Id: <1770982503.z90o85edfu.astroid@yuna.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1770982817308 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.046 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: J7EFJZKLNANETODDFOZPZ2CM4FI5DAZK X-Message-ID-Hash: J7EFJZKLNANETODDFOZPZ2CM4FI5DAZK X-MailFrom: f.gruenbichler@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On February 10, 2026 12:15 pm, Dominik Csapak wrote: > When qmeventd detects a vm exiting, it starts 'qm cleanup' to cleanup > files, executing hookscripts, etc. >=20 > Since the vm process exits is sometimes not instant, wait up to 30 > seconds here to start the cleanup process instead of immediately > aborting if the pid still exits. This prevented executing the hookscript > on the 'post-stop' phase. >=20 > This can be easily reproduced by e.g. passing through a usb device, > which delays the qemu process exit for a few seconds. >=20 > Signed-off-by: Dominik Csapak > --- > changes from v1: > * use correct while condition (time() is always >=3D $starttime) >=20 > original comment: >=20 > The 30 second timeout was arbitrarily chosen, but we could probably > start with something smaller, like 10 seconds? Could be adapted on > applying though. >=20 > In my (short) tests the usb passthrough part only adds a single second, > but i can imagine different devices on other systems could block it for > much longer. >=20 > src/PVE/CLI/qm.pm | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) >=20 > diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm > index bdae9641..16875ed2 100755 > --- a/src/PVE/CLI/qm.pm > +++ b/src/PVE/CLI/qm.pm > @@ -1101,8 +1101,19 @@ __PACKAGE__->register_method({ > 60, > sub { > my $conf =3D PVE::QemuConfig->load_config($vmid); > + > + # wait for some timeout until vm process exits, since th= is might not be instant > + my $timeout =3D 30; > + my $starttime =3D time(); > my $pid =3D PVE::QemuServer::check_running($vmid); > - die "vm still running\n" if $pid; > + warn "vm still running - waiting up to $timeout seconds\= n" if $pid; > + > + while ($pid && (time() - $starttime) < $timeout) { > + sleep(1); > + $pid =3D PVE::QemuServer::check_running($vmid); nit: this helper is deprecated - and this call here is only running in the context of "guest is local", we already obtained the lock and loaded the config, so we know that invariant holds, so this new code (and the old line above) can just use PVE::QemuServer::Helpers::vm_running_locally instead.. > + } > + > + die "vm still running - aborting cleanup\n" if $pid; > =20 > # Rollback already does cleanup when preparing and after= wards temporarily drops the > # lock on the configuration file to rollback the volumes= . Deactivating volumes here > --=20 > 2.47.3 >=20 >=20 >=20 >=20 >=20 >=20