From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 2C6061FF13F for ; Thu, 26 Feb 2026 15:07:29 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 6F59A32639; Thu, 26 Feb 2026 15:08:27 +0100 (CET) From: Dominik Csapak To: pve-devel@lists.proxmox.com Subject: [PATCH qemu-server v3 3/3] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds Date: Thu, 26 Feb 2026 14:52:02 +0100 Message-ID: <20260226140752.1792378-4-d.csapak@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260226140752.1792378-1-d.csapak@proxmox.com> References: <20260226140752.1792378-1-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -1.033 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.618 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.734 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.78 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: FFXNQGPCDNY36UW42KJVUV3ORVLMIC4I X-Message-ID-Hash: FFXNQGPCDNY36UW42KJVUV3ORVLMIC4I X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: When qmeventd detects a vm exiting, it starts 'qm cleanup'. Since the vm process exits is sometimes not instant, wait up to 30 seconds here to start the cleanup process instead of immediately aborting if the pid still exits. This prevented executing the hookscript on the 'post-stop' phase when either * the cleanup mechanism is still the old one * the guest was powered down from inside, not via the API This can be reproduced by e.g. passing through a usb device, which delays the qemu process exit for a few seconds (for most devices). Signed-off-by: Dominik Csapak --- src/PVE/CLI/qm.pm | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm index 6aff5b7a..ee3ccedd 100755 --- a/src/PVE/CLI/qm.pm +++ b/src/PVE/CLI/qm.pm @@ -1101,7 +1101,7 @@ __PACKAGE__->register_method({ 60, sub { my $conf = PVE::QemuConfig->load_config($vmid); - my $pid = PVE::QemuServer::check_running($vmid); + my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid); # With a stop mode backup, we might run here into a running vm with a backup # lock, but this already did the cleanup and is an expected state, so abort @@ -1109,7 +1109,19 @@ __PACKAGE__->register_method({ die "skipping cleanup - 'backup' lock is present and vm is running again\n" if $pid && $clean && $conf->{lock} && $conf->{lock} eq 'backup'; - die "vm still running\n" if $pid; + # wait for some time until the QEMU process exits after the QMP + # 'SHUTDOWN' event, since this might not be instant + my $timeout = 30; + my $starttime = time(); + warn "QEMU process $pid for VM $vmid still running (or newly started)\n" + if $pid; + + while ($pid && (time() - $starttime) < $timeout) { + sleep(1); + $pid = PVE::QemuServer::check_running($vmid); + } + + die "vm still running after timeout - aborting cleanup\n" if $pid; # Rollback already does cleanup when preparing and afterwards temporarily drops the # lock on the configuration file to rollback the volumes. Deactivating volumes here -- 2.47.3