From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 7D2791FF14C for ; Fri, 15 May 2026 14:25:26 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 15C26CDBC; Fri, 15 May 2026 14:25:18 +0200 (CEST) From: Dominik Csapak To: pve-devel@lists.proxmox.com Subject: [PATCH qemu-server v5 3/3] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds Date: Fri, 15 May 2026 14:23:06 +0200 Message-ID: <20260515122437.3153051-4-d.csapak@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260515122437.3153051-1-d.csapak@proxmox.com> References: <20260515122437.3153051-1-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.050 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: KIRJDQR7R66RJBFD6FAZ45ESE3W56WKN X-Message-ID-Hash: KIRJDQR7R66RJBFD6FAZ45ESE3W56WKN X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: When qmeventd detects a vm exiting, it starts 'qm cleanup'. Since the vm process exits is sometimes not instant, wait up to 30 seconds here to start the cleanup process instead of immediately aborting if the pid still exits. This prevented executing the hookscript on the 'post-stop' phase when either * the cleanup mechanism is still the old one * the guest was powered down from inside, not via the API This can be reproduced by e.g. passing through a usb device, which delays the qemu process exit for a few seconds (for most devices). Signed-off-by: Dominik Csapak --- src/PVE/CLI/qm.pm | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm index 3c9e8812..6b796440 100755 --- a/src/PVE/CLI/qm.pm +++ b/src/PVE/CLI/qm.pm @@ -1101,7 +1101,7 @@ __PACKAGE__->register_method({ 60, sub { my $conf = PVE::QemuConfig->load_config($vmid); - my $pid = PVE::QemuServer::check_running($vmid); + my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid); if ($pid) { # With a stop mode backup, we might run here into a running vm with a backup @@ -1110,7 +1110,25 @@ __PACKAGE__->register_method({ die "skipping cleanup - 'backup' lock is present and vm is running again\n" if $clean && $conf->{lock} && $conf->{lock} eq 'backup'; - die "vm still running\n"; + # wait for some time until the QEMU process exits after the QMP + # 'SHUTDOWN' event, since this might not be instant + + my $timeout = 30; + my $warned = 0; + my $starttime = time(); + + while ($pid && (time() - $starttime) < $timeout) { + if (!$warned && (time() - $starttime) > 10) { + warn + "VM cleanup: QEMU process $pid for VM $vmid still running (or newly started)\n"; + $warned = 1; + } + sleep(1); + $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid); + } + + die "aborting cleanup, VM is still running after $timeout seconds\n" + if $pid; } # Rollback already does cleanup when preparing and afterwards temporarily drops the -- 2.47.3