From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id EF0D21FF14C for ; Fri, 15 May 2026 12:08:52 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id CB30B17591; Fri, 15 May 2026 12:08:48 +0200 (CEST) From: Dominik Csapak To: pve-devel@lists.proxmox.com Subject: [PATCH qemu-server v4 3/3] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds Date: Fri, 15 May 2026 12:04:54 +0200 Message-ID: <20260515100842.1980636-4-d.csapak@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260515100842.1980636-1-d.csapak@proxmox.com> References: <20260515100842.1980636-1-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.050 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: I4D436K2EVPW46FGS35HD5M4U63J4Z55 X-Message-ID-Hash: I4D436K2EVPW46FGS35HD5M4U63J4Z55 X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: When qmeventd detects a vm exiting, it starts 'qm cleanup'. Since the vm process exits is sometimes not instant, wait up to 30 seconds here to start the cleanup process instead of immediately aborting if the pid still exits. This prevented executing the hookscript on the 'post-stop' phase when either * the cleanup mechanism is still the old one * the guest was powered down from inside, not via the API This can be reproduced by e.g. passing through a usb device, which delays the qemu process exit for a few seconds (for most devices). Signed-off-by: Dominik Csapak --- changes from v3: * adapt to indentation change of previous patch * use non-deprecated vm_running_locally helper * improve warning message src/PVE/CLI/qm.pm | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm index 3c9e8812..a35d9f74 100755 --- a/src/PVE/CLI/qm.pm +++ b/src/PVE/CLI/qm.pm @@ -1101,7 +1101,7 @@ __PACKAGE__->register_method({ 60, sub { my $conf = PVE::QemuConfig->load_config($vmid); - my $pid = PVE::QemuServer::check_running($vmid); + my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid); if ($pid) { # With a stop mode backup, we might run here into a running vm with a backup @@ -1110,7 +1110,21 @@ __PACKAGE__->register_method({ die "skipping cleanup - 'backup' lock is present and vm is running again\n" if $clean && $conf->{lock} && $conf->{lock} eq 'backup'; - die "vm still running\n"; + # wait for some time until the QEMU process exits after the QMP + # 'SHUTDOWN' event, since this might not be instant + + my $timeout = 30; + warn + "QEMU process $pid for VM $vmid still running (or newly started), waiting up to $timeout seconds for it to exit\n"; + my $starttime = time(); + + while ($pid && (time() - $starttime) < $timeout) { + sleep(1); + $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid); + } + + die "aborting cleanup, VM is still running after $timeout seconds\n" + if $pid; } # Rollback already does cleanup when preparing and afterwards temporarily drops the -- 2.47.3