From: Dominik Csapak <d.csapak@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH qemu-server v3 1/3] cleanup: refactor to make cleanup flow consistent
Date: Thu, 26 Feb 2026 14:52:00 +0100 [thread overview]
Message-ID: <20260226140752.1792378-2-d.csapak@proxmox.com> (raw)
In-Reply-To: <20260226140752.1792378-1-d.csapak@proxmox.com>
There are two ways a cleanup can be triggered:
* When a guest is stopped/shutdown via the API, 'vm_stop' calls 'vm_stop_cleanup'.
* When the guest process disconnects from qmeventd, 'qm cleanup' is
called, which in turn also tries to call 'vm_stop_cleanup'.
Both of these happen under a qemu config lock, so there is no direct
race condition that it will be called out of order, but it could happen
that the 'qm cleanup' call happened in addition so cleanup was called
twice. Which could be a problem when the shutdown was called with
'keepActive' which 'qm cleanup' would simply know nothing of and ignore.
Also the post-stop hook might not be triggered in case e.g. a stop-mode
backup was done, since that was only happening via qm cleanup and this
would sometimes detect the now again running guest and abort.
To improve the situation we move the exec_hookscript call at the end
of vm_stop_cleanup. At this point we know the vm is stopped and we still
have the config lock.
To prevent a double cleanup, create a new cleanup flag on vm startup and
only cleanup when this is still there, then delete it at the end.
This means we can now drop the check for clean/guest shutdown to do
that.
Since this can only work for guests started after this logic is
introduced, create an additional flag that we must use the old logic
which will be cleared on reboot. After that the new logic is used.
That flag will be created with a 'preinst' script, so we create it when
we come from an older version before the code is actually used.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
debian/preinst | 16 ++++++++++++++++
src/PVE/CLI/qm.pm | 12 +++++++++---
src/PVE/QemuServer.pm | 14 ++++++++++++++
src/PVE/QemuServer/RunState.pm | 28 ++++++++++++++++++++++++++++
src/test/MigrationTest/QmMock.pm | 16 ++++++++++++++++
5 files changed, 83 insertions(+), 3 deletions(-)
create mode 100755 debian/preinst
diff --git a/debian/preinst b/debian/preinst
new file mode 100755
index 00000000..1697020f
--- /dev/null
+++ b/debian/preinst
@@ -0,0 +1,16 @@
+#!/bin/sh
+
+set -e
+
+case "$1" in
+ upgrade)
+ if dpkg --compare-versions "$2" 'lt' '9.1.5'; then
+ # set use_old_cleanup flag file so the old cleanup code will be used until reboot
+ touch /run/qemu-server/use_old_cleanup
+ fi
+ ;;
+ install|abort-install)
+ ;;
+esac
+
+exit 0
diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm
index bdae9641..b7bc4d9f 100755
--- a/src/PVE/CLI/qm.pm
+++ b/src/PVE/CLI/qm.pm
@@ -1120,12 +1120,18 @@ __PACKAGE__->register_method({
}
}
- if (!$clean || $guest) {
- # vm was shutdown from inside the guest or crashed, doing api cleanup
+ my $can_use_cleanup_flag = PVE::QemuServer::RunState::can_use_cleanup_flag();
+
+ if (!$clean || $guest || $can_use_cleanup_flag) {
+ # either we can use the new mechanism to check if cleanup is done, or
+ # vm was shutdown from inside the guest or crashed
PVE::QemuServer::vm_stop_cleanup($storecfg, $vmid, $conf, 0, 0, 1);
}
- PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'post-stop');
+ if (!$can_use_cleanup_flag) {
+ # if the new cleanup mechanism is used, this will be called from 'vm_stop_cleanup'
+ PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'post-stop');
+ }
$restart = eval { PVE::QemuServer::clear_reboot_request($vmid) };
warn $@ if $@;
diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
index 545758dc..2f1d5c11 100644
--- a/src/PVE/QemuServer.pm
+++ b/src/PVE/QemuServer.pm
@@ -5817,6 +5817,8 @@ sub vm_start_nolock {
syslog("info", "VM $vmid started with PID $pid.");
+ PVE::QemuServer::RunState::create_cleanup_flag($vmid);
+
if (defined(my $migrate = $res->{migrate})) {
if ($migrate->{proto} eq 'tcp') {
my $nodename = nodename();
@@ -6124,6 +6126,11 @@ sub cleanup_pci_devices {
sub vm_stop_cleanup {
my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes, $noerr) = @_;
+ my $can_use_cleanup_flag = PVE::QemuServer::RunState::can_use_cleanup_flag();
+ if ($can_use_cleanup_flag) {
+ return if !PVE::QemuServer::RunState::cleanup_flag_exists($vmid);
+ }
+
eval {
PVE::QemuServer::QSD::quit($vmid)
if PVE::QemuServer::Helpers::qsd_running_locally($vmid);
@@ -6158,6 +6165,13 @@ sub vm_stop_cleanup {
die $err if !$noerr;
warn $err;
}
+
+ if ($can_use_cleanup_flag) {
+ # if the old cleanup mechanism is in place, this will be called by 'qm cleanup'
+ PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'post-stop');
+ }
+
+ PVE::QemuServer::RunState::clear_cleanup_flag($vmid);
}
# call only in locked context
diff --git a/src/PVE/QemuServer/RunState.pm b/src/PVE/QemuServer/RunState.pm
index 6a5fdbd7..e2bd4d94 100644
--- a/src/PVE/QemuServer/RunState.pm
+++ b/src/PVE/QemuServer/RunState.pm
@@ -6,6 +6,7 @@ use warnings;
use POSIX qw(strftime);
use PVE::Cluster;
+use PVE::File;
use PVE::RPCEnvironment;
use PVE::Storage;
@@ -183,4 +184,31 @@ sub vm_resume {
);
}
+my sub get_cleanup_path {
+ my ($vmid) = @_;
+ return "/var/run/qemu-server/$vmid.cleanup";
+}
+
+sub create_cleanup_flag {
+ my ($vmid) = @_;
+ PVE::File::file_set_contents(get_cleanup_path($vmid), time());
+}
+
+sub clear_cleanup_flag {
+ my ($vmid) = @_;
+ my $path = get_cleanup_path($vmid);
+ unlink $path or $! == POSIX::ENOENT or die "removing cleanup flag for $vmid failed: $!\n";
+}
+
+sub cleanup_flag_exists {
+ my ($vmid) = @_;
+ return -f get_cleanup_path($vmid);
+}
+
+# checks if /run/qemu-server/use_old_cleanup exists that will be created on
+# package update and cleared on bootup so we can be sure the guests were
+# started recently enough
+sub can_use_cleanup_flag {
+ !-f "/run/qemu-server/use_old_cleanup";
+}
1;
diff --git a/src/test/MigrationTest/QmMock.pm b/src/test/MigrationTest/QmMock.pm
index 78be47d3..4d5fa3ac 100644
--- a/src/test/MigrationTest/QmMock.pm
+++ b/src/test/MigrationTest/QmMock.pm
@@ -77,6 +77,22 @@ $qemu_server_helpers_module->mock(
},
);
+my $qemu_server_runstate_module = Test::MockModule->new("PVE::QemuServer::RunState");
+$qemu_server_runstate_module->mock(
+ create_cleanup_flag => sub {
+ return undef;
+ },
+ clear_cleanup_flag => sub {
+ return undef;
+ },
+ cleanup_flag_exists => sub {
+ return 1;
+ },
+ can_use_cleanup_flag => sub {
+ return 1;
+ },
+);
+
our $qemu_server_machine_module = Test::MockModule->new("PVE::QemuServer::Machine");
$qemu_server_machine_module->mock(
get_current_qemu_machine => sub {
--
2.47.3
next prev parent reply other threads:[~2026-02-26 14:07 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 13:51 [PATCH qemu-server v3 0/3] improve guest cleanup handling Dominik Csapak
2026-02-26 13:52 ` Dominik Csapak [this message]
2026-02-26 13:52 ` [PATCH qemu-server v3 2/3] qm cleanup: die early when encountering a running stop mode backup Dominik Csapak
2026-02-26 13:52 ` [PATCH qemu-server v3 3/3] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260226140752.1792378-2-d.csapak@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox