From: Dominik Csapak <d.csapak@proxmox.com>
To: Fiona Ebner <f.ebner@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [RFC PATCH qemu-server] cleanup: refactor to make cleanup flow more consistent
Date: Tue, 24 Feb 2026 11:06:13 +0100 [thread overview]
Message-ID: <aff05521-217e-4e0c-8f28-ea1c3b821d96@proxmox.com> (raw)
In-Reply-To: <46fd4500-1410-4d9e-98ee-e47bdc42a820@proxmox.com>
On 2/24/26 10:49 AM, Fiona Ebner wrote:
> Am 24.02.26 um 10:37 AM schrieb Dominik Csapak:
>> On 2/24/26 10:30 AM, Fiona Ebner wrote:
>>> Am 23.02.26 um 4:50 PM schrieb Fiona Ebner:
>>>> Am 23.02.26 um 11:56 AM schrieb Dominik Csapak:
>>>>> There are two ways a cleanup can be triggered:
>>>>>
>>>>> * When a guest is stopped/shutdown via the API, 'vm_stop' calls
>>>>> 'vm_stop_cleanup'.
>>>>> * When the guest process disconnects from qmeventd, 'qm cleanup' is
>>>>> called, which in turn also tries to call 'vm_stop_cleanup'.
>>>>>
>>>>> Both of these happen under a qemu config lock, so there is no direct
>>>>> race condition that it will be called out of order, but it could happen
>>>>> that the 'qm cleanup' call happened in addition so cleanup was called
>>>>> twice. Which could be a problem when the shutdown was called with
>>>>> 'keepActive' which 'qm cleanup' would simply know nothing of and
>>>>> ignore.
>>>>>
>>>>> Also the post-stop hook might not be triggered in case e.g. a stop-mode
>>>>> backup was done, since that was only happening via qm cleanup and this
>>>>> would detect the now again running guest and abort.
>>>>>
>>>>> To improve the situation we move the exec_hookscript call at the end
>>>>> of vm_stop_cleanup. At this point we know the vm is stopped and we
>>>>> still
>>>>> have the config lock.
>>>>>
>>>>> On _do_vm_stop (and in the one case for migration) a 'cleanup-flag' is
>>>>> created that marks the vm is cleaned up by the api, so 'qm cleanup'
>>>>> should not do it again.
>>>>>
>>>>> On vm start, this flag is cleared.
>>>
>>> It feels untidy to have something left after cleaning up, even if it's
>>> just the file indicating that cleanup was done. Maybe we can switch it
>>> around, see below:
>>>
>>>>>
>>>>> There is still a tiny race possible:
>>>>>
>>>>> a guest is stopped from within (or crashes) and the vm is started again
>>>>> via the api before 'qm cleanup' can run
>>>>>
>>>>> This should be a very rare case though, and all operation via the API
>>>>> (reboot, shutdown+start, stop-mode backup, etc.) should work as
>>>>> intended.
>>>>
>>>> How difficult is it to trigger the race with an HA-managed VM?
>>>>
>>>>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>>>>> ---
>>>>> I'm not sure how we could possibly eliminate the race i mentioned:
>>>>> * we can't block on start because i don't think we can sensibly
>>>>> decide between:
>>>>> - vm was crashing/powered off
>>>>> - vm was never started
>>>>>
>>>>> We could maybe leave a 'started' flag somewhere too and clear the
>>>>> cleanup flag also in 'qm cleanup', then we would start the vm
>>>>> only when the cleanup flag is cleared
>>>>> (or have the cleanup flags have multiple states, like 'started',
>>>>> 'finished')
>>>
>>> We already have something very similar, namely, the PID file. The issue
>>> is that the PID file is removed automatically by QEMU upon clean
>>> termination. For our use case we would need a second, more persistent
>>> file. Then we could solve the issue of duplicate cleanup and the issue
>>> of starting another instance before cleanup:
>>>
>>> 1. create a flag file at startup with an identifier for the
>>> QEMU instance, a second manual PID file?
>>> 2. at cleanup, check the file:
>>> a) if there is no such file, skip, somebody else already cleaned up
>>> NOTE: we need to ensure that pre-existing instances are still
>>> cleaned up. One possible way would be to create a flag file during
>>> host startup and only use the new behavior when that is present.
>>> b) if the file exists, check if the QEMU instance is still around. If
>>> it is, wait for the instance to be gone until hitting some
>>> timeout. Once it's gone, do cleanup.
>>> 3. make sure to run the post-stop hook whenever we remove the file
>>> 4. if the file still exists at startup, cleanup was not done yet, wait
>>> until some timeout and when hitting the timeout, either proceed with
>>> start anyway or suggest running cleanup manually. The latter would be
>>> safer, but also worse from an UX standpoint, since cleanup is
>>> root-only
>>>
>>> What do you think?
>>
>> yes, that seems good to me, i'll play around with that and send a next
>> version
>
> Ah, one more thing. With stop mode backup, we'd still run into the issue
> that the cleanup triggered by qmeventd might run into a newly started
> instance and then wait around for nothing. We already skip cleanup if we
> detect the 'rollback' lock since we know rollback does its own cleanup.
> I think we can do the same for the backup lock (if shutdown was clean),
> since we know stop mode backup does its own cleanup too. And it might be
> better to do warn+return instead of die, since the situation is not
> really unexpected (the one for rollback could be adapted too).
sure makes sense, but i'd split that in a separate patch
prev parent reply other threads:[~2026-02-24 10:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 10:49 Dominik Csapak
2026-02-23 15:49 ` Fiona Ebner
2026-02-24 9:30 ` Fiona Ebner
2026-02-24 9:37 ` Dominik Csapak
2026-02-24 9:50 ` Fiona Ebner
2026-02-24 10:06 ` Dominik Csapak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aff05521-217e-4e0c-8f28-ea1c3b821d96@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox