public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: Fiona Ebner <f.ebner@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [RFC PATCH qemu-server] cleanup: refactor to make cleanup flow more consistent
Date: Tue, 24 Feb 2026 10:37:34 +0100	[thread overview]
Message-ID: <7986269c-5fe9-433c-a1c9-bbba3279cb64@proxmox.com> (raw)
In-Reply-To: <28c91ab7-b41d-446e-b31c-5800ced4ad61@proxmox.com>



On 2/24/26 10:30 AM, Fiona Ebner wrote:
> Am 23.02.26 um 4:50 PM schrieb Fiona Ebner:
>> Am 23.02.26 um 11:56 AM schrieb Dominik Csapak:
>>> There are two ways a cleanup can be triggered:
>>>
>>> * When a guest is stopped/shutdown via the API, 'vm_stop' calls 'vm_stop_cleanup'.
>>> * When the guest process disconnects from qmeventd, 'qm cleanup' is
>>>    called, which in turn also tries to call 'vm_stop_cleanup'.
>>>
>>> Both of these happen under a qemu config lock, so there is no direct
>>> race condition that it will be called out of order, but it could happen
>>> that the 'qm cleanup' call happened in addition so cleanup was called
>>> twice. Which could be a problem when the shutdown was called with
>>> 'keepActive' which 'qm cleanup' would simply know nothing of and ignore.
>>>
>>> Also the post-stop hook might not be triggered in case e.g. a stop-mode
>>> backup was done, since that was only happening via qm cleanup and this
>>> would detect the now again running guest and abort.
>>>
>>> To improve the situation we move the exec_hookscript call at the end
>>> of vm_stop_cleanup. At this point we know the vm is stopped and we still
>>> have the config lock.
>>>
>>> On _do_vm_stop (and in the one case for migration) a 'cleanup-flag' is
>>> created that marks the vm is cleaned up by the api, so 'qm cleanup'
>>> should not do it again.
>>>
>>> On vm start, this flag is cleared.
> 
> It feels untidy to have something left after cleaning up, even if it's
> just the file indicating that cleanup was done. Maybe we can switch it
> around, see below:
> 
>>>
>>> There is still a tiny race possible:
>>>
>>> a guest is stopped from within (or crashes) and the vm is started again
>>> via the api before 'qm cleanup' can run
>>>
>>> This should be a very rare case though, and all operation via the API
>>> (reboot, shutdown+start, stop-mode backup, etc.) should work as intended.
>>
>> How difficult is it to trigger the race with an HA-managed VM?
>>
>>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>>> ---
>>> I'm not sure how we could possibly eliminate the race i mentioned:
>>> * we can't block on start because i don't think we can sensibly decide between:
>>>    - vm was crashing/powered off
>>>    - vm was never started
>>>
>>> We could maybe leave a 'started' flag somewhere too and clear the
>>> cleanup flag also in 'qm cleanup', then we would start the vm
>>> only when the cleanup flag is cleared
>>> (or have the cleanup flags have multiple states, like 'started',
>>> 'finished')
> 
> We already have something very similar, namely, the PID file. The issue
> is that the PID file is removed automatically by QEMU upon clean
> termination. For our use case we would need a second, more persistent
> file. Then we could solve the issue of duplicate cleanup and the issue
> of starting another instance before cleanup:
> 
> 1. create a flag file at startup with an identifier for the
>     QEMU instance, a second manual PID file?
> 2. at cleanup, check the file:
>     a) if there is no such file, skip, somebody else already cleaned up
>        NOTE: we need to ensure that pre-existing instances are still
>        cleaned up. One possible way would be to create a flag file during
>        host startup and only use the new behavior when that is present.
>     b) if the file exists, check if the QEMU instance is still around. If
>        it is, wait for the instance to be gone until hitting some
>        timeout. Once it's gone, do cleanup.
> 3. make sure to run the post-stop hook whenever we remove the file
> 4. if the file still exists at startup, cleanup was not done yet, wait
>     until some timeout and when hitting the timeout, either proceed with
>     start anyway or suggest running cleanup manually. The latter would be
>     safer, but also worse from an UX standpoint, since cleanup is
>     root-only
> 
> What do you think?

yes, that seems good to me,  i'll play around with that and send a next 
version




  reply	other threads:[~2026-02-24  9:36 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 10:49 Dominik Csapak
2026-02-23 15:49 ` Fiona Ebner
2026-02-24  9:30   ` Fiona Ebner
2026-02-24  9:37     ` Dominik Csapak [this message]
2026-02-24  9:50       ` Fiona Ebner
2026-02-24 10:06         ` Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7986269c-5fe9-433c-a1c9-bbba3279cb64@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal