public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: "Dominik Csapak" <d.csapak@proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	pve-devel@lists.proxmox.com
Subject: Re: [PATCH qemu-server v2] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds
Date: Thu, 19 Feb 2026 14:27:09 +0100	[thread overview]
Message-ID: <e7a2a15b-e992-4824-8330-c89b2bbdc9f3@proxmox.com> (raw)
In-Reply-To: <d022adf4-3bde-4440-b30e-8990592f13db@proxmox.com>

Am 19.02.26 um 11:15 AM schrieb Dominik Csapak:
> On 2/16/26 10:15 AM, Fiona Ebner wrote:
>> Am 16.02.26 um 9:42 AM schrieb Fabian Grünbichler:
>>> On February 13, 2026 2:16 pm, Fiona Ebner wrote:
>>>> Am 13.02.26 um 1:20 PM schrieb Fabian Grünbichler:
>>>>> On February 13, 2026 1:14 pm, Fiona Ebner wrote:
>>>>>> Am 10.02.26 um 12:14 PM schrieb Dominik Csapak:
>>>>>>> +                my $timeout = 30;
>>>>>>> +                my $starttime = time();
>>>>>>>                   my $pid = PVE::QemuServer::check_running($vmid);
>>>>>>> -                die "vm still running\n" if $pid;
>>>>>>> +                warn "vm still running - waiting up to $timeout
>>>>>>> seconds\n" if $pid;
>>>>>>
>>>>>> While we're at it, we could improve the message here. Something like
>>>>>> 'QEMU process $pid for VM $vmid still running (or newly started)'
>>>>>> Having the PID is nice info for developers/support engineers and the
>>>>>> case where a new instance is started before the cleanup was done
>>>>>> is also
>>>>>> possible.
>>>>>>
>>>>>> In fact, the case with the new instance is easily triggered by 'stop'
>>>>>> mode backups. Maybe we should fix that up first before adding a
>>>>>> timeout
>>>>>> here?
>>>>>>
>>>>>> Feb 13 13:09:48 pve9a1 qm[92975]: <root@pam> end task
>>>>>> UPID:pve9a1:00016B30:000CDF80:698F1485:qmshutdown:102:root@pam: OK
>>>>>> Feb 13 13:09:48 pve9a1 systemd[1]: Started 102.scope.
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: Starting cleanup for 102
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: trying to acquire lock...
>>>>>> Feb 13 13:09:48 pve9a1 vzdump[92895]: VM 102 started with PID 93116.
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]:  OK
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: vm still running
>>>>>
>>>>> does this mean we should actually have some sort of mechanism
>>>>> similar to
>>>>> the reboot flag to indicate a pending cleanup, and block/delay
>>>>> starts if
>>>>> it is still set?
>>>>
>>>> Blocking/delaying starts is not what happens for the reboot flag/file:
>>>
>>> that's not what I meant, the similarity was just "have a flag", not
>>> "have a flag that behaves identical" ;)
>>>
>>> my proposal was:
>>> - add a flag that indicates cleanup is pending (similar to reboot is
>>>    pending)
>>> - *handle that flag* in the start flow to wait for the cleanup to be
>>>    done before starting
>>
>> Shouldn't we change the reboot flag to also do this?
>>
>>>>> Feb 13 14:00:16 pve9a1 qm[124470]: <root@pam> starting task
>>>>> UPID:pve9a1:0001E639:001180FE:698F2060:qmreboot:102:root@pam:
>>>>> Feb 13 14:00:16 pve9a1 qm[124472]: <root@pam> starting task
>>>>> UPID:pve9a1:0001E63A:0011811E:698F2060:qmstart:102:root@pam:
>>>>> Feb 13 14:00:16 pve9a1 qm[124474]: start VM 102:
>>>>> UPID:pve9a1:0001E63A:0011811E:698F2060:qmstart:102:root@pam:
>>>>> [...]
>>>>> Feb 13 14:00:22 pve9a1 systemd[1]: 102.scope: Deactivated
>>>>> successfully.
>>>>> Feb 13 14:00:22 pve9a1 systemd[1]: 102.scope: Consumed 2min 3.333s
>>>>> CPU time, 2G memory peak.
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: Starting cleanup for 102
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: trying to acquire lock...
>>>>> Feb 13 14:00:23 pve9a1 qm[124470]: <root@pam> end task
>>>>> UPID:pve9a1:0001E639:001180FE:698F2060:qmreboot:102:root@pam: OK
>>>>> Feb 13 14:00:23 pve9a1 systemd[1]: Started 102.scope.
>>>>> Feb 13 14:00:23 pve9a1 qm[124474]: VM 102 started with PID 124620.
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]:  OK
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: vm still running
>>>>
>>>> Currently, it's just indicating whether the cleanup handler should
>>>> start
>>>> the VM again afterwards.
>>>>
>>>> Am 13.02.26 um 1:22 PM schrieb Dominik Csapak:
>>>>> Sounds good, one possibility would be to do no cleanup at all when
>>>>> doing
>>>>> a stop mode backup?
>>>>> We already know we'll need the resources (pid/socket/etc. files,
>>>>> vgpus,...) again?
>>>>>
>>>>> Or is there some situation where that might not be the case?
>>>>
>>>> We do it for reboot (if not another start task sneaks in like in my
>>>> example above), and I don't see a good reason from the top of my head
>>>> why 'stop' mode backup should behave differently from a reboot (for
>>>> running VMs). It even applies pending changes just like a reboot
>>>> right now.
>>>
>>> but what about external callers doing something like:
>>>
>>> - stop
>>> - do whatever
>>> - start
>>>
>>> in rapid (automated) succession? those would still (possibly) trigger
>>> cleanup after "doing whatever" and starting the VM again already? and in
>>> particular if we skip cleanup for "our" cases of stop;start it will be
>>> easy to introduce sideeffects in cleanup that break such usage?
>>
>> I did not argue for skipping cleanup. I argued for being consistent with
>> reboot where we (try to) do cleanup. I just wasn't sure it's really
>> needed.
>>
>>>> I'm not sure if there is an actual need to do cleanup or if we could
>>
>> I guess the actual need is to have more consistent behavior.
>>
> 
> ok so i think we'd need to
> * create a cleanup flag for each vm when qmevent detects a vm shutting
> down (in /var/run/qemu-server/VMID.cleanup, possibly with timestamp)
> * removing that cleanup flag after cleanup (obviously)
> * on start, check for that flag and block for some timeout before
> starting (e.g. check the timestamp in the flag if it's longer than some
> time, start it regardless?)

Sounds good to me.

Unfortunately, something else: turns out that we kinda rely on qmeventd
not doing the cleanup for the optimization with keeping the volumes
active (i.e. $keepActive). And actually, the optimization applies
randomly depending on who wins the race.

Output below with added log line
"doing cleanup for $vmid with keepActive=$keepActive"
in vm_stop_cleanup() to be able to see what happens.

We try to use the optimization but qmeventd interferes:

> Feb 19 14:09:43 pve9a1 vzdump[168878]: <root@pam> starting task UPID:pve9a1:000293AF:0017CFF8:69970B97:vzdump:102:root@pam:
> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: starting new backup job: vzdump 102 --storage pbs --mode stop
> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: Starting Backup of VM 102 (qemu)
> Feb 19 14:09:44 pve9a1 qm[168960]: shutdown VM 102: UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
> Feb 19 14:09:44 pve9a1 qm[168959]: <root@pam> starting task UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
> Feb 19 14:09:47 pve9a1 qm[168960]: VM 102 qga command failed - VM 102 qga command 'guest-ping' failed - got timeout
> Feb 19 14:09:50 pve9a1 qmeventd[166736]: read: Connection reset by peer
> Feb 19 14:09:50 pve9a1 pvedaemon[166884]: <root@pam> end task UPID:pve9a1:000290CD:0017B515:69970B52:vncproxy:102:root@pam: OK
> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Consumed 41.780s CPU time, 1.9G memory peak.
> Feb 19 14:09:51 pve9a1 qm[168960]: doing cleanup for 102 with keepActive=1
> Feb 19 14:09:51 pve9a1 qm[168959]: <root@pam> end task UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: OK
> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Starting cleanup for 102
> Feb 19 14:09:51 pve9a1 qm[168986]: doing cleanup for 102 with keepActive=0
> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Finished cleanup for 102
> Feb 19 14:09:51 pve9a1 systemd[1]: Started 102.scope.
> Feb 19 14:09:51 pve9a1 vzdump[168879]: VM 102 started with PID 169021.

We manage to get the optimization:

> Feb 19 14:16:01 pve9a1 qm[174585]: shutdown VM 102: UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam:
> Feb 19 14:16:04 pve9a1 qm[174585]: VM 102 qga command failed - VM 102 qga command 'guest-ping' failed - got timeout
> Feb 19 14:16:07 pve9a1 qmeventd[166736]: read: Connection reset by peer
> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Consumed 46.363s CPU time, 2G memory peak.
> Feb 19 14:16:08 pve9a1 qm[174585]: doing cleanup for 102 with keepActive=1
> Feb 19 14:16:08 pve9a1 qm[174582]: <root@pam> end task UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam: OK
> Feb 19 14:16:08 pve9a1 systemd[1]: Started 102.scope.
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: Starting cleanup for 102
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: trying to acquire lock...
> Feb 19 14:16:08 pve9a1 vzdump[174326]: VM 102 started with PID 174718.
> Feb 19 14:16:08 pve9a1 qmeventd[174685]:  OK
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: vm still running

For regular shutdown, we'll also do the cleanup twice.

Maybe we also need a way to tell qmeventd that we already did the cleanup?




  reply	other threads:[~2026-02-19 13:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10 11:15 Dominik Csapak
2026-02-12 20:33 ` Benjamin McGuire
2026-02-13 11:40 ` Fabian Grünbichler
2026-02-13 12:14 ` Fiona Ebner
2026-02-13 12:20   ` Fabian Grünbichler
2026-02-13 13:16     ` Fiona Ebner
2026-02-16  8:42       ` Fabian Grünbichler
2026-02-16  9:15         ` Fiona Ebner
2026-02-19 10:15           ` Dominik Csapak
2026-02-19 13:27             ` Fiona Ebner [this message]
2026-02-20  9:36               ` Dominik Csapak
2026-02-20 14:30                 ` Fiona Ebner
2026-02-20 14:51                   ` Dominik Csapak
2026-02-13 12:22   ` Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7a2a15b-e992-4824-8330-c89b2bbdc9f3@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal