Re: [PATCH qemu-server v2] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

From: Fiona Ebner <f.ebner@proxmox.com>
To: "Dominik Csapak" <d.csapak@proxmox.com>,
	"Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	pve-devel@lists.proxmox.com
Subject: Re: [PATCH qemu-server v2] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds
Date: Thu, 19 Feb 2026 14:27:09 +0100	[thread overview]
Message-ID: <e7a2a15b-e992-4824-8330-c89b2bbdc9f3@proxmox.com> (raw)
In-Reply-To: <d022adf4-3bde-4440-b30e-8990592f13db@proxmox.com>

Am 19.02.26 um 11:15 AM schrieb Dominik Csapak:
> On 2/16/26 10:15 AM, Fiona Ebner wrote:
>> Am 16.02.26 um 9:42 AM schrieb Fabian Grünbichler:
>>> On February 13, 2026 2:16 pm, Fiona Ebner wrote:
>>>> Am 13.02.26 um 1:20 PM schrieb Fabian Grünbichler:
>>>>> On February 13, 2026 1:14 pm, Fiona Ebner wrote:
>>>>>> Am 10.02.26 um 12:14 PM schrieb Dominik Csapak:
>>>>>>> +                my $timeout = 30;
>>>>>>> +                my $starttime = time();
>>>>>>>                   my $pid = PVE::QemuServer::check_running($vmid);
>>>>>>> -                die "vm still running\n" if $pid;
>>>>>>> +                warn "vm still running - waiting up to $timeout
>>>>>>> seconds\n" if $pid;
>>>>>>
>>>>>> While we're at it, we could improve the message here. Something like
>>>>>> 'QEMU process $pid for VM $vmid still running (or newly started)'
>>>>>> Having the PID is nice info for developers/support engineers and the
>>>>>> case where a new instance is started before the cleanup was done
>>>>>> is also
>>>>>> possible.
>>>>>>
>>>>>> In fact, the case with the new instance is easily triggered by 'stop'
>>>>>> mode backups. Maybe we should fix that up first before adding a
>>>>>> timeout
>>>>>> here?
>>>>>>
>>>>>> Feb 13 13:09:48 pve9a1 qm[92975]: <root@pam> end task
>>>>>> UPID:pve9a1:00016B30:000CDF80:698F1485:qmshutdown:102:root@pam: OK
>>>>>> Feb 13 13:09:48 pve9a1 systemd[1]: Started 102.scope.
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: Starting cleanup for 102
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: trying to acquire lock...
>>>>>> Feb 13 13:09:48 pve9a1 vzdump[92895]: VM 102 started with PID 93116.
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]:  OK
>>>>>> Feb 13 13:09:48 pve9a1 qmeventd[93079]: vm still running
>>>>>
>>>>> does this mean we should actually have some sort of mechanism
>>>>> similar to
>>>>> the reboot flag to indicate a pending cleanup, and block/delay
>>>>> starts if
>>>>> it is still set?
>>>>
>>>> Blocking/delaying starts is not what happens for the reboot flag/file:
>>>
>>> that's not what I meant, the similarity was just "have a flag", not
>>> "have a flag that behaves identical" ;)
>>>
>>> my proposal was:
>>> - add a flag that indicates cleanup is pending (similar to reboot is
>>>    pending)
>>> - *handle that flag* in the start flow to wait for the cleanup to be
>>>    done before starting
>>
>> Shouldn't we change the reboot flag to also do this?
>>
>>>>> Feb 13 14:00:16 pve9a1 qm[124470]: <root@pam> starting task
>>>>> UPID:pve9a1:0001E639:001180FE:698F2060:qmreboot:102:root@pam:
>>>>> Feb 13 14:00:16 pve9a1 qm[124472]: <root@pam> starting task
>>>>> UPID:pve9a1:0001E63A:0011811E:698F2060:qmstart:102:root@pam:
>>>>> Feb 13 14:00:16 pve9a1 qm[124474]: start VM 102:
>>>>> UPID:pve9a1:0001E63A:0011811E:698F2060:qmstart:102:root@pam:
>>>>> [...]
>>>>> Feb 13 14:00:22 pve9a1 systemd[1]: 102.scope: Deactivated
>>>>> successfully.
>>>>> Feb 13 14:00:22 pve9a1 systemd[1]: 102.scope: Consumed 2min 3.333s
>>>>> CPU time, 2G memory peak.
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: Starting cleanup for 102
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: trying to acquire lock...
>>>>> Feb 13 14:00:23 pve9a1 qm[124470]: <root@pam> end task
>>>>> UPID:pve9a1:0001E639:001180FE:698F2060:qmreboot:102:root@pam: OK
>>>>> Feb 13 14:00:23 pve9a1 systemd[1]: Started 102.scope.
>>>>> Feb 13 14:00:23 pve9a1 qm[124474]: VM 102 started with PID 124620.
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]:  OK
>>>>> Feb 13 14:00:23 pve9a1 qmeventd[124565]: vm still running
>>>>
>>>> Currently, it's just indicating whether the cleanup handler should
>>>> start
>>>> the VM again afterwards.
>>>>
>>>> Am 13.02.26 um 1:22 PM schrieb Dominik Csapak:
>>>>> Sounds good, one possibility would be to do no cleanup at all when
>>>>> doing
>>>>> a stop mode backup?
>>>>> We already know we'll need the resources (pid/socket/etc. files,
>>>>> vgpus,...) again?
>>>>>
>>>>> Or is there some situation where that might not be the case?
>>>>
>>>> We do it for reboot (if not another start task sneaks in like in my
>>>> example above), and I don't see a good reason from the top of my head
>>>> why 'stop' mode backup should behave differently from a reboot (for
>>>> running VMs). It even applies pending changes just like a reboot
>>>> right now.
>>>
>>> but what about external callers doing something like:
>>>
>>> - stop
>>> - do whatever
>>> - start
>>>
>>> in rapid (automated) succession? those would still (possibly) trigger
>>> cleanup after "doing whatever" and starting the VM again already? and in
>>> particular if we skip cleanup for "our" cases of stop;start it will be
>>> easy to introduce sideeffects in cleanup that break such usage?
>>
>> I did not argue for skipping cleanup. I argued for being consistent with
>> reboot where we (try to) do cleanup. I just wasn't sure it's really
>> needed.
>>
>>>> I'm not sure if there is an actual need to do cleanup or if we could
>>
>> I guess the actual need is to have more consistent behavior.
>>
> 
> ok so i think we'd need to
> * create a cleanup flag for each vm when qmevent detects a vm shutting
> down (in /var/run/qemu-server/VMID.cleanup, possibly with timestamp)
> * removing that cleanup flag after cleanup (obviously)
> * on start, check for that flag and block for some timeout before
> starting (e.g. check the timestamp in the flag if it's longer than some
> time, start it regardless?)

Sounds good to me.

Unfortunately, something else: turns out that we kinda rely on qmeventd
not doing the cleanup for the optimization with keeping the volumes
active (i.e. $keepActive). And actually, the optimization applies
randomly depending on who wins the race.

Output below with added log line
"doing cleanup for $vmid with keepActive=$keepActive"
in vm_stop_cleanup() to be able to see what happens.

We try to use the optimization but qmeventd interferes:

> Feb 19 14:09:43 pve9a1 vzdump[168878]: <root@pam> starting task UPID:pve9a1:000293AF:0017CFF8:69970B97:vzdump:102:root@pam:
> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: starting new backup job: vzdump 102 --storage pbs --mode stop
> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: Starting Backup of VM 102 (qemu)
> Feb 19 14:09:44 pve9a1 qm[168960]: shutdown VM 102: UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
> Feb 19 14:09:44 pve9a1 qm[168959]: <root@pam> starting task UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam:
> Feb 19 14:09:47 pve9a1 qm[168960]: VM 102 qga command failed - VM 102 qga command 'guest-ping' failed - got timeout
> Feb 19 14:09:50 pve9a1 qmeventd[166736]: read: Connection reset by peer
> Feb 19 14:09:50 pve9a1 pvedaemon[166884]: <root@pam> end task UPID:pve9a1:000290CD:0017B515:69970B52:vncproxy:102:root@pam: OK
> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Consumed 41.780s CPU time, 1.9G memory peak.
> Feb 19 14:09:51 pve9a1 qm[168960]: doing cleanup for 102 with keepActive=1
> Feb 19 14:09:51 pve9a1 qm[168959]: <root@pam> end task UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: OK
> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Starting cleanup for 102
> Feb 19 14:09:51 pve9a1 qm[168986]: doing cleanup for 102 with keepActive=0
> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Finished cleanup for 102
> Feb 19 14:09:51 pve9a1 systemd[1]: Started 102.scope.
> Feb 19 14:09:51 pve9a1 vzdump[168879]: VM 102 started with PID 169021.

We manage to get the optimization:

> Feb 19 14:16:01 pve9a1 qm[174585]: shutdown VM 102: UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam:
> Feb 19 14:16:04 pve9a1 qm[174585]: VM 102 qga command failed - VM 102 qga command 'guest-ping' failed - got timeout
> Feb 19 14:16:07 pve9a1 qmeventd[166736]: read: Connection reset by peer
> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Deactivated successfully.
> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Consumed 46.363s CPU time, 2G memory peak.
> Feb 19 14:16:08 pve9a1 qm[174585]: doing cleanup for 102 with keepActive=1
> Feb 19 14:16:08 pve9a1 qm[174582]: <root@pam> end task UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam: OK
> Feb 19 14:16:08 pve9a1 systemd[1]: Started 102.scope.
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: Starting cleanup for 102
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: trying to acquire lock...
> Feb 19 14:16:08 pve9a1 vzdump[174326]: VM 102 started with PID 174718.
> Feb 19 14:16:08 pve9a1 qmeventd[174685]:  OK
> Feb 19 14:16:08 pve9a1 qmeventd[174685]: vm still running

For regular shutdown, we'll also do the cleanup twice.

Maybe we also need a way to tell qmeventd that we already did the cleanup?

next prev parent reply	other threads:[~2026-02-19 13:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10 11:15 Dominik Csapak
2026-02-12 20:33 ` Benjamin McGuire
2026-02-13 11:40 ` Fabian Grünbichler
2026-02-13 12:14 ` Fiona Ebner
2026-02-13 12:20   ` Fabian Grünbichler
2026-02-13 13:16     ` Fiona Ebner
2026-02-16  8:42       ` Fabian Grünbichler
2026-02-16  9:15         ` Fiona Ebner
2026-02-19 10:15           ` Dominik Csapak
2026-02-19 13:27             ` Fiona Ebner [this message]
2026-02-20  9:36               ` Dominik Csapak
2026-02-20 14:30                 ` Fiona Ebner
2026-02-20 14:51                   ` Dominik Csapak
2026-02-13 12:22   ` Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7a2a15b-e992-4824-8330-c89b2bbdc9f3@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal