From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id A40731FF136 for ; Mon, 23 Feb 2026 11:27:11 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id EF6847762; Mon, 23 Feb 2026 11:28:02 +0100 (CET) Message-ID: <14d42ac4-634a-468b-8b5e-4e32fa823564@proxmox.com> Date: Mon, 23 Feb 2026 11:27:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH qemu-server v2] fix #7119: qm cleanup: wait for process exiting for up to 30 seconds To: Dominik Csapak , =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , pve-devel@lists.proxmox.com References: <20260210111612.2017883-1-d.csapak@proxmox.com> <7ee8d206-36fd-4ade-893b-c7c2222a8883@proxmox.com> <1770985110.nme4v4xomn.astroid@yuna.none> <9d501c98-a85c-44d4-af0e-0301b203d691@proxmox.com> <1771231158.rte62d97r5.astroid@yuna.none> <38236a30-a249-4ebe-bf89-788d67f36bd1@proxmox.com> <7bbce03b-d8d6-4459-876c-2a71257959a4@proxmox.com> <8099db49-d35a-4ab1-9e33-c82689aee016@proxmox.com> Content-Language: en-US From: Fiona Ebner In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771842433171 X-SPAM-LEVEL: Spam detection results: 0 AWL -1.082 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.798 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.79 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.547 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 5RSG6BNRDHMF2Q7OSYLNZYYHYG766ZLY X-Message-ID-Hash: 5RSG6BNRDHMF2Q7OSYLNZYYHYG766ZLY X-MailFrom: f.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Am 20.02.26 um 3:51 PM schrieb Dominik Csapak: > On 2/20/26 3:30 PM, Fiona Ebner wrote: >> Am 20.02.26 um 10:36 AM schrieb Dominik Csapak: >>> On 2/19/26 2:27 PM, Fiona Ebner wrote: >>>> Am 19.02.26 um 11:15 AM schrieb Dominik Csapak: >>>>> On 2/16/26 10:15 AM, Fiona Ebner wrote: >>>>>> Am 16.02.26 um 9:42 AM schrieb Fabian Grünbichler: >>>>>>> On February 13, 2026 2:16 pm, Fiona Ebner wrote: >>>>>> >>>>>> I guess the actual need is to have more consistent behavior. >>>>>> >>>>> >>>>> ok so i think we'd need to >>>>> * create a cleanup flag for each vm when qmevent detects a vm shutting >>>>> down (in /var/run/qemu-server/VMID.cleanup, possibly with timestamp) >>>>> * removing that cleanup flag after cleanup (obviously) >>>>> * on start, check for that flag and block for some timeout before >>>>> starting (e.g. check the timestamp in the flag if it's longer than >>>>> some >>>>> time, start it regardless?) >>>> >>>> Sounds good to me. >>>> >>>> Unfortunately, something else: turns out that we kinda rely on qmeventd >>>> not doing the cleanup for the optimization with keeping the volumes >>>> active (i.e. $keepActive). And actually, the optimization applies >>>> randomly depending on who wins the race. >>>> >>>> Output below with added log line >>>> "doing cleanup for $vmid with keepActive=$keepActive" >>>> in vm_stop_cleanup() to be able to see what happens. >>>> >>>> We try to use the optimization but qmeventd interferes: >>>> >>>>> Feb 19 14:09:43 pve9a1 vzdump[168878]: starting task >>>>> UPID:pve9a1:000293AF:0017CFF8:69970B97:vzdump:102:root@pam: >>>>> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: starting new backup job: >>>>> vzdump 102 --storage pbs --mode stop >>>>> Feb 19 14:09:43 pve9a1 vzdump[168879]: INFO: Starting Backup of VM >>>>> 102 (qemu) >>>>> Feb 19 14:09:44 pve9a1 qm[168960]: shutdown VM 102: >>>>> UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: >>>>> Feb 19 14:09:44 pve9a1 qm[168959]: starting task >>>>> UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: >>>>> Feb 19 14:09:47 pve9a1 qm[168960]: VM 102 qga command failed - VM 102 >>>>> qga command 'guest-ping' failed - got timeout >>>>> Feb 19 14:09:50 pve9a1 qmeventd[166736]: read: Connection reset by >>>>> peer >>>>> Feb 19 14:09:50 pve9a1 pvedaemon[166884]: end task >>>>> UPID:pve9a1:000290CD:0017B515:69970B52:vncproxy:102:root@pam: OK >>>>> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Deactivated >>>>> successfully. >>>>> Feb 19 14:09:50 pve9a1 systemd[1]: 102.scope: Consumed 41.780s CPU >>>>> time, 1.9G memory peak. >>>>> Feb 19 14:09:51 pve9a1 qm[168960]: doing cleanup for 102 with >>>>> keepActive=1 >>>>> Feb 19 14:09:51 pve9a1 qm[168959]: end task >>>>> UPID:pve9a1:00029400:0017D035:69970B98:qmshutdown:102:root@pam: OK >>>>> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Starting cleanup for 102 >>>>> Feb 19 14:09:51 pve9a1 qm[168986]: doing cleanup for 102 with >>>>> keepActive=0 >>>>> Feb 19 14:09:51 pve9a1 qmeventd[168986]: Finished cleanup for 102 >>>>> Feb 19 14:09:51 pve9a1 systemd[1]: Started 102.scope. >>>>> Feb 19 14:09:51 pve9a1 vzdump[168879]: VM 102 started with PID 169021. >>>> >>>> We manage to get the optimization: >>>> >>>>> Feb 19 14:16:01 pve9a1 qm[174585]: shutdown VM 102: >>>>> UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam: >>>>> Feb 19 14:16:04 pve9a1 qm[174585]: VM 102 qga command failed - VM 102 >>>>> qga command 'guest-ping' failed - got timeout >>>>> Feb 19 14:16:07 pve9a1 qmeventd[166736]: read: Connection reset by >>>>> peer >>>>> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Deactivated >>>>> successfully. >>>>> Feb 19 14:16:07 pve9a1 systemd[1]: 102.scope: Consumed 46.363s CPU >>>>> time, 2G memory peak. >>>>> Feb 19 14:16:08 pve9a1 qm[174585]: doing cleanup for 102 with >>>>> keepActive=1 >>>>> Feb 19 14:16:08 pve9a1 qm[174582]: end task >>>>> UPID:pve9a1:0002A9F9:0018636B:69970D11:qmshutdown:102:root@pam: OK >>>>> Feb 19 14:16:08 pve9a1 systemd[1]: Started 102.scope. >>>>> Feb 19 14:16:08 pve9a1 qmeventd[174685]: Starting cleanup for 102 >>>>> Feb 19 14:16:08 pve9a1 qmeventd[174685]: trying to acquire lock... >>>>> Feb 19 14:16:08 pve9a1 vzdump[174326]: VM 102 started with PID 174718. >>>>> Feb 19 14:16:08 pve9a1 qmeventd[174685]:  OK >>>>> Feb 19 14:16:08 pve9a1 qmeventd[174685]: vm still running >>>> >>>> For regular shutdown, we'll also do the cleanup twice. So, to expand on this, in the qm cleanup endpoint we have: > if (!$clean || $guest) { > # vm was shutdown from inside the guest or crashed, doing api cleanup > PVE::QemuServer::vm_stop_cleanup($storecfg, $vmid, $conf, 0, 0, 1); > } and the duplicate cleanup during shutdown happens when $guest evaluates to true. We have $guest=1 when the shutdown was initiated via 'system_powerdown' and 'guest-shutdown' QMP commands and when initiated from within the guest. We have $guest=0 with the 'quit' QMP command which is used for hard stop. It kinda does look like we wanted to avoid reaching vm_stop_cleanup() a second time if not required, but we don't have the necessary information to distinguish between guest-initiated shutdown from inside and guest-initiated shutdown triggered from outside. I don't see a good way to get that information from the top of my head. That said, with the cleanup flag file, we won't even need to look at $guest anymore. >>>> Maybe we also need a way to tell qmeventd that we already did the >>>> cleanup? >>> >>> >>> ok well then i'd try to do something like this: >>> >>> in >>> >>> 'vm_stop' we'll create a cleanup flag with timestamp + state (e.g. >>> 'queued') >>> >>> in vm_stop_cleanup we change/create the flag with >>> 'started' and clear the flag after cleanup >> >> Why is the one in vm_stop needed? Is there any advantage over creating >> it directly in vm_stop_cleanup()? >> > > after a bit of experimenting and re-reading the code, i think > I can simplify the logic > > at the beginning of vm_stop, we create the cleanup flag You'll also need to create one in vm_reboot(), right? > in 'qm cleanup', we only do the cleanup if the flag does not exist > in 'vm_start' we clean the flag > > this should work because these parts are under a config lock anyway: > * from vm_stop to vm_stop_cleanup > * most of the qm cleanup code > * vm_start > > so we only really have to mark that the cleanup was done from > the vm_stop codepath > > (we have to create the flag at the beginning of vm_stop, because > then there is no race between calling it's cleanup and qmeventd > picking up the vanishing process) > > does that make sense to you? Yes, sounds good :) >>> (if it's here already in 'started' state within a timelimit, ignore it) >>> >>> in vm_start we block until the cleanup flag is gone or until some >>> timeout >>> >>> in 'qm cleanup' we only start it if the flag does not exist >> >> Hmm, it does also call vm_stop_cleanup() so we could just re-use the >> check there for that part? I guess doing an early check doesn't hurt >> either, as long as we do call the post-stop hook. >> >>> I think this should make the behavior consistent? >> >> >