From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 61B811FF139 for ; Tue, 24 Feb 2026 10:36:49 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id BB9E41878; Tue, 24 Feb 2026 10:37:41 +0100 (CET) Message-ID: <7986269c-5fe9-433c-a1c9-bbba3279cb64@proxmox.com> Date: Tue, 24 Feb 2026 10:37:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [RFC PATCH qemu-server] cleanup: refactor to make cleanup flow more consistent To: Fiona Ebner , pve-devel@lists.proxmox.com References: <20260223105542.1525232-1-d.csapak@proxmox.com> <2c237643-99bd-4e9b-8838-efa9a97f9d75@proxmox.com> <28c91ab7-b41d-446e-b31c-5800ced4ad61@proxmox.com> Content-Language: en-US From: Dominik Csapak In-Reply-To: <28c91ab7-b41d-446e-b31c-5800ced4ad61@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771925839186 X-SPAM-LEVEL: Spam detection results: 0 AWL -1.033 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 1.179 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.717 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.236 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: TQXO6QCKPQRC7ZGNUR3ZKPV5DOUMR662 X-Message-ID-Hash: TQXO6QCKPQRC7ZGNUR3ZKPV5DOUMR662 X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 2/24/26 10:30 AM, Fiona Ebner wrote: > Am 23.02.26 um 4:50 PM schrieb Fiona Ebner: >> Am 23.02.26 um 11:56 AM schrieb Dominik Csapak: >>> There are two ways a cleanup can be triggered: >>> >>> * When a guest is stopped/shutdown via the API, 'vm_stop' calls 'vm_stop_cleanup'. >>> * When the guest process disconnects from qmeventd, 'qm cleanup' is >>> called, which in turn also tries to call 'vm_stop_cleanup'. >>> >>> Both of these happen under a qemu config lock, so there is no direct >>> race condition that it will be called out of order, but it could happen >>> that the 'qm cleanup' call happened in addition so cleanup was called >>> twice. Which could be a problem when the shutdown was called with >>> 'keepActive' which 'qm cleanup' would simply know nothing of and ignore. >>> >>> Also the post-stop hook might not be triggered in case e.g. a stop-mode >>> backup was done, since that was only happening via qm cleanup and this >>> would detect the now again running guest and abort. >>> >>> To improve the situation we move the exec_hookscript call at the end >>> of vm_stop_cleanup. At this point we know the vm is stopped and we still >>> have the config lock. >>> >>> On _do_vm_stop (and in the one case for migration) a 'cleanup-flag' is >>> created that marks the vm is cleaned up by the api, so 'qm cleanup' >>> should not do it again. >>> >>> On vm start, this flag is cleared. > > It feels untidy to have something left after cleaning up, even if it's > just the file indicating that cleanup was done. Maybe we can switch it > around, see below: > >>> >>> There is still a tiny race possible: >>> >>> a guest is stopped from within (or crashes) and the vm is started again >>> via the api before 'qm cleanup' can run >>> >>> This should be a very rare case though, and all operation via the API >>> (reboot, shutdown+start, stop-mode backup, etc.) should work as intended. >> >> How difficult is it to trigger the race with an HA-managed VM? >> >>> Signed-off-by: Dominik Csapak >>> --- >>> I'm not sure how we could possibly eliminate the race i mentioned: >>> * we can't block on start because i don't think we can sensibly decide between: >>> - vm was crashing/powered off >>> - vm was never started >>> >>> We could maybe leave a 'started' flag somewhere too and clear the >>> cleanup flag also in 'qm cleanup', then we would start the vm >>> only when the cleanup flag is cleared >>> (or have the cleanup flags have multiple states, like 'started', >>> 'finished') > > We already have something very similar, namely, the PID file. The issue > is that the PID file is removed automatically by QEMU upon clean > termination. For our use case we would need a second, more persistent > file. Then we could solve the issue of duplicate cleanup and the issue > of starting another instance before cleanup: > > 1. create a flag file at startup with an identifier for the > QEMU instance, a second manual PID file? > 2. at cleanup, check the file: > a) if there is no such file, skip, somebody else already cleaned up > NOTE: we need to ensure that pre-existing instances are still > cleaned up. One possible way would be to create a flag file during > host startup and only use the new behavior when that is present. > b) if the file exists, check if the QEMU instance is still around. If > it is, wait for the instance to be gone until hitting some > timeout. Once it's gone, do cleanup. > 3. make sure to run the post-stop hook whenever we remove the file > 4. if the file still exists at startup, cleanup was not done yet, wait > until some timeout and when hitting the timeout, either proceed with > start anyway or suggest running cleanup manually. The latter would be > safer, but also worse from an UX standpoint, since cleanup is > root-only > > What do you think? yes, that seems good to me, i'll play around with that and send a next version