From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 30597993B7 for ; Tue, 10 Oct 2023 16:20:29 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 0DAA434557 for ; Tue, 10 Oct 2023 16:19:59 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 10 Oct 2023 16:19:57 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id DD3B8449B1 for ; Tue, 10 Oct 2023 16:19:56 +0200 (CEST) Message-ID: Date: Tue, 10 Oct 2023 16:19:55 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Content-Language: en-US To: Proxmox VE development discussion , Thomas Lamprecht , Filip Schauer References: <20231009132519.115727-1-f.schauer@proxmox.com> <01c649c9-0717-404c-98d8-17dc8a5cc4eb@proxmox.com> From: Fiona Ebner In-Reply-To: <01c649c9-0717-404c-98d8-17dc8a5cc4eb@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.583 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -3.339 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH v3 qemu-server] Fix ACPI-suspended VMs resuming after migration X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Oct 2023 14:20:29 -0000 Am 10.10.23 um 13:45 schrieb Thomas Lamprecht: > Am 09/10/2023 um 15:25 schrieb Filip Schauer: >> Add checks for "suspended" and "prelaunch" runstates when checking >> whether a VM is paused. >> >> This fixes the following issues: >> * ACPI-suspended VMs automatically resuming after migration >> * Shutdown and reboot commands timing out instead of failing >> immediately on suspended VMs >> > > I checked the call-sites and what I'm wondering is, can the VM from those > new states get waked up without QMP intervention, say a ACPI-suspension > be triggered by some (virtual) RTC or via network (like wake-on-lan), > as then, we should add a big notice comment on this method to ensure > new users of it are informed about that possibility. Sorry! I did not consider this. Yes, it can get woken up without QMP. Even just with keyboard presses in case of my Debian test VM. So our config locks are useless here :/ It'll also be an issue for migration: 1. migration is started and remembers that "vm_was_paused" 2. migration continues 3. during migration VM wakes up 4. migration finishes, but doesn't resume guest on the target, because "vm_was_paused" We could either: 1. tolerate the old behavior (VM might get resumed on target if it was suspended) - if we declare migration an ACPI-wake-up event ;) 2. tolerate the new behavior (VM might wake up and not be resumed on target later) - seems worse then old behavior when it happens 3. disallow migration when in 'suspended' runstate 4. add new check just before moving guest to target - but not sure how we'd do that without races again... > Also, with that change we might have added a race for suspend-mode backups, > at least if VMs really can wake up without a QMP command (which I find likely). > I.e., between the time we checked and set vm_was_paused until we actually > suspend, because if the VM would wake up in between we might get inconsistent > stuff and skip things like fs-freeze. That race is already there without the patch. QEMU does not transition from 'suspended' state to 'paused' state when QMP 'stop' is issued, i.e. what our 'qm suspend' or vm_suspend() actually does. So it doesn't matter if we call that during backup or not when the VM is already in 'suspended' state. The window for the race is a bit larger though: now: VM wakes up between check if paused and 'backup' QMP before: VM wakes up after fsfreeze was skipped because guest agent was detected as not running We could either: 1. (ironically) disallow 'suspend' mode backup when in 'suspended' runstate. 2. resume and pause the VM upon 'suspend' mode backup, but is also surprising 3. make the race window smaller again by doing the qga_check_running() and fs-freeze if true, if VM was in 'suspended' runstate. I don't see how to avoid the possibility of a wake-up during an inconvenient time except with one of the first two options. > While we recommend stop and snapshot modes over suspend mode, the latter is > still exposed via UI and API and should work OK enough. > > Note that ACPI S3 suspend and our vm_suspend are very different things, our > vm_suspend is doing a "stop" monitor command, resulting in all vCPUs being > stopped, while S3 is a (virtual) firmware/machine feature – I mean, I guess > there's a reason that those things are reported as different states.. > It doesn't help that our vm_suspend method doesn't actually do a (ACPI S3 > like) suspension, but a (vCPU) "stop". > > The "prelaunch" state, OTOH., seems pretty much like "paused", with the > difference that the VM vCPUs never ran in the former, this seems fine to > handle the same. But for "suspended" I'm not sure if we'd like to detect > that as paused only if conflating both is OK for a call-site, so maybe > with an opt-in parameter. Alternatively we could return the status in > the "true" case so that call-sites can decide what to do for any such > special handling without an extra parameter. Yes, I like the opt-in approach.