From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 36C761FF16E for <inbox@lore.proxmox.com>; Mon, 31 Mar 2025 16:56:02 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 095A596BC; Mon, 31 Mar 2025 16:55:48 +0200 (CEST) From: Fiona Ebner <f.ebner@proxmox.com> To: pve-devel@lists.proxmox.com Date: Mon, 31 Mar 2025 16:55:02 +0200 Message-Id: <20250331145507.196208-2-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250331145507.196208-1-f.ebner@proxmox.com> References: <20250331145507.196208-1-f.ebner@proxmox.com> MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.040 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH qemu 1/6] savevm-async: improve setting state of snapshot operation in savevm-end handler X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> One of the callers of wait_for_close_co() already sets the state to SAVE_STATE_DONE before, but that is not fully correct, because at that moment, the operation is not fully done. In particular, if closing the target later fails, the state would even be set to SAVE_STATE_ERROR afterwards. DONE -> ERROR is not a valid state transition. Although, it should not matter in practice as long as the relevant QMP commands are sequential. The other caller does not set the state and so there seems to be a race that could lead to the state not getting set at all. This is because before this commit, the wait_for_close_co() function could return early when there is no target file anymore. This can only happen when canceling and needs to happen right around the time when the snapshot is already finishing and closing the target. Simply avoid the early return and always set the state within the wait_for_close_co() function rather than at the call site. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- migration/savevm-async.c | 33 +++++++++++++++------------------ 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/migration/savevm-async.c b/migration/savevm-async.c index 1e79fce9ba..e63dc6d8a3 100644 --- a/migration/savevm-async.c +++ b/migration/savevm-async.c @@ -450,23 +450,22 @@ static void coroutine_fn wait_for_close_co(void *opaque) { int64_t timeout; - if (!snap_state.target) { - DPRINTF("savevm-end: no target file open\n"); - return; - } - - /* wait until cleanup is done before returning, this ensures that after this - * call exits the statefile will be closed and can be removed immediately */ - DPRINTF("savevm-end: waiting for cleanup\n"); - timeout = 30L * 1000 * 1000 * 1000; - qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait, - QEMU_CLOCK_REALTIME, timeout); if (snap_state.target) { - save_snapshot_error("timeout waiting for target file close in " - "qmp_savevm_end"); - /* we cannot assume the snapshot finished in this case, so leave the - * state alone - caller has to figure something out */ - return; + /* wait until cleanup is done before returning, this ensures that after this + * call exits the statefile will be closed and can be removed immediately */ + DPRINTF("savevm-end: waiting for cleanup\n"); + timeout = 30L * 1000 * 1000 * 1000; + qemu_co_sleep_ns_wakeable(&snap_state.target_close_wait, + QEMU_CLOCK_REALTIME, timeout); + if (snap_state.target) { + save_snapshot_error("timeout waiting for target file close in " + "qmp_savevm_end"); + /* we cannot assume the snapshot finished in this case, so leave the + * state alone - caller has to figure something out */ + return; + } + } else { + DPRINTF("savevm-end: no target file open\n"); } // File closed and no other error, so ensure next snapshot can be started. @@ -497,8 +496,6 @@ void qmp_savevm_end(Error **errp) snap_state.saved_vm_running = false; } - snap_state.state = SAVE_STATE_DONE; - qemu_coroutine_enter(wait_for_close); } -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel