From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id BCA16F40B for ; Fri, 29 Sep 2023 10:29:40 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9EDD21DDA7 for ; Fri, 29 Sep 2023 10:29:10 +0200 (CEST) Received: from bastionodiso.odiso.net (bastionodiso.odiso.net [185.151.191.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 29 Sep 2023 10:29:08 +0200 (CEST) Received: from kvmformation3.odiso.net (formationkvm3.odiso.net [10.3.94.12]) by bastionodiso.odiso.net (Postfix) with ESMTP id 4495780FE; Fri, 29 Sep 2023 10:29:01 +0200 (CEST) Received: by kvmformation3.odiso.net (Postfix, from userid 0) id 2F7ED5927D; Fri, 29 Sep 2023 10:29:01 +0200 (CEST) From: Alexandre Derumier To: pve-devel@lists.proxmox.com Date: Fri, 29 Sep 2023 10:28:57 +0200 Message-Id: <20230929082859.147270-1-aderumier@odiso.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.029 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy HEADER_FROM_DIFFERENT_DOMAINS 0.249 From and EnvelopeFrom 2nd level mail domains are different KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LAZY_DOMAIN_SECURITY 1 Sending domain does not have any anti-forgery methods SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_NONE 0.001 SPF: sender does not publish an SPF Record Subject: [pve-devel] [PATCH qemu-server 0/2] migration: fix sporadic nbd-server-stop timeout X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Sep 2023 08:29:40 -0000 Hi, We had some sporadic nbd-stop error when trying to migrate vm with rbd storage + writeback between 2 differents cluster: (This is without my other targetcpu patch) 2023-09-28 16:20:39 ERROR: error - tunnel command '{"cmd":"nbdstop"}' failed - failed to handle 'nbdstop' command - VM 140 qmp command 'nbd-server-stop' failed - got timeout 2023-09-28 16:20:39 ERROR: migration finished with problems (duration 00:01:42) I'm not sure, maybe it's related to writeback, because it never happend with a fresh started vm, but vms running since some time can trigger this. (I'm not sure, maybe nbd need to flush pending datas in cache ?) Currently, the tunnel command have a 30s timeout, but the qmp command is only at 5s. Also the tunnel v2 command don't have any eval, so the migration abort keeping both source && target vm locked. unlocking target vm and resume it manually is working, so it really seem to be a too low timeout. Alexandre Derumier (2): nbd_stop: increase timeout to 25s migration: add missing eval on nbdstop with tunnel v2. PVE/QemuMigrate.pm | 8 +++++++- PVE/QemuServer.pm | 2 +- 2 files changed, 8 insertions(+), 2 deletions(-) -- 2.39.2