From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 44BE61FF191 for ; Tue, 21 Oct 2025 13:24:17 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 86EFF1BF09; Tue, 21 Oct 2025 13:24:40 +0200 (CEST) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Tue, 21 Oct 2025 13:23:32 +0200 Message-ID: <20251021112432.126221-2-f.ebner@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20251021112432.126221-1-f.ebner@proxmox.com> References: <20251021112432.126221-1-f.ebner@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1761045870463 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.022 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, gitlab.com] Subject: [pve-devel] [PATCH qemu 1/3] fix #6810: add patch to avoid deadlock upon TMF request cancelling with VirtIO X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Because of a regression caused by QEMU commit da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts") and the introduction of the requests_lock earlier, there would be a deadlock when a (FreeBSD) guest cancels SCSI requests. See the commit message of the added patch for more information. The issue was also reported in the community forum: https://forum.proxmox.com/threads/freeze-on-pfsense-vm-running-in-pve-9.171557/ Signed-off-by: Fiona Ebner --- ...adlock-upon-TMF-request-cancelling-w.patch | 83 +++++++++++++++++++ debian/patches/series | 1 + 2 files changed, 84 insertions(+) create mode 100644 debian/patches/extra/0014-hw-scsi-avoid-deadlock-upon-TMF-request-cancelling-w.patch diff --git a/debian/patches/extra/0014-hw-scsi-avoid-deadlock-upon-TMF-request-cancelling-w.patch b/debian/patches/extra/0014-hw-scsi-avoid-deadlock-upon-TMF-request-cancelling-w.patch new file mode 100644 index 0000000..4c7441e --- /dev/null +++ b/debian/patches/extra/0014-hw-scsi-avoid-deadlock-upon-TMF-request-cancelling-w.patch @@ -0,0 +1,83 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Fiona Ebner +Date: Fri, 17 Oct 2025 11:43:30 +0200 +Subject: [PATCH] hw/scsi: avoid deadlock upon TMF request cancelling with + VirtIO + +When scsi_req_dequeue() is reached via +scsi_req_cancel_async() +virtio_scsi_tmf_cancel_req() +virtio_scsi_do_tmf_aio_context(), +there is a deadlock when trying to acquire the SCSI device's requests +lock, because it was already acquired in +virtio_scsi_do_tmf_aio_context(). + +In particular, the issue happens with a FreeBSD guest (13, 14, 15, +maybe more), when it cancels SCSI requests, because of timeout. + +This is a regression caused by commit da6eebb33b ("virtio-scsi: +perform TMFs in appropriate AioContexts") and the introduction of the +requests_lock earlier. + +To fix the issue, only cancel the requests after releasing the +requests_lock. For this, the SCSI device's requests are iterated while +holding the requests_lock and the requests to be cancelled are +collected in a list. Then, the collected requests are cancelled +one by one while not holding the requests_lock. This is safe, because +only requests from the current AioContext are collected and acted +upon. + +Originally reported by Proxmox VE users: +https://bugzilla.proxmox.com/show_bug.cgi?id=6810 +https://forum.proxmox.com/threads/173914/ + +Fixes: da6eebb33b ("virtio-scsi: perform TMFs in appropriate AioContexts") +Suggested-by: Stefan Hajnoczi +Signed-off-by: Fiona Ebner +Message-id: 20251017094518.328905-1-f.ebner@proxmox.com +[Changed g_list_append() to g_list_prepend() to avoid traversing the +list each time. +--Stefan] +Signed-off-by: Stefan Hajnoczi +(cherry picked from commit 7d80d6d82db4c73e335f9e738d7a5778124df35e + from https://gitlab.com/stefanha/qemu/-/tree/block) +Signed-off-by: Fiona Ebner +--- + hw/scsi/virtio-scsi.c | 14 +++++++++++++- + 1 file changed, 13 insertions(+), 1 deletion(-) + +diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c +index 34ae14f7bf..3b635053b5 100644 +--- a/hw/scsi/virtio-scsi.c ++++ b/hw/scsi/virtio-scsi.c +@@ -343,6 +343,7 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque) + SCSIDevice *d = virtio_scsi_device_get(s, tmf->req.tmf.lun); + SCSIRequest *r; + bool match_tag; ++ g_autoptr(GList) reqs = NULL; + + if (!d) { + tmf->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET; +@@ -378,10 +379,21 @@ static void virtio_scsi_do_tmf_aio_context(void *opaque) + if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) { + continue; + } +- virtio_scsi_tmf_cancel_req(tmf, r); ++ /* ++ * Cannot cancel directly, because scsi_req_dequeue() would deadlock ++ * when attempting to acquire the request_lock a second time. Taking ++ * a reference here is paired with an unref after cancelling below. ++ */ ++ scsi_req_ref(r); ++ reqs = g_list_prepend(reqs, r); + } + } + ++ for (GList *elem = g_list_first(reqs); elem; elem = g_list_next(elem)) { ++ virtio_scsi_tmf_cancel_req(tmf, elem->data); ++ scsi_req_unref(elem->data); ++ } ++ + /* Incremented by virtio_scsi_do_tmf() */ + virtio_scsi_tmf_dec_remaining(tmf); + diff --git a/debian/patches/series b/debian/patches/series index 10ebb56..ee5da2e 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -11,6 +11,7 @@ extra/0010-vfio-igd-Enable-quirks-when-IGD-is-not-the-primary-d.patch extra/0011-i386-cpu-Enable-SMM-cpu-address-space-under-KVM.patch extra/0012-target-i386-add-compatibility-property-for-arch_capa.patch extra/0013-target-i386-add-compatibility-property-for-pdcm-feat.patch +extra/0014-hw-scsi-avoid-deadlock-upon-TMF-request-cancelling-w.patch bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch -- 2.47.3 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel