From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E2635940CE for ; Wed, 21 Feb 2024 14:02:00 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id CC40717A5E for ; Wed, 21 Feb 2024 14:02:00 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 21 Feb 2024 14:01:57 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D2E2E44470 for ; Wed, 21 Feb 2024 14:01:56 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Wed, 21 Feb 2024 14:01:52 +0100 Message-Id: <20240221130152.81573-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.071 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: [pve-devel] [PATCH qemu] add patch to fix deadlock with VirtIO block and iothread during QMP stop X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Feb 2024 13:02:00 -0000 Backported from commit bfa36802d1 ("virtio-blk: avoid using ioeventfd state in irqfd conditional") because the rework/rename dataplane -> ioeventfd didn't happen yet. Reported in the community forum [0] and reproduced doing a backup loop to PBS with suspend mode with fio doing heavy IO in the guest and using an RBD storage (with krbd). [0]: https://forum.proxmox.com/threads/141320 Signed-off-by: Fiona Ebner --- ...-using-ioeventfd-state-in-irqfd-cond.patch | 61 +++++++++++++++++++ debian/patches/series | 1 + 2 files changed, 62 insertions(+) create mode 100644 debian/patches/extra/0013-virtio-blk-avoid-using-ioeventfd-state-in-irqfd-cond.patch diff --git a/debian/patches/extra/0013-virtio-blk-avoid-using-ioeventfd-state-in-irqfd-cond.patch b/debian/patches/extra/0013-virtio-blk-avoid-using-ioeventfd-state-in-irqfd-cond.patch new file mode 100644 index 0000000..8109e7d --- /dev/null +++ b/debian/patches/extra/0013-virtio-blk-avoid-using-ioeventfd-state-in-irqfd-cond.patch @@ -0,0 +1,61 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Stefan Hajnoczi +Date: Mon, 22 Jan 2024 12:26:25 -0500 +Subject: [PATCH] virtio-blk: avoid using ioeventfd state in irqfd conditional + +Requests that complete in an IOThread use irqfd to notify the guest +while requests that complete in the main loop thread use the traditional +qdev irq code path. The reason for this conditional is that the irq code +path requires the BQL: + + if (s->ioeventfd_started && !s->ioeventfd_disabled) { + virtio_notify_irqfd(vdev, req->vq); + } else { + virtio_notify(vdev, req->vq); + } + +There is a corner case where the conditional invokes the irq code path +instead of the irqfd code path: + + static void virtio_blk_stop_ioeventfd(VirtIODevice *vdev) + { + ... + /* + * Set ->ioeventfd_started to false before draining so that host notifiers + * are not detached/attached anymore. + */ + s->ioeventfd_started = false; + + /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */ + blk_drain(s->conf.conf.blk); + +During blk_drain() the conditional produces the wrong result because +ioeventfd_started is false. + +Use qemu_in_iothread() instead of checking the ioeventfd state. + +Cc: qemu-stable@nongnu.org +Buglink: https://issues.redhat.com/browse/RHEL-15394 +Signed-off-by: Stefan Hajnoczi +Message-ID: <20240122172625.415386-1-stefanha@redhat.com> +Reviewed-by: Kevin Wolf +Signed-off-by: Kevin Wolf +[FE: backport: dataplane -> ioeventfd rework didn't happen yet] +Signed-off-by: Fiona Ebner +--- + hw/block/virtio-blk.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c +index 39e7f23fab..61bd1f6859 100644 +--- a/hw/block/virtio-blk.c ++++ b/hw/block/virtio-blk.c +@@ -64,7 +64,7 @@ static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status) + iov_discard_undo(&req->inhdr_undo); + iov_discard_undo(&req->outhdr_undo); + virtqueue_push(req->vq, &req->elem, req->in_len); +- if (s->dataplane_started && !s->dataplane_disabled) { ++ if (qemu_in_iothread()) { + virtio_blk_data_plane_notify(s->dataplane, req->vq); + } else { + virtio_notify(vdev, req->vq); diff --git a/debian/patches/series b/debian/patches/series index 4d75ec3..90553de 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -10,6 +10,7 @@ extra/0009-ui-clipboard-mark-type-as-not-available-when-there-i.patch extra/0010-virtio-scsi-Attach-event-vq-notifier-with-no_poll.patch extra/0011-virtio-Re-enable-notifications-after-drain.patch extra/0012-qemu_init-increase-NOFILE-soft-limit-on-POSIX.patch +extra/0013-virtio-blk-avoid-using-ioeventfd-state-in-irqfd-cond.patch bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch -- 2.39.2