From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 9DDFDB98D1
 for <pve-devel@lists.proxmox.com>; Mon, 11 Dec 2023 14:28:45 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 85F62170CD
 for <pve-devel@lists.proxmox.com>; Mon, 11 Dec 2023 14:28:45 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Mon, 11 Dec 2023 14:28:44 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 995DE45367
 for <pve-devel@lists.proxmox.com>; Mon, 11 Dec 2023 14:28:44 +0100 (CET)
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Mon, 11 Dec 2023 14:28:38 +0100
Message-Id: <20231211132839.747351-1-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.39.2
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.077 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: [pve-devel] [PATCH qemu 1/2] add patch to work around stuck guest
 IO with iothread and VirtIO block/SCSI
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 11 Dec 2023 13:28:45 -0000

When using iothread, after commits
1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
it can happen that polling gets stuck when draining. This would cause
IO in the guest to get completely stuck.

A workaround for users is stopping and resuming the vCPUs because that
would also stop and resume the dataplanes which would kick the host
notifiers.

This can happen with block jobs like backup and drive mirror as well
as with hotplug [2].

Reports in the community forum that might be about this issue[0][1]
and there is also one in the enterprise support channel.

As a workaround in the code, just re-enable notifications and kick the
virt queue after draining. Draining is already costly and rare, so no
need to worry about a performance penalty here. This was taken from
the following comment of a QEMU developer [3] (in my debugging,
I had already found re-enabling notification to work around the issue,
but also kicking the queue is more complete).

[0]: https://forum.proxmox.com/threads/137286/
[1]: https://forum.proxmox.com/threads/137536/
[2]: https://issues.redhat.com/browse/RHEL-3934
[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 ...work-around-iothread-polling-getting.patch | 66 +++++++++++++++++++
 debian/patches/series                         |  1 +
 2 files changed, 67 insertions(+)
 create mode 100644 debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch

diff --git a/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
new file mode 100644
index 0000000..3ac10a8
--- /dev/null
+++ b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
@@ -0,0 +1,66 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Fiona Ebner <f.ebner@proxmox.com>
+Date: Tue, 5 Dec 2023 14:05:49 +0100
+Subject: [PATCH] virtio blk/scsi: work around iothread polling getting stuck
+ with drain
+
+When using iothread, after commits
+1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
+766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
+it can happen that polling gets stuck when draining. This would cause
+IO in the guest to get completely stuck.
+
+A workaround for users is stopping and resuming the vCPUs because that
+would also stop and resume the dataplanes which would kick the host
+notifiers.
+
+This can happen with block jobs like backup and drive mirror as well
+as with hotplug [2].
+
+Reports in the community forum that might be about this issue[0][1]
+and there is also one in the enterprise support channel.
+
+As a workaround in the code, just re-enable notifications and kick the
+virt queue after draining. Draining is already costly and rare, so no
+need to worry about a performance penalty here. This was taken from
+the following comment of a QEMU developer [3] (in my debugging,
+I had already found re-enabling notification to work around the issue,
+but also kicking the queue is more complete).
+
+[0]: https://forum.proxmox.com/threads/137286/
+[1]: https://forum.proxmox.com/threads/137536/
+[2]: https://issues.redhat.com/browse/RHEL-3934
+[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096
+
+Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
+---
+ hw/block/virtio-blk.c | 2 ++
+ hw/scsi/virtio-scsi.c | 2 ++
+ 2 files changed, 4 insertions(+)
+
+diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
+index 39e7f23fab..22502047d5 100644
+--- a/hw/block/virtio-blk.c
++++ b/hw/block/virtio-blk.c
+@@ -1537,6 +1537,8 @@ static void virtio_blk_drained_end(void *opaque)
+     for (uint16_t i = 0; i < s->conf.num_queues; i++) {
+         VirtQueue *vq = virtio_get_queue(vdev, i);
+         virtio_queue_aio_attach_host_notifier(vq, ctx);
++        virtio_queue_set_notification(vq, 1);
++        virtio_queue_notify(vdev, i);
+     }
+ }
+ 
+diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
+index 45b95ea070..a7bddbf899 100644
+--- a/hw/scsi/virtio-scsi.c
++++ b/hw/scsi/virtio-scsi.c
+@@ -1166,6 +1166,8 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
+     for (uint32_t i = 0; i < total_queues; i++) {
+         VirtQueue *vq = virtio_get_queue(vdev, i);
+         virtio_queue_aio_attach_host_notifier(vq, s->ctx);
++        virtio_queue_set_notification(vq, 1);
++        virtio_queue_notify(vdev, i);
+     }
+ }
+ 
diff --git a/debian/patches/series b/debian/patches/series
index 9938b8e..0e21f1f 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -59,3 +59,4 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
 pve/0043-alloc-track-fix-deadlock-during-drop.patch
 pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
 pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
+pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
-- 
2.39.2