From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dea@corep.it>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 02435C2643
 for <pve-devel@lists.proxmox.com>; Tue, 23 Jan 2024 14:52:40 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id D0D7537AAC
 for <pve-devel@lists.proxmox.com>; Tue, 23 Jan 2024 14:52:09 +0100 (CET)
Received: from mail.corep.it (mail.corep.it [93.186.252.128])
 by firstgate.proxmox.com (Proxmox) with ESMTP
 for <pve-devel@lists.proxmox.com>; Tue, 23 Jan 2024 14:52:08 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=corep.it; s=2024;
 t=1706017422; bh=wltPHrhwHcBoQGv7CO4z5Q8fJKcw1SHKFnQGbOyR/Fo=;
 h=Date:Subject:To:References:From:In-Reply-To:From;
 b=m6QueEpK8CyvRyWZ3uBxWJHfB+W5N+4HAQooOMAtGLiA99tzpmQmSrCUvrEDcV8Dn
 RVbAXabmomudROR47mXtVExz9oOWnBwJzb+rQfMcS0ul/GPeHOFe0OGlgEeLs2r5pF
 SMtDVf8pFlnELaMY2xwZRZXj160a4BFNx8ZNHkJRbjn9YWlSu7zsVU9eNBok55TcNG
 Ui1KCvBS4mYDrdJqElJpctUFrcEvwyKYBcAmzrcyTuau3yTuLYlvePJtbjH31yyAOW
 vmoOazPEtu7zPb2UlEgA6qn9iDzYOLr+h7RbbmK4xDaDoMhibLKOgCjKgAV+/RjD0x
 JYu10atMD79yg==
Message-ID: <cb5145b1-4cb9-42f5-a45f-db5df9792427@corep.it>
Date: Tue, 23 Jan 2024 14:43:41 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: it
To: pve-devel@lists.proxmox.com
References: <20240123131320.115359-1-f.ebner@proxmox.com>
From: dea <dea@corep.it>
In-Reply-To: <20240123131320.115359-1-f.ebner@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.406 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DKIM_SIGNED               0.1 Message has a DKIM or DK signature,
 not necessarily valid
 DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature
 DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's
 domain
 DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from
 domain DMARC_PASS               -0.1 DMARC pass policy
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com, corep.it]
Subject: Re: [pve-devel] [PATCH qemu] add patch to work around stuck guest
 IO with iothread and VirtIO block/SCSI
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 23 Jan 2024 13:52:40 -0000

Very good news Fiona !!!!

For quite some time I have been using patchlevel 5 of the pve-qemu 
package (the one that has CPU overloads) because package 4-6 gives me 
stuck storage problems, as you correctly describe in your post.

Very thanks !


Il 23/01/24 14:13, Fiona Ebner ha scritto:
> This essentially repeats commit 6b7c181 ("add patch to work around
> stuck guest IO with iothread and VirtIO block/SCSI") with an added
> fix for the SCSI event virtqueue, which requires special handling.
> This is to avoid the issue [4] that made the revert 2a49e66 ("Revert
> "add patch to work around stuck guest IO with iothread and VirtIO
> block/SCSI"") necessary the first time around.
>
> When using iothread, after commits
> 1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
> 766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
> it can happen that polling gets stuck when draining. This would cause
> IO in the guest to get completely stuck.
>
> A workaround for users is stopping and resuming the vCPUs because that
> would also stop and resume the dataplanes which would kick the host
> notifiers.
>
> This can happen with block jobs like backup and drive mirror as well
> as with hotplug [2].
>
> Reports in the community forum that might be about this issue[0][1]
> and there is also one in the enterprise support channel.
>
> As a workaround in the code, just re-enable notifications and kick the
> virt queue after draining. Draining is already costly and rare, so no
> need to worry about a performance penalty here. This was taken from
> the following comment of a QEMU developer [3] (in my debugging,
> I had already found re-enabling notification to work around the issue,
> but also kicking the queue is more complete).
>
> Take special care to attach the SCSI event virtqueue host notifier
> with the _no_poll() variant like in virtio_scsi_dataplane_start().
> This avoids the issue from the first attempted fix where the iothread
> would suddenly loop with 100% CPU usage whenever some guest IO came in
> [4]. This is necessary because of commit 38738f7dbb ("virtio-scsi:
> don't waste CPU polling the event virtqueue"). See [5] for the
> relevant discussion.
>
> [0]: https://forum.proxmox.com/threads/137286/
> [1]: https://forum.proxmox.com/threads/137536/
> [2]: https://issues.redhat.com/browse/RHEL-3934
> [3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096
> [4]: https://forum.proxmox.com/threads/138140/
> [5]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>   ...work-around-iothread-polling-getting.patch | 87 +++++++++++++++++++
>   debian/patches/series                         |  1 +
>   2 files changed, 88 insertions(+)
>   create mode 100644 debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
>
> diff --git a/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
> new file mode 100644
> index 0000000..a268eed
> --- /dev/null
> +++ b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch
> @@ -0,0 +1,87 @@
> +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> +From: Fiona Ebner <f.ebner@proxmox.com>
> +Date: Tue, 23 Jan 2024 13:21:11 +0100
> +Subject: [PATCH] virtio blk/scsi: work around iothread polling getting stuck
> + with drain
> +
> +When using iothread, after commits
> +1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
> +766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
> +it can happen that polling gets stuck when draining. This would cause
> +IO in the guest to get completely stuck.
> +
> +A workaround for users is stopping and resuming the vCPUs because that
> +would also stop and resume the dataplanes which would kick the host
> +notifiers.
> +
> +This can happen with block jobs like backup and drive mirror as well
> +as with hotplug [2].
> +
> +Reports in the community forum that might be about this issue[0][1]
> +and there is also one in the enterprise support channel.
> +
> +As a workaround in the code, just re-enable notifications and kick the
> +virt queue after draining. Draining is already costly and rare, so no
> +need to worry about a performance penalty here. This was taken from
> +the following comment of a QEMU developer [3] (in my debugging,
> +I had already found re-enabling notification to work around the issue,
> +but also kicking the queue is more complete).
> +
> +Take special care to attach the SCSI event virtqueue host notifier
> +with the _no_poll() variant like in virtio_scsi_dataplane_start().
> +This avoids the issue from the first attempted fix where the iothread
> +would suddenly loop with 100% CPU usage whenever some guest IO came in
> +[4]. This is necessary because of commit 38738f7dbb ("virtio-scsi:
> +don't waste CPU polling the event virtqueue"). See [5] for the
> +relevant discussion.
> +
> +[0]: https://forum.proxmox.com/threads/137286/
> +[1]: https://forum.proxmox.com/threads/137536/
> +[2]: https://issues.redhat.com/browse/RHEL-3934
> +[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096
> +[4]: https://forum.proxmox.com/threads/138140/
> +[5]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/
> +
> +Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> +---
> + hw/block/virtio-blk.c |  4 ++++
> + hw/scsi/virtio-scsi.c | 10 +++++++++-
> + 2 files changed, 13 insertions(+), 1 deletion(-)
> +
> +diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> +index 39e7f23fab..d9a655e9b8 100644
> +--- a/hw/block/virtio-blk.c
> ++++ b/hw/block/virtio-blk.c
> +@@ -1536,7 +1536,11 @@ static void virtio_blk_drained_end(void *opaque)
> +
> +     for (uint16_t i = 0; i < s->conf.num_queues; i++) {
> +         VirtQueue *vq = virtio_get_queue(vdev, i);
> ++        if (!virtio_queue_get_notification(vq)) {
> ++            virtio_queue_set_notification(vq, true);
> ++        }
> +         virtio_queue_aio_attach_host_notifier(vq, ctx);
> ++        virtio_queue_notify(vdev, i);
> +     }
> + }
> +
> +diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> +index 45b95ea070..93a292df60 100644
> +--- a/hw/scsi/virtio-scsi.c
> ++++ b/hw/scsi/virtio-scsi.c
> +@@ -1165,7 +1165,15 @@ static void virtio_scsi_drained_end(SCSIBus *bus)
> +
> +     for (uint32_t i = 0; i < total_queues; i++) {
> +         VirtQueue *vq = virtio_get_queue(vdev, i);
> +-        virtio_queue_aio_attach_host_notifier(vq, s->ctx);
> ++        if (!virtio_queue_get_notification(vq)) {
> ++            virtio_queue_set_notification(vq, true);
> ++        }
> ++        if (vq == VIRTIO_SCSI_COMMON(s)->event_vq) {
> ++            virtio_queue_aio_attach_host_notifier_no_poll(vq, s->ctx);
> ++        } else {
> ++            virtio_queue_aio_attach_host_notifier(vq, s->ctx);
> ++        }
> ++        virtio_queue_notify(vdev, i);
> +     }
> + }
> +
> diff --git a/debian/patches/series b/debian/patches/series
> index b3da8bb..7dcedcb 100644
> --- a/debian/patches/series
> +++ b/debian/patches/series
> @@ -60,3 +60,4 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch
>   pve/0043-alloc-track-fix-deadlock-during-drop.patch
>   pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch
>   pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch
> +pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch