From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 02435C2643 for ; Tue, 23 Jan 2024 14:52:40 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D0D7537AAC for ; Tue, 23 Jan 2024 14:52:09 +0100 (CET) Received: from mail.corep.it (mail.corep.it [93.186.252.128]) by firstgate.proxmox.com (Proxmox) with ESMTP for ; Tue, 23 Jan 2024 14:52:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=corep.it; s=2024; t=1706017422; bh=wltPHrhwHcBoQGv7CO4z5Q8fJKcw1SHKFnQGbOyR/Fo=; h=Date:Subject:To:References:From:In-Reply-To:From; b=m6QueEpK8CyvRyWZ3uBxWJHfB+W5N+4HAQooOMAtGLiA99tzpmQmSrCUvrEDcV8Dn RVbAXabmomudROR47mXtVExz9oOWnBwJzb+rQfMcS0ul/GPeHOFe0OGlgEeLs2r5pF SMtDVf8pFlnELaMY2xwZRZXj160a4BFNx8ZNHkJRbjn9YWlSu7zsVU9eNBok55TcNG Ui1KCvBS4mYDrdJqElJpctUFrcEvwyKYBcAmzrcyTuau3yTuLYlvePJtbjH31yyAOW vmoOazPEtu7zPb2UlEgA6qn9iDzYOLr+h7RbbmK4xDaDoMhibLKOgCjKgAV+/RjD0x JYu10atMD79yg== Message-ID: Date: Tue, 23 Jan 2024 14:43:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: it To: pve-devel@lists.proxmox.com References: <20240123131320.115359-1-f.ebner@proxmox.com> From: dea In-Reply-To: <20240123131320.115359-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.406 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain DMARC_PASS -0.1 DMARC pass policy SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, corep.it] Subject: Re: [pve-devel] [PATCH qemu] add patch to work around stuck guest IO with iothread and VirtIO block/SCSI X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jan 2024 13:52:40 -0000 Very good news Fiona !!!! For quite some time I have been using patchlevel 5 of the pve-qemu package (the one that has CPU overloads) because package 4-6 gives me stuck storage problems, as you correctly describe in your post. Very thanks ! Il 23/01/24 14:13, Fiona Ebner ha scritto: > This essentially repeats commit 6b7c181 ("add patch to work around > stuck guest IO with iothread and VirtIO block/SCSI") with an added > fix for the SCSI event virtqueue, which requires special handling. > This is to avoid the issue [4] that made the revert 2a49e66 ("Revert > "add patch to work around stuck guest IO with iothread and VirtIO > block/SCSI"") necessary the first time around. > > When using iothread, after commits > 1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()") > 766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()") > it can happen that polling gets stuck when draining. This would cause > IO in the guest to get completely stuck. > > A workaround for users is stopping and resuming the vCPUs because that > would also stop and resume the dataplanes which would kick the host > notifiers. > > This can happen with block jobs like backup and drive mirror as well > as with hotplug [2]. > > Reports in the community forum that might be about this issue[0][1] > and there is also one in the enterprise support channel. > > As a workaround in the code, just re-enable notifications and kick the > virt queue after draining. Draining is already costly and rare, so no > need to worry about a performance penalty here. This was taken from > the following comment of a QEMU developer [3] (in my debugging, > I had already found re-enabling notification to work around the issue, > but also kicking the queue is more complete). > > Take special care to attach the SCSI event virtqueue host notifier > with the _no_poll() variant like in virtio_scsi_dataplane_start(). > This avoids the issue from the first attempted fix where the iothread > would suddenly loop with 100% CPU usage whenever some guest IO came in > [4]. This is necessary because of commit 38738f7dbb ("virtio-scsi: > don't waste CPU polling the event virtqueue"). See [5] for the > relevant discussion. > > [0]: https://forum.proxmox.com/threads/137286/ > [1]: https://forum.proxmox.com/threads/137536/ > [2]: https://issues.redhat.com/browse/RHEL-3934 > [3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096 > [4]: https://forum.proxmox.com/threads/138140/ > [5]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/ > > Signed-off-by: Fiona Ebner > --- > ...work-around-iothread-polling-getting.patch | 87 +++++++++++++++++++ > debian/patches/series | 1 + > 2 files changed, 88 insertions(+) > create mode 100644 debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch > > diff --git a/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch > new file mode 100644 > index 0000000..a268eed > --- /dev/null > +++ b/debian/patches/pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch > @@ -0,0 +1,87 @@ > +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 > +From: Fiona Ebner > +Date: Tue, 23 Jan 2024 13:21:11 +0100 > +Subject: [PATCH] virtio blk/scsi: work around iothread polling getting stuck > + with drain > + > +When using iothread, after commits > +1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()") > +766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()") > +it can happen that polling gets stuck when draining. This would cause > +IO in the guest to get completely stuck. > + > +A workaround for users is stopping and resuming the vCPUs because that > +would also stop and resume the dataplanes which would kick the host > +notifiers. > + > +This can happen with block jobs like backup and drive mirror as well > +as with hotplug [2]. > + > +Reports in the community forum that might be about this issue[0][1] > +and there is also one in the enterprise support channel. > + > +As a workaround in the code, just re-enable notifications and kick the > +virt queue after draining. Draining is already costly and rare, so no > +need to worry about a performance penalty here. This was taken from > +the following comment of a QEMU developer [3] (in my debugging, > +I had already found re-enabling notification to work around the issue, > +but also kicking the queue is more complete). > + > +Take special care to attach the SCSI event virtqueue host notifier > +with the _no_poll() variant like in virtio_scsi_dataplane_start(). > +This avoids the issue from the first attempted fix where the iothread > +would suddenly loop with 100% CPU usage whenever some guest IO came in > +[4]. This is necessary because of commit 38738f7dbb ("virtio-scsi: > +don't waste CPU polling the event virtqueue"). See [5] for the > +relevant discussion. > + > +[0]: https://forum.proxmox.com/threads/137286/ > +[1]: https://forum.proxmox.com/threads/137536/ > +[2]: https://issues.redhat.com/browse/RHEL-3934 > +[3]: https://issues.redhat.com/browse/RHEL-3934?focusedId=23562096&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-23562096 > +[4]: https://forum.proxmox.com/threads/138140/ > +[5]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/ > + > +Signed-off-by: Fiona Ebner > +--- > + hw/block/virtio-blk.c | 4 ++++ > + hw/scsi/virtio-scsi.c | 10 +++++++++- > + 2 files changed, 13 insertions(+), 1 deletion(-) > + > +diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > +index 39e7f23fab..d9a655e9b8 100644 > +--- a/hw/block/virtio-blk.c > ++++ b/hw/block/virtio-blk.c > +@@ -1536,7 +1536,11 @@ static void virtio_blk_drained_end(void *opaque) > + > + for (uint16_t i = 0; i < s->conf.num_queues; i++) { > + VirtQueue *vq = virtio_get_queue(vdev, i); > ++ if (!virtio_queue_get_notification(vq)) { > ++ virtio_queue_set_notification(vq, true); > ++ } > + virtio_queue_aio_attach_host_notifier(vq, ctx); > ++ virtio_queue_notify(vdev, i); > + } > + } > + > +diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c > +index 45b95ea070..93a292df60 100644 > +--- a/hw/scsi/virtio-scsi.c > ++++ b/hw/scsi/virtio-scsi.c > +@@ -1165,7 +1165,15 @@ static void virtio_scsi_drained_end(SCSIBus *bus) > + > + for (uint32_t i = 0; i < total_queues; i++) { > + VirtQueue *vq = virtio_get_queue(vdev, i); > +- virtio_queue_aio_attach_host_notifier(vq, s->ctx); > ++ if (!virtio_queue_get_notification(vq)) { > ++ virtio_queue_set_notification(vq, true); > ++ } > ++ if (vq == VIRTIO_SCSI_COMMON(s)->event_vq) { > ++ virtio_queue_aio_attach_host_notifier_no_poll(vq, s->ctx); > ++ } else { > ++ virtio_queue_aio_attach_host_notifier(vq, s->ctx); > ++ } > ++ virtio_queue_notify(vdev, i); > + } > + } > + > diff --git a/debian/patches/series b/debian/patches/series > index b3da8bb..7dcedcb 100644 > --- a/debian/patches/series > +++ b/debian/patches/series > @@ -60,3 +60,4 @@ pve/0042-Revert-block-rbd-implement-bdrv_co_block_status.patch > pve/0043-alloc-track-fix-deadlock-during-drop.patch > pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch > pve/0045-savevm-async-don-t-hold-BQL-during-setup.patch > +pve/0046-virtio-blk-scsi-work-around-iothread-polling-getting.patch