From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B62148E97 for ; Tue, 7 Mar 2023 15:20:21 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7BB0DCD90 for ; Tue, 7 Mar 2023 15:19:51 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 7 Mar 2023 15:19:49 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D545A45D17 for ; Tue, 7 Mar 2023 15:19:48 +0100 (CET) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Tue, 7 Mar 2023 15:19:44 +0100 Message-Id: <20230307141944.2337485-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: [pve-devel] [PATCH kernel] add patch to fix issue with large IO requests X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Mar 2023 14:20:21 -0000 Several people reported IO-related issues since kernel 6.1.6 [0]. Things got better with 6.1.10, but apparently the issues are not fully resolved (e.g. [1]). I ran into an issue with PBS backup of a VM with passed-through disks (error with 6.1.6, hang with 6.1.10+) and found that the issue did not occur anymore with v6.3-rc1. Bisecting what fixed the issue led to the commit in this patch. The hope is that it fixes some other issues too. The commit has a CC-stable tag for 5.15+, but telling from the absence of user reports, it was much less likely to trigger before 6.1.x (it's not clear what x is, because of the other issue in 6.1.6). The commit says it depends on 613b14884b85 ("block: handle bio_split_to_limits() NULL return") which is already present as a3f1c82e0413 ("block: handle bio_split_to_limits() NULL return") in the Ubuntu tree. [0]: https://forum.proxmox.com/threads/119483/post-530365 [1]: https://forum.proxmox.com/threads/119483/post-537991 Signed-off-by: Fiona Ebner --- ...w-multiple-bios-for-IOCB_NOWAIT-issu.patch | 69 +++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 patches/kernel/0018-block-don-t-allow-multiple-bios-for-IOCB_NOWAIT-issu.patch diff --git a/patches/kernel/0018-block-don-t-allow-multiple-bios-for-IOCB_NOWAIT-issu.patch b/patches/kernel/0018-block-don-t-allow-multiple-bios-for-IOCB_NOWAIT-issu.patch new file mode 100644 index 0000000..12e4453 --- /dev/null +++ b/patches/kernel/0018-block-don-t-allow-multiple-bios-for-IOCB_NOWAIT-issu.patch @@ -0,0 +1,69 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Jens Axboe +Date: Mon, 16 Jan 2023 08:55:53 -0700 +Subject: [PATCH] block: don't allow multiple bios for IOCB_NOWAIT issue + +If we're doing a large IO request which needs to be split into multiple +bios for issue, then we can run into the same situation as the below +marked commit fixes - parts will complete just fine, one or more parts +will fail to allocate a request. This will result in a partially +completed read or write request, where the caller gets EAGAIN even though +parts of the IO completed just fine. + +Do the same for large bios as we do for splits - fail a NOWAIT request +with EAGAIN. This isn't technically fixing an issue in the below marked +patch, but for stable purposes, we should have either none of them or +both. + +This depends on: 613b14884b85 ("block: handle bio_split_to_limits() NULL return") + +Cc: stable@vger.kernel.org # 5.15+ +Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio") +Link: https://github.com/axboe/liburing/issues/766 +Reported-and-tested-by: Michael Kelley +Signed-off-by: Jens Axboe +(commit 67d59247d4b52c917e373f05a807027756ab216f upstream) +Signed-off-by: Fiona Ebner +--- + block/fops.c | 21 ++++++++++++++++++--- + 1 file changed, 18 insertions(+), 3 deletions(-) + +diff --git a/block/fops.c b/block/fops.c +index b90742595317..e406aa605327 100644 +--- a/block/fops.c ++++ b/block/fops.c +@@ -221,6 +221,24 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, + bio_endio(bio); + break; + } ++ if (iocb->ki_flags & IOCB_NOWAIT) { ++ /* ++ * This is nonblocking IO, and we need to allocate ++ * another bio if we have data left to map. As we ++ * cannot guarantee that one of the sub bios will not ++ * fail getting issued FOR NOWAIT and as error results ++ * are coalesced across all of them, be safe and ask for ++ * a retry of this from blocking context. ++ */ ++ if (unlikely(iov_iter_count(iter))) { ++ bio_release_pages(bio, false); ++ bio_clear_flag(bio, BIO_REFFED); ++ bio_put(bio); ++ blk_finish_plug(&plug); ++ return -EAGAIN; ++ } ++ bio->bi_opf |= REQ_NOWAIT; ++ } + + if (is_read) { + if (dio->flags & DIO_SHOULD_DIRTY) +@@ -228,9 +246,6 @@ static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, + } else { + task_io_account_write(bio->bi_iter.bi_size); + } +- if (iocb->ki_flags & IOCB_NOWAIT) +- bio->bi_opf |= REQ_NOWAIT; +- + dio->size += bio->bi_iter.bi_size; + pos += bio->bi_iter.bi_size; + -- 2.30.2