From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 341B71FF15C
	for <inbox@lore.proxmox.com>; Wed,  8 Jan 2025 14:03:24 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 1E95419DDD;
	Wed,  8 Jan 2025 14:03:11 +0100 (CET)
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Wed,  8 Jan 2025 14:03:04 +0100
Message-Id: <20250108130304.343460-1-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.39.5
MIME-Version: 1.0
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.052 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [proxmox.com]
Subject: [pve-devel] [PATCH qemu] add fix for crash during live migration in
 combination with block flush
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

Setting blk->root is a graph change operation and thus needs to be
protected by the block graph write lock in blk_remove_bs(). The
assignment to blk->root in blk_insert_bs() is already protected by
the block graph write lock.

In particular, the graph read lock in blk_co_do_flush() could
previously not ensure that blk_bs(blk) would always return the same
value during the locked section, which could lead to a segfault [0] in
combination with migration [1].

>From the user-provided backtraces in the forum thread [1], it seems
like blk_co_do_flush() managed to get past the
blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
non-NULL value during the check, but then, when calling
bdrv_co_flush(), blk_bs(blk) returned NULL.

[0]:

> 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901

[1]: https://forum.proxmox.com/threads/158072

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Upstream submission of the same patch:
https://lore.kernel.org/qemu-devel/20250108124649.333668-1-f.ebner@proxmox.com/T/

 ...otect-setting-block-root-to-NULL-wit.patch | 51 +++++++++++++++++++
 debian/patches/series                         |  1 +
 2 files changed, 52 insertions(+)
 create mode 100644 debian/patches/extra/0007-block-backend-protect-setting-block-root-to-NULL-wit.patch

diff --git a/debian/patches/extra/0007-block-backend-protect-setting-block-root-to-NULL-wit.patch b/debian/patches/extra/0007-block-backend-protect-setting-block-root-to-NULL-wit.patch
new file mode 100644
index 0000000..7ff996f
--- /dev/null
+++ b/debian/patches/extra/0007-block-backend-protect-setting-block-root-to-NULL-wit.patch
@@ -0,0 +1,51 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Fiona Ebner <f.ebner@proxmox.com>
+Date: Wed, 8 Jan 2025 12:41:20 +0100
+Subject: [PATCH] block-backend: protect setting block root to NULL with block
+ graph write lock
+
+Setting blk->root is a graph change operation and thus needs to be
+protected by the block graph write lock in blk_remove_bs(). The
+assignment to blk->root in blk_insert_bs() is already protected by
+the block graph write lock.
+
+In particular, the graph read lock in blk_co_do_flush() could
+previously not ensure that blk_bs(blk) would always return the same
+value during the locked section, which could lead to a segfault [0] in
+combination with migration [1].
+
+From the user-provided backtraces in the forum thread [1], it seems
+like blk_co_do_flush() managed to get past the
+blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
+non-NULL value during the check, but then, when calling
+bdrv_co_flush(), blk_bs(blk) returned NULL.
+
+[0]:
+
+> 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
+> 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
+> 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
+
+[1]: https://forum.proxmox.com/threads/158072
+
+Cc: qemu-stable@nongnu.org
+Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
+---
+ block/block-backend.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/block/block-backend.c b/block/block-backend.c
+index db6f9b92a3..68ae681139 100644
+--- a/block/block-backend.c
++++ b/block/block-backend.c
+@@ -896,9 +896,9 @@ void blk_remove_bs(BlockBackend *blk)
+      */
+     blk_drain(blk);
+     root = blk->root;
+-    blk->root = NULL;
+ 
+     bdrv_graph_wrlock();
++    blk->root = NULL;
+     bdrv_root_unref_child(root);
+     bdrv_graph_wrunlock();
+ }
diff --git a/debian/patches/series b/debian/patches/series
index 0b48878..18bf974 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -4,6 +4,7 @@ extra/0003-ide-avoid-potential-deadlock-when-draining-during-tr.patch
 extra/0004-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
 extra/0005-virtio-net-Add-queues-before-loading-them.patch
 extra/0006-virtio-net-Fix-size-check-in-dhclient-workaround.patch
+extra/0007-block-backend-protect-setting-block-root-to-NULL-wit.patch
 bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
 bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
 bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel