From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH qemu] add fix for crash during live migration in combination with block flush
Date: Thu, 16 Jan 2025 11:30:03 +0100 [thread overview]
Message-ID: <760bc33f-0c7d-4df7-9b1d-e44f823c1df7@proxmox.com> (raw)
In-Reply-To: <3e462976-81be-4025-b7b0-b546a51c2246@proxmox.com>
Am 15.01.25 um 17:28 schrieb Thomas Lamprecht:
> Am 08.01.25 um 14:03 schrieb Fiona Ebner:
>> Setting blk->root is a graph change operation and thus needs to be
>> protected by the block graph write lock in blk_remove_bs(). The
>> assignment to blk->root in blk_insert_bs() is already protected by
>> the block graph write lock.
>>
>> In particular, the graph read lock in blk_co_do_flush() could
>> previously not ensure that blk_bs(blk) would always return the same
>> value during the locked section, which could lead to a segfault [0] in
>> combination with migration [1].
>>
>> From the user-provided backtraces in the forum thread [1], it seems
>> like blk_co_do_flush() managed to get past the
>> blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
>> non-NULL value during the check, but then, when calling
>> bdrv_co_flush(), blk_bs(blk) returned NULL.
>>
>> [0]:
>>
>>> 0 bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
>>> 1 bdrv_co_flush (bs=0x0) at ../block/io.c:2948
>>> 2 bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
>>
>> [1]: https://forum.proxmox.com/threads/158072
>>
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>>
>> Upstream submission of the same patch:
>> https://lore.kernel.org/qemu-devel/20250108124649.333668-1-f.ebner@proxmox.com/T/
>
> I only skimmed the upstream discussion, but seems that there is still some
> issue left; so should I wait this version out?
Yes, we should at least also put the "root = blk->root;" assignment into
the write lock section like the upstream maintainer suggested.
That more complete change is in the package provided to the forum user.
The change should still be an improvement over the status quo, however,
the user reported that it didn't help with the specific crash. I don't
see other code paths that would fit the provided backtraces right now :/
I'll ask the user to try again with a more complete GDB script in the
hope of discovering something I missed.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-01-16 10:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-08 13:03 Fiona Ebner
2025-01-15 16:28 ` Thomas Lamprecht
2025-01-16 10:30 ` Fiona Ebner [this message]
2025-01-16 13:10 ` Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=760bc33f-0c7d-4df7-9b1d-e44f823c1df7@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox