all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Stefan Reiter <s.reiter@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC qemu] migration/block-dirty-bitmap: migrate other bitmaps even if one fails
Date: Tue,  3 Nov 2020 14:57:32 +0100	[thread overview]
Message-ID: <20201103135732.3313-1-s.reiter@proxmox.com> (raw)

If the checks in bdrv_dirty_bitmap_check fail, that only means that this
one specific bitmap cannot be migrated. That is not an error condition
for any other bitmaps on the same block device.

Fixes dirty-bitmap migration with sync=bitmap, as the bitmaps used for
that are obviously marked as "busy", which would cause none at all to be
transferred.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---

NOTE: This is more or less a required workaround. The correct solution would
probably be to somehow exclude the bitmap used for sync=bitmap for migration
purposes, or handle it correctly otherwise.

ALSO: This does *not* fix the original bug, wherein QEMU segfaults on the source
side in sync=bitmap mode if any of the migration checks after the drive-mirror
fail. What this patch does is remove the error condition (which makes sense
either way), but if there is a different error, I'd still expect it to SEGV.

@Fabian: Potentially a bug in error handling from the bitmap sync mode patches
you carried? I'll keep looking too.

Crash is easily reproducible, just live-migrate a VM that has a replicated drive
on 5.1.0-4. 'add_bitmaps_to_list' will fail, and then QEMU segfaults with the
following stack trace (ioc=0 is the fault):

#0    0x0000555555de8b2f in qio_channel_detach_aio_context (ioc=0x0) at io/channel.c:452
#1    0x0000555555d91bd2 in nbd_client_detach_aio_context (bs=0x7fff879e91c0) at block/nbd.c:146
#2    0x0000555555d05050 in bdrv_detach_aio_context (bs=0x7fff879e91c0) at block.c:6267
#3    0x0000555555d05347 in bdrv_set_aio_context_ignore (bs=0x7fff879e91c0, new_context=0x7fffea332500, ignore=0x7fffffffcdc0) at block.c:6346
#4    0x0000555555d056f7 in bdrv_child_try_set_aio_context (bs=0x7fff879e91c0, ctx=0x7fffea332500, ignore_child=0x0, errp=0x0) at block.c:6450
#5    0x0000555555d0574e in bdrv_try_set_aio_context (bs=0x7fff879e91c0, ctx=0x7fffea332500, errp=0x0) at block.c:6459
#6    0x0000555555cfced7 in bdrv_replace_child (child=0x7fffea2cd580, new_bs=0x0) at block.c:2654
#7    0x0000555555cfd35e in bdrv_detach_child (child=0x7fffea2cd580) at block.c:2773
#8    0x0000555555cfd3a0 in bdrv_root_unref_child (child=0x7fffea2cd580) at block.c:2784
#9    0x0000555555d06f5b in block_job_remove_all_bdrv (job=0x7fffea896500) at blockjob.c:191
#10   0x0000555555d7c963 in mirror_exit_common (job=0x7fffea896500) at block/mirror.c:743
#11   0x0000555555d7cb02 in mirror_abort (job=0x7fffea896500) at block/mirror.c:783
#12   0x0000555555d0971a in job_abort (job=0x7fffea896500) at job.c:692
#13   0x0000555555d097be in job_finalize_single (job=0x7fffea896500) at job.c:713
#14   0x0000555555d09a47 in job_completed_txn_abort (job=0x7fffea896500) at job.c:791
#15   0x0000555555d09dc0 in job_completed (job=0x7fffea896500) at job.c:888
#16   0x0000555555d09e21 in job_exit (opaque=0x7fffea896500) at job.c:910
#17   0x0000555555e676fc in aio_bh_call (bh=0x7fffe8c141e0) at util/async.c:136
#18   0x0000555555e67806 in aio_bh_poll (ctx=0x7fffea332500) at util/async.c:164
#19   0x0000555555e51766 in aio_dispatch (ctx=0x7fffea332500) at util/aio-posix.c:380
#20   0x0000555555e67c39 in aio_ctx_dispatch (source=0x7fffea332500, callback=0x0, user_data=0x0) at util/async.c:306
#21   0x00007ffff7bf0f2e in g_main_context_dispatch + 0x2ae () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#22   0x0000555555e6ff77 in glib_pollfds_poll () at util/main-loop.c:217
#23   0x0000555555e6fff1 in os_host_main_loop_wait (timeout=0xd53f4) at util/main-loop.c:240
#24   0x0000555555e700f6 in main_loop_wait (nonblocking=0x0) at util/main-loop.c:516


 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 5bf0d9fbc6..1070c19181 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -323,7 +323,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
 
         if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_DEFAULT, &local_err)) {
             error_report_err(local_err);
-            return -1;
+            continue;
         }
 
         bdrv_ref(bs);
-- 
2.20.1





             reply	other threads:[~2020-11-03 14:00 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-03 13:57 Stefan Reiter [this message]
2020-11-04 17:42 ` [pve-devel] applied: " Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201103135732.3313-1-s.reiter@proxmox.com \
    --to=s.reiter@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal