From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id EDA9D8928 for ; Fri, 28 Jul 2023 12:59:26 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D750E1438C for ; Fri, 28 Jul 2023 12:59:26 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 28 Jul 2023 12:59:25 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 161FF41840 for ; Fri, 28 Jul 2023 12:59:25 +0200 (CEST) Message-ID: <3caaa423-98b9-a56f-bed9-1bb1a3bfe26f@proxmox.com> Date: Fri, 28 Jul 2023 12:59:23 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 To: Proxmox VE development discussion , Fiona Ebner References: <20230728094457.145148-1-f.ebner@proxmox.com> Content-Language: en-US From: Friedrich Weber In-Reply-To: <20230728094457.145148-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.047 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.091 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: Re: [pve-devel] [PATCH qemu] add patch fixing resume for snapshot and hibernate with drive with iothread and a dirty bitmap X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2023 10:59:27 -0000 As described, make a PBS backup of an existing VM and then took a snapshot. Rolling back to the snapshot failed with "qemu: qemu_mutex_unlock_impl: Operation not permitted". With this patch applied, rolling back worked. Tested-by: Friedrich Weber On 28/07/2023 11:44, Fiona Ebner wrote: > Not difficult to run into, just have a drive with iothread, take a PBS > backup and then take a snapshot or hibernate. Resuming will fail with >> qemu: qemu_mutex_unlock_impl: Operation not permitted > because of not acquiring the correct AioContext first. > > Migration is not affected, because it runs in coroutine context. > > Reported in the community forum: > https://forum.proxmox.com/threads/129899/ > > Signed-off-by: Fiona Ebner > --- > > Surprised there were not more reports, but it could also be that > people are now sitting on some snapshots which can't be rolled back > without this fix. > > Will try to reproduce the issue with upstream QEMU (don't see why they > wouldn't be affected) and upstream the fix if they are affected too. > > ...dirty-bitmap-fix-loading-bitmap-when.patch | 48 +++++++++++++++++++ > ...dirty-bitmap-migrate-other-bitmaps-e.patch | 2 +- > ...apshots-hold-the-BQL-during-setup-ca.patch | 6 +-- > debian/patches/series | 1 + > 4 files changed, 53 insertions(+), 4 deletions(-) > create mode 100644 debian/patches/extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch > > diff --git a/debian/patches/extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch b/debian/patches/extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch > new file mode 100644 > index 0000000..bb01ced > --- /dev/null > +++ b/debian/patches/extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch > @@ -0,0 +1,48 @@ > +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 > +From: Fiona Ebner > +Date: Fri, 28 Jul 2023 10:47:48 +0200 > +Subject: [PATCH] migration/block-dirty-bitmap: fix loading bitmap when there > + is an iothread > + > +The bdrv_create_dirty_bitmap() function (which is also called by > +bdrv_dirty_bitmap_create_successor()) uses bdrv_getlength(bs). This is > +a wrapper around a coroutine, and thus uses bdrv_poll_co(). Polling > +tries to release the AioContext which will trigger an assert() if it > +hasn't been acquired before. > + > +The issue does not happen for migration, because there we are in a > +coroutine already, so the wrapper will just call bdrv_co_getlength() > +directly without polling. > + > +Signed-off-by: Fiona Ebner > +--- > + migration/block-dirty-bitmap.c | 6 ++++++ > + 1 file changed, 6 insertions(+) > + > +diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > +index fe73aa94b1..7eaf498439 100644 > +--- a/migration/block-dirty-bitmap.c > ++++ b/migration/block-dirty-bitmap.c > +@@ -805,8 +805,11 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s) > + "destination", bdrv_dirty_bitmap_name(s->bitmap)); > + return -EINVAL; > + } else { > ++ AioContext *ctx = bdrv_get_aio_context(s->bs); > ++ aio_context_acquire(ctx); > + s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity, > + s->bitmap_name, &local_err); > ++ aio_context_release(ctx); > + if (!s->bitmap) { > + error_report_err(local_err); > + return -EINVAL; > +@@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, DBMLoadState *s) > + > + bdrv_disable_dirty_bitmap(s->bitmap); > + if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) { > ++ AioContext *ctx = bdrv_get_aio_context(s->bs); > ++ aio_context_acquire(ctx); > + bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err); > ++ aio_context_release(ctx); > + if (local_err) { > + error_report_err(local_err); > + return -EINVAL; > diff --git a/debian/patches/pve/0035-migration-block-dirty-bitmap-migrate-other-bitmaps-e.patch b/debian/patches/pve/0035-migration-block-dirty-bitmap-migrate-other-bitmaps-e.patch > index 0e3f38d..bd721fc 100644 > --- a/debian/patches/pve/0035-migration-block-dirty-bitmap-migrate-other-bitmaps-e.patch > +++ b/debian/patches/pve/0035-migration-block-dirty-bitmap-migrate-other-bitmaps-e.patch > @@ -19,7 +19,7 @@ Signed-off-by: Thomas Lamprecht > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > -index fe73aa94b1..a6440929fa 100644 > +index 7eaf498439..509f3df0a6 100644 > --- a/migration/block-dirty-bitmap.c > +++ b/migration/block-dirty-bitmap.c > @@ -539,7 +539,7 @@ static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs, > diff --git a/debian/patches/pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch b/debian/patches/pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch > index cbc39cc..04ef6cb 100644 > --- a/debian/patches/pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch > +++ b/debian/patches/pve/0044-migration-for-snapshots-hold-the-BQL-during-setup-ca.patch > @@ -67,10 +67,10 @@ index a8dfd8fefd..fa9b0b0f10 100644 > * must_precopy: > * - must be migrated in precopy or in stopped state > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c > -index a6440929fa..69fab3275c 100644 > +index 509f3df0a6..42dc4a8d61 100644 > --- a/migration/block-dirty-bitmap.c > +++ b/migration/block-dirty-bitmap.c > -@@ -1214,10 +1214,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) > +@@ -1220,10 +1220,17 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) > { > DBMSaveState *s = &((DBMState *)opaque)->save; > SaveBitmapState *dbms = NULL; > @@ -90,7 +90,7 @@ index a6440929fa..69fab3275c 100644 > return -1; > } > > -@@ -1225,7 +1232,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) > +@@ -1231,7 +1238,9 @@ static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque) > send_bitmap_start(f, s, dbms); > } > qemu_put_bitmap_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS); > diff --git a/debian/patches/series b/debian/patches/series > index c9c96d7..a4dd4c2 100644 > --- a/debian/patches/series > +++ b/debian/patches/series > @@ -7,6 +7,7 @@ extra/0006-lsi53c895a-disable-reentrancy-detection-for-script-R.patch > extra/0007-bcm2835_property-disable-reentrancy-detection-for-io.patch > extra/0008-raven-disable-reentrancy-detection-for-iomem.patch > extra/0009-apic-disable-reentrancy-detection-for-apic-msi.patch > +extra/0010-migration-block-dirty-bitmap-fix-loading-bitmap-when.patch > bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch > bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch > bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch