From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 01A171FF2AA for ; Wed, 3 Jul 2024 15:15:56 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C31F38ABE; Wed, 3 Jul 2024 15:16:12 +0200 (CEST) Date: Wed, 03 Jul 2024 15:15:36 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion References: <20240528085005.45859-1-f.ebner@proxmox.com> In-Reply-To: <20240528085005.45859-1-f.ebner@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid) Message-Id: <1720012217.r27ketiaun.astroid@yuna.none> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.051 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, qemuserver.pm, qemumigrate.pm, qemumigratemock.pm] Subject: Re: [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On May 28, 2024 10:50 am, Fiona Ebner wrote: > There is a possibility that the drive-mirror job is not yet done when > the migration wants to inactivate the source's blockdrives: > >> bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. > > This can be prevented by using the 'write-blocking' copy mode (also > called active mode) for the mirror. However, with active mode, the > guest write speed is limited by the synchronous writes to the mirror > target. For this reason, a way to start out in the faster 'background' > mode and later switch to active mode was introduced in QEMU 8.2. > > The switch is done once the mirror job for all drives is ready to be > completed to reduce the time spent where guest IO is limited. > > Reported rarely, but steadily over the years: > https://forum.proxmox.com/threads/78954/post-353651 > https://forum.proxmox.com/threads/78954/post-380015 > https://forum.proxmox.com/threads/100020/post-431660 > https://forum.proxmox.com/threads/111831/post-482425 > https://forum.proxmox.com/threads/111831/post-499807 > https://forum.proxmox.com/threads/137849/ > > Signed-off-by: Fiona Ebner > --- > > Changes in v2: > * check for running QEMU version instead of installed version > > PVE/QemuMigrate.pm | 8 ++++++ > PVE/QemuServer.pm | 41 +++++++++++++++++++++++++++ > test/MigrationTest/QemuMigrateMock.pm | 6 ++++ > 3 files changed, 55 insertions(+) > > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index 33d5b2d1..d7ee4a5b 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -1145,6 +1145,14 @@ sub phase2 { > $self->log('info', "$drive: start migration to $nbd_uri"); > PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap); > } > + > + if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) { > + $self->log('info', "switching mirror jobs to actively synced mode"); > + PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode( > + $vmid, > + $self->{storage_migration_jobs}, > + ); > + } > } > > $self->log('info', "starting online/live migration on $migrate_uri"); > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm > index 5df0c96d..d472e805 100644 > --- a/PVE/QemuServer.pm > +++ b/PVE/QemuServer.pm > @@ -8122,6 +8122,47 @@ sub qemu_blockjobs_cancel { > } > } > > +# Callers should version guard this (only available with a binary >= QEMU 8.2) > +sub qemu_drive_mirror_switch_to_active_mode { > + my ($vmid, $jobs) = @_; > + > + my $switching = {}; > + > + for my $job (sort keys $jobs->%*) { > + print "$job: switching to actively synced mode\n"; > + > + eval { > + mon_cmd( > + $vmid, > + "block-job-change", > + id => $job, > + type => 'mirror', > + 'copy-mode' => 'write-blocking', > + ); > + $switching->{$job} = 1; > + }; > + die "could not switch mirror job $job to active mode - $@\n" if $@; > + } > + > + while (1) { > + my $stats = mon_cmd($vmid, "query-block-jobs"); > + > + my $running_jobs = {}; > + $running_jobs->{$_->{device}} = $_ for $stats->@*; > + > + for my $job (sort keys $switching->%*) { > + if ($running_jobs->{$job}->{'actively-synced'}) { > + print "$job: successfully switched to actively synced mode\n"; > + delete $switching->{$job}; > + } > + } > + > + last if scalar(keys $switching->%*) == 0; > + > + sleep 1; > + } so what could be the cause here for a job not switching? and do we really want to loop forever if it happens? > +} > + > # Check for bug #4525: drive-mirror will open the target drive with the same aio setting as the > # source, but some storages have problems with io_uring, sometimes even leading to crashes. > my sub clone_disk_check_io_uring { > diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm > index 1efabe24..f5b44424 100644 > --- a/test/MigrationTest/QemuMigrateMock.pm > +++ b/test/MigrationTest/QemuMigrateMock.pm > @@ -152,6 +152,9 @@ $MigrationTest::Shared::qemu_server_module->mock( > } > return; > }, > + qemu_drive_mirror_switch_to_active_mode => sub { > + return; > + }, > set_migration_caps => sub { > return; > }, > @@ -185,6 +188,9 @@ $qemu_server_machine_module->mock( > if !defined($vm_status->{runningmachine}); > return $vm_status->{runningmachine}; > }, > + runs_at_least_qemu_version => sub { > + return 1; > + }, > ); > > my $ssh_info_module = Test::MockModule->new("PVE::SSHInfo"); > -- > 2.39.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel