From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk
Date: Wed, 03 Jul 2024 15:15:36 +0200 [thread overview]
Message-ID: <1720012217.r27ketiaun.astroid@yuna.none> (raw)
In-Reply-To: <20240528085005.45859-1-f.ebner@proxmox.com>
On May 28, 2024 10:50 am, Fiona Ebner wrote:
> There is a possibility that the drive-mirror job is not yet done when
> the migration wants to inactivate the source's blockdrives:
>
>> bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
>
> This can be prevented by using the 'write-blocking' copy mode (also
> called active mode) for the mirror. However, with active mode, the
> guest write speed is limited by the synchronous writes to the mirror
> target. For this reason, a way to start out in the faster 'background'
> mode and later switch to active mode was introduced in QEMU 8.2.
>
> The switch is done once the mirror job for all drives is ready to be
> completed to reduce the time spent where guest IO is limited.
>
> Reported rarely, but steadily over the years:
> https://forum.proxmox.com/threads/78954/post-353651
> https://forum.proxmox.com/threads/78954/post-380015
> https://forum.proxmox.com/threads/100020/post-431660
> https://forum.proxmox.com/threads/111831/post-482425
> https://forum.proxmox.com/threads/111831/post-499807
> https://forum.proxmox.com/threads/137849/
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>
> Changes in v2:
> * check for running QEMU version instead of installed version
>
> PVE/QemuMigrate.pm | 8 ++++++
> PVE/QemuServer.pm | 41 +++++++++++++++++++++++++++
> test/MigrationTest/QemuMigrateMock.pm | 6 ++++
> 3 files changed, 55 insertions(+)
>
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 33d5b2d1..d7ee4a5b 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -1145,6 +1145,14 @@ sub phase2 {
> $self->log('info', "$drive: start migration to $nbd_uri");
> PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap);
> }
> +
> + if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) {
> + $self->log('info', "switching mirror jobs to actively synced mode");
> + PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode(
> + $vmid,
> + $self->{storage_migration_jobs},
> + );
> + }
> }
>
> $self->log('info', "starting online/live migration on $migrate_uri");
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 5df0c96d..d472e805 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -8122,6 +8122,47 @@ sub qemu_blockjobs_cancel {
> }
> }
>
> +# Callers should version guard this (only available with a binary >= QEMU 8.2)
> +sub qemu_drive_mirror_switch_to_active_mode {
> + my ($vmid, $jobs) = @_;
> +
> + my $switching = {};
> +
> + for my $job (sort keys $jobs->%*) {
> + print "$job: switching to actively synced mode\n";
> +
> + eval {
> + mon_cmd(
> + $vmid,
> + "block-job-change",
> + id => $job,
> + type => 'mirror',
> + 'copy-mode' => 'write-blocking',
> + );
> + $switching->{$job} = 1;
> + };
> + die "could not switch mirror job $job to active mode - $@\n" if $@;
> + }
> +
> + while (1) {
> + my $stats = mon_cmd($vmid, "query-block-jobs");
> +
> + my $running_jobs = {};
> + $running_jobs->{$_->{device}} = $_ for $stats->@*;
> +
> + for my $job (sort keys $switching->%*) {
> + if ($running_jobs->{$job}->{'actively-synced'}) {
> + print "$job: successfully switched to actively synced mode\n";
> + delete $switching->{$job};
> + }
> + }
> +
> + last if scalar(keys $switching->%*) == 0;
> +
> + sleep 1;
> + }
so what could be the cause here for a job not switching? and do we
really want to loop forever if it happens?
> +}
> +
> # Check for bug #4525: drive-mirror will open the target drive with the same aio setting as the
> # source, but some storages have problems with io_uring, sometimes even leading to crashes.
> my sub clone_disk_check_io_uring {
> diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm
> index 1efabe24..f5b44424 100644
> --- a/test/MigrationTest/QemuMigrateMock.pm
> +++ b/test/MigrationTest/QemuMigrateMock.pm
> @@ -152,6 +152,9 @@ $MigrationTest::Shared::qemu_server_module->mock(
> }
> return;
> },
> + qemu_drive_mirror_switch_to_active_mode => sub {
> + return;
> + },
> set_migration_caps => sub {
> return;
> },
> @@ -185,6 +188,9 @@ $qemu_server_machine_module->mock(
> if !defined($vm_status->{runningmachine});
> return $vm_status->{runningmachine};
> },
> + runs_at_least_qemu_version => sub {
> + return 1;
> + },
> );
>
> my $ssh_info_module = Test::MockModule->new("PVE::SSHInfo");
> --
> 2.39.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2024-07-03 13:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-28 8:50 Fiona Ebner
2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner
2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler
2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check Fiona Ebner
2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler
2024-05-28 8:50 ` [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner
2024-07-03 13:32 ` Fabian Grünbichler
2024-07-03 13:15 ` Fabian Grünbichler [this message]
2024-07-03 13:44 ` [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner
2024-07-03 13:49 ` Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1720012217.r27ketiaun.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox