* [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk @ 2024-05-28 8:50 Fiona Ebner 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Fiona Ebner @ 2024-05-28 8:50 UTC (permalink / raw) To: pve-devel There is a possibility that the drive-mirror job is not yet done when the migration wants to inactivate the source's blockdrives: > bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. This can be prevented by using the 'write-blocking' copy mode (also called active mode) for the mirror. However, with active mode, the guest write speed is limited by the synchronous writes to the mirror target. For this reason, a way to start out in the faster 'background' mode and later switch to active mode was introduced in QEMU 8.2. The switch is done once the mirror job for all drives is ready to be completed to reduce the time spent where guest IO is limited. Reported rarely, but steadily over the years: https://forum.proxmox.com/threads/78954/post-353651 https://forum.proxmox.com/threads/78954/post-380015 https://forum.proxmox.com/threads/100020/post-431660 https://forum.proxmox.com/threads/111831/post-482425 https://forum.proxmox.com/threads/111831/post-499807 https://forum.proxmox.com/threads/137849/ Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Changes in v2: * check for running QEMU version instead of installed version PVE/QemuMigrate.pm | 8 ++++++ PVE/QemuServer.pm | 41 +++++++++++++++++++++++++++ test/MigrationTest/QemuMigrateMock.pm | 6 ++++ 3 files changed, 55 insertions(+) diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index 33d5b2d1..d7ee4a5b 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -1145,6 +1145,14 @@ sub phase2 { $self->log('info', "$drive: start migration to $nbd_uri"); PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap); } + + if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) { + $self->log('info', "switching mirror jobs to actively synced mode"); + PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode( + $vmid, + $self->{storage_migration_jobs}, + ); + } } $self->log('info', "starting online/live migration on $migrate_uri"); diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index 5df0c96d..d472e805 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -8122,6 +8122,47 @@ sub qemu_blockjobs_cancel { } } +# Callers should version guard this (only available with a binary >= QEMU 8.2) +sub qemu_drive_mirror_switch_to_active_mode { + my ($vmid, $jobs) = @_; + + my $switching = {}; + + for my $job (sort keys $jobs->%*) { + print "$job: switching to actively synced mode\n"; + + eval { + mon_cmd( + $vmid, + "block-job-change", + id => $job, + type => 'mirror', + 'copy-mode' => 'write-blocking', + ); + $switching->{$job} = 1; + }; + die "could not switch mirror job $job to active mode - $@\n" if $@; + } + + while (1) { + my $stats = mon_cmd($vmid, "query-block-jobs"); + + my $running_jobs = {}; + $running_jobs->{$_->{device}} = $_ for $stats->@*; + + for my $job (sort keys $switching->%*) { + if ($running_jobs->{$job}->{'actively-synced'}) { + print "$job: successfully switched to actively synced mode\n"; + delete $switching->{$job}; + } + } + + last if scalar(keys $switching->%*) == 0; + + sleep 1; + } +} + # Check for bug #4525: drive-mirror will open the target drive with the same aio setting as the # source, but some storages have problems with io_uring, sometimes even leading to crashes. my sub clone_disk_check_io_uring { diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm index 1efabe24..f5b44424 100644 --- a/test/MigrationTest/QemuMigrateMock.pm +++ b/test/MigrationTest/QemuMigrateMock.pm @@ -152,6 +152,9 @@ $MigrationTest::Shared::qemu_server_module->mock( } return; }, + qemu_drive_mirror_switch_to_active_mode => sub { + return; + }, set_migration_caps => sub { return; }, @@ -185,6 +188,9 @@ $qemu_server_machine_module->mock( if !defined($vm_status->{runningmachine}); return $vm_status->{runningmachine}; }, + runs_at_least_qemu_version => sub { + return 1; + }, ); my $ssh_info_module = Test::MockModule->new("PVE::SSHInfo"); -- 2.39.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version 2024-05-28 8:50 [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner @ 2024-05-28 8:50 ` Fiona Ebner 2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check Fiona Ebner ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Fiona Ebner @ 2024-05-28 8:50 UTC (permalink / raw) To: pve-devel In Proxmox VE 8, the oldest supported QEMU version is 8.0, so a check for version 4.2 is not required anymore. The check was also wrong, because it checked the installed version and not the currently running one. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v2. PVE/QemuMigrate.pm | 6 ------ 1 file changed, 6 deletions(-) diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index d7ee4a5b..34fc46ee 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -544,12 +544,6 @@ sub handle_replication { if $self->{opts}->{remote}; if ($self->{running}) { - - my $version = PVE::QemuServer::kvm_user_version(); - if (!min_version($version, 4, 2)) { - die "can't live migrate VM with replicated volumes, pve-qemu to old (< 4.2)!\n" - } - my @live_replicatable_volumes = $self->filter_local_volumes('online', 1); foreach my $volid (@live_replicatable_volumes) { my $drive = $local_volumes->{$volid}->{drivename}; -- 2.39.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [pve-devel] applied: [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner @ 2024-07-03 13:10 ` Fabian Grünbichler 0 siblings, 0 replies; 10+ messages in thread From: Fabian Grünbichler @ 2024-07-03 13:10 UTC (permalink / raw) To: Proxmox VE development discussion On May 28, 2024 10:50 am, Fiona Ebner wrote: > In Proxmox VE 8, the oldest supported QEMU version is 8.0, so a > check for version 4.2 is not required anymore. The check was also > wrong, because it checked the installed version and not the currently > running one. > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > > New in v2. > > PVE/QemuMigrate.pm | 6 ------ > 1 file changed, 6 deletions(-) > > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index d7ee4a5b..34fc46ee 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -544,12 +544,6 @@ sub handle_replication { > if $self->{opts}->{remote}; > > if ($self->{running}) { > - > - my $version = PVE::QemuServer::kvm_user_version(); > - if (!min_version($version, 4, 2)) { > - die "can't live migrate VM with replicated volumes, pve-qemu to old (< 4.2)!\n" > - } > - > my @live_replicatable_volumes = $self->filter_local_volumes('online', 1); > foreach my $volid (@live_replicatable_volumes) { > my $drive = $local_volumes->{$volid}->{drivename}; > -- > 2.39.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check 2024-05-28 8:50 [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner @ 2024-05-28 8:50 ` Fiona Ebner 2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler 2024-05-28 8:50 ` [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner 2024-07-03 13:15 ` [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fabian Grünbichler 3 siblings, 1 reply; 10+ messages in thread From: Fiona Ebner @ 2024-05-28 8:50 UTC (permalink / raw) To: pve-devel In Proxmox VE 8, the oldest supported QEMU version is 8.0, so a check for version 4.0.1 is not required anymore. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v2. PVE/VZDump/QemuServer.pm | 4 ---- 1 file changed, 4 deletions(-) diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm index 8c97ee62..5248c6eb 100644 --- a/PVE/VZDump/QemuServer.pm +++ b/PVE/VZDump/QemuServer.pm @@ -90,10 +90,6 @@ sub prepare { if (!$volume->{included}) { $self->loginfo("exclude disk '$name' '$volid' ($volume->{reason})"); next; - } elsif ($self->{vm_was_running} && $volume_config->{iothread} && - !PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 4, 0, 1)) { - die "disk '$name' '$volid' (iothread=on) can't use backup feature with running QEMU " . - "version < 4.0.1! Either set backup=no for this drive or upgrade QEMU and restart VM\n"; } else { my $log = "include disk '$name' '$volid'"; if (defined(my $size = $volume_config->{size})) { -- 2.39.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [pve-devel] applied: [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check Fiona Ebner @ 2024-07-03 13:10 ` Fabian Grünbichler 0 siblings, 0 replies; 10+ messages in thread From: Fabian Grünbichler @ 2024-07-03 13:10 UTC (permalink / raw) To: Proxmox VE development discussion On May 28, 2024 10:50 am, Fiona Ebner wrote: > In Proxmox VE 8, the oldest supported QEMU version is 8.0, so a check > for version 4.0.1 is not required anymore. > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > > New in v2. > > PVE/VZDump/QemuServer.pm | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm > index 8c97ee62..5248c6eb 100644 > --- a/PVE/VZDump/QemuServer.pm > +++ b/PVE/VZDump/QemuServer.pm > @@ -90,10 +90,6 @@ sub prepare { > if (!$volume->{included}) { > $self->loginfo("exclude disk '$name' '$volid' ($volume->{reason})"); > next; > - } elsif ($self->{vm_was_running} && $volume_config->{iothread} && > - !PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 4, 0, 1)) { > - die "disk '$name' '$volid' (iothread=on) can't use backup feature with running QEMU " . > - "version < 4.0.1! Either set backup=no for this drive or upgrade QEMU and restart VM\n"; > } else { > my $log = "include disk '$name' '$volid'"; > if (defined(my $size = $volume_config->{size})) { > -- > 2.39.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module 2024-05-28 8:50 [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check Fiona Ebner @ 2024-05-28 8:50 ` Fiona Ebner 2024-07-03 13:32 ` Fabian Grünbichler 2024-07-03 13:15 ` [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fabian Grünbichler 3 siblings, 1 reply; 10+ messages in thread From: Fiona Ebner @ 2024-05-28 8:50 UTC (permalink / raw) To: pve-devel The version of the running QEMU binary is not related to the machine version and so it's a bit confusing to have the helper in the 'Machine' module. It cannot live in the 'Helpers' module, because that would lead to a cyclic inclusion Helpers <-> Monitor. Thus, 'QMPHelpers' is chosen as the new home. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v2. PVE/QemuMigrate.pm | 3 ++- PVE/QemuServer/Machine.pm | 12 ------------ PVE/QemuServer/QMPHelpers.pm | 13 +++++++++++++ test/MigrationTest/QemuMigrateMock.pm | 4 ++++ test/run_config2command_tests.pl | 4 ++-- 5 files changed, 21 insertions(+), 15 deletions(-) diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm index 34fc46ee..e71face4 100644 --- a/PVE/QemuMigrate.pm +++ b/PVE/QemuMigrate.pm @@ -30,6 +30,7 @@ use PVE::QemuServer::Helpers qw(min_version); use PVE::QemuServer::Machine; use PVE::QemuServer::Monitor qw(mon_cmd); use PVE::QemuServer::Memory qw(get_current_memory); +use PVE::QemuServer::QMPHelpers; use PVE::QemuServer; use PVE::AbstractMigrate; @@ -1140,7 +1141,7 @@ sub phase2 { PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap); } - if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) { + if (PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, 8, 2)) { $self->log('info', "switching mirror jobs to actively synced mode"); PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode( $vmid, diff --git a/PVE/QemuServer/Machine.pm b/PVE/QemuServer/Machine.pm index cc92e7e6..a3917dae 100644 --- a/PVE/QemuServer/Machine.pm +++ b/PVE/QemuServer/Machine.pm @@ -161,18 +161,6 @@ sub can_run_pve_machine_version { return 0; } -# dies if a) VM not running or not exisiting b) Version query failed -# So, any defined return value is valid, any invalid state can be caught by eval -sub runs_at_least_qemu_version { - my ($vmid, $major, $minor, $extra) = @_; - - my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version'); - die "could not query currently running version for VM $vmid\n" if !defined($v); - $v = $v->{qemu}; - - return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0; -} - sub qemu_machine_pxe { my ($vmid, $conf) = @_; diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm index d3a52327..0269ea46 100644 --- a/PVE/QemuServer/QMPHelpers.pm +++ b/PVE/QemuServer/QMPHelpers.pm @@ -3,6 +3,7 @@ package PVE::QemuServer::QMPHelpers; use warnings; use strict; +use PVE::QemuServer::Helpers; use PVE::QemuServer::Monitor qw(mon_cmd); use base 'Exporter'; @@ -45,4 +46,16 @@ sub qemu_objectdel { return 1; } +# dies if a) VM not running or not exisiting b) Version query failed +# So, any defined return value is valid, any invalid state can be caught by eval +sub runs_at_least_qemu_version { + my ($vmid, $major, $minor, $extra) = @_; + + my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version'); + die "could not query currently running version for VM $vmid\n" if !defined($v); + $v = $v->{qemu}; + + return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0; +} + 1; diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm index f5b44424..11c58c08 100644 --- a/test/MigrationTest/QemuMigrateMock.pm +++ b/test/MigrationTest/QemuMigrateMock.pm @@ -188,6 +188,10 @@ $qemu_server_machine_module->mock( if !defined($vm_status->{runningmachine}); return $vm_status->{runningmachine}; }, +); + +my $qemu_server_qmphelpers_module = Test::MockModule->new("PVE::QemuServer::QMPHelpers"); +$qemu_server_qmphelpers_module->mock( runs_at_least_qemu_version => sub { return 1; }, diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl index 7212acc4..d48ef562 100755 --- a/test/run_config2command_tests.pl +++ b/test/run_config2command_tests.pl @@ -16,7 +16,7 @@ use PVE::SysFSTools; use PVE::QemuConfig; use PVE::QemuServer; use PVE::QemuServer::Monitor; -use PVE::QemuServer::Machine; +use PVE::QemuServer::QMPHelpers; use PVE::QemuServer::CPUConfig; my $base_env = { @@ -472,7 +472,7 @@ sub do_test($) { # check if QEMU version set correctly and test version_cmp (my $qemu_major = get_test_qemu_version()) =~ s/\..*$//; die "runs_at_least_qemu_version returned false, maybe error in version_cmp?" - if !PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, $qemu_major); + if !PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, $qemu_major); $cmdline =~ s/ -/ \\\n -/g; # same as qm showcmd --pretty $cmdline .= "\n"; -- 2.39.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module 2024-05-28 8:50 ` [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner @ 2024-07-03 13:32 ` Fabian Grünbichler 0 siblings, 0 replies; 10+ messages in thread From: Fabian Grünbichler @ 2024-07-03 13:32 UTC (permalink / raw) To: Proxmox VE development discussion On May 28, 2024 10:50 am, Fiona Ebner wrote: > The version of the running QEMU binary is not related to the machine > version and so it's a bit confusing to have the helper in the > 'Machine' module. It cannot live in the 'Helpers' module, because that > would lead to a cyclic inclusion Helpers <-> Monitor. Thus, > 'QMPHelpers' is chosen as the new home. > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Acked-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> but needs the first patch to be applied, or a re-order to move this first ;) > --- > > New in v2. > > PVE/QemuMigrate.pm | 3 ++- > PVE/QemuServer/Machine.pm | 12 ------------ > PVE/QemuServer/QMPHelpers.pm | 13 +++++++++++++ > test/MigrationTest/QemuMigrateMock.pm | 4 ++++ > test/run_config2command_tests.pl | 4 ++-- > 5 files changed, 21 insertions(+), 15 deletions(-) > > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index 34fc46ee..e71face4 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -30,6 +30,7 @@ use PVE::QemuServer::Helpers qw(min_version); > use PVE::QemuServer::Machine; > use PVE::QemuServer::Monitor qw(mon_cmd); > use PVE::QemuServer::Memory qw(get_current_memory); > +use PVE::QemuServer::QMPHelpers; > use PVE::QemuServer; > > use PVE::AbstractMigrate; > @@ -1140,7 +1141,7 @@ sub phase2 { > PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap); > } > > - if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) { > + if (PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, 8, 2)) { > $self->log('info', "switching mirror jobs to actively synced mode"); > PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode( > $vmid, > diff --git a/PVE/QemuServer/Machine.pm b/PVE/QemuServer/Machine.pm > index cc92e7e6..a3917dae 100644 > --- a/PVE/QemuServer/Machine.pm > +++ b/PVE/QemuServer/Machine.pm > @@ -161,18 +161,6 @@ sub can_run_pve_machine_version { > return 0; > } > > -# dies if a) VM not running or not exisiting b) Version query failed > -# So, any defined return value is valid, any invalid state can be caught by eval > -sub runs_at_least_qemu_version { > - my ($vmid, $major, $minor, $extra) = @_; > - > - my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version'); > - die "could not query currently running version for VM $vmid\n" if !defined($v); > - $v = $v->{qemu}; > - > - return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0; > -} > - > sub qemu_machine_pxe { > my ($vmid, $conf) = @_; > > diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm > index d3a52327..0269ea46 100644 > --- a/PVE/QemuServer/QMPHelpers.pm > +++ b/PVE/QemuServer/QMPHelpers.pm > @@ -3,6 +3,7 @@ package PVE::QemuServer::QMPHelpers; > use warnings; > use strict; > > +use PVE::QemuServer::Helpers; > use PVE::QemuServer::Monitor qw(mon_cmd); > > use base 'Exporter'; > @@ -45,4 +46,16 @@ sub qemu_objectdel { > return 1; > } > > +# dies if a) VM not running or not exisiting b) Version query failed > +# So, any defined return value is valid, any invalid state can be caught by eval > +sub runs_at_least_qemu_version { > + my ($vmid, $major, $minor, $extra) = @_; > + > + my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version'); > + die "could not query currently running version for VM $vmid\n" if !defined($v); > + $v = $v->{qemu}; > + > + return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0; > +} > + > 1; > diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm > index f5b44424..11c58c08 100644 > --- a/test/MigrationTest/QemuMigrateMock.pm > +++ b/test/MigrationTest/QemuMigrateMock.pm > @@ -188,6 +188,10 @@ $qemu_server_machine_module->mock( > if !defined($vm_status->{runningmachine}); > return $vm_status->{runningmachine}; > }, > +); > + > +my $qemu_server_qmphelpers_module = Test::MockModule->new("PVE::QemuServer::QMPHelpers"); > +$qemu_server_qmphelpers_module->mock( > runs_at_least_qemu_version => sub { > return 1; > }, > diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl > index 7212acc4..d48ef562 100755 > --- a/test/run_config2command_tests.pl > +++ b/test/run_config2command_tests.pl > @@ -16,7 +16,7 @@ use PVE::SysFSTools; > use PVE::QemuConfig; > use PVE::QemuServer; > use PVE::QemuServer::Monitor; > -use PVE::QemuServer::Machine; > +use PVE::QemuServer::QMPHelpers; > use PVE::QemuServer::CPUConfig; > > my $base_env = { > @@ -472,7 +472,7 @@ sub do_test($) { > # check if QEMU version set correctly and test version_cmp > (my $qemu_major = get_test_qemu_version()) =~ s/\..*$//; > die "runs_at_least_qemu_version returned false, maybe error in version_cmp?" > - if !PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, $qemu_major); > + if !PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, $qemu_major); > > $cmdline =~ s/ -/ \\\n -/g; # same as qm showcmd --pretty > $cmdline .= "\n"; > -- > 2.39.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk 2024-05-28 8:50 [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner ` (2 preceding siblings ...) 2024-05-28 8:50 ` [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner @ 2024-07-03 13:15 ` Fabian Grünbichler 2024-07-03 13:44 ` Fiona Ebner 3 siblings, 1 reply; 10+ messages in thread From: Fabian Grünbichler @ 2024-07-03 13:15 UTC (permalink / raw) To: Proxmox VE development discussion On May 28, 2024 10:50 am, Fiona Ebner wrote: > There is a possibility that the drive-mirror job is not yet done when > the migration wants to inactivate the source's blockdrives: > >> bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. > > This can be prevented by using the 'write-blocking' copy mode (also > called active mode) for the mirror. However, with active mode, the > guest write speed is limited by the synchronous writes to the mirror > target. For this reason, a way to start out in the faster 'background' > mode and later switch to active mode was introduced in QEMU 8.2. > > The switch is done once the mirror job for all drives is ready to be > completed to reduce the time spent where guest IO is limited. > > Reported rarely, but steadily over the years: > https://forum.proxmox.com/threads/78954/post-353651 > https://forum.proxmox.com/threads/78954/post-380015 > https://forum.proxmox.com/threads/100020/post-431660 > https://forum.proxmox.com/threads/111831/post-482425 > https://forum.proxmox.com/threads/111831/post-499807 > https://forum.proxmox.com/threads/137849/ > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > > Changes in v2: > * check for running QEMU version instead of installed version > > PVE/QemuMigrate.pm | 8 ++++++ > PVE/QemuServer.pm | 41 +++++++++++++++++++++++++++ > test/MigrationTest/QemuMigrateMock.pm | 6 ++++ > 3 files changed, 55 insertions(+) > > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index 33d5b2d1..d7ee4a5b 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -1145,6 +1145,14 @@ sub phase2 { > $self->log('info', "$drive: start migration to $nbd_uri"); > PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap); > } > + > + if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) { > + $self->log('info', "switching mirror jobs to actively synced mode"); > + PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode( > + $vmid, > + $self->{storage_migration_jobs}, > + ); > + } > } > > $self->log('info', "starting online/live migration on $migrate_uri"); > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm > index 5df0c96d..d472e805 100644 > --- a/PVE/QemuServer.pm > +++ b/PVE/QemuServer.pm > @@ -8122,6 +8122,47 @@ sub qemu_blockjobs_cancel { > } > } > > +# Callers should version guard this (only available with a binary >= QEMU 8.2) > +sub qemu_drive_mirror_switch_to_active_mode { > + my ($vmid, $jobs) = @_; > + > + my $switching = {}; > + > + for my $job (sort keys $jobs->%*) { > + print "$job: switching to actively synced mode\n"; > + > + eval { > + mon_cmd( > + $vmid, > + "block-job-change", > + id => $job, > + type => 'mirror', > + 'copy-mode' => 'write-blocking', > + ); > + $switching->{$job} = 1; > + }; > + die "could not switch mirror job $job to active mode - $@\n" if $@; > + } > + > + while (1) { > + my $stats = mon_cmd($vmid, "query-block-jobs"); > + > + my $running_jobs = {}; > + $running_jobs->{$_->{device}} = $_ for $stats->@*; > + > + for my $job (sort keys $switching->%*) { > + if ($running_jobs->{$job}->{'actively-synced'}) { > + print "$job: successfully switched to actively synced mode\n"; > + delete $switching->{$job}; > + } > + } > + > + last if scalar(keys $switching->%*) == 0; > + > + sleep 1; > + } so what could be the cause here for a job not switching? and do we really want to loop forever if it happens? > +} > + > # Check for bug #4525: drive-mirror will open the target drive with the same aio setting as the > # source, but some storages have problems with io_uring, sometimes even leading to crashes. > my sub clone_disk_check_io_uring { > diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm > index 1efabe24..f5b44424 100644 > --- a/test/MigrationTest/QemuMigrateMock.pm > +++ b/test/MigrationTest/QemuMigrateMock.pm > @@ -152,6 +152,9 @@ $MigrationTest::Shared::qemu_server_module->mock( > } > return; > }, > + qemu_drive_mirror_switch_to_active_mode => sub { > + return; > + }, > set_migration_caps => sub { > return; > }, > @@ -185,6 +188,9 @@ $qemu_server_machine_module->mock( > if !defined($vm_status->{runningmachine}); > return $vm_status->{runningmachine}; > }, > + runs_at_least_qemu_version => sub { > + return 1; > + }, > ); > > my $ssh_info_module = Test::MockModule->new("PVE::SSHInfo"); > -- > 2.39.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk 2024-07-03 13:15 ` [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fabian Grünbichler @ 2024-07-03 13:44 ` Fiona Ebner 2024-07-03 13:49 ` Fiona Ebner 0 siblings, 1 reply; 10+ messages in thread From: Fiona Ebner @ 2024-07-03 13:44 UTC (permalink / raw) To: Proxmox VE development discussion, Fabian Grünbichler Am 03.07.24 um 15:15 schrieb Fabian Grünbichler: > On May 28, 2024 10:50 am, Fiona Ebner wrote: >> + eval { >> + mon_cmd( >> + $vmid, >> + "block-job-change", >> + id => $job, >> + type => 'mirror', >> + 'copy-mode' => 'write-blocking', >> + ); >> + $switching->{$job} = 1; >> + }; >> + die "could not switch mirror job $job to active mode - $@\n" if $@; >> + } >> + >> + while (1) { >> + my $stats = mon_cmd($vmid, "query-block-jobs"); >> + >> + my $running_jobs = {}; >> + $running_jobs->{$_->{device}} = $_ for $stats->@*; >> + >> + for my $job (sort keys $switching->%*) { >> + if ($running_jobs->{$job}->{'actively-synced'}) { >> + print "$job: successfully switched to actively synced mode\n"; >> + delete $switching->{$job}; >> + } >> + } >> + >> + last if scalar(keys $switching->%*) == 0; >> + >> + sleep 1; >> + } > > so what could be the cause here for a job not switching? and do we > really want to loop forever if it happens? > That should never happen. The 'block-job-change' QMP command already succeeded. That means further writes will be done synchronously to the target. Once the remaining dirty parts have been mirrored by the background iteration, the actively-synced flag will be set and we break out of the loop. We got to the ready condition already before doing the switch, getting there again is even easier after the switch: https://gitlab.com/qemu-project/qemu/-/blob/stable-9.0/block/mirror.c?ref_type=heads#L1078 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk 2024-07-03 13:44 ` Fiona Ebner @ 2024-07-03 13:49 ` Fiona Ebner 0 siblings, 0 replies; 10+ messages in thread From: Fiona Ebner @ 2024-07-03 13:49 UTC (permalink / raw) To: Proxmox VE development discussion, Fabian Grünbichler Am 03.07.24 um 15:44 schrieb Fiona Ebner: > Am 03.07.24 um 15:15 schrieb Fabian Grünbichler: >> On May 28, 2024 10:50 am, Fiona Ebner wrote: >>> + eval { >>> + mon_cmd( >>> + $vmid, >>> + "block-job-change", >>> + id => $job, >>> + type => 'mirror', >>> + 'copy-mode' => 'write-blocking', >>> + ); >>> + $switching->{$job} = 1; >>> + }; >>> + die "could not switch mirror job $job to active mode - $@\n" if $@; >>> + } >>> + >>> + while (1) { >>> + my $stats = mon_cmd($vmid, "query-block-jobs"); >>> + >>> + my $running_jobs = {}; >>> + $running_jobs->{$_->{device}} = $_ for $stats->@*; >>> + >>> + for my $job (sort keys $switching->%*) { >>> + if ($running_jobs->{$job}->{'actively-synced'}) { >>> + print "$job: successfully switched to actively synced mode\n"; >>> + delete $switching->{$job}; >>> + } >>> + } >>> + >>> + last if scalar(keys $switching->%*) == 0; >>> + >>> + sleep 1; >>> + } >> >> so what could be the cause here for a job not switching? and do we >> really want to loop forever if it happens? >> > > That should never happen. The 'block-job-change' QMP command already > succeeded. That means further writes will be done synchronously to the > target. Once the remaining dirty parts have been mirrored by the > background iteration, the actively-synced flag will be set and we break > out of the loop. > > We got to the ready condition already before doing the switch, getting > there again is even easier after the switch: > https://gitlab.com/qemu-project/qemu/-/blob/stable-9.0/block/mirror.c?ref_type=heads#L1078 > Well, "should". If a job fails after switching, then we'd actually be stuck. Will write a v2 that is robust against that. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-07-03 13:48 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-05-28 8:50 [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fiona Ebner 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 2/4] migration: handle replication: remove outdated and inaccurate check for QEMU version Fiona Ebner 2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler 2024-05-28 8:50 ` [pve-devel] [PATCH v2 qemu-server 3/4] backup: prepare: remove outdated QEMU version check Fiona Ebner 2024-07-03 13:10 ` [pve-devel] applied: " Fabian Grünbichler 2024-05-28 8:50 ` [pve-devel] [RFC v2 qemu-server 4/4] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner 2024-07-03 13:32 ` Fabian Grünbichler 2024-07-03 13:15 ` [pve-devel] [PATCH v2 qemu-server 1/4] migration: avoid crash with heavy IO on local VM disk Fabian Grünbichler 2024-07-03 13:44 ` Fiona Ebner 2024-07-03 13:49 ` Fiona Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox