public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk
@ 2024-07-04  9:32 Fiona Ebner
  2024-07-04  9:32 ` [pve-devel] [PATCH v3 qemu-server 2/2] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner
  2024-07-30 19:25 ` [pve-devel] applied-series: [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fiona Ebner @ 2024-07-04  9:32 UTC (permalink / raw)
  To: pve-devel

There is a possibility that the drive-mirror job is not yet done when
the migration wants to inactivate the source's blockdrives:

> bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.

This can be prevented by using the 'write-blocking' copy mode (also
called active mode) for the mirror. However, with active mode, the
guest write speed is limited by the synchronous writes to the mirror
target. For this reason, a way to start out in the faster 'background'
mode and later switch to active mode was introduced in QEMU 8.2.

The switch is done once the mirror job for all drives is ready to be
completed to reduce the time spent where guest IO is limited.

The loop waiting for actively-synced to become true is not an endless
loop: Once the remaining dirty parts have been mirrored by the
background iteration, the actively-synced flag will be set. Because
the 'block-job-change' QMP command already succeeded, new writes will
be done synchronously to the target and thus not lead to new dirty
parts. If the job fails or vanishes (shouldn't actually happen,
because auto-dismiss is false), the loop will be exited and the error
propagated.

Reported rarely, but steadily over the years:
https://forum.proxmox.com/threads/78954/post-353651
https://forum.proxmox.com/threads/78954/post-380015
https://forum.proxmox.com/threads/100020/post-431660
https://forum.proxmox.com/threads/111831/post-482425
https://forum.proxmox.com/threads/111831/post-499807
https://forum.proxmox.com/threads/137849/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
    * avoid endless loop when job fails while switching to active mode
    * mention rationale why loop is not and endless loop in commit
      message

 PVE/QemuMigrate.pm                    |  8 +++++
 PVE/QemuServer.pm                     | 51 +++++++++++++++++++++++++++
 test/MigrationTest/QemuMigrateMock.pm |  6 ++++
 3 files changed, 65 insertions(+)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index bdcc2e54..34fc46ee 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -1139,6 +1139,14 @@ sub phase2 {
 	    $self->log('info', "$drive: start migration to $nbd_uri");
 	    PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap);
 	}
+
+	if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) {
+	    $self->log('info', "switching mirror jobs to actively synced mode");
+	    PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode(
+		$vmid,
+		$self->{storage_migration_jobs},
+	    );
+	}
     }
 
     $self->log('info', "starting online/live migration on $migrate_uri");
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 548f13f0..12872ae2 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -8181,6 +8181,57 @@ sub qemu_blockjobs_cancel {
     }
 }
 
+# Callers should version guard this (only available with a binary >= QEMU 8.2)
+sub qemu_drive_mirror_switch_to_active_mode {
+    my ($vmid, $jobs) = @_;
+
+    my $switching = {};
+
+    for my $job (sort keys $jobs->%*) {
+	print "$job: switching to actively synced mode\n";
+
+	eval {
+	    mon_cmd(
+		$vmid,
+		"block-job-change",
+		id => $job,
+		type => 'mirror',
+		'copy-mode' => 'write-blocking',
+	    );
+	    $switching->{$job} = 1;
+	};
+	die "could not switch mirror job $job to active mode - $@\n" if $@;
+    }
+
+    while (1) {
+	my $stats = mon_cmd($vmid, "query-block-jobs");
+
+	my $running_jobs = {};
+	$running_jobs->{$_->{device}} = $_ for $stats->@*;
+
+	for my $job (sort keys $switching->%*) {
+	    die "$job: vanished while switching to active mode\n" if !$running_jobs->{$job};
+
+	    my $info = $running_jobs->{$job};
+	    if ($info->{status} eq 'concluded') {
+		qemu_handle_concluded_blockjob($vmid, $job, $info);
+		# The 'concluded' state should occur here if and only if the job failed, so the
+		# 'die' below should be unreachable, but play it safe.
+		die "$job: expected job to have failed, but no error was set\n";
+	    }
+
+	    if ($info->{'actively-synced'}) {
+		print "$job: successfully switched to actively synced mode\n";
+		delete $switching->{$job};
+	    }
+	}
+
+	last if scalar(keys $switching->%*) == 0;
+
+	sleep 1;
+    }
+}
+
 # Check for bug #4525: drive-mirror will open the target drive with the same aio setting as the
 # source, but some storages have problems with io_uring, sometimes even leading to crashes.
 my sub clone_disk_check_io_uring {
diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm
index 1efabe24..f5b44424 100644
--- a/test/MigrationTest/QemuMigrateMock.pm
+++ b/test/MigrationTest/QemuMigrateMock.pm
@@ -152,6 +152,9 @@ $MigrationTest::Shared::qemu_server_module->mock(
 	}
 	return;
     },
+    qemu_drive_mirror_switch_to_active_mode => sub {
+	return;
+    },
     set_migration_caps => sub {
 	return;
     },
@@ -185,6 +188,9 @@ $qemu_server_machine_module->mock(
 	    if !defined($vm_status->{runningmachine});
 	return $vm_status->{runningmachine};
     },
+    runs_at_least_qemu_version => sub {
+	return 1;
+    },
 );
 
 my $ssh_info_module = Test::MockModule->new("PVE::SSHInfo");
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 2/2] move helper to check running QEMU version out of the 'Machine' module
  2024-07-04  9:32 [pve-devel] [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Fiona Ebner
@ 2024-07-04  9:32 ` Fiona Ebner
  2024-07-30 19:25 ` [pve-devel] applied-series: [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Fiona Ebner @ 2024-07-04  9:32 UTC (permalink / raw)
  To: pve-devel

The version of the running QEMU binary is not related to the machine
version and so it's a bit confusing to have the helper in the
'Machine' module. It cannot live in the 'Helpers' module, because that
would lead to a cyclic inclusion Helpers <-> Monitor. Thus,
'QMPHelpers' is chosen as the new home.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 PVE/QemuMigrate.pm                    |  3 ++-
 PVE/QemuServer/Machine.pm             | 12 ------------
 PVE/QemuServer/QMPHelpers.pm          | 13 +++++++++++++
 test/MigrationTest/QemuMigrateMock.pm |  4 ++++
 test/run_config2command_tests.pl      |  4 ++--
 5 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 34fc46ee..e71face4 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -30,6 +30,7 @@ use PVE::QemuServer::Helpers qw(min_version);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Memory qw(get_current_memory);
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer;
 
 use PVE::AbstractMigrate;
@@ -1140,7 +1141,7 @@ sub phase2 {
 	    PVE::QemuServer::qemu_drive_mirror($vmid, $drive, $nbd_uri, $vmid, undef, $self->{storage_migration_jobs}, 'skip', undef, $bwlimit, $bitmap);
 	}
 
-	if (PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, 8, 2)) {
+	if (PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, 8, 2)) {
 	    $self->log('info', "switching mirror jobs to actively synced mode");
 	    PVE::QemuServer::qemu_drive_mirror_switch_to_active_mode(
 		$vmid,
diff --git a/PVE/QemuServer/Machine.pm b/PVE/QemuServer/Machine.pm
index cc92e7e6..a3917dae 100644
--- a/PVE/QemuServer/Machine.pm
+++ b/PVE/QemuServer/Machine.pm
@@ -161,18 +161,6 @@ sub can_run_pve_machine_version {
     return 0;
 }
 
-# dies if a) VM not running or not exisiting b) Version query failed
-# So, any defined return value is valid, any invalid state can be caught by eval
-sub runs_at_least_qemu_version {
-    my ($vmid, $major, $minor, $extra) = @_;
-
-    my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version');
-    die "could not query currently running version for VM $vmid\n" if !defined($v);
-    $v = $v->{qemu};
-
-    return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0;
-}
-
 sub qemu_machine_pxe {
     my ($vmid, $conf) = @_;
 
diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm
index d3a52327..0269ea46 100644
--- a/PVE/QemuServer/QMPHelpers.pm
+++ b/PVE/QemuServer/QMPHelpers.pm
@@ -3,6 +3,7 @@ package PVE::QemuServer::QMPHelpers;
 use warnings;
 use strict;
 
+use PVE::QemuServer::Helpers;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 
 use base 'Exporter';
@@ -45,4 +46,16 @@ sub qemu_objectdel {
     return 1;
 }
 
+# dies if a) VM not running or not exisiting b) Version query failed
+# So, any defined return value is valid, any invalid state can be caught by eval
+sub runs_at_least_qemu_version {
+    my ($vmid, $major, $minor, $extra) = @_;
+
+    my $v = PVE::QemuServer::Monitor::mon_cmd($vmid, 'query-version');
+    die "could not query currently running version for VM $vmid\n" if !defined($v);
+    $v = $v->{qemu};
+
+    return PVE::QemuServer::Helpers::version_cmp($v->{major}, $major, $v->{minor}, $minor, $v->{micro}, $extra) >= 0;
+}
+
 1;
diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm
index f5b44424..11c58c08 100644
--- a/test/MigrationTest/QemuMigrateMock.pm
+++ b/test/MigrationTest/QemuMigrateMock.pm
@@ -188,6 +188,10 @@ $qemu_server_machine_module->mock(
 	    if !defined($vm_status->{runningmachine});
 	return $vm_status->{runningmachine};
     },
+);
+
+my $qemu_server_qmphelpers_module = Test::MockModule->new("PVE::QemuServer::QMPHelpers");
+$qemu_server_qmphelpers_module->mock(
     runs_at_least_qemu_version => sub {
 	return 1;
     },
diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
index 7212acc4..d48ef562 100755
--- a/test/run_config2command_tests.pl
+++ b/test/run_config2command_tests.pl
@@ -16,7 +16,7 @@ use PVE::SysFSTools;
 use PVE::QemuConfig;
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor;
-use PVE::QemuServer::Machine;
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer::CPUConfig;
 
 my $base_env = {
@@ -472,7 +472,7 @@ sub do_test($) {
     # check if QEMU version set correctly and test version_cmp
     (my $qemu_major = get_test_qemu_version()) =~ s/\..*$//;
     die "runs_at_least_qemu_version returned false, maybe error in version_cmp?"
-	if !PVE::QemuServer::Machine::runs_at_least_qemu_version($vmid, $qemu_major);
+	if !PVE::QemuServer::QMPHelpers::runs_at_least_qemu_version($vmid, $qemu_major);
 
     $cmdline =~ s/ -/ \\\n  -/g; # same as qm showcmd --pretty
     $cmdline .= "\n";
-- 
2.39.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] applied-series: [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk
  2024-07-04  9:32 [pve-devel] [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Fiona Ebner
  2024-07-04  9:32 ` [pve-devel] [PATCH v3 qemu-server 2/2] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner
@ 2024-07-30 19:25 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2024-07-30 19:25 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 04/07/2024 um 11:32 schrieb Fiona Ebner:
> There is a possibility that the drive-mirror job is not yet done when
> the migration wants to inactivate the source's blockdrives:
> 
>> bdrv_co_write_req_prepare: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
> 
> This can be prevented by using the 'write-blocking' copy mode (also
> called active mode) for the mirror. However, with active mode, the
> guest write speed is limited by the synchronous writes to the mirror
> target. For this reason, a way to start out in the faster 'background'
> mode and later switch to active mode was introduced in QEMU 8.2.
> 
> The switch is done once the mirror job for all drives is ready to be
> completed to reduce the time spent where guest IO is limited.
> 
> The loop waiting for actively-synced to become true is not an endless
> loop: Once the remaining dirty parts have been mirrored by the
> background iteration, the actively-synced flag will be set. Because
> the 'block-job-change' QMP command already succeeded, new writes will
> be done synchronously to the target and thus not lead to new dirty
> parts. If the job fails or vanishes (shouldn't actually happen,
> because auto-dismiss is false), the loop will be exited and the error
> propagated.
> 
> Reported rarely, but steadily over the years:
> https://forum.proxmox.com/threads/78954/post-353651
> https://forum.proxmox.com/threads/78954/post-380015
> https://forum.proxmox.com/threads/100020/post-431660
> https://forum.proxmox.com/threads/111831/post-482425
> https://forum.proxmox.com/threads/111831/post-499807
> https://forum.proxmox.com/threads/137849/
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
>     * avoid endless loop when job fails while switching to active mode
>     * mention rationale why loop is not and endless loop in commit
>       message
> 
>  PVE/QemuMigrate.pm                    |  8 +++++
>  PVE/QemuServer.pm                     | 51 +++++++++++++++++++++++++++
>  test/MigrationTest/QemuMigrateMock.pm |  6 ++++
>  3 files changed, 65 insertions(+)
> 
>

applied both patches, thanks!

Albeit I'm a bit wondering if we would be able to mock at a deeper level than, e.g.,
qemu_drive_mirror_switch_to_active_mode, maybe by mocking mon_cmd and then also
checking if they monitor commands are triggered as expected and in the correct order.
But that might affect a few more methods and definitively orthogonal to this.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-30 19:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-04  9:32 [pve-devel] [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Fiona Ebner
2024-07-04  9:32 ` [pve-devel] [PATCH v3 qemu-server 2/2] move helper to check running QEMU version out of the 'Machine' module Fiona Ebner
2024-07-30 19:25 ` [pve-devel] applied-series: [PATCH v3 qemu-server 1/2] migration: avoid crash with heavy IO on local VM disk Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal