[pve-devel] [PATCH v4 qemu-server 0/2] remote-migration: migration with different cpu

public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pve-devel] [PATCH v4 qemu-server 0/2] remote-migration: migration with different cpu
@ 2023-09-28 14:45 Alexandre Derumier
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub Alexandre Derumier
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params Alexandre Derumier
  0 siblings, 2 replies; 15+ messages in thread
From: Alexandre Derumier @ 2023-09-28 14:45 UTC (permalink / raw)
  To: pve-devel

This patch series allow remote migration between cluster with different cpu model.

2 new params are introduced: "target-cpu" && "target-reboot"

If target-cpu is defined, this will replace the cpu model of the target vm.

If vm is online/running, an extra "target-reboot" safeguard option is needed.
Indeed, as the target cpu is different, the live migration with memory transfert
is skipped (as anyway, the target will die with a different cpu).

Then, after the storage copy, we switch source vm disk to the targetvm nbd export,
then shutdown the source vm and restart the target vm.
(Like a virtual reboot between source/target)

Changelog v2:

The first version was simply shuting down the target vm,
wihout doing the block-job-complete.

After doing production migration with around 400vms, I had
some fs corruption, like some datas was still in buffer.

This v2 has been tested with another 400vms batch, without
any corruption.

Changelog v3:

v2 was not perfect, still have some 1 or 2 fs corruption with vms doing
a lot of write.

This v3 retake idea of the v1 but in a cleaner way

- we migrate disk to target vm
- source vm is switching disk to the nbd of the target vm. 
  (with a block-job-complete, and not a block-job-cancel with standard disk migration).
  We are 100% sure it that no pending write is still pending in the migration job.
- source vm is shutdown
- target with is restart

Changelog v4:
 - bugfix: no not override cpu with empty config if targetcpu is not defined
 - small cleanups with params 

We have redone a lot of migration this summer( maybe another 4000vm),
0 corruption, windows or linux guest vms.

Alexandre Derumier (2):
  migration: move livemigration code in a dedicated sub
  remote-migration: add target-cpu && target-reboot params

 PVE/API2/Qemu.pm   |  23 ++-
 PVE/CLI/qm.pm      |  11 ++
 PVE/QemuMigrate.pm | 451 ++++++++++++++++++++++++---------------------
 3 files changed, 276 insertions(+), 209 deletions(-)

-- 
2.39.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub
  2023-09-28 14:45 [pve-devel] [PATCH v4 qemu-server 0/2] remote-migration: migration with different cpu Alexandre Derumier
@ 2023-09-28 14:45 ` Alexandre Derumier
  2023-10-09 11:25   ` Fiona Ebner
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params Alexandre Derumier
  1 sibling, 1 reply; 15+ messages in thread
From: Alexandre Derumier @ 2023-09-28 14:45 UTC (permalink / raw)
  To: pve-devel

---
 PVE/QemuMigrate.pm | 420 +++++++++++++++++++++++----------------------
 1 file changed, 214 insertions(+), 206 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index f41c61f..5ea78a7 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -726,6 +726,219 @@ sub cleanup_bitmaps {
     }
 }
 
+sub live_migration {
+    my ($self, $vmid, $migrate_uri, $spice_port) = @_;
+
+    my $conf = $self->{vmconf};
+
+    $self->log('info', "starting online/live migration on $migrate_uri");
+    $self->{livemigration} = 1;
+
+    # load_defaults
+    my $defaults = PVE::QemuServer::load_defaults();
+
+    $self->log('info', "set migration capabilities");
+    eval { PVE::QemuServer::set_migration_caps($vmid) };
+    warn $@ if $@;
+
+    my $qemu_migrate_params = {};
+
+    # migrate speed can be set via bwlimit (datacenter.cfg and API) and via the
+    # migrate_speed parameter in qm.conf - take the lower of the two.
+    my $bwlimit = $self->get_bwlimit();
+
+    my $migrate_speed = $conf->{migrate_speed} // 0;
+    $migrate_speed *= 1024; # migrate_speed is in MB/s, bwlimit in KB/s
+
+    if ($bwlimit && $migrate_speed) {
+	$migrate_speed = ($bwlimit < $migrate_speed) ? $bwlimit : $migrate_speed;
+    } else {
+	$migrate_speed ||= $bwlimit;
+    }
+    $migrate_speed ||= ($defaults->{migrate_speed} || 0) * 1024;
+
+    if ($migrate_speed) {
+	$migrate_speed *= 1024; # qmp takes migrate_speed in B/s.
+	$self->log('info', "migration speed limit: ". render_bytes($migrate_speed, 1) ."/s");
+    } else {
+	# always set migrate speed as QEMU default to 128 MiBps == 1 Gbps, use 16 GiBps == 128 Gbps
+	$migrate_speed = (16 << 30);
+    }
+    $qemu_migrate_params->{'max-bandwidth'} = int($migrate_speed);
+
+    my $migrate_downtime = $defaults->{migrate_downtime};
+    $migrate_downtime = $conf->{migrate_downtime} if defined($conf->{migrate_downtime});
+    # migrate-set-parameters expects limit in ms
+    $migrate_downtime *= 1000;
+    $self->log('info', "migration downtime limit: $migrate_downtime ms");
+    $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
+
+    # set cachesize to 10% of the total memory
+    my $memory = get_current_memory($conf->{memory});
+    my $cachesize = int($memory * 1048576 / 10);
+    $cachesize = round_powerof2($cachesize);
+
+    $self->log('info', "migration cachesize: " . render_bytes($cachesize, 1));
+    $qemu_migrate_params->{'xbzrle-cache-size'} = int($cachesize);
+
+    $self->log('info', "set migration parameters");
+    eval {
+	mon_cmd($vmid, "migrate-set-parameters", %{$qemu_migrate_params});
+    };
+    $self->log('info', "migrate-set-parameters error: $@") if $@;
+
+    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my (undef, $proxyticket) = PVE::AccessControl::assemble_spice_ticket($authuser, $vmid, $self->{node});
+
+	my $filename = "/etc/pve/nodes/$self->{node}/pve-ssl.pem";
+	my $subject =  PVE::AccessControl::read_x509_subject_spice($filename);
+
+	$self->log('info', "spice client_migrate_info");
+
+	eval {
+	    mon_cmd($vmid, "client_migrate_info", protocol => 'spice',
+						hostname => $proxyticket, 'port' => 0, 'tls-port' => $spice_port,
+						'cert-subject' => $subject);
+	};
+	$self->log('info', "client_migrate_info error: $@") if $@;
+
+    }
+
+    my $start = time();
+
+    $self->log('info', "start migrate command to $migrate_uri");
+    eval {
+	mon_cmd($vmid, "migrate", uri => $migrate_uri);
+    };
+    my $merr = $@;
+    $self->log('info', "migrate uri => $migrate_uri failed: $merr") if $merr;
+
+    my $last_mem_transferred = 0;
+    my $usleep = 1000000;
+    my $i = 0;
+    my $err_count = 0;
+    my $lastrem = undef;
+    my $downtimecounter = 0;
+    while (1) {
+	$i++;
+	my $avglstat = $last_mem_transferred ? $last_mem_transferred / $i : 0;
+
+	usleep($usleep);
+
+	my $stat = eval { mon_cmd($vmid, "query-migrate") };
+	if (my $err = $@) {
+	    $err_count++;
+	    warn "query migrate failed: $err\n";
+	    $self->log('info', "query migrate failed: $err");
+	    if ($err_count <= 5) {
+		usleep(1_000_000);
+		next;
+	    }
+	    die "too many query migrate failures - aborting\n";
+	}
+
+	my $status = $stat->{status};
+	if (defined($status) && $status =~ m/^(setup)$/im) {
+	    sleep(1);
+	    next;
+	}
+
+	if (!defined($status) || $status !~ m/^(active|completed|failed|cancelled)$/im) {
+	    die $merr if $merr;
+	    die "unable to parse migration status '$status' - aborting\n";
+	}
+	$merr = undef;
+	$err_count = 0;
+
+	my $memstat = $stat->{ram};
+
+	if ($status eq 'completed') {
+	    my $delay = time() - $start;
+	    if ($delay > 0) {
+		my $total = $memstat->{total} || 0;
+		my $avg_speed = render_bytes($total / $delay, 1);
+		my $downtime = $stat->{downtime} || 0;
+		$self->log('info', "average migration speed: $avg_speed/s - downtime $downtime ms");
+	    }
+	}
+
+	if ($status eq 'failed' || $status eq 'cancelled') {
+	    my $message = $stat->{'error-desc'} ? "$status - $stat->{'error-desc'}" : $status;
+	    $self->log('info', "migration status error: $message");
+	    die "aborting\n"
+	}
+
+	if ($status ne 'active') {
+	    $self->log('info', "migration status: $status");
+	    last;
+	}
+
+	if ($memstat->{transferred} ne $last_mem_transferred) {
+	    my $trans = $memstat->{transferred} || 0;
+	    my $rem = $memstat->{remaining} || 0;
+	    my $total = $memstat->{total} || 0;
+	    my $speed = ($memstat->{'pages-per-second'} // 0) * ($memstat->{'page-size'} // 0);
+	    my $dirty_rate = ($memstat->{'dirty-pages-rate'} // 0) * ($memstat->{'page-size'} // 0);
+
+	    # reduce sleep if remainig memory is lower than the average transfer speed
+	    $usleep = 100_000 if $avglstat && $rem < $avglstat;
+
+	    # also reduce loggin if we poll more frequent
+	    my $should_log = $usleep > 100_000 ? 1 : ($i % 10) == 0;
+
+	    my $total_h = render_bytes($total, 1);
+	    my $transferred_h = render_bytes($trans, 1);
+	    my $speed_h = render_bytes($speed, 1);
+
+	    my $progress = "transferred $transferred_h of $total_h VM-state, ${speed_h}/s";
+
+	    if ($dirty_rate > $speed) {
+		my $dirty_rate_h = render_bytes($dirty_rate, 1);
+		$progress .= ", VM dirties lots of memory: $dirty_rate_h/s";
+	    }
+
+	    $self->log('info', "migration $status, $progress") if $should_log;
+
+	    my $xbzrle = $stat->{"xbzrle-cache"} || {};
+	    my ($xbzrlebytes, $xbzrlepages) = $xbzrle->@{'bytes', 'pages'};
+	    if ($xbzrlebytes || $xbzrlepages) {
+		my $bytes_h = render_bytes($xbzrlebytes, 1);
+
+		my $msg = "send updates to $xbzrlepages pages in $bytes_h encoded memory";
+
+		$msg .= sprintf(", cache-miss %.2f%%", $xbzrle->{'cache-miss-rate'} * 100)
+		    if $xbzrle->{'cache-miss-rate'};
+
+		$msg .= ", overflow $xbzrle->{overflow}" if $xbzrle->{overflow};
+
+		$self->log('info', "xbzrle: $msg") if $should_log;
+	    }
+
+	    if (($lastrem && $rem > $lastrem) || ($rem == 0)) {
+		$downtimecounter++;
+	    }
+	    $lastrem = $rem;
+
+	    if ($downtimecounter > 5) {
+		$downtimecounter = 0;
+		$migrate_downtime *= 2;
+		$self->log('info', "auto-increased downtime to continue migration: $migrate_downtime ms");
+		eval {
+		    # migrate-set-parameters does not touch values not
+		    # specified, so this only changes downtime-limit
+		    mon_cmd($vmid, "migrate-set-parameters", 'downtime-limit' => int($migrate_downtime));
+		};
+		$self->log('info', "migrate-set-parameters error: $@") if $@;
+	    }
+	}
+
+	$last_mem_transferred = $memstat->{transferred};
+    }
+}
+
 sub phase1 {
     my ($self, $vmid) = @_;
 
@@ -1137,212 +1350,7 @@ sub phase2 {
 	}
     }
 
-    $self->log('info', "starting online/live migration on $migrate_uri");
-    $self->{livemigration} = 1;
-
-    # load_defaults
-    my $defaults = PVE::QemuServer::load_defaults();
-
-    $self->log('info', "set migration capabilities");
-    eval { PVE::QemuServer::set_migration_caps($vmid) };
-    warn $@ if $@;
-
-    my $qemu_migrate_params = {};
-
-    # migrate speed can be set via bwlimit (datacenter.cfg and API) and via the
-    # migrate_speed parameter in qm.conf - take the lower of the two.
-    my $bwlimit = $self->get_bwlimit();
-
-    my $migrate_speed = $conf->{migrate_speed} // 0;
-    $migrate_speed *= 1024; # migrate_speed is in MB/s, bwlimit in KB/s
-
-    if ($bwlimit && $migrate_speed) {
-	$migrate_speed = ($bwlimit < $migrate_speed) ? $bwlimit : $migrate_speed;
-    } else {
-	$migrate_speed ||= $bwlimit;
-    }
-    $migrate_speed ||= ($defaults->{migrate_speed} || 0) * 1024;
-
-    if ($migrate_speed) {
-	$migrate_speed *= 1024; # qmp takes migrate_speed in B/s.
-	$self->log('info', "migration speed limit: ". render_bytes($migrate_speed, 1) ."/s");
-    } else {
-	# always set migrate speed as QEMU default to 128 MiBps == 1 Gbps, use 16 GiBps == 128 Gbps
-	$migrate_speed = (16 << 30);
-    }
-    $qemu_migrate_params->{'max-bandwidth'} = int($migrate_speed);
-
-    my $migrate_downtime = $defaults->{migrate_downtime};
-    $migrate_downtime = $conf->{migrate_downtime} if defined($conf->{migrate_downtime});
-    # migrate-set-parameters expects limit in ms
-    $migrate_downtime *= 1000;
-    $self->log('info', "migration downtime limit: $migrate_downtime ms");
-    $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
-
-    # set cachesize to 10% of the total memory
-    my $memory = get_current_memory($conf->{memory});
-    my $cachesize = int($memory * 1048576 / 10);
-    $cachesize = round_powerof2($cachesize);
-
-    $self->log('info', "migration cachesize: " . render_bytes($cachesize, 1));
-    $qemu_migrate_params->{'xbzrle-cache-size'} = int($cachesize);
-
-    $self->log('info', "set migration parameters");
-    eval {
-	mon_cmd($vmid, "migrate-set-parameters", %{$qemu_migrate_params});
-    };
-    $self->log('info', "migrate-set-parameters error: $@") if $@;
-
-    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
-	my $rpcenv = PVE::RPCEnvironment::get();
-	my $authuser = $rpcenv->get_user();
-
-	my (undef, $proxyticket) = PVE::AccessControl::assemble_spice_ticket($authuser, $vmid, $self->{node});
-
-	my $filename = "/etc/pve/nodes/$self->{node}/pve-ssl.pem";
-	my $subject =  PVE::AccessControl::read_x509_subject_spice($filename);
-
-	$self->log('info', "spice client_migrate_info");
-
-	eval {
-	    mon_cmd($vmid, "client_migrate_info", protocol => 'spice',
-						hostname => $proxyticket, 'port' => 0, 'tls-port' => $spice_port,
-						'cert-subject' => $subject);
-	};
-	$self->log('info', "client_migrate_info error: $@") if $@;
-
-    }
-
-    my $start = time();
-
-    $self->log('info', "start migrate command to $migrate_uri");
-    eval {
-	mon_cmd($vmid, "migrate", uri => $migrate_uri);
-    };
-    my $merr = $@;
-    $self->log('info', "migrate uri => $migrate_uri failed: $merr") if $merr;
-
-    my $last_mem_transferred = 0;
-    my $usleep = 1000000;
-    my $i = 0;
-    my $err_count = 0;
-    my $lastrem = undef;
-    my $downtimecounter = 0;
-    while (1) {
-	$i++;
-	my $avglstat = $last_mem_transferred ? $last_mem_transferred / $i : 0;
-
-	usleep($usleep);
-
-	my $stat = eval { mon_cmd($vmid, "query-migrate") };
-	if (my $err = $@) {
-	    $err_count++;
-	    warn "query migrate failed: $err\n";
-	    $self->log('info', "query migrate failed: $err");
-	    if ($err_count <= 5) {
-		usleep(1_000_000);
-		next;
-	    }
-	    die "too many query migrate failures - aborting\n";
-	}
-
-	my $status = $stat->{status};
-	if (defined($status) && $status =~ m/^(setup)$/im) {
-	    sleep(1);
-	    next;
-	}
-
-	if (!defined($status) || $status !~ m/^(active|completed|failed|cancelled)$/im) {
-	    die $merr if $merr;
-	    die "unable to parse migration status '$status' - aborting\n";
-	}
-	$merr = undef;
-	$err_count = 0;
-
-	my $memstat = $stat->{ram};
-
-	if ($status eq 'completed') {
-	    my $delay = time() - $start;
-	    if ($delay > 0) {
-		my $total = $memstat->{total} || 0;
-		my $avg_speed = render_bytes($total / $delay, 1);
-		my $downtime = $stat->{downtime} || 0;
-		$self->log('info', "average migration speed: $avg_speed/s - downtime $downtime ms");
-	    }
-	}
-
-	if ($status eq 'failed' || $status eq 'cancelled') {
-	    my $message = $stat->{'error-desc'} ? "$status - $stat->{'error-desc'}" : $status;
-	    $self->log('info', "migration status error: $message");
-	    die "aborting\n"
-	}
-
-	if ($status ne 'active') {
-	    $self->log('info', "migration status: $status");
-	    last;
-	}
-
-	if ($memstat->{transferred} ne $last_mem_transferred) {
-	    my $trans = $memstat->{transferred} || 0;
-	    my $rem = $memstat->{remaining} || 0;
-	    my $total = $memstat->{total} || 0;
-	    my $speed = ($memstat->{'pages-per-second'} // 0) * ($memstat->{'page-size'} // 0);
-	    my $dirty_rate = ($memstat->{'dirty-pages-rate'} // 0) * ($memstat->{'page-size'} // 0);
-
-	    # reduce sleep if remainig memory is lower than the average transfer speed
-	    $usleep = 100_000 if $avglstat && $rem < $avglstat;
-
-	    # also reduce loggin if we poll more frequent
-	    my $should_log = $usleep > 100_000 ? 1 : ($i % 10) == 0;
-
-	    my $total_h = render_bytes($total, 1);
-	    my $transferred_h = render_bytes($trans, 1);
-	    my $speed_h = render_bytes($speed, 1);
-
-	    my $progress = "transferred $transferred_h of $total_h VM-state, ${speed_h}/s";
-
-	    if ($dirty_rate > $speed) {
-		my $dirty_rate_h = render_bytes($dirty_rate, 1);
-		$progress .= ", VM dirties lots of memory: $dirty_rate_h/s";
-	    }
-
-	    $self->log('info', "migration $status, $progress") if $should_log;
-
-	    my $xbzrle = $stat->{"xbzrle-cache"} || {};
-	    my ($xbzrlebytes, $xbzrlepages) = $xbzrle->@{'bytes', 'pages'};
-	    if ($xbzrlebytes || $xbzrlepages) {
-		my $bytes_h = render_bytes($xbzrlebytes, 1);
-
-		my $msg = "send updates to $xbzrlepages pages in $bytes_h encoded memory";
-
-		$msg .= sprintf(", cache-miss %.2f%%", $xbzrle->{'cache-miss-rate'} * 100)
-		    if $xbzrle->{'cache-miss-rate'};
-
-		$msg .= ", overflow $xbzrle->{overflow}" if $xbzrle->{overflow};
-
-		$self->log('info', "xbzrle: $msg") if $should_log;
-	    }
-
-	    if (($lastrem && $rem > $lastrem) || ($rem == 0)) {
-		$downtimecounter++;
-	    }
-	    $lastrem = $rem;
-
-	    if ($downtimecounter > 5) {
-		$downtimecounter = 0;
-		$migrate_downtime *= 2;
-		$self->log('info', "auto-increased downtime to continue migration: $migrate_downtime ms");
-		eval {
-		    # migrate-set-parameters does not touch values not
-		    # specified, so this only changes downtime-limit
-		    mon_cmd($vmid, "migrate-set-parameters", 'downtime-limit' => int($migrate_downtime));
-		};
-		$self->log('info', "migrate-set-parameters error: $@") if $@;
-	    }
-	}
-
-	$last_mem_transferred = $memstat->{transferred};
-    }
+    live_migration($self, $vmid, $migrate_uri, $spice_port);
 
     if ($self->{storage_migration}) {
 	# finish block-job with block-job-cancel, to disconnect source VM from NBD
-- 
2.39.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub Alexandre Derumier
@ 2023-10-09 11:25   ` Fiona Ebner
  0 siblings, 0 replies; 15+ messages in thread
From: Fiona Ebner @ 2023-10-09 11:25 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Missing your Signed-off-by trailer.

Apart from that
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>




^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-09-28 14:45 [pve-devel] [PATCH v4 qemu-server 0/2] remote-migration: migration with different cpu Alexandre Derumier
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub Alexandre Derumier
@ 2023-09-28 14:45 ` Alexandre Derumier
  2023-10-09 12:13   ` Fiona Ebner
  1 sibling, 1 reply; 15+ messages in thread
From: Alexandre Derumier @ 2023-09-28 14:45 UTC (permalink / raw)
  To: pve-devel

This patch add support for remote migration when target
cpu model is different.

target-reboot param need to be defined to allow migration
whens source vm is online.

When defined, only the live storage migration is done,
and instead to transfert memory, we cleanly shutdown source vm
and restart the target vm. (like a virtual reboot between source/dest)
---
 PVE/API2/Qemu.pm   | 23 ++++++++++++++++++++++-
 PVE/CLI/qm.pm      | 11 +++++++++++
 PVE/QemuMigrate.pm | 31 +++++++++++++++++++++++++++++--
 3 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 774b0c7..38991e9 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4586,6 +4586,17 @@ __PACKAGE__->register_method({
 		optional => 1,
 		default => 0,
 	    },
+	    'target-cpu' => {
+		optional => 1,
+		description => "Target Emulated CPU model. For online migration, this require target-reboot option",
+		type => 'string',
+		format => 'pve-vm-cpu-conf',
+	    },
+	    'target-reboot' => {
+		type => 'boolean',
+		description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.",
+		optional => 1,
+	    },
 	    'target-storage' => get_standard_option('pve-targetstorage', {
 		completion => \&PVE::QemuServer::complete_migration_storage,
 		optional => 0,
@@ -4666,7 +4677,7 @@ __PACKAGE__->register_method({
 
 	if (PVE::QemuServer::check_running($source_vmid)) {
 	    die "can't migrate running VM without --online\n" if !$param->{online};
-
+	    die "can't migrate running VM without --target-reboot when target cpu is different" if $param->{'target-cpu'} && !$param->{'target-reboot'};
 	} else {
 	    warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
 	    $param->{online} = 0;
@@ -4683,6 +4694,7 @@ __PACKAGE__->register_method({
 	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
 	    if $@;
 
+
 	die "remote migration requires explicit storage mapping!\n"
 	    if $storagemap->{identity};
 
@@ -5732,6 +5744,15 @@ __PACKAGE__->register_method({
 		    PVE::QemuServer::nbd_stop($state->{vmid});
 		    return;
 		},
+		'restart' => sub {
+		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
+		    my $info = PVE::QemuServer::vm_start_nolock(
+			$state->{storecfg},
+			$state->{vmid},
+			$state->{conf},
+		    );
+		    return;
+		},
 		'resume' => sub {
 		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
 			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
index b17b4fe..9d89cfe 100755
--- a/PVE/CLI/qm.pm
+++ b/PVE/CLI/qm.pm
@@ -189,6 +189,17 @@ __PACKAGE__->register_method({
 		optional => 1,
 		default => 0,
 	    },
+	    'target-cpu' => {
+		optional => 1,
+		description => "Target Emulated CPU model. For online migration, this require target-reboot option",
+		type => 'string',
+		format => 'pve-vm-cpu-conf',
+	    },
+	    'target-reboot' => {
+		type => 'boolean',
+		description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.",
+		optional => 1,
+	    },
 	    'target-storage' => get_standard_option('pve-targetstorage', {
 		completion => \&PVE::QemuServer::complete_migration_storage,
 		optional => 0,
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 5ea78a7..8131b0b 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -729,6 +729,11 @@ sub cleanup_bitmaps {
 sub live_migration {
     my ($self, $vmid, $migrate_uri, $spice_port) = @_;
 
+    if($self->{opts}->{'target-reboot'}){
+	$self->log('info', "target reboot - skip live migration.");
+	return;
+    }
+
     my $conf = $self->{vmconf};
 
     $self->log('info', "starting online/live migration on $migrate_uri");
@@ -993,6 +998,7 @@ sub phase1_remote {
     my $remote_conf = PVE::QemuConfig->load_config($vmid);
     PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
 
+    $remote_conf->{cpu} = $self->{opts}->{'target-cpu'} if $self->{opts}->{'target-cpu'};
     my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
     for my $target (keys $bridges->%*) {
 	for my $nic (keys $bridges->{$target}->%*) {
@@ -1356,7 +1362,14 @@ sub phase2 {
 	# finish block-job with block-job-cancel, to disconnect source VM from NBD
 	# to avoid it trying to re-establish it. We are in blockjob ready state,
 	# thus, this command changes to it to blockjob complete (see qapi docs)
-	eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, 'cancel'); };
+	my $finish_cmd = "cancel";
+	if ($self->{opts}->{'target-reboot'}) {
+	    # no live migration.
+	    # finish block-job with block-job-complete, the source will switch to remote NDB
+	    # then we cleanly stop the source vm at phase3
+	    $finish_cmd = "complete";
+	}
+	eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, $finish_cmd); };
 	if (my $err = $@) {
 	    die "Failed to complete storage migration: $err\n";
 	}
@@ -1573,7 +1586,17 @@ sub phase3_cleanup {
     };
 
     # always stop local VM with nocheck, since config is moved already
-    eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1); };
+    my $shutdown_timeout = undef;
+    my $shutdown = undef;
+    my $force_stop = undef;
+    if ($self->{opts}->{'target-reboot'}) {
+	$shutdown_timeout = 180;
+	$shutdown = 1;
+	$force_stop = 1;
+	$self->log('info', "clean shutdown of source vm.");
+    }
+
+    eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1, $shutdown_timeout, $shutdown, $force_stop); };
     if (my $err = $@) {
 	$self->log('err', "stopping vm failed - $err");
 	$self->{errors} = 1;
@@ -1607,6 +1630,10 @@ sub phase3_cleanup {
     # clear migrate lock
     if ($tunnel && $tunnel->{version} >= 2) {
 	PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
+	if ($self->{opts}->{'target-reboot'}) {
+	    $self->log('info', "restart target vm.");
+	    PVE::Tunnel::write_tunnel($tunnel, 10, 'restart');
+	}
 
 	PVE::Tunnel::finish_tunnel($tunnel);
     } else {
-- 
2.39.2




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params Alexandre Derumier
@ 2023-10-09 12:13   ` Fiona Ebner
  2023-10-09 13:47     ` DERUMIER, Alexandre
  0 siblings, 1 reply; 15+ messages in thread
From: Fiona Ebner @ 2023-10-09 12:13 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

There can be other reasons to do a restart-migration, see also
https://bugzilla.proxmox.com/show_bug.cgi?id=4530, so I feel like this
should be split. One for introducing target-reboot and one for
introducing target-cpu.

For 'target-reboot', the question is if we should call it 'restart' like
for container migration for consistency? Could also be introduced for
normal migration while we're at it.

One could argue that a true 'restart' migration would migrate the
volumes also offline, but right now, I don't see a big downside to do it
via NBD like in this patch. Still, something we should think about. If
it turns out to be really needed, we'd need two different ways to do a
restart migration :/

Am 28.09.23 um 16:45 schrieb Alexandre Derumier:
> This patch add support for remote migration when target
> cpu model is different.
> 
> target-reboot param need to be defined to allow migration
> whens source vm is online.
> 
> When defined, only the live storage migration is done,
> and instead to transfert memory, we cleanly shutdown source vm
> and restart the target vm. (like a virtual reboot between source/dest)

Missing your Signed-off-by

> ---
>  PVE/API2/Qemu.pm   | 23 ++++++++++++++++++++++-
>  PVE/CLI/qm.pm      | 11 +++++++++++
>  PVE/QemuMigrate.pm | 31 +++++++++++++++++++++++++++++--
>  3 files changed, 62 insertions(+), 3 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 774b0c7..38991e9 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4586,6 +4586,17 @@ __PACKAGE__->register_method({
>  		optional => 1,
>  		default => 0,
>  	    },
> +	    'target-cpu' => {
> +		optional => 1,
> +		description => "Target Emulated CPU model. For online migration, this require target-reboot option",

To enforce it, you can use:
requires => 'target-reboot',

> +		type => 'string',
> +		format => 'pve-vm-cpu-conf',> +	    },
> +	    'target-reboot' => {
> +		type => 'boolean',
> +		description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.",
> +		optional => 1,
> +	    },
>  	    'target-storage' => get_standard_option('pve-targetstorage', {
>  		completion => \&PVE::QemuServer::complete_migration_storage,
>  		optional => 0,
> @@ -4666,7 +4677,7 @@ __PACKAGE__->register_method({
>  
>  	if (PVE::QemuServer::check_running($source_vmid)) {
>  	    die "can't migrate running VM without --online\n" if !$param->{online};
> -
> +	    die "can't migrate running VM without --target-reboot when target cpu is different" if $param->{'target-cpu'} && !$param->{'target-reboot'};
>  	} else {
>  	    warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
>  	    $param->{online} = 0;
> @@ -4683,6 +4694,7 @@ __PACKAGE__->register_method({
>  	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
>  	    if $@;
>  
> +
>  	die "remote migration requires explicit storage mapping!\n"
>  	    if $storagemap->{identity};
>  

Nit: unrelated change

> @@ -5732,6 +5744,15 @@ __PACKAGE__->register_method({
>  		    PVE::QemuServer::nbd_stop($state->{vmid});
>  		    return;
>  		},
> +		'restart' => sub {
> +		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);

The first parameter is $storecfg and is not optional. To avoid
deactivating the volumes, use the $keepActive parameter.

> +		    my $info = PVE::QemuServer::vm_start_nolock(
> +			$state->{storecfg},
> +			$state->{vmid},
> +			$state->{conf},
> +		    );
> +		    return;
> +		},
>  		'resume' => sub {
>  		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
>  			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
> diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> index b17b4fe..9d89cfe 100755
> --- a/PVE/CLI/qm.pm
> +++ b/PVE/CLI/qm.pm
> @@ -189,6 +189,17 @@ __PACKAGE__->register_method({
>  		optional => 1,
>  		default => 0,
>  	    },
> +	    'target-cpu' => {
> +		optional => 1,
> +		description => "Target Emulated CPU model. For online migration, this require target-reboot option",

Again, can be enforced
requires => 'target-reboot',

> +		type => 'string',
> +		format => 'pve-vm-cpu-conf',
> +	    },
> +	    'target-reboot' => {
> +		type => 'boolean',
> +		description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.",
> +		optional => 1,
> +	    },
>  	    'target-storage' => get_standard_option('pve-targetstorage', {
>  		completion => \&PVE::QemuServer::complete_migration_storage,
>  		optional => 0,
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 5ea78a7..8131b0b 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -729,6 +729,11 @@ sub cleanup_bitmaps {
>  sub live_migration {
>      my ($self, $vmid, $migrate_uri, $spice_port) = @_;
>  
> +    if($self->{opts}->{'target-reboot'}){
> +	$self->log('info', "target reboot - skip live migration.");

using restart migration - skipping live migration

> +	return;
> +    }
> +
>      my $conf = $self->{vmconf};
>  
>      $self->log('info', "starting online/live migration on $migrate_uri");
> @@ -993,6 +998,7 @@ sub phase1_remote {
>      my $remote_conf = PVE::QemuConfig->load_config($vmid);
>      PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
>  
> +    $remote_conf->{cpu} = $self->{opts}->{'target-cpu'} if $self->{opts}->{'target-cpu'};
>      my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
>      for my $target (keys $bridges->%*) {
>  	for my $nic (keys $bridges->{$target}->%*) {
> @@ -1356,7 +1362,14 @@ sub phase2 {
>  	# finish block-job with block-job-cancel, to disconnect source VM from NBD
>  	# to avoid it trying to re-establish it. We are in blockjob ready state,
>  	# thus, this command changes to it to blockjob complete (see qapi docs)
> -	eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, 'cancel'); };
> +	my $finish_cmd = "cancel";
> +	if ($self->{opts}->{'target-reboot'}) {
> +	    # no live migration.
> +	    # finish block-job with block-job-complete, the source will switch to remote NDB
> +	    # then we cleanly stop the source vm at phase3

Nit: "during phase3" or "in phase3"

> +	    $finish_cmd = "complete";
> +	}
> +	eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, $finish_cmd); };
>  	if (my $err = $@) {
>  	    die "Failed to complete storage migration: $err\n";
>  	}
> @@ -1573,7 +1586,17 @@ sub phase3_cleanup {
>      };
>  
>      # always stop local VM with nocheck, since config is moved already
> -    eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1); };
> +    my $shutdown_timeout = undef;
> +    my $shutdown = undef;
> +    my $force_stop = undef;
> +    if ($self->{opts}->{'target-reboot'}) {
> +	$shutdown_timeout = 180;
> +	$shutdown = 1;
> +	$force_stop = 1;
> +	$self->log('info', "clean shutdown of source vm.");

Sounds like it already happened and with force_stop=1 it might not be as
clean in the worst case ;). Maybe just "shutting down source VM"?

> +    }
> +
> +    eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1, $shutdown_timeout, $shutdown, $force_stop); };
>      if (my $err = $@) {
>  	$self->log('err', "stopping vm failed - $err");
>  	$self->{errors} = 1;
> @@ -1607,6 +1630,10 @@ sub phase3_cleanup {
>      # clear migrate lock
>      if ($tunnel && $tunnel->{version} >= 2) {
>  	PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
> +	if ($self->{opts}->{'target-reboot'}) {
> +	    $self->log('info', "restart target vm.");
> +	    PVE::Tunnel::write_tunnel($tunnel, 10, 'restart');
> +	}
>  
>  	PVE::Tunnel::finish_tunnel($tunnel);
>      } else {




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-09 12:13   ` Fiona Ebner
@ 2023-10-09 13:47     ` DERUMIER, Alexandre
  2023-10-10  9:19       ` Fiona Ebner
  0 siblings, 1 reply; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-10-09 13:47 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Hi Fiona, 

Thanks for the review.

>>There can be other reasons to do a restart-migration, see also
>>is so I feel like this
>>should be split. One for introducing target-reboot and one for
>>introducing target-cpu.

yes, sure it can be split in 2 patch, no problem. (v1 had only
targetcpu param, but I have splitted it in 2 differents param, I just
forgot to split in  2 patches

>>For 'target-reboot', the question is if we should call it 'restart'
>>like for container migration for consistency? Could also be
>>introduced for normal migration while we're at it.

Well, we have

pct reboot <vmid> 
and
pct migrate <vmid>  --restart  

^_^

We can use "restart", no problem, It make sense if we already using it
in pct migrate.

>>One could argue that a true 'restart' migration would migrate the
>>volumes also offline, but right now, I don't see a big downside to do
>>it
>>via NBD like in this patch. Still, something we should think about.
>>If
>>it turns out to be really needed, we'd need two different ways to do
>>a
>>restart migration :/

I think that a true offline migration (without starting any
source/target vm) can be done with qemu-storage-daemon running nbd
server on target &&  qemu-img on source.

This could be also used for online migration with unused/detached
disks.

I'll send a v5 next week, as I'm going to begin a 4 days proxmox
training tomorrow.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-09 13:47     ` DERUMIER, Alexandre
@ 2023-10-10  9:19       ` Fiona Ebner
  2023-10-10 16:29         ` DERUMIER, Alexandre
  2023-10-23 18:03         ` DERUMIER, Alexandre
  0 siblings, 2 replies; 15+ messages in thread
From: Fiona Ebner @ 2023-10-10  9:19 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 09.10.23 um 15:47 schrieb DERUMIER, Alexandre:
>>> One could argue that a true 'restart' migration would migrate the
>>> volumes also offline, but right now, I don't see a big downside to do
>>> it
>>> via NBD like in this patch. Still, something we should think about.
>>> If
>>> it turns out to be really needed, we'd need two different ways to do
>>> a
>>> restart migration :/
> 
> I think that a true offline migration (without starting any
> source/target vm) can be done with qemu-storage-daemon running nbd
> server on target &&  qemu-img on source.
> 
> This could be also used for online migration with unused/detached
> disks.
> 

Yes, we could, but would require additional logic. So the question is if
there are enough advantages to do it rather than via a full VM. Starting
a VM of course requires more resources.

In case of 'restart' migration, we do want to start the VM anyways, so
it's actually better, because we can catch config issues early :) Now
that I think about it, can we also just start the target VM in prelaunch
mode (instead of incoming migration mode), do the NBD migration, shut
down the source VM, stop the NBD server and then resume the target? That
would avoid the need to stop and start the target again. And therefore
might be quite a bit less downtime.

In case of offline migration, it can make sense to go via the storage
daemon instead to avoid using more resources than necessary. But can
always be added later.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-10  9:19       ` Fiona Ebner
@ 2023-10-10 16:29         ` DERUMIER, Alexandre
  2023-10-11  7:51           ` Fiona Ebner
  2023-10-23 18:03         ` DERUMIER, Alexandre
  1 sibling, 1 reply; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-10-10 16:29 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

> 
> I think that a true offline migration (without starting any
> source/target vm) can be done with qemu-storage-daemon running nbd
> server on target &&  qemu-img on source.
> 
> This could be also used for online migration with unused/detached
> disks.
> 

>>Yes, we could, but would require additional logic. So the question is
>>if
>>there are enough advantages to do it rather than via a full VM.
>>Starting
>>a VM of course requires more resources.

I don't known if it's possible to use the ndb server in the targetvm,
if the target disk is not attached ? (I really don't known)




>>In case of 'restart' migration, we do want to start the VM anyways,
>>so
>>it's actually better, because we can catch config issues early :) Now
>>that I think about it, can we also just start the target VM in
>>prelaunch
>>mode (instead of incoming migration mode), do the NBD migration, shut
>>down the source VM, stop the NBD server and then resume the target?
>>That
>>would avoid the need to stop and start the target again. And
>>therefore
>>might be quite a bit less downtime.

Yes, indeed ! I'll work on this for next version.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-10 16:29         ` DERUMIER, Alexandre
@ 2023-10-11  7:51           ` Fiona Ebner
  0 siblings, 0 replies; 15+ messages in thread
From: Fiona Ebner @ 2023-10-11  7:51 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 10.10.23 um 18:29 schrieb DERUMIER, Alexandre:
>>
>> I think that a true offline migration (without starting any
>> source/target vm) can be done with qemu-storage-daemon running nbd
>> server on target &&  qemu-img on source.
>>
>> This could be also used for online migration with unused/detached
>> disks.
>>
> 
>>> Yes, we could, but would require additional logic. So the question is
>>> if
>>> there are enough advantages to do it rather than via a full VM.
>>> Starting
>>> a VM of course requires more resources.
> 
> I don't known if it's possible to use the ndb server in the targetvm,
> if the target disk is not attached ? (I really don't known)

Yes, I think need to attach it as a blockdev first. But for 'restart'
migration, it should be fine to start out with NBD and migrate
non-attached disks with our current method. If we (later) want to add
logic for migrating non-attached disks via storage-daemon we can always
do that too and switch storage_migrate() to use that (maybe not
unconditionally, ZFS -> ZFS still makes more sense to do with send/recv
for example).




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-10  9:19       ` Fiona Ebner
  2023-10-10 16:29         ` DERUMIER, Alexandre
@ 2023-10-23 18:03         ` DERUMIER, Alexandre
  2023-10-24  8:11           ` Fiona Ebner
  1 sibling, 1 reply; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-10-23 18:03 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Hi Fiona,


>>In case of 'restart' migration, we do want to start the VM anyways,
>>so
>>it's actually better, because we can catch config issues early :) Now
>>that I think about it, can we also just start the target VM in
>>prelaunch
>>mode (instead of incoming migration mode), do the NBD migration, shut
>>down the source VM, stop the NBD server and then resume the target?
>>That
>>would avoid the need to stop and start the target again. And
>>therefore
>>might be quite a bit less downtime.


I have done some tests, It's not possible currently to write to the
remote nbd without the --incoming migration flag and only -S



https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg05700.html



nbd_add is throwing an error like

2023-10-23 18:45:51 [formationkvm1] VM 111 qmp command 'block-export-
add' failed - Permission conflict on node '#block524': permissions
'write' are both required by an unnamed block device (uses node
'#block524' as 'root' child) and unshared by block device 'drive-scsi0'
(uses node '#block524' as 'root' child).


Looking at the qemu code, the are some specific codepath in block when
the incoming flag is setup.


So I think the best way for now is to restart the target vm.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-23 18:03         ` DERUMIER, Alexandre
@ 2023-10-24  8:11           ` Fiona Ebner
  2023-10-24 12:20             ` DERUMIER, Alexandre
  0 siblings, 1 reply; 15+ messages in thread
From: Fiona Ebner @ 2023-10-24  8:11 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 23.10.23 um 20:03 schrieb DERUMIER, Alexandre:
> Hi Fiona,
> 
> 
>>> In case of 'restart' migration, we do want to start the VM anyways,
>>> so
>>> it's actually better, because we can catch config issues early :) Now
>>> that I think about it, can we also just start the target VM in
>>> prelaunch
>>> mode (instead of incoming migration mode), do the NBD migration, shut
>>> down the source VM, stop the NBD server and then resume the target?
>>> That
>>> would avoid the need to stop and start the target again. And
>>> therefore
>>> might be quite a bit less downtime.
> 
> 
> I have done some tests, It's not possible currently to write to the
> remote nbd without the --incoming migration flag and only -S
> 
> 
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg05700.html
> 
> 
> 
> nbd_add is throwing an error like
> 
> 2023-10-23 18:45:51 [formationkvm1] VM 111 qmp command 'block-export-
> add' failed - Permission conflict on node '#block524': permissions
> 'write' are both required by an unnamed block device (uses node
> '#block524' as 'root' child) and unshared by block device 'drive-scsi0'
> (uses node '#block524' as 'root' child).
> 
> 
> Looking at the qemu code, the are some specific codepath in block when
> the incoming flag is setup.
> 

That is unfortunate. But thanks for giving it a shot! I guess if we'd
really wanted to go this route, we'd need to add some kind of "empty"
migration type without any state, but would be hard to get right and
feels like a hack.

> 
> So I think the best way for now is to restart the target vm.
> 
Sure! Going with that is a much cleaner approach then.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-24  8:11           ` Fiona Ebner
@ 2023-10-24 12:20             ` DERUMIER, Alexandre
  2023-10-25  8:30               ` Fiona Ebner
  0 siblings, 1 reply; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-10-24 12:20 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

>>So I think the best way for now is to restart the target vm.
>>
>>Sure! Going with that is a much cleaner approach then.

I'll try to send a v5 today with you're last comments.

I don't manage yet the unused disks, I need to test with blockdev,

but if it's work, I think we'll need to add config generation in pve-
storage for differents blockdriver


like:

–blockdev driver=file,node-name=file0,filename=vm.img 

–blockdev driver=rbd,node-name=rbd0,pool=my-pool,image=vm01

So maybe it'll take a little bit more time.

(Maybe a second patch serie later to implement it)





-------- Message initial --------
De: Fiona Ebner <f.ebner@proxmox.com>
À: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>, pve-
devel@lists.proxmox.com <pve-devel@lists.proxmox.com>,
aderumier@odiso.com <aderumier@odiso.com>
Objet: Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add
target-cpu && target-reboot params
Date: 24/10/2023 10:11:09

Am 23.10.23 um 20:03 schrieb DERUMIER, Alexandre:
> Hi Fiona,
> 
> 
> > > In case of 'restart' migration, we do want to start the VM
> > > anyways,
> > > so
> > > it's actually better, because we can catch config issues early :)
> > > Now
> > > that I think about it, can we also just start the target VM in
> > > prelaunch
> > > mode (instead of incoming migration mode), do the NBD migration,
> > > shut
> > > down the source VM, stop the NBD server and then resume the
> > > target?
> > > That
> > > would avoid the need to stop and start the target again. And
> > > therefore
> > > might be quite a bit less downtime.
> 
> 
> I have done some tests, It's not possible currently to write to the
> remote nbd without the --incoming migration flag and only -S
> 
> 
> 
> https://antiphishing.cetsi.fr/proxy/v3?i=SGI0YVJGNmxZNE90Z2thMFYLWSxJ
> OfIERJocpmb73Vs&r=SW5LV3JodE9QZkRVZ3JEYaKpfBJeBDlAX9E2aicRCRO3qsFIBX9
> zb4pDqGdxG45MOoGKkZ3R8w3DjSjAvqYgRg&f=bnJjU3hQT3pQSmNQZVE3aOdk6YB-
> 6s0kvu35a0_AsxkSltfWi01kMLld3RaPwuBX&u=https%3A//lists.gnu.org/archiv
> e/html/qemu-devel/2017-11/msg05700.html&k=dFBm
> 
> 
> 
> nbd_add is throwing an error like
> 
> 2023-10-23 18:45:51 [formationkvm1] VM 111 qmp command 'block-export-
> add' failed - Permission conflict on node '#block524': permissions
> 'write' are both required by an unnamed block device (uses node
> '#block524' as 'root' child) and unshared by block device 'drive-
> scsi0'
> (uses node '#block524' as 'root' child).
> 
> 
> Looking at the qemu code, the are some specific codepath in block
> when
> the incoming flag is setup.
> 

That is unfortunate. But thanks for giving it a shot! I guess if we'd
really wanted to go this route, we'd need to add some kind of "empty"
migration type without any state, but would be hard to get right and
feels like a hack.

> 
> So I think the best way for now is to restart the target vm.
> 
Sure! Going with that is a much cleaner approach then.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-24 12:20             ` DERUMIER, Alexandre
@ 2023-10-25  8:30               ` Fiona Ebner
  2023-10-25 16:01                 ` DERUMIER, Alexandre
  0 siblings, 1 reply; 15+ messages in thread
From: Fiona Ebner @ 2023-10-25  8:30 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 24.10.23 um 14:20 schrieb DERUMIER, Alexandre:
>>> So I think the best way for now is to restart the target vm.
>>>
>>> Sure! Going with that is a much cleaner approach then.
> 
> I'll try to send a v5 today with you're last comments.
> 
> I don't manage yet the unused disks, I need to test with blockdev,
> 

Is it required for this series? Unused disks can just be migrated
offline via storage_migrate(), or? If we want to switch to migrating
disks offline via QEMU instead of our current storage_migrate(), going
for QEMU storage daemon + NBD seems the most natural to me.

If it's not too complicated to temporarily attach the disks to the VM,
that can be done too, but is less re-usable (e.g. pure offline migration
won't benefit from that).

> but if it's work, I think we'll need to add config generation in pve-
> storage for differents blockdriver
> 
> 
> like:
> 
> –blockdev driver=file,node-name=file0,filename=vm.img 
> 
> –blockdev driver=rbd,node-name=rbd0,pool=my-pool,image=vm01
> 

What other special cases besides (non-krbd) RBD are there? If it's just
that, I'd much rather keep the special handling in QEMU itself then
burden all other storage plugins with implementing something specific to
VMs.

Or is there a way to use the path from the storage plugin somehow like
we do at the moment, i.e.
"rbd:rbd/vm-111-disk-1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/rbd.keyring"?

> So maybe it'll take a little bit more time.
> 
> (Maybe a second patch serie later to implement it)
> 

Yes, I think that makes sense as a dedicated series.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-25  8:30               ` Fiona Ebner
@ 2023-10-25 16:01                 ` DERUMIER, Alexandre
  2023-10-27  9:19                   ` Fiona Ebner
  0 siblings, 1 reply; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-10-25 16:01 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

> 

>>Is it required for this series?
for this series, no. 
It's only focus on migrating to remote with different cpu without too
much downtime.


>> Unused disks can just be migrated
>>offline via storage_migrate(), or? 

currently unused disk can't be migrate through the http tunnel for
remote-migration

2023-10-25 17:51:38 ERROR: error - tunnel command
'{"format":"raw","migration_snapshot":"1","export_formats":"zfs","allow
_rename":"1","snapshot":"__migration__","volname":"vm-1112-disk-
1","cmd":"disk-import","storage":"targetstorage","with_snapshots":1}'
failed - failed to handle 'disk-import' command - no matching
import/export format found for storage 'preprodkvm'
2023-10-25 17:51:38 aborting phase 1 - cleanup resources
tunnel: -> sending command "quit" to remote
tunnel: <- got reply
tunnel: CMD channel closed, shutting down
2023-10-25 17:51:39 ERROR: migration aborted (duration 00:00:01): error
- tunnel command
'{"format":"raw","migration_snapshot":"1","export_formats":"zfs","allow
_rename":"1","snapshot":"__migration__","volname":"vm-1112-disk-
1","cmd":"disk-import","storage":"targetstorage","with_snapshots":1}'
failed - failed to handle 'disk-import' command - no matching
import/export format found for storage 'preprodkvm'
migration aborted




>>If we want to switch to migrating
>>disks offline via QEMU instead of our current storage_migrate(),
>>going
>>for QEMU storage daemon + NBD seems the most natural to me.

Yes, I more for this solution.

>>If it's not too complicated to temporarily attach the disks to the
>>VM,
>>that can be done too, but is less re-usable (e.g. pure offline
>>migration
>>won't benefit from that).

No sure about attach/detach temporary once by once, or attach all
devices (but this need enough controllers slot).

qemu storage daemon seem to be a less hacky  solution ^_^


> but if it's work, I think we'll need to add config generation in pv
> storage for differents blockdriver
> 
> 
> like:
> 
> –blockdev driver=file,node-name=file0,filename=vm.img 
> 
> –blockdev driver=rbd,node-name=rbd0,pool=my-pool,image=vm01
> 

>>What other special cases besides (non-krbd) RBD are there? If it's
>>just
>>that, I'd much rather keep the special handling in QEMU itself then
>>burden all other storage plugins with implementing something specific
>>to
>>VMs.

not sure, maybe glusterfs, .raw (should works for block device like
lvm,zfs), .qcow2


>>Or is there a way to use the path from the storage plugin somehow
>>like
>>we do at the moment, i.e.
>>"rbd:rbd/vm-111-disk-
>>1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/rbd.key
>>ring"?

I don't think it's possible just like this.I need to do more test, 
looking at libvirt before they are not too much doc about it.



> So maybe it'll take a little bit more time.
> 
> (Maybe a second patch serie later to implement it)
> 

>>Yes, I think that makes sense as a dedicated series.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params
  2023-10-25 16:01                 ` DERUMIER, Alexandre
@ 2023-10-27  9:19                   ` Fiona Ebner
  0 siblings, 0 replies; 15+ messages in thread
From: Fiona Ebner @ 2023-10-27  9:19 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 25.10.23 um 18:01 schrieb DERUMIER, Alexandre:
>>> Unused disks can just be migrated
>>> offline via storage_migrate(), or? 
> 
> currently unused disk can't be migrate through the http tunnel for
> remote-migration
> 
> 2023-10-25 17:51:38 ERROR: error - tunnel command
> '{"format":"raw","migration_snapshot":"1","export_formats":"zfs","allow
> _rename":"1","snapshot":"__migration__","volname":"vm-1112-disk-
> 1","cmd":"disk-import","storage":"targetstorage","with_snapshots":1}'
> failed - failed to handle 'disk-import' command - no matching
> import/export format found for storage 'preprodkvm'
> 2023-10-25 17:51:38 aborting phase 1 - cleanup resources
> tunnel: -> sending command "quit" to remote
> tunnel: <- got reply
> tunnel: CMD channel closed, shutting down
> 2023-10-25 17:51:39 ERROR: migration aborted (duration 00:00:01): error
> - tunnel command
> '{"format":"raw","migration_snapshot":"1","export_formats":"zfs","allow
> _rename":"1","snapshot":"__migration__","volname":"vm-1112-disk-
> 1","cmd":"disk-import","storage":"targetstorage","with_snapshots":1}'
> failed - failed to handle 'disk-import' command - no matching
> import/export format found for storage 'preprodkvm'
> migration aborted
> 

Well, yes, they can. But there needs to be a common import/export format
between the storage types. Which admittedly is a bit limited for certain
storage types, e.g. ZFS only supports ZFS and RBD does not implement
import/export at all yet (because in a single cluster it wasn't needed).

>>> If we want to switch to migrating
>>> disks offline via QEMU instead of our current storage_migrate(),
>>> going
>>> for QEMU storage daemon + NBD seems the most natural to me.
> 
> Yes, I more for this solution.
> 
>>> If it's not too complicated to temporarily attach the disks to the
>>> VM,
>>> that can be done too, but is less re-usable (e.g. pure offline
>>> migration
>>> won't benefit from that).
> 
> No sure about attach/detach temporary once by once, or attach all
> devices (but this need enough controllers slot).
> 

I think you can attach them to the VM without attaching to a controller
by using QMP blockdev-add, but...

> qemu storage daemon seem to be a less hacky  solution ^_^
> 

...sure, this should be nicer and more re-usable.

> 
>> but if it's work, I think we'll need to add config generation in pv
>> storage for differents blockdriver
>>
>>
>> like:
>>
>> –blockdev driver=file,node-name=file0,filename=vm.img 
>>
>> –blockdev driver=rbd,node-name=rbd0,pool=my-pool,image=vm01
>>
> 
>>> What other special cases besides (non-krbd) RBD are there? If it's
>>> just
>>> that, I'd much rather keep the special handling in QEMU itself then
>>> burden all other storage plugins with implementing something specific
>>> to
>>> VMs.
> 
> not sure, maybe glusterfs, .raw (should works for block device like
> lvm,zfs), .qcow2
> 

There's a whole lot of drivers
https://qemu.readthedocs.io/en/v8.1.0/interop/qemu-qmp-ref.html#qapidoc-883

But e.g. for NFS, we don't necessarily need it and can just use
qcow2/raw. Currently, with -drive we also just treat it like any other file.

I'd like to keep the logic for how to construct the -blockdev command
line option (mostly) in qemu-server itself. But I guess we can't avoid
some amount of coupling. Currently, for -drive we have the coupling in
path() which can e.g. return rbd: or gluster: and then QEMU will parse
what driver to use from that path.

Two approaches that make sense to me (no real preference at the moment):

1. Have a storage plugin method which tells qemu-server about the
necessary driver and properties for opening the image. E.g. return the
properties as a hash and then have qemu-server join them together and
then add the generic properties (e.g. aio,node-name) to construct the
full -blockdev option.

2. Do everything in qemu-server and special case for certain storage
types that have a dedicated driver. Still needs to get the info like
pool name from the RBD storage of course, but that should be possible
with existing methods.

Happy to hear other suggestions/opinions.

> 
>>> Or is there a way to use the path from the storage plugin somehow
>>> like
>>> we do at the moment, i.e.
>>> "rbd:rbd/vm-111-disk-
>>> 1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/rbd.key
>>> ring"?
> 
> I don't think it's possible just like this.I need to do more test, 
> looking at libvirt before they are not too much doc about it.
> 

Probably they decided to get rid of this magic for the newer -blockdev
variant. I tried to cheat using driver=file and specify the "rbd:"-path
as the filename, but it doesn't work :P

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-10-27  9:19 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-28 14:45 [pve-devel] [PATCH v4 qemu-server 0/2] remote-migration: migration with different cpu Alexandre Derumier
2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 1/2] migration: move livemigration code in a dedicated sub Alexandre Derumier
2023-10-09 11:25   ` Fiona Ebner
2023-09-28 14:45 ` [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params Alexandre Derumier
2023-10-09 12:13   ` Fiona Ebner
2023-10-09 13:47     ` DERUMIER, Alexandre
2023-10-10  9:19       ` Fiona Ebner
2023-10-10 16:29         ` DERUMIER, Alexandre
2023-10-11  7:51           ` Fiona Ebner
2023-10-23 18:03         ` DERUMIER, Alexandre
2023-10-24  8:11           ` Fiona Ebner
2023-10-24 12:20             ` DERUMIER, Alexandre
2023-10-25  8:30               ` Fiona Ebner
2023-10-25 16:01                 ` DERUMIER, Alexandre
2023-10-27  9:19                   ` Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal