From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 7B3EB989D3 for ; Mon, 9 Oct 2023 14:14:17 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 558EC17B64 for ; Mon, 9 Oct 2023 14:13:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 9 Oct 2023 14:13:45 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 91710448B6; Mon, 9 Oct 2023 14:13:45 +0200 (CEST) Message-ID: <5ecfa7d0-4525-5f1e-75a2-a6ae1a93356b@proxmox.com> Date: Mon, 9 Oct 2023 14:13:44 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Content-Language: en-US To: Proxmox VE development discussion , Alexandre Derumier References: <20230928144556.2023558-1-aderumier@odiso.com> <20230928144556.2023558-3-aderumier@odiso.com> From: Fiona Ebner In-Reply-To: <20230928144556.2023558-3-aderumier@odiso.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.826 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.818 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, qemu.pm, qm.pm, qemumigrate.pm] Subject: Re: [pve-devel] [PATCH v4 qemu-server 2/2] remote-migration: add target-cpu && target-reboot params X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Oct 2023 12:14:17 -0000 There can be other reasons to do a restart-migration, see also https://bugzilla.proxmox.com/show_bug.cgi?id=4530, so I feel like this should be split. One for introducing target-reboot and one for introducing target-cpu. For 'target-reboot', the question is if we should call it 'restart' like for container migration for consistency? Could also be introduced for normal migration while we're at it. One could argue that a true 'restart' migration would migrate the volumes also offline, but right now, I don't see a big downside to do it via NBD like in this patch. Still, something we should think about. If it turns out to be really needed, we'd need two different ways to do a restart migration :/ Am 28.09.23 um 16:45 schrieb Alexandre Derumier: > This patch add support for remote migration when target > cpu model is different. > > target-reboot param need to be defined to allow migration > whens source vm is online. > > When defined, only the live storage migration is done, > and instead to transfert memory, we cleanly shutdown source vm > and restart the target vm. (like a virtual reboot between source/dest) Missing your Signed-off-by > --- > PVE/API2/Qemu.pm | 23 ++++++++++++++++++++++- > PVE/CLI/qm.pm | 11 +++++++++++ > PVE/QemuMigrate.pm | 31 +++++++++++++++++++++++++++++-- > 3 files changed, 62 insertions(+), 3 deletions(-) > > diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm > index 774b0c7..38991e9 100644 > --- a/PVE/API2/Qemu.pm > +++ b/PVE/API2/Qemu.pm > @@ -4586,6 +4586,17 @@ __PACKAGE__->register_method({ > optional => 1, > default => 0, > }, > + 'target-cpu' => { > + optional => 1, > + description => "Target Emulated CPU model. For online migration, this require target-reboot option", To enforce it, you can use: requires => 'target-reboot', > + type => 'string', > + format => 'pve-vm-cpu-conf',> + }, > + 'target-reboot' => { > + type => 'boolean', > + description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.", > + optional => 1, > + }, > 'target-storage' => get_standard_option('pve-targetstorage', { > completion => \&PVE::QemuServer::complete_migration_storage, > optional => 0, > @@ -4666,7 +4677,7 @@ __PACKAGE__->register_method({ > > if (PVE::QemuServer::check_running($source_vmid)) { > die "can't migrate running VM without --online\n" if !$param->{online}; > - > + die "can't migrate running VM without --target-reboot when target cpu is different" if $param->{'target-cpu'} && !$param->{'target-reboot'}; > } else { > warn "VM isn't running. Doing offline migration instead.\n" if $param->{online}; > $param->{online} = 0; > @@ -4683,6 +4694,7 @@ __PACKAGE__->register_method({ > raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" }) > if $@; > > + > die "remote migration requires explicit storage mapping!\n" > if $storagemap->{identity}; > Nit: unrelated change > @@ -5732,6 +5744,15 @@ __PACKAGE__->register_method({ > PVE::QemuServer::nbd_stop($state->{vmid}); > return; > }, > + 'restart' => sub { > + PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1); The first parameter is $storecfg and is not optional. To avoid deactivating the volumes, use the $keepActive parameter. > + my $info = PVE::QemuServer::vm_start_nolock( > + $state->{storecfg}, > + $state->{vmid}, > + $state->{conf}, > + ); > + return; > + }, > 'resume' => sub { > if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) { > PVE::QemuServer::vm_resume($state->{vmid}, 1, 1); > diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm > index b17b4fe..9d89cfe 100755 > --- a/PVE/CLI/qm.pm > +++ b/PVE/CLI/qm.pm > @@ -189,6 +189,17 @@ __PACKAGE__->register_method({ > optional => 1, > default => 0, > }, > + 'target-cpu' => { > + optional => 1, > + description => "Target Emulated CPU model. For online migration, this require target-reboot option", Again, can be enforced requires => 'target-reboot', > + type => 'string', > + format => 'pve-vm-cpu-conf', > + }, > + 'target-reboot' => { > + type => 'boolean', > + description => "For online migration , don't migrate memory, only storage. Then, the source vm is shutdown and the target vm is restarted.", > + optional => 1, > + }, > 'target-storage' => get_standard_option('pve-targetstorage', { > completion => \&PVE::QemuServer::complete_migration_storage, > optional => 0, > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm > index 5ea78a7..8131b0b 100644 > --- a/PVE/QemuMigrate.pm > +++ b/PVE/QemuMigrate.pm > @@ -729,6 +729,11 @@ sub cleanup_bitmaps { > sub live_migration { > my ($self, $vmid, $migrate_uri, $spice_port) = @_; > > + if($self->{opts}->{'target-reboot'}){ > + $self->log('info', "target reboot - skip live migration."); using restart migration - skipping live migration > + return; > + } > + > my $conf = $self->{vmconf}; > > $self->log('info', "starting online/live migration on $migrate_uri"); > @@ -993,6 +998,7 @@ sub phase1_remote { > my $remote_conf = PVE::QemuConfig->load_config($vmid); > PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map}); > > + $remote_conf->{cpu} = $self->{opts}->{'target-cpu'} if $self->{opts}->{'target-cpu'}; > my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap}); > for my $target (keys $bridges->%*) { > for my $nic (keys $bridges->{$target}->%*) { > @@ -1356,7 +1362,14 @@ sub phase2 { > # finish block-job with block-job-cancel, to disconnect source VM from NBD > # to avoid it trying to re-establish it. We are in blockjob ready state, > # thus, this command changes to it to blockjob complete (see qapi docs) > - eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, 'cancel'); }; > + my $finish_cmd = "cancel"; > + if ($self->{opts}->{'target-reboot'}) { > + # no live migration. > + # finish block-job with block-job-complete, the source will switch to remote NDB > + # then we cleanly stop the source vm at phase3 Nit: "during phase3" or "in phase3" > + $finish_cmd = "complete"; > + } > + eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, $self->{storage_migration_jobs}, $finish_cmd); }; > if (my $err = $@) { > die "Failed to complete storage migration: $err\n"; > } > @@ -1573,7 +1586,17 @@ sub phase3_cleanup { > }; > > # always stop local VM with nocheck, since config is moved already > - eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1); }; > + my $shutdown_timeout = undef; > + my $shutdown = undef; > + my $force_stop = undef; > + if ($self->{opts}->{'target-reboot'}) { > + $shutdown_timeout = 180; > + $shutdown = 1; > + $force_stop = 1; > + $self->log('info', "clean shutdown of source vm."); Sounds like it already happened and with force_stop=1 it might not be as clean in the worst case ;). Maybe just "shutting down source VM"? > + } > + > + eval { PVE::QemuServer::vm_stop($self->{storecfg}, $vmid, 1, 1, $shutdown_timeout, $shutdown, $force_stop); }; > if (my $err = $@) { > $self->log('err', "stopping vm failed - $err"); > $self->{errors} = 1; > @@ -1607,6 +1630,10 @@ sub phase3_cleanup { > # clear migrate lock > if ($tunnel && $tunnel->{version} >= 2) { > PVE::Tunnel::write_tunnel($tunnel, 10, "unlock"); > + if ($self->{opts}->{'target-reboot'}) { > + $self->log('info', "restart target vm."); > + PVE::Tunnel::write_tunnel($tunnel, 10, 'restart'); > + } > > PVE::Tunnel::finish_tunnel($tunnel); > } else {