From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
Date: Tue, 4 Jan 2022 14:58:03 +0100 [thread overview]
Message-ID: <5a3846bc-04bc-ed4d-4e2e-38a9911390aa@proxmox.com> (raw)
In-Reply-To: <20211222135257.3242938-17-f.gruenbichler@proxmox.com>
Two comments inline:
Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> remote migration uses a websocket connection to a task worker running on
> the target node instead of commands via SSH to control the migration.
> this websocket tunnel is started earlier than the SSH tunnel, and allows
> adding UNIX-socket forwarding over additional websocket connections
> on-demand.
>
> the main differences to regular intra-cluster migration are:
> - source VM config and disks are only removed upon request via --delete
> - shared storages are treated like local storages, since we can't
> assume they are shared across clusters (with potentical to extend this
> by marking storages as shared)
> - NBD migrated disks are explicitly pre-allocated on the target node via
> tunnel command before starting the target VM instance
> - in addition to storages, network bridges and the VMID itself is
> transformed via a user defined mapping
> - all commands and migration data streams are sent via a WS tunnel proxy
>
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
>
> Notes:
> requires bumped pve-guest-common
>
> v3:
> - move WS tunnel helpers to pve-guest-common-perl
> - check bridge mapping early
>
> v2:
> - improve tunnel version info printing and error handling
> - don't cleanup unix sockets twice
> - url escape remote socket path
> - cleanup nits and small issues
>
> requires proxmox-websocket-tunnel
>
> v3:
> - fix misplaced parentheses
>
> v2:
> - improve tunnel version info printing and error handling
> - don't cleanup unix sockets twice
> - url escape remote socket path
> - cleanup nits and small issues
>
> PVE/API2/Qemu.pm | 2 +-
> PVE/QemuMigrate.pm | 425 +++++++++++++++++++++++++++++++++++++--------
> PVE/QemuServer.pm | 7 +-
> 3 files changed, 359 insertions(+), 75 deletions(-)
>
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index d188b77..cf90fe7 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4803,7 +4803,7 @@ __PACKAGE__->register_method({
> # bump/reset for breaking changes
> # bump/bump for opt-in changes
> return {
> - api => 2,
> + api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
> age => 0,
> };
> },
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 897018b..3e9d29e 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -5,11 +5,10 @@ use warnings;
>
> use IO::File;
> use IPC::Open2;
> -use POSIX qw( WNOHANG );
> use Time::HiRes qw( usleep );
>
> -use PVE::Format qw(render_bytes);
> use PVE::Cluster;
> +use PVE::Format qw(render_bytes);
> use PVE::GuestHelpers qw(safe_boolean_ne safe_string_ne);
> use PVE::INotify;
> use PVE::RPCEnvironment;
> @@ -17,6 +16,7 @@ use PVE::Replication;
> use PVE::ReplicationConfig;
> use PVE::ReplicationState;
> use PVE::Storage;
> +use PVE::StorageTunnel;
> use PVE::Tools;
> use PVE::Tunnel;
>
> @@ -31,6 +31,9 @@ use PVE::QemuServer;
> use PVE::AbstractMigrate;
> use base qw(PVE::AbstractMigrate);
>
> +# compared against remote end's minimum version
> +our $WS_TUNNEL_VERSION = 2;
> +
> sub fork_tunnel {
> my ($self, $ssh_forward_info) = @_;
>
> @@ -43,6 +46,35 @@ sub fork_tunnel {
> return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
> }
>
> +sub fork_websocket_tunnel {
> + my ($self, $storages, $bridges) = @_;
> +
> + my $remote = $self->{opts}->{remote};
> + my $conn = $remote->{conn};
> +
> + my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnelwebsocket";
> + my $url = "/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnel";
> +
> + my $tunnel_params = {
> + url => $websocket_url,
> + };
> +
> + my $storage_list = join(',', keys %$storages);
> + my $bridge_list = join(',', keys %$bridges);
> +
> + my $req_params = {
> + storages => $storage_list,
> + bridges => $bridge_list,
> + };
> +
> + my $log = sub {
> + my $line = shift;
> + $self->log('info', $line);
> + };
> +
> + return PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
> +}
> +
> # tunnel_info:
> # proto: unix (secure) or tcp (insecure/legacy compat)
> # addr: IP or UNIX socket path
> @@ -175,23 +207,34 @@ sub prepare {
> }
>
> my $vollist = PVE::QemuServer::get_vm_volumes($conf);
> +
> + my $storages = {};
> foreach my $volid (@$vollist) {
> my ($sid, $volname) = PVE::Storage::parse_volume_id($volid, 1);
>
> - # check if storage is available on both nodes
> + # check if storage is available on source node
> my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
>
> my $targetsid = $sid;
> - # NOTE: we currently ignore shared source storages in mappings so skip here too for now
> - if (!$scfg->{shared}) {
> + # NOTE: local ignores shared mappings, remote maps them
> + if (!$scfg->{shared} || $self->{opts}->{remote}) {
> $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
> }
>
> - my $target_scfg = PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
> - my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
> + $storages->{$targetsid} = 1;
>
> - die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
> - if !$target_scfg->{content}->{$vtype};
> + if (!$self->{opts}->{remote}) {
> + # check if storage is available on target node
> + my $target_scfg = PVE::Storage::storage_check_enabled(
> + $storecfg,
> + $targetsid,
> + $self->{node},
> + );
> + my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
> +
> + die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
> + if !$target_scfg->{content}->{$vtype};
> + }
>
> if ($scfg->{shared}) {
> # PVE::Storage::activate_storage checks this for non-shared storages
> @@ -201,10 +244,27 @@ sub prepare {
> }
> }
>
> - # test ssh connection
> - my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> - eval { $self->cmd_quiet($cmd); };
> - die "Can't connect to destination address using public key\n" if $@;
> + if ($self->{opts}->{remote}) {
> + # test & establish websocket connection
> + my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
> + my $tunnel = $self->fork_websocket_tunnel($storages, $bridges);
> + my $min_version = $tunnel->{version} - $tunnel->{age};
> + $self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
> + $self->log('info', "remote WS tunnel version: $tunnel->{version}");
> + $self->log('info', "minimum required WS tunnel version: $min_version");
> + die "Remote tunnel endpoint not compatible, upgrade required\n"
> + if $WS_TUNNEL_VERSION < $min_version;
> + die "Remote tunnel endpoint too old, upgrade required\n"
> + if $WS_TUNNEL_VERSION > $tunnel->{version};
> +
> + print "websocket tunnel started\n";
> + $self->{tunnel} = $tunnel;
> + } else {
> + # test ssh connection
> + my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> + eval { $self->cmd_quiet($cmd); };
> + die "Can't connect to destination address using public key\n" if $@;
> + }
>
> return $running;
> }
> @@ -242,7 +302,7 @@ sub scan_local_volumes {
> my @sids = PVE::Storage::storage_ids($storecfg);
> foreach my $storeid (@sids) {
> my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
> - next if $scfg->{shared};
> + next if $scfg->{shared} && !$self->{opts}->{remote};
> next if !PVE::Storage::storage_check_enabled($storecfg, $storeid, undef, 1);
>
> # get list from PVE::Storage (for unused volumes)
> @@ -251,19 +311,24 @@ sub scan_local_volumes {
> next if @{$dl->{$storeid}} == 0;
>
> my $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $storeid);
> - # check if storage is available on target node
> - my $target_scfg = PVE::Storage::storage_check_enabled(
> - $storecfg,
> - $targetsid,
> - $self->{node},
> - );
> -
> - die "content type 'images' is not available on storage '$targetsid'\n"
> - if !$target_scfg->{content}->{images};
> + my $bwlimit_sids = [$storeid];
> + if (!$self->{opts}->{remote}) {
> + # check if storage is available on target node
> + my $target_scfg = PVE::Storage::storage_check_enabled(
> + $storecfg,
> + $targetsid,
> + $self->{node},
> + );
> +
> + die "content type 'images' is not available on storage '$targetsid'\n"
> + if !$target_scfg->{content}->{images};
> +
> + push @$bwlimit_sids, $targetsid;
> + }
>
> my $bwlimit = PVE::Storage::get_bandwidth_limit(
> 'migration',
> - [$targetsid, $storeid],
> + $bwlimit_sids,
> $self->{opts}->{bwlimit},
> );
>
> @@ -319,14 +384,17 @@ sub scan_local_volumes {
> my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
>
> my $targetsid = $sid;
> - # NOTE: we currently ignore shared source storages in mappings so skip here too for now
> - if (!$scfg->{shared}) {
> + # NOTE: local ignores shared mappings, remote maps them
> + if (!$scfg->{shared} || $self->{opts}->{remote}) {
> $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
> }
>
> - PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
> + # check target storage on target node if intra-cluster migration
> + if (!$self->{opts}->{remote}) {
> + PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
>
> - return if $scfg->{shared};
> + return if $scfg->{shared};
> + }
>
> $local_volumes->{$volid}->{ref} = $attr->{referenced_in_config} ? 'config' : 'snapshot';
> $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_unused};
> @@ -415,6 +483,9 @@ sub scan_local_volumes {
>
> my $migratable = $scfg->{type} =~ /^(?:dir|btrfs|zfspool|lvmthin|lvm)$/;
>
> + # TODO: what is this even here for?
> + $migratable = 1 if $self->{opts}->{remote};
> +
> die "can't migrate '$volid' - storage type '$scfg->{type}' not supported\n"
> if !$migratable;
>
> @@ -449,6 +520,10 @@ sub handle_replication {
> my $local_volumes = $self->{local_volumes};
>
> return if !$self->{replication_jobcfg};
> +
> + die "can't migrate VM with replicated volumes to remote cluster/node\n"
> + if $self->{opts}->{remote};
> +
> if ($self->{running}) {
>
> my $version = PVE::QemuServer::kvm_user_version();
> @@ -548,24 +623,51 @@ sub sync_offline_local_volumes {
> $self->log('info', "copying local disk images") if scalar(@volids);
>
> foreach my $volid (@volids) {
> - my $targetsid = $local_volumes->{$volid}->{targetsid};
> - my $bwlimit = $local_volumes->{$volid}->{bwlimit};
> - $bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
> -
> - my $storage_migrate_opts = {
> - 'ratelimit_bps' => $bwlimit,
> - 'insecure' => $opts->{migration_type} eq 'insecure',
> - 'with_snapshots' => $local_volumes->{$volid}->{snapshots},
> - 'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
> - };
> + my $new_volid;
>
> - my $logfunc = sub { $self->log('info', $_[0]); };
> - my $new_volid = eval {
> - PVE::Storage::storage_migrate($storecfg, $volid, $self->{ssh_info},
> - $targetsid, $storage_migrate_opts, $logfunc);
> - };
> - if (my $err = $@) {
> - die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
> + my $opts = $self->{opts};
> + if ($opts->{remote}) {
> + my $log = sub {
> + my $line = shift;
> + $self->log('info', $line);
> + };
> +
> + $new_volid = PVE::StorageTunnel::storage_migrate(
> + $self->{tunnel},
> + $storecfg,
> + $volid,
> + $self->{vmid},
> + $opts->{remote}->{vmid},
> + $local_volumes->{$volid},
> + $log,
> + );
> + } else {
> + my $targetsid = $local_volumes->{$volid}->{targetsid};
> +
> + my $bwlimit = $local_volumes->{$volid}->{bwlimit};
> + $bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
> +
> + my $storage_migrate_opts = {
> + 'ratelimit_bps' => $bwlimit,
> + 'insecure' => $opts->{migration_type} eq 'insecure',
> + 'with_snapshots' => $local_volumes->{$volid}->{snapshots},
> + 'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
> + };
> +
> + my $logfunc = sub { $self->log('info', $_[0]); };
> + $new_volid = eval {
> + PVE::Storage::storage_migrate(
> + $storecfg,
> + $volid,
> + $self->{ssh_info},
> + $targetsid,
> + $storage_migrate_opts,
> + $logfunc,
> + );
> + };
> + if (my $err = $@) {
> + die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
> + }
> }
>
> $self->{volume_map}->{$volid} = $new_volid;
> @@ -581,6 +683,12 @@ sub sync_offline_local_volumes {
> sub cleanup_remotedisks {
> my ($self) = @_;
>
> + if ($self->{opts}->{remote}) {
> + PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
> + delete $self->{tunnel};
> + return;
> + }
> +
> my $local_volumes = $self->{local_volumes};
>
> foreach my $volid (values %{$self->{volume_map}}) {
> @@ -630,8 +738,100 @@ sub phase1 {
> $self->handle_replication($vmid);
>
> $self->sync_offline_local_volumes();
> + $self->phase1_remote($vmid) if $self->{opts}->{remote};
> };
>
> +sub map_bridges {
> + my ($conf, $map, $scan_only) = @_;
> +
> + my $bridges = {};
> +
> + foreach my $opt (keys %$conf) {
> + next if $opt !~ m/^net\d+$/;
> +
> + next if !$conf->{$opt};
> + my $d = PVE::QemuServer::parse_net($conf->{$opt});
> + next if !$d || !$d->{bridge};
> +
> + my $target_bridge = PVE::QemuServer::map_id($map, $d->{bridge});
> + $bridges->{$target_bridge}->{$opt} = $d->{bridge};
> +
> + next if $scan_only;
> +
> + $d->{bridge} = $target_bridge;
> + $conf->{$opt} = PVE::QemuServer::print_net($d);
> + }
> +
> + return $bridges;
> +}
> +
> +sub phase1_remote {
> + my ($self, $vmid) = @_;
> +
> + my $remote_conf = PVE::QemuConfig->load_config($vmid);
> + PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
> +
> + my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
> + for my $target (keys $bridges->%*) {
> + for my $nic (keys $bridges->{$target}->%*) {
> + $self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
> + }
> + }
> +
> + my @online_local_volumes = $self->filter_local_volumes('online');
> +
> + my $storage_map = $self->{opts}->{storagemap};
> + $self->{nbd} = {};
> + PVE::QemuConfig->foreach_volume($remote_conf, sub {
> + my ($ds, $drive) = @_;
> +
> + # TODO eject CDROM?
> + return if PVE::QemuServer::drive_is_cdrom($drive);
> +
> + my $volid = $drive->{file};
> + return if !$volid;
> +
> + return if !grep { $_ eq $volid} @online_local_volumes;
> +
> + my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
> + my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
> + my $source_format = PVE::QemuServer::qemu_img_format($scfg, $volname);
> +
> + # set by target cluster
> + my $oldvolid = delete $drive->{file};
> + delete $drive->{format};
> +
> + my $targetsid = PVE::QemuServer::map_id($storage_map, $storeid);
> +
> + my $params = {
> + format => $source_format,
> + storage => $targetsid,
> + drive => $drive,
> + };
> +
> + $self->log('info', "Allocating volume for drive '$ds' on remote storage '$targetsid'..");
> + my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'disk', $params);
> +
> + $self->log('info', "volume '$oldvolid' os '$res->{volid}' on the target\n");
s/os/is/
> + $remote_conf->{$ds} = $res->{drivestr};
> + $self->{nbd}->{$ds} = $res;
> + });
> +
> + my $conf_str = PVE::QemuServer::write_vm_config("remote", $remote_conf);
> +
> + # TODO expose in PVE::Firewall?
> + my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
> + my $fw_conf_str;
> + $fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
> + if -e $vm_fw_conf_path;
> + my $params = {
> + conf => $conf_str,
> + 'firewall-config' => $fw_conf_str,
> + };
> +
> + PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
> +}
> +
> sub phase1_cleanup {
> my ($self, $vmid, $err) = @_;
>
> @@ -662,7 +862,6 @@ sub phase2_start_local_cluster {
> my $local_volumes = $self->{local_volumes};
> my @online_local_volumes = $self->filter_local_volumes('online');
>
> - $self->{storage_migration} = 1 if scalar(@online_local_volumes);
> my $start = $params->{start_params};
> my $migrate = $params->{migrate_opts};
>
> @@ -793,10 +992,34 @@ sub phase2_start_local_cluster {
> return ($tunnel_info, $spice_port);
> }
>
> +sub phase2_start_remote_cluster {
> + my ($self, $vmid, $params) = @_;
> +
> + die "insecure migration to remote cluster not implemented\n"
> + if $params->{migrate_opts}->{type} ne 'websocket';
> +
> + my $remote_vmid = $self->{opts}->{remote}->{vmid};
> +
> + my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10, "start", $params);
10 seconds feels a bit short to me.
> +
> + foreach my $drive (keys %{$res->{drives}}) {
> + $self->{stopnbd} = 1;
> + $self->{target_drive}->{$drive}->{drivestr} = $res->{drives}->{$drive}->{drivestr};
> + my $nbd_uri = $res->{drives}->{$drive}->{nbd_uri};
> + die "unexpected NBD uri for '$drive': $nbd_uri\n"
> + if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
> +
> + $self->{target_drive}->{$drive}->{nbd_uri} = $nbd_uri;
> + }
> +
> + return ($res->{migrate}, $res->{spice_port});
> +}
> +
> sub phase2 {
> my ($self, $vmid) = @_;
>
> my $conf = $self->{vmconf};
> + my $local_volumes = $self->{local_volumes};
>
> # version > 0 for unix socket support
> my $nbd_protocol_version = 1;
> @@ -828,10 +1051,39 @@ sub phase2 {
> },
> };
>
> - my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
> + my ($tunnel_info, $spice_port);
> +
> + my @online_local_volumes = $self->filter_local_volumes('online');
> + $self->{storage_migration} = 1 if scalar(@online_local_volumes);
> +
> + if (my $remote = $self->{opts}->{remote}) {
> + my $remote_vmid = $remote->{vmid};
> + $params->{migrate_opts}->{remote_node} = $self->{node};
> + ($tunnel_info, $spice_port) = $self->phase2_start_remote_cluster($vmid, $params);
> + die "only UNIX sockets are supported for remote migration\n"
> + if $tunnel_info->{proto} ne 'unix';
> +
> + my $remote_socket = $tunnel_info->{addr};
> + my $local_socket = $remote_socket;
> + $local_socket =~ s/$remote_vmid/$vmid/g;
> + $tunnel_info->{addr} = $local_socket;
> +
> + $self->log('info', "Setting up tunnel for '$local_socket'");
> + PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
> +
> + foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) {
> + my $local_socket = $remote_socket;
> + $local_socket =~ s/$remote_vmid/$vmid/g;
> + next if $self->{tunnel}->{forwarded}->{$local_socket};
> + $self->log('info', "Setting up tunnel for '$local_socket'");
> + PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
> + }
> + } else {
> + ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
>
> - $self->log('info', "start remote tunnel");
> - $self->start_remote_tunnel($tunnel_info);
> + $self->log('info', "start remote tunnel");
> + $self->start_remote_tunnel($tunnel_info);
> + }
>
> my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
> $migrate_uri .= ":$tunnel_info->{port}"
> @@ -841,8 +1093,6 @@ sub phase2 {
> $self->{storage_migration_jobs} = {};
> $self->log('info', "starting storage migration");
>
> - my @online_local_volumes = $self->filter_local_volumes('online');
> -
> die "The number of local disks does not match between the source and the destination.\n"
> if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
> foreach my $drive (keys %{$self->{target_drive}}){
> @@ -915,7 +1165,7 @@ sub phase2 {
> };
> $self->log('info', "migrate-set-parameters error: $@") if $@;
>
> - if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
> + if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
> my $rpcenv = PVE::RPCEnvironment::get();
> my $authuser = $rpcenv->get_user();
>
> @@ -1112,11 +1362,15 @@ sub phase2_cleanup {
>
> my $nodename = PVE::INotify::nodename();
>
> - my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
> - eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> - if (my $err = $@) {
> - $self->log('err', $err);
> - $self->{errors} = 1;
> + if ($self->{tunnel} && $self->{tunnel}->{version} >= 2) {
> + PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'stop');
> + } else {
> + my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
> + eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> + if (my $err = $@) {
> + $self->log('err', $err);
> + $self->{errors} = 1;
> + }
> }
>
> # cleanup after stopping, otherwise disks might be in-use by target VM!
> @@ -1149,7 +1403,7 @@ sub phase3_cleanup {
>
> my $tunnel = $self->{tunnel};
>
> - if ($self->{volume_map}) {
> + if ($self->{volume_map} && !$self->{opts}->{remote}) {
> my $target_drives = $self->{target_drive};
>
> # FIXME: for NBD storage migration we now only update the volid, and
> @@ -1165,27 +1419,34 @@ sub phase3_cleanup {
> }
>
> # transfer replication state before move config
> - $self->transfer_replication_state() if $self->{is_replicated};
> - PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
> - $self->switch_replication_job_target() if $self->{is_replicated};
> + if (!$self->{opts}->{remote}) {
> + $self->transfer_replication_state() if $self->{is_replicated};
> + PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
> + $self->switch_replication_job_target() if $self->{is_replicated};
> + }
>
> if ($self->{livemigration}) {
> if ($self->{stopnbd}) {
> $self->log('info', "stopping NBD storage migration server on target.");
> # stop nbd server on remote vm - requirement for resume since 2.9
> - my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
> + if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 2) {
> + PVE::Tunnel::write_tunnel($tunnel, 30, 'nbdstop');
> + } else {
> + my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
>
> - eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> - if (my $err = $@) {
> - $self->log('err', $err);
> - $self->{errors} = 1;
> + eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> + if (my $err = $@) {
> + $self->log('err', $err);
> + $self->{errors} = 1;
> + }
> }
> }
>
> # config moved and nbd server stopped - now we can resume vm on target
> if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 1) {
> + my $cmd = $tunnel->{version} == 1 ? "resume $vmid" : "resume";
> eval {
> - PVE::Tunnel::write_tunnel($tunnel, 30, "resume $vmid");
> + PVE::Tunnel::write_tunnel($tunnel, 30, $cmd);
> };
> if (my $err = $@) {
> $self->log('err', $err);
> @@ -1205,18 +1466,24 @@ sub phase3_cleanup {
> }
>
> if ($self->{storage_migration} && PVE::QemuServer::parse_guest_agent($conf)->{fstrim_cloned_disks} && $self->{running}) {
> - my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
> - eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> + if ($self->{opts}->{remote}) {
> + PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'fstrim');
> + } else {
> + my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
> + eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> + }
> }
> }
>
> # close tunnel on successful migration, on error phase2_cleanup closed it
> - if ($tunnel) {
> + if ($tunnel && $tunnel->{version} == 1) {
> eval { PVE::Tunnel::finish_tunnel($tunnel); };
> if (my $err = $@) {
> $self->log('err', $err);
> $self->{errors} = 1;
> }
> + $tunnel = undef;
> + delete $self->{tunnel};
> }
>
> eval {
> @@ -1254,6 +1521,9 @@ sub phase3_cleanup {
>
> # destroy local copies
> foreach my $volid (@not_replicated_volumes) {
> + # remote is cleaned up below
> + next if $self->{opts}->{remote};
> +
> eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
> if (my $err = $@) {
> $self->log('err', "removing local copy of '$volid' failed - $err");
> @@ -1263,8 +1533,19 @@ sub phase3_cleanup {
> }
>
> # clear migrate lock
> - my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
> - $self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
> + if ($tunnel && $tunnel->{version} >= 2) {
> + PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
> +
> + PVE::Tunnel::finish_tunnel($tunnel);
> + } else {
> + my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
> + $self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
> + }
> +
> + if ($self->{opts}->{remote} && $self->{opts}->{delete}) {
> + eval { PVE::QemuServer::destroy_vm($self->{storecfg}, $vmid, 1, undef, 0) };
> + warn "Failed to remove source VM - $@\n" if $@;
> + }
> }
>
> sub final_cleanup {
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 9971f2c..9490615 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5425,7 +5425,10 @@ sub vm_start_nolock {
> my $defaults = load_defaults();
>
> # set environment variable useful inside network script
> - $ENV{PVE_MIGRATED_FROM} = $migratedfrom if $migratedfrom;
> + # for remote migration the config is available on the target node!
> + if (!$migrate_opts->{remote_node}) {
> + $ENV{PVE_MIGRATED_FROM} = $migratedfrom;
> + }
>
> PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1);
>
> @@ -5667,7 +5670,7 @@ sub vm_start_nolock {
>
> my $migrate_storage_uri;
> # nbd_protocol_version > 0 for unix socket support
> - if ($nbd_protocol_version > 0 && $migration_type eq 'secure') {
> + if ($nbd_protocol_version > 0 && ($migration_type eq 'secure' || $migration_type eq 'websocket')) {
> my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
> mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
> $migrate_storage_uri = "nbd:unix:$socket_path";
next prev parent reply other threads:[~2022-01-04 13:58 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with " Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module Fabian Grünbichler
2022-01-03 12:30 ` Fabian Ebner
[not found] ` <<47e7d41f-e328-d9fa-25b7-f7585de8ce5b@proxmox.com>
2022-01-19 14:30 ` Fabian Grünbichler
2022-01-20 9:57 ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
2022-01-03 14:30 ` Fabian Ebner
[not found] ` <<af15fed1-2d06-540e-cde8-ed1ce772aeb4@proxmox.com>
2022-01-19 14:31 ` Fabian Grünbichler
2022-01-05 10:50 ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 1/4] initial commit Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 2/4] add tunnel implementation Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
2022-01-04 11:37 ` Fabian Ebner
2022-01-19 10:34 ` Fabian Grünbichler
2022-01-19 12:16 ` Fabian Ebner
2022-01-19 12:53 ` Josef Johansson
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 4/4] add packaging Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 01/10] refactor map_storage to map_id Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 02/10] schema: use pve-bridge-id Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode Fabian Grünbichler
2022-01-04 11:57 ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 04/10] update_vm: allow simultaneous setting of boot-order and dev Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 05/10] nbd alloc helper: allow passing in explicit format Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 06/10] migrate: move tunnel-helpers to pve-guest-common Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 07/10] mtunnel: add API endpoints Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 08/10] migrate: refactor remote VM/tunnel start Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling Fabian Grünbichler
2022-01-04 13:58 ` Fabian Ebner [this message]
2022-01-04 16:44 ` Roland
2022-01-11 8:19 ` Thomas Lamprecht
[not found] ` <<554040de-09d6-974b-143a-80c2d66b9573@proxmox.com>
2022-01-19 14:32 ` Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 10/10] api: add remote migrate endpoint Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 1/4] volname_for_storage: parse volname before calling Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision Fabian Grünbichler
2022-01-05 9:00 ` Fabian Ebner
2022-01-19 14:38 ` Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare Fabian Grünbichler
2022-01-05 9:59 ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 4/4] add volume_import/export_start helpers Fabian Grünbichler
2021-12-23 13:56 ` [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a3846bc-04bc-ed4d-4e2e-38a9911390aa@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal