public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration
@ 2021-12-22 13:52 Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with " Fabian Grünbichler
                   ` (21 more replies)
  0 siblings, 22 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

this series adds remote migration for VMs.

both live and offline migration including NBD and storage-migrated disks should
work. groundwork for extending to pve-container and pvesr already laid.

new in v3: lots of refactoring and edge-case handling

new in v2: dropped parts already applied, incorporated Fabian's and
Dominik's feedback (thanks!)

new in v1: explicit remote endpoint specified as part of API call instead of
remote.cfg

overview over affected repos and changes, see individual patches for
more details.

proxmox-websocket-tunnel:

new tunnel helper tool for forwarding commands and data over websocket
connections, required by pve-guest-common on source side

pve-guest-common:

new/refactored generic WS/SSH tunnel fork/read/.. helpers
new storage migration over WS tunnel helpers

pve-storage:

refactor storage_migrate to make code-reuse possible

qemu-server:

some refactoring, new mtunnel endpoints, new remote_migration endpoints
TODO: handle pending changes and snapshots
TODO: proper CLI for remote migration
TODO: handle C^c
TODO: strict parser for FW config as well?
potential TODO: precond endpoint?

as usual, some of the patches are best viewed with '-w', especially in
qemu-server..

required dependencies are noted

pve-guest-common:

Fabian Grünbichler (3):
  migrate: handle migration_network with remote migration
  add tunnel helper module
  add storage tunnel module

 src/Makefile               |   2 +
 debian/control             |   1 +
 src/PVE/AbstractMigrate.pm |  37 ++--
 src/PVE/StorageTunnel.pm   | 231 ++++++++++++++++++++++++
 src/PVE/Tunnel.pm          | 356 +++++++++++++++++++++++++++++++++++++
 5 files changed, 611 insertions(+), 16 deletions(-)
 create mode 100644 src/PVE/StorageTunnel.pm
 create mode 100644 src/PVE/Tunnel.pm

proxmox-websocket-tunnel:

Fabian Grünbichler (4):
  initial commit
  add tunnel implementation
  add fingerprint validation
  add packaging

qemu-server:

Fabian Grünbichler (10):
  refactor map_storage to map_id
  schema: use pve-bridge-id
  parse_config: optional strict mode
  update_vm: allow simultaneous setting of boot-order and dev
  nbd alloc helper: allow passing in explicit format
  migrate: move tunnel-helpers to pve-guest-common
  mtunnel: add API endpoints
  migrate: refactor remote VM/tunnel start
  migrate: add remote migration handling
  api: add remote migrate endpoint

 PVE/API2/Qemu.pm                      | 721 ++++++++++++++++++++++++-
 PVE/QemuMigrate.pm                    | 738 ++++++++++++++++----------
 PVE/QemuServer.pm                     | 100 ++--
 debian/control                        |   2 +
 test/MigrationTest/QemuMigrateMock.pm |  28 +-
 5 files changed, 1240 insertions(+), 349 deletions(-)

pve-storage:

Fabian Grünbichler (4):
  volname_for_storage: parse volname before calling
  storage_migrate: pull out snapshot decision
  storage_migrate: pull out import/export_prepare
  add volume_import/export_start helpers

 PVE/Storage.pm | 205 ++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 153 insertions(+), 52 deletions(-)




^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with remote migration
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module Fabian Grünbichler
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

remote migration always has an explicit endpoint from the start which
gets used for everything.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/PVE/AbstractMigrate.pm | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/src/PVE/AbstractMigrate.pm b/src/PVE/AbstractMigrate.pm
index af2be38..a20b213 100644
--- a/src/PVE/AbstractMigrate.pm
+++ b/src/PVE/AbstractMigrate.pm
@@ -115,22 +115,27 @@ sub migrate {
 
     $class = ref($class) || $class;
 
-    my $dc_conf = PVE::Cluster::cfs_read_file('datacenter.cfg');
+    my ($ssh_info, $rem_ssh);
+    if (!$opts->{remote}) {
+	my $dc_conf = PVE::Cluster::cfs_read_file('datacenter.cfg');
 
-    my $migration_network = $opts->{migration_network};
-    if (!defined($migration_network)) {
-	$migration_network = $dc_conf->{migration}->{network};
-    }
-    my $ssh_info = PVE::SSHInfo::get_ssh_info($node, $migration_network);
-    $nodeip = $ssh_info->{ip};
-
-    my $migration_type = 'secure';
-    if (defined($opts->{migration_type})) {
-	$migration_type = $opts->{migration_type};
-    } elsif (defined($dc_conf->{migration}->{type})) {
-        $migration_type = $dc_conf->{migration}->{type};
+	my $migration_network = $opts->{migration_network};
+	if (!defined($migration_network)) {
+	    $migration_network = $dc_conf->{migration}->{network};
+	}
+	$ssh_info = PVE::SSHInfo::get_ssh_info($node, $migration_network);
+	$nodeip = $ssh_info->{ip};
+
+	my $migration_type = 'secure';
+	if (defined($opts->{migration_type})) {
+	    $migration_type = $opts->{migration_type};
+	} elsif (defined($dc_conf->{migration}->{type})) {
+	    $migration_type = $dc_conf->{migration}->{type};
+	}
+	$opts->{migration_type} = $migration_type;
+	$opts->{migration_network} = $migration_network;
+	$rem_ssh = PVE::SSHInfo::ssh_info_to_command($ssh_info);
     }
-    $opts->{migration_type} = $migration_type;
 
     my $self = {
 	delayed_interrupt => 0,
@@ -139,7 +144,7 @@ sub migrate {
 	node => $node,
 	ssh_info => $ssh_info,
 	nodeip => $nodeip,
-	rem_ssh => PVE::SSHInfo::ssh_info_to_command($ssh_info)
+	rem_ssh => $rem_ssh,
     };
 
     $self = bless $self, $class;
@@ -162,7 +167,7 @@ sub migrate {
 	&$eval_int($self, sub { $self->{running} = $self->prepare($self->{vmid}); });
 	die $@ if $@;
 
-	if (defined($migration_network)) {
+	if (defined($self->{opts}->{migration_network})) {
 	    $self->log('info', "use dedicated network address for sending " .
 	               "migration traffic ($self->{nodeip})");
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with " Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-03 12:30   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
                   ` (19 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

lifted from PVE::QemuMigrate, abstracting away use-case specific data.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/Makefile      |   1 +
 debian/control    |   1 +
 src/PVE/Tunnel.pm | 356 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 358 insertions(+)
 create mode 100644 src/PVE/Tunnel.pm

diff --git a/src/Makefile b/src/Makefile
index 0298d3f..d82162c 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -12,6 +12,7 @@ install: PVE
 	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
+	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
 	install -d ${PERL5DIR}/PVE/VZDump
 	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
 	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
diff --git a/debian/control b/debian/control
index 4c246d0..73c28bd 100644
--- a/debian/control
+++ b/debian/control
@@ -16,6 +16,7 @@ Depends: libpve-cluster-perl,
          libpve-common-perl (>= 4.0-89),
          libpve-storage-perl (>= 7.0-14),
          pve-cluster,
+         proxmox-websocket-tunnel,
          ${misc:Depends},
          ${perl:Depends},
 Breaks: libpve-common-perl (<< 4.0-89),
diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
new file mode 100644
index 0000000..bbd1169
--- /dev/null
+++ b/src/PVE/Tunnel.pm
@@ -0,0 +1,356 @@
+package PVE::Tunnel;
+
+use strict;
+use warnings;
+
+use IO::Pipe;
+use IPC::Open2;
+use JSON qw(encode_json decode_json);
+use POSIX qw( WNOHANG );
+use Storable qw(dclone);
+use URI::Escape;
+
+use PVE::APIClient::LWP;
+use PVE::Tools;
+
+my $finish_command_pipe = sub {
+    my ($cmdpipe, $timeout) = @_;
+
+    my $cpid = $cmdpipe->{pid};
+    return if !defined($cpid);
+
+    my $writer = $cmdpipe->{writer};
+    my $reader = $cmdpipe->{reader};
+
+    $writer->close();
+    $reader->close();
+
+    my $collect_child_process = sub {
+	my $res = waitpid($cpid, WNOHANG);
+	if (defined($res) && ($res == $cpid)) {
+	    delete $cmdpipe->{cpid};
+	    return 1;
+	} else {
+	    return 0;
+	}
+     };
+
+    if ($timeout) {
+	for (my $i = 0; $i < $timeout; $i++) {
+	    return if &$collect_child_process();
+	    sleep(1);
+	}
+    }
+
+    $cmdpipe->{log}->("tunnel still running - terminating now with SIGTERM\n");
+    kill(15, $cpid);
+
+    # wait again
+    for (my $i = 0; $i < 10; $i++) {
+	return if &$collect_child_process();
+	sleep(1);
+    }
+
+    $cmdpipe->{log}->("tunnel still running - terminating now with SIGKILL\n");
+    kill 9, $cpid;
+    sleep 1;
+
+    $cmdpipe->{log}->("tunnel child process (PID $cpid) couldn't be collected\n")
+	if !&$collect_child_process();
+};
+
+sub read_tunnel {
+    my ($tunnel, $timeout) = @_;
+
+    $timeout = 60 if !defined($timeout);
+
+    my $reader = $tunnel->{reader};
+
+    my $output;
+    eval {
+	PVE::Tools::run_with_timeout($timeout, sub { $output = <$reader>; });
+    };
+    die "reading from tunnel failed: $@\n" if $@;
+
+    chomp $output if defined($output);
+
+    return $output;
+}
+
+sub write_tunnel {
+    my ($tunnel, $timeout, $command, $params) = @_;
+
+    $timeout = 60 if !defined($timeout);
+
+    my $writer = $tunnel->{writer};
+
+    if ($tunnel->{version} && $tunnel->{version} >= 2) {
+	my $object = defined($params) ? dclone($params) : {};
+	$object->{cmd} = $command;
+
+	$command = eval { JSON::encode_json($object) };
+
+	die "failed to encode command as JSON - $@\n"
+	    if $@;
+    }
+
+    eval {
+	PVE::Tools::run_with_timeout($timeout, sub {
+	    print $writer "$command\n";
+	    $writer->flush();
+	});
+    };
+    die "writing to tunnel failed: $@\n" if $@;
+
+    if ($tunnel->{version} && $tunnel->{version} >= 1) {
+	my $res = eval { read_tunnel($tunnel, $timeout); };
+	die "no reply to command '$command': $@\n" if $@;
+
+	if ($tunnel->{version} == 1) {
+	    if ($res eq 'OK') {
+		return;
+	    } else {
+		die "tunnel replied '$res' to command '$command'\n";
+	    }
+	} else {
+	    my $parsed = eval { JSON::decode_json($res) };
+	    die "failed to decode tunnel reply '$res' (command '$command') - $@\n"
+		if $@;
+
+	    if (!$parsed->{success}) {
+		if (defined($parsed->{msg})) {
+		    die "error - tunnel command '$command' failed - $parsed->{msg}\n";
+		} else {
+		    die "error - tunnel command '$command' failed\n";
+		}
+	    }
+
+	    return $parsed;
+	}
+    }
+}
+
+sub fork_ssh_tunnel {
+    my ($rem_ssh, $cmd, $ssh_forward_info, $log) = @_;
+
+    my @localtunnelinfo = ();
+    foreach my $addr (@$ssh_forward_info) {
+	push @localtunnelinfo, '-L', $addr;
+    }
+
+    my $full_cmd = [@$rem_ssh, '-o ExitOnForwardFailure=yes', @localtunnelinfo, @$cmd];
+
+    my $reader = IO::File->new();
+    my $writer = IO::File->new();
+
+    my $orig_pid = $$;
+
+    my $cpid;
+
+    eval { $cpid = open2($reader, $writer, @$full_cmd); };
+
+    my $err = $@;
+
+    # catch exec errors
+    if ($orig_pid != $$) {
+	$log->("can't fork command pipe, aborting\n");
+	POSIX::_exit(1);
+	kill('KILL', $$);
+    }
+
+    die $err if $err;
+
+    my $tunnel = {
+	writer => $writer,
+	reader => $reader,
+	pid => $cpid,
+	rem_ssh => $rem_ssh,
+	log => $log,
+    };
+
+    eval {
+	my $helo = read_tunnel($tunnel, 60);
+	die "no reply\n" if !$helo;
+	die "no quorum on target node\n" if $helo =~ m/^no quorum$/;
+	die "got strange reply from tunnel ('$helo')\n"
+	    if $helo !~ m/^tunnel online$/;
+    };
+    $err = $@;
+
+    eval {
+	my $ver = read_tunnel($tunnel, 10);
+	if ($ver =~ /^ver (\d+)$/) {
+	    $tunnel->{version} = $1;
+	    $log->('info', "ssh tunnel $ver\n");
+	} else {
+	    $err = "received invalid tunnel version string '$ver'\n" if !$err;
+	}
+    };
+
+    if ($err) {
+	$finish_command_pipe->($tunnel);
+	die "can't open tunnel - $err";
+    }
+    return $tunnel;
+}
+
+sub forward_unix_socket {
+    my ($tunnel, $local, $remote) = @_;
+
+    my $params = dclone($tunnel->{params});
+    $params->{unix} = $local;
+    $params->{url} = $params->{url} ."socket=".uri_escape($remote)."&";
+    $params->{ticket} = { path => $remote };
+
+    my $cmd = encode_json({
+	control => JSON::true,
+	cmd => 'forward',
+	data => $params,
+    });
+
+    my $writer = $tunnel->{writer};
+    $tunnel->{forwarded}->{$local} = $remote;
+    eval {
+	unlink $local;
+	PVE::Tools::run_with_timeout(15, sub {
+	    print $writer "$cmd\n";
+	    $writer->flush();
+	});
+    };
+    die "failed to write forwarding command - $@\n" if $@;
+
+    read_tunnel($tunnel);
+}
+
+sub fork_websocket_tunnel {
+    my ($conn, $url, $req_params, $tunnel_params, $log) = @_;
+
+    if (my $apitoken = $conn->{apitoken}) {
+	$tunnel_params->{headers} = [["Authorization", "$apitoken"]];
+    } else {
+	die "can't connect to remote host without credentials\n";
+    }
+
+    if (my $fps = $conn->{cached_fingerprints}) {
+	$tunnel_params->{fingerprint} = (keys %$fps)[0];
+    }
+
+    my $api_client = PVE::APIClient::LWP->new(%$conn);
+
+    my $res = $api_client->post(
+	$url,
+	$req_params,
+    );
+
+    $log->("remote: started migration tunnel worker '$res->{upid}'");
+
+    my $websocket_url = $tunnel_params->{url};
+
+    $tunnel_params->{url} .= "?ticket=".uri_escape($res->{ticket});
+    $tunnel_params->{url} .= "&socket=".uri_escape($res->{socket});
+
+    my $reader = IO::Pipe->new();
+    my $writer = IO::Pipe->new();
+
+    my $cpid = fork();
+    if ($cpid) {
+	$writer->writer();
+	$reader->reader();
+	my $tunnel = { writer => $writer, reader => $reader, pid => $cpid };
+
+	eval {
+	    my $writer = $tunnel->{writer};
+	    my $cmd = encode_json({
+		control => JSON::true,
+		cmd => 'connect',
+		data => $tunnel_params,
+	    });
+
+	    eval {
+		PVE::Tools::run_with_timeout(15, sub {
+		    print {$writer} "$cmd\n";
+		    $writer->flush();
+		});
+	    };
+	    die "failed to write tunnel connect command - $@\n" if $@;
+	};
+	die "failed to connect via WS: $@\n" if $@;
+
+	my $err;
+        eval {
+	    my $writer = $tunnel->{writer};
+	    my $cmd = encode_json({
+		cmd => 'version',
+	    });
+
+	    eval {
+		PVE::Tools::run_with_timeout(15, sub {
+		    print {$writer} "$cmd\n";
+		    $writer->flush();
+		});
+	    };
+	    $err = "failed to write tunnel version command - $@\n" if $@;
+	    my $res = read_tunnel($tunnel, 10);
+	    $res = JSON::decode_json($res);
+	    my $version = $res->{api};
+
+	    if ($version =~ /^(\d+)$/) {
+		$tunnel->{version} = $1;
+		$tunnel->{age} = $res->{age};
+	    } else {
+		$err = "received invalid tunnel version string '$version'\n" if !$err;
+	    }
+	};
+	$err = $@ if !$err;
+
+	if ($err) {
+	    $finish_command_pipe->($tunnel);
+	    die "can't open migration tunnel - $err";
+	}
+
+	$tunnel_params->{url} = "$websocket_url?"; # reset ticket and socket
+
+	$tunnel->{params} = $tunnel_params; # for forwarding
+	$tunnel->{log} = $log;
+
+	return $tunnel;
+    } else {
+	eval {
+	    $writer->reader();
+	    $reader->writer();
+	    PVE::Tools::run_command(
+		['proxmox-websocket-tunnel'],
+		input => "<&".fileno($writer),
+		output => ">&".fileno($reader),
+		errfunc => sub { my $line = shift; print "tunnel: $line\n"; },
+	    );
+	};
+	warn "CMD websocket tunnel died: $@\n" if $@;
+	exit 0;
+    }
+}
+
+sub finish_tunnel {
+    my ($tunnel, $cleanup) = @_;
+
+    $cleanup = $cleanup ? 1 : 0;
+
+    eval { write_tunnel($tunnel, 30, 'quit', { cleanup => $cleanup }); };
+    my $err = $@;
+
+    $finish_command_pipe->($tunnel, 30);
+
+    if (my $unix_sockets = $tunnel->{unix_sockets}) {
+	# ssh does not clean up on local host
+	my $cmd = ['rm', '-f', @$unix_sockets];
+	PVE::Tools::run_command($cmd);
+
+	# .. and just to be sure check on remote side
+	if ($tunnel->{rem_ssh}) {
+	    unshift @{$cmd}, @{$tunnel->{rem_ssh}};
+	    PVE::Tools::run_command($cmd);
+	}
+    }
+
+    die $err if $err;
+}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with " Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-03 14:30   ` Fabian Ebner
  2022-01-05 10:50   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 1/4] initial commit Fabian Grünbichler
                   ` (18 subsequent siblings)
  21 siblings, 2 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

encapsulating storage-related tunnel methods, currently
- source-side storage-migrate helper
- target-side disk-import handler
- target-side query-disk-import handler

to be extended further with replication-related handlers and helpers.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    new in v3, includes code previously in qemu-server

 src/Makefile             |   1 +
 src/PVE/StorageTunnel.pm | 231 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 232 insertions(+)
 create mode 100644 src/PVE/StorageTunnel.pm

diff --git a/src/Makefile b/src/Makefile
index d82162c..baa2688 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -12,6 +12,7 @@ install: PVE
 	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
+	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
 	install -d ${PERL5DIR}/PVE/VZDump
 	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
diff --git a/src/PVE/StorageTunnel.pm b/src/PVE/StorageTunnel.pm
new file mode 100644
index 0000000..06902ef
--- /dev/null
+++ b/src/PVE/StorageTunnel.pm
@@ -0,0 +1,231 @@
+package PVE::StorageTunnel;
+
+use strict;
+use warnings;
+
+use IO::Socket::UNIX;
+use POSIX qw(WNOHANG);
+use Socket qw(SOCK_STREAM);
+
+use PVE::Tools;
+use PVE::Tunnel;
+use PVE::Storage;
+
+sub storage_migrate {
+    my ($tunnel, $storecfg, $volid, $local_vmid, $remote_vmid, $opts, $log) = @_;
+
+    my $bwlimit = $opts->{bwlimit};
+    my $targetsid = $opts->{targetsid};
+
+    # use 'migrate' limit for transfer to other node
+    my $bwlimit_opts = {
+	storage => $targetsid,
+	bwlimit => $bwlimit,
+    };
+    my $remote_bwlimit = PVE::Tunnel::write_tunnel($tunnel, 10, 'bwlimit', $bwlimit_opts);
+    $remote_bwlimit = $remote_bwlimit->{bwlimit};
+    if (defined($remote_bwlimit)) {
+	$bwlimit = $remote_bwlimit if !defined($bwlimit);
+	$bwlimit = $remote_bwlimit if $remote_bwlimit < $bwlimit;
+    }
+
+    # JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
+    $bwlimit = $bwlimit * 1024 if defined($bwlimit);
+
+    # adapt volume name for import call
+    my ($sid, undef) = PVE::Storage::parse_volume_id($volid);
+    my (undef, $name, undef, undef, undef, undef, $format) = PVE::Storage::parse_volname($storecfg, $volid);
+    my $scfg = PVE::Storage::storage_config($storecfg, $sid);
+    PVE::Storage::activate_volumes($storecfg, [$volid]);
+    if ($local_vmid != $remote_vmid) {
+	$name =~ s/-$local_vmid-/-$remote_vmid-/g;
+	$name =~ s/^$local_vmid\///; # re-added on target if dir-based storage
+    }
+
+    my $with_snapshots = $opts->{snapshots} ? 1 : 0;
+    my $snapshot;
+    my $migration_snapshot = PVE::Storage::storage_migrate_snapshot($storecfg, $sid);
+    if ($migration_snapshot) {
+	$snapshot = '__migration__';
+	$with_snapshots = 1;
+    }
+
+    my @export_formats = PVE::Storage::volume_export_formats($storecfg, $volid, undef, undef, $with_snapshots);
+
+    my $disk_import_opts = {
+	format => $format,
+	storage => $targetsid,
+	snapshot => $snapshot,
+	migration_snapshot => $migration_snapshot,
+	'with-snapshots' => $with_snapshots,
+	'allow-rename' => !$opts->{is_vmstate},
+	'export-formats' => join(",", @export_formats),
+	volname => $name,
+    };
+    my $res = PVE::Tunnel::write_tunnel($tunnel, 600, 'disk-import', $disk_import_opts);
+    my $local = "/run/qemu-server/$local_vmid.storage";
+    if (!$tunnel->{forwarded}->{$local}) {
+	PVE::Tunnel::forward_unix_socket($tunnel, $local, $res->{socket});
+    }
+    my $socket = IO::Socket::UNIX->new(Peer => $local, Type => SOCK_STREAM())
+	or die "failed to connect to websocket tunnel at $local\n";
+    # we won't be reading from the socket
+    shutdown($socket, 0);
+
+    my $disk_export_opts = {
+	snapshot => $snapshot,
+	migration_snapshot => $migration_snapshot,
+	'with-snapshots' => $with_snapshots,
+	ratelimit_bps => $bwlimit,
+	cmd => {
+	    output => '>&'.fileno($socket),
+	},
+    };
+
+    eval {
+	PVE::Storage::volume_export_start(
+	    $storecfg,
+	    $volid,
+	    $res->{format},
+	    $disk_export_opts,
+	);
+    };
+    my $send_error = $@;
+    warn "$send_error\n" if $send_error;
+
+    # don't close the connection entirely otherwise the
+    # receiving end might not get all buffered data (and
+    # fails with 'connection reset by peer')
+    shutdown($socket, 1);
+
+    # wait for the remote process to finish
+    my $new_volid;
+    while ($res = PVE::Tunnel::write_tunnel($tunnel, 10, 'query-disk-import')) {
+	if ($res->{status} eq 'pending') {
+	    if (my $msg = $res->{msg}) {
+		$log->("disk-import: $msg\n");
+	    } else {
+		$log->("waiting for disk import to finish..\n");
+	    }
+	    sleep(1)
+	} elsif ($res->{status} eq 'complete') {
+	    $new_volid = $res->{volid};
+	    last;
+	} else {
+	    warn "unknown query-disk-import result: $res->{status}\n";
+	    last;
+	}
+    }
+
+    # now close the socket
+    close($socket);
+    if ($snapshot) {
+	eval { PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snapshot, 0) };
+	warn "could not remove source snapshot: $@\n" if $@;
+    }
+    die $send_error if $send_error;
+    die "disk import failed - see log above\n" if !$new_volid;
+
+    return $new_volid;
+}
+
+our $cmd_schema = {
+    'disk-import' => {
+	volname => {
+	    type => 'string',
+	    description => 'volume name to use as preferred target volume name',
+	},
+	format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
+	'export-formats' => {
+	    type => 'string',
+	    description => 'list of supported export formats',
+	},
+	storage => {
+	    type => 'string',
+	    format => 'pve-storage-id',
+	},
+	'with-snapshots' => {
+	    description =>
+	        "Whether the stream includes intermediate snapshots",
+	    type => 'boolean',
+	    optional => 1,
+	    default => 0,
+	},
+	'allow-rename' => {
+	    description => "Choose a new volume ID if the requested " .
+		"volume ID already exists, instead of throwing an error.",
+	    type => 'boolean',
+	    optional => 1,
+	    default => 0,
+	},
+    },
+};
+
+sub handle_disk_import {
+    my ($state, $params) = @_;
+
+    die "disk import already running as PID '$state->{disk_import}->{pid}'\n"
+	if $state->{disk_import}->{pid};
+
+    my $storage = delete $params->{storage};
+    my $format = delete $params->{format};
+    my $volname = delete $params->{volname};
+
+    my $import = PVE::Storage::volume_import_start($state->{storecfg}, $storage, $volname, $format, $state->{vmid}, $params);
+
+    my $socket = $import->{socket};
+    $format = delete $import->{format};
+
+    $state->{sockets}->{$socket} = 1;
+    $state->{disk_import} = $import;
+
+    chown $state->{socket_uid}, -1, $socket;
+
+    return {
+	socket => $socket,
+	format => $format,
+    };
+}
+
+sub handle_query_disk_import {
+    my ($state, $params) = @_;
+
+    die "no disk import running\n"
+	if !$state->{disk_import}->{pid};
+
+    my $pattern = PVE::Storage::volume_imported_message(undef, 1);
+    my $result;
+    eval {
+	my $fh = $state->{disk_import}->{fh};
+	PVE::Tools::run_with_timeout(5, sub { $result = <$fh>; });
+	print "disk-import: $result\n" if $result;
+    };
+    if ($result && $result =~ $pattern) {
+	my $volid = $1;
+	waitpid($state->{disk_import}->{pid}, 0);
+
+	my $unix = $state->{disk_import}->{socket};
+	unlink $unix;
+	delete $state->{sockets}->{$unix};
+	delete $state->{disk_import};
+	$state->{cleanup}->{volumes}->{$volid} = 1;
+	return {
+	    status => "complete",
+	    volid => $volid,
+	};
+    } elsif (!$result && waitpid($state->{disk_import}->{pid}, WNOHANG)) {
+	my $unix = $state->{disk_import}->{socket};
+	unlink $unix;
+	delete $state->{sockets}->{$unix};
+	delete $state->{disk_import};
+
+	return {
+	    status => "error",
+	};
+    } else {
+	return {
+	    status => "pending",
+	    msg => $result,
+	};
+    }
+}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 proxmox-websocket-tunnel 1/4] initial commit
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (2 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 2/4] add tunnel implementation Fabian Grünbichler
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 .gitignore    |  1 +
 .cargo/config |  5 +++++
 Cargo.toml    | 11 +++++++++++
 3 files changed, 17 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 .cargo/config
 create mode 100644 Cargo.toml

diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..ea8c4bf
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+/target
diff --git a/.cargo/config b/.cargo/config
new file mode 100644
index 0000000..3b5b6e4
--- /dev/null
+++ b/.cargo/config
@@ -0,0 +1,5 @@
+[source]
+[source.debian-packages]
+directory = "/usr/share/cargo/registry"
+[source.crates-io]
+replace-with = "debian-packages"
diff --git a/Cargo.toml b/Cargo.toml
new file mode 100644
index 0000000..939184c
--- /dev/null
+++ b/Cargo.toml
@@ -0,0 +1,11 @@
+[package]
+name = "proxmox-websocket-tunnel"
+version = "0.1.0"
+authors = ["Fabian Grünbichler <f.gruenbichler@proxmox.com>"]
+edition = "2018"
+license = "AGPL-3"
+description = "Proxmox websocket tunneling helper"
+
+exclude = ["debian"]
+
+[dependencies]
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 proxmox-websocket-tunnel 2/4] add tunnel implementation
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (3 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 1/4] initial commit Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

the websocket tunnel helper accepts control commands (encoded as
single-line JSON) on stdin, and prints responses on stdout.

the following commands are available:
- "connect" a 'control' tunnel via a websocket
- "forward" a local unix socket to a remote socket via a websocket
-- if requested, this will ask for a ticket via the control tunnel after
accepting a new connection on the unix socket
- "close" the control tunnel and any forwarded socket

any other json input (without the 'control' flag set) is forwarded as-is
to the remote end of the control tunnel.

internally, the tunnel helper will spawn tokio tasks for
- handling the control tunnel connection (new commands are passed in via
an mpsc channel together with a oneshot channel for the response)
- handling accepting new connections on each forwarded unix socket
- handling forwarding data over accepted forwarded connections

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v3:
    - rebased, use proxmox-sys instead of proxmox for linux::random_data
    
    v2:
    - dropped CloseCmd and related code
    - moved call to get_ticket into forward handler task
    - bubble up errors instead of unwrap()

 Cargo.toml  |  15 ++
 src/main.rs | 396 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 411 insertions(+)
 create mode 100644 src/main.rs

diff --git a/Cargo.toml b/Cargo.toml
index 939184c..7f24602 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -9,3 +9,18 @@ description = "Proxmox websocket tunneling helper"
 exclude = ["debian"]
 
 [dependencies]
+anyhow = "1.0"
+base64 = "0.13"
+futures = "0.3"
+futures-util = "0.3"
+hex = "0.4"
+hyper = { version = "0.14" }
+openssl = "0.10"
+percent-encoding = "2"
+proxmox-http = { version = "0.6", path = "../proxmox/proxmox-http", features = ["websocket", "client"] }
+proxmox-sys = { version = "0.2.2", path = "../proxmox/proxmox-sys" }
+serde = { version = "1.0", features = ["derive"] }
+serde_json = "1.0"
+tokio = { version = "1", features = ["io-std", "io-util", "macros", "rt-multi-thread", "sync"] }
+tokio-stream = { version = "0.1", features = ["io-util"] }
+tokio-util = "0.6"
diff --git a/src/main.rs b/src/main.rs
new file mode 100644
index 0000000..582214c
--- /dev/null
+++ b/src/main.rs
@@ -0,0 +1,396 @@
+use anyhow::{bail, format_err, Error};
+
+use std::collections::VecDeque;
+use std::sync::Arc;
+
+use futures::future::FutureExt;
+use futures::select;
+
+use hyper::client::{Client, HttpConnector};
+use hyper::header::{SEC_WEBSOCKET_KEY, SEC_WEBSOCKET_VERSION, UPGRADE};
+use hyper::upgrade::Upgraded;
+use hyper::{Body, Request, StatusCode};
+
+use openssl::ssl::{SslConnector, SslMethod};
+use percent_encoding::{utf8_percent_encode, NON_ALPHANUMERIC};
+
+use serde::{Deserialize, Serialize};
+use serde_json::{Map, Value};
+use tokio::io::{AsyncBufReadExt, AsyncWriteExt, BufReader};
+use tokio::net::{UnixListener, UnixStream};
+use tokio::sync::{mpsc, oneshot};
+use tokio_stream::wrappers::LinesStream;
+use tokio_stream::StreamExt;
+
+use proxmox_http::client::HttpsConnector;
+use proxmox_http::websocket::{OpCode, WebSocket, WebSocketReader, WebSocketWriter};
+
+#[derive(Serialize, Deserialize, Debug)]
+#[serde(rename_all = "kebab-case")]
+enum CmdType {
+    Connect,
+    Forward,
+    NonControl,
+}
+
+type CmdData = Map<String, Value>;
+
+#[derive(Serialize, Deserialize, Debug)]
+#[serde(rename_all = "kebab-case")]
+struct ConnectCmdData {
+    // target URL for WS connection
+    url: String,
+    // fingerprint of TLS certificate
+    fingerprint: Option<String>,
+    // addition headers such as authorization
+    headers: Option<Vec<(String, String)>>,
+}
+
+#[derive(Serialize, Deserialize, Debug, Clone)]
+#[serde(rename_all = "kebab-case")]
+struct ForwardCmdData {
+    // target URL for WS connection
+    url: String,
+    // addition headers such as authorization
+    headers: Option<Vec<(String, String)>>,
+    // fingerprint of TLS certificate
+    fingerprint: Option<String>,
+    // local UNIX socket path for forwarding
+    unix: String,
+    // request ticket using these parameters
+    ticket: Option<Map<String, Value>>,
+}
+
+struct CtrlTunnel {
+    sender: Option<mpsc::UnboundedSender<(Value, oneshot::Sender<String>)>>,
+}
+
+impl CtrlTunnel {
+    async fn read_cmd_loop(mut self) -> Result<(), Error> {
+        let mut stdin_stream = LinesStream::new(BufReader::new(tokio::io::stdin()).lines());
+        while let Some(res) = stdin_stream.next().await {
+            match res {
+                Ok(line) => {
+                    let (cmd_type, data) = Self::parse_cmd(&line)?;
+                    match cmd_type {
+                        CmdType::Connect => self.handle_connect_cmd(data).await,
+                        CmdType::Forward => {
+                            let res = self.handle_forward_cmd(data).await;
+                            match &res {
+                                Ok(()) => println!("{}", serde_json::json!({"success": true})),
+                                Err(msg) => println!(
+                                    "{}",
+                                    serde_json::json!({"success": false, "msg": msg.to_string()})
+                                ),
+                            };
+                            res
+                        }
+                        CmdType::NonControl => self
+                            .handle_tunnel_cmd(data)
+                            .await
+                            .map(|res| println!("{}", res)),
+                    }
+                }
+                Err(err) => bail!("Failed to read from STDIN - {}", err),
+            }?;
+        }
+
+        Ok(())
+    }
+
+    fn parse_cmd(line: &str) -> Result<(CmdType, CmdData), Error> {
+        let mut json: Map<String, Value> = serde_json::from_str(line)?;
+        match json.remove("control") {
+            Some(Value::Bool(true)) => {
+                match json.remove("cmd").map(serde_json::from_value::<CmdType>) {
+                    None => bail!("input has 'control' flag, but no control 'cmd' set.."),
+                    Some(Err(e)) => bail!("failed to parse control cmd - {}", e),
+                    Some(Ok(cmd_type)) => Ok((cmd_type, json)),
+                }
+            }
+            _ => Ok((CmdType::NonControl, json)),
+        }
+    }
+
+    async fn websocket_connect(
+        url: String,
+        extra_headers: Vec<(String, String)>,
+        fingerprint: Option<String>,
+    ) -> Result<Upgraded, Error> {
+        let ws_key = proxmox_sys::linux::random_data(16)?;
+        let ws_key = base64::encode(&ws_key);
+        let mut req = Request::builder()
+            .uri(url)
+            .header(UPGRADE, "websocket")
+            .header(SEC_WEBSOCKET_VERSION, "13")
+            .header(SEC_WEBSOCKET_KEY, ws_key)
+            .body(Body::empty())?;
+
+        let headers = req.headers_mut();
+        for (name, value) in extra_headers {
+            let name = hyper::header::HeaderName::from_bytes(name.as_bytes())?;
+            let value = hyper::header::HeaderValue::from_str(&value)?;
+            headers.insert(name, value);
+        }
+
+        let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
+        if fingerprint.is_some() {
+            // FIXME actually verify fingerprint via callback!
+            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
+        } else {
+            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
+        }
+
+        let mut httpc = HttpConnector::new();
+        httpc.enforce_http(false); // we want https...
+        httpc.set_connect_timeout(Some(std::time::Duration::new(10, 0)));
+        let https = HttpsConnector::with_connector(httpc, ssl_connector_builder.build(), 120);
+
+        let client = Client::builder().build::<_, Body>(https);
+        let res = client.request(req).await?;
+        if res.status() != StatusCode::SWITCHING_PROTOCOLS {
+            bail!("server didn't upgrade: {}", res.status());
+        }
+
+        hyper::upgrade::on(res)
+            .await
+            .map_err(|err| format_err!("failed to upgrade - {}", err))
+    }
+
+    async fn handle_connect_cmd(&mut self, mut data: CmdData) -> Result<(), Error> {
+        let mut data: ConnectCmdData = data
+            .remove("data")
+            .ok_or_else(|| format_err!("'connect' command missing 'data'"))
+            .map(serde_json::from_value)??;
+
+        if self.sender.is_some() {
+            bail!("already connected!");
+        }
+
+        let upgraded = Self::websocket_connect(
+            data.url.clone(),
+            data.headers.take().unwrap_or_else(Vec::new),
+            data.fingerprint.take(),
+        )
+        .await?;
+
+        let (tx, rx) = mpsc::unbounded_channel();
+        self.sender = Some(tx);
+        tokio::spawn(async move {
+            if let Err(err) = Self::handle_ctrl_tunnel(upgraded, rx).await {
+                eprintln!("Tunnel to {} failed - {}", data.url, err);
+            }
+        });
+
+        Ok(())
+    }
+
+    async fn handle_forward_cmd(&mut self, mut data: CmdData) -> Result<(), Error> {
+        let data: ForwardCmdData = data
+            .remove("data")
+            .ok_or_else(|| format_err!("'forward' command missing 'data'"))
+            .map(serde_json::from_value)??;
+
+        if self.sender.is_none() && data.ticket.is_some() {
+            bail!("dynamically requesting ticket requires cmd tunnel connection!");
+        }
+
+        let unix_listener = UnixListener::bind(data.unix.clone())?;
+        let data = Arc::new(data);
+        let cmd_sender = self.sender.clone();
+
+        tokio::spawn(async move {
+            let mut tasks: Vec<tokio::task::JoinHandle<()>> = Vec::new();
+            let data2 = data.clone();
+
+            loop {
+                let data3 = data2.clone();
+
+                match unix_listener.accept().await {
+                    Ok((unix_stream, _)) => {
+                        eprintln!("accepted new connection on '{}'", data3.unix);
+                        let cmd_sender2 = cmd_sender.clone();
+
+                        let task = tokio::spawn(async move {
+                            if let Err(err) = Self::handle_forward_tunnel(
+                                cmd_sender2.clone(),
+                                data3.clone(),
+                                unix_stream,
+                            )
+                            .await
+                            {
+                                eprintln!("Tunnel for {} failed - {}", data3.unix, err);
+                            }
+                        });
+                        tasks.push(task);
+                    }
+                    Err(err) => eprintln!(
+                        "Failed to accept unix connection on {} - {}",
+                        data3.unix, err
+                    ),
+                };
+            }
+        });
+
+        Ok(())
+    }
+
+    async fn handle_tunnel_cmd(&mut self, data: CmdData) -> Result<String, Error> {
+        match &mut self.sender {
+            None => bail!("not connected!"),
+            Some(sender) => {
+                let data: Value = data.into();
+                let (tx, rx) = oneshot::channel::<String>();
+                if let Some(cmd) = data.get("cmd") {
+                    eprintln!("-> sending command {} to remote", cmd);
+                } else {
+                    eprintln!("-> sending data line to remote");
+                }
+                sender.send((data, tx))?;
+                let res = rx.await?;
+                eprintln!("<- got reply");
+                Ok(res)
+            }
+        }
+    }
+
+    async fn handle_ctrl_tunnel(
+        websocket: Upgraded,
+        mut cmd_receiver: mpsc::UnboundedReceiver<(Value, oneshot::Sender<String>)>,
+    ) -> Result<(), Error> {
+        let (tunnel_reader, tunnel_writer) = tokio::io::split(websocket);
+        let (ws_close_tx, mut ws_close_rx) = mpsc::unbounded_channel();
+        let ws_reader = WebSocketReader::new(tunnel_reader, ws_close_tx);
+        let mut ws_writer = WebSocketWriter::new(Some([0, 0, 0, 0]), tunnel_writer);
+
+        let mut framed_reader =
+            tokio_util::codec::FramedRead::new(ws_reader, tokio_util::codec::LinesCodec::new());
+
+        let mut resp_tx_queue: VecDeque<oneshot::Sender<String>> = VecDeque::new();
+        let mut shutting_down = false;
+
+        loop {
+            let mut close_future = ws_close_rx.recv().boxed().fuse();
+            let mut frame_future = framed_reader.next().boxed().fuse();
+            let mut cmd_future = cmd_receiver.recv().boxed().fuse();
+
+            select! {
+                res = close_future => {
+                    let res = res.ok_or_else(|| format_err!("WS control channel closed"))?;
+                    eprintln!("WS: received control message: '{:?}'", res);
+                    shutting_down = true;
+                },
+                res = frame_future => {
+                    match res {
+                        None if shutting_down => {
+                            eprintln!("WS closed");
+                            break;
+                        },
+                        None => bail!("WS closed unexpectedly"),
+                        Some(Ok(res)) => {
+                            resp_tx_queue
+                                .pop_front()
+                                .ok_or_else(|| format_err!("no response handler"))?
+                                .send(res)
+                                .map_err(|msg| format_err!("failed to send tunnel response '{}' back to requester - receiver already closed?", msg))?;
+                        },
+                        Some(Err(err)) => {
+                            bail!("reading from control tunnel failed - WS receive failed: {}", err);
+                        },
+                    }
+                },
+                res = cmd_future => {
+                    if shutting_down { continue };
+                    match res {
+                        None => {
+                            eprintln!("CMD channel closed, shutting down");
+                            ws_writer.send_control_frame(Some([1,2,3,4]), OpCode::Close, &[]).await?;
+                            shutting_down = true;
+                        },
+                        Some((msg, resp_tx)) => {
+                            resp_tx_queue.push_back(resp_tx);
+
+                            let line = format!("{}\n", msg);
+                            ws_writer.write_all(line.as_bytes()).await?;
+                            ws_writer.flush().await?;
+                        },
+                    }
+                },
+            };
+        }
+
+        Ok(())
+    }
+
+    async fn handle_forward_tunnel(
+        cmd_sender: Option<mpsc::UnboundedSender<(Value, oneshot::Sender<String>)>>,
+        data: Arc<ForwardCmdData>,
+        unix: UnixStream,
+    ) -> Result<(), Error> {
+        let data = match (&cmd_sender, &data.ticket) {
+            (Some(cmd_sender), Some(_)) => Self::get_ticket(cmd_sender, data.clone()).await,
+            _ => Ok(data.clone()),
+        }?;
+
+        let upgraded = Self::websocket_connect(
+            data.url.clone(),
+            data.headers.clone().unwrap_or_else(Vec::new),
+            data.fingerprint.clone(),
+        )
+        .await?;
+
+        let ws = WebSocket {
+            mask: Some([0, 0, 0, 0]),
+        };
+        eprintln!("established new WS for forwarding '{}'", data.unix);
+        ws.serve_connection(upgraded, unix).await?;
+
+        eprintln!("done handling forwarded connection from '{}'", data.unix);
+
+        Ok(())
+    }
+
+    async fn get_ticket(
+        cmd_sender: &mpsc::UnboundedSender<(Value, oneshot::Sender<String>)>,
+        cmd_data: Arc<ForwardCmdData>,
+    ) -> Result<Arc<ForwardCmdData>, Error> {
+        eprintln!("requesting WS ticket via tunnel");
+        let ticket_cmd = match cmd_data.ticket.clone() {
+            Some(mut ticket_cmd) => {
+                ticket_cmd.insert("cmd".to_string(), serde_json::json!("ticket"));
+                ticket_cmd
+            }
+            None => bail!("can't get ticket without ticket parameters"),
+        };
+        let (tx, rx) = oneshot::channel::<String>();
+        cmd_sender.send((serde_json::json!(ticket_cmd), tx))?;
+        let ticket = rx.await?;
+        let mut ticket: Map<String, Value> = serde_json::from_str(&ticket)?;
+        let ticket = ticket
+            .remove("ticket")
+            .ok_or_else(|| format_err!("failed to retrieve ticket via tunnel"))?;
+
+        let ticket = ticket
+            .as_str()
+            .ok_or_else(|| format_err!("failed to format received ticket"))?;
+        let ticket = utf8_percent_encode(ticket, NON_ALPHANUMERIC).to_string();
+
+        let mut data = cmd_data.clone();
+        let mut url = data.url.clone();
+        url.push_str("ticket=");
+        url.push_str(&ticket);
+        let mut d = Arc::make_mut(&mut data);
+        d.url = url;
+        Ok(data)
+    }
+}
+
+#[tokio::main]
+async fn main() -> Result<(), Error> {
+    do_main().await
+}
+
+async fn do_main() -> Result<(), Error> {
+    let tunnel = CtrlTunnel { sender: None };
+    tunnel.read_cmd_loop().await
+}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (4 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 2/4] add tunnel implementation Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-04 11:37   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 4/4] add packaging Fabian Grünbichler
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

in case we have no explicit fingerprint, we use openssl's regular "PEER"
verification. if we have a fingerprint, we ignore openssl altogether and
just verify the fingerprint of the presented leaf certificate.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v3: switch to using hex instead of no-longer-existing digest_to_hex
    v2: new

 src/main.rs | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/src/main.rs b/src/main.rs
index 582214c..49d6ffe 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -134,9 +134,50 @@ impl CtrlTunnel {
         }
 
         let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
-        if fingerprint.is_some() {
-            // FIXME actually verify fingerprint via callback!
-            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
+        if let Some(expected) = fingerprint {
+            ssl_connector_builder.set_verify_callback(
+                openssl::ssl::SslVerifyMode::NONE,
+                move |_valid, ctx| {
+                    let cert = match ctx.current_cert() {
+                        Some(cert) => cert,
+                        None => {
+                            eprintln!("SSL context lacks current certificate.");
+                            return false;
+                        }
+                    };
+
+                    let depth = ctx.error_depth();
+                    if depth != 0 {
+                        return true;
+                    }
+
+                    let fp = match cert.digest(openssl::hash::MessageDigest::sha256()) {
+                        Ok(fp) => fp,
+                        Err(err) => {
+                            // should not happen
+                            eprintln!("failed to calculate certificate FP - {}", err);
+                            return false;
+                        }
+                    };
+                    let fp_string = hex::encode(&fp);
+                    let fp_string = fp_string
+                        .as_bytes()
+                        .chunks(2)
+                        .map(|v| std::str::from_utf8(v).unwrap())
+                        .collect::<Vec<&str>>()
+                        .join(":");
+
+                    let expected = expected.to_lowercase();
+                    if expected == fp_string {
+                        true
+                    } else {
+                        eprintln!("certificate fingerprint does not match expected fingerprint!");
+                        eprintln!("expected:    {}", expected);
+                        eprintln!("encountered: {}", fp_string);
+                        false
+                    }
+                },
+            );
         } else {
             ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
         }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 proxmox-websocket-tunnel 4/4] add packaging
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (5 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 01/10] refactor map_storage to map_id Fabian Grünbichler
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v3: rebased/regenerated

 Makefile             | 44 +++++++++++++++++++++++++++++++++
 debian/changelog     |  5 ++++
 debian/control       | 58 ++++++++++++++++++++++++++++++++++++++++++++
 debian/copyright     | 16 ++++++++++++
 debian/debcargo.toml | 13 ++++++++++
 5 files changed, 136 insertions(+)
 create mode 100644 Makefile
 create mode 100644 debian/changelog
 create mode 100644 debian/control
 create mode 100644 debian/copyright
 create mode 100644 debian/debcargo.toml

diff --git a/Makefile b/Makefile
new file mode 100644
index 0000000..4f2a799
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,44 @@
+.PHONY: all
+all: check
+
+.PHONY: check
+check:
+	cargo test --all-features
+
+.PHONY: dinstall
+dinstall: deb
+	sudo -k dpkg -i build/librust-*.deb
+
+.PHONY: build
+build:
+	rm -rf build
+	rm -f debian/control
+	mkdir build
+	debcargo package \
+	    --config "$(PWD)/debian/debcargo.toml" \
+	    --changelog-ready \
+	    --no-overlay-write-back \
+	    --directory "$(PWD)/build/proxmox-websocket-tunnel" \
+	    "proxmox-websocket-tunnel" \
+	    "$$(dpkg-parsechangelog -l "debian/changelog" -SVersion | sed -e 's/-.*//')"
+	echo system >build/rust-toolchain
+	rm -f build/proxmox-websocket-tunnel/Cargo.lock
+	find build/proxmox-websocket-tunnel/debian -name '*.hint' -delete
+	cp build/proxmox-websocket-tunnel/debian/control debian/control
+
+.PHONY: deb
+deb: build
+	(cd build/proxmox-websocket-tunnel && dpkg-buildpackage -b -uc -us)
+	lintian build/*.deb
+
+.PHONY: clean
+clean:
+	rm -rf build *.deb *.buildinfo *.changes *.orig.tar.gz
+	cargo clean
+
+upload: deb
+	cd build; \
+	    dcmd --deb rust-proxmox-websocket-tunnel_*.changes \
+	    | grep -v '.changes$$' \
+	    | tar -cf- -T- \
+	    | ssh -X repoman@repo.proxmox.com upload --product pve --dist bullseye
diff --git a/debian/changelog b/debian/changelog
new file mode 100644
index 0000000..04751ce
--- /dev/null
+++ b/debian/changelog
@@ -0,0 +1,5 @@
+rust-proxmox-websocket-tunnel (0.1.0-1) unstable; urgency=medium
+
+  * initial release
+
+ -- Proxmox Support Team <support@proxmox.com>  Tue, 18 May 2021 14:18:14 +0200
diff --git a/debian/control b/debian/control
new file mode 100644
index 0000000..7f6b4c3
--- /dev/null
+++ b/debian/control
@@ -0,0 +1,58 @@
+Source: rust-proxmox-websocket-tunnel
+Section: admin
+Priority: optional
+Build-Depends: debhelper (>= 12),
+ dh-cargo (>= 25),
+ cargo:native,
+ rustc:native,
+ libstd-rust-dev,
+ librust-anyhow-1+default-dev,
+ librust-base64-0.13+default-dev,
+ librust-futures-0.3+default-dev,
+ librust-futures-util-0.3+default-dev,
+ librust-hex-0.4+default-dev,
+ librust-hyper-0.14+default-dev,
+ librust-openssl-0.10+default-dev,
+ librust-percent-encoding-2+default-dev,
+ librust-proxmox-http-0.6+client-dev,
+ librust-proxmox-http-0.6+default-dev,
+ librust-proxmox-http-0.6+websocket-dev,
+ librust-proxmox-sys-0.2+default-dev (>= 0.2.2-~~),
+ librust-serde-1+default-dev,
+ librust-serde-1+derive-dev,
+ librust-serde-json-1+default-dev,
+ librust-tokio-1+default-dev,
+ librust-tokio-1+io-std-dev,
+ librust-tokio-1+io-util-dev,
+ librust-tokio-1+macros-dev,
+ librust-tokio-1+rt-multi-thread-dev,
+ librust-tokio-1+sync-dev,
+ librust-tokio-stream-0.1+default-dev,
+ librust-tokio-stream-0.1+io-util-dev,
+ librust-tokio-util-0.6+default-dev
+Maintainer: Proxmox Support Team <support@proxmox.com>
+Standards-Version: 4.5.1
+Vcs-Git: git://git.proxmox.com/git/proxmox-websocket-tunnel.git
+Vcs-Browser: https://git.proxmox.com/?p=proxmox-websocket-tunnel.git
+Rules-Requires-Root: no
+
+Package: proxmox-websocket-tunnel
+Architecture: any
+Multi-Arch: allowed
+Depends:
+ ${misc:Depends},
+ ${shlibs:Depends},
+ ${cargo:Depends}
+Recommends:
+ ${cargo:Recommends}
+Suggests:
+ ${cargo:Suggests}
+Provides:
+ ${cargo:Provides}
+Built-Using: ${cargo:Built-Using}
+XB-X-Cargo-Built-Using: ${cargo:X-Cargo-Built-Using}
+Description: Proxmox websocket tunneling helper
+ Proxmox websocket tunnel helper
+ .
+ This package contains a helper binary for tunnelling UNIX sockets over a
+ websocket connection
diff --git a/debian/copyright b/debian/copyright
new file mode 100644
index 0000000..5661ef6
--- /dev/null
+++ b/debian/copyright
@@ -0,0 +1,16 @@
+Copyright (C) 2021 Proxmox Server Solutions GmbH
+
+This software is written by Proxmox Server Solutions GmbH <support@proxmox.com>
+
+This program is free software: you can redistribute it and/or modify
+it under the terms of the GNU Affero General Public License as published by
+the Free Software Foundation, either version 3 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU Affero General Public License for more details.
+
+You should have received a copy of the GNU Affero General Public License
+along with this program.  If not, see <http://www.gnu.org/licenses/>.
diff --git a/debian/debcargo.toml b/debian/debcargo.toml
new file mode 100644
index 0000000..dfe933e
--- /dev/null
+++ b/debian/debcargo.toml
@@ -0,0 +1,13 @@
+overlay = "."
+crate_src_path = ".."
+maintainer = "Proxmox Support Team <support@proxmox.com>"
+
+[source]
+vcs_git = "git://git.proxmox.com/git/proxmox-websocket-tunnel.git"
+vcs_browser = "https://git.proxmox.com/?p=proxmox-websocket-tunnel.git"
+section = "admin"
+
+[packages.bin]
+description="""Proxmox websocket tunnel helper
+
+This package contains a helper binary for tunnelling UNIX sockets over a websocket connection"""
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 01/10] refactor map_storage to map_id
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (6 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 4/4] add packaging Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 02/10] schema: use pve-bridge-id Fabian Grünbichler
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

since we are going to reuse the same mechanism/code for network bridge
mapping.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 PVE/QemuMigrate.pm | 6 +++---
 PVE/QemuServer.pm  | 6 ++++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index c9bc39d..3d05242 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -342,7 +342,7 @@ sub prepare {
 	my $targetsid = $sid;
 	# NOTE: we currently ignore shared source storages in mappings so skip here too for now
 	if (!$scfg->{shared}) {
-	    $targetsid = PVE::QemuServer::map_storage($self->{opts}->{storagemap}, $sid);
+	    $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
 	}
 
 	my $target_scfg = PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
@@ -408,7 +408,7 @@ sub scan_local_volumes {
 
 	    next if @{$dl->{$storeid}} == 0;
 
-	    my $targetsid = PVE::QemuServer::map_storage($self->{opts}->{storagemap}, $storeid);
+	    my $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $storeid);
 	    # check if storage is available on target node
 	    my $target_scfg = PVE::Storage::storage_check_enabled(
 		$storecfg,
@@ -479,7 +479,7 @@ sub scan_local_volumes {
 	    my $targetsid = $sid;
 	    # NOTE: we currently ignore shared source storages in mappings so skip here too for now
 	    if (!$scfg->{shared}) {
-		$targetsid = PVE::QemuServer::map_storage($self->{opts}->{storagemap}, $sid);
+		$targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
 	    }
 
 	    PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 0071a06..db3a0b1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -120,7 +120,9 @@ PVE::JSONSchema::register_standard_option('pve-qemu-machine', {
 });
 
 
-sub map_storage {
+# maps source to target ID
+# currently used for targetstorage and targetbridge when migrating
+sub map_id {
     my ($map, $source) = @_;
 
     return $source if !defined($map);
@@ -5279,7 +5281,7 @@ sub vm_migrate_alloc_nbd_disks {
 	# volume is not available there, fall back to the default format.
 	# Otherwise use the same format as the original.
 	if (!$storagemap->{identity}) {
-	    $storeid = map_storage($storagemap, $storeid);
+	    $storeid = map_id($storagemap, $storeid);
 	    my ($defFormat, $validFormats) = PVE::Storage::storage_default_format($storecfg, $storeid);
 	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
 	    my $fileFormat = qemu_img_format($scfg, $volname);
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 02/10] schema: use pve-bridge-id
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (7 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 01/10] refactor map_storage to map_id Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode Fabian Grünbichler
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    requires pve-common with pve-bridge-id

 PVE/QemuServer.pm | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index db3a0b1..f5beb1b 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -931,13 +931,10 @@ my $net_fmt = {
         default_key => 1,
     },
     (map { $_ => { keyAlias => 'model', alias => 'macaddr' }} @$nic_model_list),
-    bridge => {
-	type => 'string',
+    bridge => get_standard_option('pve-bridge-id', {
 	description => $net_fmt_bridge_descr,
-	format_description => 'bridge',
-	pattern => '[-_.\w\d]+',
 	optional => 1,
-    },
+    }),
     queues => {
 	type => 'integer',
 	minimum => 0, maximum => 16,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (8 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 02/10] schema: use pve-bridge-id Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-04 11:57   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 04/10] update_vm: allow simultaneous setting of boot-order and dev Fabian Grünbichler
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

when passing a config from one cluster to another, we want to be strict
when parsing - it's better to fail the migration early and upgrade the
target node instead of failing the migration later (when significant
work for transferring disks and/or state has already been done) or not
at all, but silently lose config settings that the target doesn't
understand.

this also might be helpful in other cases - e.g. when restoring from a
backup.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v3: new

 PVE/QemuServer.pm | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index f5beb1b..be88385 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -2329,7 +2329,7 @@ sub destroy_vm {
 }
 
 sub parse_vm_config {
-    my ($filename, $raw) = @_;
+    my ($filename, $raw, $strict) = @_;
 
     return if !defined($raw);
 
@@ -2339,6 +2339,16 @@ sub parse_vm_config {
 	pending => {},
     };
 
+    my $handle_error = sub {
+	my ($msg) = @_;
+
+	if ($strict) {
+	    warn $msg;
+	} else {
+	    die $msg;
+	}
+    };
+
     $filename =~ m|/qemu-server/(\d+)\.conf$|
 	|| die "got strange filename '$filename'";
 
@@ -2393,14 +2403,14 @@ sub parse_vm_config {
 	    if ($section eq 'pending') {
 		$conf->{delete} = $value; # we parse this later
 	    } else {
-		warn "vm $vmid - propertry 'delete' is only allowed in [PENDING]\n";
+		$handle_error->("vm $vmid - property 'delete' is only allowed in [PENDING]\n");
 	    }
 	} elsif ($line =~ m/^([a-z][a-z_]*\d*):\s*(.+?)\s*$/) {
 	    my $key = $1;
 	    my $value = $2;
 	    eval { $value = check_type($key, $value); };
 	    if ($@) {
-		warn "vm $vmid - unable to parse value of '$key' - $@";
+		$handle_error->("vm $vmid - unable to parse value of '$key' - $@");
 	    } else {
 		$key = 'ide2' if $key eq 'cdrom';
 		my $fmt = $confdesc->{$key}->{format};
@@ -2410,7 +2420,7 @@ sub parse_vm_config {
 			$v->{file} = $volid;
 			$value = print_drive($v);
 		    } else {
-			warn "vm $vmid - unable to parse value of '$key'\n";
+			$handle_error->("vm $vmid - unable to parse value of '$key'\n");
 			next;
 		    }
 		}
@@ -2418,7 +2428,7 @@ sub parse_vm_config {
 		$conf->{$key} = $value;
 	    }
 	} else {
-	    warn "vm $vmid - unable to parse config: $line\n";
+	    $handle_error->("vm $vmid - unable to parse config: $line\n");
 	}
     }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 04/10] update_vm: allow simultaneous setting of boot-order and dev
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (9 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 05/10] nbd alloc helper: allow passing in explicit format Fabian Grünbichler
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

else this fails if we check 'boot' before the device was put into
the config or pending section.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    as happens when doing a remote migration and the full config is passed through
    update_vm_api

 PVE/API2/Qemu.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 6992f6f..38a3f1e 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -1411,7 +1411,7 @@ my $update_vm_api  = sub {
 			if ($new_bootcfg->{order}) {
 			    my @devs = PVE::Tools::split_list($new_bootcfg->{order});
 			    for my $dev (@devs) {
-				my $exists = $conf->{$dev} || $conf->{pending}->{$dev};
+				my $exists = $conf->{$dev} || $conf->{pending}->{$dev} || $param->{$dev};
 				my $deleted = grep {$_ eq $dev} @delete;
 				die "invalid bootorder: device '$dev' does not exist'\n"
 				    if !$exists || $deleted;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 05/10] nbd alloc helper: allow passing in explicit format
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (10 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 04/10] update_vm: allow simultaneous setting of boot-order and dev Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 06/10] migrate: move tunnel-helpers to pve-guest-common Fabian Grünbichler
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

and make $volname optional, to support remote migration usage without
code duplication.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v2: new
    
    requires
    - pve-storage with UNIX import support
    - pve-access-control with tunnel ticket support
    - pve-http-server with websocket fixes

 PVE/QemuServer.pm | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index be88385..3702d43 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5271,11 +5271,9 @@ sub vm_migrate_get_nbd_disks {
 sub vm_migrate_alloc_nbd_disks {
     my ($storecfg, $vmid, $source_volumes, $storagemap) = @_;
 
-    my $format = undef;
-
     my $nbd = {};
     foreach my $opt (sort keys %$source_volumes) {
-	my ($volid, $storeid, $volname, $drive, $use_existing) = @{$source_volumes->{$opt}};
+	my ($volid, $storeid, $volname, $drive, $use_existing, $format) = @{$source_volumes->{$opt}};
 
 	if ($use_existing) {
 	    $nbd->{$opt}->{drivestr} = print_drive($drive);
@@ -5284,16 +5282,26 @@ sub vm_migrate_alloc_nbd_disks {
 	    next;
 	}
 
-	# If a remote storage is specified and the format of the original
-	# volume is not available there, fall back to the default format.
-	# Otherwise use the same format as the original.
+	# storage mapping + volname = regular migration
+	# storage mapping + format = remote migration
+	# order of precedence, filtered by whether storage supports it:
+	# 1. explicit requested format
+	# 2. format of current volume
+	# 3. default format of storage
 	if (!$storagemap->{identity}) {
 	    $storeid = map_id($storagemap, $storeid);
 	    my ($defFormat, $validFormats) = PVE::Storage::storage_default_format($storecfg, $storeid);
-	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
-	    my $fileFormat = qemu_img_format($scfg, $volname);
-	    $format = (grep {$fileFormat eq $_} @{$validFormats}) ? $fileFormat : $defFormat;
+	    if (!$format || !grep { $format eq $_ } @$validFormats) {
+		if ($volname) {
+		    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
+		    my $fileFormat = qemu_img_format($scfg, $volname);
+		    $format = $fileFormat
+			if grep { $fileFormat eq $_ } @$validFormats;
+		}
+		$format //= $defFormat;
+	    }
 	} else {
+	    # can't happen for remote migration, so $volname is always defined
 	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
 	    $format = qemu_img_format($scfg, $volname);
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 06/10] migrate: move tunnel-helpers to pve-guest-common
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (11 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 05/10] nbd alloc helper: allow passing in explicit format Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 07/10] mtunnel: add API endpoints Fabian Grünbichler
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

besides the log calls these don't need any parts of the migration state,
so let's make them generic and re-use them for container migration and
replication in the future.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    new in v3, requires bumped libpve-guest-common-perl

 PVE/QemuMigrate.pm                    | 183 ++------------------------
 test/MigrationTest/QemuMigrateMock.pm |  28 ++--
 2 files changed, 26 insertions(+), 185 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 3d05242..c915b28 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -18,6 +18,7 @@ use PVE::ReplicationConfig;
 use PVE::ReplicationState;
 use PVE::Storage;
 use PVE::Tools;
+use PVE::Tunnel;
 
 use PVE::QemuConfig;
 use PVE::QemuServer::CPUConfig;
@@ -30,180 +31,16 @@ use PVE::QemuServer;
 use PVE::AbstractMigrate;
 use base qw(PVE::AbstractMigrate);
 
-sub fork_command_pipe {
-    my ($self, $cmd) = @_;
-
-    my $reader = IO::File->new();
-    my $writer = IO::File->new();
-
-    my $orig_pid = $$;
-
-    my $cpid;
-
-    eval { $cpid = open2($reader, $writer, @$cmd); };
-
-    my $err = $@;
-
-    # catch exec errors
-    if ($orig_pid != $$) {
-	$self->log('err', "can't fork command pipe\n");
-	POSIX::_exit(1);
-	kill('KILL', $$);
-    }
-
-    die $err if $err;
-
-    return { writer => $writer, reader => $reader, pid => $cpid };
-}
-
-sub finish_command_pipe {
-    my ($self, $cmdpipe, $timeout) = @_;
-
-    my $cpid = $cmdpipe->{pid};
-    return if !defined($cpid);
-
-    my $writer = $cmdpipe->{writer};
-    my $reader = $cmdpipe->{reader};
-
-    $writer->close();
-    $reader->close();
-
-    my $collect_child_process = sub {
-	my $res = waitpid($cpid, WNOHANG);
-	if (defined($res) && ($res == $cpid)) {
-	    delete $cmdpipe->{cpid};
-	    return 1;
-	} else {
-	    return 0;
-	}
-     };
-
-    if ($timeout) {
-	for (my $i = 0; $i < $timeout; $i++) {
-	    return if &$collect_child_process();
-	    sleep(1);
-	}
-    }
-
-    $self->log('info', "ssh tunnel still running - terminating now with SIGTERM\n");
-    kill(15, $cpid);
-
-    # wait again
-    for (my $i = 0; $i < 10; $i++) {
-	return if &$collect_child_process();
-	sleep(1);
-    }
-
-    $self->log('info', "ssh tunnel still running - terminating now with SIGKILL\n");
-    kill 9, $cpid;
-    sleep 1;
-
-    $self->log('err', "ssh tunnel child process (PID $cpid) couldn't be collected\n")
-	if !&$collect_child_process();
-}
-
-sub read_tunnel {
-    my ($self, $tunnel, $timeout) = @_;
-
-    $timeout = 60 if !defined($timeout);
-
-    my $reader = $tunnel->{reader};
-
-    my $output;
-    eval {
-	PVE::Tools::run_with_timeout($timeout, sub { $output = <$reader>; });
-    };
-    die "reading from tunnel failed: $@\n" if $@;
-
-    chomp $output;
-
-    return $output;
-}
-
-sub write_tunnel {
-    my ($self, $tunnel, $timeout, $command) = @_;
-
-    $timeout = 60 if !defined($timeout);
-
-    my $writer = $tunnel->{writer};
-
-    eval {
-	PVE::Tools::run_with_timeout($timeout, sub {
-	    print $writer "$command\n";
-	    $writer->flush();
-	});
-    };
-    die "writing to tunnel failed: $@\n" if $@;
-
-    if ($tunnel->{version} && $tunnel->{version} >= 1) {
-	my $res = eval { $self->read_tunnel($tunnel, 10); };
-	die "no reply to command '$command': $@\n" if $@;
-
-	if ($res eq 'OK') {
-	    return;
-	} else {
-	    die "tunnel replied '$res' to command '$command'\n";
-	}
-    }
-}
-
 sub fork_tunnel {
     my ($self, $ssh_forward_info) = @_;
 
-    my @localtunnelinfo = ();
-    foreach my $addr (@$ssh_forward_info) {
-	push @localtunnelinfo, '-L', $addr;
-    }
-
-    my $cmd = [@{$self->{rem_ssh}}, '-o ExitOnForwardFailure=yes', @localtunnelinfo, '/usr/sbin/qm', 'mtunnel' ];
-
-    my $tunnel = $self->fork_command_pipe($cmd);
-
-    eval {
-	my $helo = $self->read_tunnel($tunnel, 60);
-	die "no reply\n" if !$helo;
-	die "no quorum on target node\n" if $helo =~ m/^no quorum$/;
-	die "got strange reply from mtunnel ('$helo')\n"
-	    if $helo !~ m/^tunnel online$/;
-    };
-    my $err = $@;
-
-    eval {
-	my $ver = $self->read_tunnel($tunnel, 10);
-	if ($ver =~ /^ver (\d+)$/) {
-	    $tunnel->{version} = $1;
-	    $self->log('info', "ssh tunnel $ver\n");
-	} else {
-	    $err = "received invalid tunnel version string '$ver'\n" if !$err;
-	}
+    my $cmd = ['/usr/sbin/qm', 'mtunnel'];
+    my $log = sub {
+	my $line = shift;
+	$self->log('info', $line);
     };
 
-    if ($err) {
-	$self->finish_command_pipe($tunnel);
-	die "can't open migration tunnel - $err";
-    }
-    return $tunnel;
-}
-
-sub finish_tunnel {
-    my ($self, $tunnel) = @_;
-
-    eval { $self->write_tunnel($tunnel, 30, 'quit'); };
-    my $err = $@;
-
-    $self->finish_command_pipe($tunnel, 30);
-
-    if (my $unix_sockets = $tunnel->{unix_sockets}) {
-	# ssh does not clean up on local host
-	my $cmd = ['rm', '-f', @$unix_sockets];
-	PVE::Tools::run_command($cmd);
-
-	# .. and just to be sure check on remote side
-	unshift @{$cmd}, @{$self->{rem_ssh}};
-	PVE::Tools::run_command($cmd);
-    }
-
-    die $err if $err;
+    return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
 }
 
 sub start_remote_tunnel {
@@ -244,7 +81,7 @@ sub start_remote_tunnel {
 	    }
 	    if ($unix_socket_try > 100) {
 		$self->{errors} = 1;
-		$self->finish_tunnel($self->{tunnel});
+		PVE::Tunnel::finish_tunnel($self->{tunnel});
 		die "Timeout, migration socket $ruri did not get ready";
 	    }
 	    $self->{tunnel}->{unix_sockets} = $unix_sockets if (@$unix_sockets);
@@ -1254,7 +1091,7 @@ sub phase2_cleanup {
 
 
     if ($self->{tunnel}) {
-	eval { finish_tunnel($self, $self->{tunnel});  };
+	eval { PVE::Tunnel::finish_tunnel($self->{tunnel});  };
 	if (my $err = $@) {
 	    $self->log('err', $err);
 	    $self->{errors} = 1;
@@ -1312,7 +1149,7 @@ sub phase3_cleanup {
 	# config moved and nbd server stopped - now we can resume vm on target
 	if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 1) {
 	    eval {
-		$self->write_tunnel($tunnel, 30, "resume $vmid");
+		PVE::Tunnel::write_tunnel($tunnel, 30, "resume $vmid");
 	    };
 	    if (my $err = $@) {
 		$self->log('err', $err);
@@ -1339,7 +1176,7 @@ sub phase3_cleanup {
 
     # close tunnel on successful migration, on error phase2_cleanup closed it
     if ($tunnel) {
-	eval { finish_tunnel($self, $tunnel);  };
+	eval { PVE::Tunnel::finish_tunnel($tunnel); };
 	if (my $err = $@) {
 	    $self->log('err', $err);
 	    $self->{errors} = 1;
diff --git a/test/MigrationTest/QemuMigrateMock.pm b/test/MigrationTest/QemuMigrateMock.pm
index 8e0b7d0..f2c0281 100644
--- a/test/MigrationTest/QemuMigrateMock.pm
+++ b/test/MigrationTest/QemuMigrateMock.pm
@@ -51,12 +51,26 @@ $MigrationTest::Shared::qemu_config_module->mock(
     },
 );
 
-my $qemu_migrate_module = Test::MockModule->new("PVE::QemuMigrate");
-$qemu_migrate_module->mock(
+my $tunnel_module = Test::MockModule->new("PVE::Tunnel");
+$tunnel_module->mock(
     finish_tunnel => sub {
 	delete $expected_calls->{'finish_tunnel'};
 	return;
     },
+    write_tunnel => sub {
+	my ($tunnel, $timeout, $command) = @_;
+
+	if ($command =~ m/^resume (\d+)$/) {
+	    my $vmid = $1;
+	    die "resuming wrong VM '$vmid'\n" if $vmid ne $test_vmid;
+	    return;
+	}
+	die "write_tunnel (mocked) - implement me: $command\n";
+    },
+);
+
+my $qemu_migrate_module = Test::MockModule->new("PVE::QemuMigrate");
+$qemu_migrate_module->mock(
     fork_tunnel => sub {
 	die "fork_tunnel (mocked) - implement me\n"; # currently no call should lead here
     },
@@ -73,16 +87,6 @@ $qemu_migrate_module->mock(
 	    version => 1,
 	};
     },
-    write_tunnel => sub {
-	my ($self, $tunnel, $timeout, $command) = @_;
-
-	if ($command =~ m/^resume (\d+)$/) {
-	    my $vmid = $1;
-	    die "resuming wrong VM '$vmid'\n" if $vmid ne $test_vmid;
-	    return;
-	}
-	die "write_tunnel (mocked) - implement me: $command\n";
-    },
     log => sub {
 	my ($self, $level, $message) = @_;
 	$current_log .= "$level: $message\n";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 07/10] mtunnel: add API endpoints
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (12 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 06/10] migrate: move tunnel-helpers to pve-guest-common Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 08/10] migrate: refactor remote VM/tunnel start Fabian Grünbichler
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

the following two endpoints are used for migration on the remote side

POST /nodes/NODE/qemu/VMID/mtunnel

which creates and locks an empty VM config, and spawns the main qmtunnel
worker which binds to a VM-specific UNIX socket.

this worker handles JSON-encoded migration commands coming in via this
UNIX socket:
- config (set target VM config)
-- checks permissions for updating config
-- strips pending changes and snapshots
-- sets (optional) firewall config
- disk (allocate disk for NBD migration)
-- checks permission for target storage
-- returns drive string for allocated volume
- disk-import (import 'pvesm export' stream for offline migration)
-- checks permission for target storage
-- forks a child running 'pvesm import' reading from a UNIX socket
-- only one import allowed to run at any given moment
- query-disk-import
-- checks output of 'pvesm import' for volume ID message
-- returns volid + success, or 'pending', or 'error'
- start (returning migration info)
- fstrim (via agent)
- bwlimit (query bwlimit for storage)
- ticket (creates a ticket for a WS connection to a specific socket)
- resume
- stop
- nbdstop
- unlock
- quit (+ cleanup)

this worker serves as a replacement for both 'qm mtunnel' and various
manual calls via SSH. the API call will return a ticket valid for
connecting to the worker's UNIX socket via a websocket connection.

GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket

gets called for connecting to a UNIX socket via websocket forwarding,
i.e. once for the main command mtunnel, and once each for the memory
migration and each NBD drive-mirror/storage migration.

access is guarded by a short-lived ticket binding the authenticated user
to the socket path. such tickets can be requested over the main mtunnel,
which keeps track of socket paths currently used by that
mtunnel/migration instance.

each command handler should check privileges for the requested action if
necessary.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v3:
    - handle meta and vmgenid better
    - handle failure of 'config' updating
    - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
    - improve tunnel exit by letting client close the connection
    - use strict VM config parser
    v2: incorporated Fabian Ebner's feedback, mainly:
    - use modified nbd alloc helper instead of duplicating
    - fix disk cleanup, also cleanup imported disks
    - fix firewall-conf vs firewall-config mismatch
    
    requires
    - pve-storage with UNIX import support
    - pve-access-control with tunnel ticket support
    - pve-http-server with websocket fixes
    - pve-guest-common with StorageTunnel

 PVE/API2/Qemu.pm | 513 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 512 insertions(+), 1 deletion(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 38a3f1e..d188b77 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4,10 +4,13 @@ use strict;
 use warnings;
 use Cwd 'abs_path';
 use Net::SSLeay;
-use POSIX;
 use IO::Socket::IP;
+use IO::Socket::UNIX;
+use IPC::Open3;
+use JSON;
 use URI::Escape;
 use Crypt::OpenSSL::Random;
+use Socket qw(SOCK_STREAM);
 
 use PVE::Cluster qw (cfs_read_file cfs_write_file);;
 use PVE::RRD;
@@ -37,6 +40,7 @@ use PVE::VZDump::Plugin;
 use PVE::DataCenterConfig;
 use PVE::SSHInfo;
 use PVE::Replication;
+use PVE::StorageTunnel;
 
 BEGIN {
     if (!$ENV{PVE_GENERATING_DOCS}) {
@@ -857,6 +861,7 @@ __PACKAGE__->register_method({
 	    { subdir => 'spiceproxy' },
 	    { subdir => 'sendkey' },
 	    { subdir => 'firewall' },
+	    { subdir => 'mtunnel' },
 	    ];
 
 	return $res;
@@ -4651,4 +4656,510 @@ __PACKAGE__->register_method({
 	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
     }});
 
+__PACKAGE__->register_method({
+    name => 'mtunnel',
+    path => '{vmid}/mtunnel',
+    method => 'POST',
+    protected => 1,
+    proxyto => 'node',
+    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
+	description => "You need 'VM.Allocate' permissions on /vms/{vmid}. Further permission checks happen during the actual migration.",
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    storages => {
+		type => 'string',
+		format => 'pve-storage-id-list',
+		optional => 1,
+		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
+	    },
+	},
+    },
+    returns => {
+	additionalProperties => 0,
+	properties => {
+	    upid => { type => 'string' },
+	    ticket => { type => 'string' },
+	    socket => { type => 'string' },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $node = extract_param($param, 'node');
+	my $vmid = extract_param($param, 'vmid');
+
+	my $storages = extract_param($param, 'storages');
+
+	my $storecfg = PVE::Storage::config();
+	foreach my $storeid (PVE::Tools::split_list($storages)) {
+	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
+	}
+
+	PVE::Cluster::check_cfs_quorum();
+
+	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
+
+	my $lock = 'create';
+	eval { PVE::QemuConfig->create_and_lock_config($vmid, 0, $lock); };
+
+	raise_param_exc({ vmid => "unable to create empty VM config - $@"})
+	    if $@;
+
+	my $realcmd = sub {
+	    my $state = {
+		storecfg => PVE::Storage::config(),
+		lock => $lock,
+		vmid => $vmid,
+	    };
+
+	    my $run_locked = sub {
+		my ($code, $params) = @_;
+		return PVE::QemuConfig->lock_config($state->{vmid}, sub {
+		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
+
+		    $state->{conf} = $conf;
+
+		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
+			if $state->{lock} && !PVE::QemuConfig->has_lock($conf, $state->{lock});
+
+		    return $code->($params);
+		});
+	    };
+
+	    my $cmd_desc = {
+		bwlimit => {
+		    storage => {
+			type => 'string',
+			format => 'pve-storage-id',
+			description => "Storage for which bwlimit is queried",
+		    },
+		    bwlimit => {
+			description => "Override I/O bandwidth limit (in KiB/s).",
+			optional => 1,
+			type => 'integer',
+			minimum => '0',
+		    },
+		},
+		config => {
+		    conf => {
+			type => 'string',
+			description => 'Full VM config, adapted for target cluster/node',
+		    },
+		    'firewall-config' => {
+			type => 'string',
+			description => 'VM firewall config',
+			optional => 1,
+		    },
+		},
+		disk => {
+		    format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
+		    storage => {
+			type => 'string',
+			format => 'pve-storage-id',
+		    },
+		    drive => {
+			type => 'object',
+			description => 'parsed drive information without volid and format',
+		    },
+		},
+		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
+		start => {
+		    start_params => {
+			type => 'object',
+			description => 'params passed to vm_start_nolock',
+		    },
+		    migrate_opts => {
+			type => 'object',
+			description => 'migrate_opts passed to vm_start_nolock',
+		    },
+		},
+		ticket => {
+		    path => {
+			type => 'string',
+			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
+		    },
+		},
+		quit => {
+		    cleanup => {
+			type => 'boolean',
+			description => 'remove VM config and disks, aborting migration',
+			default => 0,
+		    },
+		},
+	    };
+
+	    my $cmd_handlers = {
+		'version' => sub {
+		    # compared against other end's version
+		    # bump/reset for breaking changes
+		    # bump/bump for opt-in changes
+		    return {
+			api => 2,
+			age => 0,
+		    };
+		},
+		'config' => sub {
+		    my ($params) = @_;
+
+		    # parse and write out VM FW config if given
+		    if (my $fw_conf = $params->{'firewall-config'}) {
+			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
+
+			my $empty_conf = {
+			    rules => [],
+			    options => {},
+			    aliases => {},
+			    ipset => {} ,
+			    ipset_comments => {},
+			};
+			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
+
+			# TODO: add flag for strict parsing?
+			# TODO: add import sub that does all this given raw content?
+			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
+			$vmfw_conf->{vmid} = $state->{vmid};
+			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
+
+			$state->{cleanup}->{fw} = 1;
+		    }
+
+		    my $conf_fn = "incoming/qemu-server/$state->{vmid}.conf";
+		    my $new_conf = PVE::QemuServer::parse_vm_config($conf_fn, $params->{conf}, 1);
+		    delete $new_conf->{lock};
+		    delete $new_conf->{digest};
+
+		    # TODO handle properly?
+		    delete $new_conf->{snapshots};
+		    delete $new_conf->{parent};
+		    delete $new_conf->{pending};
+
+		    # not handled by update_vm_api
+		    my $vmgenid = delete $new_conf->{vmgenid};
+		    my $meta = delete $new_conf->{meta};
+
+		    $new_conf->{vmid} = $state->{vmid};
+		    $new_conf->{node} = $node;
+
+		    PVE::QemuConfig->remove_lock($state->{vmid}, 'create');
+
+		    eval {
+			$update_vm_api->($new_conf, 1);
+		    };
+		    if (my $err = $@) {
+			# revert to lcoked previous config
+			my $conf = PVE::QemuConfig->load_config($state->{vmid});
+			$conf->{lock} = 'create';
+			PVE::QemuConfig->write_config($state->{vmid}, $conf);
+
+			die $err;
+		    }
+
+		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
+		    $conf->{lock} = 'migrate';
+		    $conf->{vmgenid} = $vmgenid if $vmgenid;
+		    $conf->{meta} = $meta if $meta;
+		    PVE::QemuConfig->write_config($state->{vmid}, $conf);
+
+		    $state->{lock} = 'migrate';
+
+		    return;
+		},
+		'bwlimit' => sub {
+		    my ($params) = @_;
+
+		    my $bwlimit = PVE::Storage::get_bandwidth_limit('migration', [$params->{storage}], $params->{bwlimit});
+		    return { bwlimit => $bwlimit };
+
+		},
+		'disk' => sub {
+		    my ($params) = @_;
+
+		    my $format = $params->{format};
+		    my $storeid = $params->{storage};
+		    my $drive = $params->{drive};
+
+		    $check_storage_access_migrate->($rpcenv, $authuser, $state->{storecfg}, $storeid, $node);
+
+		    my $storagemap = {
+			default => $storeid,
+		    };
+
+		    my $source_volumes = {
+			'disk' => [
+			    undef,
+			    $storeid,
+			    undef,
+			    $drive,
+			    0,
+			    $format,
+			],
+		    };
+
+		    my $res = PVE::QemuServer::vm_migrate_alloc_nbd_disks($state->{storecfg}, $state->{vmid}, $source_volumes, $storagemap);
+		    if (defined($res->{disk})) {
+			$state->{cleanup}->{volumes}->{$res->{disk}->{volid}} = 1;
+			return $res->{disk};
+		    } else {
+			die "failed to allocate NBD disk..\n";
+		    }
+		},
+		'disk-import' => sub {
+		    my ($params) = @_;
+
+		    $check_storage_access_migrate->(
+			$rpcenv,
+			$authuser,
+			$state->{storecfg},
+			$params->{storage},
+			$node
+		    );
+
+		    $params->{unix} = "/run/qemu-server/$state->{vmid}.storage";
+
+		    return PVE::StorageTunnel::handle_disk_import($state, $params);
+		},
+		'query-disk-import' => sub {
+		    my ($params) = @_;
+
+		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
+		},
+		'start' => sub {
+		    my ($params) = @_;
+
+		    my $info = PVE::QemuServer::vm_start_nolock(
+			$state->{storecfg},
+			$state->{vmid},
+			$state->{conf},
+			$params->{start_params},
+			$params->{migrate_opts},
+		    );
+
+
+		    if ($info->{migrate}->{proto} ne 'unix') {
+			PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
+			die "migration over non-UNIX sockets not possible\n";
+		    }
+
+		    my $socket = $info->{migrate}->{addr};
+		    chown $state->{socket_uid}, -1, $socket;
+		    $state->{sockets}->{$socket} = 1;
+
+		    my $unix_sockets = $info->{migrate}->{unix_sockets};
+		    foreach my $socket (@$unix_sockets) {
+			chown $state->{socket_uid}, -1, $socket;
+			$state->{sockets}->{$socket} = 1;
+		    }
+		    return $info;
+		},
+		'fstrim' => sub {
+		    if (PVE::QemuServer::qga_check_running($state->{vmid})) {
+			eval { mon_cmd($state->{vmid}, "guest-fstrim") };
+			warn "fstrim failed: $@\n" if $@;
+		    }
+		    return;
+		},
+		'stop' => sub {
+		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
+		    return;
+		},
+		'nbdstop' => sub {
+		    PVE::QemuServer::nbd_stop($state->{vmid});
+		    return;
+		},
+		'resume' => sub {
+		    if (PVE::QemuServer::check_running($state->{vmid}, 1)) {
+			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
+		    } else {
+			die "VM $state->{vmid} not running\n";
+		    }
+		    return;
+		},
+		'unlock' => sub {
+		    PVE::QemuConfig->remove_lock($state->{vmid}, $state->{lock});
+		    delete $state->{lock};
+		    return;
+		},
+		'ticket' => sub {
+		    my ($params) = @_;
+
+		    my $path = $params->{path};
+
+		    die "Not allowed to generate ticket for unknown socket '$path'\n"
+			if !defined($state->{sockets}->{$path});
+
+		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
+		},
+		'quit' => sub {
+		    my ($params) = @_;
+
+		    if ($params->{cleanup}) {
+			if ($state->{cleanup}->{fw}) {
+			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
+			}
+
+			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
+			    print "freeing volume '$volid' as part of cleanup\n";
+			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
+			    warn $@ if $@;
+			}
+
+			PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
+		    }
+
+		    print "switching to exit-mode, waiting for client to disconnect\n";
+		    $state->{exit} = 1;
+		    return;
+		},
+	    };
+
+	    $run_locked->(sub {
+		my $socket_addr = "/run/qemu-server/$state->{vmid}.mtunnel";
+		unlink $socket_addr;
+
+		$state->{socket} = IO::Socket::UNIX->new(
+	            Type => SOCK_STREAM(),
+		    Local => $socket_addr,
+		    Listen => 1,
+		);
+
+		$state->{socket_uid} = getpwnam('www-data')
+		    or die "Failed to resolve user 'www-data' to numeric UID\n";
+		chown $state->{socket_uid}, -1, $socket_addr;
+	    });
+
+	    print "mtunnel started\n";
+
+	    my $conn = $state->{socket}->accept();
+
+	    $state->{conn} = $conn;
+
+	    my $reply_err = sub {
+		my ($msg) = @_;
+
+		my $reply = JSON::encode_json({
+		    success => JSON::false,
+		    msg => $msg,
+		});
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    my $reply_ok = sub {
+		my ($res) = @_;
+
+		$res->{success} = JSON::true;
+		my $reply = JSON::encode_json($res);
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    while (my $line = <$conn>) {
+		chomp $line;
+
+		# untaint, we validate below if needed
+		($line) = $line =~ /^(.*)$/;
+		my $parsed = eval { JSON::decode_json($line) };
+		if ($@) {
+		    $reply_err->("failed to parse command - $@");
+		    next;
+		}
+
+		my $cmd = delete $parsed->{cmd};
+		if (!defined($cmd)) {
+		    $reply_err->("'cmd' missing");
+		} elsif ($state->{exit}) {
+		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
+		    next;
+		} elsif (my $handler = $cmd_handlers->{$cmd}) {
+		    print "received command '$cmd'\n";
+		    eval {
+			if ($cmd_desc->{$cmd}) {
+			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
+			} else {
+			    $parsed = {};
+			}
+			my $res = $run_locked->($handler, $parsed);
+			$reply_ok->($res);
+		    };
+		    $reply_err->("failed to handle '$cmd' command - $@")
+			if $@;
+		} else {
+		    $reply_err->("unknown command '$cmd' given");
+		}
+	    }
+
+	    if ($state->{exit}) {
+		print "mtunnel exited\n";
+	    } else {
+		die "mtunnel exited unexpectedly\n";
+	    }
+	};
+
+	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
+	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
+
+	return {
+	    ticket => $ticket,
+	    upid => $upid,
+	    socket => $socket_addr,
+	};
+    }});
+
+__PACKAGE__->register_method({
+    name => 'mtunnelwebsocket',
+    path => '{vmid}/mtunnelwebsocket',
+    method => 'GET',
+    proxyto => 'node',
+    permissions => {
+	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
+        user => 'all', # check inside
+    },
+    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    socket => {
+		type => "string",
+		description => "unix socket to forward to",
+	    },
+	    ticket => {
+		type => "string",
+		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
+	    },
+	},
+    },
+    returns => {
+	type => "object",
+	properties => {
+	    port => { type => 'string', optional => 1 },
+	    socket => { type => 'string', optional => 1 },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $vmid = $param->{vmid};
+	# check VM exists
+	PVE::QemuConfig->load_config($vmid);
+
+	my $socket = $param->{socket};
+	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
+
+	return { socket => $socket };
+    }});
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 08/10] migrate: refactor remote VM/tunnel start
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (13 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 07/10] mtunnel: add API endpoints Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling Fabian Grünbichler
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

no semantic changes intended, except for:
- no longer passing the main migration UNIX socket to SSH twice for
forwarding
- dropping the 'unix:' prefix in start_remote_tunnel's timeout error message

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v2: fixed nits

 PVE/QemuMigrate.pm | 158 ++++++++++++++++++++++++++++-----------------
 PVE/QemuServer.pm  |  34 +++++-----
 2 files changed, 113 insertions(+), 79 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index c915b28..897018b 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -43,19 +43,24 @@ sub fork_tunnel {
     return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
 }
 
+# tunnel_info:
+#   proto: unix (secure) or tcp (insecure/legacy compat)
+#   addr: IP or UNIX socket path
+#   port: optional TCP port
+#   unix_sockets: additional UNIX socket paths to forward
 sub start_remote_tunnel {
-    my ($self, $raddr, $rport, $ruri, $unix_socket_info) = @_;
+    my ($self, $tunnel_info) = @_;
 
     my $nodename = PVE::INotify::nodename();
     my $migration_type = $self->{opts}->{migration_type};
 
     if ($migration_type eq 'secure') {
 
-	if ($ruri =~ /^unix:/) {
-	    my $ssh_forward_info = ["$raddr:$raddr"];
-	    $unix_socket_info->{$raddr} = 1;
+	if ($tunnel_info->{proto} eq 'unix') {
+	    my $ssh_forward_info = [];
 
-	    my $unix_sockets = [ keys %$unix_socket_info ];
+	    my $unix_sockets = [ keys %{$tunnel_info->{unix_sockets}} ];
+	    push @$unix_sockets, $tunnel_info->{addr};
 	    for my $sock (@$unix_sockets) {
 		push @$ssh_forward_info, "$sock:$sock";
 		unlink $sock;
@@ -82,23 +87,23 @@ sub start_remote_tunnel {
 	    if ($unix_socket_try > 100) {
 		$self->{errors} = 1;
 		PVE::Tunnel::finish_tunnel($self->{tunnel});
-		die "Timeout, migration socket $ruri did not get ready";
+		die "Timeout, migration socket $tunnel_info->{addr} did not get ready";
 	    }
 	    $self->{tunnel}->{unix_sockets} = $unix_sockets if (@$unix_sockets);
 
-	} elsif ($ruri =~ /^tcp:/) {
+	} elsif ($tunnel_info->{proto} eq 'tcp') {
 	    my $ssh_forward_info = [];
-	    if ($raddr eq "localhost") {
+	    if ($tunnel_info->{addr} eq "localhost") {
 		# for backwards compatibility with older qemu-server versions
 		my $pfamily = PVE::Tools::get_host_address_family($nodename);
 		my $lport = PVE::Tools::next_migrate_port($pfamily);
-		push @$ssh_forward_info, "$lport:localhost:$rport";
+		push @$ssh_forward_info, "$lport:localhost:$tunnel_info->{rport}";
 	    }
 
 	    $self->{tunnel} = $self->fork_tunnel($ssh_forward_info);
 
 	} else {
-	    die "unsupported protocol in migration URI: $ruri\n";
+	    die "unsupported protocol in migration URI: $tunnel_info->{proto}\n";
 	}
     } else {
 	#fork tunnel for insecure migration, to send faster commands like resume
@@ -650,52 +655,40 @@ sub phase1_cleanup {
     }
 }
 
-sub phase2 {
-    my ($self, $vmid) = @_;
+sub phase2_start_local_cluster {
+    my ($self, $vmid, $params) = @_;
 
     my $conf = $self->{vmconf};
     my $local_volumes = $self->{local_volumes};
     my @online_local_volumes = $self->filter_local_volumes('online');
 
     $self->{storage_migration} = 1 if scalar(@online_local_volumes);
+    my $start = $params->{start_params};
+    my $migrate = $params->{migrate_opts};
 
     $self->log('info', "starting VM $vmid on remote node '$self->{node}'");
 
-    my $raddr;
-    my $rport;
-    my $ruri; # the whole migration dst. URI (protocol:address[:port])
-    my $nodename = PVE::INotify::nodename();
+    my $tunnel_info = {};
 
     ## start on remote node
     my $cmd = [@{$self->{rem_ssh}}];
 
-    my $spice_ticket;
-    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
-	my $res = mon_cmd($vmid, 'query-spice');
-	$spice_ticket = $res->{ticket};
-    }
+    push @$cmd, 'qm', 'start', $vmid, '--skiplock';
+    push @$cmd, '--migratedfrom', $migrate->{migratedfrom};
 
-    push @$cmd , 'qm', 'start', $vmid, '--skiplock', '--migratedfrom', $nodename;
+    push @$cmd, '--migration_type', $migrate->{type};
 
-    my $migration_type = $self->{opts}->{migration_type};
+    push @$cmd, '--migration_network', $migrate->{network}
+      if $migrate->{network};
 
-    push @$cmd, '--migration_type', $migration_type;
+    push @$cmd, '--stateuri', $start->{statefile};
 
-    push @$cmd, '--migration_network', $self->{opts}->{migration_network}
-      if $self->{opts}->{migration_network};
-
-    if ($migration_type eq 'insecure') {
-	push @$cmd, '--stateuri', 'tcp';
-    } else {
-	push @$cmd, '--stateuri', 'unix';
+    if ($start->{forcemachine}) {
+	push @$cmd, '--machine', $start->{forcemachine};
     }
 
-    if ($self->{forcemachine}) {
-	push @$cmd, '--machine', $self->{forcemachine};
-    }
-
-    if ($self->{forcecpu}) {
-	push @$cmd, '--force-cpu', $self->{forcecpu};
+    if ($start->{forcecpu}) {
+	push @$cmd, '--force-cpu', $start->{forcecpu};
     }
 
     if ($self->{storage_migration}) {
@@ -703,10 +696,7 @@ sub phase2 {
     }
 
     my $spice_port;
-    my $unix_socket_info = {};
-    # version > 0 for unix socket support
-    my $nbd_protocol_version = 1;
-    my $input = "nbd_protocol_version: $nbd_protocol_version\n";
+    my $input = "nbd_protocol_version: $migrate->{nbd_proto_version}\n";
 
     if ($conf->{tpmstate0}) {
 	my $tpmdrive = PVE::QemuServer::parse_drive('tpmstate0', $conf->{tpmstate0});
@@ -715,7 +705,7 @@ sub phase2 {
 	    if $self->{volume_map}->{$tpmvol} && $tpmvol ne $self->{volume_map}->{$tpmvol};
     }
 
-    $input .= "spice_ticket: $spice_ticket\n" if $spice_ticket;
+    $input .= "spice_ticket: $migrate->{spice_ticket}\n" if $migrate->{spice_ticket};
 
     my @online_replicated_volumes = $self->filter_local_volumes('online', 1);
     foreach my $volid (@online_replicated_volumes) {
@@ -745,20 +735,20 @@ sub phase2 {
     my $exitcode = PVE::Tools::run_command($cmd, input => $input, outfunc => sub {
 	my $line = shift;
 
-	if ($line =~ m/^migration listens on tcp:(localhost|[\d\.]+|\[[\d\.:a-fA-F]+\]):(\d+)$/) {
-	    $raddr = $1;
-	    $rport = int($2);
-	    $ruri = "tcp:$raddr:$rport";
+	if ($line =~ m/^migration listens on (tcp):(localhost|[\d\.]+|\[[\d\.:a-fA-F]+\]):(\d+)$/) {
+	    $tunnel_info->{addr} = $2;
+	    $tunnel_info->{port} = int($3);
+	    $tunnel_info->{proto} = $1;
 	}
-	elsif ($line =~ m!^migration listens on unix:(/run/qemu-server/(\d+)\.migrate)$!) {
-	    $raddr = $1;
-	    die "Destination UNIX sockets VMID does not match source VMID" if $vmid ne $2;
-	    $ruri = "unix:$raddr";
+	elsif ($line =~ m!^migration listens on (unix):(/run/qemu-server/(\d+)\.migrate)$!) {
+	    $tunnel_info->{addr} = $2;
+	    die "Destination UNIX sockets VMID does not match source VMID" if $vmid ne $3;
+	    $tunnel_info->{proto} = $1;
 	}
 	elsif ($line =~ m/^migration listens on port (\d+)$/) {
-	    $raddr = "localhost";
-	    $rport = int($1);
-	    $ruri = "tcp:$raddr:$rport";
+	    $tunnel_info->{addr} = "localhost";
+	    $tunnel_info->{port} = int($1);
+	    $tunnel_info->{proto} = "tcp";
 	}
 	elsif ($line =~ m/^spice listens on port (\d+)$/) {
 	    $spice_port = int($1);
@@ -779,7 +769,7 @@ sub phase2 {
 	    $targetdrive =~ s/drive-//g;
 
 	    $handle_storage_migration_listens->($targetdrive, $drivestr, $nbd_uri);
-	    $unix_socket_info->{$nbd_unix_addr} = 1;
+	    $tunnel_info->{unix_sockets}->{$nbd_unix_addr} = 1;
 	} elsif ($line =~ m/^re-using replicated volume: (\S+) - (.*)$/) {
 	    my $drive = $1;
 	    my $volid = $2;
@@ -794,19 +784,65 @@ sub phase2 {
 
     die "remote command failed with exit code $exitcode\n" if $exitcode;
 
-    die "unable to detect remote migration address\n" if !$raddr;
+    die "unable to detect remote migration address\n" if !$tunnel_info->{addr} || !$tunnel_info->{proto};
 
     if (scalar(keys %$target_replicated_volumes) != scalar(@online_replicated_volumes)) {
 	die "number of replicated disks on source and target node do not match - target node too old?\n"
     }
 
+    return ($tunnel_info, $spice_port);
+}
+
+sub phase2 {
+    my ($self, $vmid) = @_;
+
+    my $conf = $self->{vmconf};
+
+    # version > 0 for unix socket support
+    my $nbd_protocol_version = 1;
+
+    my $spice_ticket;
+    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
+	my $res = mon_cmd($vmid, 'query-spice');
+	$spice_ticket = $res->{ticket};
+    }
+
+    my $migration_type = $self->{opts}->{migration_type};
+    my $state_uri = $migration_type eq 'insecure' ? 'tcp' : 'unix';
+
+    my $params = {
+	start_params => {
+	    statefile => $state_uri,
+	    forcemachine => $self->{forcemachine},
+	    forcecpu => $self->{forcecpu},
+	    skiplock => 1,
+	},
+	migrate_opts => {
+	    spice_ticket => $spice_ticket,
+	    type => $migration_type,
+	    network => $self->{opts}->{migration_network},
+	    storagemap => $self->{opts}->{storagemap},
+	    migratedfrom => PVE::INotify::nodename(),
+	    nbd_proto_version => $nbd_protocol_version,
+	    nbd => $self->{nbd},
+	},
+    };
+
+    my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
+
     $self->log('info', "start remote tunnel");
-    $self->start_remote_tunnel($raddr, $rport, $ruri, $unix_socket_info);
+    $self->start_remote_tunnel($tunnel_info);
+
+    my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
+    $migrate_uri .= ":$tunnel_info->{port}"
+	if defined($tunnel_info->{port});
 
     if ($self->{storage_migration}) {
 	$self->{storage_migration_jobs} = {};
 	$self->log('info', "starting storage migration");
 
+	my @online_local_volumes = $self->filter_local_volumes('online');
+
 	die "The number of local disks does not match between the source and the destination.\n"
 	    if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
 	foreach my $drive (keys %{$self->{target_drive}}){
@@ -816,7 +852,7 @@ sub phase2 {
 	    my $source_drive = PVE::QemuServer::parse_drive($drive, $conf->{$drive});
 	    my $source_volid = $source_drive->{file};
 
-	    my $bwlimit = $local_volumes->{$source_volid}->{bwlimit};
+	    my $bwlimit = $self->{local_volumes}->{$source_volid}->{bwlimit};
 	    my $bitmap = $target->{bitmap};
 
 	    $self->log('info', "$drive: start migration to $nbd_uri");
@@ -824,7 +860,7 @@ sub phase2 {
 	}
     }
 
-    $self->log('info', "starting online/live migration on $ruri");
+    $self->log('info', "starting online/live migration on $migrate_uri");
     $self->{livemigration} = 1;
 
     # load_defaults
@@ -901,12 +937,12 @@ sub phase2 {
 
     my $start = time();
 
-    $self->log('info', "start migrate command to $ruri");
+    $self->log('info', "start migrate command to $migrate_uri");
     eval {
-	mon_cmd($vmid, "migrate", uri => $ruri);
+	mon_cmd($vmid, "migrate", uri => $migrate_uri);
     };
     my $merr = $@;
-    $self->log('info', "migrate uri => $ruri failed: $merr") if $merr;
+    $self->log('info', "migrate uri => $migrate_uri failed: $merr") if $merr;
 
     my $last_mem_transferred = 0;
     my $usleep = 1000000;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 3702d43..9971f2c 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5472,10 +5472,10 @@ sub vm_start_nolock {
 	return $migration_ip;
     };
 
-    my $migrate_uri;
     if ($statefile) {
 	if ($statefile eq 'tcp') {
-	    my $localip = "localhost";
+	    my $migrate = $res->{migrate} = { proto => 'tcp' };
+	    $migrate->{addr} = "localhost";
 	    my $datacenterconf = PVE::Cluster::cfs_read_file('datacenter.cfg');
 	    my $nodename = nodename();
 
@@ -5488,26 +5488,26 @@ sub vm_start_nolock {
 	    }
 
 	    if ($migration_type eq 'insecure') {
-		$localip = $get_migration_ip->($nodename);
-		$localip = "[$localip]" if Net::IP::ip_is_ipv6($localip);
+		$migrate->{addr} = $get_migration_ip->($nodename);
+		$migrate->{addr} = "[$migrate->{addr}]" if Net::IP::ip_is_ipv6($migrate->{addr});
 	    }
 
 	    my $pfamily = PVE::Tools::get_host_address_family($nodename);
-	    my $migrate_port = PVE::Tools::next_migrate_port($pfamily);
-	    $migrate_uri = "tcp:${localip}:${migrate_port}";
-	    push @$cmd, '-incoming', $migrate_uri;
+	    $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
+	    $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
+	    push @$cmd, '-incoming', $migrate->{uri};
 	    push @$cmd, '-S';
 
 	} elsif ($statefile eq 'unix') {
 	    # should be default for secure migrations as a ssh TCP forward
 	    # tunnel is not deterministic reliable ready and fails regurarly
 	    # to set up in time, so use UNIX socket forwards
-	    my $socket_addr = "/run/qemu-server/$vmid.migrate";
-	    unlink $socket_addr;
+	    my $migrate = $res->{migrate} = { proto => 'unix' };
+	    $migrate->{addr} = "/run/qemu-server/$vmid.migrate";
+	    unlink $migrate->{addr};
 
-	    $migrate_uri = "unix:$socket_addr";
-
-	    push @$cmd, '-incoming', $migrate_uri;
+	    $migrate->{uri} = "unix:$migrate->{addr}";
+	    push @$cmd, '-incoming', $migrate->{uri};
 	    push @$cmd, '-S';
 
 	} elsif (-e $statefile) {
@@ -5654,10 +5654,9 @@ sub vm_start_nolock {
     eval { PVE::QemuServer::PCI::reserve_pci_usage($pci_id_list, $vmid, undef, $pid) };
     warn $@ if $@;
 
-    print "migration listens on $migrate_uri\n" if $migrate_uri;
-    $res->{migrate_uri} = $migrate_uri;
-
-    if ($statefile && $statefile ne 'tcp' && $statefile ne 'unix')  {
+    if (defined($res->{migrate})) {
+	print "migration listens on $res->{migrate}->{uri}\n";
+    } elsif ($statefile) {
 	eval { mon_cmd($vmid, "cont"); };
 	warn $@ if $@;
     }
@@ -5672,6 +5671,7 @@ sub vm_start_nolock {
 	    my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
 	    mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
 	    $migrate_storage_uri = "nbd:unix:$socket_path";
+	    $res->{migrate}->{unix_sockets} = [$socket_path];
 	} else {
 	    my $nodename = nodename();
 	    my $localip = $get_migration_ip->($nodename);
@@ -5689,8 +5689,6 @@ sub vm_start_nolock {
 	    $migrate_storage_uri = "nbd:${localip}:${storage_migrate_port}";
 	}
 
-	$res->{migrate_storage_uri} = $migrate_storage_uri;
-
 	foreach my $opt (sort keys %$nbd) {
 	    my $drivestr = $nbd->{$opt}->{drivestr};
 	    my $volid = $nbd->{$opt}->{volid};
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (14 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 08/10] migrate: refactor remote VM/tunnel start Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-04 13:58   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 10/10] api: add remote migrate endpoint Fabian Grünbichler
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

remote migration uses a websocket connection to a task worker running on
the target node instead of commands via SSH to control the migration.
this websocket tunnel is started earlier than the SSH tunnel, and allows
adding UNIX-socket forwarding over additional websocket connections
on-demand.

the main differences to regular intra-cluster migration are:
- source VM config and disks are only removed upon request via --delete
- shared storages are treated like local storages, since we can't
assume they are shared across clusters (with potentical to extend this
by marking storages as shared)
- NBD migrated disks are explicitly pre-allocated on the target node via
tunnel command before starting the target VM instance
- in addition to storages, network bridges and the VMID itself is
transformed via a user defined mapping
- all commands and migration data streams are sent via a WS tunnel proxy

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    requires bumped pve-guest-common
    
    v3:
    - move WS tunnel helpers to pve-guest-common-perl
    - check bridge mapping early
    
    v2:
    - improve tunnel version info printing and error handling
    - don't cleanup unix sockets twice
    - url escape remote socket path
    - cleanup nits and small issues
    
    requires proxmox-websocket-tunnel
    
    v3:
    - fix misplaced parentheses
    
    v2:
    - improve tunnel version info printing and error handling
    - don't cleanup unix sockets twice
    - url escape remote socket path
    - cleanup nits and small issues

 PVE/API2/Qemu.pm   |   2 +-
 PVE/QemuMigrate.pm | 425 +++++++++++++++++++++++++++++++++++++--------
 PVE/QemuServer.pm  |   7 +-
 3 files changed, 359 insertions(+), 75 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index d188b77..cf90fe7 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4803,7 +4803,7 @@ __PACKAGE__->register_method({
 		    # bump/reset for breaking changes
 		    # bump/bump for opt-in changes
 		    return {
-			api => 2,
+			api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
 			age => 0,
 		    };
 		},
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 897018b..3e9d29e 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -5,11 +5,10 @@ use warnings;
 
 use IO::File;
 use IPC::Open2;
-use POSIX qw( WNOHANG );
 use Time::HiRes qw( usleep );
 
-use PVE::Format qw(render_bytes);
 use PVE::Cluster;
+use PVE::Format qw(render_bytes);
 use PVE::GuestHelpers qw(safe_boolean_ne safe_string_ne);
 use PVE::INotify;
 use PVE::RPCEnvironment;
@@ -17,6 +16,7 @@ use PVE::Replication;
 use PVE::ReplicationConfig;
 use PVE::ReplicationState;
 use PVE::Storage;
+use PVE::StorageTunnel;
 use PVE::Tools;
 use PVE::Tunnel;
 
@@ -31,6 +31,9 @@ use PVE::QemuServer;
 use PVE::AbstractMigrate;
 use base qw(PVE::AbstractMigrate);
 
+# compared against remote end's minimum version
+our $WS_TUNNEL_VERSION = 2;
+
 sub fork_tunnel {
     my ($self, $ssh_forward_info) = @_;
 
@@ -43,6 +46,35 @@ sub fork_tunnel {
     return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
 }
 
+sub fork_websocket_tunnel {
+    my ($self, $storages, $bridges) = @_;
+
+    my $remote = $self->{opts}->{remote};
+    my $conn = $remote->{conn};
+
+    my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnelwebsocket";
+    my $url = "/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnel";
+
+    my $tunnel_params = {
+	url => $websocket_url,
+    };
+
+    my $storage_list = join(',', keys %$storages);
+    my $bridge_list = join(',', keys %$bridges);
+
+    my $req_params = {
+	storages => $storage_list,
+	bridges => $bridge_list,
+    };
+
+    my $log = sub {
+	my $line = shift;
+	$self->log('info', $line);
+    };
+
+    return PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
+}
+
 # tunnel_info:
 #   proto: unix (secure) or tcp (insecure/legacy compat)
 #   addr: IP or UNIX socket path
@@ -175,23 +207,34 @@ sub prepare {
     }
 
     my $vollist = PVE::QemuServer::get_vm_volumes($conf);
+
+    my $storages = {};
     foreach my $volid (@$vollist) {
 	my ($sid, $volname) = PVE::Storage::parse_volume_id($volid, 1);
 
-	# check if storage is available on both nodes
+	# check if storage is available on source node
 	my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
 
 	my $targetsid = $sid;
-	# NOTE: we currently ignore shared source storages in mappings so skip here too for now
-	if (!$scfg->{shared}) {
+	# NOTE: local ignores shared mappings, remote maps them
+	if (!$scfg->{shared} || $self->{opts}->{remote}) {
 	    $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
 	}
 
-	my $target_scfg = PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
-	my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
+	$storages->{$targetsid} = 1;
 
-	die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
-	    if !$target_scfg->{content}->{$vtype};
+	if (!$self->{opts}->{remote}) {
+	    # check if storage is available on target node
+	    my $target_scfg = PVE::Storage::storage_check_enabled(
+		$storecfg,
+		$targetsid,
+		$self->{node},
+	    );
+	    my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
+
+	    die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
+		if !$target_scfg->{content}->{$vtype};
+	}
 
 	if ($scfg->{shared}) {
 	    # PVE::Storage::activate_storage checks this for non-shared storages
@@ -201,10 +244,27 @@ sub prepare {
 	}
     }
 
-    # test ssh connection
-    my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
-    eval { $self->cmd_quiet($cmd); };
-    die "Can't connect to destination address using public key\n" if $@;
+    if ($self->{opts}->{remote}) {
+	# test & establish websocket connection
+	my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
+	my $tunnel = $self->fork_websocket_tunnel($storages, $bridges);
+	my $min_version = $tunnel->{version} - $tunnel->{age};
+	$self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
+	$self->log('info', "remote WS tunnel version: $tunnel->{version}");
+	$self->log('info', "minimum required WS tunnel version: $min_version");
+	die "Remote tunnel endpoint not compatible, upgrade required\n"
+	    if $WS_TUNNEL_VERSION < $min_version;
+	 die "Remote tunnel endpoint too old, upgrade required\n"
+	    if $WS_TUNNEL_VERSION > $tunnel->{version};
+
+	print "websocket tunnel started\n";
+	$self->{tunnel} = $tunnel;
+    } else {
+	# test ssh connection
+	my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
+	eval { $self->cmd_quiet($cmd); };
+	die "Can't connect to destination address using public key\n" if $@;
+    }
 
     return $running;
 }
@@ -242,7 +302,7 @@ sub scan_local_volumes {
 	my @sids = PVE::Storage::storage_ids($storecfg);
 	foreach my $storeid (@sids) {
 	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
-	    next if $scfg->{shared};
+	    next if $scfg->{shared} && !$self->{opts}->{remote};
 	    next if !PVE::Storage::storage_check_enabled($storecfg, $storeid, undef, 1);
 
 	    # get list from PVE::Storage (for unused volumes)
@@ -251,19 +311,24 @@ sub scan_local_volumes {
 	    next if @{$dl->{$storeid}} == 0;
 
 	    my $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $storeid);
-	    # check if storage is available on target node
-	    my $target_scfg = PVE::Storage::storage_check_enabled(
-		$storecfg,
-		$targetsid,
-		$self->{node},
-	    );
-
-	    die "content type 'images' is not available on storage '$targetsid'\n"
-		if !$target_scfg->{content}->{images};
+	    my $bwlimit_sids = [$storeid];
+	    if (!$self->{opts}->{remote}) {
+		# check if storage is available on target node
+		my $target_scfg = PVE::Storage::storage_check_enabled(
+		    $storecfg,
+		    $targetsid,
+		    $self->{node},
+		);
+
+		die "content type 'images' is not available on storage '$targetsid'\n"
+		    if !$target_scfg->{content}->{images};
+
+		push @$bwlimit_sids, $targetsid;
+	    }
 
 	    my $bwlimit = PVE::Storage::get_bandwidth_limit(
 		'migration',
-		[$targetsid, $storeid],
+		$bwlimit_sids,
 		$self->{opts}->{bwlimit},
 	    );
 
@@ -319,14 +384,17 @@ sub scan_local_volumes {
 	    my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
 
 	    my $targetsid = $sid;
-	    # NOTE: we currently ignore shared source storages in mappings so skip here too for now
-	    if (!$scfg->{shared}) {
+	    # NOTE: local ignores shared mappings, remote maps them
+	    if (!$scfg->{shared} || $self->{opts}->{remote}) {
 		$targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
 	    }
 
-	    PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
+	    # check target storage on target node if intra-cluster migration
+	    if (!$self->{opts}->{remote}) {
+		PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
 
-	    return if $scfg->{shared};
+		return if $scfg->{shared};
+	    }
 
 	    $local_volumes->{$volid}->{ref} = $attr->{referenced_in_config} ? 'config' : 'snapshot';
 	    $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_unused};
@@ -415,6 +483,9 @@ sub scan_local_volumes {
 
 	    my $migratable = $scfg->{type} =~ /^(?:dir|btrfs|zfspool|lvmthin|lvm)$/;
 
+	    # TODO: what is this even here for?
+	    $migratable = 1 if $self->{opts}->{remote};
+
 	    die "can't migrate '$volid' - storage type '$scfg->{type}' not supported\n"
 		if !$migratable;
 
@@ -449,6 +520,10 @@ sub handle_replication {
     my $local_volumes = $self->{local_volumes};
 
     return if !$self->{replication_jobcfg};
+
+    die "can't migrate VM with replicated volumes to remote cluster/node\n"
+	if $self->{opts}->{remote};
+
     if ($self->{running}) {
 
 	my $version = PVE::QemuServer::kvm_user_version();
@@ -548,24 +623,51 @@ sub sync_offline_local_volumes {
     $self->log('info', "copying local disk images") if scalar(@volids);
 
     foreach my $volid (@volids) {
-	my $targetsid = $local_volumes->{$volid}->{targetsid};
-	my $bwlimit = $local_volumes->{$volid}->{bwlimit};
-	$bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
-
-	my $storage_migrate_opts = {
-	    'ratelimit_bps' => $bwlimit,
-	    'insecure' => $opts->{migration_type} eq 'insecure',
-	    'with_snapshots' => $local_volumes->{$volid}->{snapshots},
-	    'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
-	};
+	my $new_volid;
 
-	my $logfunc = sub { $self->log('info', $_[0]); };
-	my $new_volid = eval {
-	    PVE::Storage::storage_migrate($storecfg, $volid, $self->{ssh_info},
-					  $targetsid, $storage_migrate_opts, $logfunc);
-	};
-	if (my $err = $@) {
-	    die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
+	my $opts = $self->{opts};
+	if ($opts->{remote}) {
+	    my $log = sub {
+		my $line = shift;
+		$self->log('info', $line);
+	    };
+
+	    $new_volid = PVE::StorageTunnel::storage_migrate(
+		$self->{tunnel},
+		$storecfg,
+		$volid,
+		$self->{vmid},
+		$opts->{remote}->{vmid},
+		$local_volumes->{$volid},
+		$log,
+	    );
+	} else {
+	    my $targetsid = $local_volumes->{$volid}->{targetsid};
+
+	    my $bwlimit = $local_volumes->{$volid}->{bwlimit};
+	    $bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
+
+	    my $storage_migrate_opts = {
+		'ratelimit_bps' => $bwlimit,
+		'insecure' => $opts->{migration_type} eq 'insecure',
+		'with_snapshots' => $local_volumes->{$volid}->{snapshots},
+		'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
+	    };
+
+	    my $logfunc = sub { $self->log('info', $_[0]); };
+	    $new_volid = eval {
+		PVE::Storage::storage_migrate(
+		    $storecfg,
+		    $volid,
+		    $self->{ssh_info},
+		    $targetsid,
+		    $storage_migrate_opts,
+		    $logfunc,
+		);
+	    };
+	    if (my $err = $@) {
+		die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
+	    }
 	}
 
 	$self->{volume_map}->{$volid} = $new_volid;
@@ -581,6 +683,12 @@ sub sync_offline_local_volumes {
 sub cleanup_remotedisks {
     my ($self) = @_;
 
+    if ($self->{opts}->{remote}) {
+	PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
+	delete $self->{tunnel};
+	return;
+    }
+
     my $local_volumes = $self->{local_volumes};
 
     foreach my $volid (values %{$self->{volume_map}}) {
@@ -630,8 +738,100 @@ sub phase1 {
     $self->handle_replication($vmid);
 
     $self->sync_offline_local_volumes();
+    $self->phase1_remote($vmid) if $self->{opts}->{remote};
 };
 
+sub map_bridges {
+    my ($conf, $map, $scan_only) = @_;
+
+    my $bridges = {};
+
+    foreach my $opt (keys %$conf) {
+	next if $opt !~ m/^net\d+$/;
+
+	next if !$conf->{$opt};
+	my $d = PVE::QemuServer::parse_net($conf->{$opt});
+	next if !$d || !$d->{bridge};
+
+	my $target_bridge = PVE::QemuServer::map_id($map, $d->{bridge});
+	$bridges->{$target_bridge}->{$opt} = $d->{bridge};
+
+	next if $scan_only;
+
+	$d->{bridge} = $target_bridge;
+	$conf->{$opt} = PVE::QemuServer::print_net($d);
+    }
+
+    return $bridges;
+}
+
+sub phase1_remote {
+    my ($self, $vmid) = @_;
+
+    my $remote_conf = PVE::QemuConfig->load_config($vmid);
+    PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
+
+    my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
+    for my $target (keys $bridges->%*) {
+	for my $nic (keys $bridges->{$target}->%*) {
+	    $self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
+	}
+    }
+
+    my @online_local_volumes = $self->filter_local_volumes('online');
+
+    my $storage_map = $self->{opts}->{storagemap};
+    $self->{nbd} = {};
+    PVE::QemuConfig->foreach_volume($remote_conf, sub {
+	my ($ds, $drive) = @_;
+
+	# TODO eject CDROM?
+	return if PVE::QemuServer::drive_is_cdrom($drive);
+
+	my $volid = $drive->{file};
+	return if !$volid;
+
+	return if !grep { $_ eq $volid} @online_local_volumes;
+
+	my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
+	my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
+	my $source_format = PVE::QemuServer::qemu_img_format($scfg, $volname);
+
+	# set by target cluster
+	my $oldvolid = delete $drive->{file};
+	delete $drive->{format};
+
+	my $targetsid = PVE::QemuServer::map_id($storage_map, $storeid);
+
+	my $params = {
+	    format => $source_format,
+	    storage => $targetsid,
+	    drive => $drive,
+	};
+
+	$self->log('info', "Allocating volume for drive '$ds' on remote storage '$targetsid'..");
+	my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'disk', $params);
+
+	$self->log('info', "volume '$oldvolid' os '$res->{volid}' on the target\n");
+	$remote_conf->{$ds} = $res->{drivestr};
+	$self->{nbd}->{$ds} = $res;
+    });
+
+    my $conf_str = PVE::QemuServer::write_vm_config("remote", $remote_conf);
+
+    # TODO expose in PVE::Firewall?
+    my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
+    my $fw_conf_str;
+    $fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
+	if -e $vm_fw_conf_path;
+    my $params = {
+	conf => $conf_str,
+	'firewall-config' => $fw_conf_str,
+    };
+
+    PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
+}
+
 sub phase1_cleanup {
     my ($self, $vmid, $err) = @_;
 
@@ -662,7 +862,6 @@ sub phase2_start_local_cluster {
     my $local_volumes = $self->{local_volumes};
     my @online_local_volumes = $self->filter_local_volumes('online');
 
-    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
     my $start = $params->{start_params};
     my $migrate = $params->{migrate_opts};
 
@@ -793,10 +992,34 @@ sub phase2_start_local_cluster {
     return ($tunnel_info, $spice_port);
 }
 
+sub phase2_start_remote_cluster {
+    my ($self, $vmid, $params) = @_;
+
+    die "insecure migration to remote cluster not implemented\n"
+	if $params->{migrate_opts}->{type} ne 'websocket';
+
+    my $remote_vmid = $self->{opts}->{remote}->{vmid};
+
+    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10, "start", $params);
+
+    foreach my $drive (keys %{$res->{drives}}) {
+	$self->{stopnbd} = 1;
+	$self->{target_drive}->{$drive}->{drivestr} = $res->{drives}->{$drive}->{drivestr};
+	my $nbd_uri = $res->{drives}->{$drive}->{nbd_uri};
+	die "unexpected NBD uri for '$drive': $nbd_uri\n"
+	    if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
+
+	$self->{target_drive}->{$drive}->{nbd_uri} = $nbd_uri;
+    }
+
+    return ($res->{migrate}, $res->{spice_port});
+}
+
 sub phase2 {
     my ($self, $vmid) = @_;
 
     my $conf = $self->{vmconf};
+    my $local_volumes = $self->{local_volumes};
 
     # version > 0 for unix socket support
     my $nbd_protocol_version = 1;
@@ -828,10 +1051,39 @@ sub phase2 {
 	},
     };
 
-    my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
+    my ($tunnel_info, $spice_port);
+
+    my @online_local_volumes = $self->filter_local_volumes('online');
+    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
+
+    if (my $remote = $self->{opts}->{remote}) {
+	my $remote_vmid = $remote->{vmid};
+	$params->{migrate_opts}->{remote_node} = $self->{node};
+	($tunnel_info, $spice_port) = $self->phase2_start_remote_cluster($vmid, $params);
+	die "only UNIX sockets are supported for remote migration\n"
+	    if $tunnel_info->{proto} ne 'unix';
+
+	my $remote_socket = $tunnel_info->{addr};
+	my $local_socket = $remote_socket;
+	$local_socket =~ s/$remote_vmid/$vmid/g;
+	$tunnel_info->{addr} = $local_socket;
+
+	$self->log('info', "Setting up tunnel for '$local_socket'");
+	PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+
+	foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) {
+	    my $local_socket = $remote_socket;
+	    $local_socket =~ s/$remote_vmid/$vmid/g;
+	    next if $self->{tunnel}->{forwarded}->{$local_socket};
+	    $self->log('info', "Setting up tunnel for '$local_socket'");
+	    PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+	}
+    } else {
+	($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
 
-    $self->log('info', "start remote tunnel");
-    $self->start_remote_tunnel($tunnel_info);
+	$self->log('info', "start remote tunnel");
+	$self->start_remote_tunnel($tunnel_info);
+    }
 
     my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
     $migrate_uri .= ":$tunnel_info->{port}"
@@ -841,8 +1093,6 @@ sub phase2 {
 	$self->{storage_migration_jobs} = {};
 	$self->log('info', "starting storage migration");
 
-	my @online_local_volumes = $self->filter_local_volumes('online');
-
 	die "The number of local disks does not match between the source and the destination.\n"
 	    if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
 	foreach my $drive (keys %{$self->{target_drive}}){
@@ -915,7 +1165,7 @@ sub phase2 {
     };
     $self->log('info', "migrate-set-parameters error: $@") if $@;
 
-    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
+    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
 	my $rpcenv = PVE::RPCEnvironment::get();
 	my $authuser = $rpcenv->get_user();
 
@@ -1112,11 +1362,15 @@ sub phase2_cleanup {
 
     my $nodename = PVE::INotify::nodename();
 
-    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
-    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
-    if (my $err = $@) {
-        $self->log('err', $err);
-        $self->{errors} = 1;
+    if ($self->{tunnel} && $self->{tunnel}->{version} >= 2) {
+	PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'stop');
+    } else {
+	my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
+	eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+	if (my $err = $@) {
+	    $self->log('err', $err);
+	    $self->{errors} = 1;
+	}
     }
 
     # cleanup after stopping, otherwise disks might be in-use by target VM!
@@ -1149,7 +1403,7 @@ sub phase3_cleanup {
 
     my $tunnel = $self->{tunnel};
 
-    if ($self->{volume_map}) {
+    if ($self->{volume_map} && !$self->{opts}->{remote}) {
 	my $target_drives = $self->{target_drive};
 
 	# FIXME: for NBD storage migration we now only update the volid, and
@@ -1165,27 +1419,34 @@ sub phase3_cleanup {
     }
 
     # transfer replication state before move config
-    $self->transfer_replication_state() if $self->{is_replicated};
-    PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
-    $self->switch_replication_job_target() if $self->{is_replicated};
+    if (!$self->{opts}->{remote}) {
+	$self->transfer_replication_state() if $self->{is_replicated};
+	PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
+	$self->switch_replication_job_target() if $self->{is_replicated};
+    }
 
     if ($self->{livemigration}) {
 	if ($self->{stopnbd}) {
 	    $self->log('info', "stopping NBD storage migration server on target.");
 	    # stop nbd server on remote vm - requirement for resume since 2.9
-	    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
+	    if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 2) {
+		PVE::Tunnel::write_tunnel($tunnel, 30, 'nbdstop');
+	    } else {
+		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
 
-	    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
-	    if (my $err = $@) {
-		$self->log('err', $err);
-		$self->{errors} = 1;
+		eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+		if (my $err = $@) {
+		    $self->log('err', $err);
+		    $self->{errors} = 1;
+		}
 	    }
 	}
 
 	# config moved and nbd server stopped - now we can resume vm on target
 	if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 1) {
+	    my $cmd = $tunnel->{version} == 1 ? "resume $vmid" : "resume";
 	    eval {
-		PVE::Tunnel::write_tunnel($tunnel, 30, "resume $vmid");
+		PVE::Tunnel::write_tunnel($tunnel, 30, $cmd);
 	    };
 	    if (my $err = $@) {
 		$self->log('err', $err);
@@ -1205,18 +1466,24 @@ sub phase3_cleanup {
 	}
 
 	if ($self->{storage_migration} && PVE::QemuServer::parse_guest_agent($conf)->{fstrim_cloned_disks} && $self->{running}) {
-	    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
-	    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+	    if ($self->{opts}->{remote}) {
+		PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'fstrim');
+	    } else {
+		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
+		eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+	    }
 	}
     }
 
     # close tunnel on successful migration, on error phase2_cleanup closed it
-    if ($tunnel) {
+    if ($tunnel && $tunnel->{version} == 1) {
 	eval { PVE::Tunnel::finish_tunnel($tunnel); };
 	if (my $err = $@) {
 	    $self->log('err', $err);
 	    $self->{errors} = 1;
 	}
+	$tunnel = undef;
+	delete $self->{tunnel};
     }
 
     eval {
@@ -1254,6 +1521,9 @@ sub phase3_cleanup {
 
     # destroy local copies
     foreach my $volid (@not_replicated_volumes) {
+	# remote is cleaned up below
+	next if $self->{opts}->{remote};
+
 	eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
 	if (my $err = $@) {
 	    $self->log('err', "removing local copy of '$volid' failed - $err");
@@ -1263,8 +1533,19 @@ sub phase3_cleanup {
     }
 
     # clear migrate lock
-    my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
-    $self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
+    if ($tunnel && $tunnel->{version} >= 2) {
+	PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
+
+	PVE::Tunnel::finish_tunnel($tunnel);
+    } else {
+	my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
+	$self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
+    }
+
+    if ($self->{opts}->{remote} && $self->{opts}->{delete}) {
+	eval { PVE::QemuServer::destroy_vm($self->{storecfg}, $vmid, 1, undef, 0) };
+	warn "Failed to remove source VM - $@\n" if $@;
+    }
 }
 
 sub final_cleanup {
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 9971f2c..9490615 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5425,7 +5425,10 @@ sub vm_start_nolock {
     my $defaults = load_defaults();
 
     # set environment variable useful inside network script
-    $ENV{PVE_MIGRATED_FROM} = $migratedfrom if $migratedfrom;
+    # for remote migration the config is available on the target node!
+    if (!$migrate_opts->{remote_node}) {
+	$ENV{PVE_MIGRATED_FROM} = $migratedfrom;
+    }
 
     PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1);
 
@@ -5667,7 +5670,7 @@ sub vm_start_nolock {
 
 	my $migrate_storage_uri;
 	# nbd_protocol_version > 0 for unix socket support
-	if ($nbd_protocol_version > 0 && $migration_type eq 'secure') {
+	if ($nbd_protocol_version > 0 && ($migration_type eq 'secure' || $migration_type eq 'websocket')) {
 	    my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
 	    mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
 	    $migrate_storage_uri = "nbd:unix:$socket_path";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 10/10] api: add remote migrate endpoint
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (15 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 1/4] volname_for_storage: parse volname before calling Fabian Grünbichler
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
Notes:
    I left the checks here, but for the final version / proper CLI they
    should move to a separate API endpoint or even just done
    client-side.

 PVE/API2/Qemu.pm | 206 ++++++++++++++++++++++++++++++++++++++++++++++-
 debian/control   |   2 +
 2 files changed, 206 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index cf90fe7..002d0b7 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -12,6 +12,7 @@ use URI::Escape;
 use Crypt::OpenSSL::Random;
 use Socket qw(SOCK_STREAM);
 
+use PVE::APIClient::LWP;
 use PVE::Cluster qw (cfs_read_file cfs_write_file);;
 use PVE::RRD;
 use PVE::SafeSyslog;
@@ -51,8 +52,6 @@ BEGIN {
     }
 }
 
-use Data::Dumper; # fixme: remove
-
 use base qw(PVE::RESTHandler);
 
 my $opt_force_description = "Force physical removal. Without this, we simple remove the disk from the config file and create an additional configuration entry called 'unused[n]', which contains the volume ID. Unlink of unused[n] always cause physical removal.";
@@ -4000,6 +3999,198 @@ __PACKAGE__->register_method({
 
     }});
 
+__PACKAGE__->register_method({
+    name => 'remote_migrate_vm',
+    path => '{vmid}/remote_migrate',
+    method => 'POST',
+    protected => 1,
+    proxyto => 'node',
+    description => "Migrate virtual machine to a remote cluster. Creates a new migration task.",
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
+	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
+	    'target-node' => get_standard_option('pve-node', {
+		description => "Target node on remote cluster.",
+            }),
+	    'target-endpoint' => get_standard_option('proxmox-remote', {
+		description => "Remote target endpoint",
+	    }),
+	    online => {
+		type => 'boolean',
+		description => "Use online/live migration if VM is running. Ignored if VM is stopped.",
+		optional => 1,
+	    },
+	    'with-local-disks' => {
+		type => 'boolean',
+		description => "Enable live storage migration for local disk",
+		optional => 1,
+	    },
+	    delete => {
+		type => 'boolean',
+		description => "Delete the original VM and related data after successful migration. By default the original VM is kept on the source cluster in a stopped state.",
+		optional => 1,
+		default => 0,
+	    },
+            'target-storage' => get_standard_option('pve-targetstorage', {
+		completion => \&PVE::QemuServer::complete_migration_storage,
+		optional => 0,
+            }),
+	    'target-bridge' => {
+		type => 'string',
+		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
+		format => 'bridge-pair-list',
+	    },
+	    bwlimit => {
+		description => "Override I/O bandwidth limit (in KiB/s).",
+		optional => 1,
+		type => 'integer',
+		minimum => '0',
+		default => 'migrate limit from datacenter or storage config',
+	    },
+	},
+    },
+    returns => {
+	type => 'string',
+	description => "the task ID.",
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $source_vmid = extract_param($param, 'vmid');
+	my $target_endpoint = extract_param($param, 'target-endpoint');
+	my $target_node = extract_param($param, 'target-node');
+	my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
+
+	my $delete = extract_param($param, 'delete') // 0;
+
+	PVE::Cluster::check_cfs_quorum();
+
+	# test if VM exists
+	my $conf = PVE::QemuConfig->load_config($source_vmid);
+
+	PVE::QemuConfig->check_lock($conf);
+
+	raise_param_exc({ vmid => "cannot migrate HA-managed VM to remote cluster" })
+	    if PVE::HA::Config::vm_is_ha_managed($source_vmid);
+
+	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
+
+	# TODO: move this as helper somewhere appropriate?
+	my $conn_args = {
+	    protocol => 'https',
+	    host => $remote->{host},
+	    port => $remote->{port} // 8006,
+	    apitoken => $remote->{apitoken},
+	};
+
+	my $fp;
+	if ($fp = $remote->{fingerprint}) {
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 };
+	}
+
+	print "Establishing API connection with remote at '$remote->{host}'\n";
+
+	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
+	my $version = $api_client->get("/version");
+	print "remote: version '$version->{version}\n";
+
+	if (!defined($fp)) {
+	    my $cert_info = $api_client->get("/nodes/$target_node/certificates/info");
+	    foreach my $cert (@$cert_info) {
+		my $filename = $cert->{filename};
+		next if $filename ne 'pveproxy-ssl.pem' && $filename ne 'pve-ssl.pem';
+		$fp = $cert->{fingerprint} if !$fp || $filename eq 'pveproxy-ssl.pem';
+	    }
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 }
+		if defined($fp);
+	}
+
+	if (PVE::QemuServer::check_running($source_vmid)) {
+	    die "can't migrate running VM without --online\n" if !$param->{online};
+
+	    my $repl_conf = PVE::ReplicationConfig->new();
+	    my $is_replicated = $repl_conf->check_for_existing_jobs($source_vmid, 1);
+	    die "cannot remote-migrate replicated VM\n" if $is_replicated;
+	} else {
+	    warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
+	    $param->{online} = 0;
+	}
+
+	# FIXME: fork worker hear to avoid timeout? or poll these periodically
+	# in pvestatd and access cached info here? all of the below is actually
+	# checked at the remote end anyway once we call the mtunnel endpoint,
+	# we could also punt it to the client and not do it here at all..
+	my $resources = $api_client->get("/cluster/resources");
+	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
+	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
+	}
+
+	my $storages = [ grep { $_->{type} eq 'storage' && $_->{node} eq $target_node } @$resources ];
+	my $storecfg = PVE::Storage::config();
+	my $target_storage = extract_param($param, 'target-storage');
+	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
+	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
+	    if $@;
+
+	my $target_bridge = extract_param($param, 'target-bridge');
+	my $bridgemap = eval { PVE::JSONSchema::parse_idmap($target_bridge, 'pve-bridge-id') };
+	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
+	    if $@;
+
+	my $check_remote_storage = sub {
+	    my ($storage) = @_;
+	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
+	    die "remote: storage '$storage' does not exist!\n"
+		if !@$found;
+
+	    $found = @$found[0];
+
+	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
+	    die "remote: storage '$storage' cannot store images\n"
+		if !grep { $_ eq 'images' } @$content_types;
+	};
+
+	foreach my $target_sid (values %{$storagemap->{entries}}) {
+	    $check_remote_storage->($target_sid);
+	}
+
+	$check_remote_storage->($storagemap->{default})
+	    if $storagemap->{default};
+
+	# TODO: or check all referenced storages?
+	die "remote migration requires explicit storage mapping!\n"
+	    if $storagemap->{identity};
+
+	$param->{storagemap} = $storagemap;
+	$param->{bridgemap} = $bridgemap;
+	$param->{remote} = {
+	    conn => $conn_args, # re-use fingerprint for tunnel
+	    client => $api_client,
+	    vmid => $target_vmid,
+	};
+	$param->{migration_type} = 'websocket';
+	$param->{delete} = $delete if $delete;
+
+	my $realcmd = sub {
+	    PVE::QemuMigrate->migrate($target_node, $remote->{host}, $source_vmid, $param);
+	};
+
+	my $worker = sub {
+	    return PVE::GuestHelpers::guest_migration_lock($source_vmid, 10, $realcmd);
+	};
+
+	return $rpcenv->fork_worker('qmigrate', $source_vmid, $authuser, $worker);
+    }});
+
 __PACKAGE__->register_method({
     name => 'monitor',
     path => '{vmid}/monitor',
@@ -4678,6 +4869,12 @@ __PACKAGE__->register_method({
 		optional => 1,
 		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
 	    },
+	    bridges => {
+		type => 'string',
+		format => 'pve-bridge-id-list',
+		optional => 1,
+		description => 'List of network bridges to check availability. Will be checked again for actually used bridges during migration.',
+	    },
 	},
     },
     returns => {
@@ -4698,12 +4895,17 @@ __PACKAGE__->register_method({
 	my $vmid = extract_param($param, 'vmid');
 
 	my $storages = extract_param($param, 'storages');
+	my $bridges = extract_param($param, 'bridges');
 
 	my $storecfg = PVE::Storage::config();
 	foreach my $storeid (PVE::Tools::split_list($storages)) {
 	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
 	}
 
+	foreach my $bridge (PVE::Tools::split_list($bridges)) {
+	    PVE::Network::read_bridge_mtu($bridge);
+	}
+
 	PVE::Cluster::check_cfs_quorum();
 
 	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
diff --git a/debian/control b/debian/control
index 6cf471a..3e91dd6 100644
--- a/debian/control
+++ b/debian/control
@@ -6,6 +6,7 @@ Build-Depends: debhelper (>= 12~),
                libglib2.0-dev,
                libio-multiplex-perl,
                libjson-c-dev,
+               libpve-apiclient-perl,
                libpve-cluster-perl,
                libpve-common-perl (>= 7.0-14),
                libpve-guest-common-perl (>= 3.1-3),
@@ -34,6 +35,7 @@ Depends: dbus,
          libjson-xs-perl,
          libnet-ssleay-perl,
          libpve-access-control (>= 5.0-7),
+         libpve-apiclient-perl,
          libpve-cluster-perl,
          libpve-common-perl (>= 7.0-14),
          libpve-guest-common-perl (>= 3.1-3),
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 storage 1/4] volname_for_storage: parse volname before calling
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (16 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 10/10] api: add remote migrate endpoint Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision Fabian Grünbichler
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

to allow reusing this with remote migration, where parsing of the source
volid has to happen on the source node, but this call has to happen on
the target node.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
new in v3

 PVE/Storage.pm | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index d64019f..05be3dd 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -623,19 +623,20 @@ sub abs_filesystem_path {
     return $path;
 }
 
+# used as last resort to adapt volnames when migrating
 my $volname_for_storage = sub {
-    my ($cfg, $volid, $target_storeid) = @_;
+    my ($cfg, $storeid, $name, $vmid, $format) = @_;
 
-    my (undef, $name, $vmid, undef, undef, undef, $format) = parse_volname($cfg, $volid);
-    my $target_scfg = storage_config($cfg, $target_storeid);
+    my $scfg = storage_config($cfg, $storeid);
 
-    my (undef, $valid_formats) = PVE::Storage::Plugin::default_format($target_scfg);
+    my (undef, $valid_formats) = PVE::Storage::Plugin::default_format($scfg);
     my $format_is_valid = grep { $_ eq $format } @$valid_formats;
-    die "unsupported format '$format' for storage type $target_scfg->{type}\n" if !$format_is_valid;
+    die "unsupported format '$format' for storage type $scfg->{type}\n"
+	if !$format_is_valid;
 
     (my $name_without_extension = $name) =~ s/\.$format$//;
 
-    if ($target_scfg->{path}) {
+    if ($scfg->{path}) {
        return "$vmid/$name_without_extension.$format";
     } else {
        return "$name_without_extension";
@@ -667,7 +668,8 @@ sub storage_migrate {
     } elsif ($scfg->{type} eq $tcfg->{type}) {
 	$target_volname = $volname;
     } else {
-	$target_volname = $volname_for_storage->($cfg, $volid, $target_storeid);
+	my (undef, $name, $vmid, undef, undef, undef, $format) = parse_volname($cfg, $volid);
+	$target_volname = $volname_for_storage->($cfg, $target_storeid, $name, $vmid, $format);
     }
 
     my $target_volid = "${target_storeid}:${target_volname}";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (17 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 1/4] volname_for_storage: parse volname before calling Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-05  9:00   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare Fabian Grünbichler
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

into new top-level helper for re-use with remote migration.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
new in v3

 PVE/Storage.pm | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index 05be3dd..4f3a44a 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -643,6 +643,14 @@ my $volname_for_storage = sub {
     }
 };
 
+# whether a migration snapshot is needed for a given storage
+sub storage_migrate_snapshot {
+    my ($cfg, $storeid) = @_;
+    my $scfg = storage_config($cfg, $storeid);
+
+    return $scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs';
+}
+
 sub storage_migrate {
     my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
 
@@ -688,10 +696,8 @@ sub storage_migrate {
 
     my $migration_snapshot;
     if (!defined($snapshot)) {
-	if ($scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs') {
-	    $migration_snapshot = 1;
-	    $snapshot = '__migration__';
-	}
+	$migration_snapshot = storage_migrate_snapshot->($cfg, $target_storeid);
+	$snapshot = '__migration__' if $migration_snapshot;
     }
 
     my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $snapshot, $base_snapshot, $with_snapshots);
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (18 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2022-01-05  9:59   ` Fabian Ebner
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 4/4] add volume_import/export_start helpers Fabian Grünbichler
  2021-12-23 13:56 ` [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
  21 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

for re-use with remote migration, where import and export happen on
different clusters connected via a websocket instead of SSH tunnel.

no semantic changes intended.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
new in v3

using $opts across the board in storage_migrate to prevent future bugs
where options get out of sync between helper calls..

 PVE/Storage.pm | 113 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 70 insertions(+), 43 deletions(-)

diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index 4f3a44a..976bd63 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -651,16 +651,70 @@ sub storage_migrate_snapshot {
     return $scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs';
 }
 
-sub storage_migrate {
-    my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
+my $volume_import_prepare = sub {
+    my ($volid, $format, $path, $apiver, $opts) = @_;
 
     my $base_snapshot = $opts->{base_snapshot};
     my $snapshot = $opts->{snapshot};
-    my $ratelimit_bps = $opts->{ratelimit_bps};
-    my $insecure = $opts->{insecure};
     my $with_snapshots = $opts->{with_snapshots} ? 1 : 0;
+    my $migration_snapshot = $opts->{migration_snapshot} ? 1 : 0;
     my $allow_rename = $opts->{allow_rename} ? 1 : 0;
 
+    my $recv = ['pvesm', 'import', $volid, $format, $path, '-with-snapshots', $with_snapshots];
+    if (defined($snapshot)) {
+	push @$recv, '-snapshot', $snapshot;
+    }
+    if ($migration_snapshot) {
+	push @$recv, '-delete-snapshot', $snapshot;
+    }
+    push @$recv, '-allow-rename', $allow_rename if $apiver >= 5;
+
+    if (defined($base_snapshot)) {
+	# Check if the snapshot exists on the remote side:
+	push @$recv, '-base', $base_snapshot if $apiver >= 9;
+    }
+
+    return $recv;
+};
+
+my $volume_export_prepare = sub {
+    my ($cfg, $volid, $format, $logfunc, $opts) = @_;
+    my $base_snapshot = $opts->{base_snapshot};
+    my $snapshot = $opts->{snapshot};
+    my $with_snapshots = $opts->{with_snapshots} ? 1 : 0;
+    my $migration_snapshot = $opts->{migration_snapshot} ? 1 : 0;
+    my $ratelimit_bps = $opts->{ratelimit_bps};
+
+    my $send = ['pvesm', 'export', $volid, $format, '-', '-with-snapshots', $with_snapshots];
+    if (defined($snapshot)) {
+	push @$send, '-snapshot', $snapshot;
+    }
+    if (defined($base_snapshot)) {
+	push @$send, '-base', $base_snapshot;
+    }
+
+    my @cstream;
+    if (defined($ratelimit_bps)) {
+	@cstream = [ '/usr/bin/cstream', '-t', $ratelimit_bps ];
+	$logfunc->("using a bandwidth limit of $ratelimit_bps bps for transferring '$volid'") if $logfunc;
+    }
+
+    volume_snapshot($cfg, $volid, $snapshot) if $migration_snapshot;
+
+    if (defined($snapshot)) {
+	activate_volumes($cfg, [$volid], $snapshot);
+    } else {
+	activate_volumes($cfg, [$volid]);
+    }
+
+    return [ $send, @cstream ];
+};
+
+sub storage_migrate {
+    my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
+
+    my $insecure = $opts->{insecure};
+
     my ($storeid, $volname) = parse_volume_id($volid);
 
     my $scfg = storage_config($cfg, $storeid);
@@ -688,19 +742,12 @@ sub storage_migrate {
     my $ssh_base = PVE::SSHInfo::ssh_info_to_command_base($target_sshinfo);
     local $ENV{RSYNC_RSH} = PVE::Tools::cmd2string($ssh_base);
 
-    my @cstream;
-    if (defined($ratelimit_bps)) {
-	@cstream = ([ '/usr/bin/cstream', '-t', $ratelimit_bps ]);
-	$logfunc->("using a bandwidth limit of $ratelimit_bps bps for transferring '$volid'") if $logfunc;
+    if (!defined($opts->{snapshot})) {
+	$opts->{migration_snapshot} = storage_migrate_snapshot($cfg, $target_storeid);
+	$opts->{snapshot} = '__migration__' if $opts->{migration_snapshot};
     }
 
-    my $migration_snapshot;
-    if (!defined($snapshot)) {
-	$migration_snapshot = storage_migrate_snapshot->($cfg, $target_storeid);
-	$snapshot = '__migration__' if $migration_snapshot;
-    }
-
-    my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $snapshot, $base_snapshot, $with_snapshots);
+    my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $opts->{snapshot}, $opts->{base_snapshot}, $opts->{with_snapshots});
     die "cannot migrate from storage type '$scfg->{type}' to '$tcfg->{type}'\n" if !@formats;
     my $format = $formats[0];
 
@@ -715,22 +762,7 @@ sub storage_migrate {
     my $match_api_version = sub { $target_apiver = $1 if $_[0] =~ m!^APIVER (\d+)$!; };
     eval { run_command($get_api_version, logfunc => $match_api_version); };
 
-    my $send = ['pvesm', 'export', $volid, $format, '-', '-with-snapshots', $with_snapshots];
-    my $recv = [@$ssh, '--', 'pvesm', 'import', $target_volid, $format, $import_fn, '-with-snapshots', $with_snapshots];
-    if (defined($snapshot)) {
-	push @$send, '-snapshot', $snapshot;
-	push @$recv, '-snapshot', $snapshot;
-    }
-    if ($migration_snapshot) {
-	push @$recv, '-delete-snapshot', $snapshot;
-    }
-    push @$recv, '-allow-rename', $allow_rename if $target_apiver >= 5;
-
-    if (defined($base_snapshot)) {
-	# Check if the snapshot exists on the remote side:
-	push @$send, '-base', $base_snapshot;
-	push @$recv, '-base', $base_snapshot if $target_apiver >= 9;
-    }
+    my $recv = [ @$ssh, '--', $volume_import_prepare->($target_volid, $format, $import_fn, $target_apiver, $opts)->@* ];
 
     my $new_volid;
     my $pattern = volume_imported_message(undef, 1);
@@ -745,19 +777,13 @@ sub storage_migrate {
 	}
     };
 
-    volume_snapshot($cfg, $volid, $snapshot) if $migration_snapshot;
-
-    if (defined($snapshot)) {
-	activate_volumes($cfg, [$volid], $snapshot);
-    } else {
-	activate_volumes($cfg, [$volid]);
-    }
+    my $cmds = [ $volume_export_prepare->($cfg, $volid, $format, $logfunc, $opts)->@* ];
 
     eval {
 	if ($insecure) {
 	    my $input = IO::File->new();
 	    my $info = IO::File->new();
-	    open3($input, $info, $info, @{$recv})
+	    open3($input, $info, $info, @$recv)
 		or die "receive command failed: $!\n";
 	    close($input);
 
@@ -774,7 +800,7 @@ sub storage_migrate {
 	    # we won't be reading from the socket
 	    shutdown($socket, 0);
 
-	    eval { run_command([$send, @cstream], output => '>&'.fileno($socket), errfunc => $logfunc); };
+	    eval { run_command($cmds, output => '>&'.fileno($socket), errfunc => $logfunc); };
 	    my $send_error = $@;
 
 	    # don't close the connection entirely otherwise the receiving end
@@ -795,7 +821,8 @@ sub storage_migrate {
 
 	    die $send_error if $send_error;
 	} else {
-	    run_command([$send, @cstream, $recv], logfunc => $match_volid_and_log);
+	    push @$cmds, $recv;
+	    run_command($cmds, logfunc => $match_volid_and_log);
 	}
 
 	die "unable to get ID of the migrated volume\n"
@@ -803,8 +830,8 @@ sub storage_migrate {
     };
     my $err = $@;
     warn "send/receive failed, cleaning up snapshot(s)..\n" if $err;
-    if ($migration_snapshot) {
-	eval { volume_snapshot_delete($cfg, $volid, $snapshot, 0) };
+    if ($opts->{migration_snapshot}) {
+	eval { volume_snapshot_delete($cfg, $volid, $opts->{snapshot}, 0) };
 	warn "could not remove source snapshot: $@\n" if $@;
     }
     die $err if $err;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* [pve-devel] [PATCH v3 storage 4/4] add volume_import/export_start helpers
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (19 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare Fabian Grünbichler
@ 2021-12-22 13:52 ` Fabian Grünbichler
  2021-12-23 13:56 ` [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-22 13:52 UTC (permalink / raw)
  To: pve-devel

exposing the two halves of a storage migration for usage across
cluster boundaries.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    new in v3

 PVE/Storage.pm | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index 976bd63..f0c62aa 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -1833,6 +1833,72 @@ sub volume_imported_message {
     }
 }
 
+# $format and $volname are requests and might be overruled depending on $opts
+# $opts:
+# - with-snapshots: passed to `pvesm import` and used to select import format
+# - allow-rename: passed to `pvesm import`
+# - export-formats: used to select common transport format
+# - unix: unix socket path
+sub volume_import_start {
+    my ($cfg, $storeid, $volname, $format, $vmid, $opts) = @_;
+
+    my $with_snapshots = $opts->{'with-snapshots'} ? 1 : 0;
+
+    $volname = $volname_for_storage->($cfg, $storeid, $volname, $vmid, $format);
+
+    my $volid = "$storeid:$volname";
+
+    if (!defined($opts->{snapshot})) {
+	$opts->{migration_snapshot} = storage_migrate_snapshot($cfg, $storeid);
+	$opts->{snapshot} = '__migration__' if $opts->{migration_snapshot};
+    }
+
+    # find common import/export format, like volume_transfer_formats
+    my @import_formats = PVE::Storage::volume_import_formats($cfg, $volid, $opts->{migration_snapshot}, undef, $opts->{with_snapshots});
+    my @export_formats = PVE::Tools::split_list($opts->{'export-formats'});
+    my %import_hash = map { $_ => 1 } @import_formats;
+    my @common = grep { $import_hash{$_} } @export_formats;
+    die "no matching import/export format found for storage '$storeid'\n"
+	if !@common;
+    $format = $common[0];
+
+    my $input = IO::File->new();
+    my $info = IO::File->new();
+
+    my $unix = $opts->{unix} // "/run/pve/storage-migrate-$vmid.$$.unix";
+    my $import = $volume_import_prepare->($volid, $format, "unix://$unix", APIVER, $opts);
+
+    unlink $unix;
+    my $cpid = open3($input, $info, $info, @$import)
+	or die "failed to spawn disk-import child - $!\n";
+
+    my $ready;
+    eval {
+	PVE::Tools::run_with_timeout(5, sub { $ready = <$info>; });
+    };
+
+    die "failed to read readyness from disk import child: $@\n" if $@;
+
+    print "$ready\n";
+
+    return {
+	fh => $info,
+	pid => $cpid,
+	socket => $unix,
+	format => $format,
+    };
+}
+
+sub volume_export_start {
+    my ($cfg, $volid, $format, $opts) = @_;
+
+    my $run_command_params = delete $opts->{cmd} // {};
+
+    my @cmds = $volume_export_prepare->($cfg, $volid, $format, undef, $opts);
+
+    PVE::Tools::run_command(@cmds, %$run_command_params);
+}
+
 # bash completion helper
 
 sub complete_storage {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration
  2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
                   ` (20 preceding siblings ...)
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 4/4] add volume_import/export_start helpers Fabian Grünbichler
@ 2021-12-23 13:56 ` Fabian Grünbichler
  21 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2021-12-23 13:56 UTC (permalink / raw)
  To: Proxmox VE development discussion

some feedback/known issues/.. from Dominik's testing:
- there is some mishandling with the export/import (hash keys with '-' 
  vs '_') that breaks rename and qcow2 support (already fixed locally)
- we could probably drop the targetnode parameter and replace it with 
  localhost, if we punt the node -> endpoint selection to the client - 
  otherwise we need to do a lookup including IP in the remote migration 
  API call
- targetstorage should get an adapted description ('1' not valid)
- ceph plugin doesn't implement import/export (yet)
- plugins not implementing import/export should be handled with a good 
  error message and as early as possible (when scanning disks?)
- source being kept around locked if --delete is not passed might not be 
  the best solution - alternatives?
- mtunnel is missing a timeout for the initial connection/first read

thanks for testing and discussion (off-list)!

please add if I forgot anything ;)

On December 22, 2021 2:52 pm, Fabian Grünbichler wrote:
> this series adds remote migration for VMs.
> 
> both live and offline migration including NBD and storage-migrated disks should
> work. groundwork for extending to pve-container and pvesr already laid.
> 
> new in v3: lots of refactoring and edge-case handling
> 
> new in v2: dropped parts already applied, incorporated Fabian's and
> Dominik's feedback (thanks!)
> 
> new in v1: explicit remote endpoint specified as part of API call instead of
> remote.cfg
> 
> overview over affected repos and changes, see individual patches for
> more details.
> 
> proxmox-websocket-tunnel:
> 
> new tunnel helper tool for forwarding commands and data over websocket
> connections, required by pve-guest-common on source side
> 
> pve-guest-common:
> 
> new/refactored generic WS/SSH tunnel fork/read/.. helpers
> new storage migration over WS tunnel helpers
> 
> pve-storage:
> 
> refactor storage_migrate to make code-reuse possible
> 
> qemu-server:
> 
> some refactoring, new mtunnel endpoints, new remote_migration endpoints
> TODO: handle pending changes and snapshots
> TODO: proper CLI for remote migration
> TODO: handle C^c
> TODO: strict parser for FW config as well?
> potential TODO: precond endpoint?
> 
> as usual, some of the patches are best viewed with '-w', especially in
> qemu-server..
> 
> required dependencies are noted
> 
> pve-guest-common:
> 
> Fabian Grünbichler (3):
>   migrate: handle migration_network with remote migration
>   add tunnel helper module
>   add storage tunnel module
> 
>  src/Makefile               |   2 +
>  debian/control             |   1 +
>  src/PVE/AbstractMigrate.pm |  37 ++--
>  src/PVE/StorageTunnel.pm   | 231 ++++++++++++++++++++++++
>  src/PVE/Tunnel.pm          | 356 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 611 insertions(+), 16 deletions(-)
>  create mode 100644 src/PVE/StorageTunnel.pm
>  create mode 100644 src/PVE/Tunnel.pm
> 
> proxmox-websocket-tunnel:
> 
> Fabian Grünbichler (4):
>   initial commit
>   add tunnel implementation
>   add fingerprint validation
>   add packaging
> 
> qemu-server:
> 
> Fabian Grünbichler (10):
>   refactor map_storage to map_id
>   schema: use pve-bridge-id
>   parse_config: optional strict mode
>   update_vm: allow simultaneous setting of boot-order and dev
>   nbd alloc helper: allow passing in explicit format
>   migrate: move tunnel-helpers to pve-guest-common
>   mtunnel: add API endpoints
>   migrate: refactor remote VM/tunnel start
>   migrate: add remote migration handling
>   api: add remote migrate endpoint
> 
>  PVE/API2/Qemu.pm                      | 721 ++++++++++++++++++++++++-
>  PVE/QemuMigrate.pm                    | 738 ++++++++++++++++----------
>  PVE/QemuServer.pm                     | 100 ++--
>  debian/control                        |   2 +
>  test/MigrationTest/QemuMigrateMock.pm |  28 +-
>  5 files changed, 1240 insertions(+), 349 deletions(-)
> 
> pve-storage:
> 
> Fabian Grünbichler (4):
>   volname_for_storage: parse volname before calling
>   storage_migrate: pull out snapshot decision
>   storage_migrate: pull out import/export_prepare
>   add volume_import/export_start helpers
> 
>  PVE/Storage.pm | 205 ++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 153 insertions(+), 52 deletions(-)
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module Fabian Grünbichler
@ 2022-01-03 12:30   ` Fabian Ebner
       [not found]     ` <<47e7d41f-e328-d9fa-25b7-f7585de8ce5b@proxmox.com>
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-03 12:30 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

A few nits inline.

Am 12/22/21 um 14:52 schrieb Fabian Grünbichler:
> lifted from PVE::QemuMigrate, abstracting away use-case specific data.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
>   src/Makefile      |   1 +
>   debian/control    |   1 +
>   src/PVE/Tunnel.pm | 356 ++++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 358 insertions(+)
>   create mode 100644 src/PVE/Tunnel.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index 0298d3f..d82162c 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -12,6 +12,7 @@ install: PVE
>   	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
> +	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>   	install -d ${PERL5DIR}/PVE/VZDump
>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
>   	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
> diff --git a/debian/control b/debian/control
> index 4c246d0..73c28bd 100644
> --- a/debian/control
> +++ b/debian/control
> @@ -16,6 +16,7 @@ Depends: libpve-cluster-perl,
>            libpve-common-perl (>= 4.0-89),
>            libpve-storage-perl (>= 7.0-14),
>            pve-cluster,
> +         proxmox-websocket-tunnel,
>            ${misc:Depends},
>            ${perl:Depends},
>   Breaks: libpve-common-perl (<< 4.0-89),
> diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
> new file mode 100644
> index 0000000..bbd1169
> --- /dev/null
> +++ b/src/PVE/Tunnel.pm
> @@ -0,0 +1,356 @@
> +package PVE::Tunnel;
> +
> +use strict;
> +use warnings;
> +
> +use IO::Pipe;
> +use IPC::Open2;
> +use JSON qw(encode_json decode_json);
> +use POSIX qw( WNOHANG );
> +use Storable qw(dclone);
> +use URI::Escape;
> +
> +use PVE::APIClient::LWP;
> +use PVE::Tools;
> +
> +my $finish_command_pipe = sub {
> +    my ($cmdpipe, $timeout) = @_;
> +
> +    my $cpid = $cmdpipe->{pid};
> +    return if !defined($cpid);
> +
> +    my $writer = $cmdpipe->{writer};
> +    my $reader = $cmdpipe->{reader};
> +
> +    $writer->close();
> +    $reader->close();
> +
> +    my $collect_child_process = sub {
> +	my $res = waitpid($cpid, WNOHANG);
> +	if (defined($res) && ($res == $cpid)) {
> +	    delete $cmdpipe->{cpid};
> +	    return 1;
> +	} else {
> +	    return 0;
> +	}
> +     };

style nit: white-space error

> +
> +    if ($timeout) {
> +	for (my $i = 0; $i < $timeout; $i++) {
> +	    return if &$collect_child_process();
> +	    sleep(1);
> +	}
> +    }
> +
> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGTERM\n");
> +    kill(15, $cpid);
> +
> +    # wait again
> +    for (my $i = 0; $i < 10; $i++) {
> +	return if &$collect_child_process();
> +	sleep(1);
> +    }
> +
> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGKILL\n");
> +    kill 9, $cpid;
> +    sleep 1;
> +
> +    $cmdpipe->{log}->("tunnel child process (PID $cpid) couldn't be collected\n")
> +	if !&$collect_child_process();
> +};
> +
> +sub read_tunnel {
> +    my ($tunnel, $timeout) = @_;
> +
> +    $timeout = 60 if !defined($timeout);
> +
> +    my $reader = $tunnel->{reader};
> +
> +    my $output;
> +    eval {
> +	PVE::Tools::run_with_timeout($timeout, sub { $output = <$reader>; });
> +    };
> +    die "reading from tunnel failed: $@\n" if $@;
> +
> +    chomp $output if defined($output);
> +
> +    return $output;
> +}
> +
> +sub write_tunnel {
> +    my ($tunnel, $timeout, $command, $params) = @_;
> +
> +    $timeout = 60 if !defined($timeout);
> +
> +    my $writer = $tunnel->{writer};
> +
> +    if ($tunnel->{version} && $tunnel->{version} >= 2) {
> +	my $object = defined($params) ? dclone($params) : {};
> +	$object->{cmd} = $command;
> +
> +	$command = eval { JSON::encode_json($object) };
> +
> +	die "failed to encode command as JSON - $@\n"
> +	    if $@;
> +    }
> +
> +    eval {
> +	PVE::Tools::run_with_timeout($timeout, sub {
> +	    print $writer "$command\n";
> +	    $writer->flush();
> +	});
> +    };
> +    die "writing to tunnel failed: $@\n" if $@;
> +
> +    if ($tunnel->{version} && $tunnel->{version} >= 1) {
> +	my $res = eval { read_tunnel($tunnel, $timeout); };
> +	die "no reply to command '$command': $@\n" if $@;
> +
> +	if ($tunnel->{version} == 1) {
> +	    if ($res eq 'OK') {
> +		return;
> +	    } else {
> +		die "tunnel replied '$res' to command '$command'\n";
> +	    }
> +	} else {
> +	    my $parsed = eval { JSON::decode_json($res) };
> +	    die "failed to decode tunnel reply '$res' (command '$command') - $@\n"
> +		if $@;
> +
> +	    if (!$parsed->{success}) {
> +		if (defined($parsed->{msg})) {
> +		    die "error - tunnel command '$command' failed - $parsed->{msg}\n";
> +		} else {
> +		    die "error - tunnel command '$command' failed\n";
> +		}
> +	    }
> +
> +	    return $parsed;
> +	}
> +    }
> +}
> +
> +sub fork_ssh_tunnel {
> +    my ($rem_ssh, $cmd, $ssh_forward_info, $log) = @_;
> +
> +    my @localtunnelinfo = ();
> +    foreach my $addr (@$ssh_forward_info) {
> +	push @localtunnelinfo, '-L', $addr;
> +    }
> +
> +    my $full_cmd = [@$rem_ssh, '-o ExitOnForwardFailure=yes', @localtunnelinfo, @$cmd];
> +
> +    my $reader = IO::File->new();
> +    my $writer = IO::File->new();
> +
> +    my $orig_pid = $$;
> +
> +    my $cpid;
> +
> +    eval { $cpid = open2($reader, $writer, @$full_cmd); };
> +
> +    my $err = $@;
> +
> +    # catch exec errors
> +    if ($orig_pid != $$) {
> +	$log->("can't fork command pipe, aborting\n");
> +	POSIX::_exit(1);
> +	kill('KILL', $$);
> +    }
> +
> +    die $err if $err;
> +
> +    my $tunnel = {
> +	writer => $writer,
> +	reader => $reader,
> +	pid => $cpid,
> +	rem_ssh => $rem_ssh,
> +	log => $log,
> +    };
> +
> +    eval {
> +	my $helo = read_tunnel($tunnel, 60);
> +	die "no reply\n" if !$helo;
> +	die "no quorum on target node\n" if $helo =~ m/^no quorum$/;
> +	die "got strange reply from tunnel ('$helo')\n"
> +	    if $helo !~ m/^tunnel online$/;
> +    };
> +    $err = $@;
> +
> +    eval {
> +	my $ver = read_tunnel($tunnel, 10);
> +	if ($ver =~ /^ver (\d+)$/) {
> +	    $tunnel->{version} = $1;
> +	    $log->('info', "ssh tunnel $ver\n");

Should pass only the message like the other calls to $log (unless the 
function is intended to behave differently when multiple parameters are 
used, but in qemu-server 06/10 that doesn't happen). What are the 
reasons for not sticking to the two-parameter log interface?

> +	} else {
> +	    $err = "received invalid tunnel version string '$ver'\n" if !$err;
> +	}
> +    };
> +
> +    if ($err) {
> +	$finish_command_pipe->($tunnel);
> +	die "can't open tunnel - $err";
> +    }
> +    return $tunnel;
> +}
> +
> +sub forward_unix_socket {
> +    my ($tunnel, $local, $remote) = @_;
> +
> +    my $params = dclone($tunnel->{params});
> +    $params->{unix} = $local;
> +    $params->{url} = $params->{url} ."socket=".uri_escape($remote)."&";
> +    $params->{ticket} = { path => $remote };
> +
> +    my $cmd = encode_json({
> +	control => JSON::true,
> +	cmd => 'forward',
> +	data => $params,
> +    });
> +
> +    my $writer = $tunnel->{writer};
> +    $tunnel->{forwarded}->{$local} = $remote;
> +    eval {
> +	unlink $local;
> +	PVE::Tools::run_with_timeout(15, sub {
> +	    print $writer "$cmd\n";
> +	    $writer->flush();
> +	});
> +    };
> +    die "failed to write forwarding command - $@\n" if $@;
> +
> +    read_tunnel($tunnel);
> +}
> +
> +sub fork_websocket_tunnel {
> +    my ($conn, $url, $req_params, $tunnel_params, $log) = @_;
> +
> +    if (my $apitoken = $conn->{apitoken}) {
> +	$tunnel_params->{headers} = [["Authorization", "$apitoken"]];
> +    } else {
> +	die "can't connect to remote host without credentials\n";
> +    }
> +
> +    if (my $fps = $conn->{cached_fingerprints}) {
> +	$tunnel_params->{fingerprint} = (keys %$fps)[0];
> +    }
> +
> +    my $api_client = PVE::APIClient::LWP->new(%$conn);
> +
> +    my $res = $api_client->post(
> +	$url,
> +	$req_params,
> +    );
> +
> +    $log->("remote: started migration tunnel worker '$res->{upid}'");

Nit: still mentions migration, as is done...

> +
> +    my $websocket_url = $tunnel_params->{url};
> +
> +    $tunnel_params->{url} .= "?ticket=".uri_escape($res->{ticket});
> +    $tunnel_params->{url} .= "&socket=".uri_escape($res->{socket});
> +
> +    my $reader = IO::Pipe->new();
> +    my $writer = IO::Pipe->new();
> +
> +    my $cpid = fork();
> +    if ($cpid) {
> +	$writer->writer();
> +	$reader->reader();
> +	my $tunnel = { writer => $writer, reader => $reader, pid => $cpid };
> +
> +	eval {
> +	    my $writer = $tunnel->{writer};
> +	    my $cmd = encode_json({
> +		control => JSON::true,
> +		cmd => 'connect',
> +		data => $tunnel_params,
> +	    });
> +
> +	    eval {
> +		PVE::Tools::run_with_timeout(15, sub {
> +		    print {$writer} "$cmd\n";
> +		    $writer->flush();
> +		});
> +	    };
> +	    die "failed to write tunnel connect command - $@\n" if $@;
> +	};
> +	die "failed to connect via WS: $@\n" if $@;
> +
> +	my $err;
> +        eval {
> +	    my $writer = $tunnel->{writer};
> +	    my $cmd = encode_json({
> +		cmd => 'version',
> +	    });
> +
> +	    eval {
> +		PVE::Tools::run_with_timeout(15, sub {
> +		    print {$writer} "$cmd\n";
> +		    $writer->flush();
> +		});
> +	    };
> +	    $err = "failed to write tunnel version command - $@\n" if $@;
> +	    my $res = read_tunnel($tunnel, 10);
> +	    $res = JSON::decode_json($res);
> +	    my $version = $res->{api};
> +
> +	    if ($version =~ /^(\d+)$/) {
> +		$tunnel->{version} = $1;
> +		$tunnel->{age} = $res->{age};
> +	    } else {
> +		$err = "received invalid tunnel version string '$version'\n" if !$err;
> +	    }
> +	};
> +	$err = $@ if !$err;
> +
> +	if ($err) {
> +	    $finish_command_pipe->($tunnel);

Nit: Here, $tunnel->{log} is still undef, but finish_command_pipe might 
try to use it.

> +	    die "can't open migration tunnel - $err";

...here.

> +	}
> +
> +	$tunnel_params->{url} = "$websocket_url?"; # reset ticket and socket
> +
> +	$tunnel->{params} = $tunnel_params; # for forwarding
> +	$tunnel->{log} = $log;
> +
> +	return $tunnel;
> +    } else {
> +	eval {
> +	    $writer->reader();
> +	    $reader->writer();
> +	    PVE::Tools::run_command(
> +		['proxmox-websocket-tunnel'],
> +		input => "<&".fileno($writer),
> +		output => ">&".fileno($reader),
> +		errfunc => sub { my $line = shift; print "tunnel: $line\n"; },
> +	    );
> +	};
> +	warn "CMD websocket tunnel died: $@\n" if $@;
> +	exit 0;
> +    }
> +}
> +
> +sub finish_tunnel {
> +    my ($tunnel, $cleanup) = @_;
> +
> +    $cleanup = $cleanup ? 1 : 0;
> +
> +    eval { write_tunnel($tunnel, 30, 'quit', { cleanup => $cleanup }); };
> +    my $err = $@;
> +
> +    $finish_command_pipe->($tunnel, 30);
> +
> +    if (my $unix_sockets = $tunnel->{unix_sockets}) {
> +	# ssh does not clean up on local host
> +	my $cmd = ['rm', '-f', @$unix_sockets];
> +	PVE::Tools::run_command($cmd);
> +
> +	# .. and just to be sure check on remote side
> +	if ($tunnel->{rem_ssh}) {
> +	    unshift @{$cmd}, @{$tunnel->{rem_ssh}};
> +	    PVE::Tools::run_command($cmd);
> +	}
> +    }
> +
> +    die $err if $err;
> +}




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
@ 2022-01-03 14:30   ` Fabian Ebner
       [not found]     ` <<af15fed1-2d06-540e-cde8-ed1ce772aeb4@proxmox.com>
  2022-01-05 10:50   ` Fabian Ebner
  1 sibling, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-03 14:30 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 12/22/21 um 14:52 schrieb Fabian Grünbichler:
> encapsulating storage-related tunnel methods, currently
> - source-side storage-migrate helper
> - target-side disk-import handler
> - target-side query-disk-import handler
> 
> to be extended further with replication-related handlers and helpers.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:

Needs bump for pve-storage (for storage_migration_snapshot, 
volume_import_start, etc.)

>      new in v3, includes code previously in qemu-server
> 
>   src/Makefile             |   1 +
>   src/PVE/StorageTunnel.pm | 231 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 232 insertions(+)
>   create mode 100644 src/PVE/StorageTunnel.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index d82162c..baa2688 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -12,6 +12,7 @@ install: PVE
>   	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
> +	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>   	install -d ${PERL5DIR}/PVE/VZDump
>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
> diff --git a/src/PVE/StorageTunnel.pm b/src/PVE/StorageTunnel.pm
> new file mode 100644
> index 0000000..06902ef
> --- /dev/null
> +++ b/src/PVE/StorageTunnel.pm
> @@ -0,0 +1,231 @@
> +package PVE::StorageTunnel;
> +
> +use strict;
> +use warnings;
> +
> +use IO::Socket::UNIX;
> +use POSIX qw(WNOHANG);
> +use Socket qw(SOCK_STREAM);
> +
> +use PVE::Tools;
> +use PVE::Tunnel;
> +use PVE::Storage;

Nit: not ordered alphabetically

> +
> +sub storage_migrate {
> +    my ($tunnel, $storecfg, $volid, $local_vmid, $remote_vmid, $opts, $log) = @_;
> +
> +    my $bwlimit = $opts->{bwlimit};
> +    my $targetsid = $opts->{targetsid};
> +
> +    # use 'migrate' limit for transfer to other node
> +    my $bwlimit_opts = {
> +	storage => $targetsid,
> +	bwlimit => $bwlimit,
> +    };
> +    my $remote_bwlimit = PVE::Tunnel::write_tunnel($tunnel, 10, 'bwlimit', $bwlimit_opts);

Should this be done on the call side instead? For re-using the helper 
with replication, the 'bwlimit' command shouldn't happen, as there, the 
only relevant limit is the one from the job configuration (even the 
'default' storage limit isn't used AFAICT).

> +    $remote_bwlimit = $remote_bwlimit->{bwlimit};
> +    if (defined($remote_bwlimit)) {
> +	$bwlimit = $remote_bwlimit if !defined($bwlimit);
> +	$bwlimit = $remote_bwlimit if $remote_bwlimit < $bwlimit;

Wrong result if $remote_bwlimit = 0, $bwlimit > 0, but maybe that 
situation is not even possible currently. That aside, isn't the content 
of the if block superfluous, because the 'bwlimit' 
handler/get_bandwith_limit already determines the minimum from the 
passed-along parameter and the storage limit?

> +    }
> +
> +    # JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
> +    $bwlimit = $bwlimit * 1024 if defined($bwlimit);
> +
> +    # adapt volume name for import call
> +    my ($sid, undef) = PVE::Storage::parse_volume_id($volid);
> +    my (undef, $name, undef, undef, undef, undef, $format) = PVE::Storage::parse_volname($storecfg, $volid);
> +    my $scfg = PVE::Storage::storage_config($storecfg, $sid);
> +    PVE::Storage::activate_volumes($storecfg, [$volid]);
> +    if ($local_vmid != $remote_vmid) {

There can be guests where the owner's VM ID and the disk's VM ID don't 
match (e.g. manual reassign via 'qm set'), but I'm not sure it's worth 
bothering about.

> +	$name =~ s/-$local_vmid-/-$remote_vmid-/g;
> +	$name =~ s/^$local_vmid\///; # re-added on target if dir-based storage
> +    }
> +
> +    my $with_snapshots = $opts->{snapshots} ? 1 : 0;
> +    my $snapshot;
> +    my $migration_snapshot = PVE::Storage::storage_migrate_snapshot($storecfg, $sid);
> +    if ($migration_snapshot) {
> +	$snapshot = '__migration__';
> +	$with_snapshots = 1;
> +    }
> +
> +    my @export_formats = PVE::Storage::volume_export_formats($storecfg, $volid, undef, undef, $with_snapshots);
> +
> +    my $disk_import_opts = {
> +	format => $format,
> +	storage => $targetsid,
> +	snapshot => $snapshot,
> +	migration_snapshot => $migration_snapshot,
> +	'with-snapshots' => $with_snapshots,
> +	'allow-rename' => !$opts->{is_vmstate},
> +	'export-formats' => join(",", @export_formats),
> +	volname => $name,
> +    };
> +    my $res = PVE::Tunnel::write_tunnel($tunnel, 600, 'disk-import', $disk_import_opts);
> +    my $local = "/run/qemu-server/$local_vmid.storage";
> +    if (!$tunnel->{forwarded}->{$local}) {
> +	PVE::Tunnel::forward_unix_socket($tunnel, $local, $res->{socket});
> +    }
> +    my $socket = IO::Socket::UNIX->new(Peer => $local, Type => SOCK_STREAM())
> +	or die "failed to connect to websocket tunnel at $local\n";
> +    # we won't be reading from the socket
> +    shutdown($socket, 0);
> +
> +    my $disk_export_opts = {
> +	snapshot => $snapshot,
> +	migration_snapshot => $migration_snapshot,
> +	'with-snapshots' => $with_snapshots,
> +	ratelimit_bps => $bwlimit,
> +	cmd => {
> +	    output => '>&'.fileno($socket),
> +	},
> +    };
> +
> +    eval {
> +	PVE::Storage::volume_export_start(
> +	    $storecfg,
> +	    $volid,
> +	    $res->{format},
> +	    $disk_export_opts,
> +	);
> +    };
> +    my $send_error = $@;
> +    warn "$send_error\n" if $send_error;
> +
> +    # don't close the connection entirely otherwise the
> +    # receiving end might not get all buffered data (and
> +    # fails with 'connection reset by peer')
> +    shutdown($socket, 1);
> +
> +    # wait for the remote process to finish
> +    my $new_volid;
> +    while ($res = PVE::Tunnel::write_tunnel($tunnel, 10, 'query-disk-import')) {
> +	if ($res->{status} eq 'pending') {
> +	    if (my $msg = $res->{msg}) {
> +		$log->("disk-import: $msg\n");
> +	    } else {
> +		$log->("waiting for disk import to finish..\n");
> +	    }
> +	    sleep(1)
> +	} elsif ($res->{status} eq 'complete') {
> +	    $new_volid = $res->{volid};
> +	    last;
> +	} else {
> +	    warn "unknown query-disk-import result: $res->{status}\n";
> +	    last;
> +	}
> +    }
> +
> +    # now close the socket
> +    close($socket);
> +    if ($snapshot) {
> +	eval { PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snapshot, 0) };
> +	warn "could not remove source snapshot: $@\n" if $@;
> +    }
> +    die $send_error if $send_error;
> +    die "disk import failed - see log above\n" if !$new_volid;
> +
> +    return $new_volid;
> +}
> +
> +our $cmd_schema = {

Maybe 'bwlimit' should also live here (and for future re-usability 
become agnostic towards the operation for the limit, rather than 
hard-coding 'migration')?

And maybe we should add
     'query-disk-import' => {},
for completeness?

> +    'disk-import' => {
> +	volname => {
> +	    type => 'string',
> +	    description => 'volume name to use as preferred target volume name',
> +	},
> +	format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
> +	'export-formats' => {
> +	    type => 'string',
> +	    description => 'list of supported export formats',
> +	},
> +	storage => {
> +	    type => 'string',
> +	    format => 'pve-storage-id',
> +	},
> +	'with-snapshots' => {
> +	    description =>
> +	        "Whether the stream includes intermediate snapshots",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 0,
> +	},
> +	'allow-rename' => {
> +	    description => "Choose a new volume ID if the requested " .
> +		"volume ID already exists, instead of throwing an error.",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 0,
> +	},
> +    },
> +};
> +
> +sub handle_disk_import {
> +    my ($state, $params) = @_;
> +
> +    die "disk import already running as PID '$state->{disk_import}->{pid}'\n"
> +	if $state->{disk_import}->{pid};
> +
> +    my $storage = delete $params->{storage};
> +    my $format = delete $params->{format};
> +    my $volname = delete $params->{volname};
> +
> +    my $import = PVE::Storage::volume_import_start($state->{storecfg}, $storage, $volname, $format, $state->{vmid}, $params);
> +
> +    my $socket = $import->{socket};
> +    $format = delete $import->{format};
> +
> +    $state->{sockets}->{$socket} = 1;
> +    $state->{disk_import} = $import;
> +
> +    chown $state->{socket_uid}, -1, $socket;
> +
> +    return {
> +	socket => $socket,
> +	format => $format,
> +    };
> +}
> +
> +sub handle_query_disk_import {
> +    my ($state, $params) = @_;
> +
> +    die "no disk import running\n"
> +	if !$state->{disk_import}->{pid};
> +
> +    my $pattern = PVE::Storage::volume_imported_message(undef, 1);
> +    my $result;
> +    eval {
> +	my $fh = $state->{disk_import}->{fh};
> +	PVE::Tools::run_with_timeout(5, sub { $result = <$fh>; });
> +	print "disk-import: $result\n" if $result;
> +    };

What if the disk import finishes after here...

> +    if ($result && $result =~ $pattern) {
> +	my $volid = $1;
> +	waitpid($state->{disk_import}->{pid}, 0);
> +
> +	my $unix = $state->{disk_import}->{socket};
> +	unlink $unix;
> +	delete $state->{sockets}->{$unix};
> +	delete $state->{disk_import};
> +	$state->{cleanup}->{volumes}->{$volid} = 1;
> +	return {
> +	    status => "complete",
> +	    volid => $volid,
> +	};
> +    } elsif (!$result && waitpid($state->{disk_import}->{pid}, WNOHANG)) {

...but before the call to waitpid? Or is it guaranteed to keep running 
until we read from the file handle?

> +	my $unix = $state->{disk_import}->{socket};
> +	unlink $unix;
> +	delete $state->{sockets}->{$unix};
> +	delete $state->{disk_import};
> +
> +	return {
> +	    status => "error",
> +	};
> +    } else {
> +	return {
> +	    status => "pending",
> +	    msg => $result,
> +	};
> +    }
> +}




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
@ 2022-01-04 11:37   ` Fabian Ebner
  2022-01-19 10:34     ` Fabian Grünbichler
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-04 11:37 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> in case we have no explicit fingerprint, we use openssl's regular "PEER"
> verification. if we have a fingerprint, we ignore openssl altogether and
> just verify the fingerprint of the presented leaf certificate.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      v3: switch to using hex instead of no-longer-existing digest_to_hex
>      v2: new
> 
>   src/main.rs | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>   1 file changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/src/main.rs b/src/main.rs
> index 582214c..49d6ffe 100644
> --- a/src/main.rs
> +++ b/src/main.rs
> @@ -134,9 +134,50 @@ impl CtrlTunnel {
>           }
>   
>           let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
> -        if fingerprint.is_some() {
> -            // FIXME actually verify fingerprint via callback!
> -            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
> +        if let Some(expected) = fingerprint {
> +            ssl_connector_builder.set_verify_callback(
> +                openssl::ssl::SslVerifyMode::NONE,
> +                move |_valid, ctx| {
> +                    let cert = match ctx.current_cert() {
> +                        Some(cert) => cert,
> +                        None => {
> +                            eprintln!("SSL context lacks current certificate.");
> +                            return false;
> +                        }
> +                    };
> +
> +                    let depth = ctx.error_depth();
> +                    if depth != 0 {
> +                        return true;
> +                    }

Sorry about my ignorance. Does using SslVerifyMode::NONE imply that 
there is an error? At depth 0? Why is it fine to return true if not?

> +
> +                    let fp = match cert.digest(openssl::hash::MessageDigest::sha256()) {
> +                        Ok(fp) => fp,
> +                        Err(err) => {
> +                            // should not happen
> +                            eprintln!("failed to calculate certificate FP - {}", err);
> +                            return false;
> +                        }
> +                    };
> +                    let fp_string = hex::encode(&fp);
> +                    let fp_string = fp_string
> +                        .as_bytes()
> +                        .chunks(2)
> +                        .map(|v| std::str::from_utf8(v).unwrap())
> +                        .collect::<Vec<&str>>()
> +                        .join(":");
> +
> +                    let expected = expected.to_lowercase();
> +                    if expected == fp_string {
> +                        true
> +                    } else {
> +                        eprintln!("certificate fingerprint does not match expected fingerprint!");
> +                        eprintln!("expected:    {}", expected);
> +                        eprintln!("encountered: {}", fp_string);
> +                        false
> +                    }
> +                },
> +            );
>           } else {
>               ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
>           }




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode Fabian Grünbichler
@ 2022-01-04 11:57   ` Fabian Ebner
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Ebner @ 2022-01-04 11:57 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> when passing a config from one cluster to another, we want to be strict
> when parsing - it's better to fail the migration early and upgrade the
> target node instead of failing the migration later (when significant
> work for transferring disks and/or state has already been done) or not
> at all, but silently lose config settings that the target doesn't
> understand.
> 
> this also might be helpful in other cases - e.g. when restoring from a
> backup.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      v3: new
> 
>   PVE/QemuServer.pm | 20 +++++++++++++++-----
>   1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index f5beb1b..be88385 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -2329,7 +2329,7 @@ sub destroy_vm {
>   }
>   
>   sub parse_vm_config {
> -    my ($filename, $raw) = @_;
> +    my ($filename, $raw, $strict) = @_;
>   
>       return if !defined($raw);
>   
> @@ -2339,6 +2339,16 @@ sub parse_vm_config {
>   	pending => {},
>       };
>   
> +    my $handle_error = sub {
> +	my ($msg) = @_;
> +
> +	if ($strict) {
> +	    warn $msg;
> +	} else {
> +	    die $msg;
> +	}

Inverted logic ;)

> +    };
> +
>       $filename =~ m|/qemu-server/(\d+)\.conf$|
>   	|| die "got strange filename '$filename'";
>   
> @@ -2393,14 +2403,14 @@ sub parse_vm_config {
>   	    if ($section eq 'pending') {
>   		$conf->{delete} = $value; # we parse this later
>   	    } else {
> -		warn "vm $vmid - propertry 'delete' is only allowed in [PENDING]\n";
> +		$handle_error->("vm $vmid - property 'delete' is only allowed in [PENDING]\n");
>   	    }
>   	} elsif ($line =~ m/^([a-z][a-z_]*\d*):\s*(.+?)\s*$/) {
>   	    my $key = $1;
>   	    my $value = $2;
>   	    eval { $value = check_type($key, $value); };
>   	    if ($@) {
> -		warn "vm $vmid - unable to parse value of '$key' - $@";
> +		$handle_error->("vm $vmid - unable to parse value of '$key' - $@");
>   	    } else {
>   		$key = 'ide2' if $key eq 'cdrom';
>   		my $fmt = $confdesc->{$key}->{format};
> @@ -2410,7 +2420,7 @@ sub parse_vm_config {
>   			$v->{file} = $volid;
>   			$value = print_drive($v);
>   		    } else {
> -			warn "vm $vmid - unable to parse value of '$key'\n";
> +			$handle_error->("vm $vmid - unable to parse value of '$key'\n");
>   			next;
>   		    }
>   		}
> @@ -2418,7 +2428,7 @@ sub parse_vm_config {
>   		$conf->{$key} = $value;
>   	    }
>   	} else {
> -	    warn "vm $vmid - unable to parse config: $line\n";
> +	    $handle_error->("vm $vmid - unable to parse config: $line\n");
>   	}
>       }
>   




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling Fabian Grünbichler
@ 2022-01-04 13:58   ` Fabian Ebner
  2022-01-04 16:44     ` Roland
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-04 13:58 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Two comments inline:

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> remote migration uses a websocket connection to a task worker running on
> the target node instead of commands via SSH to control the migration.
> this websocket tunnel is started earlier than the SSH tunnel, and allows
> adding UNIX-socket forwarding over additional websocket connections
> on-demand.
> 
> the main differences to regular intra-cluster migration are:
> - source VM config and disks are only removed upon request via --delete
> - shared storages are treated like local storages, since we can't
> assume they are shared across clusters (with potentical to extend this
> by marking storages as shared)
> - NBD migrated disks are explicitly pre-allocated on the target node via
> tunnel command before starting the target VM instance
> - in addition to storages, network bridges and the VMID itself is
> transformed via a user defined mapping
> - all commands and migration data streams are sent via a WS tunnel proxy
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      requires bumped pve-guest-common
>      
>      v3:
>      - move WS tunnel helpers to pve-guest-common-perl
>      - check bridge mapping early
>      
>      v2:
>      - improve tunnel version info printing and error handling
>      - don't cleanup unix sockets twice
>      - url escape remote socket path
>      - cleanup nits and small issues
>      
>      requires proxmox-websocket-tunnel
>      
>      v3:
>      - fix misplaced parentheses
>      
>      v2:
>      - improve tunnel version info printing and error handling
>      - don't cleanup unix sockets twice
>      - url escape remote socket path
>      - cleanup nits and small issues
> 
>   PVE/API2/Qemu.pm   |   2 +-
>   PVE/QemuMigrate.pm | 425 +++++++++++++++++++++++++++++++++++++--------
>   PVE/QemuServer.pm  |   7 +-
>   3 files changed, 359 insertions(+), 75 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index d188b77..cf90fe7 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4803,7 +4803,7 @@ __PACKAGE__->register_method({
>   		    # bump/reset for breaking changes
>   		    # bump/bump for opt-in changes
>   		    return {
> -			api => 2,
> +			api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
>   			age => 0,
>   		    };
>   		},
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 897018b..3e9d29e 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -5,11 +5,10 @@ use warnings;
>   
>   use IO::File;
>   use IPC::Open2;
> -use POSIX qw( WNOHANG );
>   use Time::HiRes qw( usleep );
>   
> -use PVE::Format qw(render_bytes);
>   use PVE::Cluster;
> +use PVE::Format qw(render_bytes);
>   use PVE::GuestHelpers qw(safe_boolean_ne safe_string_ne);
>   use PVE::INotify;
>   use PVE::RPCEnvironment;
> @@ -17,6 +16,7 @@ use PVE::Replication;
>   use PVE::ReplicationConfig;
>   use PVE::ReplicationState;
>   use PVE::Storage;
> +use PVE::StorageTunnel;
>   use PVE::Tools;
>   use PVE::Tunnel;
>   
> @@ -31,6 +31,9 @@ use PVE::QemuServer;
>   use PVE::AbstractMigrate;
>   use base qw(PVE::AbstractMigrate);
>   
> +# compared against remote end's minimum version
> +our $WS_TUNNEL_VERSION = 2;
> +
>   sub fork_tunnel {
>       my ($self, $ssh_forward_info) = @_;
>   
> @@ -43,6 +46,35 @@ sub fork_tunnel {
>       return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
>   }
>   
> +sub fork_websocket_tunnel {
> +    my ($self, $storages, $bridges) = @_;
> +
> +    my $remote = $self->{opts}->{remote};
> +    my $conn = $remote->{conn};
> +
> +    my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnelwebsocket";
> +    my $url = "/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnel";
> +
> +    my $tunnel_params = {
> +	url => $websocket_url,
> +    };
> +
> +    my $storage_list = join(',', keys %$storages);
> +    my $bridge_list = join(',', keys %$bridges);
> +
> +    my $req_params = {
> +	storages => $storage_list,
> +	bridges => $bridge_list,
> +    };
> +
> +    my $log = sub {
> +	my $line = shift;
> +	$self->log('info', $line);
> +    };
> +
> +    return PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
> +}
> +
>   # tunnel_info:
>   #   proto: unix (secure) or tcp (insecure/legacy compat)
>   #   addr: IP or UNIX socket path
> @@ -175,23 +207,34 @@ sub prepare {
>       }
>   
>       my $vollist = PVE::QemuServer::get_vm_volumes($conf);
> +
> +    my $storages = {};
>       foreach my $volid (@$vollist) {
>   	my ($sid, $volname) = PVE::Storage::parse_volume_id($volid, 1);
>   
> -	# check if storage is available on both nodes
> +	# check if storage is available on source node
>   	my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
>   
>   	my $targetsid = $sid;
> -	# NOTE: we currently ignore shared source storages in mappings so skip here too for now
> -	if (!$scfg->{shared}) {
> +	# NOTE: local ignores shared mappings, remote maps them
> +	if (!$scfg->{shared} || $self->{opts}->{remote}) {
>   	    $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
>   	}
>   
> -	my $target_scfg = PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
> -	my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
> +	$storages->{$targetsid} = 1;
>   
> -	die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
> -	    if !$target_scfg->{content}->{$vtype};
> +	if (!$self->{opts}->{remote}) {
> +	    # check if storage is available on target node
> +	    my $target_scfg = PVE::Storage::storage_check_enabled(
> +		$storecfg,
> +		$targetsid,
> +		$self->{node},
> +	    );
> +	    my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
> +
> +	    die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
> +		if !$target_scfg->{content}->{$vtype};
> +	}
>   
>   	if ($scfg->{shared}) {
>   	    # PVE::Storage::activate_storage checks this for non-shared storages
> @@ -201,10 +244,27 @@ sub prepare {
>   	}
>       }
>   
> -    # test ssh connection
> -    my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> -    eval { $self->cmd_quiet($cmd); };
> -    die "Can't connect to destination address using public key\n" if $@;
> +    if ($self->{opts}->{remote}) {
> +	# test & establish websocket connection
> +	my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
> +	my $tunnel = $self->fork_websocket_tunnel($storages, $bridges);
> +	my $min_version = $tunnel->{version} - $tunnel->{age};
> +	$self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
> +	$self->log('info', "remote WS tunnel version: $tunnel->{version}");
> +	$self->log('info', "minimum required WS tunnel version: $min_version");
> +	die "Remote tunnel endpoint not compatible, upgrade required\n"
> +	    if $WS_TUNNEL_VERSION < $min_version;
> +	 die "Remote tunnel endpoint too old, upgrade required\n"
> +	    if $WS_TUNNEL_VERSION > $tunnel->{version};
> +
> +	print "websocket tunnel started\n";
> +	$self->{tunnel} = $tunnel;
> +    } else {
> +	# test ssh connection
> +	my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> +	eval { $self->cmd_quiet($cmd); };
> +	die "Can't connect to destination address using public key\n" if $@;
> +    }
>   
>       return $running;
>   }
> @@ -242,7 +302,7 @@ sub scan_local_volumes {
>   	my @sids = PVE::Storage::storage_ids($storecfg);
>   	foreach my $storeid (@sids) {
>   	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
> -	    next if $scfg->{shared};
> +	    next if $scfg->{shared} && !$self->{opts}->{remote};
>   	    next if !PVE::Storage::storage_check_enabled($storecfg, $storeid, undef, 1);
>   
>   	    # get list from PVE::Storage (for unused volumes)
> @@ -251,19 +311,24 @@ sub scan_local_volumes {
>   	    next if @{$dl->{$storeid}} == 0;
>   
>   	    my $targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $storeid);
> -	    # check if storage is available on target node
> -	    my $target_scfg = PVE::Storage::storage_check_enabled(
> -		$storecfg,
> -		$targetsid,
> -		$self->{node},
> -	    );
> -
> -	    die "content type 'images' is not available on storage '$targetsid'\n"
> -		if !$target_scfg->{content}->{images};
> +	    my $bwlimit_sids = [$storeid];
> +	    if (!$self->{opts}->{remote}) {
> +		# check if storage is available on target node
> +		my $target_scfg = PVE::Storage::storage_check_enabled(
> +		    $storecfg,
> +		    $targetsid,
> +		    $self->{node},
> +		);
> +
> +		die "content type 'images' is not available on storage '$targetsid'\n"
> +		    if !$target_scfg->{content}->{images};
> +
> +		push @$bwlimit_sids, $targetsid;
> +	    }
>   
>   	    my $bwlimit = PVE::Storage::get_bandwidth_limit(
>   		'migration',
> -		[$targetsid, $storeid],
> +		$bwlimit_sids,
>   		$self->{opts}->{bwlimit},
>   	    );
>   
> @@ -319,14 +384,17 @@ sub scan_local_volumes {
>   	    my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
>   
>   	    my $targetsid = $sid;
> -	    # NOTE: we currently ignore shared source storages in mappings so skip here too for now
> -	    if (!$scfg->{shared}) {
> +	    # NOTE: local ignores shared mappings, remote maps them
> +	    if (!$scfg->{shared} || $self->{opts}->{remote}) {
>   		$targetsid = PVE::QemuServer::map_id($self->{opts}->{storagemap}, $sid);
>   	    }
>   
> -	    PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
> +	    # check target storage on target node if intra-cluster migration
> +	    if (!$self->{opts}->{remote}) {
> +		PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
>   
> -	    return if $scfg->{shared};
> +		return if $scfg->{shared};
> +	    }
>   
>   	    $local_volumes->{$volid}->{ref} = $attr->{referenced_in_config} ? 'config' : 'snapshot';
>   	    $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_unused};
> @@ -415,6 +483,9 @@ sub scan_local_volumes {
>   
>   	    my $migratable = $scfg->{type} =~ /^(?:dir|btrfs|zfspool|lvmthin|lvm)$/;
>   
> +	    # TODO: what is this even here for?
> +	    $migratable = 1 if $self->{opts}->{remote};
> +
>   	    die "can't migrate '$volid' - storage type '$scfg->{type}' not supported\n"
>   		if !$migratable;
>   
> @@ -449,6 +520,10 @@ sub handle_replication {
>       my $local_volumes = $self->{local_volumes};
>   
>       return if !$self->{replication_jobcfg};
> +
> +    die "can't migrate VM with replicated volumes to remote cluster/node\n"
> +	if $self->{opts}->{remote};
> +
>       if ($self->{running}) {
>   
>   	my $version = PVE::QemuServer::kvm_user_version();
> @@ -548,24 +623,51 @@ sub sync_offline_local_volumes {
>       $self->log('info', "copying local disk images") if scalar(@volids);
>   
>       foreach my $volid (@volids) {
> -	my $targetsid = $local_volumes->{$volid}->{targetsid};
> -	my $bwlimit = $local_volumes->{$volid}->{bwlimit};
> -	$bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
> -
> -	my $storage_migrate_opts = {
> -	    'ratelimit_bps' => $bwlimit,
> -	    'insecure' => $opts->{migration_type} eq 'insecure',
> -	    'with_snapshots' => $local_volumes->{$volid}->{snapshots},
> -	    'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
> -	};
> +	my $new_volid;
>   
> -	my $logfunc = sub { $self->log('info', $_[0]); };
> -	my $new_volid = eval {
> -	    PVE::Storage::storage_migrate($storecfg, $volid, $self->{ssh_info},
> -					  $targetsid, $storage_migrate_opts, $logfunc);
> -	};
> -	if (my $err = $@) {
> -	    die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
> +	my $opts = $self->{opts};
> +	if ($opts->{remote}) {
> +	    my $log = sub {
> +		my $line = shift;
> +		$self->log('info', $line);
> +	    };
> +
> +	    $new_volid = PVE::StorageTunnel::storage_migrate(
> +		$self->{tunnel},
> +		$storecfg,
> +		$volid,
> +		$self->{vmid},
> +		$opts->{remote}->{vmid},
> +		$local_volumes->{$volid},
> +		$log,
> +	    );
> +	} else {
> +	    my $targetsid = $local_volumes->{$volid}->{targetsid};
> +
> +	    my $bwlimit = $local_volumes->{$volid}->{bwlimit};
> +	    $bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
> +
> +	    my $storage_migrate_opts = {
> +		'ratelimit_bps' => $bwlimit,
> +		'insecure' => $opts->{migration_type} eq 'insecure',
> +		'with_snapshots' => $local_volumes->{$volid}->{snapshots},
> +		'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
> +	    };
> +
> +	    my $logfunc = sub { $self->log('info', $_[0]); };
> +	    $new_volid = eval {
> +		PVE::Storage::storage_migrate(
> +		    $storecfg,
> +		    $volid,
> +		    $self->{ssh_info},
> +		    $targetsid,
> +		    $storage_migrate_opts,
> +		    $logfunc,
> +		);
> +	    };
> +	    if (my $err = $@) {
> +		die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
> +	    }
>   	}
>   
>   	$self->{volume_map}->{$volid} = $new_volid;
> @@ -581,6 +683,12 @@ sub sync_offline_local_volumes {
>   sub cleanup_remotedisks {
>       my ($self) = @_;
>   
> +    if ($self->{opts}->{remote}) {
> +	PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
> +	delete $self->{tunnel};
> +	return;
> +    }
> +
>       my $local_volumes = $self->{local_volumes};
>   
>       foreach my $volid (values %{$self->{volume_map}}) {
> @@ -630,8 +738,100 @@ sub phase1 {
>       $self->handle_replication($vmid);
>   
>       $self->sync_offline_local_volumes();
> +    $self->phase1_remote($vmid) if $self->{opts}->{remote};
>   };
>   
> +sub map_bridges {
> +    my ($conf, $map, $scan_only) = @_;
> +
> +    my $bridges = {};
> +
> +    foreach my $opt (keys %$conf) {
> +	next if $opt !~ m/^net\d+$/;
> +
> +	next if !$conf->{$opt};
> +	my $d = PVE::QemuServer::parse_net($conf->{$opt});
> +	next if !$d || !$d->{bridge};
> +
> +	my $target_bridge = PVE::QemuServer::map_id($map, $d->{bridge});
> +	$bridges->{$target_bridge}->{$opt} = $d->{bridge};
> +
> +	next if $scan_only;
> +
> +	$d->{bridge} = $target_bridge;
> +	$conf->{$opt} = PVE::QemuServer::print_net($d);
> +    }
> +
> +    return $bridges;
> +}
> +
> +sub phase1_remote {
> +    my ($self, $vmid) = @_;
> +
> +    my $remote_conf = PVE::QemuConfig->load_config($vmid);
> +    PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
> +
> +    my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
> +    for my $target (keys $bridges->%*) {
> +	for my $nic (keys $bridges->{$target}->%*) {
> +	    $self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
> +	}
> +    }
> +
> +    my @online_local_volumes = $self->filter_local_volumes('online');
> +
> +    my $storage_map = $self->{opts}->{storagemap};
> +    $self->{nbd} = {};
> +    PVE::QemuConfig->foreach_volume($remote_conf, sub {
> +	my ($ds, $drive) = @_;
> +
> +	# TODO eject CDROM?
> +	return if PVE::QemuServer::drive_is_cdrom($drive);
> +
> +	my $volid = $drive->{file};
> +	return if !$volid;
> +
> +	return if !grep { $_ eq $volid} @online_local_volumes;
> +
> +	my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
> +	my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
> +	my $source_format = PVE::QemuServer::qemu_img_format($scfg, $volname);
> +
> +	# set by target cluster
> +	my $oldvolid = delete $drive->{file};
> +	delete $drive->{format};
> +
> +	my $targetsid = PVE::QemuServer::map_id($storage_map, $storeid);
> +
> +	my $params = {
> +	    format => $source_format,
> +	    storage => $targetsid,
> +	    drive => $drive,
> +	};
> +
> +	$self->log('info', "Allocating volume for drive '$ds' on remote storage '$targetsid'..");
> +	my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'disk', $params);
> +
> +	$self->log('info', "volume '$oldvolid' os '$res->{volid}' on the target\n");

s/os/is/

> +	$remote_conf->{$ds} = $res->{drivestr};
> +	$self->{nbd}->{$ds} = $res;
> +    });
> +
> +    my $conf_str = PVE::QemuServer::write_vm_config("remote", $remote_conf);
> +
> +    # TODO expose in PVE::Firewall?
> +    my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
> +    my $fw_conf_str;
> +    $fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
> +	if -e $vm_fw_conf_path;
> +    my $params = {
> +	conf => $conf_str,
> +	'firewall-config' => $fw_conf_str,
> +    };
> +
> +    PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
> +}
> +
>   sub phase1_cleanup {
>       my ($self, $vmid, $err) = @_;
>   
> @@ -662,7 +862,6 @@ sub phase2_start_local_cluster {
>       my $local_volumes = $self->{local_volumes};
>       my @online_local_volumes = $self->filter_local_volumes('online');
>   
> -    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
>       my $start = $params->{start_params};
>       my $migrate = $params->{migrate_opts};
>   
> @@ -793,10 +992,34 @@ sub phase2_start_local_cluster {
>       return ($tunnel_info, $spice_port);
>   }
>   
> +sub phase2_start_remote_cluster {
> +    my ($self, $vmid, $params) = @_;
> +
> +    die "insecure migration to remote cluster not implemented\n"
> +	if $params->{migrate_opts}->{type} ne 'websocket';
> +
> +    my $remote_vmid = $self->{opts}->{remote}->{vmid};
> +
> +    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10, "start", $params);

10 seconds feels a bit short to me.

> +
> +    foreach my $drive (keys %{$res->{drives}}) {
> +	$self->{stopnbd} = 1;
> +	$self->{target_drive}->{$drive}->{drivestr} = $res->{drives}->{$drive}->{drivestr};
> +	my $nbd_uri = $res->{drives}->{$drive}->{nbd_uri};
> +	die "unexpected NBD uri for '$drive': $nbd_uri\n"
> +	    if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
> +
> +	$self->{target_drive}->{$drive}->{nbd_uri} = $nbd_uri;
> +    }
> +
> +    return ($res->{migrate}, $res->{spice_port});
> +}
> +
>   sub phase2 {
>       my ($self, $vmid) = @_;
>   
>       my $conf = $self->{vmconf};
> +    my $local_volumes = $self->{local_volumes};
>   
>       # version > 0 for unix socket support
>       my $nbd_protocol_version = 1;
> @@ -828,10 +1051,39 @@ sub phase2 {
>   	},
>       };
>   
> -    my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
> +    my ($tunnel_info, $spice_port);
> +
> +    my @online_local_volumes = $self->filter_local_volumes('online');
> +    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
> +
> +    if (my $remote = $self->{opts}->{remote}) {
> +	my $remote_vmid = $remote->{vmid};
> +	$params->{migrate_opts}->{remote_node} = $self->{node};
> +	($tunnel_info, $spice_port) = $self->phase2_start_remote_cluster($vmid, $params);
> +	die "only UNIX sockets are supported for remote migration\n"
> +	    if $tunnel_info->{proto} ne 'unix';
> +
> +	my $remote_socket = $tunnel_info->{addr};
> +	my $local_socket = $remote_socket;
> +	$local_socket =~ s/$remote_vmid/$vmid/g;
> +	$tunnel_info->{addr} = $local_socket;
> +
> +	$self->log('info', "Setting up tunnel for '$local_socket'");
> +	PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
> +
> +	foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) {
> +	    my $local_socket = $remote_socket;
> +	    $local_socket =~ s/$remote_vmid/$vmid/g;
> +	    next if $self->{tunnel}->{forwarded}->{$local_socket};
> +	    $self->log('info', "Setting up tunnel for '$local_socket'");
> +	    PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
> +	}
> +    } else {
> +	($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
>   
> -    $self->log('info', "start remote tunnel");
> -    $self->start_remote_tunnel($tunnel_info);
> +	$self->log('info', "start remote tunnel");
> +	$self->start_remote_tunnel($tunnel_info);
> +    }
>   
>       my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
>       $migrate_uri .= ":$tunnel_info->{port}"
> @@ -841,8 +1093,6 @@ sub phase2 {
>   	$self->{storage_migration_jobs} = {};
>   	$self->log('info', "starting storage migration");
>   
> -	my @online_local_volumes = $self->filter_local_volumes('online');
> -
>   	die "The number of local disks does not match between the source and the destination.\n"
>   	    if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
>   	foreach my $drive (keys %{$self->{target_drive}}){
> @@ -915,7 +1165,7 @@ sub phase2 {
>       };
>       $self->log('info', "migrate-set-parameters error: $@") if $@;
>   
> -    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
> +    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
>   	my $rpcenv = PVE::RPCEnvironment::get();
>   	my $authuser = $rpcenv->get_user();
>   
> @@ -1112,11 +1362,15 @@ sub phase2_cleanup {
>   
>       my $nodename = PVE::INotify::nodename();
>   
> -    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
> -    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> -    if (my $err = $@) {
> -        $self->log('err', $err);
> -        $self->{errors} = 1;
> +    if ($self->{tunnel} && $self->{tunnel}->{version} >= 2) {
> +	PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'stop');
> +    } else {
> +	my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
> +	eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> +	if (my $err = $@) {
> +	    $self->log('err', $err);
> +	    $self->{errors} = 1;
> +	}
>       }
>   
>       # cleanup after stopping, otherwise disks might be in-use by target VM!
> @@ -1149,7 +1403,7 @@ sub phase3_cleanup {
>   
>       my $tunnel = $self->{tunnel};
>   
> -    if ($self->{volume_map}) {
> +    if ($self->{volume_map} && !$self->{opts}->{remote}) {
>   	my $target_drives = $self->{target_drive};
>   
>   	# FIXME: for NBD storage migration we now only update the volid, and
> @@ -1165,27 +1419,34 @@ sub phase3_cleanup {
>       }
>   
>       # transfer replication state before move config
> -    $self->transfer_replication_state() if $self->{is_replicated};
> -    PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
> -    $self->switch_replication_job_target() if $self->{is_replicated};
> +    if (!$self->{opts}->{remote}) {
> +	$self->transfer_replication_state() if $self->{is_replicated};
> +	PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
> +	$self->switch_replication_job_target() if $self->{is_replicated};
> +    }
>   
>       if ($self->{livemigration}) {
>   	if ($self->{stopnbd}) {
>   	    $self->log('info', "stopping NBD storage migration server on target.");
>   	    # stop nbd server on remote vm - requirement for resume since 2.9
> -	    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
> +	    if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 2) {
> +		PVE::Tunnel::write_tunnel($tunnel, 30, 'nbdstop');
> +	    } else {
> +		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
>   
> -	    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> -	    if (my $err = $@) {
> -		$self->log('err', $err);
> -		$self->{errors} = 1;
> +		eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> +		if (my $err = $@) {
> +		    $self->log('err', $err);
> +		    $self->{errors} = 1;
> +		}
>   	    }
>   	}
>   
>   	# config moved and nbd server stopped - now we can resume vm on target
>   	if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 1) {
> +	    my $cmd = $tunnel->{version} == 1 ? "resume $vmid" : "resume";
>   	    eval {
> -		PVE::Tunnel::write_tunnel($tunnel, 30, "resume $vmid");
> +		PVE::Tunnel::write_tunnel($tunnel, 30, $cmd);
>   	    };
>   	    if (my $err = $@) {
>   		$self->log('err', $err);
> @@ -1205,18 +1466,24 @@ sub phase3_cleanup {
>   	}
>   
>   	if ($self->{storage_migration} && PVE::QemuServer::parse_guest_agent($conf)->{fstrim_cloned_disks} && $self->{running}) {
> -	    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
> -	    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> +	    if ($self->{opts}->{remote}) {
> +		PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'fstrim');
> +	    } else {
> +		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
> +		eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
> +	    }
>   	}
>       }
>   
>       # close tunnel on successful migration, on error phase2_cleanup closed it
> -    if ($tunnel) {
> +    if ($tunnel && $tunnel->{version} == 1) {
>   	eval { PVE::Tunnel::finish_tunnel($tunnel); };
>   	if (my $err = $@) {
>   	    $self->log('err', $err);
>   	    $self->{errors} = 1;
>   	}
> +	$tunnel = undef;
> +	delete $self->{tunnel};
>       }
>   
>       eval {
> @@ -1254,6 +1521,9 @@ sub phase3_cleanup {
>   
>       # destroy local copies
>       foreach my $volid (@not_replicated_volumes) {
> +	# remote is cleaned up below
> +	next if $self->{opts}->{remote};
> +
>   	eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
>   	if (my $err = $@) {
>   	    $self->log('err', "removing local copy of '$volid' failed - $err");
> @@ -1263,8 +1533,19 @@ sub phase3_cleanup {
>       }
>   
>       # clear migrate lock
> -    my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
> -    $self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
> +    if ($tunnel && $tunnel->{version} >= 2) {
> +	PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
> +
> +	PVE::Tunnel::finish_tunnel($tunnel);
> +    } else {
> +	my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
> +	$self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
> +    }
> +
> +    if ($self->{opts}->{remote} && $self->{opts}->{delete}) {
> +	eval { PVE::QemuServer::destroy_vm($self->{storecfg}, $vmid, 1, undef, 0) };
> +	warn "Failed to remove source VM - $@\n" if $@;
> +    }
>   }
>   
>   sub final_cleanup {
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 9971f2c..9490615 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5425,7 +5425,10 @@ sub vm_start_nolock {
>       my $defaults = load_defaults();
>   
>       # set environment variable useful inside network script
> -    $ENV{PVE_MIGRATED_FROM} = $migratedfrom if $migratedfrom;
> +    # for remote migration the config is available on the target node!
> +    if (!$migrate_opts->{remote_node}) {
> +	$ENV{PVE_MIGRATED_FROM} = $migratedfrom;
> +    }
>   
>       PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1);
>   
> @@ -5667,7 +5670,7 @@ sub vm_start_nolock {
>   
>   	my $migrate_storage_uri;
>   	# nbd_protocol_version > 0 for unix socket support
> -	if ($nbd_protocol_version > 0 && $migration_type eq 'secure') {
> +	if ($nbd_protocol_version > 0 && ($migration_type eq 'secure' || $migration_type eq 'websocket')) {
>   	    my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
>   	    mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
>   	    $migrate_storage_uri = "nbd:unix:$socket_path";




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
  2022-01-04 13:58   ` Fabian Ebner
@ 2022-01-04 16:44     ` Roland
  2022-01-11  8:19       ` Thomas Lamprecht
  0 siblings, 1 reply; 41+ messages in thread
From: Roland @ 2022-01-04 16:44 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Ebner, Fabian Grünbichler


>>   +sub phase2_start_remote_cluster {
>> +    my ($self, $vmid, $params) = @_;
>> +
>> +    die "insecure migration to remote cluster not implemented\n"
>> +    if $params->{migrate_opts}->{type} ne 'websocket';
>> +
>> +    my $remote_vmid = $self->{opts}->{remote}->{vmid};
>> +
>> +    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10,
>> "start", $params);
>
> 10 seconds feels a bit short to me.
>
Please, administrators like tunables and knobs for changing default values.

Not only for being empowered to fix things themselves but also to be
able to dig into a problem and find the root cause...

I remember that i had more then one occasion , where i grepped for
timeout or other limiting values in proxmox or other softwares source, 
and often gave up in the end, because it was too deeply hidden or i got
too many hits/findings.

Finding such without knowing the code can often be like searching for
the needle in a haystack and extremely frustrating.

I would be happy, if such important values would get defined with some
descriptive variable name at a suitable location, maybe even with some
comment what's it all about ( even if it's not meant to be changed/tuned)

just my 10 cents...

Roland




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision Fabian Grünbichler
@ 2022-01-05  9:00   ` Fabian Ebner
  2022-01-19 14:38     ` Fabian Grünbichler
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-05  9:00 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> into new top-level helper for re-use with remote migration.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> new in v3
> 
>   PVE/Storage.pm | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
> index 05be3dd..4f3a44a 100755
> --- a/PVE/Storage.pm
> +++ b/PVE/Storage.pm
> @@ -643,6 +643,14 @@ my $volname_for_storage = sub {
>       }
>   };
>   
> +# whether a migration snapshot is needed for a given storage
> +sub storage_migrate_snapshot {
> +    my ($cfg, $storeid) = @_;
> +    my $scfg = storage_config($cfg, $storeid);
> +
> +    return $scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs';
> +}
> +
>   sub storage_migrate {
>       my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
>   
> @@ -688,10 +696,8 @@ sub storage_migrate {
>   
>       my $migration_snapshot;
>       if (!defined($snapshot)) {
> -	if ($scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs') {
> -	    $migration_snapshot = 1;
> -	    $snapshot = '__migration__';
> -	}
> +	$migration_snapshot = storage_migrate_snapshot->($cfg, $target_storeid);

The call is wrong, because of the ->

And why base the decision on the target storage rather than keeping it 
based on the source storage? That breaks migration from e.g. lvm-thin to 
btrfs, because it now tries to use a snapshot and doesn't find any 
common transfer format anymore.

> +	$snapshot = '__migration__' if $migration_snapshot;
>       }
>   
>       my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $snapshot, $base_snapshot, $with_snapshots);




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare Fabian Grünbichler
@ 2022-01-05  9:59   ` Fabian Ebner
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Ebner @ 2022-01-05  9:59 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> for re-use with remote migration, where import and export happen on
> different clusters connected via a websocket instead of SSH tunnel.
> 
> no semantic changes intended.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> new in v3
> 
> using $opts across the board in storage_migrate to prevent future bugs
> where options get out of sync between helper calls..
> 
>   PVE/Storage.pm | 113 ++++++++++++++++++++++++++++++-------------------
>   1 file changed, 70 insertions(+), 43 deletions(-)
> 
> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
> index 4f3a44a..976bd63 100755
> --- a/PVE/Storage.pm
> +++ b/PVE/Storage.pm
> @@ -651,16 +651,70 @@ sub storage_migrate_snapshot {
>       return $scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs';
>   }
>   
> -sub storage_migrate {
> -    my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
> +my $volume_import_prepare = sub {
> +    my ($volid, $format, $path, $apiver, $opts) = @_;
>   
>       my $base_snapshot = $opts->{base_snapshot};
>       my $snapshot = $opts->{snapshot};
> -    my $ratelimit_bps = $opts->{ratelimit_bps};
> -    my $insecure = $opts->{insecure};
>       my $with_snapshots = $opts->{with_snapshots} ? 1 : 0;
> +    my $migration_snapshot = $opts->{migration_snapshot} ? 1 : 0;
>       my $allow_rename = $opts->{allow_rename} ? 1 : 0;
>   
> +    my $recv = ['pvesm', 'import', $volid, $format, $path, '-with-snapshots', $with_snapshots];
> +    if (defined($snapshot)) {
> +	push @$recv, '-snapshot', $snapshot;
> +    }
> +    if ($migration_snapshot) {
> +	push @$recv, '-delete-snapshot', $snapshot;
> +    }
> +    push @$recv, '-allow-rename', $allow_rename if $apiver >= 5;
> +
> +    if (defined($base_snapshot)) {
> +	# Check if the snapshot exists on the remote side:
> +	push @$recv, '-base', $base_snapshot if $apiver >= 9;
> +    }
> +
> +    return $recv;
> +};
> +
> +my $volume_export_prepare = sub {
> +    my ($cfg, $volid, $format, $logfunc, $opts) = @_;
> +    my $base_snapshot = $opts->{base_snapshot};
> +    my $snapshot = $opts->{snapshot};
> +    my $with_snapshots = $opts->{with_snapshots} ? 1 : 0;
> +    my $migration_snapshot = $opts->{migration_snapshot} ? 1 : 0;
> +    my $ratelimit_bps = $opts->{ratelimit_bps};
> +
> +    my $send = ['pvesm', 'export', $volid, $format, '-', '-with-snapshots', $with_snapshots];
> +    if (defined($snapshot)) {
> +	push @$send, '-snapshot', $snapshot;
> +    }
> +    if (defined($base_snapshot)) {
> +	push @$send, '-base', $base_snapshot;
> +    }
> +
> +    my @cstream;
> +    if (defined($ratelimit_bps)) {
> +	@cstream = [ '/usr/bin/cstream', '-t', $ratelimit_bps ];

Nit: It's Perl I guess and it seems to work, but assigning an array ref 
to an array looks weird to me. Why not keep the surrounding ( )? Or 
better, can't we just avoid wrapping the ref in an array altogether?

> +	$logfunc->("using a bandwidth limit of $ratelimit_bps bps for transferring '$volid'") if $logfunc; > +    }
> +
> +    volume_snapshot($cfg, $volid, $snapshot) if $migration_snapshot;
> +
> +    if (defined($snapshot)) {
> +	activate_volumes($cfg, [$volid], $snapshot);
> +    } else {
> +	activate_volumes($cfg, [$volid]);
> +    }
> +
> +    return [ $send, @cstream ];
> +};
> +
> +sub storage_migrate {
> +    my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
> +
> +    my $insecure = $opts->{insecure};
> +
>       my ($storeid, $volname) = parse_volume_id($volid);
>   
>       my $scfg = storage_config($cfg, $storeid);
> @@ -688,19 +742,12 @@ sub storage_migrate {
>       my $ssh_base = PVE::SSHInfo::ssh_info_to_command_base($target_sshinfo);
>       local $ENV{RSYNC_RSH} = PVE::Tools::cmd2string($ssh_base);
>   
> -    my @cstream;
> -    if (defined($ratelimit_bps)) {
> -	@cstream = ([ '/usr/bin/cstream', '-t', $ratelimit_bps ]);
> -	$logfunc->("using a bandwidth limit of $ratelimit_bps bps for transferring '$volid'") if $logfunc;
> +    if (!defined($opts->{snapshot})) {
> +	$opts->{migration_snapshot} = storage_migrate_snapshot($cfg, $target_storeid);
> +	$opts->{snapshot} = '__migration__' if $opts->{migration_snapshot};
>       }
>   
> -    my $migration_snapshot;
> -    if (!defined($snapshot)) {
> -	$migration_snapshot = storage_migrate_snapshot->($cfg, $target_storeid);
> -	$snapshot = '__migration__' if $migration_snapshot;
> -    }
> -
> -    my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $snapshot, $base_snapshot, $with_snapshots);
> +    my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $opts->{snapshot}, $opts->{base_snapshot}, $opts->{with_snapshots});
>       die "cannot migrate from storage type '$scfg->{type}' to '$tcfg->{type}'\n" if !@formats;
>       my $format = $formats[0];
>   
> @@ -715,22 +762,7 @@ sub storage_migrate {
>       my $match_api_version = sub { $target_apiver = $1 if $_[0] =~ m!^APIVER (\d+)$!; };
>       eval { run_command($get_api_version, logfunc => $match_api_version); };
>   
> -    my $send = ['pvesm', 'export', $volid, $format, '-', '-with-snapshots', $with_snapshots];
> -    my $recv = [@$ssh, '--', 'pvesm', 'import', $target_volid, $format, $import_fn, '-with-snapshots', $with_snapshots];
> -    if (defined($snapshot)) {
> -	push @$send, '-snapshot', $snapshot;
> -	push @$recv, '-snapshot', $snapshot;
> -    }
> -    if ($migration_snapshot) {
> -	push @$recv, '-delete-snapshot', $snapshot;
> -    }
> -    push @$recv, '-allow-rename', $allow_rename if $target_apiver >= 5;
> -
> -    if (defined($base_snapshot)) {
> -	# Check if the snapshot exists on the remote side:
> -	push @$send, '-base', $base_snapshot;
> -	push @$recv, '-base', $base_snapshot if $target_apiver >= 9;
> -    }
> +    my $recv = [ @$ssh, '--', $volume_import_prepare->($target_volid, $format, $import_fn, $target_apiver, $opts)->@* ];
>   
>       my $new_volid;
>       my $pattern = volume_imported_message(undef, 1);
> @@ -745,19 +777,13 @@ sub storage_migrate {
>   	}
>       };
>   
> -    volume_snapshot($cfg, $volid, $snapshot) if $migration_snapshot;
> -
> -    if (defined($snapshot)) {
> -	activate_volumes($cfg, [$volid], $snapshot);
> -    } else {
> -	activate_volumes($cfg, [$volid]);
> -    }
> +    my $cmds = [ $volume_export_prepare->($cfg, $volid, $format, $logfunc, $opts)->@* ];
>   
>       eval {
>   	if ($insecure) {
>   	    my $input = IO::File->new();
>   	    my $info = IO::File->new();
> -	    open3($input, $info, $info, @{$recv})
> +	    open3($input, $info, $info, @$recv)
>   		or die "receive command failed: $!\n";
>   	    close($input);
>   
> @@ -774,7 +800,7 @@ sub storage_migrate {
>   	    # we won't be reading from the socket
>   	    shutdown($socket, 0);
>   
> -	    eval { run_command([$send, @cstream], output => '>&'.fileno($socket), errfunc => $logfunc); };
> +	    eval { run_command($cmds, output => '>&'.fileno($socket), errfunc => $logfunc); };
>   	    my $send_error = $@;
>   
>   	    # don't close the connection entirely otherwise the receiving end
> @@ -795,7 +821,8 @@ sub storage_migrate {
>   
>   	    die $send_error if $send_error;
>   	} else {
> -	    run_command([$send, @cstream, $recv], logfunc => $match_volid_and_log);
> +	    push @$cmds, $recv;
> +	    run_command($cmds, logfunc => $match_volid_and_log);
>   	}
>   
>   	die "unable to get ID of the migrated volume\n"
> @@ -803,8 +830,8 @@ sub storage_migrate {
>       };
>       my $err = $@;
>       warn "send/receive failed, cleaning up snapshot(s)..\n" if $err;
> -    if ($migration_snapshot) {
> -	eval { volume_snapshot_delete($cfg, $volid, $snapshot, 0) };
> +    if ($opts->{migration_snapshot}) {
> +	eval { volume_snapshot_delete($cfg, $volid, $opts->{snapshot}, 0) };
>   	warn "could not remove source snapshot: $@\n" if $@;
>       }
>       die $err if $err;




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module
  2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
  2022-01-03 14:30   ` Fabian Ebner
@ 2022-01-05 10:50   ` Fabian Ebner
  1 sibling, 0 replies; 41+ messages in thread
From: Fabian Ebner @ 2022-01-05 10:50 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
> encapsulating storage-related tunnel methods, currently
> - source-side storage-migrate helper
> - target-side disk-import handler
> - target-side query-disk-import handler
> 
> to be extended further with replication-related handlers and helpers.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      new in v3, includes code previously in qemu-server
> 
>   src/Makefile             |   1 +
>   src/PVE/StorageTunnel.pm | 231 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 232 insertions(+)
>   create mode 100644 src/PVE/StorageTunnel.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index d82162c..baa2688 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -12,6 +12,7 @@ install: PVE
>   	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
> +	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>   	install -d ${PERL5DIR}/PVE/VZDump
>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
> diff --git a/src/PVE/StorageTunnel.pm b/src/PVE/StorageTunnel.pm
> new file mode 100644
> index 0000000..06902ef
> --- /dev/null
> +++ b/src/PVE/StorageTunnel.pm
> @@ -0,0 +1,231 @@
> +package PVE::StorageTunnel;
> +
> +use strict;
> +use warnings;
> +
> +use IO::Socket::UNIX;
> +use POSIX qw(WNOHANG);
> +use Socket qw(SOCK_STREAM);
> +
> +use PVE::Tools;
> +use PVE::Tunnel;
> +use PVE::Storage;
> +
> +sub storage_migrate {
> +    my ($tunnel, $storecfg, $volid, $local_vmid, $remote_vmid, $opts, $log) = @_;
> +
> +    my $bwlimit = $opts->{bwlimit};
> +    my $targetsid = $opts->{targetsid};
> +
> +    # use 'migrate' limit for transfer to other node
> +    my $bwlimit_opts = {
> +	storage => $targetsid,
> +	bwlimit => $bwlimit,
> +    };
> +    my $remote_bwlimit = PVE::Tunnel::write_tunnel($tunnel, 10, 'bwlimit', $bwlimit_opts);
> +    $remote_bwlimit = $remote_bwlimit->{bwlimit};
> +    if (defined($remote_bwlimit)) {
> +	$bwlimit = $remote_bwlimit if !defined($bwlimit);
> +	$bwlimit = $remote_bwlimit if $remote_bwlimit < $bwlimit;
> +    }
> +
> +    # JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
> +    $bwlimit = $bwlimit * 1024 if defined($bwlimit);
> +
> +    # adapt volume name for import call
> +    my ($sid, undef) = PVE::Storage::parse_volume_id($volid);
> +    my (undef, $name, undef, undef, undef, undef, $format) = PVE::Storage::parse_volname($storecfg, $volid);
> +    my $scfg = PVE::Storage::storage_config($storecfg, $sid);
> +    PVE::Storage::activate_volumes($storecfg, [$volid]);
> +    if ($local_vmid != $remote_vmid) {
> +	$name =~ s/-$local_vmid-/-$remote_vmid-/g;
> +	$name =~ s/^$local_vmid\///; # re-added on target if dir-based storage
> +    }
> +
> +    my $with_snapshots = $opts->{snapshots} ? 1 : 0;
> +    my $snapshot;
> +    my $migration_snapshot = PVE::Storage::storage_migrate_snapshot($storecfg, $sid);
> +    if ($migration_snapshot) {
> +	$snapshot = '__migration__';
> +	$with_snapshots = 1;
> +    }
> +
> +    my @export_formats = PVE::Storage::volume_export_formats($storecfg, $volid, undef, undef, $with_snapshots);
> +
> +    my $disk_import_opts = {
> +	format => $format,
> +	storage => $targetsid,
> +	snapshot => $snapshot,
> +	migration_snapshot => $migration_snapshot,

Noticed while looking at how volume_import_start is used:
snapshot and migration_snapshot are not defined in the schema for 
'disk-import' below

> +	'with-snapshots' => $with_snapshots,
> +	'allow-rename' => !$opts->{is_vmstate},
> +	'export-formats' => join(",", @export_formats),
> +	volname => $name,
> +    };
> +    my $res = PVE::Tunnel::write_tunnel($tunnel, 600, 'disk-import', $disk_import_opts);
> +    my $local = "/run/qemu-server/$local_vmid.storage";
> +    if (!$tunnel->{forwarded}->{$local}) {
> +	PVE::Tunnel::forward_unix_socket($tunnel, $local, $res->{socket});
> +    }
> +    my $socket = IO::Socket::UNIX->new(Peer => $local, Type => SOCK_STREAM())
> +	or die "failed to connect to websocket tunnel at $local\n";
> +    # we won't be reading from the socket
> +    shutdown($socket, 0);
> +
> +    my $disk_export_opts = {
> +	snapshot => $snapshot,
> +	migration_snapshot => $migration_snapshot,
> +	'with-snapshots' => $with_snapshots,
> +	ratelimit_bps => $bwlimit,
> +	cmd => {
> +	    output => '>&'.fileno($socket),
> +	},
> +    };
> +
> +    eval {
> +	PVE::Storage::volume_export_start(
> +	    $storecfg,
> +	    $volid,
> +	    $res->{format},
> +	    $disk_export_opts,
> +	);
> +    };
> +    my $send_error = $@;
> +    warn "$send_error\n" if $send_error;
> +
> +    # don't close the connection entirely otherwise the
> +    # receiving end might not get all buffered data (and
> +    # fails with 'connection reset by peer')
> +    shutdown($socket, 1);
> +
> +    # wait for the remote process to finish
> +    my $new_volid;
> +    while ($res = PVE::Tunnel::write_tunnel($tunnel, 10, 'query-disk-import')) {
> +	if ($res->{status} eq 'pending') {
> +	    if (my $msg = $res->{msg}) {
> +		$log->("disk-import: $msg\n");
> +	    } else {
> +		$log->("waiting for disk import to finish..\n");
> +	    }
> +	    sleep(1)
> +	} elsif ($res->{status} eq 'complete') {
> +	    $new_volid = $res->{volid};
> +	    last;
> +	} else {
> +	    warn "unknown query-disk-import result: $res->{status}\n";
> +	    last;
> +	}
> +    }
> +
> +    # now close the socket
> +    close($socket);
> +    if ($snapshot) {
> +	eval { PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snapshot, 0) };
> +	warn "could not remove source snapshot: $@\n" if $@;
> +    }
> +    die $send_error if $send_error;
> +    die "disk import failed - see log above\n" if !$new_volid;
> +
> +    return $new_volid;
> +}
> +
> +our $cmd_schema = {
> +    'disk-import' => {
> +	volname => {
> +	    type => 'string',
> +	    description => 'volume name to use as preferred target volume name',
> +	},
> +	format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
> +	'export-formats' => {
> +	    type => 'string',
> +	    description => 'list of supported export formats',
> +	},
> +	storage => {
> +	    type => 'string',
> +	    format => 'pve-storage-id',
> +	},
> +	'with-snapshots' => {
> +	    description =>
> +	        "Whether the stream includes intermediate snapshots",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 0,
> +	},
> +	'allow-rename' => {
> +	    description => "Choose a new volume ID if the requested " .
> +		"volume ID already exists, instead of throwing an error.",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 0,
> +	},
> +    },
> +};
> +
> +sub handle_disk_import {
> +    my ($state, $params) = @_;
> +
> +    die "disk import already running as PID '$state->{disk_import}->{pid}'\n"
> +	if $state->{disk_import}->{pid};
> +
> +    my $storage = delete $params->{storage};
> +    my $format = delete $params->{format};
> +    my $volname = delete $params->{volname};
> +
> +    my $import = PVE::Storage::volume_import_start($state->{storecfg}, $storage, $volname, $format, $state->{vmid}, $params);
> +
> +    my $socket = $import->{socket};
> +    $format = delete $import->{format};
> +
> +    $state->{sockets}->{$socket} = 1;
> +    $state->{disk_import} = $import;
> +
> +    chown $state->{socket_uid}, -1, $socket;
> +
> +    return {
> +	socket => $socket,
> +	format => $format,
> +    };
> +}
> +
> +sub handle_query_disk_import {
> +    my ($state, $params) = @_;
> +
> +    die "no disk import running\n"
> +	if !$state->{disk_import}->{pid};
> +
> +    my $pattern = PVE::Storage::volume_imported_message(undef, 1);
> +    my $result;
> +    eval {
> +	my $fh = $state->{disk_import}->{fh};
> +	PVE::Tools::run_with_timeout(5, sub { $result = <$fh>; });
> +	print "disk-import: $result\n" if $result;
> +    };
> +    if ($result && $result =~ $pattern) {
> +	my $volid = $1;
> +	waitpid($state->{disk_import}->{pid}, 0);
> +
> +	my $unix = $state->{disk_import}->{socket};
> +	unlink $unix;
> +	delete $state->{sockets}->{$unix};
> +	delete $state->{disk_import};
> +	$state->{cleanup}->{volumes}->{$volid} = 1;
> +	return {
> +	    status => "complete",
> +	    volid => $volid,
> +	};
> +    } elsif (!$result && waitpid($state->{disk_import}->{pid}, WNOHANG)) {
> +	my $unix = $state->{disk_import}->{socket};
> +	unlink $unix;
> +	delete $state->{sockets}->{$unix};
> +	delete $state->{disk_import};
> +
> +	return {
> +	    status => "error",
> +	};
> +    } else {
> +	return {
> +	    status => "pending",
> +	    msg => $result,
> +	};
> +    }
> +}




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
  2022-01-04 16:44     ` Roland
@ 2022-01-11  8:19       ` Thomas Lamprecht
       [not found]         ` <<554040de-09d6-974b-143a-80c2d66b9573@proxmox.com>
  0 siblings, 1 reply; 41+ messages in thread
From: Thomas Lamprecht @ 2022-01-11  8:19 UTC (permalink / raw)
  To: Proxmox VE development discussion, Roland, Fabian Ebner,
	Fabian Grünbichler

On 04.01.22 17:44, Roland wrote:
>>>   +sub phase2_start_remote_cluster {
>>> +    my ($self, $vmid, $params) = @_;
>>> +
>>> +    die "insecure migration to remote cluster not implemented\n"
>>> +    if $params->{migrate_opts}->{type} ne 'websocket';
>>> +
>>> +    my $remote_vmid = $self->{opts}->{remote}->{vmid};
>>> +
>>> +    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10,
>>> "start", $params);
>>
>> 10 seconds feels a bit short to me.

@Fabian(s):

Why not use the same as vm_start_nolock + some buffer? E.g.,

($params->{timeout} // config_aware_timeout($conf, $resume)) + 10;

That should be quite generous, as the worst case time for the start for
incoming migration, which is always paused so not doing much, is normally
way lower than a cold-start.

I mean, we're in an worker task and do not really care much if setup needs
a bit longer.

>>
> Please, administrators like tunables and knobs for changing default values.
> 

@Roland

sure, but exposing all hundreds+ of timeout and other parameters will just
overload most users and produce guides to extend timeouts for issues that
should be actually fixed in the setup, i.e., when the timeout is just a symptom
of a bad config/hw/...

> Not only for being empowered to fix things themselves but also to be
> able to dig into a problem and find the root cause...
> 

With PVE et. al being under AGPLv3 you're already empowered to just change the
code, so lets keep it simple(r) to most users, if a timeout is actually to short
we can always change it to a better fitting value (at least if reports about that
reach us).

> I remember that i had more then one occasion , where i grepped for
> timeout or other limiting values in proxmox or other softwares source, 
> and often gave up in the end, because it was too deeply hidden or i got
> too many hits/findings.
> 

same would happen if we expose every possible parameter as knob, you'd have
hundreds and get many hits on searches..

> Finding such without knowing the code can often be like searching for
> the needle in a haystack and extremely frustrating.

sure and I can imagine that it's frustrating, but you can always ask here, on
pve-user or other channels about the issue at hand and people more used to
the code can often better guide you to the actual parameter location, or give
some other input.

> 
> I would be happy, if such important values would get defined with some
> descriptive variable name at a suitable location, maybe even with some
> comment what's it all about ( even if it's not meant to be changed/tuned)
> 

everybody sees specific knobs as more or less important, so that's not a a clear
cut at all. For comments I'd also suggest using git blame to get the commit
(message) that introduced it, often there's some info encoded in there too.
Also, if you ever got some reasoning into a special timeout that had no info
or comment whatsoever we're naturally happy to accept patches that add one.

> just my 10 cents...

thanks for your input. IMO this specific one can be improved without exposing
a new tunable (see start of my reply), and in general I prefer to avoid to many
tunables, we can still add them if it shows that there's quite some demand for
a specific one, while removing would break compat..

cheers,
Thomas




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation
  2022-01-04 11:37   ` Fabian Ebner
@ 2022-01-19 10:34     ` Fabian Grünbichler
  2022-01-19 12:16       ` Fabian Ebner
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2022-01-19 10:34 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On January 4, 2022 12:37 pm, Fabian Ebner wrote:
> Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
>> in case we have no explicit fingerprint, we use openssl's regular "PEER"
>> verification. if we have a fingerprint, we ignore openssl altogether and
>> just verify the fingerprint of the presented leaf certificate.
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> 
>> Notes:
>>      v3: switch to using hex instead of no-longer-existing digest_to_hex
>>      v2: new
>> 
>>   src/main.rs | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>>   1 file changed, 44 insertions(+), 3 deletions(-)
>> 
>> diff --git a/src/main.rs b/src/main.rs
>> index 582214c..49d6ffe 100644
>> --- a/src/main.rs
>> +++ b/src/main.rs
>> @@ -134,9 +134,50 @@ impl CtrlTunnel {
>>           }
>>   
>>           let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
>> -        if fingerprint.is_some() {
>> -            // FIXME actually verify fingerprint via callback!
>> -            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
>> +        if let Some(expected) = fingerprint {
>> +            ssl_connector_builder.set_verify_callback(
>> +                openssl::ssl::SslVerifyMode::NONE,
>> +                move |_valid, ctx| {
>> +                    let cert = match ctx.current_cert() {
>> +                        Some(cert) => cert,
>> +                        None => {
>> +                            eprintln!("SSL context lacks current certificate.");
>> +                            return false;
>> +                        }
>> +                    };
>> +
>> +                    let depth = ctx.error_depth();
>> +                    if depth != 0 {
>> +                        return true;
>> +                    }
> 
> Sorry about my ignorance. Does using SslVerifyMode::NONE imply that 
> there is an error? At depth 0? Why is it fine to return true if not?

this is a bit.. tricky (did I mention I really really dislike openssl's 
API? ;))

basically what we do in this branch (if we have a pinned fingerprint to 
check - the regular 'connect iff trusted by system' is the else branch 
below) we set our own callback that gets called for each cert along the 
chain (starting at the top, ending with the leaf/end certificate, but 
the order is not relevant since a single failed callback fails the whole 
verification).

for each cert (== element of the chain == depth value) we get the result 
of openssl's check (`_valid`) and the X509 store context (ctx).

the context (among other things ;)) contains information about where 
(depth) in the chain we currently are:
- depth 0 == peer certificate (the one we are interested in)
- depth 1 == CA certificate (signer of peer cert, not interesting)
- depth 2 == higher CA certificate (signer of CA at 1, not interesting)
- depth X == higher CA certificate (signer of CA at X-1, not 
  interesting)

all but the peer certificate are optional (peer could give us just a 
self-signed certificate, or an incomplete chain).

that the methods here are all referring to 'error' is an OpenSSL 
peculiarity - it basically gives us a cert store with the current cert 
and error depth set to values that are valid if we fail (error) the 
verification.

for each cert/call we do the following:

- ensure there is a current cert in the context or fail verification
- continue verification with next element of the chain if we are not 
  (yet) at the peer certificate (depth != 0)
- calculate fingerprint for current (== peer) cert, or fail
- compare fingerprint with pinned/expected one, fail if not expected

since the verification fails as soon as single callback fails, we need 
to
- return false if we fail some assumption (like ctx having a current 
  cert, or being able to calculate a cert's FP)
- return true if the current call is at a depth we are not interested in 
  verifying
- return true/false depending on result of FP check if current call is at 
a depth we are interested in

I'll add a comment to the depth part that it is for skipping the CA 
certs! also verify mode should technically be PEER, so I'll fix that up 
as well.

> 
>> +
>> +                    let fp = match cert.digest(openssl::hash::MessageDigest::sha256()) {
>> +                        Ok(fp) => fp,
>> +                        Err(err) => {
>> +                            // should not happen
>> +                            eprintln!("failed to calculate certificate FP - {}", err);
>> +                            return false;
>> +                        }
>> +                    };
>> +                    let fp_string = hex::encode(&fp);
>> +                    let fp_string = fp_string
>> +                        .as_bytes()
>> +                        .chunks(2)
>> +                        .map(|v| std::str::from_utf8(v).unwrap())
>> +                        .collect::<Vec<&str>>()
>> +                        .join(":");
>> +
>> +                    let expected = expected.to_lowercase();
>> +                    if expected == fp_string {
>> +                        true
>> +                    } else {
>> +                        eprintln!("certificate fingerprint does not match expected fingerprint!");
>> +                        eprintln!("expected:    {}", expected);
>> +                        eprintln!("encountered: {}", fp_string);
>> +                        false
>> +                    }
>> +                },
>> +            );
>>           } else {
>>               ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
>>           }
> 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation
  2022-01-19 10:34     ` Fabian Grünbichler
@ 2022-01-19 12:16       ` Fabian Ebner
  2022-01-19 12:53         ` Josef Johansson
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Ebner @ 2022-01-19 12:16 UTC (permalink / raw)
  To: Fabian Grünbichler, pve-devel

Am 19.01.22 um 11:34 schrieb Fabian Grünbichler:
> On January 4, 2022 12:37 pm, Fabian Ebner wrote:
>> Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
>>> in case we have no explicit fingerprint, we use openssl's regular "PEER"
>>> verification. if we have a fingerprint, we ignore openssl altogether and
>>> just verify the fingerprint of the presented leaf certificate.
>>>
>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>>> ---
>>>
>>> Notes:
>>>       v3: switch to using hex instead of no-longer-existing digest_to_hex
>>>       v2: new
>>>
>>>    src/main.rs | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>>>    1 file changed, 44 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/src/main.rs b/src/main.rs
>>> index 582214c..49d6ffe 100644
>>> --- a/src/main.rs
>>> +++ b/src/main.rs
>>> @@ -134,9 +134,50 @@ impl CtrlTunnel {
>>>            }
>>>    
>>>            let mut ssl_connector_builder = SslConnector::builder(SslMethod::tls())?;
>>> -        if fingerprint.is_some() {
>>> -            // FIXME actually verify fingerprint via callback!
>>> -            ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
>>> +        if let Some(expected) = fingerprint {
>>> +            ssl_connector_builder.set_verify_callback(
>>> +                openssl::ssl::SslVerifyMode::NONE,
>>> +                move |_valid, ctx| {
>>> +                    let cert = match ctx.current_cert() {
>>> +                        Some(cert) => cert,
>>> +                        None => {
>>> +                            eprintln!("SSL context lacks current certificate.");
>>> +                            return false;
>>> +                        }
>>> +                    };
>>> +
>>> +                    let depth = ctx.error_depth();
>>> +                    if depth != 0 {
>>> +                        return true;
>>> +                    }
>>
>> Sorry about my ignorance. Does using SslVerifyMode::NONE imply that
>> there is an error? At depth 0? Why is it fine to return true if not?
> 
> this is a bit.. tricky (did I mention I really really dislike openssl's
> API? ;))
> 
> basically what we do in this branch (if we have a pinned fingerprint to
> check - the regular 'connect iff trusted by system' is the else branch
> below) we set our own callback that gets called for each cert along the
> chain (starting at the top, ending with the leaf/end certificate, but
> the order is not relevant since a single failed callback fails the whole
> verification).
> 
> for each cert (== element of the chain == depth value) we get the result
> of openssl's check (`_valid`) and the X509 store context (ctx).
> 
> the context (among other things ;)) contains information about where
> (depth) in the chain we currently are:
> - depth 0 == peer certificate (the one we are interested in)
> - depth 1 == CA certificate (signer of peer cert, not interesting)
> - depth 2 == higher CA certificate (signer of CA at 1, not interesting)
> - depth X == higher CA certificate (signer of CA at X-1, not
>    interesting)
> 
> all but the peer certificate are optional (peer could give us just a
> self-signed certificate, or an incomplete chain).
> 
> that the methods here are all referring to 'error' is an OpenSSL
> peculiarity - it basically gives us a cert store with the current cert
> and error depth set to values that are valid if we fail (error) the
> verification.
> 
> for each cert/call we do the following:
> 
> - ensure there is a current cert in the context or fail verification
> - continue verification with next element of the chain if we are not
>    (yet) at the peer certificate (depth != 0)
> - calculate fingerprint for current (== peer) cert, or fail
> - compare fingerprint with pinned/expected one, fail if not expected
> 
> since the verification fails as soon as single callback fails, we need
> to
> - return false if we fail some assumption (like ctx having a current
>    cert, or being able to calculate a cert's FP)
> - return true if the current call is at a depth we are not interested in
>    verifying
> - return true/false depending on result of FP check if current call is at
> a depth we are interested in
> 
> I'll add a comment to the depth part that it is for skipping the CA
> certs! also verify mode should technically be PEER, so I'll fix that up
> as well.
> 

Thanks for the thorough explanation! I think I get it now (especially 
knowing that the callback can be called multiple times helps a lot).

>>
>>> +
>>> +                    let fp = match cert.digest(openssl::hash::MessageDigest::sha256()) {
>>> +                        Ok(fp) => fp,
>>> +                        Err(err) => {
>>> +                            // should not happen
>>> +                            eprintln!("failed to calculate certificate FP - {}", err);
>>> +                            return false;
>>> +                        }
>>> +                    };
>>> +                    let fp_string = hex::encode(&fp);
>>> +                    let fp_string = fp_string
>>> +                        .as_bytes()
>>> +                        .chunks(2)
>>> +                        .map(|v| std::str::from_utf8(v).unwrap())
>>> +                        .collect::<Vec<&str>>()
>>> +                        .join(":");
>>> +
>>> +                    let expected = expected.to_lowercase();
>>> +                    if expected == fp_string {
>>> +                        true
>>> +                    } else {
>>> +                        eprintln!("certificate fingerprint does not match expected fingerprint!");
>>> +                        eprintln!("expected:    {}", expected);
>>> +                        eprintln!("encountered: {}", fp_string);
>>> +                        false
>>> +                    }
>>> +                },
>>> +            );
>>>            } else {
>>>                ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
>>>            }
>>




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation
  2022-01-19 12:16       ` Fabian Ebner
@ 2022-01-19 12:53         ` Josef Johansson
  0 siblings, 0 replies; 41+ messages in thread
From: Josef Johansson @ 2022-01-19 12:53 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Ebner, Fabian Grünbichler

On 1/19/22 13:16, Fabian Ebner wrote:
> Am 19.01.22 um 11:34 schrieb Fabian Grünbichler:
>> On January 4, 2022 12:37 pm, Fabian Ebner wrote:
>>> Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
>>>> in case we have no explicit fingerprint, we use openssl's regular
>>>> "PEER"
>>>> verification. if we have a fingerprint, we ignore openssl
>>>> altogether and
>>>> just verify the fingerprint of the presented leaf certificate.
>>>>
>>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>>>> ---
>>>>
>>>> Notes:
>>>>       v3: switch to using hex instead of no-longer-existing
>>>> digest_to_hex
>>>>       v2: new
>>>>
>>>>    src/main.rs | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>>>>    1 file changed, 44 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/src/main.rs b/src/main.rs
>>>> index 582214c..49d6ffe 100644
>>>> --- a/src/main.rs
>>>> +++ b/src/main.rs
>>>> @@ -134,9 +134,50 @@ impl CtrlTunnel {
>>>>            }
>>>>               let mut ssl_connector_builder =
>>>> SslConnector::builder(SslMethod::tls())?;
>>>> -        if fingerprint.is_some() {
>>>> -            // FIXME actually verify fingerprint via callback!
>>>> -           
>>>> ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::NONE);
>>>> +        if let Some(expected) = fingerprint {
>>>> +            ssl_connector_builder.set_verify_callback(
>>>> +                openssl::ssl::SslVerifyMode::NONE,
>>>> +                move |_valid, ctx| {
>>>> +                    let cert = match ctx.current_cert() {
>>>> +                        Some(cert) => cert,
>>>> +                        None => {
>>>> +                            eprintln!("SSL context lacks current
>>>> certificate.");
>>>> +                            return false;
>>>> +                        }
>>>> +                    };
>>>> +
>>>> +                    let depth = ctx.error_depth();
>>>> +                    if depth != 0 {
>>>> +                        return true;
>>>> +                    }
>>>
>>> Sorry about my ignorance. Does using SslVerifyMode::NONE imply that
>>> there is an error? At depth 0? Why is it fine to return true if not?
>>
>> this is a bit.. tricky (did I mention I really really dislike openssl's
>> API? ;))
>>
>> basically what we do in this branch (if we have a pinned fingerprint to
>> check - the regular 'connect iff trusted by system' is the else branch
>> below) we set our own callback that gets called for each cert along the
>> chain (starting at the top, ending with the leaf/end certificate, but
>> the order is not relevant since a single failed callback fails the whole
>> verification).
>>
>> for each cert (== element of the chain == depth value) we get the result
>> of openssl's check (`_valid`) and the X509 store context (ctx).
>>
>> the context (among other things ;)) contains information about where
>> (depth) in the chain we currently are:
>> - depth 0 == peer certificate (the one we are interested in)
>> - depth 1 == CA certificate (signer of peer cert, not interesting)
>> - depth 2 == higher CA certificate (signer of CA at 1, not interesting)
>> - depth X == higher CA certificate (signer of CA at X-1, not
>>    interesting)
>>
>> all but the peer certificate are optional (peer could give us just a
>> self-signed certificate, or an incomplete chain).
>>
>> that the methods here are all referring to 'error' is an OpenSSL
>> peculiarity - it basically gives us a cert store with the current cert
>> and error depth set to values that are valid if we fail (error) the
>> verification.
>>
>> for each cert/call we do the following:
>>
>> - ensure there is a current cert in the context or fail verification
>> - continue verification with next element of the chain if we are not
>>    (yet) at the peer certificate (depth != 0)
>> - calculate fingerprint for current (== peer) cert, or fail
>> - compare fingerprint with pinned/expected one, fail if not expected
>>
>> since the verification fails as soon as single callback fails, we need
>> to
>> - return false if we fail some assumption (like ctx having a current
>>    cert, or being able to calculate a cert's FP)
>> - return true if the current call is at a depth we are not interested in
>>    verifying
>> - return true/false depending on result of FP check if current call
>> is at
>> a depth we are interested in
>>
>> I'll add a comment to the depth part that it is for skipping the CA
>> certs! also verify mode should technically be PEER, so I'll fix that up
>> as well.
>>
>
> Thanks for the thorough explanation! I think I get it now (especially
> knowing that the callback can be called multiple times helps a lot).
>
>>>
>>>> +
>>>> +                    let fp = match
>>>> cert.digest(openssl::hash::MessageDigest::sha256()) {
>>>> +                        Ok(fp) => fp,
>>>> +                        Err(err) => {
>>>> +                            // should not happen
>>>> +                            eprintln!("failed to calculate
>>>> certificate FP - {}", err);
>>>> +                            return false;
>>>> +                        }
>>>> +                    };
>>>> +                    let fp_string = hex::encode(&fp);
>>>> +                    let fp_string = fp_string
>>>> +                        .as_bytes()
>>>> +                        .chunks(2)
>>>> +                        .map(|v| std::str::from_utf8(v).unwrap())
>>>> +                        .collect::<Vec<&str>>()
>>>> +                        .join(":");
>>>> +
>>>> +                    let expected = expected.to_lowercase();
>>>> +                    if expected == fp_string {
>>>> +                        true
>>>> +                    } else {
>>>> +                        eprintln!("certificate fingerprint does
>>>> not match expected fingerprint!");
>>>> +                        eprintln!("expected:    {}", expected);
>>>> +                        eprintln!("encountered: {}", fp_string);
>>>> +                        false
>>>> +                    }
>>>> +                },
>>>> +            );
>>>>            } else {
>>>>               
>>>> ssl_connector_builder.set_verify(openssl::ssl::SslVerifyMode::PEER);
>>>>            }
>>>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Hi,

Just as a datapoint, I tested the step client
(https://github.com/smallstep/cli), and saw that it had a pretty sweet
system for verifying remote certificates.

{
  "version": 3,
  "serial_number": "347801840814762006190637676907327486230196",
  "signature_algorithm": {
    "name": "SHA256-RSA",
    "oid": "1.2.840.113549.1.1.11"
  },
  "issuer": {
    "common_name": [
      "R3"
    ],
    "country": [
      "US"
    ],
    "organization": [
      "Let's Encrypt"
    ]
  },
  "issuer_dn": "C=US, O=Let's Encrypt, CN=R3",
  "validity": {
    "start": "2022-01-15T22:05:15Z",
    "end": "2022-04-15T22:05:14Z",
    "length": 7775999
  },
  "subject": {
    "common_name": [
      "www.proxmox.com"
    ]
  },
  "subject_dn": "CN=www.proxmox.com",
  "subject_key_info": {
    "key_algorithm": {
      "name": "RSA"
    },
    "rsa_public_key": {
      "exponent": 65537,
      "modulus": "52/d0AcsH5pbdbTETwqKzKnNNaZqGt35Qs3ciLg3h+wkwqEZhxiSp7tmD4zyyywP84dDvUB+WxDngTsoGnkxhBNibFN+htDrhRiDIwXVqW42MRHOxnfGfyILcjmEPRcw0ej2e9LfR45Kd9o7QWL7L5ano8LgkG5rxoBHeEfnjyRyQR8q28z1EyKvR0/jE5fGRtEdtUIJn3Bu7+SsUBqQXcbKVDDcQFyEbICYxBTofXi2fxf1DVGT3ZGy9X4ja0WtPU8o6pjyfzUFW45mgY4W6UwhAWUtbDC0np+K7BiZjZcQTyuszHdcN0BY9skO9kH679DLzqD275+xkU89sb2xMQ==",
      "length": 2048
    },
    "fingerprint_sha256": "e832ec8ba68141627a1d3b58fde78150553084b14623689cd0be655aafdc5a5d"
  },
  "extensions": {
    "key_usage": {
      "digital_signature": true,
      "key_encipherment": true,
      "value": 5
    },
    "basic_constraints": {
      "is_ca": false
    },
    "subject_alt_name": {
      "dns_names": [
        "proxmox.com",
        "www.proxmox.com"
      ]
    },
    "authority_key_id": "142eb317b75856cbae500940e61faf9d8b14c2c6",
    "subject_key_id": "078f859fc1821a3f838c5b86ce319b8a70e962fc",
    "extended_key_usage": {
      "client_auth": true,
      "server_auth": true
    },
    "certificate_policies": [
      {
        "id": "2.23.140.1.2.1"
      },
      {
        "id": "1.3.6.1.4.1.44947.1.1.1",
        "cps": [
          "http://cps.letsencrypt.org"
        ]
      }
    ],
    "authority_info_access": {
      "ocsp_urls": [
        "http://r3.o.lencr.org"
      ],
      "issuer_urls": [
        "http://r3.i.lencr.org/"
      ]
    },
    "signed_certificate_timestamps": [
      {
        "version": 0,
        "log_id": "QcjKsd8iRkoQxqE6CUKHXk4xixsD6+tLx2jwkGKWBvY=",
        "timestamp": 1642287915,
        "signature": "BAMASDBGAiEA6oCYHTwiAUo6xBAknN5K+IedikfwIETRFFjzVaX/d1ICIQC5QGnOlcA9J1kNNDUjwW8MyZp/kj8U3E7Vm1f8Dr6B4A=="
      },
      {
        "version": 0,
        "log_id": "KXm+8J45OSHwVnOfY6V35b5XfZxgCvj5TV0mXCVdx4Q=",
        "timestamp": 1642287915,
        "signature": "BAMASDBGAiEAtZrDq7wbgxUHzp2VKVDvuIYSJRVP72naI4oS3nS0rMYCIQDHCGH4TPVygVPE73P8jFzX++7ZhamkZbK2IFZgIVBdcg=="
      }
    ]
  },
  "signature": {
    "signature_algorithm": {
      "name": "SHA256-RSA",
      "oid": "1.2.840.113549.1.1.11"
    },
    "value": "iTb8dmJ1JvjG200/hmJ6mPmD3w7YYUR0XlmK8hu7YJTbPYxQtgiqew3ulrfx5z98fe7gbPmnR7fCtdSPqwz4HKbiGLS6kmqEiQn6eDkhbi2pIXLZTJ6qPpKkJJAUyT1b3rgIRF49PhVJZ0sbHTHeyBH41Lbkvs+mDWHBubhsCI6coN1f0+ZmZgctaK2cE9jKn8t1vRmMFW8MeM7SSX951EcRxO4LRrx/yNE9v8reMsLAeUbuQ6fboEh60+ZIK/6+PFupGYsINtp/76rn1AOcDCGeP1enzP7/T8qbFChNN5CnewSYEdzJYkn07y8H1nt+ioj60TsIHdW8JBd0tEajgg==",
    "valid": false,
    "self_signed": false
  },
  "fingerprint_md5": "9102932ce81f268eedddc6b9ff59e6fe",
  "fingerprint_sha1": "19e4425b6321236f4dfd5ee26a592438fec4bcbd",
  "fingerprint_sha256": "3b21f35ed0b6bbb5e7cd1c176038c077c86bd723c10587746ffd10dc9f87ba9b",
  "tbs_noct_fingerprint": "0024bc21edbd4539da91a0a029fd7f3ed7f327336fd90dc150ff2c1f92e45d4a",
  "spki_subject_fingerprint": "8b9da0a8a08a165b399932a91add980e63d2a59299fd1517ad8e70cbb360637a",
  "tbs_fingerprint": "be6bf119f73c655983772411fc88e0142926ab3d5bd7adc14b017fc9c7d2f1a6",
  "validation_level": "DV",
  "names": [
    "www.proxmox.com",
    "proxmox.com"
  ],
  "redacted": false
}





^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module
       [not found]     ` <<47e7d41f-e328-d9fa-25b7-f7585de8ce5b@proxmox.com>
@ 2022-01-19 14:30       ` Fabian Grünbichler
  2022-01-20  9:57         ` Fabian Ebner
  0 siblings, 1 reply; 41+ messages in thread
From: Fabian Grünbichler @ 2022-01-19 14:30 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On January 3, 2022 1:30 pm, Fabian Ebner wrote:
> A few nits inline.
> 
> Am 12/22/21 um 14:52 schrieb Fabian Grünbichler:
>> lifted from PVE::QemuMigrate, abstracting away use-case specific data.
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>>   src/Makefile      |   1 +
>>   debian/control    |   1 +
>>   src/PVE/Tunnel.pm | 356 ++++++++++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 358 insertions(+)
>>   create mode 100644 src/PVE/Tunnel.pm
>> 
>> diff --git a/src/Makefile b/src/Makefile
>> index 0298d3f..d82162c 100644
>> --- a/src/Makefile
>> +++ b/src/Makefile
>> @@ -12,6 +12,7 @@ install: PVE
>>   	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>>   	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
>> +	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>>   	install -d ${PERL5DIR}/PVE/VZDump
>>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
>>   	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
>> diff --git a/debian/control b/debian/control
>> index 4c246d0..73c28bd 100644
>> --- a/debian/control
>> +++ b/debian/control
>> @@ -16,6 +16,7 @@ Depends: libpve-cluster-perl,
>>            libpve-common-perl (>= 4.0-89),
>>            libpve-storage-perl (>= 7.0-14),
>>            pve-cluster,
>> +         proxmox-websocket-tunnel,
>>            ${misc:Depends},
>>            ${perl:Depends},
>>   Breaks: libpve-common-perl (<< 4.0-89),
>> diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
>> new file mode 100644
>> index 0000000..bbd1169
>> --- /dev/null
>> +++ b/src/PVE/Tunnel.pm
>> @@ -0,0 +1,356 @@
>> +package PVE::Tunnel;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use IO::Pipe;
>> +use IPC::Open2;
>> +use JSON qw(encode_json decode_json);
>> +use POSIX qw( WNOHANG );
>> +use Storable qw(dclone);
>> +use URI::Escape;
>> +
>> +use PVE::APIClient::LWP;
>> +use PVE::Tools;
>> +
>> +my $finish_command_pipe = sub {
>> +    my ($cmdpipe, $timeout) = @_;
>> +
>> +    my $cpid = $cmdpipe->{pid};
>> +    return if !defined($cpid);
>> +
>> +    my $writer = $cmdpipe->{writer};
>> +    my $reader = $cmdpipe->{reader};
>> +
>> +    $writer->close();
>> +    $reader->close();
>> +
>> +    my $collect_child_process = sub {
>> +	my $res = waitpid($cpid, WNOHANG);
>> +	if (defined($res) && ($res == $cpid)) {
>> +	    delete $cmdpipe->{cpid};
>> +	    return 1;
>> +	} else {
>> +	    return 0;
>> +	}
>> +     };
> 
> style nit: white-space error
> 
>> +
>> +    if ($timeout) {
>> +	for (my $i = 0; $i < $timeout; $i++) {
>> +	    return if &$collect_child_process();
>> +	    sleep(1);
>> +	}
>> +    }
>> +
>> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGTERM\n");
>> +    kill(15, $cpid);
>> +
>> +    # wait again
>> +    for (my $i = 0; $i < 10; $i++) {
>> +	return if &$collect_child_process();
>> +	sleep(1);
>> +    }
>> +
>> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGKILL\n");
>> +    kill 9, $cpid;
>> +    sleep 1;
>> +
>> +    $cmdpipe->{log}->("tunnel child process (PID $cpid) couldn't be collected\n")
>> +	if !&$collect_child_process();
>> +};
>> +
>> +sub read_tunnel {
>> +    my ($tunnel, $timeout) = @_;
>> +
>> +    $timeout = 60 if !defined($timeout);
>> +
>> +    my $reader = $tunnel->{reader};
>> +
>> +    my $output;
>> +    eval {
>> +	PVE::Tools::run_with_timeout($timeout, sub { $output = <$reader>; });
>> +    };
>> +    die "reading from tunnel failed: $@\n" if $@;
>> +
>> +    chomp $output if defined($output);
>> +
>> +    return $output;
>> +}
>> +
>> +sub write_tunnel {
>> +    my ($tunnel, $timeout, $command, $params) = @_;
>> +
>> +    $timeout = 60 if !defined($timeout);
>> +
>> +    my $writer = $tunnel->{writer};
>> +
>> +    if ($tunnel->{version} && $tunnel->{version} >= 2) {
>> +	my $object = defined($params) ? dclone($params) : {};
>> +	$object->{cmd} = $command;
>> +
>> +	$command = eval { JSON::encode_json($object) };
>> +
>> +	die "failed to encode command as JSON - $@\n"
>> +	    if $@;
>> +    }
>> +
>> +    eval {
>> +	PVE::Tools::run_with_timeout($timeout, sub {
>> +	    print $writer "$command\n";
>> +	    $writer->flush();
>> +	});
>> +    };
>> +    die "writing to tunnel failed: $@\n" if $@;
>> +
>> +    if ($tunnel->{version} && $tunnel->{version} >= 1) {
>> +	my $res = eval { read_tunnel($tunnel, $timeout); };
>> +	die "no reply to command '$command': $@\n" if $@;
>> +
>> +	if ($tunnel->{version} == 1) {
>> +	    if ($res eq 'OK') {
>> +		return;
>> +	    } else {
>> +		die "tunnel replied '$res' to command '$command'\n";
>> +	    }
>> +	} else {
>> +	    my $parsed = eval { JSON::decode_json($res) };
>> +	    die "failed to decode tunnel reply '$res' (command '$command') - $@\n"
>> +		if $@;
>> +
>> +	    if (!$parsed->{success}) {
>> +		if (defined($parsed->{msg})) {
>> +		    die "error - tunnel command '$command' failed - $parsed->{msg}\n";
>> +		} else {
>> +		    die "error - tunnel command '$command' failed\n";
>> +		}
>> +	    }
>> +
>> +	    return $parsed;
>> +	}
>> +    }
>> +}
>> +
>> +sub fork_ssh_tunnel {
>> +    my ($rem_ssh, $cmd, $ssh_forward_info, $log) = @_;
>> +
>> +    my @localtunnelinfo = ();
>> +    foreach my $addr (@$ssh_forward_info) {
>> +	push @localtunnelinfo, '-L', $addr;
>> +    }
>> +
>> +    my $full_cmd = [@$rem_ssh, '-o ExitOnForwardFailure=yes', @localtunnelinfo, @$cmd];
>> +
>> +    my $reader = IO::File->new();
>> +    my $writer = IO::File->new();
>> +
>> +    my $orig_pid = $$;
>> +
>> +    my $cpid;
>> +
>> +    eval { $cpid = open2($reader, $writer, @$full_cmd); };
>> +
>> +    my $err = $@;
>> +
>> +    # catch exec errors
>> +    if ($orig_pid != $$) {
>> +	$log->("can't fork command pipe, aborting\n");
>> +	POSIX::_exit(1);
>> +	kill('KILL', $$);
>> +    }
>> +
>> +    die $err if $err;
>> +
>> +    my $tunnel = {
>> +	writer => $writer,
>> +	reader => $reader,
>> +	pid => $cpid,
>> +	rem_ssh => $rem_ssh,
>> +	log => $log,
>> +    };
>> +
>> +    eval {
>> +	my $helo = read_tunnel($tunnel, 60);
>> +	die "no reply\n" if !$helo;
>> +	die "no quorum on target node\n" if $helo =~ m/^no quorum$/;
>> +	die "got strange reply from tunnel ('$helo')\n"
>> +	    if $helo !~ m/^tunnel online$/;
>> +    };
>> +    $err = $@;
>> +
>> +    eval {
>> +	my $ver = read_tunnel($tunnel, 10);
>> +	if ($ver =~ /^ver (\d+)$/) {
>> +	    $tunnel->{version} = $1;
>> +	    $log->('info', "ssh tunnel $ver\n");
> 
> Should pass only the message like the other calls to $log (unless the 
> function is intended to behave differently when multiple parameters are 
> used, but in qemu-server 06/10 that doesn't happen). What are the 
> reasons for not sticking to the two-parameter log interface?

no real reason other than 'let's make it more simple' (all the 'err' 
level does is prefix the lines with 'ERROR: '). but we can add it back 
in if we want?

replication uses a simple logfunc, so we need to adapt one to the other 
in any case:
- either migration only using 'info' (current)
- or replication wrapping its logfunc and throwing away the level/adding it 
  as prefix (alternative)

the alternative is less 'lossy', so maybe I'll switch to that..

> 
>> +	} else {
>> +	    $err = "received invalid tunnel version string '$ver'\n" if !$err;
>> +	}
>> +    };
>> +
>> +    if ($err) {
>> +	$finish_command_pipe->($tunnel);
>> +	die "can't open tunnel - $err";
>> +    }
>> +    return $tunnel;
>> +}
>> +
>> +sub forward_unix_socket {
>> +    my ($tunnel, $local, $remote) = @_;
>> +
>> +    my $params = dclone($tunnel->{params});
>> +    $params->{unix} = $local;
>> +    $params->{url} = $params->{url} ."socket=".uri_escape($remote)."&";
>> +    $params->{ticket} = { path => $remote };
>> +
>> +    my $cmd = encode_json({
>> +	control => JSON::true,
>> +	cmd => 'forward',
>> +	data => $params,
>> +    });
>> +
>> +    my $writer = $tunnel->{writer};
>> +    $tunnel->{forwarded}->{$local} = $remote;
>> +    eval {
>> +	unlink $local;
>> +	PVE::Tools::run_with_timeout(15, sub {
>> +	    print $writer "$cmd\n";
>> +	    $writer->flush();
>> +	});
>> +    };
>> +    die "failed to write forwarding command - $@\n" if $@;
>> +
>> +    read_tunnel($tunnel);
>> +}
>> +
>> +sub fork_websocket_tunnel {
>> +    my ($conn, $url, $req_params, $tunnel_params, $log) = @_;
>> +
>> +    if (my $apitoken = $conn->{apitoken}) {
>> +	$tunnel_params->{headers} = [["Authorization", "$apitoken"]];
>> +    } else {
>> +	die "can't connect to remote host without credentials\n";
>> +    }
>> +
>> +    if (my $fps = $conn->{cached_fingerprints}) {
>> +	$tunnel_params->{fingerprint} = (keys %$fps)[0];
>> +    }
>> +
>> +    my $api_client = PVE::APIClient::LWP->new(%$conn);
>> +
>> +    my $res = $api_client->post(
>> +	$url,
>> +	$req_params,
>> +    );
>> +
>> +    $log->("remote: started migration tunnel worker '$res->{upid}'");
> 
> Nit: still mentions migration, as is done...

thanks, removed

> 
>> +
>> +    my $websocket_url = $tunnel_params->{url};
>> +
>> +    $tunnel_params->{url} .= "?ticket=".uri_escape($res->{ticket});
>> +    $tunnel_params->{url} .= "&socket=".uri_escape($res->{socket});
>> +
>> +    my $reader = IO::Pipe->new();
>> +    my $writer = IO::Pipe->new();
>> +
>> +    my $cpid = fork();
>> +    if ($cpid) {
>> +	$writer->writer();
>> +	$reader->reader();
>> +	my $tunnel = { writer => $writer, reader => $reader, pid => $cpid };
>> +
>> +	eval {
>> +	    my $writer = $tunnel->{writer};
>> +	    my $cmd = encode_json({
>> +		control => JSON::true,
>> +		cmd => 'connect',
>> +		data => $tunnel_params,
>> +	    });
>> +
>> +	    eval {
>> +		PVE::Tools::run_with_timeout(15, sub {
>> +		    print {$writer} "$cmd\n";
>> +		    $writer->flush();
>> +		});
>> +	    };
>> +	    die "failed to write tunnel connect command - $@\n" if $@;
>> +	};
>> +	die "failed to connect via WS: $@\n" if $@;
>> +
>> +	my $err;
>> +        eval {
>> +	    my $writer = $tunnel->{writer};
>> +	    my $cmd = encode_json({
>> +		cmd => 'version',
>> +	    });
>> +
>> +	    eval {
>> +		PVE::Tools::run_with_timeout(15, sub {
>> +		    print {$writer} "$cmd\n";
>> +		    $writer->flush();
>> +		});
>> +	    };
>> +	    $err = "failed to write tunnel version command - $@\n" if $@;
>> +	    my $res = read_tunnel($tunnel, 10);
>> +	    $res = JSON::decode_json($res);
>> +	    my $version = $res->{api};
>> +
>> +	    if ($version =~ /^(\d+)$/) {
>> +		$tunnel->{version} = $1;
>> +		$tunnel->{age} = $res->{age};
>> +	    } else {
>> +		$err = "received invalid tunnel version string '$version'\n" if !$err;
>> +	    }
>> +	};
>> +	$err = $@ if !$err;
>> +
>> +	if ($err) {
>> +	    $finish_command_pipe->($tunnel);
> 
> Nit: Here, $tunnel->{log} is still undef, but finish_command_pipe might 
> try to use it.

ack, since we already have $log anyway we can just set it from the 
start, like in fork_ssh_tunnel.

> 
>> +	    die "can't open migration tunnel - $err";
> 
> ...here.

removed as well.

> 
>> +	}
>> +
>> +	$tunnel_params->{url} = "$websocket_url?"; # reset ticket and socket
>> +
>> +	$tunnel->{params} = $tunnel_params; # for forwarding
>> +	$tunnel->{log} = $log;
>> +
>> +	return $tunnel;
>> +    } else {
>> +	eval {
>> +	    $writer->reader();
>> +	    $reader->writer();
>> +	    PVE::Tools::run_command(
>> +		['proxmox-websocket-tunnel'],
>> +		input => "<&".fileno($writer),
>> +		output => ">&".fileno($reader),
>> +		errfunc => sub { my $line = shift; print "tunnel: $line\n"; },
>> +	    );
>> +	};
>> +	warn "CMD websocket tunnel died: $@\n" if $@;
>> +	exit 0;
>> +    }
>> +}
>> +
>> +sub finish_tunnel {
>> +    my ($tunnel, $cleanup) = @_;
>> +
>> +    $cleanup = $cleanup ? 1 : 0;
>> +
>> +    eval { write_tunnel($tunnel, 30, 'quit', { cleanup => $cleanup }); };
>> +    my $err = $@;
>> +
>> +    $finish_command_pipe->($tunnel, 30);
>> +
>> +    if (my $unix_sockets = $tunnel->{unix_sockets}) {
>> +	# ssh does not clean up on local host
>> +	my $cmd = ['rm', '-f', @$unix_sockets];
>> +	PVE::Tools::run_command($cmd);
>> +
>> +	# .. and just to be sure check on remote side
>> +	if ($tunnel->{rem_ssh}) {
>> +	    unshift @{$cmd}, @{$tunnel->{rem_ssh}};
>> +	    PVE::Tools::run_command($cmd);
>> +	}
>> +    }
>> +
>> +    die $err if $err;
>> +}
> 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module
       [not found]     ` <<af15fed1-2d06-540e-cde8-ed1ce772aeb4@proxmox.com>
@ 2022-01-19 14:31       ` Fabian Grünbichler
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2022-01-19 14:31 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On January 3, 2022 3:30 pm, Fabian Ebner wrote:
> Am 12/22/21 um 14:52 schrieb Fabian Grünbichler:
>> encapsulating storage-related tunnel methods, currently
>> - source-side storage-migrate helper
>> - target-side disk-import handler
>> - target-side query-disk-import handler
>> 
>> to be extended further with replication-related handlers and helpers.
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> 
>> Notes:
> 
> Needs bump for pve-storage (for storage_migration_snapshot, 
> volume_import_start, etc.)
> 
>>      new in v3, includes code previously in qemu-server
>> 
>>   src/Makefile             |   1 +
>>   src/PVE/StorageTunnel.pm | 231 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 232 insertions(+)
>>   create mode 100644 src/PVE/StorageTunnel.pm
>> 
>> diff --git a/src/Makefile b/src/Makefile
>> index d82162c..baa2688 100644
>> --- a/src/Makefile
>> +++ b/src/Makefile
>> @@ -12,6 +12,7 @@ install: PVE
>>   	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>>   	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
>> +	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
>>   	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>>   	install -d ${PERL5DIR}/PVE/VZDump
>>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
>> diff --git a/src/PVE/StorageTunnel.pm b/src/PVE/StorageTunnel.pm
>> new file mode 100644
>> index 0000000..06902ef
>> --- /dev/null
>> +++ b/src/PVE/StorageTunnel.pm
>> @@ -0,0 +1,231 @@
>> +package PVE::StorageTunnel;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use IO::Socket::UNIX;
>> +use POSIX qw(WNOHANG);
>> +use Socket qw(SOCK_STREAM);
>> +
>> +use PVE::Tools;
>> +use PVE::Tunnel;
>> +use PVE::Storage;
> 
> Nit: not ordered alphabetically
> 
>> +
>> +sub storage_migrate {
>> +    my ($tunnel, $storecfg, $volid, $local_vmid, $remote_vmid, $opts, $log) = @_;
>> +
>> +    my $bwlimit = $opts->{bwlimit};
>> +    my $targetsid = $opts->{targetsid};
>> +
>> +    # use 'migrate' limit for transfer to other node
>> +    my $bwlimit_opts = {
>> +	storage => $targetsid,
>> +	bwlimit => $bwlimit,
>> +    };
>> +    my $remote_bwlimit = PVE::Tunnel::write_tunnel($tunnel, 10, 'bwlimit', $bwlimit_opts);
> 
> Should this be done on the call side instead? For re-using the helper 
> with replication, the 'bwlimit' command shouldn't happen, as there, the 
> only relevant limit is the one from the job configuration (even the 
> 'default' storage limit isn't used AFAICT).

actually I am not so sure for remote replication - there it might make 
sense to have bwlimit configurable on either side (although of course, 
since the limiting happens on the client side the remote still needs to 
trust it to actually honor the limit ;))

> 
>> +    $remote_bwlimit = $remote_bwlimit->{bwlimit};
>> +    if (defined($remote_bwlimit)) {
>> +	$bwlimit = $remote_bwlimit if !defined($bwlimit);
>> +	$bwlimit = $remote_bwlimit if $remote_bwlimit < $bwlimit;
> 
> Wrong result if $remote_bwlimit = 0, $bwlimit > 0, but maybe that 
> situation is not even possible currently. That aside, isn't the content 
> of the if block superfluous, because the 'bwlimit' 
> handler/get_bandwith_limit already determines the minimum from the 
> passed-along parameter and the storage limit?

hmm, no, but on a closer look this is actually wrong, as the 
get_bandwidth_limit keeps the explicit limit in some situations (based 
on privilege checks the explicit limit can take precedence over the 
configured ones even if higher).

I'll take a closer look for the next iteration - maybe we need to split 
the option here into source side bwlimit without override and explicit 
override bwlimit, and pass neither to the remote side, and then combine 
source, remote and override here similar to what get_bandwidth_limit 
those (or by extending it) :-/

> 
>> +    }
>> +
>> +    # JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
>> +    $bwlimit = $bwlimit * 1024 if defined($bwlimit);
>> +
>> +    # adapt volume name for import call
>> +    my ($sid, undef) = PVE::Storage::parse_volume_id($volid);
>> +    my (undef, $name, undef, undef, undef, undef, $format) = PVE::Storage::parse_volname($storecfg, $volid);
>> +    my $scfg = PVE::Storage::storage_config($storecfg, $sid);
>> +    PVE::Storage::activate_volumes($storecfg, [$volid]);
>> +    if ($local_vmid != $remote_vmid) {
> 
> There can be guests where the owner's VM ID and the disk's VM ID don't 
> match (e.g. manual reassign via 'qm set'), but I'm not sure it's worth 
> bothering about.

true, we could detect this here and either abort or warn and override 
$local_vmid..

> 
>> +	$name =~ s/-$local_vmid-/-$remote_vmid-/g;
>> +	$name =~ s/^$local_vmid\///; # re-added on target if dir-based storage
>> +    }
>> +
>> +    my $with_snapshots = $opts->{snapshots} ? 1 : 0;
>> +    my $snapshot;
>> +    my $migration_snapshot = PVE::Storage::storage_migrate_snapshot($storecfg, $sid);
>> +    if ($migration_snapshot) {
>> +	$snapshot = '__migration__';
>> +	$with_snapshots = 1;
>> +    }
>> +
>> +    my @export_formats = PVE::Storage::volume_export_formats($storecfg, $volid, undef, undef, $with_snapshots);
>> +
>> +    my $disk_import_opts = {
>> +	format => $format,
>> +	storage => $targetsid,
>> +	snapshot => $snapshot,
>> +	migration_snapshot => $migration_snapshot,
>> +	'with-snapshots' => $with_snapshots,
>> +	'allow-rename' => !$opts->{is_vmstate},
>> +	'export-formats' => join(",", @export_formats),
>> +	volname => $name,
>> +    };
>> +    my $res = PVE::Tunnel::write_tunnel($tunnel, 600, 'disk-import', $disk_import_opts);
>> +    my $local = "/run/qemu-server/$local_vmid.storage";
>> +    if (!$tunnel->{forwarded}->{$local}) {
>> +	PVE::Tunnel::forward_unix_socket($tunnel, $local, $res->{socket});
>> +    }
>> +    my $socket = IO::Socket::UNIX->new(Peer => $local, Type => SOCK_STREAM())
>> +	or die "failed to connect to websocket tunnel at $local\n";
>> +    # we won't be reading from the socket
>> +    shutdown($socket, 0);
>> +
>> +    my $disk_export_opts = {
>> +	snapshot => $snapshot,
>> +	migration_snapshot => $migration_snapshot,
>> +	'with-snapshots' => $with_snapshots,
>> +	ratelimit_bps => $bwlimit,
>> +	cmd => {
>> +	    output => '>&'.fileno($socket),
>> +	},
>> +    };
>> +
>> +    eval {
>> +	PVE::Storage::volume_export_start(
>> +	    $storecfg,
>> +	    $volid,
>> +	    $res->{format},
>> +	    $disk_export_opts,
>> +	);
>> +    };
>> +    my $send_error = $@;
>> +    warn "$send_error\n" if $send_error;
>> +
>> +    # don't close the connection entirely otherwise the
>> +    # receiving end might not get all buffered data (and
>> +    # fails with 'connection reset by peer')
>> +    shutdown($socket, 1);
>> +
>> +    # wait for the remote process to finish
>> +    my $new_volid;
>> +    while ($res = PVE::Tunnel::write_tunnel($tunnel, 10, 'query-disk-import')) {
>> +	if ($res->{status} eq 'pending') {
>> +	    if (my $msg = $res->{msg}) {
>> +		$log->("disk-import: $msg\n");
>> +	    } else {
>> +		$log->("waiting for disk import to finish..\n");
>> +	    }
>> +	    sleep(1)
>> +	} elsif ($res->{status} eq 'complete') {
>> +	    $new_volid = $res->{volid};
>> +	    last;
>> +	} else {
>> +	    warn "unknown query-disk-import result: $res->{status}\n";
>> +	    last;
>> +	}
>> +    }
>> +
>> +    # now close the socket
>> +    close($socket);
>> +    if ($snapshot) {
>> +	eval { PVE::Storage::volume_snapshot_delete($storecfg, $volid, $snapshot, 0) };
>> +	warn "could not remove source snapshot: $@\n" if $@;
>> +    }
>> +    die $send_error if $send_error;
>> +    die "disk import failed - see log above\n" if !$new_volid;
>> +
>> +    return $new_volid;
>> +}
>> +
>> +our $cmd_schema = {
> 
> Maybe 'bwlimit' should also live here (and for future re-usability 
> become agnostic towards the operation for the limit, rather than 
> hard-coding 'migration')?

yeah, we could do that.

> 
> And maybe we should add
>      'query-disk-import' => {},
> for completeness?

doesn't hurt either.

> 
>> +    'disk-import' => {
>> +	volname => {
>> +	    type => 'string',
>> +	    description => 'volume name to use as preferred target volume name',
>> +	},
>> +	format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
>> +	'export-formats' => {
>> +	    type => 'string',
>> +	    description => 'list of supported export formats',
>> +	},
>> +	storage => {
>> +	    type => 'string',
>> +	    format => 'pve-storage-id',
>> +	},
>> +	'with-snapshots' => {
>> +	    description =>
>> +	        "Whether the stream includes intermediate snapshots",
>> +	    type => 'boolean',
>> +	    optional => 1,
>> +	    default => 0,
>> +	},
>> +	'allow-rename' => {
>> +	    description => "Choose a new volume ID if the requested " .
>> +		"volume ID already exists, instead of throwing an error.",
>> +	    type => 'boolean',
>> +	    optional => 1,
>> +	    default => 0,
>> +	},
>> +    },
>> +};
>> +
>> +sub handle_disk_import {
>> +    my ($state, $params) = @_;
>> +
>> +    die "disk import already running as PID '$state->{disk_import}->{pid}'\n"
>> +	if $state->{disk_import}->{pid};
>> +
>> +    my $storage = delete $params->{storage};
>> +    my $format = delete $params->{format};
>> +    my $volname = delete $params->{volname};
>> +
>> +    my $import = PVE::Storage::volume_import_start($state->{storecfg}, $storage, $volname, $format, $state->{vmid}, $params);
>> +
>> +    my $socket = $import->{socket};
>> +    $format = delete $import->{format};
>> +
>> +    $state->{sockets}->{$socket} = 1;
>> +    $state->{disk_import} = $import;
>> +
>> +    chown $state->{socket_uid}, -1, $socket;
>> +
>> +    return {
>> +	socket => $socket,
>> +	format => $format,
>> +    };
>> +}
>> +
>> +sub handle_query_disk_import {
>> +    my ($state, $params) = @_;
>> +
>> +    die "no disk import running\n"
>> +	if !$state->{disk_import}->{pid};
>> +
>> +    my $pattern = PVE::Storage::volume_imported_message(undef, 1);
>> +    my $result;
>> +    eval {
>> +	my $fh = $state->{disk_import}->{fh};
>> +	PVE::Tools::run_with_timeout(5, sub { $result = <$fh>; });
>> +	print "disk-import: $result\n" if $result;
>> +    };
> 
> What if the disk import finishes after here...
> 
>> +    if ($result && $result =~ $pattern) {
>> +	my $volid = $1;
>> +	waitpid($state->{disk_import}->{pid}, 0);
>> +
>> +	my $unix = $state->{disk_import}->{socket};
>> +	unlink $unix;
>> +	delete $state->{sockets}->{$unix};
>> +	delete $state->{disk_import};
>> +	$state->{cleanup}->{volumes}->{$volid} = 1;
>> +	return {
>> +	    status => "complete",
>> +	    volid => $volid,
>> +	};
>> +    } elsif (!$result && waitpid($state->{disk_import}->{pid}, WNOHANG)) {
> 
> ...but before the call to waitpid? Or is it guaranteed to keep running 
> until we read from the file handle?

good catch!

no it's not - there is a (small) race window here (`/tmp/sleep` sleeps 
20s and then prints a line):

use strict;
use warnings;

use IPC::Open3;
use POSIX qw(WNOHANG);

use PVE::Tools;

my $input = IO::File->new();
my $info = IO::File->new();
my $cpid = open3($input, $info, $info, "/tmp/sleep");

my $result;

while (!$result) {
    eval { PVE::Tools::run_with_timeout(5, sub { $result = <$info>; }) };
    if ($result) {
	print "res: $result\n";
	waitpid($cpid, 0);
	print "child exited\n";
    } else {
	sleep 30; # trigger race
	if (waitpid($cpid, WNOHANG)) {
	    print "child exited without result\n";
	    last;
	} else {
	    print "still pending\n";
	}
    }
}

prints:

child exited without result

with sleep between attempted read and waitpid in the else branch removed:

still pending
still pending
still pending
still pending
res: OK

child exited

I guess we can just attempt to read from $fh after the waitpid to get 
all the remaining output, and then either handle it as error (output, 
but nothing matching $pattern) or as okay (got a line matching 
$pattern). or maybe I find some other, more elegant solution that 
doesn't involve doing sysread/select and manual buffering..

> 
>> +	my $unix = $state->{disk_import}->{socket};
>> +	unlink $unix;
>> +	delete $state->{sockets}->{$unix};
>> +	delete $state->{disk_import};
>> +
>> +	return {
>> +	    status => "error",
>> +	};
>> +    } else {
>> +	return {
>> +	    status => "pending",
>> +	    msg => $result,
>> +	};
>> +    }
>> +}
> 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling
       [not found]         ` <<554040de-09d6-974b-143a-80c2d66b9573@proxmox.com>
@ 2022-01-19 14:32           ` Fabian Grünbichler
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2022-01-19 14:32 UTC (permalink / raw)
  To: Roland, Fabian Ebner, Proxmox VE development discussion,
	Thomas Lamprecht

On January 11, 2022 9:19 am, Thomas Lamprecht wrote:
> On 04.01.22 17:44, Roland wrote:
>>>>   +sub phase2_start_remote_cluster {
>>>> +    my ($self, $vmid, $params) = @_;
>>>> +
>>>> +    die "insecure migration to remote cluster not implemented\n"
>>>> +    if $params->{migrate_opts}->{type} ne 'websocket';
>>>> +
>>>> +    my $remote_vmid = $self->{opts}->{remote}->{vmid};
>>>> +
>>>> +    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 10,
>>>> "start", $params);
>>>
>>> 10 seconds feels a bit short to me.
> 
> @Fabian(s):
> 
> Why not use the same as vm_start_nolock + some buffer? E.g.,
> 
> ($params->{timeout} // config_aware_timeout($conf, $resume)) + 10;
> 
> That should be quite generous, as the worst case time for the start for
> incoming migration, which is always paused so not doing much, is normally
> way lower than a cold-start.
> 
> I mean, we're in an worker task and do not really care much if setup needs
> a bit longer.

yeah, that sounds like a good idea :) also gives us future updates / 
special case handling for setups that take longer 'for free', including 
things like a possible manual override via the guest config.




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision
  2022-01-05  9:00   ` Fabian Ebner
@ 2022-01-19 14:38     ` Fabian Grünbichler
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Grünbichler @ 2022-01-19 14:38 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On January 5, 2022 10:00 am, Fabian Ebner wrote:
> Am 22.12.21 um 14:52 schrieb Fabian Grünbichler:
>> into new top-level helper for re-use with remote migration.
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> new in v3
>> 
>>   PVE/Storage.pm | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>> 
>> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
>> index 05be3dd..4f3a44a 100755
>> --- a/PVE/Storage.pm
>> +++ b/PVE/Storage.pm
>> @@ -643,6 +643,14 @@ my $volname_for_storage = sub {
>>       }
>>   };
>>   
>> +# whether a migration snapshot is needed for a given storage
>> +sub storage_migrate_snapshot {
>> +    my ($cfg, $storeid) = @_;
>> +    my $scfg = storage_config($cfg, $storeid);
>> +
>> +    return $scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs';
>> +}
>> +
>>   sub storage_migrate {
>>       my ($cfg, $volid, $target_sshinfo, $target_storeid, $opts, $logfunc) = @_;
>>   
>> @@ -688,10 +696,8 @@ sub storage_migrate {
>>   
>>       my $migration_snapshot;
>>       if (!defined($snapshot)) {
>> -	if ($scfg->{type} eq 'zfspool' || $scfg->{type} eq 'btrfs') {
>> -	    $migration_snapshot = 1;
>> -	    $snapshot = '__migration__';
>> -	}
>> +	$migration_snapshot = storage_migrate_snapshot->($cfg, $target_storeid);
> 
> The call is wrong, because of the ->

that's a fixup gone wrong (got cleaned up in a later patch instead of 
folding that chunk into this one)

> And why base the decision on the target storage rather than keeping it 
> based on the source storage? That breaks migration from e.g. lvm-thin to 
> btrfs, because it now tries to use a snapshot and doesn't find any 
> common transfer format anymore.

and yeah, this should be $storeid, not $target_storeid (ZFS only allows 
importing from ZFS, so it was not affected).

> 
>> +	$snapshot = '__migration__' if $migration_snapshot;
>>       }
>>   
>>       my @formats = volume_transfer_formats($cfg, $volid, $target_volid, $snapshot, $base_snapshot, $with_snapshots);
> 




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module
  2022-01-19 14:30       ` Fabian Grünbichler
@ 2022-01-20  9:57         ` Fabian Ebner
  0 siblings, 0 replies; 41+ messages in thread
From: Fabian Ebner @ 2022-01-20  9:57 UTC (permalink / raw)
  To: Fabian Grünbichler, pve-devel

Am 19.01.22 um 15:30 schrieb Fabian Grünbichler:
> On January 3, 2022 1:30 pm, Fabian Ebner wrote:
>> A few nits inline.
>>
>> Am 12/22/21 um 14:52 schrieb Fabian Grünbichler:
>>> lifted from PVE::QemuMigrate, abstracting away use-case specific data.
>>>
>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>>> ---
>>>    src/Makefile      |   1 +
>>>    debian/control    |   1 +
>>>    src/PVE/Tunnel.pm | 356 ++++++++++++++++++++++++++++++++++++++++++++++
>>>    3 files changed, 358 insertions(+)
>>>    create mode 100644 src/PVE/Tunnel.pm
>>>
>>> diff --git a/src/Makefile b/src/Makefile
>>> index 0298d3f..d82162c 100644
>>> --- a/src/Makefile
>>> +++ b/src/Makefile
>>> @@ -12,6 +12,7 @@ install: PVE
>>>    	install -m 0644 PVE/ReplicationConfig.pm ${PERL5DIR}/PVE/
>>>    	install -m 0644 PVE/ReplicationState.pm ${PERL5DIR}/PVE/
>>>    	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
>>> +	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
>>>    	install -d ${PERL5DIR}/PVE/VZDump
>>>    	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
>>>    	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
>>> diff --git a/debian/control b/debian/control
>>> index 4c246d0..73c28bd 100644
>>> --- a/debian/control
>>> +++ b/debian/control
>>> @@ -16,6 +16,7 @@ Depends: libpve-cluster-perl,
>>>             libpve-common-perl (>= 4.0-89),
>>>             libpve-storage-perl (>= 7.0-14),
>>>             pve-cluster,
>>> +         proxmox-websocket-tunnel,
>>>             ${misc:Depends},
>>>             ${perl:Depends},
>>>    Breaks: libpve-common-perl (<< 4.0-89),
>>> diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
>>> new file mode 100644
>>> index 0000000..bbd1169
>>> --- /dev/null
>>> +++ b/src/PVE/Tunnel.pm
>>> @@ -0,0 +1,356 @@
>>> +package PVE::Tunnel;
>>> +
>>> +use strict;
>>> +use warnings;
>>> +
>>> +use IO::Pipe;
>>> +use IPC::Open2;
>>> +use JSON qw(encode_json decode_json);
>>> +use POSIX qw( WNOHANG );
>>> +use Storable qw(dclone);
>>> +use URI::Escape;
>>> +
>>> +use PVE::APIClient::LWP;
>>> +use PVE::Tools;
>>> +
>>> +my $finish_command_pipe = sub {
>>> +    my ($cmdpipe, $timeout) = @_;
>>> +
>>> +    my $cpid = $cmdpipe->{pid};
>>> +    return if !defined($cpid);
>>> +
>>> +    my $writer = $cmdpipe->{writer};
>>> +    my $reader = $cmdpipe->{reader};
>>> +
>>> +    $writer->close();
>>> +    $reader->close();
>>> +
>>> +    my $collect_child_process = sub {
>>> +	my $res = waitpid($cpid, WNOHANG);
>>> +	if (defined($res) && ($res == $cpid)) {
>>> +	    delete $cmdpipe->{cpid};
>>> +	    return 1;
>>> +	} else {
>>> +	    return 0;
>>> +	}
>>> +     };
>>
>> style nit: white-space error
>>
>>> +
>>> +    if ($timeout) {
>>> +	for (my $i = 0; $i < $timeout; $i++) {
>>> +	    return if &$collect_child_process();
>>> +	    sleep(1);
>>> +	}
>>> +    }
>>> +
>>> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGTERM\n");
>>> +    kill(15, $cpid);
>>> +
>>> +    # wait again
>>> +    for (my $i = 0; $i < 10; $i++) {
>>> +	return if &$collect_child_process();
>>> +	sleep(1);
>>> +    }
>>> +
>>> +    $cmdpipe->{log}->("tunnel still running - terminating now with SIGKILL\n");
>>> +    kill 9, $cpid;
>>> +    sleep 1;
>>> +
>>> +    $cmdpipe->{log}->("tunnel child process (PID $cpid) couldn't be collected\n")
>>> +	if !&$collect_child_process();
>>> +};
>>> +
>>> +sub read_tunnel {
>>> +    my ($tunnel, $timeout) = @_;
>>> +
>>> +    $timeout = 60 if !defined($timeout);
>>> +
>>> +    my $reader = $tunnel->{reader};
>>> +
>>> +    my $output;
>>> +    eval {
>>> +	PVE::Tools::run_with_timeout($timeout, sub { $output = <$reader>; });
>>> +    };
>>> +    die "reading from tunnel failed: $@\n" if $@;
>>> +
>>> +    chomp $output if defined($output);
>>> +
>>> +    return $output;
>>> +}
>>> +
>>> +sub write_tunnel {
>>> +    my ($tunnel, $timeout, $command, $params) = @_;
>>> +
>>> +    $timeout = 60 if !defined($timeout);
>>> +
>>> +    my $writer = $tunnel->{writer};
>>> +
>>> +    if ($tunnel->{version} && $tunnel->{version} >= 2) {
>>> +	my $object = defined($params) ? dclone($params) : {};
>>> +	$object->{cmd} = $command;
>>> +
>>> +	$command = eval { JSON::encode_json($object) };
>>> +
>>> +	die "failed to encode command as JSON - $@\n"
>>> +	    if $@;
>>> +    }
>>> +
>>> +    eval {
>>> +	PVE::Tools::run_with_timeout($timeout, sub {
>>> +	    print $writer "$command\n";
>>> +	    $writer->flush();
>>> +	});
>>> +    };
>>> +    die "writing to tunnel failed: $@\n" if $@;
>>> +
>>> +    if ($tunnel->{version} && $tunnel->{version} >= 1) {
>>> +	my $res = eval { read_tunnel($tunnel, $timeout); };
>>> +	die "no reply to command '$command': $@\n" if $@;
>>> +
>>> +	if ($tunnel->{version} == 1) {
>>> +	    if ($res eq 'OK') {
>>> +		return;
>>> +	    } else {
>>> +		die "tunnel replied '$res' to command '$command'\n";
>>> +	    }
>>> +	} else {
>>> +	    my $parsed = eval { JSON::decode_json($res) };
>>> +	    die "failed to decode tunnel reply '$res' (command '$command') - $@\n"
>>> +		if $@;
>>> +
>>> +	    if (!$parsed->{success}) {
>>> +		if (defined($parsed->{msg})) {
>>> +		    die "error - tunnel command '$command' failed - $parsed->{msg}\n";
>>> +		} else {
>>> +		    die "error - tunnel command '$command' failed\n";
>>> +		}
>>> +	    }
>>> +
>>> +	    return $parsed;
>>> +	}
>>> +    }
>>> +}
>>> +
>>> +sub fork_ssh_tunnel {
>>> +    my ($rem_ssh, $cmd, $ssh_forward_info, $log) = @_;
>>> +
>>> +    my @localtunnelinfo = ();
>>> +    foreach my $addr (@$ssh_forward_info) {
>>> +	push @localtunnelinfo, '-L', $addr;
>>> +    }
>>> +
>>> +    my $full_cmd = [@$rem_ssh, '-o ExitOnForwardFailure=yes', @localtunnelinfo, @$cmd];
>>> +
>>> +    my $reader = IO::File->new();
>>> +    my $writer = IO::File->new();
>>> +
>>> +    my $orig_pid = $$;
>>> +
>>> +    my $cpid;
>>> +
>>> +    eval { $cpid = open2($reader, $writer, @$full_cmd); };
>>> +
>>> +    my $err = $@;
>>> +
>>> +    # catch exec errors
>>> +    if ($orig_pid != $$) {
>>> +	$log->("can't fork command pipe, aborting\n");
>>> +	POSIX::_exit(1);
>>> +	kill('KILL', $$);
>>> +    }
>>> +
>>> +    die $err if $err;
>>> +
>>> +    my $tunnel = {
>>> +	writer => $writer,
>>> +	reader => $reader,
>>> +	pid => $cpid,
>>> +	rem_ssh => $rem_ssh,
>>> +	log => $log,
>>> +    };
>>> +
>>> +    eval {
>>> +	my $helo = read_tunnel($tunnel, 60);
>>> +	die "no reply\n" if !$helo;
>>> +	die "no quorum on target node\n" if $helo =~ m/^no quorum$/;
>>> +	die "got strange reply from tunnel ('$helo')\n"
>>> +	    if $helo !~ m/^tunnel online$/;
>>> +    };
>>> +    $err = $@;
>>> +
>>> +    eval {
>>> +	my $ver = read_tunnel($tunnel, 10);
>>> +	if ($ver =~ /^ver (\d+)$/) {
>>> +	    $tunnel->{version} = $1;
>>> +	    $log->('info', "ssh tunnel $ver\n");
>>
>> Should pass only the message like the other calls to $log (unless the
>> function is intended to behave differently when multiple parameters are
>> used, but in qemu-server 06/10 that doesn't happen). What are the
>> reasons for not sticking to the two-parameter log interface?
> 
> no real reason other than 'let's make it more simple' (all the 'err'
> level does is prefix the lines with 'ERROR: '). but we can add it back
> in if we want?
> 
> replication uses a simple logfunc, so we need to adapt one to the other
> in any case:
> - either migration only using 'info' (current)
> - or replication wrapping its logfunc and throwing away the level/adding it
>    as prefix (alternative)
> 
> the alternative is less 'lossy', so maybe I'll switch to that..
> 

I'm fine with either way. I do feel like re-introducing the second 
parameter later on will be more difficult (because of 
package-boundaries), but OTOH there might not be any real need for it.

>>
>>> +	} else {
>>> +	    $err = "received invalid tunnel version string '$ver'\n" if !$err;
>>> +	}
>>> +    };
>>> +
>>> +    if ($err) {
>>> +	$finish_command_pipe->($tunnel);
>>> +	die "can't open tunnel - $err";
>>> +    }
>>> +    return $tunnel;
>>> +}
>>> +
>>> +sub forward_unix_socket {
>>> +    my ($tunnel, $local, $remote) = @_;
>>> +
>>> +    my $params = dclone($tunnel->{params});
>>> +    $params->{unix} = $local;
>>> +    $params->{url} = $params->{url} ."socket=".uri_escape($remote)."&";
>>> +    $params->{ticket} = { path => $remote };
>>> +
>>> +    my $cmd = encode_json({
>>> +	control => JSON::true,
>>> +	cmd => 'forward',
>>> +	data => $params,
>>> +    });
>>> +
>>> +    my $writer = $tunnel->{writer};
>>> +    $tunnel->{forwarded}->{$local} = $remote;
>>> +    eval {
>>> +	unlink $local;
>>> +	PVE::Tools::run_with_timeout(15, sub {
>>> +	    print $writer "$cmd\n";
>>> +	    $writer->flush();
>>> +	});
>>> +    };
>>> +    die "failed to write forwarding command - $@\n" if $@;
>>> +
>>> +    read_tunnel($tunnel);
>>> +}
>>> +
>>> +sub fork_websocket_tunnel {
>>> +    my ($conn, $url, $req_params, $tunnel_params, $log) = @_;
>>> +
>>> +    if (my $apitoken = $conn->{apitoken}) {
>>> +	$tunnel_params->{headers} = [["Authorization", "$apitoken"]];
>>> +    } else {
>>> +	die "can't connect to remote host without credentials\n";
>>> +    }
>>> +
>>> +    if (my $fps = $conn->{cached_fingerprints}) {
>>> +	$tunnel_params->{fingerprint} = (keys %$fps)[0];
>>> +    }
>>> +
>>> +    my $api_client = PVE::APIClient::LWP->new(%$conn);
>>> +
>>> +    my $res = $api_client->post(
>>> +	$url,
>>> +	$req_params,
>>> +    );
>>> +
>>> +    $log->("remote: started migration tunnel worker '$res->{upid}'");
>>
>> Nit: still mentions migration, as is done...
> 
> thanks, removed
> 
>>
>>> +
>>> +    my $websocket_url = $tunnel_params->{url};
>>> +
>>> +    $tunnel_params->{url} .= "?ticket=".uri_escape($res->{ticket});
>>> +    $tunnel_params->{url} .= "&socket=".uri_escape($res->{socket});
>>> +
>>> +    my $reader = IO::Pipe->new();
>>> +    my $writer = IO::Pipe->new();
>>> +
>>> +    my $cpid = fork();
>>> +    if ($cpid) {
>>> +	$writer->writer();
>>> +	$reader->reader();
>>> +	my $tunnel = { writer => $writer, reader => $reader, pid => $cpid };
>>> +
>>> +	eval {
>>> +	    my $writer = $tunnel->{writer};
>>> +	    my $cmd = encode_json({
>>> +		control => JSON::true,
>>> +		cmd => 'connect',
>>> +		data => $tunnel_params,
>>> +	    });
>>> +
>>> +	    eval {
>>> +		PVE::Tools::run_with_timeout(15, sub {
>>> +		    print {$writer} "$cmd\n";
>>> +		    $writer->flush();
>>> +		});
>>> +	    };
>>> +	    die "failed to write tunnel connect command - $@\n" if $@;
>>> +	};
>>> +	die "failed to connect via WS: $@\n" if $@;
>>> +
>>> +	my $err;
>>> +        eval {
>>> +	    my $writer = $tunnel->{writer};
>>> +	    my $cmd = encode_json({
>>> +		cmd => 'version',
>>> +	    });
>>> +
>>> +	    eval {
>>> +		PVE::Tools::run_with_timeout(15, sub {
>>> +		    print {$writer} "$cmd\n";
>>> +		    $writer->flush();
>>> +		});
>>> +	    };
>>> +	    $err = "failed to write tunnel version command - $@\n" if $@;
>>> +	    my $res = read_tunnel($tunnel, 10);
>>> +	    $res = JSON::decode_json($res);
>>> +	    my $version = $res->{api};
>>> +
>>> +	    if ($version =~ /^(\d+)$/) {
>>> +		$tunnel->{version} = $1;
>>> +		$tunnel->{age} = $res->{age};
>>> +	    } else {
>>> +		$err = "received invalid tunnel version string '$version'\n" if !$err;
>>> +	    }
>>> +	};
>>> +	$err = $@ if !$err;
>>> +
>>> +	if ($err) {
>>> +	    $finish_command_pipe->($tunnel);
>>
>> Nit: Here, $tunnel->{log} is still undef, but finish_command_pipe might
>> try to use it.
> 
> ack, since we already have $log anyway we can just set it from the
> start, like in fork_ssh_tunnel.
> 
>>
>>> +	    die "can't open migration tunnel - $err";
>>
>> ...here.
> 
> removed as well.
> 
>>
>>> +	}
>>> +
>>> +	$tunnel_params->{url} = "$websocket_url?"; # reset ticket and socket
>>> +
>>> +	$tunnel->{params} = $tunnel_params; # for forwarding
>>> +	$tunnel->{log} = $log;
>>> +
>>> +	return $tunnel;
>>> +    } else {
>>> +	eval {
>>> +	    $writer->reader();
>>> +	    $reader->writer();
>>> +	    PVE::Tools::run_command(
>>> +		['proxmox-websocket-tunnel'],
>>> +		input => "<&".fileno($writer),
>>> +		output => ">&".fileno($reader),
>>> +		errfunc => sub { my $line = shift; print "tunnel: $line\n"; },
>>> +	    );
>>> +	};
>>> +	warn "CMD websocket tunnel died: $@\n" if $@;
>>> +	exit 0;
>>> +    }
>>> +}
>>> +
>>> +sub finish_tunnel {
>>> +    my ($tunnel, $cleanup) = @_;
>>> +
>>> +    $cleanup = $cleanup ? 1 : 0;
>>> +
>>> +    eval { write_tunnel($tunnel, 30, 'quit', { cleanup => $cleanup }); };
>>> +    my $err = $@;
>>> +
>>> +    $finish_command_pipe->($tunnel, 30);
>>> +
>>> +    if (my $unix_sockets = $tunnel->{unix_sockets}) {
>>> +	# ssh does not clean up on local host
>>> +	my $cmd = ['rm', '-f', @$unix_sockets];
>>> +	PVE::Tools::run_command($cmd);
>>> +
>>> +	# .. and just to be sure check on remote side
>>> +	if ($tunnel->{rem_ssh}) {
>>> +	    unshift @{$cmd}, @{$tunnel->{rem_ssh}};
>>> +	    PVE::Tools::run_command($cmd);
>>> +	}
>>> +    }
>>> +
>>> +    die $err if $err;
>>> +}
>>




^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-01-20  9:57 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-22 13:52 [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 1/3] migrate: handle migration_network with " Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 2/3] add tunnel helper module Fabian Grünbichler
2022-01-03 12:30   ` Fabian Ebner
     [not found]     ` <<47e7d41f-e328-d9fa-25b7-f7585de8ce5b@proxmox.com>
2022-01-19 14:30       ` Fabian Grünbichler
2022-01-20  9:57         ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 guest-common 3/3] add storage tunnel module Fabian Grünbichler
2022-01-03 14:30   ` Fabian Ebner
     [not found]     ` <<af15fed1-2d06-540e-cde8-ed1ce772aeb4@proxmox.com>
2022-01-19 14:31       ` Fabian Grünbichler
2022-01-05 10:50   ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 1/4] initial commit Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 2/4] add tunnel implementation Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 3/4] add fingerprint validation Fabian Grünbichler
2022-01-04 11:37   ` Fabian Ebner
2022-01-19 10:34     ` Fabian Grünbichler
2022-01-19 12:16       ` Fabian Ebner
2022-01-19 12:53         ` Josef Johansson
2021-12-22 13:52 ` [pve-devel] [PATCH v3 proxmox-websocket-tunnel 4/4] add packaging Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 01/10] refactor map_storage to map_id Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 02/10] schema: use pve-bridge-id Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 03/10] parse_config: optional strict mode Fabian Grünbichler
2022-01-04 11:57   ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 04/10] update_vm: allow simultaneous setting of boot-order and dev Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 05/10] nbd alloc helper: allow passing in explicit format Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 06/10] migrate: move tunnel-helpers to pve-guest-common Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 07/10] mtunnel: add API endpoints Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 08/10] migrate: refactor remote VM/tunnel start Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 09/10] migrate: add remote migration handling Fabian Grünbichler
2022-01-04 13:58   ` Fabian Ebner
2022-01-04 16:44     ` Roland
2022-01-11  8:19       ` Thomas Lamprecht
     [not found]         ` <<554040de-09d6-974b-143a-80c2d66b9573@proxmox.com>
2022-01-19 14:32           ` Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 qemu-server 10/10] api: add remote migrate endpoint Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 1/4] volname_for_storage: parse volname before calling Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 2/4] storage_migrate: pull out snapshot decision Fabian Grünbichler
2022-01-05  9:00   ` Fabian Ebner
2022-01-19 14:38     ` Fabian Grünbichler
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 3/4] storage_migrate: pull out import/export_prepare Fabian Grünbichler
2022-01-05  9:59   ` Fabian Ebner
2021-12-22 13:52 ` [pve-devel] [PATCH v3 storage 4/4] add volume_import/export_start helpers Fabian Grünbichler
2021-12-23 13:56 ` [pve-devel] [PATCH v3 qemu-server++ 0/21] remote migration Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal