public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIES v6 0/13] remote migration
@ 2022-09-28 12:50 Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming Fabian Grünbichler
                   ` (13 more replies)
  0 siblings, 14 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

this series adds remote migration for VMs and CTs.

both live and offline migration of VMs including NBD and
storage-migrated disks should work, containers don't have any live
migration so both offline and restart mode work identical except for the
restart part.

groundwork for extending to pvesr already laid.

uncovered (but still not fixed)
https://bugzilla.proxmox.com/show_bug.cgi?id=3873
(migration btrfs -> btrfs with snapshots)

dependencies/breaks:
- qemu-server / pve-container -> bumped pve-storage (taint bug
  storage migration)
- qemu-server / pve-container -> bumped pve-access-control (new priv)
- qemu-server -> bumped pve-common (moved pve-targetstorage option)
- pve-common -BREAKS-> not-bumped qemu-server (same)

follow-ups/todos:
- implement disk export/import for shared storages like rbd
- implement disk export/import raw+size for ZFS zvols
- extend ZFS replication via websocket tunnel to remote cluster
- extend replication to support RBD snapshot-based replication
- extend RBD replication via websocket tunnel to remote cluster
- switch regular migration SSH mtunnel to version 2 with json support
  (related -> s.hanreichs pre-/post-migrate-hook series)

new in v6:
- --with-local-disks always set and not a parameter
- `pct remote-migrate`
- new Sys.Incoming privilege + checks
- storage export taintedness bug fix
- properly take over pve-targetstorage option (qemu-server ->
  pve-common)
- review feedback addressed

new in v5: lots of edge cases fixed, PoC for pve-container, some more
helper moving for re-use in pve-container without duplication

new in v4: lots of small fixes, improved bwlimit handling, `qm` command
(thanks Fabian Ebner and Dominik Csapak for the feedback on v3!)

new in v3: lots of refactoring and edge-case handling

new in v2: dropped parts already applied, incorporated Fabian's and
Dominik's feedback (thanks!)

new in v1: explicit remote endpoint specified as part of API call
instead of
remote.cfg

overview over affected repos and changes, see individual patches for
more details.

pve-access-control:

Fabian Grünbichler (1):
  privs: add Sys.Incoming

 src/PVE/AccessControl.pm | 1 +
 1 file changed, 1 insertion(+)

pve-common:

Fabian Grünbichler (1):
  schema: take over 'pve-targetstorage' option

 src/PVE/JSONSchema.pm | 7 +++++++
 1 file changed, 7 insertions(+)

pve-container:

Fabian Grünbichler (3):
  migration: add remote migration
  pct: add 'remote-migrate' command
  migrate: print mapped volume in error

 src/PVE/API2/LXC.pm    | 635 +++++++++++++++++++++++++++++++++++++++++
 src/PVE/CLI/pct.pm     | 124 ++++++++
 src/PVE/LXC/Migrate.pm | 248 +++++++++++++---
 3 files changed, 965 insertions(+), 42 deletions(-)

pve-docs:

Fabian Grünbichler (1):
  pveum: mention Sys.Incoming privilege

 pveum.adoc | 1 +
 1 file changed, 1 insertion(+)

qemu-server:

Fabian Grünbichler (6):
  schema: move 'pve-targetstorage' to pve-common
  mtunnel: add API endpoints
  migrate: refactor remote VM/tunnel start
  migrate: add remote migration handling
  api: add remote migrate endpoint
  qm: add remote-migrate command

 PVE/API2/Qemu.pm   | 709 ++++++++++++++++++++++++++++++++++++++++++++-
 PVE/CLI/qm.pm      | 113 ++++++++
 PVE/QemuMigrate.pm | 590 ++++++++++++++++++++++++++++---------
 PVE/QemuServer.pm  |  48 ++-
 debian/control     |   5 +-
 5 files changed, 1299 insertions(+), 166 deletions(-)

pve-storage:

Fabian Grünbichler (1):
  (remote) export: check and untaint format

 PVE/CLI/pvesm.pm | 6 ++----
 PVE/Storage.pm   | 9 +++++++++
 2 files changed, 11 insertions(+), 4 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-11-07 15:38   ` [pve-devel] applied: " Thomas Lamprecht
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option Fabian Grünbichler
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

for guarding cross-cluster data streams like guest migrations and
storage migrations.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/PVE/AccessControl.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/PVE/AccessControl.pm b/src/PVE/AccessControl.pm
index c32dcc3..2dcb897 100644
--- a/src/PVE/AccessControl.pm
+++ b/src/PVE/AccessControl.pm
@@ -1022,6 +1022,7 @@ my $privgroups = {
 	root => [
 	    'Sys.PowerMgmt',
 	    'Sys.Modify', # edit/change node settings
+	    'Sys.Incoming', # incoming storage/guest migrations
 	],
 	admin => [
 	    'Permissions.Modify',
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-11-07 15:31   ` [pve-devel] applied: " Thomas Lamprecht
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 1/3] migration: add remote migration Fabian Grünbichler
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

from qemu-server, for re-use in pve-container.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
---

Notes:
    requires versioned breaks on old qemu-server containing the option, to avoid
    registering twice
    
    new in v6/follow-up to v5

 src/PVE/JSONSchema.pm | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/PVE/JSONSchema.pm b/src/PVE/JSONSchema.pm
index 54c149d..527e409 100644
--- a/src/PVE/JSONSchema.pm
+++ b/src/PVE/JSONSchema.pm
@@ -318,6 +318,13 @@ my $verify_idpair = sub {
     return $input;
 };
 
+PVE::JSONSchema::register_standard_option('pve-targetstorage', {
+    description => "Mapping from source to target storages. Providing only a single storage ID maps all source storages to that storage. Providing the special value '1' will map each source storage to itself.",
+    type => 'string',
+    format => 'storage-pair-list',
+    optional => 1,
+});
+
 # note: this only checks a single list entry
 # when using a storage-pair-list map, you need to pass the full parameter to
 # parse_idmap
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 container 1/3] migration: add remote migration
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-10-03 13:22   ` [pve-devel] [PATCH FOLLOW-UP " Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 2/3] pct: add 'remote-migrate' command Fabian Grünbichler
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

modelled after the VM migration, but folded into a single commit since
the actual migration changes are a lot smaller here.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - check for Sys.Incoming in mtunnel API endpoint
    - mark as experimental
    - test_mp fix for non-snapshot calls
    
    new in v5 - PoC to ensure helpers and abstractions are re-usable
    
    requires bumped pve-storage to avoid tainted issue

 src/PVE/API2/LXC.pm    | 635 +++++++++++++++++++++++++++++++++++++++++
 src/PVE/LXC/Migrate.pm | 245 +++++++++++++---
 2 files changed, 838 insertions(+), 42 deletions(-)

diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 589f96f..4e21be4 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -3,6 +3,8 @@ package PVE::API2::LXC;
 use strict;
 use warnings;
 
+use Socket qw(SOCK_STREAM);
+
 use PVE::SafeSyslog;
 use PVE::Tools qw(extract_param run_command);
 use PVE::Exception qw(raise raise_param_exc raise_perm_exc);
@@ -1089,6 +1091,174 @@ __PACKAGE__->register_method ({
     }});
 
 
+__PACKAGE__->register_method({
+    name => 'remote_migrate_vm',
+    path => '{vmid}/remote_migrate',
+    method => 'POST',
+    protected => 1,
+    proxyto => 'node',
+    description => "Migrate the container to another cluster. Creates a new migration task. EXPERIMENTAL feature!",
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
+    },
+    parameters => {
+    	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::LXC::complete_ctid }),
+	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
+	    'target-endpoint' => get_standard_option('proxmox-remote', {
+		description => "Remote target endpoint",
+	    }),
+	    online => {
+		type => 'boolean',
+		description => "Use online/live migration.",
+		optional => 1,
+	    },
+	    restart => {
+		type => 'boolean',
+		description => "Use restart migration",
+		optional => 1,
+	    },
+	    timeout => {
+		type => 'integer',
+		description => "Timeout in seconds for shutdown for restart migration",
+		optional => 1,
+		default => 180,
+	    },
+	    delete => {
+		type => 'boolean',
+		description => "Delete the original CT and related data after successful migration. By default the original CT is kept on the source cluster in a stopped state.",
+		optional => 1,
+		default => 0,
+	    },
+	    'target-storage' => get_standard_option('pve-targetstorage', {
+		optional => 0,
+	    }),
+	    'target-bridge' => {
+		type => 'string',
+		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
+		format => 'bridge-pair-list',
+	    },
+	    bwlimit => {
+		description => "Override I/O bandwidth limit (in KiB/s).",
+		optional => 1,
+		type => 'number',
+		minimum => '0',
+		default => 'migrate limit from datacenter or storage config',
+	    },
+	},
+    },
+    returns => {
+	type => 'string',
+	description => "the task ID.",
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $source_vmid = extract_param($param, 'vmid');
+	my $target_endpoint = extract_param($param, 'target-endpoint');
+	my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
+
+	my $delete = extract_param($param, 'delete') // 0;
+
+	PVE::Cluster::check_cfs_quorum();
+
+	# test if CT exists
+	my $conf = PVE::LXC::Config->load_config($source_vmid);
+	PVE::LXC::Config->check_lock($conf);
+
+	# try to detect errors early
+	if (PVE::LXC::check_running($source_vmid)) {
+	    die "can't migrate running container without --online or --restart\n"
+		if !$param->{online} && !$param->{restart};
+	}
+
+	raise_param_exc({ vmid => "cannot migrate HA-managed CT to remote cluster" })
+	    if PVE::HA::Config::vm_is_ha_managed($source_vmid);
+
+	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
+
+	# TODO: move this as helper somewhere appropriate?
+	my $conn_args = {
+	    protocol => 'https',
+	    host => $remote->{host},
+	    port => $remote->{port} // 8006,
+	    apitoken => $remote->{apitoken},
+	};
+
+	my $fp;
+	if ($fp = $remote->{fingerprint}) {
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 };
+	}
+
+	print "Establishing API connection with remote at '$remote->{host}'\n";
+
+	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
+
+	if (!defined($fp)) {
+	    my $cert_info = $api_client->get("/nodes/localhost/certificates/info");
+	    foreach my $cert (@$cert_info) {
+		my $filename = $cert->{filename};
+		next if $filename ne 'pveproxy-ssl.pem' && $filename ne 'pve-ssl.pem';
+		$fp = $cert->{fingerprint} if !$fp || $filename eq 'pveproxy-ssl.pem';
+	    }
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 }
+		if defined($fp);
+	}
+
+	my $storecfg = PVE::Storage::config();
+	my $target_storage = extract_param($param, 'target-storage');
+	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
+	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
+	    if $@;
+
+	my $target_bridge = extract_param($param, 'target-bridge');
+	my $bridgemap = eval { PVE::JSONSchema::parse_idmap($target_bridge, 'pve-bridge-id') };
+	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
+	    if $@;
+
+	die "remote migration requires explicit storage mapping!\n"
+	    if $storagemap->{identity};
+
+	$param->{storagemap} = $storagemap;
+	$param->{bridgemap} = $bridgemap;
+	$param->{remote} = {
+	    conn => $conn_args, # re-use fingerprint for tunnel
+	    client => $api_client,
+	    vmid => $target_vmid,
+	};
+	$param->{migration_type} = 'websocket';
+	$param->{delete} = $delete if $delete;
+
+	my $cluster_status = $api_client->get("/cluster/status");
+	my $target_node;
+	foreach my $entry (@$cluster_status) {
+	    next if $entry->{type} ne 'node';
+	    if ($entry->{local}) {
+		$target_node = $entry->{name};
+		last;
+	    }
+	}
+
+	die "couldn't determine endpoint's node name\n"
+	    if !defined($target_node);
+
+	my $realcmd = sub {
+	    PVE::LXC::Migrate->migrate($target_node, $remote->{host}, $source_vmid, $param);
+	};
+
+	my $worker = sub {
+	    return PVE::GuestHelpers::guest_migration_lock($source_vmid, 10, $realcmd);
+	};
+
+	return $rpcenv->fork_worker('vzmigrate', $source_vmid, $authuser, $worker);
+    }});
+
+
 __PACKAGE__->register_method({
     name => 'migrate_vm',
     path => '{vmid}/migrate',
@@ -2318,4 +2488,469 @@ __PACKAGE__->register_method({
 	return PVE::GuestHelpers::config_with_pending_array($conf, $pending_delete_hash);
     }});
 
+__PACKAGE__->register_method({
+    name => 'mtunnel',
+    path => '{vmid}/mtunnel',
+    method => 'POST',
+    protected => 1,
+    description => 'Migration tunnel endpoint - only for internal use by CT migration.',
+    permissions => {
+	check =>
+	[ 'and',
+	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
+	  ['perm', '/', [ 'Sys.Incoming' ]],
+	],
+	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
+	               " on '/'. Further permission checks happen during the actual migration.",
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    storages => {
+		type => 'string',
+		format => 'pve-storage-id-list',
+		optional => 1,
+		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
+	    },
+	    bridges => {
+		type => 'string',
+		format => 'pve-bridge-id-list',
+		optional => 1,
+		description => 'List of network bridges to check availability. Will be checked again for actually used bridges during migration.',
+	    },
+	},
+    },
+    returns => {
+	additionalProperties => 0,
+	properties => {
+	    upid => { type => 'string' },
+	    ticket => { type => 'string' },
+	    socket => { type => 'string' },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $node = extract_param($param, 'node');
+	my $vmid = extract_param($param, 'vmid');
+
+	my $storages = extract_param($param, 'storages');
+	my $bridges = extract_param($param, 'bridges');
+
+	my $nodename = PVE::INotify::nodename();
+
+	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
+	    if $node ne 'localhost' && $node ne $nodename;
+
+	$node = $nodename;
+
+	my $storecfg = PVE::Storage::config();
+	foreach my $storeid (PVE::Tools::split_list($storages)) {
+	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
+	}
+
+	foreach my $bridge (PVE::Tools::split_list($bridges)) {
+	    PVE::Network::read_bridge_mtu($bridge);
+	}
+
+	PVE::Cluster::check_cfs_quorum();
+
+	my $socket_addr = "/run/pve/ct-$vmid.mtunnel";
+
+	my $lock = 'create';
+	eval { PVE::LXC::Config->create_and_lock_config($vmid, 0, $lock); };
+
+	raise_param_exc({ vmid => "unable to create empty CT config - $@"})
+	    if $@;
+
+	my $realcmd = sub {
+	    my $state = {
+		storecfg => PVE::Storage::config(),
+		lock => $lock,
+		vmid => $vmid,
+	    };
+
+	    my $run_locked = sub {
+		my ($code, $params) = @_;
+		return PVE::LXC::Config->lock_config($state->{vmid}, sub {
+		    my $conf = PVE::LXC::Config->load_config($state->{vmid});
+
+		    $state->{conf} = $conf;
+
+		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
+			if $state->{lock} && !PVE::LXC::Config->has_lock($conf, $state->{lock});
+
+		    return $code->($params);
+		});
+	    };
+
+	    my $cmd_desc = {
+		config => {
+		    conf => {
+			type => 'string',
+			description => 'Full CT config, adapted for target cluster/node',
+		    },
+		    'firewall-config' => {
+			type => 'string',
+			description => 'CT firewall config',
+			optional => 1,
+		    },
+		},
+		ticket => {
+		    path => {
+			type => 'string',
+			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
+		    },
+		},
+		quit => {
+		    cleanup => {
+			type => 'boolean',
+			description => 'remove CT config and volumes, aborting migration',
+			default => 0,
+		    },
+		},
+		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
+		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
+		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
+	    };
+
+	    my $cmd_handlers = {
+		'version' => sub {
+		    # compared against other end's version
+		    # bump/reset for breaking changes
+		    # bump/bump for opt-in changes
+		    return {
+			api => $PVE::LXC::Migrate::WS_TUNNEL_VERSION,
+			age => 0,
+		    };
+		},
+		'config' => sub {
+		    my ($params) = @_;
+
+		    # parse and write out VM FW config if given
+		    if (my $fw_conf = $params->{'firewall-config'}) {
+			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
+
+			my $empty_conf = {
+			    rules => [],
+			    options => {},
+			    aliases => {},
+			    ipset => {} ,
+			    ipset_comments => {},
+			};
+			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
+
+			# TODO: add flag for strict parsing?
+			# TODO: add import sub that does all this given raw content?
+			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
+			$vmfw_conf->{vmid} = $state->{vmid};
+			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
+
+			$state->{cleanup}->{fw} = 1;
+		    }
+
+		    my $conf_fn = "incoming/lxc/$state->{vmid}.conf";
+		    my $new_conf = PVE::LXC::Config::parse_pct_config($conf_fn, $params->{conf}, 1);
+		    delete $new_conf->{lock};
+		    delete $new_conf->{digest};
+
+		    my $unprivileged = delete $new_conf->{unprivileged};
+		    my $arch = delete $new_conf->{arch};
+
+		    # TODO handle properly?
+		    delete $new_conf->{snapshots};
+		    delete $new_conf->{parent};
+		    delete $new_conf->{pending};
+		    delete $new_conf->{lxc};
+
+		    PVE::LXC::Config->remove_lock($state->{vmid}, 'create');
+
+		    eval {
+			my $conf = {
+			    unprivileged => $unprivileged,
+			    arch => $arch,
+			};
+			PVE::LXC::check_ct_modify_config_perm(
+			    $rpcenv,
+			    $authuser,
+			    $state->{vmid},
+			    undef,
+			    $conf,
+			    $new_conf,
+			    undef,
+			    $unprivileged,
+			);
+			my $errors = PVE::LXC::Config->update_pct_config(
+			    $state->{vmid},
+			    $conf,
+			    0,
+			    $new_conf,
+			    [],
+			    [],
+			);
+			raise_param_exc($errors) if scalar(keys %$errors);
+			PVE::LXC::Config->write_config($state->{vmid}, $conf);
+			PVE::LXC::update_lxc_config($vmid, $conf);
+		    };
+		    if (my $err = $@) {
+			# revert to locked previous config
+			my $conf = PVE::LXC::Config->load_config($state->{vmid});
+			$conf->{lock} = 'create';
+			PVE::LXC::Config->write_config($state->{vmid}, $conf);
+
+			die $err;
+		    }
+
+		    my $conf = PVE::LXC::Config->load_config($state->{vmid});
+		    $conf->{lock} = 'migrate';
+		    PVE::LXC::Config->write_config($state->{vmid}, $conf);
+
+		    $state->{lock} = 'migrate';
+
+		    return;
+		},
+		'bwlimit' => sub {
+		    my ($params) = @_;
+		    return PVE::StorageTunnel::handle_bwlimit($params);
+		},
+		'disk-import' => sub {
+		    my ($params) = @_;
+
+		    $check_storage_access_migrate->(
+			$rpcenv,
+			$authuser,
+			$state->{storecfg},
+			$params->{storage},
+			$node
+		    );
+
+		    $params->{unix} = "/run/pve/ct-$state->{vmid}.storage";
+
+		    return PVE::StorageTunnel::handle_disk_import($state, $params);
+		},
+		'query-disk-import' => sub {
+		    my ($params) = @_;
+
+		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
+		},
+		'unlock' => sub {
+		    PVE::LXC::Config->remove_lock($state->{vmid}, $state->{lock});
+		    delete $state->{lock};
+		    return;
+		},
+		'start' => sub {
+		    PVE::LXC::vm_start(
+			$state->{vmid},
+			$state->{conf},
+			0
+		    );
+
+		    return;
+		},
+		'stop' => sub {
+		    PVE::LXC::vm_stop($state->{vmid}, 1, 10, 1);
+		    return;
+		},
+		'ticket' => sub {
+		    my ($params) = @_;
+
+		    my $path = $params->{path};
+
+		    die "Not allowed to generate ticket for unknown socket '$path'\n"
+			if !defined($state->{sockets}->{$path});
+
+		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
+		},
+		'quit' => sub {
+		    my ($params) = @_;
+
+		    if ($params->{cleanup}) {
+			if ($state->{cleanup}->{fw}) {
+			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
+			}
+
+			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
+			    print "freeing volume '$volid' as part of cleanup\n";
+			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
+			    warn $@ if $@;
+			}
+
+			PVE::LXC::destroy_lxc_container(
+			    $state->{storecfg},
+			    $state->{vmid},
+			    $state->{conf},
+			    undef,
+			    0,
+			);
+		    }
+
+		    print "switching to exit-mode, waiting for client to disconnect\n";
+		    $state->{exit} = 1;
+		    return;
+		},
+	    };
+
+	    $run_locked->(sub {
+		my $socket_addr = "/run/pve/ct-$state->{vmid}.mtunnel";
+		unlink $socket_addr;
+
+		$state->{socket} = IO::Socket::UNIX->new(
+	            Type => SOCK_STREAM(),
+		    Local => $socket_addr,
+		    Listen => 1,
+		);
+
+		$state->{socket_uid} = getpwnam('www-data')
+		    or die "Failed to resolve user 'www-data' to numeric UID\n";
+		chown $state->{socket_uid}, -1, $socket_addr;
+	    });
+
+	    print "mtunnel started\n";
+
+	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
+	    if ($@) {
+		warn "Failed to accept tunnel connection - $@\n";
+
+		warn "Removing tunnel socket..\n";
+		unlink $state->{socket};
+
+		warn "Removing temporary VM config..\n";
+		$run_locked->(sub {
+		    PVE::LXC::destroy_config($state->{vmid});
+		});
+
+		die "Exiting mtunnel\n";
+	    }
+
+	    $state->{conn} = $conn;
+
+	    my $reply_err = sub {
+		my ($msg) = @_;
+
+		my $reply = JSON::encode_json({
+		    success => JSON::false,
+		    msg => $msg,
+		});
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    my $reply_ok = sub {
+		my ($res) = @_;
+
+		$res->{success} = JSON::true;
+		my $reply = JSON::encode_json($res);
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    while (my $line = <$conn>) {
+		chomp $line;
+
+		# untaint, we validate below if needed
+		($line) = $line =~ /^(.*)$/;
+		my $parsed = eval { JSON::decode_json($line) };
+		if ($@) {
+		    $reply_err->("failed to parse command - $@");
+		    next;
+		}
+
+		my $cmd = delete $parsed->{cmd};
+		if (!defined($cmd)) {
+		    $reply_err->("'cmd' missing");
+		} elsif ($state->{exit}) {
+		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
+		    next;
+		} elsif (my $handler = $cmd_handlers->{$cmd}) {
+		    print "received command '$cmd'\n";
+		    eval {
+			if ($cmd_desc->{$cmd}) {
+			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
+			} else {
+			    $parsed = {};
+			}
+			my $res = $run_locked->($handler, $parsed);
+			$reply_ok->($res);
+		    };
+		    $reply_err->("failed to handle '$cmd' command - $@")
+			if $@;
+		} else {
+		    $reply_err->("unknown command '$cmd' given");
+		}
+	    }
+
+	    if ($state->{exit}) {
+		print "mtunnel exited\n";
+	    } else {
+		die "mtunnel exited unexpectedly\n";
+	    }
+	};
+
+	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
+	my $upid = $rpcenv->fork_worker('vzmtunnel', $vmid, $authuser, $realcmd);
+
+	return {
+	    ticket => $ticket,
+	    upid => $upid,
+	    socket => $socket_addr,
+	};
+    }});
+
+__PACKAGE__->register_method({
+    name => 'mtunnelwebsocket',
+    path => '{vmid}/mtunnelwebsocket',
+    method => 'GET',
+    permissions => {
+	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
+        user => 'all', # check inside
+    },
+    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    socket => {
+		type => "string",
+		description => "unix socket to forward to",
+	    },
+	    ticket => {
+		type => "string",
+		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
+	    },
+	},
+    },
+    returns => {
+	type => "object",
+	properties => {
+	    port => { type => 'string', optional => 1 },
+	    socket => { type => 'string', optional => 1 },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $nodename = PVE::INotify::nodename();
+	my $node = extract_param($param, 'node');
+
+	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
+	    if $node ne 'localhost' && $node ne $nodename;
+
+	my $vmid = $param->{vmid};
+	# check VM exists
+	PVE::LXC::Config->load_config($vmid);
+
+	my $socket = $param->{socket};
+	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
+
+	return { socket => $socket };
+    }});
 1;
diff --git a/src/PVE/LXC/Migrate.pm b/src/PVE/LXC/Migrate.pm
index 2ef1cce..a0ab65e 100644
--- a/src/PVE/LXC/Migrate.pm
+++ b/src/PVE/LXC/Migrate.pm
@@ -17,6 +17,9 @@ use PVE::Replication;
 
 use base qw(PVE::AbstractMigrate);
 
+# compared against remote end's minimum version
+our $WS_TUNNEL_VERSION = 2;
+
 sub lock_vm {
     my ($self, $vmid, $code, @param) = @_;
 
@@ -28,6 +31,7 @@ sub prepare {
 
     my $online = $self->{opts}->{online};
     my $restart= $self->{opts}->{restart};
+    my $remote = $self->{opts}->{remote};
 
     $self->{storecfg} = PVE::Storage::config();
 
@@ -44,6 +48,7 @@ sub prepare {
     }
     $self->{was_running} = $running;
 
+    my $storages = {};
     PVE::LXC::Config->foreach_volume_full($conf, { include_unused => 1 }, sub {
 	my ($ms, $mountpoint) = @_;
 
@@ -70,7 +75,7 @@ sub prepare {
 	die "content type 'rootdir' is not available on storage '$storage'\n"
 	    if !$scfg->{content}->{rootdir};
 
-	if ($scfg->{shared}) {
+	if ($scfg->{shared} && !$remote) {
 	    # PVE::Storage::activate_storage checks this for non-shared storages
 	    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
 	    warn "Used shared storage '$storage' is not online on source node!\n"
@@ -83,18 +88,63 @@ sub prepare {
 	    $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $storage);
 	}
 
-	my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
+	if (!$remote) {
+	    my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
+
+	    die "$volid: content type 'rootdir' is not available on storage '$targetsid'\n"
+		if !$target_scfg->{content}->{rootdir};
+	}
 
-	die "$volid: content type 'rootdir' is not available on storage '$targetsid'\n"
-	    if !$target_scfg->{content}->{rootdir};
+	$storages->{$targetsid} = 1;
     });
 
     # todo: test if VM uses local resources
 
-    # test ssh connection
-    my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
-    eval { $self->cmd_quiet($cmd); };
-    die "Can't connect to destination address using public key\n" if $@;
+    if ($remote) {
+	# test & establish websocket connection
+	my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
+
+	my $remote = $self->{opts}->{remote};
+	my $conn = $remote->{conn};
+
+	my $log = sub {
+	    my ($level, $msg) = @_;
+	    $self->log($level, $msg);
+	};
+
+	my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/lxc/$remote->{vmid}/mtunnelwebsocket";
+	my $url = "/nodes/$self->{node}/lxc/$remote->{vmid}/mtunnel";
+
+	my $tunnel_params = {
+	    url => $websocket_url,
+	};
+
+	my $storage_list = join(',', keys %$storages);
+	my $bridge_list = join(',', keys %$bridges);
+
+	my $req_params = {
+	    storages => $storage_list,
+	    bridges => $bridge_list,
+	};
+
+	my $tunnel = PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
+	my $min_version = $tunnel->{version} - $tunnel->{age};
+	$self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
+	$self->log('info', "remote WS tunnel version: $tunnel->{version}");
+	$self->log('info', "minimum required WS tunnel version: $min_version");
+	die "Remote tunnel endpoint not compatible, upgrade required\n"
+	    if $WS_TUNNEL_VERSION < $min_version;
+	 die "Remote tunnel endpoint too old, upgrade required\n"
+	    if $WS_TUNNEL_VERSION > $tunnel->{version};
+
+	$self->log('info', "websocket tunnel started\n");
+	$self->{tunnel} = $tunnel;
+    } else {
+	# test ssh connection
+	my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
+	eval { $self->cmd_quiet($cmd); };
+	die "Can't connect to destination address using public key\n" if $@;
+    }
 
     # in restart mode, we shutdown the container before migrating
     if ($restart && $running) {
@@ -113,6 +163,8 @@ sub prepare {
 sub phase1 {
     my ($self, $vmid) = @_;
 
+    my $remote = $self->{opts}->{remote};
+
     $self->log('info', "starting migration of CT $self->{vmid} to node '$self->{node}' ($self->{nodeip})");
 
     my $conf = $self->{vmconf};
@@ -147,7 +199,7 @@ sub phase1 {
 
 	my $targetsid = $sid;
 
-	if ($scfg->{shared}) {
+	if ($scfg->{shared} && !$remote) {
 	    $self->log('info', "volume '$volid' is on shared storage '$sid'")
 		if !$snapname;
 	    return;
@@ -155,7 +207,8 @@ sub phase1 {
 	    $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $sid);
 	}
 
-	PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
+	PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node})
+	    if !$remote;
 
 	my $bwlimit = $self->get_bwlimit($sid, $targetsid);
 
@@ -192,6 +245,9 @@ sub phase1 {
 
 	eval {
 	    &$test_volid($volid, $snapname);
+
+	    die "remote migration with snapshots not supported yet\n"
+		if $remote && $snapname;
 	};
 
 	&$log_error($@, $volid) if $@;
@@ -201,7 +257,7 @@ sub phase1 {
     my @sids = PVE::Storage::storage_ids($self->{storecfg});
     foreach my $storeid (@sids) {
 	my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
-	next if $scfg->{shared};
+	next if $scfg->{shared} && !$remote;
 	next if !PVE::Storage::storage_check_enabled($self->{storecfg}, $storeid, undef, 1);
 
 	# get list from PVE::Storage (for unreferenced volumes)
@@ -211,10 +267,12 @@ sub phase1 {
 
 	# check if storage is available on target node
 	my $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $storeid);
-	my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
+	if (!$remote) {
+	    my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
 
-	die "content type 'rootdir' is not available on storage '$targetsid'\n"
-	    if !$target_scfg->{content}->{rootdir};
+	    die "content type 'rootdir' is not available on storage '$targetsid'\n"
+		if !$target_scfg->{content}->{rootdir};
+	}
 
 	PVE::Storage::foreach_volid($dl, sub {
 	    my ($volid, $sid, $volname) = @_;
@@ -240,12 +298,21 @@ sub phase1 {
 	    my ($sid, $volname) = PVE::Storage::parse_volume_id($volid);
 	    my $scfg =  PVE::Storage::storage_config($self->{storecfg}, $sid);
 
-	    my $migratable = ($scfg->{type} eq 'dir') || ($scfg->{type} eq 'zfspool')
-		|| ($scfg->{type} eq 'lvmthin') || ($scfg->{type} eq 'lvm')
-		|| ($scfg->{type} eq 'btrfs');
+	    # TODO move to storage plugin layer?
+	    my $migratable_storages = [
+		'dir',
+		'zfspool',
+		'lvmthin',
+		'lvm',
+		'btrfs',
+	    ];
+	    if ($remote) {
+		push @$migratable_storages, 'cifs';
+		push @$migratable_storages, 'nfs';
+	    }
 
 	    die "storage type '$scfg->{type}' not supported\n"
-		if !$migratable;
+		if !grep { $_ eq $scfg->{type} } @$migratable_storages;
 
 	    # image is a linked clone on local storage, se we can't migrate.
 	    if (my $basename = (PVE::Storage::parse_volname($self->{storecfg}, $volid))[3]) {
@@ -280,7 +347,10 @@ sub phase1 {
 
     my $rep_cfg = PVE::ReplicationConfig->new();
 
-    if (my $jobcfg = $rep_cfg->find_local_replication_job($vmid, $self->{node})) {
+    if ($remote) {
+	die "cannot remote-migrate replicated VM\n"
+	    if $rep_cfg->check_for_existing_jobs($vmid, 1);
+    } elsif (my $jobcfg = $rep_cfg->find_local_replication_job($vmid, $self->{node})) {
 	die "can't live migrate VM with replicated volumes\n" if $self->{running};
 	my $start_time = time();
 	my $logfunc = sub { my ($msg) = @_;  $self->log('info', $msg); };
@@ -291,7 +361,6 @@ sub phase1 {
     my $opts = $self->{opts};
     foreach my $volid (keys %$volhash) {
 	next if $rep_volumes->{$volid};
-	my ($sid, $volname) = PVE::Storage::parse_volume_id($volid);
 	push @{$self->{volumes}}, $volid;
 
 	# JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
@@ -301,22 +370,39 @@ sub phase1 {
 	my $targetsid = $volhash->{$volid}->{targetsid};
 
 	my $new_volid = eval {
-	    my $storage_migrate_opts = {
-		'ratelimit_bps' => $bwlimit,
-		'insecure' => $opts->{migration_type} eq 'insecure',
-		'with_snapshots' => $volhash->{$volid}->{snapshots},
-		'allow_rename' => 1,
-	    };
-
-	    my $logfunc = sub { $self->log('info', $_[0]); };
-	    return PVE::Storage::storage_migrate(
-		$self->{storecfg},
-		$volid,
-		$self->{ssh_info},
-		$targetsid,
-		$storage_migrate_opts,
-		$logfunc,
-	    );
+	    if ($remote) {
+		my $log = sub {
+		    my ($level, $msg) = @_;
+		    $self->log($level, $msg);
+		};
+
+		return PVE::StorageTunnel::storage_migrate(
+		    $self->{tunnel},
+		    $self->{storecfg},
+		    $volid,
+		    $self->{vmid},
+		    $remote->{vmid},
+		    $volhash->{$volid},
+		    $log,
+		);
+	    } else {
+		my $storage_migrate_opts = {
+		    'ratelimit_bps' => $bwlimit,
+		    'insecure' => $opts->{migration_type} eq 'insecure',
+		    'with_snapshots' => $volhash->{$volid}->{snapshots},
+		    'allow_rename' => 1,
+		};
+
+		my $logfunc = sub { $self->log('info', $_[0]); };
+		return PVE::Storage::storage_migrate(
+		    $self->{storecfg},
+		    $volid,
+		    $self->{ssh_info},
+		    $targetsid,
+		    $storage_migrate_opts,
+		    $logfunc,
+		);
+	    }
 	};
 
 	if (my $err = $@) {
@@ -346,13 +432,38 @@ sub phase1 {
     my $vollist = PVE::LXC::Config->get_vm_volumes($conf);
     PVE::Storage::deactivate_volumes($self->{storecfg}, $vollist);
 
-    # transfer replication state before moving config
-    $self->transfer_replication_state() if $rep_volumes;
-    PVE::LXC::Config->update_volume_ids($conf, $self->{volume_map});
-    PVE::LXC::Config->write_config($vmid, $conf);
-    PVE::LXC::Config->move_config_to_node($vmid, $self->{node});
+    if ($remote) {
+	my $remote_conf = PVE::LXC::Config->load_config($vmid);
+	PVE::LXC::Config->update_volume_ids($remote_conf, $self->{volume_map});
+
+	my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
+	for my $target (keys $bridges->%*) {
+	    for my $nic (keys $bridges->{$target}->%*) {
+		$self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
+	    }
+	}
+	my $conf_str = PVE::LXC::Config::write_pct_config("remote", $remote_conf);
+
+	# TODO expose in PVE::Firewall?
+	my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
+	my $fw_conf_str;
+	$fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
+	    if -e $vm_fw_conf_path;
+	my $params = {
+	    conf => $conf_str,
+	    'firewall-config' => $fw_conf_str,
+	};
+
+	PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
+    } else {
+	# transfer replication state before moving config
+	$self->transfer_replication_state() if $rep_volumes;
+	PVE::LXC::Config->update_volume_ids($conf, $self->{volume_map});
+	PVE::LXC::Config->write_config($vmid, $conf);
+	PVE::LXC::Config->move_config_to_node($vmid, $self->{node});
+	$self->switch_replication_job_target() if $rep_volumes;
+    }
     $self->{conf_migrated} = 1;
-    $self->switch_replication_job_target() if $rep_volumes;
 }
 
 sub phase1_cleanup {
@@ -366,6 +477,12 @@ sub phase1_cleanup {
 	    # fixme: try to remove ?
 	}
     }
+
+    if ($self->{opts}->{remote}) {
+	# cleans up remote volumes
+	PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
+	delete $self->{tunnel};
+    }
 }
 
 sub phase3 {
@@ -373,6 +490,9 @@ sub phase3 {
 
     my $volids = $self->{volumes};
 
+    # handled below in final_cleanup
+    return if $self->{opts}->{remote};
+
     # destroy local copies
     foreach my $volid (@$volids) {
 	eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
@@ -401,6 +521,24 @@ sub final_cleanup {
 	    my $skiplock = 1;
 	    PVE::LXC::vm_start($vmid, $self->{vmconf}, $skiplock);
 	}
+    } elsif ($self->{opts}->{remote}) {
+	eval { PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'unlock') };
+	$self->log('err', "Failed to clear migrate lock - $@\n") if $@;
+
+	if ($self->{opts}->{restart} && $self->{was_running}) {
+	    $self->log('info', "start container on target node");
+	    PVE::Tunnel::write_tunnel($self->{tunnel}, 60, 'start');
+	}
+	if ($self->{opts}->{delete}) {
+	    PVE::LXC::destroy_lxc_container(
+		PVE::Storage::config(),
+		$vmid,
+		PVE::LXC::Config->load_config($vmid),
+		undef,
+		0,
+	    );
+	}
+	PVE::Tunnel::finish_tunnel($self->{tunnel});
     } else {
 	my $cmd = [ @{$self->{rem_ssh}}, 'pct', 'unlock', $vmid ];
 	$self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
@@ -413,7 +551,30 @@ sub final_cleanup {
 	    $self->cmd($cmd);
 	}
     }
+}
+
+sub map_bridges {
+    my ($conf, $map, $scan_only) = @_;
+
+    my $bridges = {};
+
+    foreach my $opt (keys %$conf) {
+	next if $opt !~ m/^net\d+$/;
+
+	next if !$conf->{$opt};
+	my $d = PVE::LXC::Config->parse_lxc_network($conf->{$opt});
+	next if !$d || !$d->{bridge};
+
+	my $target_bridge = PVE::JSONSchema::map_id($map, $d->{bridge});
+	$bridges->{$target_bridge}->{$opt} = $d->{bridge};
+
+	next if $scan_only;
+
+	$d->{bridge} = $target_bridge;
+	$conf->{$opt} = PVE::LXC::Config->print_lxc_network($d);
+    }
 
+    return $bridges;
 }
 
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 container 2/3] pct: add 'remote-migrate' command
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (2 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 1/3] migration: add remote migration Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 3/3] migrate: print mapped volume in error Fabian Grünbichler
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

works the same as `qm remote-migrate`, with the addition of `--restart`
and `--timeout` parameters.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6: new

 src/PVE/CLI/pct.pm | 124 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)

diff --git a/src/PVE/CLI/pct.pm b/src/PVE/CLI/pct.pm
index 23793ee..3ade2ba 100755
--- a/src/PVE/CLI/pct.pm
+++ b/src/PVE/CLI/pct.pm
@@ -10,6 +10,7 @@ use POSIX;
 use PVE::CLIHandler;
 use PVE::Cluster;
 use PVE::CpuSet;
+use PVE::Exception qw(raise_param_exc);
 use PVE::GuestHelpers;
 use PVE::INotify;
 use PVE::JSONSchema qw(get_standard_option);
@@ -803,6 +804,128 @@ __PACKAGE__->register_method ({
 	return undef;
     }});
 
+
+__PACKAGE__->register_method({
+    name => 'remote_migrate_vm',
+    path => 'remote_migrate_vm',
+    method => 'POST',
+    description => "Migrate container to a remote cluster. Creates a new migration task. EXPERIMENTAL feature!",
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
+	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
+	    'target-endpoint' => get_standard_option('proxmox-remote', {
+		description => "Remote target endpoint",
+	    }),
+	    online => {
+		type => 'boolean',
+		description => "Use online/live migration.",
+		optional => 1,
+	    },
+	    restart => {
+		type => 'boolean',
+		description => "Use restart migration",
+		optional => 1,
+	    },
+	    timeout => {
+		type => 'integer',
+		description => "Timeout in seconds for shutdown for restart migration",
+		optional => 1,
+		default => 180,
+	    },
+	    delete => {
+		type => 'boolean',
+		description => "Delete the original CT and related data after successful migration. By default the original CT is kept on the source cluster in a stopped state.",
+		optional => 1,
+		default => 0,
+	    },
+	    'target-storage' => get_standard_option('pve-targetstorage', {
+		completion => \&PVE::QemuServer::complete_migration_storage,
+		optional => 0,
+	    }),
+	    'target-bridge' => {
+		type => 'string',
+		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
+		format => 'bridge-pair-list',
+	    },
+	    bwlimit => {
+		description => "Override I/O bandwidth limit (in KiB/s).",
+		optional => 1,
+		type => 'integer',
+		minimum => '0',
+		default => 'migrate limit from datacenter or storage config',
+	    },
+	},
+    },
+    returns => {
+	type => 'string',
+	description => "the task ID.",
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $source_vmid = $param->{vmid};
+	my $target_endpoint = $param->{'target-endpoint'};
+	my $target_vmid = $param->{'target-vmid'} // $source_vmid;
+
+	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
+
+	# TODO: move this as helper somewhere appropriate?
+	my $conn_args = {
+	    protocol => 'https',
+	    host => $remote->{host},
+	    port => $remote->{port} // 8006,
+	    apitoken => $remote->{apitoken},
+	};
+
+	$conn_args->{cached_fingerprints} = { uc($remote->{fingerprint}) => 1 }
+	    if defined($remote->{fingerprint});
+
+	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
+	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
+	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
+	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
+	}
+
+	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
+
+	my $storecfg = PVE::Storage::config();
+	my $target_storage = $param->{'target-storage'};
+	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
+	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
+	    if $@;
+
+	my $check_remote_storage = sub {
+	    my ($storage) = @_;
+	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
+	    die "remote: storage '$storage' does not exist!\n"
+		if !@$found;
+
+	    $found = @$found[0];
+
+	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
+	    die "remote: storage '$storage' cannot store CT rootdir\n"
+		if !grep { $_ eq 'rootdir' } @$content_types;
+	};
+
+	foreach my $target_sid (values %{$storagemap->{entries}}) {
+	    $check_remote_storage->($target_sid);
+	}
+
+	$check_remote_storage->($storagemap->{default})
+	    if $storagemap->{default};
+
+	return PVE::API2::LXC->remote_migrate_vm($param);
+    }});
+
 our $cmddef = {
     list=> [ 'PVE::API2::LXC', 'vmlist', [], { node => $nodename }, sub {
 	my $res = shift;
@@ -851,6 +974,7 @@ our $cmddef = {
     migrate => [ "PVE::API2::LXC", 'migrate_vm', ['vmid', 'target'], { node => $nodename }, $upid_exit],
     'move-volume' => [ "PVE::API2::LXC", 'move_volume', ['vmid', 'volume', 'storage', 'target-vmid', 'target-volume'], { node => $nodename }, $upid_exit ],
     move_volume => { alias => 'move-volume' },
+    'remote-migrate' => [ __PACKAGE__, 'remote_migrate_vm', ['vmid', 'target-vmid', 'target-endpoint'], { node => $nodename }, $upid_exit ],
 
     snapshot => [ "PVE::API2::LXC::Snapshot", 'snapshot', ['vmid', 'snapname'], { node => $nodename } , $upid_exit ],
     delsnapshot => [ "PVE::API2::LXC::Snapshot", 'delsnapshot', ['vmid', 'snapname'], { node => $nodename } , $upid_exit ],
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 container 3/3] migrate: print mapped volume in error
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (3 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 2/3] pct: add 'remote-migrate' command Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege Fabian Grünbichler
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

since that is the ID on the target node..

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 src/PVE/LXC/Migrate.pm | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/PVE/LXC/Migrate.pm b/src/PVE/LXC/Migrate.pm
index a0ab65e..ca1dd08 100644
--- a/src/PVE/LXC/Migrate.pm
+++ b/src/PVE/LXC/Migrate.pm
@@ -473,6 +473,9 @@ sub phase1_cleanup {
 
     if ($self->{volumes}) {
 	foreach my $volid (@{$self->{volumes}}) {
+	    if (my $mapped_volume = $self->{volume_map}->{$volid}) {
+		$volid = $mapped_volume;
+	    }
 	    $self->log('err', "found stale volume copy '$volid' on node '$self->{node}'");
 	    # fixme: try to remove ?
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (4 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 3/3] migrate: print mapped volume in error Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-11-07 15:45   ` [pve-devel] applied: " Thomas Lamprecht
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common Fabian Grünbichler
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 pveum.adoc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/pveum.adoc b/pveum.adoc
index 64d8931..cbd553a 100644
--- a/pveum.adoc
+++ b/pveum.adoc
@@ -753,6 +753,7 @@ Node / System related privileges::
 * `Sys.Syslog`: view syslog
 * `Sys.Audit`: view node status/config, Corosync cluster config, and HA config
 * `Sys.Modify`: create/modify/remove node network parameters
+* `Sys.Incoming`: allow incoming data streams from other clusters (experimental)
 * `Group.Allocate`: create/modify/remove groups
 * `Pool.Allocate`: create/modify/remove a pool
 * `Pool.Audit`: view a pool
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (5 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-11-07 15:31   ` [pve-devel] applied: " Thomas Lamprecht
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

for proper re-use in pve-container.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
---

Notes:
    requires versioned dependency on pve-common that has taken over the option
    
    new in v6 / follow-up to v5

 PVE/QemuServer.pm | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 4e85dd02..b1246edf 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -120,13 +120,6 @@ PVE::JSONSchema::register_standard_option('pve-qemu-machine', {
 	optional => 1,
 });
 
-PVE::JSONSchema::register_standard_option('pve-targetstorage', {
-    description => "Mapping from source to target storages. Providing only a single storage ID maps all source storages to that storage. Providing the special value '1' will map each source storage to itself.",
-    type => 'string',
-    format => 'storage-pair-list',
-    optional => 1,
-});
-
 #no warnings 'redefine';
 
 my $nodename_cache;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (6 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-30 11:52   ` Stefan Hanreich
                     ` (2 more replies)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 3/6] migrate: refactor remote VM/tunnel start Fabian Grünbichler
                   ` (5 subsequent siblings)
  13 siblings, 3 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

the following two endpoints are used for migration on the remote side

POST /nodes/NODE/qemu/VMID/mtunnel

which creates and locks an empty VM config, and spawns the main qmtunnel
worker which binds to a VM-specific UNIX socket.

this worker handles JSON-encoded migration commands coming in via this
UNIX socket:
- config (set target VM config)
-- checks permissions for updating config
-- strips pending changes and snapshots
-- sets (optional) firewall config
- disk (allocate disk for NBD migration)
-- checks permission for target storage
-- returns drive string for allocated volume
- disk-import, query-disk-import, bwlimit
-- handled by PVE::StorageTunnel
- start (returning migration info)
- fstrim (via agent)
- ticket (creates a ticket for a WS connection to a specific socket)
- resume
- stop
- nbdstop
- unlock
- quit (+ cleanup)

this worker serves as a replacement for both 'qm mtunnel' and various
manual calls via SSH. the API call will return a ticket valid for
connecting to the worker's UNIX socket via a websocket connection.

GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket

gets called for connecting to a UNIX socket via websocket forwarding,
i.e. once for the main command mtunnel, and once each for the memory
migration and each NBD drive-mirror/storage migration.

access is guarded by a short-lived ticket binding the authenticated user
to the socket path. such tickets can be requested over the main mtunnel,
which keeps track of socket paths currently used by that
mtunnel/migration instance.

each command handler should check privileges for the requested action if
necessary.

both mtunnel and mtunnelwebsocket endpoints are not proxied, the
client/caller is responsible for ensuring the passed 'node' parameter
and the endpoint handling the call are matching.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - check for Sys.Incoming in mtunnel
    - add definedness checks in 'config' command
    - switch to vm_running_locally in 'resume' command
    - moved $socket_addr closer to usage
    v5:
    - us vm_running_locally
    - move '$socket_addr' declaration closer to usage
    v4:
    - add timeout to accept()
    - move 'bwlimit' to PVE::StorageTunnel and extend it
    - mark mtunnel(websocket) as non-proxied, and check $node accordingly
    v3:
    - handle meta and vmgenid better
    - handle failure of 'config' updating
    - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
    - improve tunnel exit by letting client close the connection
    - use strict VM config parser
    v2: incorporated Fabian Ebner's feedback, mainly:
    - use modified nbd alloc helper instead of duplicating
    - fix disk cleanup, also cleanup imported disks
    - fix firewall-conf vs firewall-config mismatch
    
    requires
    - pve-access-control with tunnel ticket support (already marked in d/control)
    - pve-access-control with Sys.Incoming privilege (not yet applied/bumped!)
    - pve-http-server with websocket fixes (could be done via breaks? or bumped in
      pve-manager..)

 PVE/API2/Qemu.pm | 527 ++++++++++++++++++++++++++++++++++++++++++++++-
 debian/control   |   2 +-
 2 files changed, 527 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 3ec31c26..9270ca74 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4,10 +4,13 @@ use strict;
 use warnings;
 use Cwd 'abs_path';
 use Net::SSLeay;
-use POSIX;
 use IO::Socket::IP;
+use IO::Socket::UNIX;
+use IPC::Open3;
+use JSON;
 use URI::Escape;
 use Crypt::OpenSSL::Random;
+use Socket qw(SOCK_STREAM);
 
 use PVE::Cluster qw (cfs_read_file cfs_write_file);;
 use PVE::RRD;
@@ -38,6 +41,7 @@ use PVE::VZDump::Plugin;
 use PVE::DataCenterConfig;
 use PVE::SSHInfo;
 use PVE::Replication;
+use PVE::StorageTunnel;
 
 BEGIN {
     if (!$ENV{PVE_GENERATING_DOCS}) {
@@ -1087,6 +1091,7 @@ __PACKAGE__->register_method({
 	    { subdir => 'spiceproxy' },
 	    { subdir => 'sendkey' },
 	    { subdir => 'firewall' },
+	    { subdir => 'mtunnel' },
 	    ];
 
 	return $res;
@@ -4965,4 +4970,524 @@ __PACKAGE__->register_method({
 	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
     }});
 
+__PACKAGE__->register_method({
+    name => 'mtunnel',
+    path => '{vmid}/mtunnel',
+    method => 'POST',
+    protected => 1,
+    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
+    permissions => {
+	check =>
+	[ 'and',
+	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
+	  ['perm', '/', [ 'Sys.Incoming' ]],
+	],
+	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
+	               " on '/'. Further permission checks happen during the actual migration.",
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    storages => {
+		type => 'string',
+		format => 'pve-storage-id-list',
+		optional => 1,
+		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
+	    },
+	},
+    },
+    returns => {
+	additionalProperties => 0,
+	properties => {
+	    upid => { type => 'string' },
+	    ticket => { type => 'string' },
+	    socket => { type => 'string' },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $node = extract_param($param, 'node');
+	my $vmid = extract_param($param, 'vmid');
+
+	my $storages = extract_param($param, 'storages');
+
+	my $nodename = PVE::INotify::nodename();
+
+	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
+	    if $node ne 'localhost' && $node ne $nodename;
+
+	$node = $nodename;
+
+	my $storecfg = PVE::Storage::config();
+	foreach my $storeid (PVE::Tools::split_list($storages)) {
+	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
+	}
+
+	PVE::Cluster::check_cfs_quorum();
+
+	my $lock = 'create';
+	eval { PVE::QemuConfig->create_and_lock_config($vmid, 0, $lock); };
+
+	raise_param_exc({ vmid => "unable to create empty VM config - $@"})
+	    if $@;
+
+	my $realcmd = sub {
+	    my $state = {
+		storecfg => PVE::Storage::config(),
+		lock => $lock,
+		vmid => $vmid,
+	    };
+
+	    my $run_locked = sub {
+		my ($code, $params) = @_;
+		return PVE::QemuConfig->lock_config($state->{vmid}, sub {
+		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
+
+		    $state->{conf} = $conf;
+
+		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
+			if $state->{lock} && !PVE::QemuConfig->has_lock($conf, $state->{lock});
+
+		    return $code->($params);
+		});
+	    };
+
+	    my $cmd_desc = {
+		config => {
+		    conf => {
+			type => 'string',
+			description => 'Full VM config, adapted for target cluster/node',
+		    },
+		    'firewall-config' => {
+			type => 'string',
+			description => 'VM firewall config',
+			optional => 1,
+		    },
+		},
+		disk => {
+		    format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
+		    storage => {
+			type => 'string',
+			format => 'pve-storage-id',
+		    },
+		    drive => {
+			type => 'object',
+			description => 'parsed drive information without volid and format',
+		    },
+		},
+		start => {
+		    start_params => {
+			type => 'object',
+			description => 'params passed to vm_start_nolock',
+		    },
+		    migrate_opts => {
+			type => 'object',
+			description => 'migrate_opts passed to vm_start_nolock',
+		    },
+		},
+		ticket => {
+		    path => {
+			type => 'string',
+			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
+		    },
+		},
+		quit => {
+		    cleanup => {
+			type => 'boolean',
+			description => 'remove VM config and disks, aborting migration',
+			default => 0,
+		    },
+		},
+		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
+		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
+		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
+	    };
+
+	    my $cmd_handlers = {
+		'version' => sub {
+		    # compared against other end's version
+		    # bump/reset for breaking changes
+		    # bump/bump for opt-in changes
+		    return {
+			api => 2,
+			age => 0,
+		    };
+		},
+		'config' => sub {
+		    my ($params) = @_;
+
+		    # parse and write out VM FW config if given
+		    if (my $fw_conf = $params->{'firewall-config'}) {
+			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
+
+			my $empty_conf = {
+			    rules => [],
+			    options => {},
+			    aliases => {},
+			    ipset => {} ,
+			    ipset_comments => {},
+			};
+			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
+
+			# TODO: add flag for strict parsing?
+			# TODO: add import sub that does all this given raw content?
+			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
+			$vmfw_conf->{vmid} = $state->{vmid};
+			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
+
+			$state->{cleanup}->{fw} = 1;
+		    }
+
+		    my $conf_fn = "incoming/qemu-server/$state->{vmid}.conf";
+		    my $new_conf = PVE::QemuServer::parse_vm_config($conf_fn, $params->{conf}, 1);
+		    delete $new_conf->{lock};
+		    delete $new_conf->{digest};
+
+		    # TODO handle properly?
+		    delete $new_conf->{snapshots};
+		    delete $new_conf->{parent};
+		    delete $new_conf->{pending};
+
+		    # not handled by update_vm_api
+		    my $vmgenid = delete $new_conf->{vmgenid};
+		    my $meta = delete $new_conf->{meta};
+
+		    $new_conf->{vmid} = $state->{vmid};
+		    $new_conf->{node} = $node;
+
+		    PVE::QemuConfig->remove_lock($state->{vmid}, 'create');
+
+		    eval {
+			$update_vm_api->($new_conf, 1);
+		    };
+		    if (my $err = $@) {
+			# revert to locked previous config
+			my $conf = PVE::QemuConfig->load_config($state->{vmid});
+			$conf->{lock} = 'create';
+			PVE::QemuConfig->write_config($state->{vmid}, $conf);
+
+			die $err;
+		    }
+
+		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
+		    $conf->{lock} = 'migrate';
+		    $conf->{vmgenid} = $vmgenid if defined($vmgenid);
+		    $conf->{meta} = $meta if defined($meta);
+		    PVE::QemuConfig->write_config($state->{vmid}, $conf);
+
+		    $state->{lock} = 'migrate';
+
+		    return;
+		},
+		'bwlimit' => sub {
+		    my ($params) = @_;
+		    return PVE::StorageTunnel::handle_bwlimit($params);
+		},
+		'disk' => sub {
+		    my ($params) = @_;
+
+		    my $format = $params->{format};
+		    my $storeid = $params->{storage};
+		    my $drive = $params->{drive};
+
+		    $check_storage_access_migrate->($rpcenv, $authuser, $state->{storecfg}, $storeid, $node);
+
+		    my $storagemap = {
+			default => $storeid,
+		    };
+
+		    my $source_volumes = {
+			'disk' => [
+			    undef,
+			    $storeid,
+			    undef,
+			    $drive,
+			    0,
+			    $format,
+			],
+		    };
+
+		    my $res = PVE::QemuServer::vm_migrate_alloc_nbd_disks($state->{storecfg}, $state->{vmid}, $source_volumes, $storagemap);
+		    if (defined($res->{disk})) {
+			$state->{cleanup}->{volumes}->{$res->{disk}->{volid}} = 1;
+			return $res->{disk};
+		    } else {
+			die "failed to allocate NBD disk..\n";
+		    }
+		},
+		'disk-import' => sub {
+		    my ($params) = @_;
+
+		    $check_storage_access_migrate->(
+			$rpcenv,
+			$authuser,
+			$state->{storecfg},
+			$params->{storage},
+			$node
+		    );
+
+		    $params->{unix} = "/run/qemu-server/$state->{vmid}.storage";
+
+		    return PVE::StorageTunnel::handle_disk_import($state, $params);
+		},
+		'query-disk-import' => sub {
+		    my ($params) = @_;
+
+		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
+		},
+		'start' => sub {
+		    my ($params) = @_;
+
+		    my $info = PVE::QemuServer::vm_start_nolock(
+			$state->{storecfg},
+			$state->{vmid},
+			$state->{conf},
+			$params->{start_params},
+			$params->{migrate_opts},
+		    );
+
+
+		    if ($info->{migrate}->{proto} ne 'unix') {
+			PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
+			die "migration over non-UNIX sockets not possible\n";
+		    }
+
+		    my $socket = $info->{migrate}->{addr};
+		    chown $state->{socket_uid}, -1, $socket;
+		    $state->{sockets}->{$socket} = 1;
+
+		    my $unix_sockets = $info->{migrate}->{unix_sockets};
+		    foreach my $socket (@$unix_sockets) {
+			chown $state->{socket_uid}, -1, $socket;
+			$state->{sockets}->{$socket} = 1;
+		    }
+		    return $info;
+		},
+		'fstrim' => sub {
+		    if (PVE::QemuServer::qga_check_running($state->{vmid})) {
+			eval { mon_cmd($state->{vmid}, "guest-fstrim") };
+			warn "fstrim failed: $@\n" if $@;
+		    }
+		    return;
+		},
+		'stop' => sub {
+		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
+		    return;
+		},
+		'nbdstop' => sub {
+		    PVE::QemuServer::nbd_stop($state->{vmid});
+		    return;
+		},
+		'resume' => sub {
+		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
+			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
+		    } else {
+			die "VM $state->{vmid} not running\n";
+		    }
+		    return;
+		},
+		'unlock' => sub {
+		    PVE::QemuConfig->remove_lock($state->{vmid}, $state->{lock});
+		    delete $state->{lock};
+		    return;
+		},
+		'ticket' => sub {
+		    my ($params) = @_;
+
+		    my $path = $params->{path};
+
+		    die "Not allowed to generate ticket for unknown socket '$path'\n"
+			if !defined($state->{sockets}->{$path});
+
+		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
+		},
+		'quit' => sub {
+		    my ($params) = @_;
+
+		    if ($params->{cleanup}) {
+			if ($state->{cleanup}->{fw}) {
+			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
+			}
+
+			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
+			    print "freeing volume '$volid' as part of cleanup\n";
+			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
+			    warn $@ if $@;
+			}
+
+			PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
+		    }
+
+		    print "switching to exit-mode, waiting for client to disconnect\n";
+		    $state->{exit} = 1;
+		    return;
+		},
+	    };
+
+	    $run_locked->(sub {
+		my $socket_addr = "/run/qemu-server/$state->{vmid}.mtunnel";
+		unlink $socket_addr;
+
+		$state->{socket} = IO::Socket::UNIX->new(
+	            Type => SOCK_STREAM(),
+		    Local => $socket_addr,
+		    Listen => 1,
+		);
+
+		$state->{socket_uid} = getpwnam('www-data')
+		    or die "Failed to resolve user 'www-data' to numeric UID\n";
+		chown $state->{socket_uid}, -1, $socket_addr;
+	    });
+
+	    print "mtunnel started\n";
+
+	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
+	    if ($@) {
+		warn "Failed to accept tunnel connection - $@\n";
+
+		warn "Removing tunnel socket..\n";
+		unlink $state->{socket};
+
+		warn "Removing temporary VM config..\n";
+		$run_locked->(sub {
+		    PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
+		});
+
+		die "Exiting mtunnel\n";
+	    }
+
+	    $state->{conn} = $conn;
+
+	    my $reply_err = sub {
+		my ($msg) = @_;
+
+		my $reply = JSON::encode_json({
+		    success => JSON::false,
+		    msg => $msg,
+		});
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    my $reply_ok = sub {
+		my ($res) = @_;
+
+		$res->{success} = JSON::true;
+		my $reply = JSON::encode_json($res);
+		$conn->print("$reply\n");
+		$conn->flush();
+	    };
+
+	    while (my $line = <$conn>) {
+		chomp $line;
+
+		# untaint, we validate below if needed
+		($line) = $line =~ /^(.*)$/;
+		my $parsed = eval { JSON::decode_json($line) };
+		if ($@) {
+		    $reply_err->("failed to parse command - $@");
+		    next;
+		}
+
+		my $cmd = delete $parsed->{cmd};
+		if (!defined($cmd)) {
+		    $reply_err->("'cmd' missing");
+		} elsif ($state->{exit}) {
+		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
+		    next;
+		} elsif (my $handler = $cmd_handlers->{$cmd}) {
+		    print "received command '$cmd'\n";
+		    eval {
+			if ($cmd_desc->{$cmd}) {
+			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
+			} else {
+			    $parsed = {};
+			}
+			my $res = $run_locked->($handler, $parsed);
+			$reply_ok->($res);
+		    };
+		    $reply_err->("failed to handle '$cmd' command - $@")
+			if $@;
+		} else {
+		    $reply_err->("unknown command '$cmd' given");
+		}
+	    }
+
+	    if ($state->{exit}) {
+		print "mtunnel exited\n";
+	    } else {
+		die "mtunnel exited unexpectedly\n";
+	    }
+	};
+
+	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
+	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
+	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
+
+	return {
+	    ticket => $ticket,
+	    upid => $upid,
+	    socket => $socket_addr,
+	};
+    }});
+
+__PACKAGE__->register_method({
+    name => 'mtunnelwebsocket',
+    path => '{vmid}/mtunnelwebsocket',
+    method => 'GET',
+    permissions => {
+	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
+        user => 'all', # check inside
+    },
+    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid'),
+	    socket => {
+		type => "string",
+		description => "unix socket to forward to",
+	    },
+	    ticket => {
+		type => "string",
+		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
+	    },
+	},
+    },
+    returns => {
+	type => "object",
+	properties => {
+	    port => { type => 'string', optional => 1 },
+	    socket => { type => 'string', optional => 1 },
+	},
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $nodename = PVE::INotify::nodename();
+	my $node = extract_param($param, 'node');
+
+	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
+	    if $node ne 'localhost' && $node ne $nodename;
+
+	my $vmid = $param->{vmid};
+	# check VM exists
+	PVE::QemuConfig->load_config($vmid);
+
+	my $socket = $param->{socket};
+	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
+
+	return { socket => $socket };
+    }});
+
 1;
diff --git a/debian/control b/debian/control
index a90ecd6f..ce469cbd 100644
--- a/debian/control
+++ b/debian/control
@@ -33,7 +33,7 @@ Depends: dbus,
          libjson-perl,
          libjson-xs-perl,
          libnet-ssleay-perl,
-         libpve-access-control (>= 5.0-7),
+         libpve-access-control (>= 7.0-7),
          libpve-cluster-perl,
          libpve-common-perl (>= 7.1-4),
          libpve-guest-common-perl (>= 4.1-1),
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 3/6] migrate: refactor remote VM/tunnel start
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (7 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 4/6] migrate: add remote migration handling Fabian Grünbichler
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

no semantic changes intended, except for:
- no longer passing the main migration UNIX socket to SSH twice for
forwarding
- dropping the 'unix:' prefix in start_remote_tunnel's timeout error message

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - rport/port
    - properly conditionalize 'skiplock'

 PVE/QemuMigrate.pm | 159 ++++++++++++++++++++++++++++-----------------
 PVE/QemuServer.pm  |  34 +++++-----
 2 files changed, 116 insertions(+), 77 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index d52dc8db..d1fedbb8 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -43,19 +43,24 @@ sub fork_tunnel {
     return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
 }
 
+# tunnel_info:
+#   proto: unix (secure) or tcp (insecure/legacy compat)
+#   addr: IP or UNIX socket path
+#   port: optional TCP port
+#   unix_sockets: additional UNIX socket paths to forward
 sub start_remote_tunnel {
-    my ($self, $raddr, $rport, $ruri, $unix_socket_info) = @_;
+    my ($self, $tunnel_info) = @_;
 
     my $nodename = PVE::INotify::nodename();
     my $migration_type = $self->{opts}->{migration_type};
 
     if ($migration_type eq 'secure') {
 
-	if ($ruri =~ /^unix:/) {
-	    my $ssh_forward_info = ["$raddr:$raddr"];
-	    $unix_socket_info->{$raddr} = 1;
+	if ($tunnel_info->{proto} eq 'unix') {
+	    my $ssh_forward_info = [];
 
-	    my $unix_sockets = [ keys %$unix_socket_info ];
+	    my $unix_sockets = [ keys %{$tunnel_info->{unix_sockets}} ];
+	    push @$unix_sockets, $tunnel_info->{addr};
 	    for my $sock (@$unix_sockets) {
 		push @$ssh_forward_info, "$sock:$sock";
 		unlink $sock;
@@ -82,23 +87,23 @@ sub start_remote_tunnel {
 	    if ($unix_socket_try > 100) {
 		$self->{errors} = 1;
 		PVE::Tunnel::finish_tunnel($self->{tunnel});
-		die "Timeout, migration socket $ruri did not get ready";
+		die "Timeout, migration socket $tunnel_info->{addr} did not get ready";
 	    }
 	    $self->{tunnel}->{unix_sockets} = $unix_sockets if (@$unix_sockets);
 
-	} elsif ($ruri =~ /^tcp:/) {
+	} elsif ($tunnel_info->{proto} eq 'tcp') {
 	    my $ssh_forward_info = [];
-	    if ($raddr eq "localhost") {
+	    if ($tunnel_info->{addr} eq "localhost") {
 		# for backwards compatibility with older qemu-server versions
 		my $pfamily = PVE::Tools::get_host_address_family($nodename);
 		my $lport = PVE::Tools::next_migrate_port($pfamily);
-		push @$ssh_forward_info, "$lport:localhost:$rport";
+		push @$ssh_forward_info, "$lport:localhost:$tunnel_info->{port}";
 	    }
 
 	    $self->{tunnel} = $self->fork_tunnel($ssh_forward_info);
 
 	} else {
-	    die "unsupported protocol in migration URI: $ruri\n";
+	    die "unsupported protocol in migration URI: $tunnel_info->{proto}\n";
 	}
     } else {
 	#fork tunnel for insecure migration, to send faster commands like resume
@@ -652,52 +657,45 @@ sub phase1_cleanup {
     }
 }
 
-sub phase2 {
-    my ($self, $vmid) = @_;
+sub phase2_start_local_cluster {
+    my ($self, $vmid, $params) = @_;
 
     my $conf = $self->{vmconf};
     my $local_volumes = $self->{local_volumes};
     my @online_local_volumes = $self->filter_local_volumes('online');
 
     $self->{storage_migration} = 1 if scalar(@online_local_volumes);
+    my $start = $params->{start_params};
+    my $migrate = $params->{migrate_opts};
 
     $self->log('info', "starting VM $vmid on remote node '$self->{node}'");
 
-    my $raddr;
-    my $rport;
-    my $ruri; # the whole migration dst. URI (protocol:address[:port])
-    my $nodename = PVE::INotify::nodename();
+    my $tunnel_info = {};
 
     ## start on remote node
     my $cmd = [@{$self->{rem_ssh}}];
 
-    my $spice_ticket;
-    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
-	my $res = mon_cmd($vmid, 'query-spice');
-	$spice_ticket = $res->{ticket};
+    push @$cmd, 'qm', 'start', $vmid;
+
+    if ($start->{skiplock}) {
+	push @$cmd, '--skiplock';
     }
 
-    push @$cmd , 'qm', 'start', $vmid, '--skiplock', '--migratedfrom', $nodename;
+    push @$cmd, '--migratedfrom', $migrate->{migratedfrom};
 
-    my $migration_type = $self->{opts}->{migration_type};
+    push @$cmd, '--migration_type', $migrate->{type};
 
-    push @$cmd, '--migration_type', $migration_type;
+    push @$cmd, '--migration_network', $migrate->{network}
+      if $migrate->{network};
 
-    push @$cmd, '--migration_network', $self->{opts}->{migration_network}
-      if $self->{opts}->{migration_network};
+    push @$cmd, '--stateuri', $start->{statefile};
 
-    if ($migration_type eq 'insecure') {
-	push @$cmd, '--stateuri', 'tcp';
-    } else {
-	push @$cmd, '--stateuri', 'unix';
-    }
-
-    if ($self->{forcemachine}) {
-	push @$cmd, '--machine', $self->{forcemachine};
+    if ($start->{forcemachine}) {
+	push @$cmd, '--machine', $start->{forcemachine};
     }
 
-    if ($self->{forcecpu}) {
-	push @$cmd, '--force-cpu', $self->{forcecpu};
+    if ($start->{forcecpu}) {
+	push @$cmd, '--force-cpu', $start->{forcecpu};
     }
 
     if ($self->{storage_migration}) {
@@ -705,10 +703,7 @@ sub phase2 {
     }
 
     my $spice_port;
-    my $unix_socket_info = {};
-    # version > 0 for unix socket support
-    my $nbd_protocol_version = 1;
-    my $input = "nbd_protocol_version: $nbd_protocol_version\n";
+    my $input = "nbd_protocol_version: $migrate->{nbd_proto_version}\n";
 
     my @offline_local_volumes = $self->filter_local_volumes('offline');
     for my $volid (@offline_local_volumes) {
@@ -726,7 +721,7 @@ sub phase2 {
 	}
     }
 
-    $input .= "spice_ticket: $spice_ticket\n" if $spice_ticket;
+    $input .= "spice_ticket: $migrate->{spice_ticket}\n" if $migrate->{spice_ticket};
 
     my @online_replicated_volumes = $self->filter_local_volumes('online', 1);
     foreach my $volid (@online_replicated_volumes) {
@@ -756,20 +751,20 @@ sub phase2 {
     my $exitcode = PVE::Tools::run_command($cmd, input => $input, outfunc => sub {
 	my $line = shift;
 
-	if ($line =~ m/^migration listens on tcp:(localhost|[\d\.]+|\[[\d\.:a-fA-F]+\]):(\d+)$/) {
-	    $raddr = $1;
-	    $rport = int($2);
-	    $ruri = "tcp:$raddr:$rport";
+	if ($line =~ m/^migration listens on (tcp):(localhost|[\d\.]+|\[[\d\.:a-fA-F]+\]):(\d+)$/) {
+	    $tunnel_info->{addr} = $2;
+	    $tunnel_info->{port} = int($3);
+	    $tunnel_info->{proto} = $1;
 	}
-	elsif ($line =~ m!^migration listens on unix:(/run/qemu-server/(\d+)\.migrate)$!) {
-	    $raddr = $1;
-	    die "Destination UNIX sockets VMID does not match source VMID" if $vmid ne $2;
-	    $ruri = "unix:$raddr";
+	elsif ($line =~ m!^migration listens on (unix):(/run/qemu-server/(\d+)\.migrate)$!) {
+	    $tunnel_info->{addr} = $2;
+	    die "Destination UNIX sockets VMID does not match source VMID" if $vmid ne $3;
+	    $tunnel_info->{proto} = $1;
 	}
 	elsif ($line =~ m/^migration listens on port (\d+)$/) {
-	    $raddr = "localhost";
-	    $rport = int($1);
-	    $ruri = "tcp:$raddr:$rport";
+	    $tunnel_info->{addr} = "localhost";
+	    $tunnel_info->{port} = int($1);
+	    $tunnel_info->{proto} = "tcp";
 	}
 	elsif ($line =~ m/^spice listens on port (\d+)$/) {
 	    $spice_port = int($1);
@@ -790,7 +785,7 @@ sub phase2 {
 	    $targetdrive =~ s/drive-//g;
 
 	    $handle_storage_migration_listens->($targetdrive, $drivestr, $nbd_uri);
-	    $unix_socket_info->{$nbd_unix_addr} = 1;
+	    $tunnel_info->{unix_sockets}->{$nbd_unix_addr} = 1;
 	} elsif ($line =~ m/^re-using replicated volume: (\S+) - (.*)$/) {
 	    my $drive = $1;
 	    my $volid = $2;
@@ -805,19 +800,65 @@ sub phase2 {
 
     die "remote command failed with exit code $exitcode\n" if $exitcode;
 
-    die "unable to detect remote migration address\n" if !$raddr;
+    die "unable to detect remote migration address\n" if !$tunnel_info->{addr} || !$tunnel_info->{proto};
 
     if (scalar(keys %$target_replicated_volumes) != scalar(@online_replicated_volumes)) {
 	die "number of replicated disks on source and target node do not match - target node too old?\n"
     }
 
+    return ($tunnel_info, $spice_port);
+}
+
+sub phase2 {
+    my ($self, $vmid) = @_;
+
+    my $conf = $self->{vmconf};
+
+    # version > 0 for unix socket support
+    my $nbd_protocol_version = 1;
+
+    my $spice_ticket;
+    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
+	my $res = mon_cmd($vmid, 'query-spice');
+	$spice_ticket = $res->{ticket};
+    }
+
+    my $migration_type = $self->{opts}->{migration_type};
+    my $state_uri = $migration_type eq 'insecure' ? 'tcp' : 'unix';
+
+    my $params = {
+	start_params => {
+	    statefile => $state_uri,
+	    forcemachine => $self->{forcemachine},
+	    forcecpu => $self->{forcecpu},
+	    skiplock => 1,
+	},
+	migrate_opts => {
+	    spice_ticket => $spice_ticket,
+	    type => $migration_type,
+	    network => $self->{opts}->{migration_network},
+	    storagemap => $self->{opts}->{storagemap},
+	    migratedfrom => PVE::INotify::nodename(),
+	    nbd_proto_version => $nbd_protocol_version,
+	    nbd => $self->{nbd},
+	},
+    };
+
+    my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
+
     $self->log('info', "start remote tunnel");
-    $self->start_remote_tunnel($raddr, $rport, $ruri, $unix_socket_info);
+    $self->start_remote_tunnel($tunnel_info);
+
+    my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
+    $migrate_uri .= ":$tunnel_info->{port}"
+	if defined($tunnel_info->{port});
 
     if ($self->{storage_migration}) {
 	$self->{storage_migration_jobs} = {};
 	$self->log('info', "starting storage migration");
 
+	my @online_local_volumes = $self->filter_local_volumes('online');
+
 	die "The number of local disks does not match between the source and the destination.\n"
 	    if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
 	foreach my $drive (keys %{$self->{target_drive}}){
@@ -827,7 +868,7 @@ sub phase2 {
 	    my $source_drive = PVE::QemuServer::parse_drive($drive, $conf->{$drive});
 	    my $source_volid = $source_drive->{file};
 
-	    my $bwlimit = $local_volumes->{$source_volid}->{bwlimit};
+	    my $bwlimit = $self->{local_volumes}->{$source_volid}->{bwlimit};
 	    my $bitmap = $target->{bitmap};
 
 	    $self->log('info', "$drive: start migration to $nbd_uri");
@@ -835,7 +876,7 @@ sub phase2 {
 	}
     }
 
-    $self->log('info', "starting online/live migration on $ruri");
+    $self->log('info', "starting online/live migration on $migrate_uri");
     $self->{livemigration} = 1;
 
     # load_defaults
@@ -912,12 +953,12 @@ sub phase2 {
 
     my $start = time();
 
-    $self->log('info', "start migrate command to $ruri");
+    $self->log('info', "start migrate command to $migrate_uri");
     eval {
-	mon_cmd($vmid, "migrate", uri => $ruri);
+	mon_cmd($vmid, "migrate", uri => $migrate_uri);
     };
     my $merr = $@;
-    $self->log('info', "migrate uri => $ruri failed: $merr") if $merr;
+    $self->log('info', "migrate uri => $migrate_uri failed: $merr") if $merr;
 
     my $last_mem_transferred = 0;
     my $usleep = 1000000;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index b1246edf..024b0af0 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5521,10 +5521,10 @@ sub vm_start_nolock {
 	return $migration_ip;
     };
 
-    my $migrate_uri;
     if ($statefile) {
 	if ($statefile eq 'tcp') {
-	    my $localip = "localhost";
+	    my $migrate = $res->{migrate} = { proto => 'tcp' };
+	    $migrate->{addr} = "localhost";
 	    my $datacenterconf = PVE::Cluster::cfs_read_file('datacenter.cfg');
 	    my $nodename = nodename();
 
@@ -5537,26 +5537,26 @@ sub vm_start_nolock {
 	    }
 
 	    if ($migration_type eq 'insecure') {
-		$localip = $get_migration_ip->($nodename);
-		$localip = "[$localip]" if Net::IP::ip_is_ipv6($localip);
+		$migrate->{addr} = $get_migration_ip->($nodename);
+		$migrate->{addr} = "[$migrate->{addr}]" if Net::IP::ip_is_ipv6($migrate->{addr});
 	    }
 
 	    my $pfamily = PVE::Tools::get_host_address_family($nodename);
-	    my $migrate_port = PVE::Tools::next_migrate_port($pfamily);
-	    $migrate_uri = "tcp:${localip}:${migrate_port}";
-	    push @$cmd, '-incoming', $migrate_uri;
+	    $migrate->{port} = PVE::Tools::next_migrate_port($pfamily);
+	    $migrate->{uri} = "tcp:$migrate->{addr}:$migrate->{port}";
+	    push @$cmd, '-incoming', $migrate->{uri};
 	    push @$cmd, '-S';
 
 	} elsif ($statefile eq 'unix') {
 	    # should be default for secure migrations as a ssh TCP forward
 	    # tunnel is not deterministic reliable ready and fails regurarly
 	    # to set up in time, so use UNIX socket forwards
-	    my $socket_addr = "/run/qemu-server/$vmid.migrate";
-	    unlink $socket_addr;
+	    my $migrate = $res->{migrate} = { proto => 'unix' };
+	    $migrate->{addr} = "/run/qemu-server/$vmid.migrate";
+	    unlink $migrate->{addr};
 
-	    $migrate_uri = "unix:$socket_addr";
-
-	    push @$cmd, '-incoming', $migrate_uri;
+	    $migrate->{uri} = "unix:$migrate->{addr}";
+	    push @$cmd, '-incoming', $migrate->{uri};
 	    push @$cmd, '-S';
 
 	} elsif (-e $statefile) {
@@ -5709,10 +5709,9 @@ sub vm_start_nolock {
     eval { PVE::QemuServer::PCI::reserve_pci_usage($pci_id_list, $vmid, undef, $pid) };
     warn $@ if $@;
 
-    print "migration listens on $migrate_uri\n" if $migrate_uri;
-    $res->{migrate_uri} = $migrate_uri;
-
-    if ($statefile && $statefile ne 'tcp' && $statefile ne 'unix')  {
+    if (defined($res->{migrate})) {
+	print "migration listens on $res->{migrate}->{uri}\n";
+    } elsif ($statefile) {
 	eval { mon_cmd($vmid, "cont"); };
 	warn $@ if $@;
     }
@@ -5727,6 +5726,7 @@ sub vm_start_nolock {
 	    my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
 	    mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
 	    $migrate_storage_uri = "nbd:unix:$socket_path";
+	    $res->{migrate}->{unix_sockets} = [$socket_path];
 	} else {
 	    my $nodename = nodename();
 	    my $localip = $get_migration_ip->($nodename);
@@ -5744,8 +5744,6 @@ sub vm_start_nolock {
 	    $migrate_storage_uri = "nbd:${localip}:${storage_migrate_port}";
 	}
 
-	$res->{migrate_storage_uri} = $migrate_storage_uri;
-
 	foreach my $opt (sort keys %$nbd) {
 	    my $drivestr = $nbd->{$opt}->{drivestr};
 	    my $volid = $nbd->{$opt}->{volid};
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 4/6] migrate: add remote migration handling
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (8 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 3/6] migrate: refactor remote VM/tunnel start Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 5/6] api: add remote migrate endpoint Fabian Grünbichler
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

remote migration uses a websocket connection to a task worker running on
the target node instead of commands via SSH to control the migration.
this websocket tunnel is started earlier than the SSH tunnel, and allows
adding UNIX-socket forwarding over additional websocket connections
on-demand.

the main differences to regular intra-cluster migration are:
- source VM config and disks are only removed upon request via --delete
- shared storages are treated like local storages, since we can't
assume they are shared across clusters (with potentical to extend this
by marking storages as shared)
- NBD migrated disks are explicitly pre-allocated on the target node via
tunnel command before starting the target VM instance
- in addition to storages, network bridges and the VMID itself is
transformed via a user defined mapping
- all commands and migration data streams are sent via a WS tunnel proxy
- pending changes and snapshots are discarded on the target side (for
  the time being)

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - add proxmox-websocket-tunnel dependency
    
    v5:
    - move merge_bwlimits helper to PVE::AbstractMigrate and extend it
    - adapt to map_id move
    - add check on source side for VM snapshots (not yet supported/implemented)
    
    v4:
    - new merge_bwlimits helper, improved bwlimit handling
    - use config-aware remote start timeout
    - switch tunnel log to match migration log sub
    
    v3:
    - move WS tunnel helpers to pve-guest-common-perl
    - check bridge mapping early
    - fix misplaced parentheses
    
    v2:
    - improve tunnel version info printing and error handling
    - don't cleanup unix sockets twice
    - url escape remote socket path
    - cleanup nits and small issues
    
    requires bumped pve-storage to avoid tainted issue for storage migrations

 PVE/API2/Qemu.pm   |   2 +-
 PVE/QemuMigrate.pm | 439 +++++++++++++++++++++++++++++++++++++--------
 PVE/QemuServer.pm  |   7 +-
 debian/control     |   1 +
 4 files changed, 367 insertions(+), 82 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 9270ca74..898b4518 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -5115,7 +5115,7 @@ __PACKAGE__->register_method({
 		    # bump/reset for breaking changes
 		    # bump/bump for opt-in changes
 		    return {
-			api => 2,
+			api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
 			age => 0,
 		    };
 		},
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index d1fedbb8..75495bce 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -5,11 +5,10 @@ use warnings;
 
 use IO::File;
 use IPC::Open2;
-use POSIX qw( WNOHANG );
 use Time::HiRes qw( usleep );
 
-use PVE::Format qw(render_bytes);
 use PVE::Cluster;
+use PVE::Format qw(render_bytes);
 use PVE::GuestHelpers qw(safe_boolean_ne safe_string_ne);
 use PVE::INotify;
 use PVE::RPCEnvironment;
@@ -17,6 +16,7 @@ use PVE::Replication;
 use PVE::ReplicationConfig;
 use PVE::ReplicationState;
 use PVE::Storage;
+use PVE::StorageTunnel;
 use PVE::Tools;
 use PVE::Tunnel;
 
@@ -31,6 +31,9 @@ use PVE::QemuServer;
 use PVE::AbstractMigrate;
 use base qw(PVE::AbstractMigrate);
 
+# compared against remote end's minimum version
+our $WS_TUNNEL_VERSION = 2;
+
 sub fork_tunnel {
     my ($self, $ssh_forward_info) = @_;
 
@@ -43,6 +46,35 @@ sub fork_tunnel {
     return PVE::Tunnel::fork_ssh_tunnel($self->{rem_ssh}, $cmd, $ssh_forward_info, $log);
 }
 
+sub fork_websocket_tunnel {
+    my ($self, $storages, $bridges) = @_;
+
+    my $remote = $self->{opts}->{remote};
+    my $conn = $remote->{conn};
+
+    my $log = sub {
+	my ($level, $msg) = @_;
+	$self->log($level, $msg);
+    };
+
+    my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnelwebsocket";
+    my $url = "/nodes/$self->{node}/qemu/$remote->{vmid}/mtunnel";
+
+    my $tunnel_params = {
+	url => $websocket_url,
+    };
+
+    my $storage_list = join(',', keys %$storages);
+    my $bridge_list = join(',', keys %$bridges);
+
+    my $req_params = {
+	storages => $storage_list,
+	bridges => $bridge_list,
+    };
+
+    return PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
+}
+
 # tunnel_info:
 #   proto: unix (secure) or tcp (insecure/legacy compat)
 #   addr: IP or UNIX socket path
@@ -177,23 +209,34 @@ sub prepare {
     }
 
     my $vollist = PVE::QemuServer::get_vm_volumes($conf);
+
+    my $storages = {};
     foreach my $volid (@$vollist) {
 	my ($sid, $volname) = PVE::Storage::parse_volume_id($volid, 1);
 
-	# check if storage is available on both nodes
+	# check if storage is available on source node
 	my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
 
 	my $targetsid = $sid;
-	# NOTE: we currently ignore shared source storages in mappings so skip here too for now
-	if (!$scfg->{shared}) {
+	# NOTE: local ignores shared mappings, remote maps them
+	if (!$scfg->{shared} || $self->{opts}->{remote}) {
 	    $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $sid);
 	}
 
-	my $target_scfg = PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
-	my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
+	$storages->{$targetsid} = 1;
 
-	die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
-	    if !$target_scfg->{content}->{$vtype};
+	if (!$self->{opts}->{remote}) {
+	    # check if storage is available on target node
+	    my $target_scfg = PVE::Storage::storage_check_enabled(
+		$storecfg,
+		$targetsid,
+		$self->{node},
+	    );
+	    my ($vtype) = PVE::Storage::parse_volname($storecfg, $volid);
+
+	    die "$volid: content type '$vtype' is not available on storage '$targetsid'\n"
+		if !$target_scfg->{content}->{$vtype};
+	}
 
 	if ($scfg->{shared}) {
 	    # PVE::Storage::activate_storage checks this for non-shared storages
@@ -203,10 +246,27 @@ sub prepare {
 	}
     }
 
-    # test ssh connection
-    my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
-    eval { $self->cmd_quiet($cmd); };
-    die "Can't connect to destination address using public key\n" if $@;
+    if ($self->{opts}->{remote}) {
+	# test & establish websocket connection
+	my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
+	my $tunnel = $self->fork_websocket_tunnel($storages, $bridges);
+	my $min_version = $tunnel->{version} - $tunnel->{age};
+	$self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
+	$self->log('info', "remote WS tunnel version: $tunnel->{version}");
+	$self->log('info', "minimum required WS tunnel version: $min_version");
+	die "Remote tunnel endpoint not compatible, upgrade required\n"
+	    if $WS_TUNNEL_VERSION < $min_version;
+	 die "Remote tunnel endpoint too old, upgrade required\n"
+	    if $WS_TUNNEL_VERSION > $tunnel->{version};
+
+	print "websocket tunnel started\n";
+	$self->{tunnel} = $tunnel;
+    } else {
+	# test ssh connection
+	my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
+	eval { $self->cmd_quiet($cmd); };
+	die "Can't connect to destination address using public key\n" if $@;
+    }
 
     return $running;
 }
@@ -244,7 +304,7 @@ sub scan_local_volumes {
 	my @sids = PVE::Storage::storage_ids($storecfg);
 	foreach my $storeid (@sids) {
 	    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
-	    next if $scfg->{shared};
+	    next if $scfg->{shared} && !$self->{opts}->{remote};
 	    next if !PVE::Storage::storage_check_enabled($storecfg, $storeid, undef, 1);
 
 	    # get list from PVE::Storage (for unused volumes)
@@ -253,21 +313,20 @@ sub scan_local_volumes {
 	    next if @{$dl->{$storeid}} == 0;
 
 	    my $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $storeid);
-	    # check if storage is available on target node
-	    my $target_scfg = PVE::Storage::storage_check_enabled(
-		$storecfg,
-		$targetsid,
-		$self->{node},
-	    );
+	    if (!$self->{opts}->{remote}) {
+		# check if storage is available on target node
+		my $target_scfg = PVE::Storage::storage_check_enabled(
+		    $storecfg,
+		    $targetsid,
+		    $self->{node},
+		);
 
-	    die "content type 'images' is not available on storage '$targetsid'\n"
-		if !$target_scfg->{content}->{images};
+		die "content type 'images' is not available on storage '$targetsid'\n"
+		    if !$target_scfg->{content}->{images};
 
-	    my $bwlimit = PVE::Storage::get_bandwidth_limit(
-		'migration',
-		[$targetsid, $storeid],
-		$self->{opts}->{bwlimit},
-	    );
+	    }
+
+	    my $bwlimit = $self->get_bwlimit($storeid, $targetsid);
 
 	    PVE::Storage::foreach_volid($dl, sub {
 		my ($volid, $sid, $volinfo) = @_;
@@ -321,14 +380,17 @@ sub scan_local_volumes {
 	    my $scfg = PVE::Storage::storage_check_enabled($storecfg, $sid);
 
 	    my $targetsid = $sid;
-	    # NOTE: we currently ignore shared source storages in mappings so skip here too for now
-	    if (!$scfg->{shared}) {
+	    # NOTE: local ignores shared mappings, remote maps them
+	    if (!$scfg->{shared} || $self->{opts}->{remote}) {
 		$targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $sid);
 	    }
 
-	    PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
+	    # check target storage on target node if intra-cluster migration
+	    if (!$self->{opts}->{remote}) {
+		PVE::Storage::storage_check_enabled($storecfg, $targetsid, $self->{node});
 
-	    return if $scfg->{shared};
+		return if $scfg->{shared};
+	    }
 
 	    $local_volumes->{$volid}->{ref} = $attr->{referenced_in_config} ? 'config' : 'snapshot';
 	    $local_volumes->{$volid}->{ref} = 'storage' if $attr->{is_unused};
@@ -361,6 +423,8 @@ sub scan_local_volumes {
 		# exceptions: 'zfspool' or 'qcow2' files (on directory storage)
 
 		die "online storage migration not possible if snapshot exists\n" if $self->{running};
+		die "remote migration with snapshots not supported yet\n" if $self->{opts}->{remote};
+
 		if (!($scfg->{type} eq 'zfspool'
 		    || ($scfg->{type} eq 'btrfs' && $local_volumes->{$volid}->{format} eq 'raw')
 		    || $local_volumes->{$volid}->{format} eq 'qcow2'
@@ -417,6 +481,9 @@ sub scan_local_volumes {
 
 	    my $migratable = $scfg->{type} =~ /^(?:dir|btrfs|zfspool|lvmthin|lvm)$/;
 
+	    # TODO: what is this even here for?
+	    $migratable = 1 if $self->{opts}->{remote};
+
 	    die "can't migrate '$volid' - storage type '$scfg->{type}' not supported\n"
 		if !$migratable;
 
@@ -451,6 +518,10 @@ sub handle_replication {
     my $local_volumes = $self->{local_volumes};
 
     return if !$self->{replication_jobcfg};
+
+    die "can't migrate VM with replicated volumes to remote cluster/node\n"
+	if $self->{opts}->{remote};
+
     if ($self->{running}) {
 
 	my $version = PVE::QemuServer::kvm_user_version();
@@ -550,24 +621,51 @@ sub sync_offline_local_volumes {
     $self->log('info', "copying local disk images") if scalar(@volids);
 
     foreach my $volid (@volids) {
-	my $targetsid = $local_volumes->{$volid}->{targetsid};
-	my $bwlimit = $local_volumes->{$volid}->{bwlimit};
-	$bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
-
-	my $storage_migrate_opts = {
-	    'ratelimit_bps' => $bwlimit,
-	    'insecure' => $opts->{migration_type} eq 'insecure',
-	    'with_snapshots' => $local_volumes->{$volid}->{snapshots},
-	    'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
-	};
+	my $new_volid;
 
-	my $logfunc = sub { $self->log('info', $_[0]); };
-	my $new_volid = eval {
-	    PVE::Storage::storage_migrate($storecfg, $volid, $self->{ssh_info},
-					  $targetsid, $storage_migrate_opts, $logfunc);
-	};
-	if (my $err = $@) {
-	    die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
+	my $opts = $self->{opts};
+	if ($opts->{remote}) {
+	    my $log = sub {
+		my ($level, $msg) = @_;
+		$self->log($level, $msg);
+	    };
+
+	    $new_volid = PVE::StorageTunnel::storage_migrate(
+		$self->{tunnel},
+		$storecfg,
+		$volid,
+		$self->{vmid},
+		$opts->{remote}->{vmid},
+		$local_volumes->{$volid},
+		$log,
+	    );
+	} else {
+	    my $targetsid = $local_volumes->{$volid}->{targetsid};
+
+	    my $bwlimit = $local_volumes->{$volid}->{bwlimit};
+	    $bwlimit = $bwlimit * 1024 if defined($bwlimit); # storage_migrate uses bps
+
+	    my $storage_migrate_opts = {
+		'ratelimit_bps' => $bwlimit,
+		'insecure' => $opts->{migration_type} eq 'insecure',
+		'with_snapshots' => $local_volumes->{$volid}->{snapshots},
+		'allow_rename' => !$local_volumes->{$volid}->{is_vmstate},
+	    };
+
+	    my $logfunc = sub { $self->log('info', $_[0]); };
+	    $new_volid = eval {
+		PVE::Storage::storage_migrate(
+		    $storecfg,
+		    $volid,
+		    $self->{ssh_info},
+		    $targetsid,
+		    $storage_migrate_opts,
+		    $logfunc,
+		);
+	    };
+	    if (my $err = $@) {
+		die "storage migration for '$volid' to storage '$targetsid' failed - $err\n";
+	    }
 	}
 
 	$self->{volume_map}->{$volid} = $new_volid;
@@ -583,6 +681,12 @@ sub sync_offline_local_volumes {
 sub cleanup_remotedisks {
     my ($self) = @_;
 
+    if ($self->{opts}->{remote}) {
+	PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
+	delete $self->{tunnel};
+	return;
+    }
+
     my $local_volumes = $self->{local_volumes};
 
     foreach my $volid (values %{$self->{volume_map}}) {
@@ -632,8 +736,100 @@ sub phase1 {
     $self->handle_replication($vmid);
 
     $self->sync_offline_local_volumes();
+    $self->phase1_remote($vmid) if $self->{opts}->{remote};
 };
 
+sub map_bridges {
+    my ($conf, $map, $scan_only) = @_;
+
+    my $bridges = {};
+
+    foreach my $opt (keys %$conf) {
+	next if $opt !~ m/^net\d+$/;
+
+	next if !$conf->{$opt};
+	my $d = PVE::QemuServer::parse_net($conf->{$opt});
+	next if !$d || !$d->{bridge};
+
+	my $target_bridge = PVE::JSONSchema::map_id($map, $d->{bridge});
+	$bridges->{$target_bridge}->{$opt} = $d->{bridge};
+
+	next if $scan_only;
+
+	$d->{bridge} = $target_bridge;
+	$conf->{$opt} = PVE::QemuServer::print_net($d);
+    }
+
+    return $bridges;
+}
+
+sub phase1_remote {
+    my ($self, $vmid) = @_;
+
+    my $remote_conf = PVE::QemuConfig->load_config($vmid);
+    PVE::QemuConfig->update_volume_ids($remote_conf, $self->{volume_map});
+
+    my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
+    for my $target (keys $bridges->%*) {
+	for my $nic (keys $bridges->{$target}->%*) {
+	    $self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
+	}
+    }
+
+    my @online_local_volumes = $self->filter_local_volumes('online');
+
+    my $storage_map = $self->{opts}->{storagemap};
+    $self->{nbd} = {};
+    PVE::QemuConfig->foreach_volume($remote_conf, sub {
+	my ($ds, $drive) = @_;
+
+	# TODO eject CDROM?
+	return if PVE::QemuServer::drive_is_cdrom($drive);
+
+	my $volid = $drive->{file};
+	return if !$volid;
+
+	return if !grep { $_ eq $volid} @online_local_volumes;
+
+	my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid);
+	my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
+	my $source_format = PVE::QemuServer::qemu_img_format($scfg, $volname);
+
+	# set by target cluster
+	my $oldvolid = delete $drive->{file};
+	delete $drive->{format};
+
+	my $targetsid = PVE::JSONSchema::map_id($storage_map, $storeid);
+
+	my $params = {
+	    format => $source_format,
+	    storage => $targetsid,
+	    drive => $drive,
+	};
+
+	$self->log('info', "Allocating volume for drive '$ds' on remote storage '$targetsid'..");
+	my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'disk', $params);
+
+	$self->log('info', "volume '$oldvolid' is '$res->{volid}' on the target\n");
+	$remote_conf->{$ds} = $res->{drivestr};
+	$self->{nbd}->{$ds} = $res;
+    });
+
+    my $conf_str = PVE::QemuServer::write_vm_config("remote", $remote_conf);
+
+    # TODO expose in PVE::Firewall?
+    my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
+    my $fw_conf_str;
+    $fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
+	if -e $vm_fw_conf_path;
+    my $params = {
+	conf => $conf_str,
+	'firewall-config' => $fw_conf_str,
+    };
+
+    PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
+}
+
 sub phase1_cleanup {
     my ($self, $vmid, $err) = @_;
 
@@ -664,7 +860,6 @@ sub phase2_start_local_cluster {
     my $local_volumes = $self->{local_volumes};
     my @online_local_volumes = $self->filter_local_volumes('online');
 
-    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
     my $start = $params->{start_params};
     my $migrate = $params->{migrate_opts};
 
@@ -809,10 +1004,37 @@ sub phase2_start_local_cluster {
     return ($tunnel_info, $spice_port);
 }
 
+sub phase2_start_remote_cluster {
+    my ($self, $vmid, $params) = @_;
+
+    die "insecure migration to remote cluster not implemented\n"
+	if $params->{migrate_opts}->{type} ne 'websocket';
+
+    my $remote_vmid = $self->{opts}->{remote}->{vmid};
+
+    # like regular start but with some overhead accounted for
+    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+
+    my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
+
+    foreach my $drive (keys %{$res->{drives}}) {
+	$self->{stopnbd} = 1;
+	$self->{target_drive}->{$drive}->{drivestr} = $res->{drives}->{$drive}->{drivestr};
+	my $nbd_uri = $res->{drives}->{$drive}->{nbd_uri};
+	die "unexpected NBD uri for '$drive': $nbd_uri\n"
+	    if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
+
+	$self->{target_drive}->{$drive}->{nbd_uri} = $nbd_uri;
+    }
+
+    return ($res->{migrate}, $res->{spice_port});
+}
+
 sub phase2 {
     my ($self, $vmid) = @_;
 
     my $conf = $self->{vmconf};
+    my $local_volumes = $self->{local_volumes};
 
     # version > 0 for unix socket support
     my $nbd_protocol_version = 1;
@@ -844,10 +1066,39 @@ sub phase2 {
 	},
     };
 
-    my ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
+    my ($tunnel_info, $spice_port);
 
-    $self->log('info', "start remote tunnel");
-    $self->start_remote_tunnel($tunnel_info);
+    my @online_local_volumes = $self->filter_local_volumes('online');
+    $self->{storage_migration} = 1 if scalar(@online_local_volumes);
+
+    if (my $remote = $self->{opts}->{remote}) {
+	my $remote_vmid = $remote->{vmid};
+	$params->{migrate_opts}->{remote_node} = $self->{node};
+	($tunnel_info, $spice_port) = $self->phase2_start_remote_cluster($vmid, $params);
+	die "only UNIX sockets are supported for remote migration\n"
+	    if $tunnel_info->{proto} ne 'unix';
+
+	my $remote_socket = $tunnel_info->{addr};
+	my $local_socket = $remote_socket;
+	$local_socket =~ s/$remote_vmid/$vmid/g;
+	$tunnel_info->{addr} = $local_socket;
+
+	$self->log('info', "Setting up tunnel for '$local_socket'");
+	PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+
+	foreach my $remote_socket (@{$tunnel_info->{unix_sockets}}) {
+	    my $local_socket = $remote_socket;
+	    $local_socket =~ s/$remote_vmid/$vmid/g;
+	    next if $self->{tunnel}->{forwarded}->{$local_socket};
+	    $self->log('info', "Setting up tunnel for '$local_socket'");
+	    PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+	}
+    } else {
+	($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
+
+	$self->log('info', "start remote tunnel");
+	$self->start_remote_tunnel($tunnel_info);
+    }
 
     my $migrate_uri = "$tunnel_info->{proto}:$tunnel_info->{addr}";
     $migrate_uri .= ":$tunnel_info->{port}"
@@ -857,8 +1108,6 @@ sub phase2 {
 	$self->{storage_migration_jobs} = {};
 	$self->log('info', "starting storage migration");
 
-	my @online_local_volumes = $self->filter_local_volumes('online');
-
 	die "The number of local disks does not match between the source and the destination.\n"
 	    if (scalar(keys %{$self->{target_drive}}) != scalar(@online_local_volumes));
 	foreach my $drive (keys %{$self->{target_drive}}){
@@ -890,7 +1139,8 @@ sub phase2 {
 
     # migrate speed can be set via bwlimit (datacenter.cfg and API) and via the
     # migrate_speed parameter in qm.conf - take the lower of the two.
-    my $bwlimit = PVE::Storage::get_bandwidth_limit('migration', undef, $self->{opts}->{bwlimit}) // 0;
+    my $bwlimit = $self->get_bwlimit();
+
     my $migrate_speed = $conf->{migrate_speed} // 0;
     $migrate_speed *= 1024; # migrate_speed is in MB/s, bwlimit in KB/s
 
@@ -931,7 +1181,7 @@ sub phase2 {
     };
     $self->log('info', "migrate-set-parameters error: $@") if $@;
 
-    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga})) {
+    if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && !$self->{opts}->{remote}) {
 	my $rpcenv = PVE::RPCEnvironment::get();
 	my $authuser = $rpcenv->get_user();
 
@@ -1144,11 +1394,15 @@ sub phase2_cleanup {
 
     my $nodename = PVE::INotify::nodename();
 
-    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
-    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
-    if (my $err = $@) {
-        $self->log('err', $err);
-        $self->{errors} = 1;
+    if ($self->{tunnel} && $self->{tunnel}->{version} >= 2) {
+	PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'stop');
+    } else {
+	my $cmd = [@{$self->{rem_ssh}}, 'qm', 'stop', $vmid, '--skiplock', '--migratedfrom', $nodename];
+	eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+	if (my $err = $@) {
+	    $self->log('err', $err);
+	    $self->{errors} = 1;
+	}
     }
 
     # cleanup after stopping, otherwise disks might be in-use by target VM!
@@ -1181,7 +1435,7 @@ sub phase3_cleanup {
 
     my $tunnel = $self->{tunnel};
 
-    if ($self->{volume_map}) {
+    if ($self->{volume_map} && !$self->{opts}->{remote}) {
 	my $target_drives = $self->{target_drive};
 
 	# FIXME: for NBD storage migration we now only update the volid, and
@@ -1197,28 +1451,35 @@ sub phase3_cleanup {
     }
 
     # transfer replication state before move config
-    $self->transfer_replication_state() if $self->{is_replicated};
-    PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
-    $self->switch_replication_job_target() if $self->{is_replicated};
+    if (!$self->{opts}->{remote}) {
+	$self->transfer_replication_state() if $self->{is_replicated};
+	PVE::QemuConfig->move_config_to_node($vmid, $self->{node});
+	$self->switch_replication_job_target() if $self->{is_replicated};
+    }
 
     if ($self->{livemigration}) {
 	if ($self->{stopnbd}) {
 	    $self->log('info', "stopping NBD storage migration server on target.");
 	    # stop nbd server on remote vm - requirement for resume since 2.9
-	    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
+	    if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 2) {
+		PVE::Tunnel::write_tunnel($tunnel, 30, 'nbdstop');
+	    } else {
+		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'nbdstop', $vmid];
 
-	    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
-	    if (my $err = $@) {
-		$self->log('err', $err);
-		$self->{errors} = 1;
+		eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+		if (my $err = $@) {
+		    $self->log('err', $err);
+		    $self->{errors} = 1;
+		}
 	    }
 	}
 
 	if (!$self->{vm_was_paused}) {
 	    # config moved and nbd server stopped - now we can resume vm on target
 	    if ($tunnel && $tunnel->{version} && $tunnel->{version} >= 1) {
+		my $cmd = $tunnel->{version} == 1 ? "resume $vmid" : "resume";
 		eval {
-		    PVE::Tunnel::write_tunnel($tunnel, 30, "resume $vmid");
+		    PVE::Tunnel::write_tunnel($tunnel, 30, $cmd);
 		};
 		if (my $err = $@) {
 		    $self->log('err', $err);
@@ -1245,11 +1506,15 @@ sub phase3_cleanup {
 	) {
 	    if (!$self->{vm_was_paused}) {
 		$self->log('info', "issuing guest fstrim");
-		my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
-		eval { PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
-		if (my $err = $@) {
-		    $self->log('err', "fstrim failed - $err");
-		    $self->{errors} = 1;
+		if ($self->{opts}->{remote}) {
+		    PVE::Tunnel::write_tunnel($self->{tunnel}, 600, 'fstrim');
+		} else {
+		    my $cmd = [@{$self->{rem_ssh}}, 'qm', 'guest', 'cmd', $vmid, 'fstrim'];
+		    eval{ PVE::Tools::run_command($cmd, outfunc => sub {}, errfunc => sub {}) };
+		    if (my $err = $@) {
+			$self->log('err', "fstrim failed - $err");
+			$self->{errors} = 1;
+		    }
 		}
 	    } else {
 		$self->log('info', "skipping guest fstrim, because VM is paused");
@@ -1258,12 +1523,14 @@ sub phase3_cleanup {
     }
 
     # close tunnel on successful migration, on error phase2_cleanup closed it
-    if ($tunnel) {
+    if ($tunnel && $tunnel->{version} == 1) {
 	eval { PVE::Tunnel::finish_tunnel($tunnel); };
 	if (my $err = $@) {
 	    $self->log('err', $err);
 	    $self->{errors} = 1;
 	}
+	$tunnel = undef;
+	delete $self->{tunnel};
     }
 
     eval {
@@ -1301,6 +1568,9 @@ sub phase3_cleanup {
 
     # destroy local copies
     foreach my $volid (@not_replicated_volumes) {
+	# remote is cleaned up below
+	next if $self->{opts}->{remote};
+
 	eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
 	if (my $err = $@) {
 	    $self->log('err', "removing local copy of '$volid' failed - $err");
@@ -1310,8 +1580,19 @@ sub phase3_cleanup {
     }
 
     # clear migrate lock
-    my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
-    $self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
+    if ($tunnel && $tunnel->{version} >= 2) {
+	PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
+
+	PVE::Tunnel::finish_tunnel($tunnel);
+    } else {
+	my $cmd = [ @{$self->{rem_ssh}}, 'qm', 'unlock', $vmid ];
+	$self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
+    }
+
+    if ($self->{opts}->{remote} && $self->{opts}->{delete}) {
+	eval { PVE::QemuServer::destroy_vm($self->{storecfg}, $vmid, 1, undef, 0) };
+	warn "Failed to remove source VM - $@\n" if $@;
+    }
 }
 
 sub final_cleanup {
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 024b0af0..8a3b213d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5474,7 +5474,10 @@ sub vm_start_nolock {
     my $defaults = load_defaults();
 
     # set environment variable useful inside network script
-    $ENV{PVE_MIGRATED_FROM} = $migratedfrom if $migratedfrom;
+    # for remote migration the config is available on the target node!
+    if (!$migrate_opts->{remote_node}) {
+	$ENV{PVE_MIGRATED_FROM} = $migratedfrom;
+    }
 
     PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1);
 
@@ -5722,7 +5725,7 @@ sub vm_start_nolock {
 
 	my $migrate_storage_uri;
 	# nbd_protocol_version > 0 for unix socket support
-	if ($nbd_protocol_version > 0 && $migration_type eq 'secure') {
+	if ($nbd_protocol_version > 0 && ($migration_type eq 'secure' || $migration_type eq 'websocket')) {
 	    my $socket_path = "/run/qemu-server/$vmid\_nbd.migrate";
 	    mon_cmd($vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $socket_path } } );
 	    $migrate_storage_uri = "nbd:unix:$socket_path";
diff --git a/debian/control b/debian/control
index ce469cbd..790f6ce2 100644
--- a/debian/control
+++ b/debian/control
@@ -42,6 +42,7 @@ Depends: dbus,
          libuuid-perl,
          libxml-libxml-perl,
          perl (>= 5.10.0-19),
+         proxmox-websocket-tunnel,
          pve-cluster,
          pve-edk2-firmware (>= 3.20210831-1),
          pve-firewall,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 5/6] api: add remote migrate endpoint
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (9 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 4/6] migrate: add remote migration handling Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command Fabian Grünbichler
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

entry point for the remote migration on the source side, mainly
preparing the API client that gets passed to the actual migration code
and doing some parameter parsing.

querying of the remote sides resources (like available storages, free
VMIDs, lookup of endpoint details for specific nodes, ...) should be
done by the client - see next commit with CLI example.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - mark as experimental
    - remove `with-local-disks` from API parameters, always set to true
    v5:
    - add to API index
    v4:
    - removed target_node parameter, now determined by querying /cluster/status on the remote
    - moved checks to CLI

 PVE/API2/Qemu.pm | 213 ++++++++++++++++++++++++++++++++++++++++++++++-
 debian/control   |   2 +
 2 files changed, 212 insertions(+), 3 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 898b4518..fa31e973 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -12,6 +12,7 @@ use URI::Escape;
 use Crypt::OpenSSL::Random;
 use Socket qw(SOCK_STREAM);
 
+use PVE::APIClient::LWP;
 use PVE::Cluster qw (cfs_read_file cfs_write_file);;
 use PVE::RRD;
 use PVE::SafeSyslog;
@@ -52,8 +53,6 @@ BEGIN {
     }
 }
 
-use Data::Dumper; # fixme: remove
-
 use base qw(PVE::RESTHandler);
 
 my $opt_force_description = "Force physical removal. Without this, we simple remove the disk from the config file and create an additional configuration entry called 'unused[n]', which contains the volume ID. Unlink of unused[n] always cause physical removal.";
@@ -1092,7 +1091,8 @@ __PACKAGE__->register_method({
 	    { subdir => 'sendkey' },
 	    { subdir => 'firewall' },
 	    { subdir => 'mtunnel' },
-	    ];
+	    { subdir => 'remote_migrate' },
+	];
 
 	return $res;
     }});
@@ -4300,6 +4300,202 @@ __PACKAGE__->register_method({
 
     }});
 
+__PACKAGE__->register_method({
+    name => 'remote_migrate_vm',
+    path => '{vmid}/remote_migrate',
+    method => 'POST',
+    protected => 1,
+    proxyto => 'node',
+    description => "Migrate virtual machine to a remote cluster. Creates a new migration task. EXPERIMENTAL feature!",
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
+	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
+	    'target-endpoint' => get_standard_option('proxmox-remote', {
+		description => "Remote target endpoint",
+	    }),
+	    online => {
+		type => 'boolean',
+		description => "Use online/live migration if VM is running. Ignored if VM is stopped.",
+		optional => 1,
+	    },
+	    delete => {
+		type => 'boolean',
+		description => "Delete the original VM and related data after successful migration. By default the original VM is kept on the source cluster in a stopped state.",
+		optional => 1,
+		default => 0,
+	    },
+	    'target-storage' => get_standard_option('pve-targetstorage', {
+		completion => \&PVE::QemuServer::complete_migration_storage,
+		optional => 0,
+	    }),
+	    'target-bridge' => {
+		type => 'string',
+		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
+		format => 'bridge-pair-list',
+	    },
+	    bwlimit => {
+		description => "Override I/O bandwidth limit (in KiB/s).",
+		optional => 1,
+		type => 'integer',
+		minimum => '0',
+		default => 'migrate limit from datacenter or storage config',
+	    },
+	},
+    },
+    returns => {
+	type => 'string',
+	description => "the task ID.",
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $source_vmid = extract_param($param, 'vmid');
+	my $target_endpoint = extract_param($param, 'target-endpoint');
+	my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
+
+	my $delete = extract_param($param, 'delete') // 0;
+
+	PVE::Cluster::check_cfs_quorum();
+
+	# test if VM exists
+	my $conf = PVE::QemuConfig->load_config($source_vmid);
+
+	PVE::QemuConfig->check_lock($conf);
+
+	raise_param_exc({ vmid => "cannot migrate HA-managed VM to remote cluster" })
+	    if PVE::HA::Config::vm_is_ha_managed($source_vmid);
+
+	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
+
+	# TODO: move this as helper somewhere appropriate?
+	my $conn_args = {
+	    protocol => 'https',
+	    host => $remote->{host},
+	    port => $remote->{port} // 8006,
+	    apitoken => $remote->{apitoken},
+	};
+
+	my $fp;
+	if ($fp = $remote->{fingerprint}) {
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 };
+	}
+
+	print "Establishing API connection with remote at '$remote->{host}'\n";
+
+	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
+
+	if (!defined($fp)) {
+	    my $cert_info = $api_client->get("/nodes/localhost/certificates/info");
+	    foreach my $cert (@$cert_info) {
+		my $filename = $cert->{filename};
+		next if $filename ne 'pveproxy-ssl.pem' && $filename ne 'pve-ssl.pem';
+		$fp = $cert->{fingerprint} if !$fp || $filename eq 'pveproxy-ssl.pem';
+	    }
+	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 }
+		if defined($fp);
+	}
+
+	my $repl_conf = PVE::ReplicationConfig->new();
+	my $is_replicated = $repl_conf->check_for_existing_jobs($source_vmid, 1);
+	die "cannot remote-migrate replicated VM\n" if $is_replicated;
+
+	if (PVE::QemuServer::check_running($source_vmid)) {
+	    die "can't migrate running VM without --online\n" if !$param->{online};
+
+	} else {
+	    warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
+	    $param->{online} = 0;
+	}
+
+	# FIXME: fork worker hear to avoid timeout? or poll these periodically
+	# in pvestatd and access cached info here? all of the below is actually
+	# checked at the remote end anyway once we call the mtunnel endpoint,
+	# we could also punt it to the client and not do it here at all..
+	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
+	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
+	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
+	}
+
+	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
+
+	my $storecfg = PVE::Storage::config();
+	my $target_storage = extract_param($param, 'target-storage');
+	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
+	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
+	    if $@;
+
+	my $target_bridge = extract_param($param, 'target-bridge');
+	my $bridgemap = eval { PVE::JSONSchema::parse_idmap($target_bridge, 'pve-bridge-id') };
+	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
+	    if $@;
+
+	my $check_remote_storage = sub {
+	    my ($storage) = @_;
+	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
+	    die "remote: storage '$storage' does not exist!\n"
+		if !@$found;
+
+	    $found = @$found[0];
+
+	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
+	    die "remote: storage '$storage' cannot store images\n"
+		if !grep { $_ eq 'images' } @$content_types;
+	};
+
+	foreach my $target_sid (values %{$storagemap->{entries}}) {
+	    $check_remote_storage->($target_sid);
+	}
+
+	$check_remote_storage->($storagemap->{default})
+	    if $storagemap->{default};
+
+	die "remote migration requires explicit storage mapping!\n"
+	    if $storagemap->{identity};
+
+	$param->{storagemap} = $storagemap;
+	$param->{bridgemap} = $bridgemap;
+	$param->{remote} = {
+	    conn => $conn_args, # re-use fingerprint for tunnel
+	    client => $api_client,
+	    vmid => $target_vmid,
+	};
+	$param->{migration_type} = 'websocket';
+	$param->{'with-local-disks'} = 1;
+	$param->{delete} = $delete if $delete;
+
+	my $cluster_status = $api_client->get("/cluster/status");
+	my $target_node;
+	foreach my $entry (@$cluster_status) {
+	    next if $entry->{type} ne 'node';
+	    if ($entry->{local}) {
+		$target_node = $entry->{name};
+		last;
+	    }
+	}
+
+	die "couldn't determine endpoint's node name\n"
+	    if !defined($target_node);
+
+	my $realcmd = sub {
+	    PVE::QemuMigrate->migrate($target_node, $remote->{host}, $source_vmid, $param);
+	};
+
+	my $worker = sub {
+	    return PVE::GuestHelpers::guest_migration_lock($source_vmid, 10, $realcmd);
+	};
+
+	return $rpcenv->fork_worker('qmigrate', $source_vmid, $authuser, $worker);
+    }});
+
 __PACKAGE__->register_method({
     name => 'monitor',
     path => '{vmid}/monitor',
@@ -4996,6 +5192,12 @@ __PACKAGE__->register_method({
 		optional => 1,
 		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
 	    },
+	    bridges => {
+		type => 'string',
+		format => 'pve-bridge-id-list',
+		optional => 1,
+		description => 'List of network bridges to check availability. Will be checked again for actually used bridges during migration.',
+	    },
 	},
     },
     returns => {
@@ -5016,6 +5218,7 @@ __PACKAGE__->register_method({
 	my $vmid = extract_param($param, 'vmid');
 
 	my $storages = extract_param($param, 'storages');
+	my $bridges = extract_param($param, 'bridges');
 
 	my $nodename = PVE::INotify::nodename();
 
@@ -5029,6 +5232,10 @@ __PACKAGE__->register_method({
 	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
 	}
 
+	foreach my $bridge (PVE::Tools::split_list($bridges)) {
+	    PVE::Network::read_bridge_mtu($bridge);
+	}
+
 	PVE::Cluster::check_cfs_quorum();
 
 	my $lock = 'create';
diff --git a/debian/control b/debian/control
index 790f6ce2..18fde803 100644
--- a/debian/control
+++ b/debian/control
@@ -6,6 +6,7 @@ Build-Depends: debhelper (>= 12~),
                libglib2.0-dev,
                libio-multiplex-perl,
                libjson-c-dev,
+               libpve-apiclient-perl,
                libpve-cluster-perl,
                libpve-common-perl (>= 7.1-4),
                libpve-guest-common-perl (>= 4.1-1),
@@ -34,6 +35,7 @@ Depends: dbus,
          libjson-xs-perl,
          libnet-ssleay-perl,
          libpve-access-control (>= 7.0-7),
+         libpve-apiclient-perl,
          libpve-cluster-perl,
          libpve-common-perl (>= 7.1-4),
          libpve-guest-common-perl (>= 4.1-1),
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (10 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 5/6] api: add remote migrate endpoint Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-10-17 14:40   ` DERUMIER, Alexandre
  2022-10-17 17:22   ` DERUMIER, Alexandre
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 storage 1/1] (remote) export: check and untaint format Fabian Grünbichler
  2022-10-04 15:29 ` [pve-devel] [PATCH-SERIES v6 0/13] remote migration DERUMIER, Alexandre
  13 siblings, 2 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

which wraps the remote_migrate_vm API endpoint, but does the
precondition checks that can be done up front itself.

this now just leaves the FP retrieval and target node name lookup to the
sync part of the API endpoint, which should be do-able in <30s ..

an example invocation:

$ qm remote-migrate 1234 4321 'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' --target-bridge vmbr0 --target-storage zfs-a:rbd-b,nfs-c:dir-d,zfs-e --online

will migrate the local VM 1234 to the host 123.123.1232.123 using the
given API token, mapping the VMID to 4321 on the target cluster, all its
virtual NICs to the target vm bridge 'vmbr0', any volumes on storage
zfs-a to storage rbd-b, any on storage nfs-c to storage dir-d, and any
other volumes to storage zfs-e. the source VM will be stopped but remain
on the source node/cluster after the migration has finished.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6:
    - mark as experimental
    - drop `with-local-disks` parameter from API, always set to true
    - add example invocation to commit message
    
    v5: rename to 'remote-migrate'

 PVE/API2/Qemu.pm |  31 -------------
 PVE/CLI/qm.pm    | 113 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+), 31 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index fa31e973..57083601 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4416,17 +4416,6 @@ __PACKAGE__->register_method({
 	    $param->{online} = 0;
 	}
 
-	# FIXME: fork worker hear to avoid timeout? or poll these periodically
-	# in pvestatd and access cached info here? all of the below is actually
-	# checked at the remote end anyway once we call the mtunnel endpoint,
-	# we could also punt it to the client and not do it here at all..
-	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
-	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
-	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
-	}
-
-	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
-
 	my $storecfg = PVE::Storage::config();
 	my $target_storage = extract_param($param, 'target-storage');
 	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
@@ -4438,26 +4427,6 @@ __PACKAGE__->register_method({
 	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
 	    if $@;
 
-	my $check_remote_storage = sub {
-	    my ($storage) = @_;
-	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
-	    die "remote: storage '$storage' does not exist!\n"
-		if !@$found;
-
-	    $found = @$found[0];
-
-	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
-	    die "remote: storage '$storage' cannot store images\n"
-		if !grep { $_ eq 'images' } @$content_types;
-	};
-
-	foreach my $target_sid (values %{$storagemap->{entries}}) {
-	    $check_remote_storage->($target_sid);
-	}
-
-	$check_remote_storage->($storagemap->{default})
-	    if $storagemap->{default};
-
 	die "remote migration requires explicit storage mapping!\n"
 	    if $storagemap->{identity};
 
diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
index ca5d25fc..a6a63566 100755
--- a/PVE/CLI/qm.pm
+++ b/PVE/CLI/qm.pm
@@ -15,6 +15,7 @@ use POSIX qw(strftime);
 use Term::ReadLine;
 use URI::Escape;
 
+use PVE::APIClient::LWP;
 use PVE::Cluster;
 use PVE::Exception qw(raise_param_exc);
 use PVE::GuestHelpers;
@@ -158,6 +159,117 @@ __PACKAGE__->register_method ({
 	return;
     }});
 
+
+__PACKAGE__->register_method({
+    name => 'remote_migrate_vm',
+    path => 'remote_migrate_vm',
+    method => 'POST',
+    description => "Migrate virtual machine to a remote cluster. Creates a new migration task. EXPERIMENTAL feature!",
+    permissions => {
+	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    node => get_standard_option('pve-node'),
+	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
+	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
+	    'target-endpoint' => get_standard_option('proxmox-remote', {
+		description => "Remote target endpoint",
+	    }),
+	    online => {
+		type => 'boolean',
+		description => "Use online/live migration if VM is running. Ignored if VM is stopped.",
+		optional => 1,
+	    },
+	    delete => {
+		type => 'boolean',
+		description => "Delete the original VM and related data after successful migration. By default the original VM is kept on the source cluster in a stopped state.",
+		optional => 1,
+		default => 0,
+	    },
+	    'target-storage' => get_standard_option('pve-targetstorage', {
+		completion => \&PVE::QemuServer::complete_migration_storage,
+		optional => 0,
+	    }),
+	    'target-bridge' => {
+		type => 'string',
+		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
+		format => 'bridge-pair-list',
+	    },
+	    bwlimit => {
+		description => "Override I/O bandwidth limit (in KiB/s).",
+		optional => 1,
+		type => 'integer',
+		minimum => '0',
+		default => 'migrate limit from datacenter or storage config',
+	    },
+	},
+    },
+    returns => {
+	type => 'string',
+	description => "the task ID.",
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+
+	my $source_vmid = $param->{vmid};
+	my $target_endpoint = $param->{'target-endpoint'};
+	my $target_vmid = $param->{'target-vmid'} // $source_vmid;
+
+	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
+
+	# TODO: move this as helper somewhere appropriate?
+	my $conn_args = {
+	    protocol => 'https',
+	    host => $remote->{host},
+	    port => $remote->{port} // 8006,
+	    apitoken => $remote->{apitoken},
+	};
+
+	$conn_args->{cached_fingerprints} = { uc($remote->{fingerprint}) => 1 }
+	    if defined($remote->{fingerprint});
+
+	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
+	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
+	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
+	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
+	}
+
+	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
+
+	my $storecfg = PVE::Storage::config();
+	my $target_storage = $param->{'target-storage'};
+	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
+	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
+	    if $@;
+
+	my $check_remote_storage = sub {
+	    my ($storage) = @_;
+	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
+	    die "remote: storage '$storage' does not exist!\n"
+		if !@$found;
+
+	    $found = @$found[0];
+
+	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
+	    die "remote: storage '$storage' cannot store images\n"
+		if !grep { $_ eq 'images' } @$content_types;
+	};
+
+	foreach my $target_sid (values %{$storagemap->{entries}}) {
+	    $check_remote_storage->($target_sid);
+	}
+
+	$check_remote_storage->($storagemap->{default})
+	    if $storagemap->{default};
+
+	return PVE::API2::Qemu->remote_migrate_vm($param);
+    }});
+
 __PACKAGE__->register_method ({
     name => 'status',
     path => 'status',
@@ -900,6 +1012,7 @@ our $cmddef = {
     clone => [ "PVE::API2::Qemu", 'clone_vm', ['vmid', 'newid'], { node => $nodename }, $upid_exit ],
 
     migrate => [ "PVE::API2::Qemu", 'migrate_vm', ['vmid', 'target'], { node => $nodename }, $upid_exit ],
+    'remote-migrate' => [ __PACKAGE__, 'remote_migrate_vm', ['vmid', 'target-vmid', 'target-endpoint'], { node => $nodename }, $upid_exit ],
 
     set => [ "PVE::API2::Qemu", 'update_vm', ['vmid'], { node => $nodename } ],
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 storage 1/1] (remote) export: check and untaint format
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (11 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command Fabian Grünbichler
@ 2022-09-28 12:50 ` Fabian Grünbichler
  2022-09-29 12:39   ` [pve-devel] applied: " Thomas Lamprecht
  2022-10-04 15:29 ` [pve-devel] [PATCH-SERIES v6 0/13] remote migration DERUMIER, Alexandre
  13 siblings, 1 reply; 29+ messages in thread
From: Fabian Grünbichler @ 2022-09-28 12:50 UTC (permalink / raw)
  To: pve-devel

this format comes from the remote cluster, so it might not be supported
on the source side - checking whether it's known (as additional
safeguard) and untainting (to avoid open3 failure) is required.

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---

Notes:
    v6: new

 PVE/CLI/pvesm.pm | 6 ++----
 PVE/Storage.pm   | 9 +++++++++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/PVE/CLI/pvesm.pm b/PVE/CLI/pvesm.pm
index 003b019..9b9676b 100755
--- a/PVE/CLI/pvesm.pm
+++ b/PVE/CLI/pvesm.pm
@@ -30,8 +30,6 @@ use PVE::CLIHandler;
 
 use base qw(PVE::CLIHandler);
 
-my $KNOWN_EXPORT_FORMATS = ['raw+size', 'tar+size', 'qcow2+size', 'vmdk+size', 'zfs', 'btrfs'];
-
 my $nodename = PVE::INotify::nodename();
 
 sub param_mapping {
@@ -269,7 +267,7 @@ __PACKAGE__->register_method ({
 	    format => {
 		description => "Export stream format",
 		type => 'string',
-		enum => $KNOWN_EXPORT_FORMATS,
+		enum => $PVE::Storage::KNOWN_EXPORT_FORMATS,
 	    },
 	    filename => {
 		description => "Destination file name",
@@ -355,7 +353,7 @@ __PACKAGE__->register_method ({
 	    format => {
 		description => "Import stream format",
 		type => 'string',
-		enum => $KNOWN_EXPORT_FORMATS,
+		enum => $PVE::Storage::KNOWN_EXPORT_FORMATS,
 	    },
 	    filename => {
 		description => "Source file name. For '-' stdin is used, the " .
diff --git a/PVE/Storage.pm b/PVE/Storage.pm
index b9c53a1..ce61fee 100755
--- a/PVE/Storage.pm
+++ b/PVE/Storage.pm
@@ -48,6 +48,8 @@ use constant APIVER => 10;
 # see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
 use constant APIAGE => 1;
 
+our $KNOWN_EXPORT_FORMATS = ['raw+size', 'tar+size', 'qcow2+size', 'vmdk+size', 'zfs', 'btrfs'];
+
 # load standard plugins
 PVE::Storage::DirPlugin->register();
 PVE::Storage::LVMPlugin->register();
@@ -1949,6 +1951,13 @@ sub volume_import_start {
 sub volume_export_start {
     my ($cfg, $volid, $format, $log, $opts) = @_;
 
+    my $known_format = [ grep { $_ eq $format } $KNOWN_EXPORT_FORMATS->@* ];
+    if (!$known_format->@*) {
+	die "Cannot export '$volid' using unknown export format '$format'\n";
+    }
+
+    $format = ($known_format->@*)[0];
+
     my $run_command_params = delete $opts->{cmd} // {};
 
     my $cmds = $volume_export_prepare->($cfg, $volid, $format, $log, $opts);
-- 
2.30.2





^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 storage 1/1] (remote) export: check and untaint format
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 storage 1/1] (remote) export: check and untaint format Fabian Grünbichler
@ 2022-09-29 12:39   ` Thomas Lamprecht
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Lamprecht @ 2022-09-29 12:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 28/09/2022 um 14:50 schrieb Fabian Grünbichler:
> this format comes from the remote cluster, so it might not be supported
> on the source side - checking whether it's known (as additional
> safeguard) and untainting (to avoid open3 failure) is required.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>     v6: new
> 
>  PVE/CLI/pvesm.pm | 6 ++----
>  PVE/Storage.pm   | 9 +++++++++
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 
>

applied, with a small code style fixup (`($foo->@*)[0]` vs `$foo->[0]`) squashed in, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
@ 2022-09-30 11:52   ` Stefan Hanreich
  2022-10-03  7:11     ` Fabian Grünbichler
  2022-10-03 13:22   ` [pve-devel] [PATCH FOLLOW-UP " Fabian Grünbichler
  2022-10-18  6:23   ` [pve-devel] [PATCH " DERUMIER, Alexandre
  2 siblings, 1 reply; 29+ messages in thread
From: Stefan Hanreich @ 2022-09-30 11:52 UTC (permalink / raw)
  To: pve-devel



On 9/28/22 14:50, Fabian Grünbichler wrote:
> the following two endpoints are used for migration on the remote side
> 
> POST /nodes/NODE/qemu/VMID/mtunnel
> 
> which creates and locks an empty VM config, and spawns the main qmtunnel
> worker which binds to a VM-specific UNIX socket.
> 
> this worker handles JSON-encoded migration commands coming in via this
> UNIX socket:
> - config (set target VM config)
> -- checks permissions for updating config
> -- strips pending changes and snapshots
> -- sets (optional) firewall config
> - disk (allocate disk for NBD migration)
> -- checks permission for target storage
> -- returns drive string for allocated volume
> - disk-import, query-disk-import, bwlimit
> -- handled by PVE::StorageTunnel
> - start (returning migration info)
> - fstrim (via agent)
> - ticket (creates a ticket for a WS connection to a specific socket)
> - resume
> - stop
> - nbdstop
> - unlock
> - quit (+ cleanup)
> 
> this worker serves as a replacement for both 'qm mtunnel' and various
> manual calls via SSH. the API call will return a ticket valid for
> connecting to the worker's UNIX socket via a websocket connection.
> 
> GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket
> 
> gets called for connecting to a UNIX socket via websocket forwarding,
> i.e. once for the main command mtunnel, and once each for the memory
> migration and each NBD drive-mirror/storage migration.
> 
> access is guarded by a short-lived ticket binding the authenticated user
> to the socket path. such tickets can be requested over the main mtunnel,
> which keeps track of socket paths currently used by that
> mtunnel/migration instance.
> 
> each command handler should check privileges for the requested action if
> necessary.
> 
> both mtunnel and mtunnelwebsocket endpoints are not proxied, the
> client/caller is responsible for ensuring the passed 'node' parameter
> and the endpoint handling the call are matching.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      v6:
>      - check for Sys.Incoming in mtunnel
>      - add definedness checks in 'config' command
>      - switch to vm_running_locally in 'resume' command
>      - moved $socket_addr closer to usage
>      v5:
>      - us vm_running_locally
>      - move '$socket_addr' declaration closer to usage
>      v4:
>      - add timeout to accept()
>      - move 'bwlimit' to PVE::StorageTunnel and extend it
>      - mark mtunnel(websocket) as non-proxied, and check $node accordingly
>      v3:
>      - handle meta and vmgenid better
>      - handle failure of 'config' updating
>      - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
>      - improve tunnel exit by letting client close the connection
>      - use strict VM config parser
>      v2: incorporated Fabian Ebner's feedback, mainly:
>      - use modified nbd alloc helper instead of duplicating
>      - fix disk cleanup, also cleanup imported disks
>      - fix firewall-conf vs firewall-config mismatch
>      
>      requires
>      - pve-access-control with tunnel ticket support (already marked in d/control)
>      - pve-access-control with Sys.Incoming privilege (not yet applied/bumped!)
>      - pve-http-server with websocket fixes (could be done via breaks? or bumped in
>        pve-manager..)
> 
>   PVE/API2/Qemu.pm | 527 ++++++++++++++++++++++++++++++++++++++++++++++-
>   debian/control   |   2 +-
>   2 files changed, 527 insertions(+), 2 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 3ec31c26..9270ca74 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4,10 +4,13 @@ use strict;
>   use warnings;
>   use Cwd 'abs_path';
>   use Net::SSLeay;
> -use POSIX;
>   use IO::Socket::IP;
> +use IO::Socket::UNIX;
> +use IPC::Open3;
> +use JSON;
>   use URI::Escape;
>   use Crypt::OpenSSL::Random;
> +use Socket qw(SOCK_STREAM);
>   
>   use PVE::Cluster qw (cfs_read_file cfs_write_file);;
>   use PVE::RRD;
> @@ -38,6 +41,7 @@ use PVE::VZDump::Plugin;
>   use PVE::DataCenterConfig;
>   use PVE::SSHInfo;
>   use PVE::Replication;
> +use PVE::StorageTunnel;
>   
>   BEGIN {
>       if (!$ENV{PVE_GENERATING_DOCS}) {
> @@ -1087,6 +1091,7 @@ __PACKAGE__->register_method({
>   	    { subdir => 'spiceproxy' },
>   	    { subdir => 'sendkey' },
>   	    { subdir => 'firewall' },
> +	    { subdir => 'mtunnel' },
>   	    ];
>   
>   	return $res;
> @@ -4965,4 +4970,524 @@ __PACKAGE__->register_method({
>   	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
>       }});
>   
> +__PACKAGE__->register_method({
> +    name => 'mtunnel',
> +    path => '{vmid}/mtunnel',
> +    method => 'POST',
> +    protected => 1,
> +    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
> +    permissions => {
> +	check =>
> +	[ 'and',
> +	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
> +	  ['perm', '/', [ 'Sys.Incoming' ]],
> +	],
> +	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
> +	               " on '/'. Further permission checks happen during the actual migration.",
> +    },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    storages => {
> +		type => 'string',
> +		format => 'pve-storage-id-list',
> +		optional => 1,
> +		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
> +	    },
> +	},
> +    },
> +    returns => {
> +	additionalProperties => 0,
> +	properties => {
> +	    upid => { type => 'string' },
> +	    ticket => { type => 'string' },
> +	    socket => { type => 'string' },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $node = extract_param($param, 'node');
> +	my $vmid = extract_param($param, 'vmid');
> +
> +	my $storages = extract_param($param, 'storages');
> +
> +	my $nodename = PVE::INotify::nodename();
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	$node = $nodename;
> +
> +	my $storecfg = PVE::Storage::config();
> +	foreach my $storeid (PVE::Tools::split_list($storages)) {
> +	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
> +	}
> +
> +	PVE::Cluster::check_cfs_quorum();
> +
> +	my $lock = 'create';
> +	eval { PVE::QemuConfig->create_and_lock_config($vmid, 0, $lock); };
> +
> +	raise_param_exc({ vmid => "unable to create empty VM config - $@"})
> +	    if $@;
> +
> +	my $realcmd = sub {
> +	    my $state = {
> +		storecfg => PVE::Storage::config(),
> +		lock => $lock,
> +		vmid => $vmid,
> +	    };
> +
> +	    my $run_locked = sub {
> +		my ($code, $params) = @_;
> +		return PVE::QemuConfig->lock_config($state->{vmid}, sub {
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +
> +		    $state->{conf} = $conf;
> +
> +		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
> +			if $state->{lock} && !PVE::QemuConfig->has_lock($conf, $state->{lock});
> +
> +		    return $code->($params);
> +		});
> +	    };
> +
> +	    my $cmd_desc = {
> +		config => {
> +		    conf => {
> +			type => 'string',
> +			description => 'Full VM config, adapted for target cluster/node',
> +		    },
> +		    'firewall-config' => {
> +			type => 'string',
> +			description => 'VM firewall config',
> +			optional => 1,
> +		    },
> +		},
> +		disk => {
> +		    format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
> +		    storage => {
> +			type => 'string',
> +			format => 'pve-storage-id',
> +		    },
> +		    drive => {
> +			type => 'object',
> +			description => 'parsed drive information without volid and format',
> +		    },
> +		},
> +		start => {
> +		    start_params => {
> +			type => 'object',
> +			description => 'params passed to vm_start_nolock',
> +		    },
> +		    migrate_opts => {
> +			type => 'object',
> +			description => 'migrate_opts passed to vm_start_nolock',
> +		    },
> +		},
> +		ticket => {
> +		    path => {
> +			type => 'string',
> +			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
> +		    },
> +		},
> +		quit => {
> +		    cleanup => {
> +			type => 'boolean',
> +			description => 'remove VM config and disks, aborting migration',
> +			default => 0,
> +		    },
> +		},
> +		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
> +		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
> +		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
> +	    };
> +
> +	    my $cmd_handlers = {
> +		'version' => sub {
> +		    # compared against other end's version
> +		    # bump/reset for breaking changes
> +		    # bump/bump for opt-in changes
> +		    return {
> +			api => 2,
> +			age => 0,
> +		    };
> +		},
> +		'config' => sub {
> +		    my ($params) = @_;
> +
> +		    # parse and write out VM FW config if given
> +		    if (my $fw_conf = $params->{'firewall-config'}) {
> +			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
> +
> +			my $empty_conf = {
> +			    rules => [],
> +			    options => {},
> +			    aliases => {},
> +			    ipset => {} ,
> +			    ipset_comments => {},
> +			};
> +			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
> +
> +			# TODO: add flag for strict parsing?
> +			# TODO: add import sub that does all this given raw content?
> +			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
> +			$vmfw_conf->{vmid} = $state->{vmid};
> +			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
> +
> +			$state->{cleanup}->{fw} = 1;
> +		    }
> +
> +		    my $conf_fn = "incoming/qemu-server/$state->{vmid}.conf";
> +		    my $new_conf = PVE::QemuServer::parse_vm_config($conf_fn, $params->{conf}, 1);
> +		    delete $new_conf->{lock};
> +		    delete $new_conf->{digest};
> +
> +		    # TODO handle properly?
> +		    delete $new_conf->{snapshots};
> +		    delete $new_conf->{parent};
> +		    delete $new_conf->{pending};
> +
> +		    # not handled by update_vm_api
> +		    my $vmgenid = delete $new_conf->{vmgenid};
> +		    my $meta = delete $new_conf->{meta};
> +
> +		    $new_conf->{vmid} = $state->{vmid};
> +		    $new_conf->{node} = $node;
> +
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, 'create');
> +
> +		    eval {
> +			$update_vm_api->($new_conf, 1);
> +		    };
> +		    if (my $err = $@) {
> +			# revert to locked previous config
> +			my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +			$conf->{lock} = 'create';
> +			PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +			die $err;
> +		    }
> +
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +		    $conf->{lock} = 'migrate';
> +		    $conf->{vmgenid} = $vmgenid if defined($vmgenid);
> +		    $conf->{meta} = $meta if defined($meta);
> +		    PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +		    $state->{lock} = 'migrate';
> +
> +		    return;
> +		},
> +		'bwlimit' => sub {
> +		    my ($params) = @_;
> +		    return PVE::StorageTunnel::handle_bwlimit($params);
> +		},
> +		'disk' => sub {
> +		    my ($params) = @_;
> +
> +		    my $format = $params->{format};
> +		    my $storeid = $params->{storage};
> +		    my $drive = $params->{drive};
> +
> +		    $check_storage_access_migrate->($rpcenv, $authuser, $state->{storecfg}, $storeid, $node);
> +
> +		    my $storagemap = {
> +			default => $storeid,
> +		    };
> +
> +		    my $source_volumes = {
> +			'disk' => [
> +			    undef,
> +			    $storeid,
> +			    undef,
> +			    $drive,
> +			    0,
> +			    $format,
> +			],
> +		    };
> +
> +		    my $res = PVE::QemuServer::vm_migrate_alloc_nbd_disks($state->{storecfg}, $state->{vmid}, $source_volumes, $storagemap);
> +		    if (defined($res->{disk})) {
> +			$state->{cleanup}->{volumes}->{$res->{disk}->{volid}} = 1;
> +			return $res->{disk};
> +		    } else {
> +			die "failed to allocate NBD disk..\n";
> +		    }
> +		},
> +		'disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    $check_storage_access_migrate->(
> +			$rpcenv,
> +			$authuser,
> +			$state->{storecfg},
> +			$params->{storage},
> +			$node
> +		    );
> +
> +		    $params->{unix} = "/run/qemu-server/$state->{vmid}.storage";
> +
> +		    return PVE::StorageTunnel::handle_disk_import($state, $params);
> +		},
> +		'query-disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
> +		},
> +		'start' => sub {
> +		    my ($params) = @_;
> +
> +		    my $info = PVE::QemuServer::vm_start_nolock(
> +			$state->{storecfg},
> +			$state->{vmid},
> +			$state->{conf},
> +			$params->{start_params},
> +			$params->{migrate_opts},
> +		    );
> +
> +
> +		    if ($info->{migrate}->{proto} ne 'unix') {
> +			PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +			die "migration over non-UNIX sockets not possible\n";
> +		    }
> +
> +		    my $socket = $info->{migrate}->{addr};
> +		    chown $state->{socket_uid}, -1, $socket;
> +		    $state->{sockets}->{$socket} = 1;
> +
> +		    my $unix_sockets = $info->{migrate}->{unix_sockets};
> +		    foreach my $socket (@$unix_sockets) {
> +			chown $state->{socket_uid}, -1, $socket;
> +			$state->{sockets}->{$socket} = 1;
> +		    }
> +		    return $info;
> +		},
> +		'fstrim' => sub {
> +		    if (PVE::QemuServer::qga_check_running($state->{vmid})) {
> +			eval { mon_cmd($state->{vmid}, "guest-fstrim") };
> +			warn "fstrim failed: $@\n" if $@;
> +		    }
> +		    return;
> +		},
> +		'stop' => sub {
> +		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +		    return;
> +		},
> +		'nbdstop' => sub {
> +		    PVE::QemuServer::nbd_stop($state->{vmid});
> +		    return;
> +		},
> +		'resume' => sub {
> +		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
> +			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
> +		    } else {
> +			die "VM $state->{vmid} not running\n";
> +		    }
> +		    return;
> +		},
> +		'unlock' => sub {
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, $state->{lock});
> +		    delete $state->{lock};
> +		    return;
> +		},
> +		'ticket' => sub {
> +		    my ($params) = @_;
> +
> +		    my $path = $params->{path};
> +
> +		    die "Not allowed to generate ticket for unknown socket '$path'\n"
> +			if !defined($state->{sockets}->{$path});
> +
> +		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
> +		},
> +		'quit' => sub {
> +		    my ($params) = @_;
> +
> +		    if ($params->{cleanup}) {
> +			if ($state->{cleanup}->{fw}) {
> +			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
> +			}
> +
> +			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
> +			    print "freeing volume '$volid' as part of cleanup\n";
> +			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
> +			    warn $@ if $@;
> +			}
> +
> +			PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		    }
> +
> +		    print "switching to exit-mode, waiting for client to disconnect\n";
> +		    $state->{exit} = 1;
> +		    return;
> +		},
> +	    };
> +
> +	    $run_locked->(sub {
> +		my $socket_addr = "/run/qemu-server/$state->{vmid}.mtunnel";
> +		unlink $socket_addr;
> +
> +		$state->{socket} = IO::Socket::UNIX->new(
> +	            Type => SOCK_STREAM(),
> +		    Local => $socket_addr,
> +		    Listen => 1,
> +		);
> +
> +		$state->{socket_uid} = getpwnam('www-data')
> +		    or die "Failed to resolve user 'www-data' to numeric UID\n";
> +		chown $state->{socket_uid}, -1, $socket_addr;
> +	    });
> +
> +	    print "mtunnel started\n";
> +
> +	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
> +	    if ($@) {
> +		warn "Failed to accept tunnel connection - $@\n";
> +
> +		warn "Removing tunnel socket..\n";
> +		unlink $state->{socket};
> +
> +		warn "Removing temporary VM config..\n";
> +		$run_locked->(sub {
> +		    PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		});
> +
> +		die "Exiting mtunnel\n";
> +	    }
> +
> +	    $state->{conn} = $conn;
> +
> +	    my $reply_err = sub {
> +		my ($msg) = @_;
> +
> +		my $reply = JSON::encode_json({
> +		    success => JSON::false,
> +		    msg => $msg,
> +		});
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    my $reply_ok = sub {
> +		my ($res) = @_;
> +
> +		$res->{success} = JSON::true;
> +		my $reply = JSON::encode_json($res);
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    while (my $line = <$conn>) {
> +		chomp $line;
> +
> +		# untaint, we validate below if needed
> +		($line) = $line =~ /^(.*)$/;
> +		my $parsed = eval { JSON::decode_json($line) };
> +		if ($@) {
> +		    $reply_err->("failed to parse command - $@");
> +		    next;
> +		}
> +
> +		my $cmd = delete $parsed->{cmd};
> +		if (!defined($cmd)) {
> +		    $reply_err->("'cmd' missing");
> +		} elsif ($state->{exit}) {
> +		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
> +		    next;
> +		} elsif (my $handler = $cmd_handlers->{$cmd}) {
> +		    print "received command '$cmd'\n";
> +		    eval {
> +			if ($cmd_desc->{$cmd}) {
> +			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);

might the params be flipped here?

> +			} else {
> +			    $parsed = {};
> +			}
> +			my $res = $run_locked->($handler, $parsed);
> +			$reply_ok->($res);
> +		    };
> +		    $reply_err->("failed to handle '$cmd' command - $@")
> +			if $@;
> +		} else {
> +		    $reply_err->("unknown command '$cmd' given");
> +		}
> +	    }
> +
> +	    if ($state->{exit}) {
> +		print "mtunnel exited\n";
> +	    } else {
> +		die "mtunnel exited unexpectedly\n";
> +	    }
> +	};
> +
> +	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
> +	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
> +	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
> +
> +	return {
> +	    ticket => $ticket,
> +	    upid => $upid,
> +	    socket => $socket_addr,
> +	};
> +    }});
> +
> +__PACKAGE__->register_method({
> +    name => 'mtunnelwebsocket',
> +    path => '{vmid}/mtunnelwebsocket',
> +    method => 'GET',
> +    permissions => {
> +	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
> +        user => 'all', # check inside
> +    },
> +    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    socket => {
> +		type => "string",
> +		description => "unix socket to forward to",
> +	    },
> +	    ticket => {
> +		type => "string",
> +		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => "object",
> +	properties => {
> +	    port => { type => 'string', optional => 1 },
> +	    socket => { type => 'string', optional => 1 },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $nodename = PVE::INotify::nodename();
> +	my $node = extract_param($param, 'node');
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	my $vmid = $param->{vmid};
> +	# check VM exists
> +	PVE::QemuConfig->load_config($vmid);
> +
> +	my $socket = $param->{socket};
> +	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
> +
> +	return { socket => $socket };
> +    }});
> +
>   1;
> diff --git a/debian/control b/debian/control
> index a90ecd6f..ce469cbd 100644
> --- a/debian/control
> +++ b/debian/control
> @@ -33,7 +33,7 @@ Depends: dbus,
>            libjson-perl,
>            libjson-xs-perl,
>            libnet-ssleay-perl,
> -         libpve-access-control (>= 5.0-7),
> +         libpve-access-control (>= 7.0-7),
>            libpve-cluster-perl,
>            libpve-common-perl (>= 7.1-4),
>            libpve-guest-common-perl (>= 4.1-1),




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints
  2022-09-30 11:52   ` Stefan Hanreich
@ 2022-10-03  7:11     ` Fabian Grünbichler
  0 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-10-03  7:11 UTC (permalink / raw)
  To: Proxmox VE development discussion

On September 30, 2022 1:52 pm, Stefan Hanreich wrote:
> 
> 
> On 9/28/22 14:50, Fabian Grünbichler wrote:
>> the following two endpoints are used for migration on the remote side
>> 
>> POST /nodes/NODE/qemu/VMID/mtunnel
>> 
>> which creates and locks an empty VM config, and spawns the main qmtunnel
>> worker which binds to a VM-specific UNIX socket.
>> 
>> this worker handles JSON-encoded migration commands coming in via this
>> UNIX socket:
>> - config (set target VM config)
>> -- checks permissions for updating config
>> -- strips pending changes and snapshots
>> -- sets (optional) firewall config
>> - disk (allocate disk for NBD migration)
>> -- checks permission for target storage
>> -- returns drive string for allocated volume
>> - disk-import, query-disk-import, bwlimit
>> -- handled by PVE::StorageTunnel
>> - start (returning migration info)
>> - fstrim (via agent)
>> - ticket (creates a ticket for a WS connection to a specific socket)
>> - resume
>> - stop
>> - nbdstop
>> - unlock
>> - quit (+ cleanup)
>> 
>> this worker serves as a replacement for both 'qm mtunnel' and various
>> manual calls via SSH. the API call will return a ticket valid for
>> connecting to the worker's UNIX socket via a websocket connection.
>> 
>> GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket
>> 
>> gets called for connecting to a UNIX socket via websocket forwarding,
>> i.e. once for the main command mtunnel, and once each for the memory
>> migration and each NBD drive-mirror/storage migration.
>> 
>> access is guarded by a short-lived ticket binding the authenticated user
>> to the socket path. such tickets can be requested over the main mtunnel,
>> which keeps track of socket paths currently used by that
>> mtunnel/migration instance.
>> 
>> each command handler should check privileges for the requested action if
>> necessary.
>> 
>> both mtunnel and mtunnelwebsocket endpoints are not proxied, the
>> client/caller is responsible for ensuring the passed 'node' parameter
>> and the endpoint handling the call are matching.
>> 
>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>> ---
>> 
>> Notes:
>>      v6:
>>      - check for Sys.Incoming in mtunnel
>>      - add definedness checks in 'config' command
>>      - switch to vm_running_locally in 'resume' command
>>      - moved $socket_addr closer to usage
>>      v5:
>>      - us vm_running_locally
>>      - move '$socket_addr' declaration closer to usage
>>      v4:
>>      - add timeout to accept()
>>      - move 'bwlimit' to PVE::StorageTunnel and extend it
>>      - mark mtunnel(websocket) as non-proxied, and check $node accordingly
>>      v3:
>>      - handle meta and vmgenid better
>>      - handle failure of 'config' updating
>>      - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
>>      - improve tunnel exit by letting client close the connection
>>      - use strict VM config parser
>>      v2: incorporated Fabian Ebner's feedback, mainly:
>>      - use modified nbd alloc helper instead of duplicating
>>      - fix disk cleanup, also cleanup imported disks
>>      - fix firewall-conf vs firewall-config mismatch
>>      
>>      requires
>>      - pve-access-control with tunnel ticket support (already marked in d/control)
>>      - pve-access-control with Sys.Incoming privilege (not yet applied/bumped!)
>>      - pve-http-server with websocket fixes (could be done via breaks? or bumped in
>>        pve-manager..)
>> 
>>   PVE/API2/Qemu.pm | 527 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   debian/control   |   2 +-
>>   2 files changed, 527 insertions(+), 2 deletions(-)
>> 
>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>> index 3ec31c26..9270ca74 100644
>> --- a/PVE/API2/Qemu.pm
>> +++ b/PVE/API2/Qemu.pm
>> @@ -4,10 +4,13 @@ use strict;
>>   use warnings;
>>   use Cwd 'abs_path';
>>   use Net::SSLeay;
>> -use POSIX;
>>   use IO::Socket::IP;
>> +use IO::Socket::UNIX;
>> +use IPC::Open3;
>> +use JSON;
>>   use URI::Escape;
>>   use Crypt::OpenSSL::Random;
>> +use Socket qw(SOCK_STREAM);
>>   
>>   use PVE::Cluster qw (cfs_read_file cfs_write_file);;
>>   use PVE::RRD;
>> @@ -38,6 +41,7 @@ use PVE::VZDump::Plugin;
>>   use PVE::DataCenterConfig;
>>   use PVE::SSHInfo;
>>   use PVE::Replication;
>> +use PVE::StorageTunnel;
>>   
>>   BEGIN {
>>       if (!$ENV{PVE_GENERATING_DOCS}) {
>> @@ -1087,6 +1091,7 @@ __PACKAGE__->register_method({
>>   	    { subdir => 'spiceproxy' },
>>   	    { subdir => 'sendkey' },
>>   	    { subdir => 'firewall' },
>> +	    { subdir => 'mtunnel' },
>>   	    ];
>>   
>>   	return $res;
>> @@ -4965,4 +4970,524 @@ __PACKAGE__->register_method({
>>   	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
>>       }});
>>   
>> +__PACKAGE__->register_method({
>> +    name => 'mtunnel',
>> +    path => '{vmid}/mtunnel',
>> +    method => 'POST',
>> +    protected => 1,
>> +    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
>> +    permissions => {
>> +	check =>
>> +	[ 'and',
>> +	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
>> +	  ['perm', '/', [ 'Sys.Incoming' ]],
>> +	],
>> +	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
>> +	               " on '/'. Further permission checks happen during the actual migration.",
>> +    },
>> +    parameters => {
>> +	additionalProperties => 0,
>> +	properties => {
>> +	    node => get_standard_option('pve-node'),
>> +	    vmid => get_standard_option('pve-vmid'),
>> +	    storages => {
>> +		type => 'string',
>> +		format => 'pve-storage-id-list',
>> +		optional => 1,
>> +		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
>> +	    },
>> +	},
>> +    },
>> +    returns => {
>> +	additionalProperties => 0,
>> +	properties => {
>> +	    upid => { type => 'string' },
>> +	    ticket => { type => 'string' },
>> +	    socket => { type => 'string' },
>> +	},
>> +    },
>> +    code => sub {

[...]

>> +		my $cmd = delete $parsed->{cmd};
>> +		if (!defined($cmd)) {
>> +		    $reply_err->("'cmd' missing");
>> +		} elsif ($state->{exit}) {
>> +		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
>> +		    next;
>> +		} elsif (my $handler = $cmd_handlers->{$cmd}) {
>> +		    print "received command '$cmd'\n";
>> +		    eval {
>> +			if ($cmd_desc->{$cmd}) {
>> +			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
> 
> might the params be flipped here?
> 

yes! thanks for catching (and wow - our schema handling is flexible that 
it never choked on that!).

I'll do some tests and see whether flipping breaks anything (the same is 
also true for pve-container, since this part is duplicated there).

>> +			} else {
>> +			    $parsed = {};
>> +			}
>> +			my $res = $run_locked->($handler, $parsed);
>> +			$reply_ok->($res);
>> +		    };
>> +		    $reply_err->("failed to handle '$cmd' command - $@")
>> +			if $@;
>> +		} else {
>> +		    $reply_err->("unknown command '$cmd' given");
>> +		}
>> +	    }
>> +
>> +	    if ($state->{exit}) {
>> +		print "mtunnel exited\n";
>> +	    } else {
>> +		die "mtunnel exited unexpectedly\n";
>> +	    }
>> +	};
>> +
>> +	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
>> +	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
>> +	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
>> +
>> +	return {
>> +	    ticket => $ticket,
>> +	    upid => $upid,
>> +	    socket => $socket_addr,
>> +	};
>> +    }});
>> +
>> +__PACKAGE__->register_method({
>> +    name => 'mtunnelwebsocket',
>> +    path => '{vmid}/mtunnelwebsocket',
>> +    method => 'GET',
>> +    permissions => {
>> +	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
>> +        user => 'all', # check inside
>> +    },
>> +    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
>> +    parameters => {
>> +	additionalProperties => 0,
>> +	properties => {
>> +	    node => get_standard_option('pve-node'),
>> +	    vmid => get_standard_option('pve-vmid'),
>> +	    socket => {
>> +		type => "string",
>> +		description => "unix socket to forward to",
>> +	    },
>> +	    ticket => {
>> +		type => "string",
>> +		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
>> +	    },
>> +	},
>> +    },
>> +    returns => {
>> +	type => "object",
>> +	properties => {
>> +	    port => { type => 'string', optional => 1 },
>> +	    socket => { type => 'string', optional => 1 },
>> +	},
>> +    },
>> +    code => sub {
>> +	my ($param) = @_;
>> +
>> +	my $rpcenv = PVE::RPCEnvironment::get();
>> +	my $authuser = $rpcenv->get_user();
>> +
>> +	my $nodename = PVE::INotify::nodename();
>> +	my $node = extract_param($param, 'node');
>> +
>> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
>> +	    if $node ne 'localhost' && $node ne $nodename;
>> +
>> +	my $vmid = $param->{vmid};
>> +	# check VM exists
>> +	PVE::QemuConfig->load_config($vmid);
>> +
>> +	my $socket = $param->{socket};
>> +	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
>> +
>> +	return { socket => $socket };
>> +    }});
>> +
>>   1;
>> diff --git a/debian/control b/debian/control
>> index a90ecd6f..ce469cbd 100644
>> --- a/debian/control
>> +++ b/debian/control
>> @@ -33,7 +33,7 @@ Depends: dbus,
>>            libjson-perl,
>>            libjson-xs-perl,
>>            libnet-ssleay-perl,
>> -         libpve-access-control (>= 5.0-7),
>> +         libpve-access-control (>= 7.0-7),
>>            libpve-cluster-perl,
>>            libpve-common-perl (>= 7.1-4),
>>            libpve-guest-common-perl (>= 4.1-1),
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH FOLLOW-UP v6 qemu-server 2/6] mtunnel: add API endpoints
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
  2022-09-30 11:52   ` Stefan Hanreich
@ 2022-10-03 13:22   ` Fabian Grünbichler
  2022-10-18  6:23   ` [pve-devel] [PATCH " DERUMIER, Alexandre
  2 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-10-03 13:22 UTC (permalink / raw)
  To: Proxmox VE development discussion

as reported by Stefan Hantreich, the following follow-up should be 
squashed into this patch if applied:

----8<----
diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 57083601..4da37678 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -5581,7 +5581,7 @@ __PACKAGE__->register_method({
 		    print "received command '$cmd'\n";
 		    eval {
 			if ($cmd_desc->{$cmd}) {
-			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
+			    PVE::JSONSchema::validate($parsed, $cmd_desc->{$cmd});
 			} else {
 			    $parsed = {};
 			}
---->8----

On September 28, 2022 2:50 pm, Fabian Grünbichler wrote:
> the following two endpoints are used for migration on the remote side
> 
> POST /nodes/NODE/qemu/VMID/mtunnel
> 
> which creates and locks an empty VM config, and spawns the main qmtunnel
> worker which binds to a VM-specific UNIX socket.
> 
> this worker handles JSON-encoded migration commands coming in via this
> UNIX socket:
> - config (set target VM config)
> -- checks permissions for updating config
> -- strips pending changes and snapshots
> -- sets (optional) firewall config
> - disk (allocate disk for NBD migration)
> -- checks permission for target storage
> -- returns drive string for allocated volume
> - disk-import, query-disk-import, bwlimit
> -- handled by PVE::StorageTunnel
> - start (returning migration info)
> - fstrim (via agent)
> - ticket (creates a ticket for a WS connection to a specific socket)
> - resume
> - stop
> - nbdstop
> - unlock
> - quit (+ cleanup)
> 
> this worker serves as a replacement for both 'qm mtunnel' and various
> manual calls via SSH. the API call will return a ticket valid for
> connecting to the worker's UNIX socket via a websocket connection.
> 
> GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket
> 
> gets called for connecting to a UNIX socket via websocket forwarding,
> i.e. once for the main command mtunnel, and once each for the memory
> migration and each NBD drive-mirror/storage migration.
> 
> access is guarded by a short-lived ticket binding the authenticated user
> to the socket path. such tickets can be requested over the main mtunnel,
> which keeps track of socket paths currently used by that
> mtunnel/migration instance.
> 
> each command handler should check privileges for the requested action if
> necessary.
> 
> both mtunnel and mtunnelwebsocket endpoints are not proxied, the
> client/caller is responsible for ensuring the passed 'node' parameter
> and the endpoint handling the call are matching.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>     v6:
>     - check for Sys.Incoming in mtunnel
>     - add definedness checks in 'config' command
>     - switch to vm_running_locally in 'resume' command
>     - moved $socket_addr closer to usage
>     v5:
>     - us vm_running_locally
>     - move '$socket_addr' declaration closer to usage
>     v4:
>     - add timeout to accept()
>     - move 'bwlimit' to PVE::StorageTunnel and extend it
>     - mark mtunnel(websocket) as non-proxied, and check $node accordingly
>     v3:
>     - handle meta and vmgenid better
>     - handle failure of 'config' updating
>     - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
>     - improve tunnel exit by letting client close the connection
>     - use strict VM config parser
>     v2: incorporated Fabian Ebner's feedback, mainly:
>     - use modified nbd alloc helper instead of duplicating
>     - fix disk cleanup, also cleanup imported disks
>     - fix firewall-conf vs firewall-config mismatch
>     
>     requires
>     - pve-access-control with tunnel ticket support (already marked in d/control)
>     - pve-access-control with Sys.Incoming privilege (not yet applied/bumped!)
>     - pve-http-server with websocket fixes (could be done via breaks? or bumped in
>       pve-manager..)
> 
>  PVE/API2/Qemu.pm | 527 ++++++++++++++++++++++++++++++++++++++++++++++-
>  debian/control   |   2 +-
>  2 files changed, 527 insertions(+), 2 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 3ec31c26..9270ca74 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4,10 +4,13 @@ use strict;
>  use warnings;
>  use Cwd 'abs_path';
>  use Net::SSLeay;
> -use POSIX;
>  use IO::Socket::IP;
> +use IO::Socket::UNIX;
> +use IPC::Open3;
> +use JSON;
>  use URI::Escape;
>  use Crypt::OpenSSL::Random;
> +use Socket qw(SOCK_STREAM);
>  
>  use PVE::Cluster qw (cfs_read_file cfs_write_file);;
>  use PVE::RRD;
> @@ -38,6 +41,7 @@ use PVE::VZDump::Plugin;
>  use PVE::DataCenterConfig;
>  use PVE::SSHInfo;
>  use PVE::Replication;
> +use PVE::StorageTunnel;
>  
>  BEGIN {
>      if (!$ENV{PVE_GENERATING_DOCS}) {
> @@ -1087,6 +1091,7 @@ __PACKAGE__->register_method({
>  	    { subdir => 'spiceproxy' },
>  	    { subdir => 'sendkey' },
>  	    { subdir => 'firewall' },
> +	    { subdir => 'mtunnel' },
>  	    ];
>  
>  	return $res;
> @@ -4965,4 +4970,524 @@ __PACKAGE__->register_method({
>  	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
>      }});
>  
> +__PACKAGE__->register_method({
> +    name => 'mtunnel',
> +    path => '{vmid}/mtunnel',
> +    method => 'POST',
> +    protected => 1,
> +    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
> +    permissions => {
> +	check =>
> +	[ 'and',
> +	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
> +	  ['perm', '/', [ 'Sys.Incoming' ]],
> +	],
> +	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
> +	               " on '/'. Further permission checks happen during the actual migration.",
> +    },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    storages => {
> +		type => 'string',
> +		format => 'pve-storage-id-list',
> +		optional => 1,
> +		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
> +	    },
> +	},
> +    },
> +    returns => {
> +	additionalProperties => 0,
> +	properties => {
> +	    upid => { type => 'string' },
> +	    ticket => { type => 'string' },
> +	    socket => { type => 'string' },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $node = extract_param($param, 'node');
> +	my $vmid = extract_param($param, 'vmid');
> +
> +	my $storages = extract_param($param, 'storages');
> +
> +	my $nodename = PVE::INotify::nodename();
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	$node = $nodename;
> +
> +	my $storecfg = PVE::Storage::config();
> +	foreach my $storeid (PVE::Tools::split_list($storages)) {
> +	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
> +	}
> +
> +	PVE::Cluster::check_cfs_quorum();
> +
> +	my $lock = 'create';
> +	eval { PVE::QemuConfig->create_and_lock_config($vmid, 0, $lock); };
> +
> +	raise_param_exc({ vmid => "unable to create empty VM config - $@"})
> +	    if $@;
> +
> +	my $realcmd = sub {
> +	    my $state = {
> +		storecfg => PVE::Storage::config(),
> +		lock => $lock,
> +		vmid => $vmid,
> +	    };
> +
> +	    my $run_locked = sub {
> +		my ($code, $params) = @_;
> +		return PVE::QemuConfig->lock_config($state->{vmid}, sub {
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +
> +		    $state->{conf} = $conf;
> +
> +		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
> +			if $state->{lock} && !PVE::QemuConfig->has_lock($conf, $state->{lock});
> +
> +		    return $code->($params);
> +		});
> +	    };
> +
> +	    my $cmd_desc = {
> +		config => {
> +		    conf => {
> +			type => 'string',
> +			description => 'Full VM config, adapted for target cluster/node',
> +		    },
> +		    'firewall-config' => {
> +			type => 'string',
> +			description => 'VM firewall config',
> +			optional => 1,
> +		    },
> +		},
> +		disk => {
> +		    format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
> +		    storage => {
> +			type => 'string',
> +			format => 'pve-storage-id',
> +		    },
> +		    drive => {
> +			type => 'object',
> +			description => 'parsed drive information without volid and format',
> +		    },
> +		},
> +		start => {
> +		    start_params => {
> +			type => 'object',
> +			description => 'params passed to vm_start_nolock',
> +		    },
> +		    migrate_opts => {
> +			type => 'object',
> +			description => 'migrate_opts passed to vm_start_nolock',
> +		    },
> +		},
> +		ticket => {
> +		    path => {
> +			type => 'string',
> +			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
> +		    },
> +		},
> +		quit => {
> +		    cleanup => {
> +			type => 'boolean',
> +			description => 'remove VM config and disks, aborting migration',
> +			default => 0,
> +		    },
> +		},
> +		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
> +		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
> +		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
> +	    };
> +
> +	    my $cmd_handlers = {
> +		'version' => sub {
> +		    # compared against other end's version
> +		    # bump/reset for breaking changes
> +		    # bump/bump for opt-in changes
> +		    return {
> +			api => 2,
> +			age => 0,
> +		    };
> +		},
> +		'config' => sub {
> +		    my ($params) = @_;
> +
> +		    # parse and write out VM FW config if given
> +		    if (my $fw_conf = $params->{'firewall-config'}) {
> +			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
> +
> +			my $empty_conf = {
> +			    rules => [],
> +			    options => {},
> +			    aliases => {},
> +			    ipset => {} ,
> +			    ipset_comments => {},
> +			};
> +			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
> +
> +			# TODO: add flag for strict parsing?
> +			# TODO: add import sub that does all this given raw content?
> +			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
> +			$vmfw_conf->{vmid} = $state->{vmid};
> +			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
> +
> +			$state->{cleanup}->{fw} = 1;
> +		    }
> +
> +		    my $conf_fn = "incoming/qemu-server/$state->{vmid}.conf";
> +		    my $new_conf = PVE::QemuServer::parse_vm_config($conf_fn, $params->{conf}, 1);
> +		    delete $new_conf->{lock};
> +		    delete $new_conf->{digest};
> +
> +		    # TODO handle properly?
> +		    delete $new_conf->{snapshots};
> +		    delete $new_conf->{parent};
> +		    delete $new_conf->{pending};
> +
> +		    # not handled by update_vm_api
> +		    my $vmgenid = delete $new_conf->{vmgenid};
> +		    my $meta = delete $new_conf->{meta};
> +
> +		    $new_conf->{vmid} = $state->{vmid};
> +		    $new_conf->{node} = $node;
> +
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, 'create');
> +
> +		    eval {
> +			$update_vm_api->($new_conf, 1);
> +		    };
> +		    if (my $err = $@) {
> +			# revert to locked previous config
> +			my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +			$conf->{lock} = 'create';
> +			PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +			die $err;
> +		    }
> +
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +		    $conf->{lock} = 'migrate';
> +		    $conf->{vmgenid} = $vmgenid if defined($vmgenid);
> +		    $conf->{meta} = $meta if defined($meta);
> +		    PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +		    $state->{lock} = 'migrate';
> +
> +		    return;
> +		},
> +		'bwlimit' => sub {
> +		    my ($params) = @_;
> +		    return PVE::StorageTunnel::handle_bwlimit($params);
> +		},
> +		'disk' => sub {
> +		    my ($params) = @_;
> +
> +		    my $format = $params->{format};
> +		    my $storeid = $params->{storage};
> +		    my $drive = $params->{drive};
> +
> +		    $check_storage_access_migrate->($rpcenv, $authuser, $state->{storecfg}, $storeid, $node);
> +
> +		    my $storagemap = {
> +			default => $storeid,
> +		    };
> +
> +		    my $source_volumes = {
> +			'disk' => [
> +			    undef,
> +			    $storeid,
> +			    undef,
> +			    $drive,
> +			    0,
> +			    $format,
> +			],
> +		    };
> +
> +		    my $res = PVE::QemuServer::vm_migrate_alloc_nbd_disks($state->{storecfg}, $state->{vmid}, $source_volumes, $storagemap);
> +		    if (defined($res->{disk})) {
> +			$state->{cleanup}->{volumes}->{$res->{disk}->{volid}} = 1;
> +			return $res->{disk};
> +		    } else {
> +			die "failed to allocate NBD disk..\n";
> +		    }
> +		},
> +		'disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    $check_storage_access_migrate->(
> +			$rpcenv,
> +			$authuser,
> +			$state->{storecfg},
> +			$params->{storage},
> +			$node
> +		    );
> +
> +		    $params->{unix} = "/run/qemu-server/$state->{vmid}.storage";
> +
> +		    return PVE::StorageTunnel::handle_disk_import($state, $params);
> +		},
> +		'query-disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
> +		},
> +		'start' => sub {
> +		    my ($params) = @_;
> +
> +		    my $info = PVE::QemuServer::vm_start_nolock(
> +			$state->{storecfg},
> +			$state->{vmid},
> +			$state->{conf},
> +			$params->{start_params},
> +			$params->{migrate_opts},
> +		    );
> +
> +
> +		    if ($info->{migrate}->{proto} ne 'unix') {
> +			PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +			die "migration over non-UNIX sockets not possible\n";
> +		    }
> +
> +		    my $socket = $info->{migrate}->{addr};
> +		    chown $state->{socket_uid}, -1, $socket;
> +		    $state->{sockets}->{$socket} = 1;
> +
> +		    my $unix_sockets = $info->{migrate}->{unix_sockets};
> +		    foreach my $socket (@$unix_sockets) {
> +			chown $state->{socket_uid}, -1, $socket;
> +			$state->{sockets}->{$socket} = 1;
> +		    }
> +		    return $info;
> +		},
> +		'fstrim' => sub {
> +		    if (PVE::QemuServer::qga_check_running($state->{vmid})) {
> +			eval { mon_cmd($state->{vmid}, "guest-fstrim") };
> +			warn "fstrim failed: $@\n" if $@;
> +		    }
> +		    return;
> +		},
> +		'stop' => sub {
> +		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +		    return;
> +		},
> +		'nbdstop' => sub {
> +		    PVE::QemuServer::nbd_stop($state->{vmid});
> +		    return;
> +		},
> +		'resume' => sub {
> +		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
> +			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
> +		    } else {
> +			die "VM $state->{vmid} not running\n";
> +		    }
> +		    return;
> +		},
> +		'unlock' => sub {
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, $state->{lock});
> +		    delete $state->{lock};
> +		    return;
> +		},
> +		'ticket' => sub {
> +		    my ($params) = @_;
> +
> +		    my $path = $params->{path};
> +
> +		    die "Not allowed to generate ticket for unknown socket '$path'\n"
> +			if !defined($state->{sockets}->{$path});
> +
> +		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
> +		},
> +		'quit' => sub {
> +		    my ($params) = @_;
> +
> +		    if ($params->{cleanup}) {
> +			if ($state->{cleanup}->{fw}) {
> +			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
> +			}
> +
> +			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
> +			    print "freeing volume '$volid' as part of cleanup\n";
> +			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
> +			    warn $@ if $@;
> +			}
> +
> +			PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		    }
> +
> +		    print "switching to exit-mode, waiting for client to disconnect\n";
> +		    $state->{exit} = 1;
> +		    return;
> +		},
> +	    };
> +
> +	    $run_locked->(sub {
> +		my $socket_addr = "/run/qemu-server/$state->{vmid}.mtunnel";
> +		unlink $socket_addr;
> +
> +		$state->{socket} = IO::Socket::UNIX->new(
> +	            Type => SOCK_STREAM(),
> +		    Local => $socket_addr,
> +		    Listen => 1,
> +		);
> +
> +		$state->{socket_uid} = getpwnam('www-data')
> +		    or die "Failed to resolve user 'www-data' to numeric UID\n";
> +		chown $state->{socket_uid}, -1, $socket_addr;
> +	    });
> +
> +	    print "mtunnel started\n";
> +
> +	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
> +	    if ($@) {
> +		warn "Failed to accept tunnel connection - $@\n";
> +
> +		warn "Removing tunnel socket..\n";
> +		unlink $state->{socket};
> +
> +		warn "Removing temporary VM config..\n";
> +		$run_locked->(sub {
> +		    PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		});
> +
> +		die "Exiting mtunnel\n";
> +	    }
> +
> +	    $state->{conn} = $conn;
> +
> +	    my $reply_err = sub {
> +		my ($msg) = @_;
> +
> +		my $reply = JSON::encode_json({
> +		    success => JSON::false,
> +		    msg => $msg,
> +		});
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    my $reply_ok = sub {
> +		my ($res) = @_;
> +
> +		$res->{success} = JSON::true;
> +		my $reply = JSON::encode_json($res);
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    while (my $line = <$conn>) {
> +		chomp $line;
> +
> +		# untaint, we validate below if needed
> +		($line) = $line =~ /^(.*)$/;
> +		my $parsed = eval { JSON::decode_json($line) };
> +		if ($@) {
> +		    $reply_err->("failed to parse command - $@");
> +		    next;
> +		}
> +
> +		my $cmd = delete $parsed->{cmd};
> +		if (!defined($cmd)) {
> +		    $reply_err->("'cmd' missing");
> +		} elsif ($state->{exit}) {
> +		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
> +		    next;
> +		} elsif (my $handler = $cmd_handlers->{$cmd}) {
> +		    print "received command '$cmd'\n";
> +		    eval {
> +			if ($cmd_desc->{$cmd}) {
> +			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
> +			} else {
> +			    $parsed = {};
> +			}
> +			my $res = $run_locked->($handler, $parsed);
> +			$reply_ok->($res);
> +		    };
> +		    $reply_err->("failed to handle '$cmd' command - $@")
> +			if $@;
> +		} else {
> +		    $reply_err->("unknown command '$cmd' given");
> +		}
> +	    }
> +
> +	    if ($state->{exit}) {
> +		print "mtunnel exited\n";
> +	    } else {
> +		die "mtunnel exited unexpectedly\n";
> +	    }
> +	};
> +
> +	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
> +	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
> +	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
> +
> +	return {
> +	    ticket => $ticket,
> +	    upid => $upid,
> +	    socket => $socket_addr,
> +	};
> +    }});
> +
> +__PACKAGE__->register_method({
> +    name => 'mtunnelwebsocket',
> +    path => '{vmid}/mtunnelwebsocket',
> +    method => 'GET',
> +    permissions => {
> +	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
> +        user => 'all', # check inside
> +    },
> +    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    socket => {
> +		type => "string",
> +		description => "unix socket to forward to",
> +	    },
> +	    ticket => {
> +		type => "string",
> +		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => "object",
> +	properties => {
> +	    port => { type => 'string', optional => 1 },
> +	    socket => { type => 'string', optional => 1 },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $nodename = PVE::INotify::nodename();
> +	my $node = extract_param($param, 'node');
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	my $vmid = $param->{vmid};
> +	# check VM exists
> +	PVE::QemuConfig->load_config($vmid);
> +
> +	my $socket = $param->{socket};
> +	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
> +
> +	return { socket => $socket };
> +    }});
> +
>  1;
> diff --git a/debian/control b/debian/control
> index a90ecd6f..ce469cbd 100644
> --- a/debian/control
> +++ b/debian/control
> @@ -33,7 +33,7 @@ Depends: dbus,
>           libjson-perl,
>           libjson-xs-perl,
>           libnet-ssleay-perl,
> -         libpve-access-control (>= 5.0-7),
> +         libpve-access-control (>= 7.0-7),
>           libpve-cluster-perl,
>           libpve-common-perl (>= 7.1-4),
>           libpve-guest-common-perl (>= 4.1-1),
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH FOLLOW-UP v6 container 1/3] migration: add remote migration
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 1/3] migration: add remote migration Fabian Grünbichler
@ 2022-10-03 13:22   ` Fabian Grünbichler
  0 siblings, 0 replies; 29+ messages in thread
From: Fabian Grünbichler @ 2022-10-03 13:22 UTC (permalink / raw)
  To: Proxmox VE development discussion

same as in qemu-server, the following should be squashed into this 
patch/commit:

----8<----
diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 4e21be4..3573b59 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -2870,7 +2870,7 @@ __PACKAGE__->register_method({
 		    print "received command '$cmd'\n";
 		    eval {
 			if ($cmd_desc->{$cmd}) {
-			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
+			    PVE::JSONSchema::validate($parsed, $cmd_desc->{$cmd});
 			} else {
 			    $parsed = {};
 			}
---->8----

On September 28, 2022 2:50 pm, Fabian Grünbichler wrote:
> modelled after the VM migration, but folded into a single commit since
> the actual migration changes are a lot smaller here.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>     v6:
>     - check for Sys.Incoming in mtunnel API endpoint
>     - mark as experimental
>     - test_mp fix for non-snapshot calls
>     
>     new in v5 - PoC to ensure helpers and abstractions are re-usable
>     
>     requires bumped pve-storage to avoid tainted issue
> 
>  src/PVE/API2/LXC.pm    | 635 +++++++++++++++++++++++++++++++++++++++++
>  src/PVE/LXC/Migrate.pm | 245 +++++++++++++---
>  2 files changed, 838 insertions(+), 42 deletions(-)
> 
> diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
> index 589f96f..4e21be4 100644
> --- a/src/PVE/API2/LXC.pm
> +++ b/src/PVE/API2/LXC.pm
> @@ -3,6 +3,8 @@ package PVE::API2::LXC;
>  use strict;
>  use warnings;
>  
> +use Socket qw(SOCK_STREAM);
> +
>  use PVE::SafeSyslog;
>  use PVE::Tools qw(extract_param run_command);
>  use PVE::Exception qw(raise raise_param_exc raise_perm_exc);
> @@ -1089,6 +1091,174 @@ __PACKAGE__->register_method ({
>      }});
>  
>  
> +__PACKAGE__->register_method({
> +    name => 'remote_migrate_vm',
> +    path => '{vmid}/remote_migrate',
> +    method => 'POST',
> +    protected => 1,
> +    proxyto => 'node',
> +    description => "Migrate the container to another cluster. Creates a new migration task. EXPERIMENTAL feature!",
> +    permissions => {
> +	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
> +    },
> +    parameters => {
> +    	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::LXC::complete_ctid }),
> +	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
> +	    'target-endpoint' => get_standard_option('proxmox-remote', {
> +		description => "Remote target endpoint",
> +	    }),
> +	    online => {
> +		type => 'boolean',
> +		description => "Use online/live migration.",
> +		optional => 1,
> +	    },
> +	    restart => {
> +		type => 'boolean',
> +		description => "Use restart migration",
> +		optional => 1,
> +	    },
> +	    timeout => {
> +		type => 'integer',
> +		description => "Timeout in seconds for shutdown for restart migration",
> +		optional => 1,
> +		default => 180,
> +	    },
> +	    delete => {
> +		type => 'boolean',
> +		description => "Delete the original CT and related data after successful migration. By default the original CT is kept on the source cluster in a stopped state.",
> +		optional => 1,
> +		default => 0,
> +	    },
> +	    'target-storage' => get_standard_option('pve-targetstorage', {
> +		optional => 0,
> +	    }),
> +	    'target-bridge' => {
> +		type => 'string',
> +		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
> +		format => 'bridge-pair-list',
> +	    },
> +	    bwlimit => {
> +		description => "Override I/O bandwidth limit (in KiB/s).",
> +		optional => 1,
> +		type => 'number',
> +		minimum => '0',
> +		default => 'migrate limit from datacenter or storage config',
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => 'string',
> +	description => "the task ID.",
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $source_vmid = extract_param($param, 'vmid');
> +	my $target_endpoint = extract_param($param, 'target-endpoint');
> +	my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
> +
> +	my $delete = extract_param($param, 'delete') // 0;
> +
> +	PVE::Cluster::check_cfs_quorum();
> +
> +	# test if CT exists
> +	my $conf = PVE::LXC::Config->load_config($source_vmid);
> +	PVE::LXC::Config->check_lock($conf);
> +
> +	# try to detect errors early
> +	if (PVE::LXC::check_running($source_vmid)) {
> +	    die "can't migrate running container without --online or --restart\n"
> +		if !$param->{online} && !$param->{restart};
> +	}
> +
> +	raise_param_exc({ vmid => "cannot migrate HA-managed CT to remote cluster" })
> +	    if PVE::HA::Config::vm_is_ha_managed($source_vmid);
> +
> +	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
> +
> +	# TODO: move this as helper somewhere appropriate?
> +	my $conn_args = {
> +	    protocol => 'https',
> +	    host => $remote->{host},
> +	    port => $remote->{port} // 8006,
> +	    apitoken => $remote->{apitoken},
> +	};
> +
> +	my $fp;
> +	if ($fp = $remote->{fingerprint}) {
> +	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 };
> +	}
> +
> +	print "Establishing API connection with remote at '$remote->{host}'\n";
> +
> +	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
> +
> +	if (!defined($fp)) {
> +	    my $cert_info = $api_client->get("/nodes/localhost/certificates/info");
> +	    foreach my $cert (@$cert_info) {
> +		my $filename = $cert->{filename};
> +		next if $filename ne 'pveproxy-ssl.pem' && $filename ne 'pve-ssl.pem';
> +		$fp = $cert->{fingerprint} if !$fp || $filename eq 'pveproxy-ssl.pem';
> +	    }
> +	    $conn_args->{cached_fingerprints} = { uc($fp) => 1 }
> +		if defined($fp);
> +	}
> +
> +	my $storecfg = PVE::Storage::config();
> +	my $target_storage = extract_param($param, 'target-storage');
> +	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
> +	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
> +	    if $@;
> +
> +	my $target_bridge = extract_param($param, 'target-bridge');
> +	my $bridgemap = eval { PVE::JSONSchema::parse_idmap($target_bridge, 'pve-bridge-id') };
> +	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
> +	    if $@;
> +
> +	die "remote migration requires explicit storage mapping!\n"
> +	    if $storagemap->{identity};
> +
> +	$param->{storagemap} = $storagemap;
> +	$param->{bridgemap} = $bridgemap;
> +	$param->{remote} = {
> +	    conn => $conn_args, # re-use fingerprint for tunnel
> +	    client => $api_client,
> +	    vmid => $target_vmid,
> +	};
> +	$param->{migration_type} = 'websocket';
> +	$param->{delete} = $delete if $delete;
> +
> +	my $cluster_status = $api_client->get("/cluster/status");
> +	my $target_node;
> +	foreach my $entry (@$cluster_status) {
> +	    next if $entry->{type} ne 'node';
> +	    if ($entry->{local}) {
> +		$target_node = $entry->{name};
> +		last;
> +	    }
> +	}
> +
> +	die "couldn't determine endpoint's node name\n"
> +	    if !defined($target_node);
> +
> +	my $realcmd = sub {
> +	    PVE::LXC::Migrate->migrate($target_node, $remote->{host}, $source_vmid, $param);
> +	};
> +
> +	my $worker = sub {
> +	    return PVE::GuestHelpers::guest_migration_lock($source_vmid, 10, $realcmd);
> +	};
> +
> +	return $rpcenv->fork_worker('vzmigrate', $source_vmid, $authuser, $worker);
> +    }});
> +
> +
>  __PACKAGE__->register_method({
>      name => 'migrate_vm',
>      path => '{vmid}/migrate',
> @@ -2318,4 +2488,469 @@ __PACKAGE__->register_method({
>  	return PVE::GuestHelpers::config_with_pending_array($conf, $pending_delete_hash);
>      }});
>  
> +__PACKAGE__->register_method({
> +    name => 'mtunnel',
> +    path => '{vmid}/mtunnel',
> +    method => 'POST',
> +    protected => 1,
> +    description => 'Migration tunnel endpoint - only for internal use by CT migration.',
> +    permissions => {
> +	check =>
> +	[ 'and',
> +	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
> +	  ['perm', '/', [ 'Sys.Incoming' ]],
> +	],
> +	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
> +	               " on '/'. Further permission checks happen during the actual migration.",
> +    },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    storages => {
> +		type => 'string',
> +		format => 'pve-storage-id-list',
> +		optional => 1,
> +		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
> +	    },
> +	    bridges => {
> +		type => 'string',
> +		format => 'pve-bridge-id-list',
> +		optional => 1,
> +		description => 'List of network bridges to check availability. Will be checked again for actually used bridges during migration.',
> +	    },
> +	},
> +    },
> +    returns => {
> +	additionalProperties => 0,
> +	properties => {
> +	    upid => { type => 'string' },
> +	    ticket => { type => 'string' },
> +	    socket => { type => 'string' },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $node = extract_param($param, 'node');
> +	my $vmid = extract_param($param, 'vmid');
> +
> +	my $storages = extract_param($param, 'storages');
> +	my $bridges = extract_param($param, 'bridges');
> +
> +	my $nodename = PVE::INotify::nodename();
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	$node = $nodename;
> +
> +	my $storecfg = PVE::Storage::config();
> +	foreach my $storeid (PVE::Tools::split_list($storages)) {
> +	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
> +	}
> +
> +	foreach my $bridge (PVE::Tools::split_list($bridges)) {
> +	    PVE::Network::read_bridge_mtu($bridge);
> +	}
> +
> +	PVE::Cluster::check_cfs_quorum();
> +
> +	my $socket_addr = "/run/pve/ct-$vmid.mtunnel";
> +
> +	my $lock = 'create';
> +	eval { PVE::LXC::Config->create_and_lock_config($vmid, 0, $lock); };
> +
> +	raise_param_exc({ vmid => "unable to create empty CT config - $@"})
> +	    if $@;
> +
> +	my $realcmd = sub {
> +	    my $state = {
> +		storecfg => PVE::Storage::config(),
> +		lock => $lock,
> +		vmid => $vmid,
> +	    };
> +
> +	    my $run_locked = sub {
> +		my ($code, $params) = @_;
> +		return PVE::LXC::Config->lock_config($state->{vmid}, sub {
> +		    my $conf = PVE::LXC::Config->load_config($state->{vmid});
> +
> +		    $state->{conf} = $conf;
> +
> +		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
> +			if $state->{lock} && !PVE::LXC::Config->has_lock($conf, $state->{lock});
> +
> +		    return $code->($params);
> +		});
> +	    };
> +
> +	    my $cmd_desc = {
> +		config => {
> +		    conf => {
> +			type => 'string',
> +			description => 'Full CT config, adapted for target cluster/node',
> +		    },
> +		    'firewall-config' => {
> +			type => 'string',
> +			description => 'CT firewall config',
> +			optional => 1,
> +		    },
> +		},
> +		ticket => {
> +		    path => {
> +			type => 'string',
> +			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
> +		    },
> +		},
> +		quit => {
> +		    cleanup => {
> +			type => 'boolean',
> +			description => 'remove CT config and volumes, aborting migration',
> +			default => 0,
> +		    },
> +		},
> +		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
> +		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
> +		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
> +	    };
> +
> +	    my $cmd_handlers = {
> +		'version' => sub {
> +		    # compared against other end's version
> +		    # bump/reset for breaking changes
> +		    # bump/bump for opt-in changes
> +		    return {
> +			api => $PVE::LXC::Migrate::WS_TUNNEL_VERSION,
> +			age => 0,
> +		    };
> +		},
> +		'config' => sub {
> +		    my ($params) = @_;
> +
> +		    # parse and write out VM FW config if given
> +		    if (my $fw_conf = $params->{'firewall-config'}) {
> +			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
> +
> +			my $empty_conf = {
> +			    rules => [],
> +			    options => {},
> +			    aliases => {},
> +			    ipset => {} ,
> +			    ipset_comments => {},
> +			};
> +			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
> +
> +			# TODO: add flag for strict parsing?
> +			# TODO: add import sub that does all this given raw content?
> +			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
> +			$vmfw_conf->{vmid} = $state->{vmid};
> +			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
> +
> +			$state->{cleanup}->{fw} = 1;
> +		    }
> +
> +		    my $conf_fn = "incoming/lxc/$state->{vmid}.conf";
> +		    my $new_conf = PVE::LXC::Config::parse_pct_config($conf_fn, $params->{conf}, 1);
> +		    delete $new_conf->{lock};
> +		    delete $new_conf->{digest};
> +
> +		    my $unprivileged = delete $new_conf->{unprivileged};
> +		    my $arch = delete $new_conf->{arch};
> +
> +		    # TODO handle properly?
> +		    delete $new_conf->{snapshots};
> +		    delete $new_conf->{parent};
> +		    delete $new_conf->{pending};
> +		    delete $new_conf->{lxc};
> +
> +		    PVE::LXC::Config->remove_lock($state->{vmid}, 'create');
> +
> +		    eval {
> +			my $conf = {
> +			    unprivileged => $unprivileged,
> +			    arch => $arch,
> +			};
> +			PVE::LXC::check_ct_modify_config_perm(
> +			    $rpcenv,
> +			    $authuser,
> +			    $state->{vmid},
> +			    undef,
> +			    $conf,
> +			    $new_conf,
> +			    undef,
> +			    $unprivileged,
> +			);
> +			my $errors = PVE::LXC::Config->update_pct_config(
> +			    $state->{vmid},
> +			    $conf,
> +			    0,
> +			    $new_conf,
> +			    [],
> +			    [],
> +			);
> +			raise_param_exc($errors) if scalar(keys %$errors);
> +			PVE::LXC::Config->write_config($state->{vmid}, $conf);
> +			PVE::LXC::update_lxc_config($vmid, $conf);
> +		    };
> +		    if (my $err = $@) {
> +			# revert to locked previous config
> +			my $conf = PVE::LXC::Config->load_config($state->{vmid});
> +			$conf->{lock} = 'create';
> +			PVE::LXC::Config->write_config($state->{vmid}, $conf);
> +
> +			die $err;
> +		    }
> +
> +		    my $conf = PVE::LXC::Config->load_config($state->{vmid});
> +		    $conf->{lock} = 'migrate';
> +		    PVE::LXC::Config->write_config($state->{vmid}, $conf);
> +
> +		    $state->{lock} = 'migrate';
> +
> +		    return;
> +		},
> +		'bwlimit' => sub {
> +		    my ($params) = @_;
> +		    return PVE::StorageTunnel::handle_bwlimit($params);
> +		},
> +		'disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    $check_storage_access_migrate->(
> +			$rpcenv,
> +			$authuser,
> +			$state->{storecfg},
> +			$params->{storage},
> +			$node
> +		    );
> +
> +		    $params->{unix} = "/run/pve/ct-$state->{vmid}.storage";
> +
> +		    return PVE::StorageTunnel::handle_disk_import($state, $params);
> +		},
> +		'query-disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
> +		},
> +		'unlock' => sub {
> +		    PVE::LXC::Config->remove_lock($state->{vmid}, $state->{lock});
> +		    delete $state->{lock};
> +		    return;
> +		},
> +		'start' => sub {
> +		    PVE::LXC::vm_start(
> +			$state->{vmid},
> +			$state->{conf},
> +			0
> +		    );
> +
> +		    return;
> +		},
> +		'stop' => sub {
> +		    PVE::LXC::vm_stop($state->{vmid}, 1, 10, 1);
> +		    return;
> +		},
> +		'ticket' => sub {
> +		    my ($params) = @_;
> +
> +		    my $path = $params->{path};
> +
> +		    die "Not allowed to generate ticket for unknown socket '$path'\n"
> +			if !defined($state->{sockets}->{$path});
> +
> +		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
> +		},
> +		'quit' => sub {
> +		    my ($params) = @_;
> +
> +		    if ($params->{cleanup}) {
> +			if ($state->{cleanup}->{fw}) {
> +			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
> +			}
> +
> +			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
> +			    print "freeing volume '$volid' as part of cleanup\n";
> +			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
> +			    warn $@ if $@;
> +			}
> +
> +			PVE::LXC::destroy_lxc_container(
> +			    $state->{storecfg},
> +			    $state->{vmid},
> +			    $state->{conf},
> +			    undef,
> +			    0,
> +			);
> +		    }
> +
> +		    print "switching to exit-mode, waiting for client to disconnect\n";
> +		    $state->{exit} = 1;
> +		    return;
> +		},
> +	    };
> +
> +	    $run_locked->(sub {
> +		my $socket_addr = "/run/pve/ct-$state->{vmid}.mtunnel";
> +		unlink $socket_addr;
> +
> +		$state->{socket} = IO::Socket::UNIX->new(
> +	            Type => SOCK_STREAM(),
> +		    Local => $socket_addr,
> +		    Listen => 1,
> +		);
> +
> +		$state->{socket_uid} = getpwnam('www-data')
> +		    or die "Failed to resolve user 'www-data' to numeric UID\n";
> +		chown $state->{socket_uid}, -1, $socket_addr;
> +	    });
> +
> +	    print "mtunnel started\n";
> +
> +	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
> +	    if ($@) {
> +		warn "Failed to accept tunnel connection - $@\n";
> +
> +		warn "Removing tunnel socket..\n";
> +		unlink $state->{socket};
> +
> +		warn "Removing temporary VM config..\n";
> +		$run_locked->(sub {
> +		    PVE::LXC::destroy_config($state->{vmid});
> +		});
> +
> +		die "Exiting mtunnel\n";
> +	    }
> +
> +	    $state->{conn} = $conn;
> +
> +	    my $reply_err = sub {
> +		my ($msg) = @_;
> +
> +		my $reply = JSON::encode_json({
> +		    success => JSON::false,
> +		    msg => $msg,
> +		});
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    my $reply_ok = sub {
> +		my ($res) = @_;
> +
> +		$res->{success} = JSON::true;
> +		my $reply = JSON::encode_json($res);
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    while (my $line = <$conn>) {
> +		chomp $line;
> +
> +		# untaint, we validate below if needed
> +		($line) = $line =~ /^(.*)$/;
> +		my $parsed = eval { JSON::decode_json($line) };
> +		if ($@) {
> +		    $reply_err->("failed to parse command - $@");
> +		    next;
> +		}
> +
> +		my $cmd = delete $parsed->{cmd};
> +		if (!defined($cmd)) {
> +		    $reply_err->("'cmd' missing");
> +		} elsif ($state->{exit}) {
> +		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
> +		    next;
> +		} elsif (my $handler = $cmd_handlers->{$cmd}) {
> +		    print "received command '$cmd'\n";
> +		    eval {
> +			if ($cmd_desc->{$cmd}) {
> +			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
> +			} else {
> +			    $parsed = {};
> +			}
> +			my $res = $run_locked->($handler, $parsed);
> +			$reply_ok->($res);
> +		    };
> +		    $reply_err->("failed to handle '$cmd' command - $@")
> +			if $@;
> +		} else {
> +		    $reply_err->("unknown command '$cmd' given");
> +		}
> +	    }
> +
> +	    if ($state->{exit}) {
> +		print "mtunnel exited\n";
> +	    } else {
> +		die "mtunnel exited unexpectedly\n";
> +	    }
> +	};
> +
> +	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
> +	my $upid = $rpcenv->fork_worker('vzmtunnel', $vmid, $authuser, $realcmd);
> +
> +	return {
> +	    ticket => $ticket,
> +	    upid => $upid,
> +	    socket => $socket_addr,
> +	};
> +    }});
> +
> +__PACKAGE__->register_method({
> +    name => 'mtunnelwebsocket',
> +    path => '{vmid}/mtunnelwebsocket',
> +    method => 'GET',
> +    permissions => {
> +	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
> +        user => 'all', # check inside
> +    },
> +    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    socket => {
> +		type => "string",
> +		description => "unix socket to forward to",
> +	    },
> +	    ticket => {
> +		type => "string",
> +		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => "object",
> +	properties => {
> +	    port => { type => 'string', optional => 1 },
> +	    socket => { type => 'string', optional => 1 },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $nodename = PVE::INotify::nodename();
> +	my $node = extract_param($param, 'node');
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	my $vmid = $param->{vmid};
> +	# check VM exists
> +	PVE::LXC::Config->load_config($vmid);
> +
> +	my $socket = $param->{socket};
> +	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
> +
> +	return { socket => $socket };
> +    }});
>  1;
> diff --git a/src/PVE/LXC/Migrate.pm b/src/PVE/LXC/Migrate.pm
> index 2ef1cce..a0ab65e 100644
> --- a/src/PVE/LXC/Migrate.pm
> +++ b/src/PVE/LXC/Migrate.pm
> @@ -17,6 +17,9 @@ use PVE::Replication;
>  
>  use base qw(PVE::AbstractMigrate);
>  
> +# compared against remote end's minimum version
> +our $WS_TUNNEL_VERSION = 2;
> +
>  sub lock_vm {
>      my ($self, $vmid, $code, @param) = @_;
>  
> @@ -28,6 +31,7 @@ sub prepare {
>  
>      my $online = $self->{opts}->{online};
>      my $restart= $self->{opts}->{restart};
> +    my $remote = $self->{opts}->{remote};
>  
>      $self->{storecfg} = PVE::Storage::config();
>  
> @@ -44,6 +48,7 @@ sub prepare {
>      }
>      $self->{was_running} = $running;
>  
> +    my $storages = {};
>      PVE::LXC::Config->foreach_volume_full($conf, { include_unused => 1 }, sub {
>  	my ($ms, $mountpoint) = @_;
>  
> @@ -70,7 +75,7 @@ sub prepare {
>  	die "content type 'rootdir' is not available on storage '$storage'\n"
>  	    if !$scfg->{content}->{rootdir};
>  
> -	if ($scfg->{shared}) {
> +	if ($scfg->{shared} && !$remote) {
>  	    # PVE::Storage::activate_storage checks this for non-shared storages
>  	    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
>  	    warn "Used shared storage '$storage' is not online on source node!\n"
> @@ -83,18 +88,63 @@ sub prepare {
>  	    $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $storage);
>  	}
>  
> -	my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
> +	if (!$remote) {
> +	    my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
> +
> +	    die "$volid: content type 'rootdir' is not available on storage '$targetsid'\n"
> +		if !$target_scfg->{content}->{rootdir};
> +	}
>  
> -	die "$volid: content type 'rootdir' is not available on storage '$targetsid'\n"
> -	    if !$target_scfg->{content}->{rootdir};
> +	$storages->{$targetsid} = 1;
>      });
>  
>      # todo: test if VM uses local resources
>  
> -    # test ssh connection
> -    my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> -    eval { $self->cmd_quiet($cmd); };
> -    die "Can't connect to destination address using public key\n" if $@;
> +    if ($remote) {
> +	# test & establish websocket connection
> +	my $bridges = map_bridges($conf, $self->{opts}->{bridgemap}, 1);
> +
> +	my $remote = $self->{opts}->{remote};
> +	my $conn = $remote->{conn};
> +
> +	my $log = sub {
> +	    my ($level, $msg) = @_;
> +	    $self->log($level, $msg);
> +	};
> +
> +	my $websocket_url = "https://$conn->{host}:$conn->{port}/api2/json/nodes/$self->{node}/lxc/$remote->{vmid}/mtunnelwebsocket";
> +	my $url = "/nodes/$self->{node}/lxc/$remote->{vmid}/mtunnel";
> +
> +	my $tunnel_params = {
> +	    url => $websocket_url,
> +	};
> +
> +	my $storage_list = join(',', keys %$storages);
> +	my $bridge_list = join(',', keys %$bridges);
> +
> +	my $req_params = {
> +	    storages => $storage_list,
> +	    bridges => $bridge_list,
> +	};
> +
> +	my $tunnel = PVE::Tunnel::fork_websocket_tunnel($conn, $url, $req_params, $tunnel_params, $log);
> +	my $min_version = $tunnel->{version} - $tunnel->{age};
> +	$self->log('info', "local WS tunnel version: $WS_TUNNEL_VERSION");
> +	$self->log('info', "remote WS tunnel version: $tunnel->{version}");
> +	$self->log('info', "minimum required WS tunnel version: $min_version");
> +	die "Remote tunnel endpoint not compatible, upgrade required\n"
> +	    if $WS_TUNNEL_VERSION < $min_version;
> +	 die "Remote tunnel endpoint too old, upgrade required\n"
> +	    if $WS_TUNNEL_VERSION > $tunnel->{version};
> +
> +	$self->log('info', "websocket tunnel started\n");
> +	$self->{tunnel} = $tunnel;
> +    } else {
> +	# test ssh connection
> +	my $cmd = [ @{$self->{rem_ssh}}, '/bin/true' ];
> +	eval { $self->cmd_quiet($cmd); };
> +	die "Can't connect to destination address using public key\n" if $@;
> +    }
>  
>      # in restart mode, we shutdown the container before migrating
>      if ($restart && $running) {
> @@ -113,6 +163,8 @@ sub prepare {
>  sub phase1 {
>      my ($self, $vmid) = @_;
>  
> +    my $remote = $self->{opts}->{remote};
> +
>      $self->log('info', "starting migration of CT $self->{vmid} to node '$self->{node}' ($self->{nodeip})");
>  
>      my $conf = $self->{vmconf};
> @@ -147,7 +199,7 @@ sub phase1 {
>  
>  	my $targetsid = $sid;
>  
> -	if ($scfg->{shared}) {
> +	if ($scfg->{shared} && !$remote) {
>  	    $self->log('info', "volume '$volid' is on shared storage '$sid'")
>  		if !$snapname;
>  	    return;
> @@ -155,7 +207,8 @@ sub phase1 {
>  	    $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $sid);
>  	}
>  
> -	PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
> +	PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node})
> +	    if !$remote;
>  
>  	my $bwlimit = $self->get_bwlimit($sid, $targetsid);
>  
> @@ -192,6 +245,9 @@ sub phase1 {
>  
>  	eval {
>  	    &$test_volid($volid, $snapname);
> +
> +	    die "remote migration with snapshots not supported yet\n"
> +		if $remote && $snapname;
>  	};
>  
>  	&$log_error($@, $volid) if $@;
> @@ -201,7 +257,7 @@ sub phase1 {
>      my @sids = PVE::Storage::storage_ids($self->{storecfg});
>      foreach my $storeid (@sids) {
>  	my $scfg = PVE::Storage::storage_config($self->{storecfg}, $storeid);
> -	next if $scfg->{shared};
> +	next if $scfg->{shared} && !$remote;
>  	next if !PVE::Storage::storage_check_enabled($self->{storecfg}, $storeid, undef, 1);
>  
>  	# get list from PVE::Storage (for unreferenced volumes)
> @@ -211,10 +267,12 @@ sub phase1 {
>  
>  	# check if storage is available on target node
>  	my $targetsid = PVE::JSONSchema::map_id($self->{opts}->{storagemap}, $storeid);
> -	my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
> +	if (!$remote) {
> +	    my $target_scfg = PVE::Storage::storage_check_enabled($self->{storecfg}, $targetsid, $self->{node});
>  
> -	die "content type 'rootdir' is not available on storage '$targetsid'\n"
> -	    if !$target_scfg->{content}->{rootdir};
> +	    die "content type 'rootdir' is not available on storage '$targetsid'\n"
> +		if !$target_scfg->{content}->{rootdir};
> +	}
>  
>  	PVE::Storage::foreach_volid($dl, sub {
>  	    my ($volid, $sid, $volname) = @_;
> @@ -240,12 +298,21 @@ sub phase1 {
>  	    my ($sid, $volname) = PVE::Storage::parse_volume_id($volid);
>  	    my $scfg =  PVE::Storage::storage_config($self->{storecfg}, $sid);
>  
> -	    my $migratable = ($scfg->{type} eq 'dir') || ($scfg->{type} eq 'zfspool')
> -		|| ($scfg->{type} eq 'lvmthin') || ($scfg->{type} eq 'lvm')
> -		|| ($scfg->{type} eq 'btrfs');
> +	    # TODO move to storage plugin layer?
> +	    my $migratable_storages = [
> +		'dir',
> +		'zfspool',
> +		'lvmthin',
> +		'lvm',
> +		'btrfs',
> +	    ];
> +	    if ($remote) {
> +		push @$migratable_storages, 'cifs';
> +		push @$migratable_storages, 'nfs';
> +	    }
>  
>  	    die "storage type '$scfg->{type}' not supported\n"
> -		if !$migratable;
> +		if !grep { $_ eq $scfg->{type} } @$migratable_storages;
>  
>  	    # image is a linked clone on local storage, se we can't migrate.
>  	    if (my $basename = (PVE::Storage::parse_volname($self->{storecfg}, $volid))[3]) {
> @@ -280,7 +347,10 @@ sub phase1 {
>  
>      my $rep_cfg = PVE::ReplicationConfig->new();
>  
> -    if (my $jobcfg = $rep_cfg->find_local_replication_job($vmid, $self->{node})) {
> +    if ($remote) {
> +	die "cannot remote-migrate replicated VM\n"
> +	    if $rep_cfg->check_for_existing_jobs($vmid, 1);
> +    } elsif (my $jobcfg = $rep_cfg->find_local_replication_job($vmid, $self->{node})) {
>  	die "can't live migrate VM with replicated volumes\n" if $self->{running};
>  	my $start_time = time();
>  	my $logfunc = sub { my ($msg) = @_;  $self->log('info', $msg); };
> @@ -291,7 +361,6 @@ sub phase1 {
>      my $opts = $self->{opts};
>      foreach my $volid (keys %$volhash) {
>  	next if $rep_volumes->{$volid};
> -	my ($sid, $volname) = PVE::Storage::parse_volume_id($volid);
>  	push @{$self->{volumes}}, $volid;
>  
>  	# JSONSchema and get_bandwidth_limit use kbps - storage_migrate bps
> @@ -301,22 +370,39 @@ sub phase1 {
>  	my $targetsid = $volhash->{$volid}->{targetsid};
>  
>  	my $new_volid = eval {
> -	    my $storage_migrate_opts = {
> -		'ratelimit_bps' => $bwlimit,
> -		'insecure' => $opts->{migration_type} eq 'insecure',
> -		'with_snapshots' => $volhash->{$volid}->{snapshots},
> -		'allow_rename' => 1,
> -	    };
> -
> -	    my $logfunc = sub { $self->log('info', $_[0]); };
> -	    return PVE::Storage::storage_migrate(
> -		$self->{storecfg},
> -		$volid,
> -		$self->{ssh_info},
> -		$targetsid,
> -		$storage_migrate_opts,
> -		$logfunc,
> -	    );
> +	    if ($remote) {
> +		my $log = sub {
> +		    my ($level, $msg) = @_;
> +		    $self->log($level, $msg);
> +		};
> +
> +		return PVE::StorageTunnel::storage_migrate(
> +		    $self->{tunnel},
> +		    $self->{storecfg},
> +		    $volid,
> +		    $self->{vmid},
> +		    $remote->{vmid},
> +		    $volhash->{$volid},
> +		    $log,
> +		);
> +	    } else {
> +		my $storage_migrate_opts = {
> +		    'ratelimit_bps' => $bwlimit,
> +		    'insecure' => $opts->{migration_type} eq 'insecure',
> +		    'with_snapshots' => $volhash->{$volid}->{snapshots},
> +		    'allow_rename' => 1,
> +		};
> +
> +		my $logfunc = sub { $self->log('info', $_[0]); };
> +		return PVE::Storage::storage_migrate(
> +		    $self->{storecfg},
> +		    $volid,
> +		    $self->{ssh_info},
> +		    $targetsid,
> +		    $storage_migrate_opts,
> +		    $logfunc,
> +		);
> +	    }
>  	};
>  
>  	if (my $err = $@) {
> @@ -346,13 +432,38 @@ sub phase1 {
>      my $vollist = PVE::LXC::Config->get_vm_volumes($conf);
>      PVE::Storage::deactivate_volumes($self->{storecfg}, $vollist);
>  
> -    # transfer replication state before moving config
> -    $self->transfer_replication_state() if $rep_volumes;
> -    PVE::LXC::Config->update_volume_ids($conf, $self->{volume_map});
> -    PVE::LXC::Config->write_config($vmid, $conf);
> -    PVE::LXC::Config->move_config_to_node($vmid, $self->{node});
> +    if ($remote) {
> +	my $remote_conf = PVE::LXC::Config->load_config($vmid);
> +	PVE::LXC::Config->update_volume_ids($remote_conf, $self->{volume_map});
> +
> +	my $bridges = map_bridges($remote_conf, $self->{opts}->{bridgemap});
> +	for my $target (keys $bridges->%*) {
> +	    for my $nic (keys $bridges->{$target}->%*) {
> +		$self->log('info', "mapped: $nic from $bridges->{$target}->{$nic} to $target");
> +	    }
> +	}
> +	my $conf_str = PVE::LXC::Config::write_pct_config("remote", $remote_conf);
> +
> +	# TODO expose in PVE::Firewall?
> +	my $vm_fw_conf_path = "/etc/pve/firewall/$vmid.fw";
> +	my $fw_conf_str;
> +	$fw_conf_str = PVE::Tools::file_get_contents($vm_fw_conf_path)
> +	    if -e $vm_fw_conf_path;
> +	my $params = {
> +	    conf => $conf_str,
> +	    'firewall-config' => $fw_conf_str,
> +	};
> +
> +	PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'config', $params);
> +    } else {
> +	# transfer replication state before moving config
> +	$self->transfer_replication_state() if $rep_volumes;
> +	PVE::LXC::Config->update_volume_ids($conf, $self->{volume_map});
> +	PVE::LXC::Config->write_config($vmid, $conf);
> +	PVE::LXC::Config->move_config_to_node($vmid, $self->{node});
> +	$self->switch_replication_job_target() if $rep_volumes;
> +    }
>      $self->{conf_migrated} = 1;
> -    $self->switch_replication_job_target() if $rep_volumes;
>  }
>  
>  sub phase1_cleanup {
> @@ -366,6 +477,12 @@ sub phase1_cleanup {
>  	    # fixme: try to remove ?
>  	}
>      }
> +
> +    if ($self->{opts}->{remote}) {
> +	# cleans up remote volumes
> +	PVE::Tunnel::finish_tunnel($self->{tunnel}, 1);
> +	delete $self->{tunnel};
> +    }
>  }
>  
>  sub phase3 {
> @@ -373,6 +490,9 @@ sub phase3 {
>  
>      my $volids = $self->{volumes};
>  
> +    # handled below in final_cleanup
> +    return if $self->{opts}->{remote};
> +
>      # destroy local copies
>      foreach my $volid (@$volids) {
>  	eval { PVE::Storage::vdisk_free($self->{storecfg}, $volid); };
> @@ -401,6 +521,24 @@ sub final_cleanup {
>  	    my $skiplock = 1;
>  	    PVE::LXC::vm_start($vmid, $self->{vmconf}, $skiplock);
>  	}
> +    } elsif ($self->{opts}->{remote}) {
> +	eval { PVE::Tunnel::write_tunnel($self->{tunnel}, 10, 'unlock') };
> +	$self->log('err', "Failed to clear migrate lock - $@\n") if $@;
> +
> +	if ($self->{opts}->{restart} && $self->{was_running}) {
> +	    $self->log('info', "start container on target node");
> +	    PVE::Tunnel::write_tunnel($self->{tunnel}, 60, 'start');
> +	}
> +	if ($self->{opts}->{delete}) {
> +	    PVE::LXC::destroy_lxc_container(
> +		PVE::Storage::config(),
> +		$vmid,
> +		PVE::LXC::Config->load_config($vmid),
> +		undef,
> +		0,
> +	    );
> +	}
> +	PVE::Tunnel::finish_tunnel($self->{tunnel});
>      } else {
>  	my $cmd = [ @{$self->{rem_ssh}}, 'pct', 'unlock', $vmid ];
>  	$self->cmd_logerr($cmd, errmsg => "failed to clear migrate lock");
> @@ -413,7 +551,30 @@ sub final_cleanup {
>  	    $self->cmd($cmd);
>  	}
>      }
> +}
> +
> +sub map_bridges {
> +    my ($conf, $map, $scan_only) = @_;
> +
> +    my $bridges = {};
> +
> +    foreach my $opt (keys %$conf) {
> +	next if $opt !~ m/^net\d+$/;
> +
> +	next if !$conf->{$opt};
> +	my $d = PVE::LXC::Config->parse_lxc_network($conf->{$opt});
> +	next if !$d || !$d->{bridge};
> +
> +	my $target_bridge = PVE::JSONSchema::map_id($map, $d->{bridge});
> +	$bridges->{$target_bridge}->{$opt} = $d->{bridge};
> +
> +	next if $scan_only;
> +
> +	$d->{bridge} = $target_bridge;
> +	$conf->{$opt} = PVE::LXC::Config->print_lxc_network($d);
> +    }
>  
> +    return $bridges;
>  }
>  
>  1;
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH-SERIES v6 0/13] remote migration
  2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
                   ` (12 preceding siblings ...)
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 storage 1/1] (remote) export: check and untaint format Fabian Grünbichler
@ 2022-10-04 15:29 ` DERUMIER, Alexandre
  13 siblings, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2022-10-04 15:29 UTC (permalink / raw)
  To: pve-devel

Hi Fabian,
I'll try to test it this month. (Very busy currently, maybe in 2
weekend)

Do you have any roadmap to official merge this patch serie ? Is it
already stable ?

I'll have a lot of migration between different cluster in coming months
(moving part of our infra to a new datacenter). So I'll be able to test
is at scale.


Le mercredi 28 septembre 2022 à 14:50 +0200, Fabian Grünbichler a
écrit :
> this series adds remote migration for VMs and CTs.
> 
> both live and offline migration of VMs including NBD and
> storage-migrated disks should work, containers don't have any live
> migration so both offline and restart mode work identical except for
> the
> restart part.
> 
> groundwork for extending to pvesr already laid.
> 
> uncovered (but still not fixed)
> https://antiphishing.cetsi.fr/proxy/v3?i=SHV0Y1JZQjNyckJFa3dUQiblhF5YcUqtiWCaK_ri0kk&r=T0hnMlUyVEgwNmlmdHc1NSqeTQ1pLQVNn4UvDLnWe4fCxNuytxXrtkvXRfHgEH29SgNUOJTfU-F2je9BBTq-sg&f=V3p0eFlQOUZ4czh2enpJS6vlBYwhEUcOwTmUN-Hu71ZWogcUGH-slS7gYzVrVVB6_wb2zNaC4g2GRLF4nWvKLw&u=https%3A//bugzilla.proxmox.com/show_bug.cgi%3Fid%3D3873&k=ZVd0
> (migration btrfs -> btrfs with snapshots)
> 
> dependencies/breaks:
> - qemu-server / pve-container -> bumped pve-storage (taint bug
>   storage migration)
> - qemu-server / pve-container -> bumped pve-access-control (new priv)
> - qemu-server -> bumped pve-common (moved pve-targetstorage option)
> - pve-common -BREAKS-> not-bumped qemu-server (same)
> 
> follow-ups/todos:
> - implement disk export/import for shared storages like rbd
> - implement disk export/import raw+size for ZFS zvols
> - extend ZFS replication via websocket tunnel to remote cluster
> - extend replication to support RBD snapshot-based replication
> - extend RBD replication via websocket tunnel to remote cluster
> - switch regular migration SSH mtunnel to version 2 with json support
>   (related -> s.hanreichs pre-/post-migrate-hook series)
> 
> new in v6:
> - --with-local-disks always set and not a parameter
> - `pct remote-migrate`
> - new Sys.Incoming privilege + checks
> - storage export taintedness bug fix
> - properly take over pve-targetstorage option (qemu-server ->
>   pve-common)
> - review feedback addressed
> 
> new in v5: lots of edge cases fixed, PoC for pve-container, some more
> helper moving for re-use in pve-container without duplication
> 
> new in v4: lots of small fixes, improved bwlimit handling, `qm`
> command
> (thanks Fabian Ebner and Dominik Csapak for the feedback on v3!)
> 
> new in v3: lots of refactoring and edge-case handling
> 
> new in v2: dropped parts already applied, incorporated Fabian's and
> Dominik's feedback (thanks!)
> 
> new in v1: explicit remote endpoint specified as part of API call
> instead of
> remote.cfg
> 
> overview over affected repos and changes, see individual patches for
> more details.
> 
> pve-access-control:
> 
> Fabian Grünbichler (1):
>   privs: add Sys.Incoming
> 
>  src/PVE/AccessControl.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
> pve-common:
> 
> Fabian Grünbichler (1):
>   schema: take over 'pve-targetstorage' option
> 
>  src/PVE/JSONSchema.pm | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> pve-container:
> 
> Fabian Grünbichler (3):
>   migration: add remote migration
>   pct: add 'remote-migrate' command
>   migrate: print mapped volume in error
> 
>  src/PVE/API2/LXC.pm    | 635
> +++++++++++++++++++++++++++++++++++++++++
>  src/PVE/CLI/pct.pm     | 124 ++++++++
>  src/PVE/LXC/Migrate.pm | 248 +++++++++++++---
>  3 files changed, 965 insertions(+), 42 deletions(-)
> 
> pve-docs:
> 
> Fabian Grünbichler (1):
>   pveum: mention Sys.Incoming privilege
> 
>  pveum.adoc | 1 +
>  1 file changed, 1 insertion(+)
> 
> qemu-server:
> 
> Fabian Grünbichler (6):
>   schema: move 'pve-targetstorage' to pve-common
>   mtunnel: add API endpoints
>   migrate: refactor remote VM/tunnel start
>   migrate: add remote migration handling
>   api: add remote migrate endpoint
>   qm: add remote-migrate command
> 
>  PVE/API2/Qemu.pm   | 709
> ++++++++++++++++++++++++++++++++++++++++++++-
>  PVE/CLI/qm.pm      | 113 ++++++++
>  PVE/QemuMigrate.pm | 590 ++++++++++++++++++++++++++++---------
>  PVE/QemuServer.pm  |  48 ++-
>  debian/control     |   5 +-
>  5 files changed, 1299 insertions(+), 166 deletions(-)
> 
> pve-storage:
> 
> Fabian Grünbichler (1):
>   (remote) export: check and untaint format
> 
>  PVE/CLI/pvesm.pm | 6 ++----
>  PVE/Storage.pm   | 9 +++++++++
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command Fabian Grünbichler
@ 2022-10-17 14:40   ` DERUMIER, Alexandre
  2022-10-18  6:39     ` Thomas Lamprecht
  2022-10-17 17:22   ` DERUMIER, Alexandre
  1 sibling, 1 reply; 29+ messages in thread
From: DERUMIER, Alexandre @ 2022-10-17 14:40 UTC (permalink / raw)
  To: pve-devel

Hi Fabian,


> an example invocation:
>
> $ qm remote-migrate 1234 4321 
'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' 
--target-bridge vmbr0 --target-storage zfs-a:rbd-b,nfs-c:dir-d,zfs-e 
--online


Maybe it could be better (optionnaly) to store the long

"'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' 
"

in a config file in /etc/pve/priv/<targethost>.conf ?


Like this, this should avoid to have the api token in the bash history.

maybe something like:

qm remote-migration 1234 4321 <targethost> ....

?





Le 28/09/22 à 14:50, Fabian Grünbichler a écrit :
> which wraps the remote_migrate_vm API endpoint, but does the
> precondition checks that can be done up front itself.
> 
> this now just leaves the FP retrieval and target node name lookup to the
> sync part of the API endpoint, which should be do-able in <30s ..
> 
> 
> will migrate the local VM 1234 to the host 123.123.1232.123 using the
> given API token, mapping the VMID to 4321 on the target cluster, all its
> virtual NICs to the target vm bridge 'vmbr0', any volumes on storage
> zfs-a to storage rbd-b, any on storage nfs-c to storage dir-d, and any
> other volumes to storage zfs-e. the source VM will be stopped but remain
> on the source node/cluster after the migration has finished.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      v6:
>      - mark as experimental
>      - drop `with-local-disks` parameter from API, always set to true
>      - add example invocation to commit message
>      
>      v5: rename to 'remote-migrate'
> 
>   PVE/API2/Qemu.pm |  31 -------------
>   PVE/CLI/qm.pm    | 113 +++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 113 insertions(+), 31 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index fa31e973..57083601 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4416,17 +4416,6 @@ __PACKAGE__->register_method({
>   	    $param->{online} = 0;
>   	}
>   
> -	# FIXME: fork worker hear to avoid timeout? or poll these periodically
> -	# in pvestatd and access cached info here? all of the below is actually
> -	# checked at the remote end anyway once we call the mtunnel endpoint,
> -	# we could also punt it to the client and not do it here at all..
> -	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
> -	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
> -	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
> -	}
> -
> -	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
> -
>   	my $storecfg = PVE::Storage::config();
>   	my $target_storage = extract_param($param, 'target-storage');
>   	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
> @@ -4438,26 +4427,6 @@ __PACKAGE__->register_method({
>   	raise_param_exc({ 'target-bridge' => "failed to parse bridge map: $@" })
>   	    if $@;
>   
> -	my $check_remote_storage = sub {
> -	    my ($storage) = @_;
> -	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
> -	    die "remote: storage '$storage' does not exist!\n"
> -		if !@$found;
> -
> -	    $found = @$found[0];
> -
> -	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
> -	    die "remote: storage '$storage' cannot store images\n"
> -		if !grep { $_ eq 'images' } @$content_types;
> -	};
> -
> -	foreach my $target_sid (values %{$storagemap->{entries}}) {
> -	    $check_remote_storage->($target_sid);
> -	}
> -
> -	$check_remote_storage->($storagemap->{default})
> -	    if $storagemap->{default};
> -
>   	die "remote migration requires explicit storage mapping!\n"
>   	    if $storagemap->{identity};
>   
> diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> index ca5d25fc..a6a63566 100755
> --- a/PVE/CLI/qm.pm
> +++ b/PVE/CLI/qm.pm
> @@ -15,6 +15,7 @@ use POSIX qw(strftime);
>   use Term::ReadLine;
>   use URI::Escape;
>   
> +use PVE::APIClient::LWP;
>   use PVE::Cluster;
>   use PVE::Exception qw(raise_param_exc);
>   use PVE::GuestHelpers;
> @@ -158,6 +159,117 @@ __PACKAGE__->register_method ({
>   	return;
>       }});
>   
> +
> +__PACKAGE__->register_method({
> +    name => 'remote_migrate_vm',
> +    path => 'remote_migrate_vm',
> +    method => 'POST',
> +    description => "Migrate virtual machine to a remote cluster. Creates a new migration task. EXPERIMENTAL feature!",
> +    permissions => {
> +	check => ['perm', '/vms/{vmid}', [ 'VM.Migrate' ]],
> +    },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid', { completion => \&PVE::QemuServer::complete_vmid }),
> +	    'target-vmid' => get_standard_option('pve-vmid', { optional => 1 }),
> +	    'target-endpoint' => get_standard_option('proxmox-remote', {
> +		description => "Remote target endpoint",
> +	    }),
> +	    online => {
> +		type => 'boolean',
> +		description => "Use online/live migration if VM is running. Ignored if VM is stopped.",
> +		optional => 1,
> +	    },
> +	    delete => {
> +		type => 'boolean',
> +		description => "Delete the original VM and related data after successful migration. By default the original VM is kept on the source cluster in a stopped state.",
> +		optional => 1,
> +		default => 0,
> +	    },
> +	    'target-storage' => get_standard_option('pve-targetstorage', {
> +		completion => \&PVE::QemuServer::complete_migration_storage,
> +		optional => 0,
> +	    }),
> +	    'target-bridge' => {
> +		type => 'string',
> +		description => "Mapping from source to target bridges. Providing only a single bridge ID maps all source bridges to that bridge. Providing the special value '1' will map each source bridge to itself.",
> +		format => 'bridge-pair-list',
> +	    },
> +	    bwlimit => {
> +		description => "Override I/O bandwidth limit (in KiB/s).",
> +		optional => 1,
> +		type => 'integer',
> +		minimum => '0',
> +		default => 'migrate limit from datacenter or storage config',
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => 'string',
> +	description => "the task ID.",
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $source_vmid = $param->{vmid};
> +	my $target_endpoint = $param->{'target-endpoint'};
> +	my $target_vmid = $param->{'target-vmid'} // $source_vmid;
> +
> +	my $remote = PVE::JSONSchema::parse_property_string('proxmox-remote', $target_endpoint);
> +
> +	# TODO: move this as helper somewhere appropriate?
> +	my $conn_args = {
> +	    protocol => 'https',
> +	    host => $remote->{host},
> +	    port => $remote->{port} // 8006,
> +	    apitoken => $remote->{apitoken},
> +	};
> +
> +	$conn_args->{cached_fingerprints} = { uc($remote->{fingerprint}) => 1 }
> +	    if defined($remote->{fingerprint});
> +
> +	my $api_client = PVE::APIClient::LWP->new(%$conn_args);
> +	my $resources = $api_client->get("/cluster/resources", { type => 'vm' });
> +	if (grep { defined($_->{vmid}) && $_->{vmid} eq $target_vmid } @$resources) {
> +	    raise_param_exc({ target_vmid => "Guest with ID '$target_vmid' already exists on remote cluster" });
> +	}
> +
> +	my $storages = $api_client->get("/nodes/localhost/storage", { enabled => 1 });
> +
> +	my $storecfg = PVE::Storage::config();
> +	my $target_storage = $param->{'target-storage'};
> +	my $storagemap = eval { PVE::JSONSchema::parse_idmap($target_storage, 'pve-storage-id') };
> +	raise_param_exc({ 'target-storage' => "failed to parse storage map: $@" })
> +	    if $@;
> +
> +	my $check_remote_storage = sub {
> +	    my ($storage) = @_;
> +	    my $found = [ grep { $_->{storage} eq $storage } @$storages ];
> +	    die "remote: storage '$storage' does not exist!\n"
> +		if !@$found;
> +
> +	    $found = @$found[0];
> +
> +	    my $content_types = [ PVE::Tools::split_list($found->{content}) ];
> +	    die "remote: storage '$storage' cannot store images\n"
> +		if !grep { $_ eq 'images' } @$content_types;
> +	};
> +
> +	foreach my $target_sid (values %{$storagemap->{entries}}) {
> +	    $check_remote_storage->($target_sid);
> +	}
> +
> +	$check_remote_storage->($storagemap->{default})
> +	    if $storagemap->{default};
> +
> +	return PVE::API2::Qemu->remote_migrate_vm($param);
> +    }});
> +
>   __PACKAGE__->register_method ({
>       name => 'status',
>       path => 'status',
> @@ -900,6 +1012,7 @@ our $cmddef = {
>       clone => [ "PVE::API2::Qemu", 'clone_vm', ['vmid', 'newid'], { node => $nodename }, $upid_exit ],
>   
>       migrate => [ "PVE::API2::Qemu", 'migrate_vm', ['vmid', 'target'], { node => $nodename }, $upid_exit ],
> +    'remote-migrate' => [ __PACKAGE__, 'remote_migrate_vm', ['vmid', 'target-vmid', 'target-endpoint'], { node => $nodename }, $upid_exit ],
>   
>       set => [ "PVE::API2::Qemu", 'update_vm', ['vmid'], { node => $nodename } ],
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command Fabian Grünbichler
  2022-10-17 14:40   ` DERUMIER, Alexandre
@ 2022-10-17 17:22   ` DERUMIER, Alexandre
  1 sibling, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2022-10-17 17:22 UTC (permalink / raw)
  To: pve-devel

Le 28/09/22 à 14:50, Fabian Grünbichler a écrit :
> which wraps the remote_migrate_vm API endpoint, but does the
> precondition checks that can be done up front itself.
> 
> this now just leaves the FP retrieval and target node name lookup to the
> sync part of the API endpoint, which should be do-able in <30s ..
> 
> an example invocation:
> 
> $ qm remote-migrate 1234 4321 'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' --target-bridge vmbr0 --target-storage zfs-a:rbd-b,nfs-c:dir-d,zfs-e --online



"apitoken=pveapitoken=user@pve!incoming

should be

"apitoken=PVEAPIToken=user@pve!incoming"

as PVEAPIToken is case sensitive



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
  2022-09-30 11:52   ` Stefan Hanreich
  2022-10-03 13:22   ` [pve-devel] [PATCH FOLLOW-UP " Fabian Grünbichler
@ 2022-10-18  6:23   ` DERUMIER, Alexandre
  2 siblings, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2022-10-18  6:23 UTC (permalink / raw)
  To: pve-devel

I think it's missing cleaning of vm on remote storage,

if migration is aborted on phase1 or phase2.

It safe to delete vm conf && disk on remotehost until the end of the phase2,
when disk are switched.


If job is aborted, It trigger "mtunnel exited unexpectedly" on 
remotehost, without any cleaning.
(and even in phase2_cleanup, remotehost don't receive the "stop" command 
in the tunnel)



example1:

on phase1, I had a hanging source storage, and the job was haning on 
scan_local_volumes().

I have manually aborted the job.

on source:
"2022-10-18 06:00:25 local WS tunnel version: 2
2022-10-18 06:00:25 remote WS tunnel version: 2
2022-10-18 06:00:25 minimum required WS tunnel version: 2
websocket tunnel started
2022-10-18 06:00:25 starting migration of VM 111 to node 'formationkvm1' 
(10.3.94.10)
tunnel: -> sending command "bwlimit" to remote
tunnel: <- got reply
CMD websocket tunnel died: command 'proxmox-websocket-tunnel' failed: 
interrupted by signal

2022-10-18 06:02:03 ERROR: Problem found while scanning volumes - rbd 
error: interrupted by signal
2022-10-18 06:02:03 aborting phase 1 - cleanup resources
2022-10-18 06:02:04 ERROR: writing to tunnel failed: broken pipe
2022-10-18 06:02:04 ERROR: migration aborted (duration 00:01:39): 
Problem found while scanning volumes - rbd error: interrupted by signal
"


on remote host:

"
mtunnel started
received command 'version'
TASK ERROR: mtunnel exited unexpectedly
"
with a lock on the vm config. (no disk created yet).





example2:
manually abort job when disk transfert is running


on source:
"
tunnel: -> sending command "version" to remote
tunnel: <- got reply
2022-10-18 06:35:47 local WS tunnel version: 2
2022-10-18 06:35:47 remote WS tunnel version: 2
2022-10-18 06:35:47 minimum required WS tunnel version: 2
websocket tunnel started
2022-10-18 06:35:47 starting migration of VM 111 to node 'formationkvm1' 
(10.3.94.10)
toto monpbs
toto replicat2x
tunnel: -> sending command "bwlimit" to remote
tunnel: <- got reply

2022-10-18 06:35:48 found local disk 'replicat2x:vm-111-disk-0' (in 
current VM config)
2022-10-18 06:35:48 mapped: net0 from vmbr0 to vmbr0
2022-10-18 06:35:48 mapped: net1 from vmbr0 to vmbr0
2022-10-18 06:35:48 Allocating volume for drive 'scsi0' on remote 
storage 'local-lvm'..
tunnel: -> sending command "disk" to remote
tunnel: <- got reply
2022-10-18 06:35:49 volume 'replicat2x:vm-111-disk-0' is 
'local-lvm:vm-10005-disk-0' on the target
tunnel: -> sending command "config" to remote
tunnel: <- got reply
tunnel: -> sending command "start" to remote
tunnel: <- got reply
2022-10-18 06:35:52 Setting up tunnel for '/run/qemu-server/111.migrate'
2022-10-18 06:35:52 Setting up tunnel for '/run/qemu-server/111_nbd.migrate'
2022-10-18 06:35:52 starting storage migration
2022-10-18 06:35:52 scsi0: start migration to 
nbd:unix:/run/qemu-server/111_nbd.migrate:exportname=drive-scsi0
drive mirror is starting for drive-scsi0
tunnel: accepted new connection on '/run/qemu-server/111_nbd.migrate'
tunnel: requesting WS ticket via tunnel
tunnel: established new WS for forwarding '/run/qemu-server/111_nbd.migrate'
drive-scsi0: transferred 0.0 B of 32.0 GiB (0.00%) in 0s
drive-scsi0: transferred 101.0 MiB of 32.0 GiB (0.31%) in 1s
drive-scsi0: transferred 209.0 MiB of 32.0 GiB (0.64%) in 2s
drive-scsi0: transferred 318.0 MiB of 32.0 GiB (0.97%) in 3s
drive-scsi0: transferred 428.0 MiB of 32.0 GiB (1.31%) in 4s

drive-scsi0: Cancelling block job
CMD websocket tunnel died: command 'proxmox-websocket-tunnel' failed: 
interrupted by signal

drive-scsi0: Done.
2022-10-18 06:36:31 ERROR: online migrate failure - block job (mirror) 
error: interrupted by signal
2022-10-18 06:36:31 aborting phase 2 - cleanup resources
2022-10-18 06:36:31 migrate_cancel
2022-10-18 06:36:31 ERROR: writing to tunnel failed: broken pipe
2022-10-18 06:36:31 ERROR: migration finished with problems (duration 
00:00:44)

TASK ERROR: migration problems
"

on remote host:
"
mtunnel started
received command 'version'
received command 'bwlimit'
received command 'disk'
   WARNING: You have not turned on protection against thin pools running 
out of space.
   WARNING: Set activation/thin_pool_autoextend_threshold below 100 to 
trigger automatic extension of thin pools before they get full.
   Logical volume "vm-10005-disk-0" created.
   WARNING: Sum of all thin volume sizes (69.00 GiB) exceeds the size of 
thin pool pve/data and the amount of free space in volume group (13.87 GiB).
received command 'config'
update VM 10005: -boot order=scsi0;ide2;net0 -cores 4 -ide2 
none,media=cdrom -machine pc-i440fx-6.2 -memory 2048 -name debian8 -net0 
virtio=5E:57:C7:01:CB:FC,bridge=vmbr0,firewall=1,queues=2,tag=94 -net1 
virtio=5E:57:C7:01:CB:FE,bridge=vmbr0,firewall=1,queues=2,tag=94 -numa 0 
-ostype l26 -scsi0 local-lvm:vm-10005-disk-0,format=raw,size=32G -scsihw 
virtio-scsi-pci -smbios1 uuid=5bd68bc8-c0fd-4f50-9649-f78a8f805e4c 
-sockets 2
received command 'start'
migration listens on unix:/run/qemu-server/10005.migrate
storage migration listens on 
nbd:unix:/run/qemu-server/10005_nbd.migrate:exportname=drive-scsi0 
volume:local-lvm:vm-10005-disk-0,format=raw,size=32G
received command 'ticket'
TASK ERROR: mtunnel exited unexpectedly
"

with a lock on the vmconfig in migrate state.


Le 28/09/22 à 14:50, Fabian Grünbichler a écrit :
> the following two endpoints are used for migration on the remote side
> 
> POST /nodes/NODE/qemu/VMID/mtunnel
> 
> which creates and locks an empty VM config, and spawns the main qmtunnel
> worker which binds to a VM-specific UNIX socket.
> 
> this worker handles JSON-encoded migration commands coming in via this
> UNIX socket:
> - config (set target VM config)
> -- checks permissions for updating config
> -- strips pending changes and snapshots
> -- sets (optional) firewall config
> - disk (allocate disk for NBD migration)
> -- checks permission for target storage
> -- returns drive string for allocated volume
> - disk-import, query-disk-import, bwlimit
> -- handled by PVE::StorageTunnel
> - start (returning migration info)
> - fstrim (via agent)
> - ticket (creates a ticket for a WS connection to a specific socket)
> - resume
> - stop
> - nbdstop
> - unlock
> - quit (+ cleanup)
> 
> this worker serves as a replacement for both 'qm mtunnel' and various
> manual calls via SSH. the API call will return a ticket valid for
> connecting to the worker's UNIX socket via a websocket connection.
> 
> GET+WebSocket upgrade /nodes/NODE/qemu/VMID/mtunnelwebsocket
> 
> gets called for connecting to a UNIX socket via websocket forwarding,
> i.e. once for the main command mtunnel, and once each for the memory
> migration and each NBD drive-mirror/storage migration.
> 
> access is guarded by a short-lived ticket binding the authenticated user
> to the socket path. such tickets can be requested over the main mtunnel,
> which keeps track of socket paths currently used by that
> mtunnel/migration instance.
> 
> each command handler should check privileges for the requested action if
> necessary.
> 
> both mtunnel and mtunnelwebsocket endpoints are not proxied, the
> client/caller is responsible for ensuring the passed 'node' parameter
> and the endpoint handling the call are matching.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
> 
> Notes:
>      v6:
>      - check for Sys.Incoming in mtunnel
>      - add definedness checks in 'config' command
>      - switch to vm_running_locally in 'resume' command
>      - moved $socket_addr closer to usage
>      v5:
>      - us vm_running_locally
>      - move '$socket_addr' declaration closer to usage
>      v4:
>      - add timeout to accept()
>      - move 'bwlimit' to PVE::StorageTunnel and extend it
>      - mark mtunnel(websocket) as non-proxied, and check $node accordingly
>      v3:
>      - handle meta and vmgenid better
>      - handle failure of 'config' updating
>      - move 'disk-import' and 'query-disk-import' handlers to pve-guest-common
>      - improve tunnel exit by letting client close the connection
>      - use strict VM config parser
>      v2: incorporated Fabian Ebner's feedback, mainly:
>      - use modified nbd alloc helper instead of duplicating
>      - fix disk cleanup, also cleanup imported disks
>      - fix firewall-conf vs firewall-config mismatch
>      
>      requires
>      - pve-access-control with tunnel ticket support (already marked in d/control)
>      - pve-access-control with Sys.Incoming privilege (not yet applied/bumped!)
>      - pve-http-server with websocket fixes (could be done via breaks? or bumped in
>        pve-manager..)
> 
>   PVE/API2/Qemu.pm | 527 ++++++++++++++++++++++++++++++++++++++++++++++-
>   debian/control   |   2 +-
>   2 files changed, 527 insertions(+), 2 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 3ec31c26..9270ca74 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4,10 +4,13 @@ use strict;
>   use warnings;
>   use Cwd 'abs_path';
>   use Net::SSLeay;
> -use POSIX;
>   use IO::Socket::IP;
> +use IO::Socket::UNIX;
> +use IPC::Open3;
> +use JSON;
>   use URI::Escape;
>   use Crypt::OpenSSL::Random;
> +use Socket qw(SOCK_STREAM);
>   
>   use PVE::Cluster qw (cfs_read_file cfs_write_file);;
>   use PVE::RRD;
> @@ -38,6 +41,7 @@ use PVE::VZDump::Plugin;
>   use PVE::DataCenterConfig;
>   use PVE::SSHInfo;
>   use PVE::Replication;
> +use PVE::StorageTunnel;
>   
>   BEGIN {
>       if (!$ENV{PVE_GENERATING_DOCS}) {
> @@ -1087,6 +1091,7 @@ __PACKAGE__->register_method({
>   	    { subdir => 'spiceproxy' },
>   	    { subdir => 'sendkey' },
>   	    { subdir => 'firewall' },
> +	    { subdir => 'mtunnel' },
>   	    ];
>   
>   	return $res;
> @@ -4965,4 +4970,524 @@ __PACKAGE__->register_method({
>   	return PVE::QemuServer::Cloudinit::dump_cloudinit_config($conf, $param->{vmid}, $param->{type});
>       }});
>   
> +__PACKAGE__->register_method({
> +    name => 'mtunnel',
> +    path => '{vmid}/mtunnel',
> +    method => 'POST',
> +    protected => 1,
> +    description => 'Migration tunnel endpoint - only for internal use by VM migration.',
> +    permissions => {
> +	check =>
> +	[ 'and',
> +	  ['perm', '/vms/{vmid}', [ 'VM.Allocate' ]],
> +	  ['perm', '/', [ 'Sys.Incoming' ]],
> +	],
> +	description => "You need 'VM.Allocate' permissions on '/vms/{vmid}' and Sys.Incoming" .
> +	               " on '/'. Further permission checks happen during the actual migration.",
> +    },
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    storages => {
> +		type => 'string',
> +		format => 'pve-storage-id-list',
> +		optional => 1,
> +		description => 'List of storages to check permission and availability. Will be checked again for all actually used storages during migration.',
> +	    },
> +	},
> +    },
> +    returns => {
> +	additionalProperties => 0,
> +	properties => {
> +	    upid => { type => 'string' },
> +	    ticket => { type => 'string' },
> +	    socket => { type => 'string' },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $node = extract_param($param, 'node');
> +	my $vmid = extract_param($param, 'vmid');
> +
> +	my $storages = extract_param($param, 'storages');
> +
> +	my $nodename = PVE::INotify::nodename();
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	$node = $nodename;
> +
> +	my $storecfg = PVE::Storage::config();
> +	foreach my $storeid (PVE::Tools::split_list($storages)) {
> +	    $check_storage_access_migrate->($rpcenv, $authuser, $storecfg, $storeid, $node);
> +	}
> +
> +	PVE::Cluster::check_cfs_quorum();
> +
> +	my $lock = 'create';
> +	eval { PVE::QemuConfig->create_and_lock_config($vmid, 0, $lock); };
> +
> +	raise_param_exc({ vmid => "unable to create empty VM config - $@"})
> +	    if $@;
> +
> +	my $realcmd = sub {
> +	    my $state = {
> +		storecfg => PVE::Storage::config(),
> +		lock => $lock,
> +		vmid => $vmid,
> +	    };
> +
> +	    my $run_locked = sub {
> +		my ($code, $params) = @_;
> +		return PVE::QemuConfig->lock_config($state->{vmid}, sub {
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +
> +		    $state->{conf} = $conf;
> +
> +		    die "Encountered wrong lock - aborting mtunnel command handling.\n"
> +			if $state->{lock} && !PVE::QemuConfig->has_lock($conf, $state->{lock});
> +
> +		    return $code->($params);
> +		});
> +	    };
> +
> +	    my $cmd_desc = {
> +		config => {
> +		    conf => {
> +			type => 'string',
> +			description => 'Full VM config, adapted for target cluster/node',
> +		    },
> +		    'firewall-config' => {
> +			type => 'string',
> +			description => 'VM firewall config',
> +			optional => 1,
> +		    },
> +		},
> +		disk => {
> +		    format => PVE::JSONSchema::get_standard_option('pve-qm-image-format'),
> +		    storage => {
> +			type => 'string',
> +			format => 'pve-storage-id',
> +		    },
> +		    drive => {
> +			type => 'object',
> +			description => 'parsed drive information without volid and format',
> +		    },
> +		},
> +		start => {
> +		    start_params => {
> +			type => 'object',
> +			description => 'params passed to vm_start_nolock',
> +		    },
> +		    migrate_opts => {
> +			type => 'object',
> +			description => 'migrate_opts passed to vm_start_nolock',
> +		    },
> +		},
> +		ticket => {
> +		    path => {
> +			type => 'string',
> +			description => 'socket path for which the ticket should be valid. must be known to current mtunnel instance.',
> +		    },
> +		},
> +		quit => {
> +		    cleanup => {
> +			type => 'boolean',
> +			description => 'remove VM config and disks, aborting migration',
> +			default => 0,
> +		    },
> +		},
> +		'disk-import' => $PVE::StorageTunnel::cmd_schema->{'disk-import'},
> +		'query-disk-import' => $PVE::StorageTunnel::cmd_schema->{'query-disk-import'},
> +		bwlimit => $PVE::StorageTunnel::cmd_schema->{bwlimit},
> +	    };
> +
> +	    my $cmd_handlers = {
> +		'version' => sub {
> +		    # compared against other end's version
> +		    # bump/reset for breaking changes
> +		    # bump/bump for opt-in changes
> +		    return {
> +			api => 2,
> +			age => 0,
> +		    };
> +		},
> +		'config' => sub {
> +		    my ($params) = @_;
> +
> +		    # parse and write out VM FW config if given
> +		    if (my $fw_conf = $params->{'firewall-config'}) {
> +			my ($path, $fh) = PVE::Tools::tempfile_contents($fw_conf, 700);
> +
> +			my $empty_conf = {
> +			    rules => [],
> +			    options => {},
> +			    aliases => {},
> +			    ipset => {} ,
> +			    ipset_comments => {},
> +			};
> +			my $cluster_fw_conf = PVE::Firewall::load_clusterfw_conf();
> +
> +			# TODO: add flag for strict parsing?
> +			# TODO: add import sub that does all this given raw content?
> +			my $vmfw_conf = PVE::Firewall::generic_fw_config_parser($path, $cluster_fw_conf, $empty_conf, 'vm');
> +			$vmfw_conf->{vmid} = $state->{vmid};
> +			PVE::Firewall::save_vmfw_conf($state->{vmid}, $vmfw_conf);
> +
> +			$state->{cleanup}->{fw} = 1;
> +		    }
> +
> +		    my $conf_fn = "incoming/qemu-server/$state->{vmid}.conf";
> +		    my $new_conf = PVE::QemuServer::parse_vm_config($conf_fn, $params->{conf}, 1);
> +		    delete $new_conf->{lock};
> +		    delete $new_conf->{digest};
> +
> +		    # TODO handle properly?
> +		    delete $new_conf->{snapshots};
> +		    delete $new_conf->{parent};
> +		    delete $new_conf->{pending};
> +
> +		    # not handled by update_vm_api
> +		    my $vmgenid = delete $new_conf->{vmgenid};
> +		    my $meta = delete $new_conf->{meta};
> +
> +		    $new_conf->{vmid} = $state->{vmid};
> +		    $new_conf->{node} = $node;
> +
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, 'create');
> +
> +		    eval {
> +			$update_vm_api->($new_conf, 1);
> +		    };
> +		    if (my $err = $@) {
> +			# revert to locked previous config
> +			my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +			$conf->{lock} = 'create';
> +			PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +			die $err;
> +		    }
> +
> +		    my $conf = PVE::QemuConfig->load_config($state->{vmid});
> +		    $conf->{lock} = 'migrate';
> +		    $conf->{vmgenid} = $vmgenid if defined($vmgenid);
> +		    $conf->{meta} = $meta if defined($meta);
> +		    PVE::QemuConfig->write_config($state->{vmid}, $conf);
> +
> +		    $state->{lock} = 'migrate';
> +
> +		    return;
> +		},
> +		'bwlimit' => sub {
> +		    my ($params) = @_;
> +		    return PVE::StorageTunnel::handle_bwlimit($params);
> +		},
> +		'disk' => sub {
> +		    my ($params) = @_;
> +
> +		    my $format = $params->{format};
> +		    my $storeid = $params->{storage};
> +		    my $drive = $params->{drive};
> +
> +		    $check_storage_access_migrate->($rpcenv, $authuser, $state->{storecfg}, $storeid, $node);
> +
> +		    my $storagemap = {
> +			default => $storeid,
> +		    };
> +
> +		    my $source_volumes = {
> +			'disk' => [
> +			    undef,
> +			    $storeid,
> +			    undef,
> +			    $drive,
> +			    0,
> +			    $format,
> +			],
> +		    };
> +
> +		    my $res = PVE::QemuServer::vm_migrate_alloc_nbd_disks($state->{storecfg}, $state->{vmid}, $source_volumes, $storagemap);
> +		    if (defined($res->{disk})) {
> +			$state->{cleanup}->{volumes}->{$res->{disk}->{volid}} = 1;
> +			return $res->{disk};
> +		    } else {
> +			die "failed to allocate NBD disk..\n";
> +		    }
> +		},
> +		'disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    $check_storage_access_migrate->(
> +			$rpcenv,
> +			$authuser,
> +			$state->{storecfg},
> +			$params->{storage},
> +			$node
> +		    );
> +
> +		    $params->{unix} = "/run/qemu-server/$state->{vmid}.storage";
> +
> +		    return PVE::StorageTunnel::handle_disk_import($state, $params);
> +		},
> +		'query-disk-import' => sub {
> +		    my ($params) = @_;
> +
> +		    return PVE::StorageTunnel::handle_query_disk_import($state, $params);
> +		},
> +		'start' => sub {
> +		    my ($params) = @_;
> +
> +		    my $info = PVE::QemuServer::vm_start_nolock(
> +			$state->{storecfg},
> +			$state->{vmid},
> +			$state->{conf},
> +			$params->{start_params},
> +			$params->{migrate_opts},
> +		    );
> +
> +
> +		    if ($info->{migrate}->{proto} ne 'unix') {
> +			PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +			die "migration over non-UNIX sockets not possible\n";
> +		    }
> +
> +		    my $socket = $info->{migrate}->{addr};
> +		    chown $state->{socket_uid}, -1, $socket;
> +		    $state->{sockets}->{$socket} = 1;
> +
> +		    my $unix_sockets = $info->{migrate}->{unix_sockets};
> +		    foreach my $socket (@$unix_sockets) {
> +			chown $state->{socket_uid}, -1, $socket;
> +			$state->{sockets}->{$socket} = 1;
> +		    }
> +		    return $info;
> +		},
> +		'fstrim' => sub {
> +		    if (PVE::QemuServer::qga_check_running($state->{vmid})) {
> +			eval { mon_cmd($state->{vmid}, "guest-fstrim") };
> +			warn "fstrim failed: $@\n" if $@;
> +		    }
> +		    return;
> +		},
> +		'stop' => sub {
> +		    PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
> +		    return;
> +		},
> +		'nbdstop' => sub {
> +		    PVE::QemuServer::nbd_stop($state->{vmid});
> +		    return;
> +		},
> +		'resume' => sub {
> +		    if (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
> +			PVE::QemuServer::vm_resume($state->{vmid}, 1, 1);
> +		    } else {
> +			die "VM $state->{vmid} not running\n";
> +		    }
> +		    return;
> +		},
> +		'unlock' => sub {
> +		    PVE::QemuConfig->remove_lock($state->{vmid}, $state->{lock});
> +		    delete $state->{lock};
> +		    return;
> +		},
> +		'ticket' => sub {
> +		    my ($params) = @_;
> +
> +		    my $path = $params->{path};
> +
> +		    die "Not allowed to generate ticket for unknown socket '$path'\n"
> +			if !defined($state->{sockets}->{$path});
> +
> +		    return { ticket => PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$path") };
> +		},
> +		'quit' => sub {
> +		    my ($params) = @_;
> +
> +		    if ($params->{cleanup}) {
> +			if ($state->{cleanup}->{fw}) {
> +			    PVE::Firewall::remove_vmfw_conf($state->{vmid});
> +			}
> +
> +			for my $volid (keys $state->{cleanup}->{volumes}->%*) {
> +			    print "freeing volume '$volid' as part of cleanup\n";
> +			    eval { PVE::Storage::vdisk_free($state->{storecfg}, $volid) };
> +			    warn $@ if $@;
> +			}
> +
> +			PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		    }
> +
> +		    print "switching to exit-mode, waiting for client to disconnect\n";
> +		    $state->{exit} = 1;
> +		    return;
> +		},
> +	    };
> +
> +	    $run_locked->(sub {
> +		my $socket_addr = "/run/qemu-server/$state->{vmid}.mtunnel";
> +		unlink $socket_addr;
> +
> +		$state->{socket} = IO::Socket::UNIX->new(
> +	            Type => SOCK_STREAM(),
> +		    Local => $socket_addr,
> +		    Listen => 1,
> +		);
> +
> +		$state->{socket_uid} = getpwnam('www-data')
> +		    or die "Failed to resolve user 'www-data' to numeric UID\n";
> +		chown $state->{socket_uid}, -1, $socket_addr;
> +	    });
> +
> +	    print "mtunnel started\n";
> +
> +	    my $conn = eval { PVE::Tools::run_with_timeout(300, sub { $state->{socket}->accept() }) };
> +	    if ($@) {
> +		warn "Failed to accept tunnel connection - $@\n";
> +
> +		warn "Removing tunnel socket..\n";
> +		unlink $state->{socket};
> +
> +		warn "Removing temporary VM config..\n";
> +		$run_locked->(sub {
> +		    PVE::QemuServer::destroy_vm($state->{storecfg}, $state->{vmid}, 1);
> +		});
> +
> +		die "Exiting mtunnel\n";
> +	    }
> +
> +	    $state->{conn} = $conn;
> +
> +	    my $reply_err = sub {
> +		my ($msg) = @_;
> +
> +		my $reply = JSON::encode_json({
> +		    success => JSON::false,
> +		    msg => $msg,
> +		});
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    my $reply_ok = sub {
> +		my ($res) = @_;
> +
> +		$res->{success} = JSON::true;
> +		my $reply = JSON::encode_json($res);
> +		$conn->print("$reply\n");
> +		$conn->flush();
> +	    };
> +
> +	    while (my $line = <$conn>) {
> +		chomp $line;
> +
> +		# untaint, we validate below if needed
> +		($line) = $line =~ /^(.*)$/;
> +		my $parsed = eval { JSON::decode_json($line) };
> +		if ($@) {
> +		    $reply_err->("failed to parse command - $@");
> +		    next;
> +		}
> +
> +		my $cmd = delete $parsed->{cmd};
> +		if (!defined($cmd)) {
> +		    $reply_err->("'cmd' missing");
> +		} elsif ($state->{exit}) {
> +		    $reply_err->("tunnel is in exit-mode, processing '$cmd' cmd not possible");
> +		    next;
> +		} elsif (my $handler = $cmd_handlers->{$cmd}) {
> +		    print "received command '$cmd'\n";
> +		    eval {
> +			if ($cmd_desc->{$cmd}) {
> +			    PVE::JSONSchema::validate($cmd_desc->{$cmd}, $parsed);
> +			} else {
> +			    $parsed = {};
> +			}
> +			my $res = $run_locked->($handler, $parsed);
> +			$reply_ok->($res);
> +		    };
> +		    $reply_err->("failed to handle '$cmd' command - $@")
> +			if $@;
> +		} else {
> +		    $reply_err->("unknown command '$cmd' given");
> +		}
> +	    }
> +
> +	    if ($state->{exit}) {
> +		print "mtunnel exited\n";
> +	    } else {
> +		die "mtunnel exited unexpectedly\n";
> +	    }
> +	};
> +
> +	my $socket_addr = "/run/qemu-server/$vmid.mtunnel";
> +	my $ticket = PVE::AccessControl::assemble_tunnel_ticket($authuser, "/socket/$socket_addr");
> +	my $upid = $rpcenv->fork_worker('qmtunnel', $vmid, $authuser, $realcmd);
> +
> +	return {
> +	    ticket => $ticket,
> +	    upid => $upid,
> +	    socket => $socket_addr,
> +	};
> +    }});
> +
> +__PACKAGE__->register_method({
> +    name => 'mtunnelwebsocket',
> +    path => '{vmid}/mtunnelwebsocket',
> +    method => 'GET',
> +    permissions => {
> +	description => "You need to pass a ticket valid for the selected socket. Tickets can be created via the mtunnel API call, which will check permissions accordingly.",
> +        user => 'all', # check inside
> +    },
> +    description => 'Migration tunnel endpoint for websocket upgrade - only for internal use by VM migration.',
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    vmid => get_standard_option('pve-vmid'),
> +	    socket => {
> +		type => "string",
> +		description => "unix socket to forward to",
> +	    },
> +	    ticket => {
> +		type => "string",
> +		description => "ticket return by initial 'mtunnel' API call, or retrieved via 'ticket' tunnel command",
> +	    },
> +	},
> +    },
> +    returns => {
> +	type => "object",
> +	properties => {
> +	    port => { type => 'string', optional => 1 },
> +	    socket => { type => 'string', optional => 1 },
> +	},
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $authuser = $rpcenv->get_user();
> +
> +	my $nodename = PVE::INotify::nodename();
> +	my $node = extract_param($param, 'node');
> +
> +	raise_param_exc({ node => "node needs to be 'localhost' or local hostname '$nodename'" })
> +	    if $node ne 'localhost' && $node ne $nodename;
> +
> +	my $vmid = $param->{vmid};
> +	# check VM exists
> +	PVE::QemuConfig->load_config($vmid);
> +
> +	my $socket = $param->{socket};
> +	PVE::AccessControl::verify_tunnel_ticket($param->{ticket}, $authuser, "/socket/$socket");
> +
> +	return { socket => $socket };
> +    }});
> +
>   1;
> diff --git a/debian/control b/debian/control
> index a90ecd6f..ce469cbd 100644
> --- a/debian/control
> +++ b/debian/control
> @@ -33,7 +33,7 @@ Depends: dbus,
>            libjson-perl,
>            libjson-xs-perl,
>            libnet-ssleay-perl,
> -         libpve-access-control (>= 5.0-7),
> +         libpve-access-control (>= 7.0-7),
>            libpve-cluster-perl,
>            libpve-common-perl (>= 7.1-4),
>            libpve-guest-common-perl (>= 4.1-1),


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command
  2022-10-17 14:40   ` DERUMIER, Alexandre
@ 2022-10-18  6:39     ` Thomas Lamprecht
  2022-10-18  6:56       ` DERUMIER, Alexandre
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Lamprecht @ 2022-10-18  6:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, DERUMIER, Alexandre

Hi,

Am 17/10/2022 um 16:40 schrieb DERUMIER, Alexandre:
>> an example invocation:
>>
>> $ qm remote-migrate 1234 4321 
> 'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' 
> --target-bridge vmbr0 --target-storage zfs-a:rbd-b,nfs-c:dir-d,zfs-e 
> --online
> 
> 
> Maybe it could be better (optionnaly) to store the long
> 
> "'host=123.123.123.123,apitoken=pveapitoken=user@pve!incoming=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee,fingerprint=aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb:cc:dd:ee:ff:aa:bb' 
> "
> 
> in a config file in /etc/pve/priv/<targethost>.conf ?
> 
> 
> Like this, this should avoid to have the api token in the bash history.
> 
> maybe something like:
> 
> qm remote-migration 1234 4321 <targethost> ....
> 
> ?

We plan to have such functionality in the datacenter manager, as that should provide
a better way to manage such remotes and interfacing, in PVE it'd be bolted on and
would require the need to manage this on every host/cluster separately.
IOW. this is rather the lower level interface.

It may still make sense to allow passing a remote via more private channels, like the
environment or `stdin`, wouldn't be hard to do, just mark the target-endpoint as optional
and fallback to $ENV{'..'} and maybe a json string from STDIN - which an admin that wants
to use this lower level part directly can then even use with a file config served via
bash input redicrection `qm remote-migration 1234 4321  </etc/pve/priv/target-host.json`





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command
  2022-10-18  6:39     ` Thomas Lamprecht
@ 2022-10-18  6:56       ` DERUMIER, Alexandre
  0 siblings, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2022-10-18  6:56 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Le mardi 18 octobre 2022 à 08:39 +0200, Thomas Lamprecht a écrit :
> 
> We plan to have such functionality in the datacenter manager, as that
> should provide
> a better way to manage such remotes and interfacing, in PVE it'd be
> bolted on and
> would require the need to manage this on every host/cluster
> separately.
> IOW. this is rather the lower level interface.
> 
oh, ok! Make sense. I don't known too much about the roadmap,
but when you say "the datacenter manager", do you plan to release some
kind of external tool to manage multi-cluster ? 


> It may still make sense to allow passing a remote via more private
> channels, like the
> environment or `stdin`, wouldn't be hard to do, just mark the target-
> endpoint as optional
> and fallback to $ENV{'..'} and maybe a json string from STDIN - which
> an admin that wants
> to use this lower level part directly can then even use with a file
> config served via
> bash input redicrection `qm remote-migration 1234 4321 
> </etc/pve/priv/target-host.json`
> 
> 
yep, something like that. Could be usefull in case of emergency, the
night when you only have ssd access, don't remember the full syntax,
the fingerprint, the token,... ^_^



(BTW, I have begin to test, the migration through the websocket tunnel
is working really fine ! love it with the vmid reservation !)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option Fabian Grünbichler
@ 2022-11-07 15:31   ` Thomas Lamprecht
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Lamprecht @ 2022-11-07 15:31 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 28/09/2022 um 14:50 schrieb Fabian Grünbichler:
> from qemu-server, for re-use in pve-container.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Notes:
>     requires versioned breaks on old qemu-server containing the option, to avoid
>     registering twice
>     
>     new in v6/follow-up to v5
> 
>  src/PVE/JSONSchema.pm | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
>

applied, with breaks of older qemu-server recorded in d/control, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common Fabian Grünbichler
@ 2022-11-07 15:31   ` Thomas Lamprecht
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Lamprecht @ 2022-11-07 15:31 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 28/09/2022 um 14:50 schrieb Fabian Grünbichler:
> for proper re-use in pve-container.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Notes:
>     requires versioned dependency on pve-common that has taken over the option
>     
>     new in v6 / follow-up to v5
> 
>  PVE/QemuServer.pm | 7 -------
>  1 file changed, 7 deletions(-)
> 
>

applied, with versioned d & b-d bumps for pve-common in d/control, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 access-control 1/1] privs: add Sys.Incoming
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming Fabian Grünbichler
@ 2022-11-07 15:38   ` Thomas Lamprecht
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Lamprecht @ 2022-11-07 15:38 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 28/09/2022 um 14:50 schrieb Fabian Grünbichler:
> for guarding cross-cluster data streams like guest migrations and
> storage migrations.
> 
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
>  src/PVE/AccessControl.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
>

applied, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege
  2022-09-28 12:50 ` [pve-devel] [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege Fabian Grünbichler
@ 2022-11-07 15:45   ` Thomas Lamprecht
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Lamprecht @ 2022-11-07 15:45 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

Am 28/09/2022 um 14:50 schrieb Fabian Grünbichler:
> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> ---
>  pveum.adoc | 1 +
>  1 file changed, 1 insertion(+)
> 
>

applied, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-11-07 15:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-28 12:50 [pve-devel] [PATCH-SERIES v6 0/13] remote migration Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 access-control 1/1] privs: add Sys.Incoming Fabian Grünbichler
2022-11-07 15:38   ` [pve-devel] applied: " Thomas Lamprecht
2022-09-28 12:50 ` [pve-devel] [PATCH v6 common 1/1] schema: take over 'pve-targetstorage' option Fabian Grünbichler
2022-11-07 15:31   ` [pve-devel] applied: " Thomas Lamprecht
2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 1/3] migration: add remote migration Fabian Grünbichler
2022-10-03 13:22   ` [pve-devel] [PATCH FOLLOW-UP " Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 2/3] pct: add 'remote-migrate' command Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 container 3/3] migrate: print mapped volume in error Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 docs 1/1] pveum: mention Sys.Incoming privilege Fabian Grünbichler
2022-11-07 15:45   ` [pve-devel] applied: " Thomas Lamprecht
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 1/6] schema: move 'pve-targetstorage' to pve-common Fabian Grünbichler
2022-11-07 15:31   ` [pve-devel] applied: " Thomas Lamprecht
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 2/6] mtunnel: add API endpoints Fabian Grünbichler
2022-09-30 11:52   ` Stefan Hanreich
2022-10-03  7:11     ` Fabian Grünbichler
2022-10-03 13:22   ` [pve-devel] [PATCH FOLLOW-UP " Fabian Grünbichler
2022-10-18  6:23   ` [pve-devel] [PATCH " DERUMIER, Alexandre
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 3/6] migrate: refactor remote VM/tunnel start Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 4/6] migrate: add remote migration handling Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 5/6] api: add remote migrate endpoint Fabian Grünbichler
2022-09-28 12:50 ` [pve-devel] [PATCH v6 qemu-server 6/6] qm: add remote-migrate command Fabian Grünbichler
2022-10-17 14:40   ` DERUMIER, Alexandre
2022-10-18  6:39     ` Thomas Lamprecht
2022-10-18  6:56       ` DERUMIER, Alexandre
2022-10-17 17:22   ` DERUMIER, Alexandre
2022-09-28 12:50 ` [pve-devel] [PATCH v6 storage 1/1] (remote) export: check and untaint format Fabian Grünbichler
2022-09-29 12:39   ` [pve-devel] applied: " Thomas Lamprecht
2022-10-04 15:29 ` [pve-devel] [PATCH-SERIES v6 0/13] remote migration DERUMIER, Alexandre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal