all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC] qemu-server: add migration_type=insecure to remote-migrate
@ 2026-04-25  1:10 Bogdan Ionescu
  2026-04-30 12:40 ` Fabian Grünbichler
  0 siblings, 1 reply; 7+ messages in thread
From: Bogdan Ionescu @ 2026-04-25  1:10 UTC (permalink / raw)
  To: pve-devel@lists.proxmox.com

Hi all,

I'd like to gauge interest in adding a migration_type=insecure option to
the qm remote-migrate / remote_migrate_vm endpoint, before investing
time in a review-ready patch series.

== Motivation ==

The current remote-migrate implementation tunnels both control plane
and data plane through the websocket connection to the target's API
endpoint on 8006/tcp. This is the right default for trust reasons
(API token + TLS fingerprint, no SSH trust between clusters needed),
but the data plane throughput is severely bottlenecked by:

  - userspace bouncing through PVE::Tunnel + pveproxy + qmtunnel
    (3 Perl processes in the data path, each context-switching per
    chunk)
  - per-byte WebSocket masking in pure Perl (RFC 6455 §5.3)
  - TLS framing on top
  - lack of zero-copy / TSO offload for the streamed bytes
  - multiple TCP segments end-to-end with independent flow control

In our deployment between two DCs connected by WireGuard over a
10 Gbps link, we observe sustained ~1 MB/s for remote-migrate while
intra-cluster `qm migrate --migration_type insecure` between the same
hosts saturates the link at ~300+ MB/s. The bottleneck is clearly
the WS tunnel data path on a single Perl-bound core, not the network.

For VMs with 32+ GB of RAM, this difference is the difference between
a migration finishing in 2 minutes vs. failing to converge entirely
because the dirty rate exceeds the throughput.

== Proposal ==

Mirror the local-cluster migration model: keep secure (WS-tunneled) as
the default, allow opt-in 'insecure' for trusted networks where the
operator has out-of-band guarantees (private cross-connect, VPN,
overlay encryption at L2/L3).

  qm remote-migrate <vmid> <target-vmid> 'apitoken=...,host=...,fp=...' \
      --target-storage ... --target-bridge ... --online \
      --migration_type insecure \
      --migration_network 10.50.0.0/24

Semantics:
  - control plane (config, NBD allocation requests, tunnel commands,
    spice ticket, etc.) still goes through the WS tunnel as today
  - data plane (QEMU memory stream + NBD storage drive-mirror) goes
    direct TCP between source and target on the standard
    60000-60050 range, with the target's listener IP resolved from
    --migration_network (same logic as local-cluster insecure)
  - root-only on the source side, consistent with migrate_vm
  - target advertises an 'insecure-remote' capability in the mtunnel
    version response so source can fail closed on older targets

== Backward compatibility approach ==

Rather than bumping WS_TUNNEL_VERSION (which would break
new-source -> old-target combinations because of the existing
"$WS_TUNNEL_VERSION > $tunnel->{version}" check), I'd add a
forward-compatible 'caps' field to the version response. Old sources
ignore unknown JSON keys; new sources require 'insecure-remote' to be
present in caps before allowing migration_type=insecure, and otherwise
fall through to the existing WS-tunneled path with no behavioral
change.

This means all four mix matrices are clean:
  - patched <-> patched, secure: identical to today
  - unpatched src -> patched tgt: caps ignored, WS path as today
  - patched src -> unpatched tgt, secure: caps absent, not checked,
    WS path as today
  - patched src -> unpatched tgt, insecure: source dies early with a
    clear "upgrade target or omit migration_type=insecure" error,
    no side effects on target

== Security considerations ==

  - root-only at the API/CLI layer, same as the local-cluster knob
  - documented as requiring trusted/private network between clusters
  - no change to control plane or auth (still API token + TLS fp)
  - data plane confidentiality drops to network-layer controls only,
    which is identical to the trade-off operators already make for
    intra-cluster insecure migration
  - no new ports beyond the existing 60000-60050 range that
    insecure migration already uses
  - source-side caps check ensures no silent downgrade when target
    doesn't support it

== Open questions ==

  1. Is this direction acceptable in principle, or would you prefer
     a different direction?

  2. Should the 'caps' mechanism be added in a standalone preliminary
     patch (useful as future-proofing for any opt-in mtunnel feature),
     or rolled into the same series?

  3. Should NBD direct-TCP be gated by a separate flag, or is it fine
     to have migration_type=insecure imply both RAM and NBD direct?
     The intra-cluster knob ties them together today.

  4. Any preference on the parameter name? I matched migrate_vm
     ('migration_type', 'migration_network') for consistency, but
     'data-direct-tcp' or similar would also work and arguably be
     less misleading since the control plane is still encrypted.


Thanks,
Bogdan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [RFC] qemu-server: add migration_type=insecure to remote-migrate
  2026-04-25  1:10 [pve-devel] [RFC] qemu-server: add migration_type=insecure to remote-migrate Bogdan Ionescu
@ 2026-04-30 12:40 ` Fabian Grünbichler
  2026-05-14 22:25   ` Bogdan Ionescu
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Fabian Grünbichler @ 2026-04-30 12:40 UTC (permalink / raw)
  To: Bogdan Ionescu, pve-devel@lists.proxmox.com

On April 25, 2026 3:10 am, Bogdan Ionescu wrote:
> Hi all,
> 
> I'd like to gauge interest in adding a migration_type=insecure option to
> the qm remote-migrate / remote_migrate_vm endpoint, before investing
> time in a review-ready patch series.

Hi!

This is something that we will need sooner or later as well, in the
context of PDM and fabrics.

> == Motivation ==
> 
> The current remote-migrate implementation tunnels both control plane
> and data plane through the websocket connection to the target's API
> endpoint on 8006/tcp. This is the right default for trust reasons
> (API token + TLS fingerprint, no SSH trust between clusters needed),
> but the data plane throughput is severely bottlenecked by:
> 
>   - userspace bouncing through PVE::Tunnel + pveproxy + qmtunnel
>     (3 Perl processes in the data path, each context-switching per
>     chunk)
>   - per-byte WebSocket masking in pure Perl (RFC 6455 §5.3)
>   - TLS framing on top
>   - lack of zero-copy / TSO offload for the streamed bytes
>   - multiple TCP segments end-to-end with independent flow control
> 
> In our deployment between two DCs connected by WireGuard over a
> 10 Gbps link, we observe sustained ~1 MB/s for remote-migrate while
> intra-cluster `qm migrate --migration_type insecure` between the same
> hosts saturates the link at ~300+ MB/s. The bottleneck is clearly
> the WS tunnel data path on a single Perl-bound core, not the network.
> 
> For VMs with 32+ GB of RAM, this difference is the difference between
> a migration finishing in 2 minutes vs. failing to converge entirely
> because the dirty rate exceeds the throughput.
> 
> == Proposal ==
> 
> Mirror the local-cluster migration model: keep secure (WS-tunneled) as
> the default, allow opt-in 'insecure' for trusted networks where the
> operator has out-of-band guarantees (private cross-connect, VPN,
> overlay encryption at L2/L3).
> 
>   qm remote-migrate <vmid> <target-vmid> 'apitoken=...,host=...,fp=...' \
>       --target-storage ... --target-bridge ... --online \
>       --migration_type insecure \
>       --migration_network 10.50.0.0/24
> 
> Semantics:
>   - control plane (config, NBD allocation requests, tunnel commands,
>     spice ticket, etc.) still goes through the WS tunnel as today

this makes sense

>   - data plane (QEMU memory stream + NBD storage drive-mirror) goes
>     direct TCP between source and target on the standard
>     60000-60050 range, with the target's listener IP resolved from
>     --migration_network (same logic as local-cluster insecure)

this as well, though as an alternative one might consider providing an
interface name as well?

>   - root-only on the source side, consistent with migrate_vm

here we have a slight difference between intra-cluster and inter-cluster
migrations:
- within a cluster, we have established trust and a shared
  authentication scope - node A asking node B about its migration
  address is okay (post-authentication), since a regular user cannot
  override it
- between clusters, we have less guarantees - while the target has to
  trust the source somewhat (which is why we require a separate
  privilege for allowing incoming remote migrations in the first place),
  I am not sure whether we would not want to require some additional
  privileges for allowing insecure migrations as well? e.g. Sys.Modify
  somewhere, or something similar?

we might also consider whether it makes sense to pre-configure remote
migration networks and allow selecting them by ID, though that could be
added later as follow-up as well.

>   - target advertises an 'insecure-remote' capability in the mtunnel
>     version response so source can fail closed on older targets

right, without this an outdated remote node would start the VM with tcp
migration, but the mtunnel endpoint would then die because it only
allows unix sockets atm..

> 
> == Backward compatibility approach ==
> 
> Rather than bumping WS_TUNNEL_VERSION (which would break
> new-source -> old-target combinations because of the existing
> "$WS_TUNNEL_VERSION > $tunnel->{version}" check), I'd add a
> forward-compatible 'caps' field to the version response. Old sources
> ignore unknown JSON keys; new sources require 'insecure-remote' to be
> present in caps before allowing migration_type=insecure, and otherwise
> fall through to the existing WS-tunneled path with no behavioral
> change.

what do you think @Fiona?

> This means all four mix matrices are clean:
>   - patched <-> patched, secure: identical to today
>   - unpatched src -> patched tgt: caps ignored, WS path as today
>   - patched src -> unpatched tgt, secure: caps absent, not checked,
>     WS path as today
>   - patched src -> unpatched tgt, insecure: source dies early with a
>     clear "upgrade target or omit migration_type=insecure" error,
>     no side effects on target
> 
> == Security considerations ==
> 
>   - root-only at the API/CLI layer, same as the local-cluster knob
>   - documented as requiring trusted/private network between clusters

this part here is a big one - it really needs documentation that screams
"double check to ensure this doesn't accidentally broadcast clear text
migration data over the internet"

>   - no change to control plane or auth (still API token + TLS fp)
>   - data plane confidentiality drops to network-layer controls only,
>     which is identical to the trade-off operators already make for
>     intra-cluster insecure migration
>   - no new ports beyond the existing 60000-60050 range that
>     insecure migration already uses
>   - source-side caps check ensures no silent downgrade when target
>     doesn't support it
> 
> == Open questions ==
> 
>   1. Is this direction acceptable in principle, or would you prefer
>      a different direction?

it mostly looks good to me with a quick glance, it might be sensible to
wait for additional input by Fiona or Thomas before diving in.

>   2. Should the 'caps' mechanism be added in a standalone preliminary
>      patch (useful as future-proofing for any opt-in mtunnel feature),
>      or rolled into the same series?
> 
>   3. Should NBD direct-TCP be gated by a separate flag, or is it fine
>      to have migration_type=insecure imply both RAM and NBD direct?
>      The intra-cluster knob ties them together today.

I think it's fine to tie them together, the same considerations apply to
them.

>   4. Any preference on the parameter name? I matched migrate_vm
>      ('migration_type', 'migration_network') for consistency, but
>      'data-direct-tcp' or similar would also work and arguably be
>      less misleading since the control plane is still encrypted.

that's also true for local migration - the data plane is SSH in that
case..




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [RFC] qemu-server: add migration_type=insecure to remote-migrate
  2026-04-30 12:40 ` Fabian Grünbichler
@ 2026-05-14 22:25   ` Bogdan Ionescu
  2026-05-14 22:27   ` [PATCH qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
  2026-05-15 21:54   ` [PATCH v2 qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
  2 siblings, 0 replies; 7+ messages in thread
From: Bogdan Ionescu @ 2026-05-14 22:25 UTC (permalink / raw)
  To: pve-devel@lists.proxmox.com

Hi,

following the feedback, I reworked the implementation and split it into a small patch series.

Changes compared to the RFC version:

kept the websocket tunnel for the control plane only
migration and NBD storage traffic use direct TCP only when migration_type=insecure
added capability-based negotiation (insecure-remote) instead of bumping the tunnel API version
added support for migration_type / migration_network to qm remote-migrate
added additional permission checks requiring Sys.Modify on /
added more explicit warnings that guest RAM and disk migration data may be transferred unencrypted

I tested this with online remote migration between two PVE 9 nodes using local ZFS-backed disks.

The remote migration path successfully used:

NBD over direct TCP for storage migration
direct TCP for QEMU migration state
websocket tunnel only for the control plane

Patch series follows.

Kind regards, 
Bogdan

On Thursday, April 30th, 2026 at 3:40 PM, Fabian Grünbichler <f.gruenbichler@proxmox.com> wrote:

> On April 25, 2026 3:10 am, Bogdan Ionescu wrote:
> > Hi all,
> >
> > I'd like to gauge interest in adding a migration_type=insecure option to
> > the qm remote-migrate / remote_migrate_vm endpoint, before investing
> > time in a review-ready patch series.
> 
> Hi!
> 
> This is something that we will need sooner or later as well, in the
> context of PDM and fabrics.
> 
> > == Motivation ==
> >
> > The current remote-migrate implementation tunnels both control plane
> > and data plane through the websocket connection to the target's API
> > endpoint on 8006/tcp. This is the right default for trust reasons
> > (API token + TLS fingerprint, no SSH trust between clusters needed),
> > but the data plane throughput is severely bottlenecked by:
> >
> >   - userspace bouncing through PVE::Tunnel + pveproxy + qmtunnel
> >     (3 Perl processes in the data path, each context-switching per
> >     chunk)
> >   - per-byte WebSocket masking in pure Perl (RFC 6455 §5.3)
> >   - TLS framing on top
> >   - lack of zero-copy / TSO offload for the streamed bytes
> >   - multiple TCP segments end-to-end with independent flow control
> >
> > In our deployment between two DCs connected by WireGuard over a
> > 10 Gbps link, we observe sustained ~1 MB/s for remote-migrate while
> > intra-cluster `qm migrate --migration_type insecure` between the same
> > hosts saturates the link at ~300+ MB/s. The bottleneck is clearly
> > the WS tunnel data path on a single Perl-bound core, not the network.
> >
> > For VMs with 32+ GB of RAM, this difference is the difference between
> > a migration finishing in 2 minutes vs. failing to converge entirely
> > because the dirty rate exceeds the throughput.
> >
> > == Proposal ==
> >
> > Mirror the local-cluster migration model: keep secure (WS-tunneled) as
> > the default, allow opt-in 'insecure' for trusted networks where the
> > operator has out-of-band guarantees (private cross-connect, VPN,
> > overlay encryption at L2/L3).
> >
> >   qm remote-migrate <vmid> <target-vmid> 'apitoken=...,host=...,fp=...' \
> >       --target-storage ... --target-bridge ... --online \
> >       --migration_type insecure \
> >       --migration_network 10.50.0.0/24
> >
> > Semantics:
> >   - control plane (config, NBD allocation requests, tunnel commands,
> >     spice ticket, etc.) still goes through the WS tunnel as today
> 
> this makes sense
> 
> >   - data plane (QEMU memory stream + NBD storage drive-mirror) goes
> >     direct TCP between source and target on the standard
> >     60000-60050 range, with the target's listener IP resolved from
> >     --migration_network (same logic as local-cluster insecure)
> 
> this as well, though as an alternative one might consider providing an
> interface name as well?
> 
> >   - root-only on the source side, consistent with migrate_vm
> 
> here we have a slight difference between intra-cluster and inter-cluster
> migrations:
> - within a cluster, we have established trust and a shared
>   authentication scope - node A asking node B about its migration
>   address is okay (post-authentication), since a regular user cannot
>   override it
> - between clusters, we have less guarantees - while the target has to
>   trust the source somewhat (which is why we require a separate
>   privilege for allowing incoming remote migrations in the first place),
>   I am not sure whether we would not want to require some additional
>   privileges for allowing insecure migrations as well? e.g. Sys.Modify
>   somewhere, or something similar?
> 
> we might also consider whether it makes sense to pre-configure remote
> migration networks and allow selecting them by ID, though that could be
> added later as follow-up as well.
> 
> >   - target advertises an 'insecure-remote' capability in the mtunnel
> >     version response so source can fail closed on older targets
> 
> right, without this an outdated remote node would start the VM with tcp
> migration, but the mtunnel endpoint would then die because it only
> allows unix sockets atm..
> 
> >
> > == Backward compatibility approach ==
> >
> > Rather than bumping WS_TUNNEL_VERSION (which would break
> > new-source -> old-target combinations because of the existing
> > "$WS_TUNNEL_VERSION > $tunnel->{version}" check), I'd add a
> > forward-compatible 'caps' field to the version response. Old sources
> > ignore unknown JSON keys; new sources require 'insecure-remote' to be
> > present in caps before allowing migration_type=insecure, and otherwise
> > fall through to the existing WS-tunneled path with no behavioral
> > change.
> 
> what do you think @Fiona?
> 
> > This means all four mix matrices are clean:
> >   - patched <-> patched, secure: identical to today
> >   - unpatched src -> patched tgt: caps ignored, WS path as today
> >   - patched src -> unpatched tgt, secure: caps absent, not checked,
> >     WS path as today
> >   - patched src -> unpatched tgt, insecure: source dies early with a
> >     clear "upgrade target or omit migration_type=insecure" error,
> >     no side effects on target
> >
> > == Security considerations ==
> >
> >   - root-only at the API/CLI layer, same as the local-cluster knob
> >   - documented as requiring trusted/private network between clusters
> 
> this part here is a big one - it really needs documentation that screams
> "double check to ensure this doesn't accidentally broadcast clear text
> migration data over the internet"
> 
> >   - no change to control plane or auth (still API token + TLS fp)
> >   - data plane confidentiality drops to network-layer controls only,
> >     which is identical to the trade-off operators already make for
> >     intra-cluster insecure migration
> >   - no new ports beyond the existing 60000-60050 range that
> >     insecure migration already uses
> >   - source-side caps check ensures no silent downgrade when target
> >     doesn't support it
> >
> > == Open questions ==
> >
> >   1. Is this direction acceptable in principle, or would you prefer
> >      a different direction?
> 
> it mostly looks good to me with a quick glance, it might be sensible to
> wait for additional input by Fiona or Thomas before diving in.
> 
> >   2. Should the 'caps' mechanism be added in a standalone preliminary
> >      patch (useful as future-proofing for any opt-in mtunnel feature),
> >      or rolled into the same series?
> >
> >   3. Should NBD direct-TCP be gated by a separate flag, or is it fine
> >      to have migration_type=insecure imply both RAM and NBD direct?
> >      The intra-cluster knob ties them together today.
> 
> I think it's fine to tie them together, the same considerations apply to
> them.
> 
> >   4. Any preference on the parameter name? I matched migrate_vm
> >      ('migration_type', 'migration_network') for consistency, but
> >      'data-direct-tcp' or similar would also work and arguably be
> >      less misleading since the control plane is still encrypted.
> 
> that's also true for local migration - the data plane is SSH in that
> case..
> 
> 
> 
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH qemu-server] remote migration: allow insecure TCP data plane
  2026-04-30 12:40 ` Fabian Grünbichler
  2026-05-14 22:25   ` Bogdan Ionescu
@ 2026-05-14 22:27   ` Bogdan Ionescu
  2026-05-14 22:27     ` [PATCH pve-guest-common] tunnel: propagate remote capabilities Bogdan Ionescu
  2026-05-15 21:54   ` [PATCH v2 qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
  2 siblings, 1 reply; 7+ messages in thread
From: Bogdan Ionescu @ 2026-05-14 22:27 UTC (permalink / raw)
  To: f.gruenbichler; +Cc: pve-devel, Bogdan Ionescu

Signed-off-by: Bogdan Ionescu <bogdan@ionescu.at>
---
 src/PVE/API2/Qemu.pm   | 47 +++++++++++++++++++++++++++++++++++-------
 src/PVE/CLI/qm.pm      | 17 +++++++++++++++
 src/PVE/QemuMigrate.pm | 10 +++++++++
 3 files changed, 66 insertions(+), 8 deletions(-)

diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
index d762401b..43888ab5 100644
--- a/src/PVE/API2/Qemu.pm
+++ b/src/PVE/API2/Qemu.pm
@@ -5668,6 +5668,23 @@ __PACKAGE__->register_method({
                 minimum => '0',
                 default => 'migrate limit from datacenter or storage config',
             },
+            migration_type => {
+                type => 'string',
+                enum => ['secure', 'insecure'],
+                description =>
+                    "Migration traffic is encrypted using a websocket tunnel by default. "
+                    . "On secure, completely private networks this can be disabled to "
+                    . "increase performance. WARNING: with 'insecure', VM RAM and disk "
+                    . "migration data is transferred in clear text over the selected "
+                    . "migration network.",
+                optional => 1,
+            },
+            migration_network => {
+                type => 'string',
+                format => 'CIDR',
+                description => "CIDR of the trusted private network used for insecure remote migration.",
+                optional => 1,
+            },
         },
     },
     returns => {
@@ -5683,9 +5700,16 @@ __PACKAGE__->register_method({
         my $source_vmid = extract_param($param, 'vmid');
         my $target_endpoint = extract_param($param, 'target-endpoint');
         my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
+        my $migration_type = extract_param($param, 'migration_type') // 'secure';
+        my $migration_network = extract_param($param, 'migration_network');
 
         my $delete = extract_param($param, 'delete') // 0;
 
+        # insecure remote migration can transfer VM RAM and disk data in clear text
+        if ($migration_type eq 'insecure' || defined($migration_network)) {
+            $rpcenv->check_full($authuser, "/", ['Sys.Modify']);
+        }
+
         PVE::Cluster::check_cfs_quorum();
 
         # test if VM exists
@@ -5760,7 +5784,8 @@ __PACKAGE__->register_method({
             client => $api_client,
             vmid => $target_vmid,
         };
-        $param->{migration_type} = 'websocket';
+        $param->{migration_type} = $migration_type eq 'insecure' ? 'insecure' : 'websocket';
+        $param->{migration_network} = $migration_network if defined($migration_network);
         $param->{'with-local-disks'} = 1;
         $param->{delete} = $delete if $delete;
 
@@ -6716,6 +6741,7 @@ __PACKAGE__->register_method({
                     return {
                         api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
                         age => 0,
+                        caps => ['insecure-remote'],
                     };
                 },
                 'config' => sub {
@@ -6866,19 +6892,24 @@ __PACKAGE__->register_method({
                         $params->{migrate_opts},
                     );
 
-                    if ($info->{migrate}->{proto} ne 'unix') {
+                    if (
+                        $params->{migrate_opts}->{type} ne 'insecure'
+                        && $info->{migrate}->{proto} ne 'unix'
+                    ) {
                         PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
                         die "migration over non-UNIX sockets not possible\n";
                     }
 
-                    my $socket = $info->{migrate}->{addr};
-                    chown $state->{socket_uid}, -1, $socket;
-                    $state->{sockets}->{$socket} = 1;
-
-                    my $unix_sockets = $info->{migrate}->{unix_sockets};
-                    foreach my $socket (@$unix_sockets) {
+                    if ($info->{migrate}->{proto} eq 'unix') {
+                        my $socket = $info->{migrate}->{addr};
                         chown $state->{socket_uid}, -1, $socket;
                         $state->{sockets}->{$socket} = 1;
+
+                        my $unix_sockets = $info->{migrate}->{unix_sockets} // [];
+                        foreach my $socket (@$unix_sockets) {
+                            chown $state->{socket_uid}, -1, $socket;
+                            $state->{sockets}->{$socket} = 1;
+                        }
                     }
                     return $info;
                 },
diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm
index bfa0d1d5..43ec442a 100755
--- a/src/PVE/CLI/qm.pm
+++ b/src/PVE/CLI/qm.pm
@@ -224,6 +224,23 @@ __PACKAGE__->register_method({
                 minimum => '0',
                 default => 'migrate limit from datacenter or storage config',
             },
+            migration_type => {
+                type => 'string',
+                enum => ['secure', 'insecure'],
+                description =>
+                    "Migration traffic is encrypted using a websocket tunnel by default. "
+                    . "On secure, completely private networks this can be disabled to "
+                    . "increase performance. WARNING: with 'insecure', VM RAM and disk "
+                    . "migration data is transferred in clear text over the selected "
+                    . "migration network.",
+                optional => 1,
+            },
+            migration_network => {
+                type => 'string',
+                format => 'CIDR',
+                description => "CIDR of the trusted private network used for insecure remote migration.",
+                optional => 1,
+            },
         },
     },
     returns => {
diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
index 8f38bf69..617cc1de 100644
--- a/src/PVE/QemuMigrate.pm
+++ b/src/PVE/QemuMigrate.pm
@@ -46,6 +46,12 @@ use base qw(PVE::AbstractMigrate);
 # compared against remote end's minimum version
 our $WS_TUNNEL_VERSION = 2;
 
+sub remote_tunnel_has_cap {
+    my ($tunnel, $cap) = @_;
+
+    return grep { $_ eq $cap } @{ $tunnel->{caps} // [] };
+}
+
 sub fork_tunnel {
     my ($self, $ssh_forward_info) = @_;
 
@@ -351,6 +357,10 @@ sub prepare {
             if $WS_TUNNEL_VERSION < $min_version;
         die "Remote tunnel endpoint too old, upgrade required\n"
             if $WS_TUNNEL_VERSION > $tunnel->{version};
+        die "Remote tunnel endpoint does not support insecure remote migration, upgrade target or"
+            . " omit migration_type=insecure\n"
+            if $self->{opts}->{migration_type} eq 'insecure'
+            && !remote_tunnel_has_cap($tunnel, 'insecure-remote');
 
         print "websocket tunnel started\n";
         $self->{tunnel} = $tunnel;
-- 
2.47.3




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH pve-guest-common] tunnel: propagate remote capabilities
  2026-05-14 22:27   ` [PATCH qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
@ 2026-05-14 22:27     ` Bogdan Ionescu
  0 siblings, 0 replies; 7+ messages in thread
From: Bogdan Ionescu @ 2026-05-14 22:27 UTC (permalink / raw)
  To: f.gruenbichler; +Cc: pve-devel, Bogdan Ionescu

Signed-off-by: Bogdan Ionescu <bogdan@ionescu.at>
---
 src/PVE/Tunnel.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
index 791b465..952fd66 100644
--- a/src/PVE/Tunnel.pm
+++ b/src/PVE/Tunnel.pm
@@ -315,6 +315,8 @@ sub fork_websocket_tunnel {
             if ($version =~ /^(\d+)$/) {
                 $tunnel->{version} = $1;
                 $tunnel->{age} = $res->{age};
+                $tunnel->{caps} = $res->{caps}
+                    if defined($res->{caps}) && ref($res->{caps}) eq 'ARRAY';
             } else {
                 $err = "received invalid tunnel version string '$version'\n" if !$err;
             }
-- 
2.47.3




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 qemu-server] remote migration: allow insecure TCP data plane
  2026-04-30 12:40 ` Fabian Grünbichler
  2026-05-14 22:25   ` Bogdan Ionescu
  2026-05-14 22:27   ` [PATCH qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
@ 2026-05-15 21:54   ` Bogdan Ionescu
  2026-05-15 21:54     ` [PATCH v2 pve-guest-common] tunnel: propagate remote capabilities Bogdan Ionescu
  2 siblings, 1 reply; 7+ messages in thread
From: Bogdan Ionescu @ 2026-05-15 21:54 UTC (permalink / raw)
  To: f.gruenbichler; +Cc: pve-devel, Bogdan Ionescu

Expose migration_type=insecure and migration_network for remote
migration.

The websocket tunnel remains used for control commands, while QEMU
migration state and NBD storage migration can use a direct TCP data
plane. This avoids websocket masking/TLS overhead on trusted private
migration networks.

Gate the feature with an insecure-remote mtunnel capability so source
nodes fail early when the target does not support it. Require Sys.Modify
on / when using the insecure data plane or explicitly selecting a
migration network.

This mode is only intended for fully trusted private networks, as guest
RAM and disk migration data may be transferred in clear text.

Signed-off-by: Bogdan Ionescu <bogdan@ionescu.at>
---
 src/PVE/API2/Qemu.pm   | 47 +++++++++++++++++++++++-----
 src/PVE/CLI/qm.pm      | 17 ++++++++++
 src/PVE/QemuMigrate.pm | 70 ++++++++++++++++++++++++++++++------------
 3 files changed, 107 insertions(+), 27 deletions(-)

diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
index d762401b..43888ab5 100644
--- a/src/PVE/API2/Qemu.pm
+++ b/src/PVE/API2/Qemu.pm
@@ -5668,6 +5668,23 @@ __PACKAGE__->register_method({
                 minimum => '0',
                 default => 'migrate limit from datacenter or storage config',
             },
+            migration_type => {
+                type => 'string',
+                enum => ['secure', 'insecure'],
+                description =>
+                    "Migration traffic is encrypted using a websocket tunnel by default. "
+                    . "On secure, completely private networks this can be disabled to "
+                    . "increase performance. WARNING: with 'insecure', VM RAM and disk "
+                    . "migration data is transferred in clear text over the selected "
+                    . "migration network.",
+                optional => 1,
+            },
+            migration_network => {
+                type => 'string',
+                format => 'CIDR',
+                description => "CIDR of the trusted private network used for insecure remote migration.",
+                optional => 1,
+            },
         },
     },
     returns => {
@@ -5683,9 +5700,16 @@ __PACKAGE__->register_method({
         my $source_vmid = extract_param($param, 'vmid');
         my $target_endpoint = extract_param($param, 'target-endpoint');
         my $target_vmid = extract_param($param, 'target-vmid') // $source_vmid;
+        my $migration_type = extract_param($param, 'migration_type') // 'secure';
+        my $migration_network = extract_param($param, 'migration_network');
 
         my $delete = extract_param($param, 'delete') // 0;
 
+        # insecure remote migration can transfer VM RAM and disk data in clear text
+        if ($migration_type eq 'insecure' || defined($migration_network)) {
+            $rpcenv->check_full($authuser, "/", ['Sys.Modify']);
+        }
+
         PVE::Cluster::check_cfs_quorum();
 
         # test if VM exists
@@ -5760,7 +5784,8 @@ __PACKAGE__->register_method({
             client => $api_client,
             vmid => $target_vmid,
         };
-        $param->{migration_type} = 'websocket';
+        $param->{migration_type} = $migration_type eq 'insecure' ? 'insecure' : 'websocket';
+        $param->{migration_network} = $migration_network if defined($migration_network);
         $param->{'with-local-disks'} = 1;
         $param->{delete} = $delete if $delete;
 
@@ -6716,6 +6741,7 @@ __PACKAGE__->register_method({
                     return {
                         api => $PVE::QemuMigrate::WS_TUNNEL_VERSION,
                         age => 0,
+                        caps => ['insecure-remote'],
                     };
                 },
                 'config' => sub {
@@ -6866,19 +6892,24 @@ __PACKAGE__->register_method({
                         $params->{migrate_opts},
                     );
 
-                    if ($info->{migrate}->{proto} ne 'unix') {
+                    if (
+                        $params->{migrate_opts}->{type} ne 'insecure'
+                        && $info->{migrate}->{proto} ne 'unix'
+                    ) {
                         PVE::QemuServer::vm_stop(undef, $state->{vmid}, 1, 1);
                         die "migration over non-UNIX sockets not possible\n";
                     }
 
-                    my $socket = $info->{migrate}->{addr};
-                    chown $state->{socket_uid}, -1, $socket;
-                    $state->{sockets}->{$socket} = 1;
-
-                    my $unix_sockets = $info->{migrate}->{unix_sockets};
-                    foreach my $socket (@$unix_sockets) {
+                    if ($info->{migrate}->{proto} eq 'unix') {
+                        my $socket = $info->{migrate}->{addr};
                         chown $state->{socket_uid}, -1, $socket;
                         $state->{sockets}->{$socket} = 1;
+
+                        my $unix_sockets = $info->{migrate}->{unix_sockets} // [];
+                        foreach my $socket (@$unix_sockets) {
+                            chown $state->{socket_uid}, -1, $socket;
+                            $state->{sockets}->{$socket} = 1;
+                        }
                     }
                     return $info;
                 },
diff --git a/src/PVE/CLI/qm.pm b/src/PVE/CLI/qm.pm
index bfa0d1d5..43ec442a 100755
--- a/src/PVE/CLI/qm.pm
+++ b/src/PVE/CLI/qm.pm
@@ -224,6 +224,23 @@ __PACKAGE__->register_method({
                 minimum => '0',
                 default => 'migrate limit from datacenter or storage config',
             },
+            migration_type => {
+                type => 'string',
+                enum => ['secure', 'insecure'],
+                description =>
+                    "Migration traffic is encrypted using a websocket tunnel by default. "
+                    . "On secure, completely private networks this can be disabled to "
+                    . "increase performance. WARNING: with 'insecure', VM RAM and disk "
+                    . "migration data is transferred in clear text over the selected "
+                    . "migration network.",
+                optional => 1,
+            },
+            migration_network => {
+                type => 'string',
+                format => 'CIDR',
+                description => "CIDR of the trusted private network used for insecure remote migration.",
+                optional => 1,
+            },
         },
     },
     returns => {
diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
index 8f38bf69..455024f0 100644
--- a/src/PVE/QemuMigrate.pm
+++ b/src/PVE/QemuMigrate.pm
@@ -46,6 +46,12 @@ use base qw(PVE::AbstractMigrate);
 # compared against remote end's minimum version
 our $WS_TUNNEL_VERSION = 2;
 
+sub remote_tunnel_has_cap {
+    my ($tunnel, $cap) = @_;
+
+    return grep { $_ eq $cap } @{ $tunnel->{caps} // [] };
+}
+
 sub fork_tunnel {
     my ($self, $ssh_forward_info) = @_;
 
@@ -351,6 +357,10 @@ sub prepare {
             if $WS_TUNNEL_VERSION < $min_version;
         die "Remote tunnel endpoint too old, upgrade required\n"
             if $WS_TUNNEL_VERSION > $tunnel->{version};
+        die "Remote tunnel endpoint does not support insecure remote migration, upgrade target or"
+            . " omit migration_type=insecure\n"
+            if $self->{opts}->{migration_type} eq 'insecure'
+            && !remote_tunnel_has_cap($tunnel, 'insecure-remote');
 
         print "websocket tunnel started\n";
         $self->{tunnel} = $tunnel;
@@ -1144,8 +1154,9 @@ sub phase2_start_local_cluster {
 sub phase2_start_remote_cluster {
     my ($self, $vmid, $params) = @_;
 
-    die "insecure migration to remote cluster not implemented\n"
-        if $params->{migrate_opts}->{type} ne 'websocket';
+    die "unsupported remote migration type '$params->{migrate_opts}->{type}'\n"
+        if $params->{migrate_opts}->{type} ne 'websocket'
+        && $params->{migrate_opts}->{type} ne 'insecure';
 
     my $remote_vmid = $self->{opts}->{remote}->{vmid};
 
@@ -1159,8 +1170,13 @@ sub phase2_start_remote_cluster {
         $self->{stopnbd} = 1;
         $self->{target_drive}->{$drive}->{drivestr} = $res->{drives}->{$drive}->{drivestr};
         my $nbd_uri = $res->{drives}->{$drive}->{nbd_uri};
-        die "unexpected NBD uri for '$drive': $nbd_uri\n"
-            if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
+        if ($params->{migrate_opts}->{type} eq 'websocket') {
+            die "unexpected NBD uri for '$drive': $nbd_uri\n"
+                if $nbd_uri !~ s!/run/qemu-server/$remote_vmid\_!/run/qemu-server/$vmid\_!;
+        } elsif ($params->{migrate_opts}->{type} eq 'insecure') {
+            die "unexpected NBD uri for '$drive': $nbd_uri\n"
+                if $nbd_uri !~ m!^nbd:(?:localhost|[\d\.]+|\[[\d\.:a-fA-F]+\]):\d+:exportname=drive-[A-Za-z0-9_.-]+$!;
+        }
 
         $self->{target_drive}->{$drive}->{nbd_uri} = $nbd_uri;
     }
@@ -1254,28 +1270,44 @@ sub phase2 {
         my $remote_vmid = $remote->{vmid};
         $params->{migrate_opts}->{remote_node} = $self->{node};
         ($tunnel_info, $spice_port) = $self->phase2_start_remote_cluster($vmid, $params);
-        die "only UNIX sockets are supported for remote migration\n"
-            if $tunnel_info->{proto} ne 'unix';
-
-        # untaint
-        my ($remote_socket) = $tunnel_info->{addr} =~ m|^(/run/qemu-server/\d+\.migrate)$|
-            or die "unexpected socket address '$tunnel_info->{addr}'\n";
-        my $local_socket = $remote_socket;
-        $local_socket =~ s/$remote_vmid/$vmid/g;
-        $tunnel_info->{addr} = $local_socket;
 
-        $self->log('info', "Setting up tunnel for '$local_socket'");
-        PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+        if ($params->{migrate_opts}->{type} eq 'websocket') {
+            die "only UNIX sockets are supported for remote migration\n"
+                if $tunnel_info->{proto} ne 'unix';
 
-        foreach my $remote_socket (@{ $tunnel_info->{unix_sockets} }) {
             # untaint
-            ($remote_socket) = $remote_socket =~ m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$|
-                or die "unexpected socket address '$remote_socket'\n";
+            my ($remote_socket) = $tunnel_info->{addr} =~ m|^(/run/qemu-server/\d+\.migrate)$|
+                or die "unexpected socket address '$tunnel_info->{addr}'\n";
             my $local_socket = $remote_socket;
             $local_socket =~ s/$remote_vmid/$vmid/g;
-            next if $self->{tunnel}->{forwarded}->{$local_socket};
+            $tunnel_info->{addr} = $local_socket;
+
             $self->log('info', "Setting up tunnel for '$local_socket'");
             PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+
+            foreach my $remote_socket (@{ $tunnel_info->{unix_sockets} // [] }) {
+                # untaint
+                ($remote_socket) = $remote_socket =~ m|^(/run/qemu-server/(?:(?!\.\./).)+\.migrate)$|
+                    or die "unexpected socket address '$remote_socket'\n";
+                my $local_socket = $remote_socket;
+                $local_socket =~ s/$remote_vmid/$vmid/g;
+                next if $self->{tunnel}->{forwarded}->{$local_socket};
+                $self->log('info', "Setting up tunnel for '$local_socket'");
+                PVE::Tunnel::forward_unix_socket($self->{tunnel}, $local_socket, $remote_socket);
+            }
+        } elsif ($params->{migrate_opts}->{type} eq 'insecure') {
+            die "only TCP sockets are supported for insecure remote migration\n"
+                if $tunnel_info->{proto} ne 'tcp';
+
+            die "unexpected TCP migration address '$tunnel_info->{addr}'\n"
+                if $tunnel_info->{addr} !~ m/^(?:localhost|[\d\.]+|\[[\d\.:a-fA-F]+\])$/;
+
+            die "unexpected TCP migration port '$tunnel_info->{port}'\n"
+                if $tunnel_info->{port} !~ /^\d+$/ || $tunnel_info->{port} <= 0 || $tunnel_info->{port} > 65535;
+
+            $self->log('info', "using direct TCP migration to $tunnel_info->{addr}:$tunnel_info->{port}");
+        } else {
+            die "unsupported remote migration type '$params->{migrate_opts}->{type}'\n";
         }
     } else {
         ($tunnel_info, $spice_port) = $self->phase2_start_local_cluster($vmid, $params);
-- 
2.47.3




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 pve-guest-common] tunnel: propagate remote capabilities
  2026-05-15 21:54   ` [PATCH v2 qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
@ 2026-05-15 21:54     ` Bogdan Ionescu
  0 siblings, 0 replies; 7+ messages in thread
From: Bogdan Ionescu @ 2026-05-15 21:54 UTC (permalink / raw)
  To: f.gruenbichler; +Cc: pve-devel, Bogdan Ionescu

Signed-off-by: Bogdan Ionescu <bogdan@ionescu.at>
---
 src/PVE/Tunnel.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/PVE/Tunnel.pm b/src/PVE/Tunnel.pm
index 791b465..952fd66 100644
--- a/src/PVE/Tunnel.pm
+++ b/src/PVE/Tunnel.pm
@@ -315,6 +315,8 @@ sub fork_websocket_tunnel {
             if ($version =~ /^(\d+)$/) {
                 $tunnel->{version} = $1;
                 $tunnel->{age} = $res->{age};
+                $tunnel->{caps} = $res->{caps}
+                    if defined($res->{caps}) && ref($res->{caps}) eq 'ARRAY';
             } else {
                 $err = "received invalid tunnel version string '$version'\n" if !$err;
             }
-- 
2.47.3




^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-15 21:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-25  1:10 [pve-devel] [RFC] qemu-server: add migration_type=insecure to remote-migrate Bogdan Ionescu
2026-04-30 12:40 ` Fabian Grünbichler
2026-05-14 22:25   ` Bogdan Ionescu
2026-05-14 22:27   ` [PATCH qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
2026-05-14 22:27     ` [PATCH pve-guest-common] tunnel: propagate remote capabilities Bogdan Ionescu
2026-05-15 21:54   ` [PATCH v2 qemu-server] remote migration: allow insecure TCP data plane Bogdan Ionescu
2026-05-15 21:54     ` [PATCH v2 pve-guest-common] tunnel: propagate remote capabilities Bogdan Ionescu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal