* [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs
@ 2025-09-04 12:40 Fiona Ebner
2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 1/8] virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1 Fiona Ebner
` (9 more replies)
0 siblings, 10 replies; 15+ messages in thread
From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw)
To: pve-devel
Changes in v3:
* add part three - snapshot handling
* backport full migration+start handling
* schema description: explicitly mention that a value of 0 means to
not use host_mtu
* die when host_mtu > bridge MTU also upon migration and expand error
message
* less bloaty code by always mentioning migrated host_mtu value
* move get_nets_host_mtu to network module for re-use with snapshots
* avoid overly long line in tests
Changes in v2:
* push make tidy change already to master
* add part two - migration
* move version_guard() call to outside of print_netdevice_full() call
* add comment about why host_mtu is always set in source code
The virtual hardware is generated differently (at least for i440fx
machines) when host_mtu is set or not set on the netdev command line
[0]. When the MTU is the same value as the default 1500, Proxmox VE
did not add a host_mtu parameter. This is problematic for migration
where host_mtu is present on one end of the migration, but not on the
other [1].
Always set the host_mtu parameter starting with machine version
10.0+pve1 to avoid this issue going forward. For snapshots, the
nets-host-mtu information is recorded in the snapshot config. When the
information is not present, this series keeps the behavior on Proxmox
VE 8 and Proxmox VE 9 as-is, i.e. loading a Proxmox VE 8 snapshot on
Proxmox VE 9 when the bridge MTU has a mismatch can still be
problematic. Loading snapshots made on the same major version works.
The VM start parameter already provides an escape hatch. We could also
think about doing a follow-up and automatically try to fallback to
Proxmox VE 8 default behavior when loading the snapshot fails (for
machine verison < 10.0+pve1).
Moreover, the effective setting in the guest (state) will
still be the host_mtu from the source side, even if a different value
is used for host_mtu on the target instance's commandline. This will
not lead to an error loading the migration stream in QEMU, but having
a larger host_mtu than the bridge MTU is still problematic for certain
network traffic like
> iperf3 -c 10.10.10.11 -u -l 2k
when host_mtu=9000 and bridge MTU=1500.
Add the necessary parameter for VM start and pass the values along for
migration to preserve the values going forward.
For Proxmox VE 8, the migration handling fixes are backported.
stable-bookworm
Fiona Ebner (2):
api: vm start: introduce nets-host-mtu parameter for migration compat
migration: preserve host_mtu for virtio-net devices
master:
Fiona Ebner (6):
virtio-net: fix migration between default/non-default MTUs starting
with machine version 10.0+pve1
api: vm start: introduce nets-host-mtu parameter for migration compat
migration: preserve host_mtu for virtio-net devices
snapshot: save vmstate: avoid using deprecated check_running()
function
snapshot: save vmstate: die when PID cannot be obtained
snapshot: introduce running-nets-host-mtu property
src/PVE/API2/Qemu.pm | 13 ++++
src/PVE/QemuConfig.pm | 13 ++--
src/PVE/QemuMigrate.pm | 8 +++
src/PVE/QemuServer.pm | 62 +++++++++++++++++--
src/PVE/QemuServer/Machine.pm | 6 ++
src/PVE/QemuServer/Network.pm | 29 +++++++++
src/PVE/QemuServer/RunState.pm | 3 +-
src/test/MigrationTest/QemuMigrateMock.pm | 15 +++++
src/test/cfg2cmd/bootorder-empty.conf.cmd | 4 +-
src/test/cfg2cmd/bootorder-legacy.conf.cmd | 4 +-
src/test/cfg2cmd/bootorder.conf.cmd | 4 +-
src/test/cfg2cmd/efidisk-on-rbd.conf.cmd | 4 +-
src/test/cfg2cmd/ide.conf.cmd | 4 +-
.../cfg2cmd/netdev-7.1-multiqueues.conf.cmd | 2 +-
src/test/cfg2cmd/netdev-7.1.conf.cmd | 2 +-
src/test/cfg2cmd/netdev_vxlan.conf.cmd | 2 +-
src/test/cfg2cmd/q35-ide.conf.cmd | 4 +-
.../q35-linux-hostpci-mapping.conf.cmd | 4 +-
.../q35-linux-hostpci-multifunction.conf.cmd | 4 +-
...q35-linux-hostpci-x-pci-overrides.conf.cmd | 4 +-
src/test/cfg2cmd/q35-linux-hostpci.conf.cmd | 4 +-
src/test/cfg2cmd/q35-simple.conf.cmd | 4 +-
src/test/cfg2cmd/seabios_serial.conf.cmd | 4 +-
src/test/cfg2cmd/simple-btrfs.conf.cmd | 4 +-
.../cfg2cmd/simple-disk-passthrough.conf.cmd | 4 +-
src/test/cfg2cmd/simple-rbd.conf.cmd | 4 +-
src/test/cfg2cmd/simple-virtio-blk.conf.cmd | 4 +-
.../cfg2cmd/simple-zfs-over-iscsi.conf.cmd | 4 +-
src/test/cfg2cmd/simple1.conf.cmd | 4 +-
29 files changed, 177 insertions(+), 50 deletions(-)
--
2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 1/8] virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 2/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner ` (8 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The virtual hardware is generated differently (at least for i440fx machines) when host_mtu is set or not set on the netdev command line [0]. When the MTU is the same value as the default 1500, Proxmox VE did not add a host_mtu parameter. This is problematic for migration where host_mtu is present on one end of the migration, but not on the other [1]. Always set the host_mtu parameter starting with machine version 10.0+pve1 to avoid this issue going forward. Handling migrations with older machine versions is more involved and will be done in separate patches. Thanks to Stefan Hanreich and Fabian Grünbichler for discussing this with me! Since print_netdevice_full() is also called for hotplug, it cannot always use the $version_guard helper and needs to fallback to min_version() then. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- src/PVE/QemuServer.pm | 10 +++++++++- src/PVE/QemuServer/Machine.pm | 6 ++++++ src/test/cfg2cmd/bootorder-empty.conf.cmd | 4 ++-- src/test/cfg2cmd/bootorder-legacy.conf.cmd | 4 ++-- src/test/cfg2cmd/bootorder.conf.cmd | 4 ++-- src/test/cfg2cmd/efidisk-on-rbd.conf.cmd | 4 ++-- src/test/cfg2cmd/ide.conf.cmd | 4 ++-- src/test/cfg2cmd/netdev-7.1-multiqueues.conf.cmd | 2 +- src/test/cfg2cmd/netdev-7.1.conf.cmd | 2 +- src/test/cfg2cmd/netdev_vxlan.conf.cmd | 2 +- src/test/cfg2cmd/q35-ide.conf.cmd | 4 ++-- src/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd | 4 ++-- .../cfg2cmd/q35-linux-hostpci-multifunction.conf.cmd | 4 ++-- .../cfg2cmd/q35-linux-hostpci-x-pci-overrides.conf.cmd | 4 ++-- src/test/cfg2cmd/q35-linux-hostpci.conf.cmd | 4 ++-- src/test/cfg2cmd/q35-simple.conf.cmd | 4 ++-- src/test/cfg2cmd/seabios_serial.conf.cmd | 4 ++-- src/test/cfg2cmd/simple-btrfs.conf.cmd | 4 ++-- src/test/cfg2cmd/simple-disk-passthrough.conf.cmd | 4 ++-- src/test/cfg2cmd/simple-rbd.conf.cmd | 4 ++-- src/test/cfg2cmd/simple-virtio-blk.conf.cmd | 4 ++-- src/test/cfg2cmd/simple-zfs-over-iscsi.conf.cmd | 4 ++-- src/test/cfg2cmd/simple1.conf.cmd | 4 ++-- 23 files changed, 54 insertions(+), 40 deletions(-) diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 38fa3f83..5b7087dc 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -1495,7 +1495,13 @@ sub print_netdevice_full { die "netdev $netid: MTU '$mtu' is bigger than the bridge MTU '$bridge_mtu'\n"; } - $tmpstr .= ",host_mtu=$mtu" if $mtu != 1500; + if (min_version($machine_version, 10, 0, 1)) { + # Always add host_mtu for migration compatibility, because the presence of host_mtu + # means that the virtual hardware is generated differently (at least for i440fx) + $tmpstr .= ",host_mtu=$mtu"; + } else { + $tmpstr .= ",host_mtu=$mtu" if $mtu != 1500; + } } elsif (defined($mtu)) { warn "WARN: netdev $netid: ignoring MTU '$mtu', not using VirtIO or no bridge configured.\n"; @@ -3819,6 +3825,8 @@ sub config_to_command { my $netdevfull = print_netdev_full($vmid, $conf, $arch, $d, $netname); push @$devices, '-netdev', $netdevfull; + # force +pve1 if machine version 10.0, for host_mtu differentiation + $version_guard->(10, 0, 1); my $netdevicefull = print_netdevice_full( $vmid, $conf, diff --git a/src/PVE/QemuServer/Machine.pm b/src/PVE/QemuServer/Machine.pm index 9d17344a..4c135a20 100644 --- a/src/PVE/QemuServer/Machine.pm +++ b/src/PVE/QemuServer/Machine.pm @@ -37,6 +37,12 @@ our $PVE_MACHINE_VERSION = { '+pve1' => 'Disables S3/S4 power states by default.', }, }, + '10.0' => { + highest => 1, + revisions => { + '+pve1' => 'Set host_mtu vNIC option even with default value for migration compat.', + }, + }, }; my $machine_fmt = { diff --git a/src/test/cfg2cmd/bootorder-empty.conf.cmd b/src/test/cfg2cmd/bootorder-empty.conf.cmd index 3516b344..af4a5ba6 100644 --- a/src/test/cfg2cmd/bootorder-empty.conf.cmd +++ b/src/test/cfg2cmd/bootorder-empty.conf.cmd @@ -39,5 +39,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"eeb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"feb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"drive-virtio1","read-only":false,"throttle-group":"throttle-drive-virtio1"}' \ -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb,iothread=iothread-virtio1,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/bootorder-legacy.conf.cmd b/src/test/cfg2cmd/bootorder-legacy.conf.cmd index c86ab6f9..6b848a9b 100644 --- a/src/test/cfg2cmd/bootorder-legacy.conf.cmd +++ b/src/test/cfg2cmd/bootorder-legacy.conf.cmd @@ -39,5 +39,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"eeb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"feb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"drive-virtio1","read-only":false,"throttle-group":"throttle-drive-virtio1"}' \ -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb,iothread=iothread-virtio1,bootindex=302,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=100' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=100,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/bootorder.conf.cmd b/src/test/cfg2cmd/bootorder.conf.cmd index 48f9da8b..a3c6bd39 100644 --- a/src/test/cfg2cmd/bootorder.conf.cmd +++ b/src/test/cfg2cmd/bootorder.conf.cmd @@ -39,5 +39,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"eeb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"feb683fb9c516c1a8707c917f0d7a38","read-only":false},"node-name":"drive-virtio1","read-only":false,"throttle-group":"throttle-drive-virtio1"}' \ -device 'virtio-blk-pci,drive=drive-virtio1,id=virtio1,bus=pci.0,addr=0xb,iothread=iothread-virtio1,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=101' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=101,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/efidisk-on-rbd.conf.cmd b/src/test/cfg2cmd/efidisk-on-rbd.conf.cmd index 5d0c8aff..dda9d91b 100644 --- a/src/test/cfg2cmd/efidisk-on-rbd.conf.cmd +++ b/src/test/cfg2cmd/efidisk-on-rbd.conf.cmd @@ -31,5 +31,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=pc+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=pc+pve1' diff --git a/src/test/cfg2cmd/ide.conf.cmd b/src/test/cfg2cmd/ide.conf.cmd index 6b5a52a9..23282a18 100644 --- a/src/test/cfg2cmd/ide.conf.cmd +++ b/src/test/cfg2cmd/ide.conf.cmd @@ -42,5 +42,5 @@ -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/100/vm-100-disk-2.qcow2","node-name":"ec11e0572184321efc5835152b95d5d","read-only":false},"node-name":"fc11e0572184321efc5835152b95d5d","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,device_id=drive-scsi0,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/netdev-7.1-multiqueues.conf.cmd b/src/test/cfg2cmd/netdev-7.1-multiqueues.conf.cmd index 776bab30..43e40742 100644 --- a/src/test/cfg2cmd/netdev-7.1-multiqueues.conf.cmd +++ b/src/test/cfg2cmd/netdev-7.1-multiqueues.conf.cmd @@ -25,4 +25,4 @@ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on,queues=2' \ -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,vectors=6,mq=on,packed=on,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=900' \ - -machine 'type=pc+pve0' + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/netdev-7.1.conf.cmd b/src/test/cfg2cmd/netdev-7.1.conf.cmd index 0d6b3ad2..10404de4 100644 --- a/src/test/cfg2cmd/netdev-7.1.conf.cmd +++ b/src/test/cfg2cmd/netdev-7.1.conf.cmd @@ -25,4 +25,4 @@ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=900' \ - -machine 'type=pc+pve0' + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/netdev_vxlan.conf.cmd b/src/test/cfg2cmd/netdev_vxlan.conf.cmd index a2f3579d..7de574a7 100644 --- a/src/test/cfg2cmd/netdev_vxlan.conf.cmd +++ b/src/test/cfg2cmd/netdev_vxlan.conf.cmd @@ -25,4 +25,4 @@ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1450' \ - -machine 'type=pc+pve0' + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/q35-ide.conf.cmd b/src/test/cfg2cmd/q35-ide.conf.cmd index 475e58d9..9af48002 100644 --- a/src/test/cfg2cmd/q35-ide.conf.cmd +++ b/src/test/cfg2cmd/q35-ide.conf.cmd @@ -41,5 +41,5 @@ -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vz/images/100/vm-100-disk-2.qcow2","node-name":"ec11e0572184321efc5835152b95d5d","read-only":false},"node-name":"fc11e0572184321efc5835152b95d5d","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,device_id=drive-scsi0,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=q35+pve1' diff --git a/src/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd b/src/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd index b0c3e587..7413a651 100644 --- a/src/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd +++ b/src/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd @@ -35,5 +35,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve1' diff --git a/src/test/cfg2cmd/q35-linux-hostpci-multifunction.conf.cmd b/src/test/cfg2cmd/q35-linux-hostpci-multifunction.conf.cmd index b4aa46f5..f8435778 100644 --- a/src/test/cfg2cmd/q35-linux-hostpci-multifunction.conf.cmd +++ b/src/test/cfg2cmd/q35-linux-hostpci-multifunction.conf.cmd @@ -35,5 +35,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve1' diff --git a/src/test/cfg2cmd/q35-linux-hostpci-x-pci-overrides.conf.cmd b/src/test/cfg2cmd/q35-linux-hostpci-x-pci-overrides.conf.cmd index 6c4937c7..b314b8ad 100644 --- a/src/test/cfg2cmd/q35-linux-hostpci-x-pci-overrides.conf.cmd +++ b/src/test/cfg2cmd/q35-linux-hostpci-x-pci-overrides.conf.cmd @@ -34,5 +34,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve1' diff --git a/src/test/cfg2cmd/q35-linux-hostpci.conf.cmd b/src/test/cfg2cmd/q35-linux-hostpci.conf.cmd index 19e6ba3c..b6914255 100644 --- a/src/test/cfg2cmd/q35-linux-hostpci.conf.cmd +++ b/src/test/cfg2cmd/q35-linux-hostpci.conf.cmd @@ -40,5 +40,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve1' diff --git a/src/test/cfg2cmd/q35-simple.conf.cmd b/src/test/cfg2cmd/q35-simple.conf.cmd index e3f712c3..9cdb5bdb 100644 --- a/src/test/cfg2cmd/q35-simple.conf.cmd +++ b/src/test/cfg2cmd/q35-simple.conf.cmd @@ -28,5 +28,5 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve0' + -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'pflash0=pflash0,pflash1=drive-efidisk0,type=q35+pve1' diff --git a/src/test/cfg2cmd/seabios_serial.conf.cmd b/src/test/cfg2cmd/seabios_serial.conf.cmd index 8fc0509b..ce2d7cf2 100644 --- a/src/test/cfg2cmd/seabios_serial.conf.cmd +++ b/src/test/cfg2cmd/seabios_serial.conf.cmd @@ -31,5 +31,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"ecd04be4259153b8293415fefa2a84c","read-only":false},"node-name":"fcd04be4259153b8293415fefa2a84c","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,device_id=drive-scsi0,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'smm=off,type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'smm=off,type=pc+pve1' diff --git a/src/test/cfg2cmd/simple-btrfs.conf.cmd b/src/test/cfg2cmd/simple-btrfs.conf.cmd index f80421ad..b73aae79 100644 --- a/src/test/cfg2cmd/simple-btrfs.conf.cmd +++ b/src/test/cfg2cmd/simple-btrfs.conf.cmd @@ -40,5 +40,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/butter/bread/images/8006/vm-8006-disk-0/disk.raw","node-name":"e7487c01d831e2b51a5446980170ec9","read-only":false},"node-name":"f7487c01d831e2b51a5446980170ec9","read-only":false},"node-name":"drive-scsi3","read-only":false,"throttle-group":"throttle-drive-scsi3"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=3,drive=drive-scsi3,id=scsi3,device_id=drive-scsi3,write-cache=off' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/simple-disk-passthrough.conf.cmd b/src/test/cfg2cmd/simple-disk-passthrough.conf.cmd index 987a6c82..2b3a22e5 100644 --- a/src/test/cfg2cmd/simple-disk-passthrough.conf.cmd +++ b/src/test/cfg2cmd/simple-disk-passthrough.conf.cmd @@ -36,5 +36,5 @@ -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/mnt/file.raw","node-name":"e234a4e3b89ac3adac9bdbf0c3dd6b4","read-only":false},"node-name":"f234a4e3b89ac3adac9bdbf0c3dd6b4","read-only":false},"node-name":"drive-scsi1","read-only":false,"throttle-group":"throttle-drive-scsi1"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1,device_id=drive-scsi1,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/simple-rbd.conf.cmd b/src/test/cfg2cmd/simple-rbd.conf.cmd index b848672c..29dfaacc 100644 --- a/src/test/cfg2cmd/simple-rbd.conf.cmd +++ b/src/test/cfg2cmd/simple-rbd.conf.cmd @@ -52,5 +52,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"raw","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"host_device","filename":"/dev/rbd-pve/fc4181a6-56eb-4f68-b452-8ba1f381ca2a/cpool/vm-8006-disk-0","node-name":"eb0b017124a47505c97a5da052e0141","read-only":false},"node-name":"fb0b017124a47505c97a5da052e0141","read-only":false},"node-name":"drive-scsi7","read-only":false,"throttle-group":"throttle-drive-scsi7"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=7,drive=drive-scsi7,id=scsi7,device_id=drive-scsi7,write-cache=off' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/simple-virtio-blk.conf.cmd b/src/test/cfg2cmd/simple-virtio-blk.conf.cmd index a9acb0cf..efec4a20 100644 --- a/src/test/cfg2cmd/simple-virtio-blk.conf.cmd +++ b/src/test/cfg2cmd/simple-virtio-blk.conf.cmd @@ -31,5 +31,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"edd19f6c1b3a6d5a6248c3376a91a16","read-only":false},"node-name":"fdd19f6c1b3a6d5a6248c3376a91a16","read-only":false},"node-name":"drive-virtio0","read-only":false,"throttle-group":"throttle-drive-virtio0"}' \ -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,iothread=iothread-virtio0,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/simple-zfs-over-iscsi.conf.cmd b/src/test/cfg2cmd/simple-zfs-over-iscsi.conf.cmd index 4fa6a5a9..21bfd638 100644 --- a/src/test/cfg2cmd/simple-zfs-over-iscsi.conf.cmd +++ b/src/test/cfg2cmd/simple-zfs-over-iscsi.conf.cmd @@ -40,5 +40,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"raw","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"iscsi","lun":0,"node-name":"e915a332310039f7a3feed6901eb5da","portal":"127.0.0.1","read-only":false,"target":"iqn.2019-10.org.test:foobar","transport":"tcp"},"node-name":"f915a332310039f7a3feed6901eb5da","read-only":false},"node-name":"drive-scsi3","read-only":false,"throttle-group":"throttle-drive-scsi3"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=3,drive=drive-scsi3,id=scsi3,device_id=drive-scsi3,write-cache=off' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' diff --git a/src/test/cfg2cmd/simple1.conf.cmd b/src/test/cfg2cmd/simple1.conf.cmd index 49b848f2..eef2868b 100644 --- a/src/test/cfg2cmd/simple1.conf.cmd +++ b/src/test/cfg2cmd/simple1.conf.cmd @@ -31,5 +31,5 @@ -blockdev '{"detect-zeroes":"unmap","discard":"unmap","driver":"throttle","file":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"unmap","discard":"unmap","driver":"file","filename":"/var/lib/vz/images/8006/vm-8006-disk-0.qcow2","node-name":"ecd04be4259153b8293415fefa2a84c","read-only":false},"node-name":"fcd04be4259153b8293415fefa2a84c","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,device_id=drive-scsi0,bootindex=100,write-cache=on' \ -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/usr/libexec/qemu-server/pve-bridge,downscript=/usr/libexec/qemu-server/pve-bridgedown,vhost=on' \ - -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300' \ - -machine 'type=pc+pve0' + -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=300,host_mtu=1500' \ + -machine 'type=pc+pve1' -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 2/8] api: vm start: introduce nets-host-mtu parameter for migration compat 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 1/8] virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1 Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 3/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner ` (7 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The virtual hardware is generated differently (at least for i440fx machines) when host_mtu is set or not set on the netdev command line [0]. When the MTU is the same value as the default 1500, Proxmox VE did not add a host_mtu parameter. This is problematic for migration where host_mtu is present on one end of the migration, but not on the other [1]. Moreover, the effective setting in the guest (state) will still be the host_mtu from the source side, even if a different value is used for host_mtu on the target instance's commandline. This will not lead to an error loading the migration stream in QEMU, but having a larger host_mtu than the bridge MTU is still problematic for certain network traffic like > iperf3 -c 10.10.10.11 -u -l 2k when host_mtu=9000 and bridge MTU=1500. Starting a VM cold with such a configuration is already prohibited, so also prevent it for migration. Add the necessary parameter for VM start to allow preserving the values going forward. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Changes in v3: * explicitly mention that a value of 0 means to not use host_mtu * die when host_mtu > bridge MTU also upon migration and expand error message * less bloaty code by always mentioning migrated host_mtu value src/PVE/API2/Qemu.pm | 12 ++++++++++++ src/PVE/QemuServer.pm | 45 +++++++++++++++++++++++++++++++++++++------ 2 files changed, 51 insertions(+), 6 deletions(-) diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm index b571e6c1..4770fbf8 100644 --- a/src/PVE/API2/Qemu.pm +++ b/src/PVE/API2/Qemu.pm @@ -3383,6 +3383,16 @@ __PACKAGE__->register_method({ default => 0, description => 'Whether to migrate conntrack entries for running VMs.', }, + 'nets-host-mtu' => { + type => 'string', + pattern => 'net\d+=\d+(,net\d+=\d+)*', + optional => 1, + description => + 'Used for migration compat. List of VirtIO network devices and their effective' + . ' host_mtu setting according to the QEMU object model on the source side of' + . ' the migration. A value of 0 means that the host_mtu parameter is to be' + . ' avoided for the corresponding device.', + }, }, }, returns => { @@ -3414,6 +3424,7 @@ __PACKAGE__->register_method({ my $targetstorage = $get_root_param->('targetstorage'); my $force_cpu = $get_root_param->('force-cpu'); my $with_conntrack_state = $get_root_param->('with-conntrack-state'); + my $nets_host_mtu = $get_root_param->('nets-host-mtu'); my $storagemap; @@ -3501,6 +3512,7 @@ __PACKAGE__->register_method({ forcemachine => $machine, timeout => $timeout, forcecpu => $force_cpu, + 'nets-host-mtu' => $nets_host_mtu, }; PVE::QemuServer::vm_start($storecfg, $vmid, $params, $migrate_opts); diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 5b7087dc..20e26cd5 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -1457,7 +1457,17 @@ sub print_pbs_blockdev { } sub print_netdevice_full { - my ($vmid, $conf, $net, $netid, $bridges, $use_old_bios_files, $arch, $machine_version) = @_; + my ( + $vmid, + $conf, + $net, + $netid, + $bridges, + $use_old_bios_files, + $arch, + $machine_version, + $host_mtu_migration, # force this value for host_mtu, 0 means force absence of param + ) = @_; my $device = $net->{model}; if ($net->{model} eq 'virtio') { @@ -1484,18 +1494,29 @@ sub print_netdevice_full { my $mtu = $net->{mtu}; - if ($net->{model} eq 'virtio' && $net->{bridge}) { + my $migration_skip_host_mtu = defined($host_mtu_migration) && $host_mtu_migration == 0; + print "netdev $netid: not adding 'host_mtu' parameter for migration compat\n" + if $migration_skip_host_mtu; + + if ($net->{model} eq 'virtio' && $net->{bridge} && !$migration_skip_host_mtu) { my $bridge_mtu = PVE::Network::read_bridge_mtu($net->{bridge}); + if ($host_mtu_migration) { + print "netdev $netid: using 'host_mtu=$host_mtu_migration' for migration compat\n"; + $mtu = $host_mtu_migration; + } + if (!defined($mtu) || $mtu == 1) { $mtu = $bridge_mtu; } elsif ($mtu < 576) { die "netdev $netid: MTU '$mtu' is smaller than the IP minimum MTU '576'\n"; } elsif ($mtu > $bridge_mtu) { - die "netdev $netid: MTU '$mtu' is bigger than the bridge MTU '$bridge_mtu'\n"; + die "netdev $netid: MTU '$mtu' is bigger than the bridge MTU '$bridge_mtu'" + . " - adjust the MTU for the network device in the VM configuration, while ensuring" + . " that the bridge is configured as desired.\n"; } - if (min_version($machine_version, 10, 0, 1)) { + if (min_version($machine_version, 10, 0, 1) || $host_mtu_migration) { # Always add host_mtu for migration compatibility, because the presence of host_mtu # means that the virtual hardware is generated differently (at least for i440fx) $tmpstr .= ",host_mtu=$mtu"; @@ -1503,8 +1524,14 @@ sub print_netdevice_full { $tmpstr .= ",host_mtu=$mtu" if $mtu != 1500; } } elsif (defined($mtu)) { - warn - "WARN: netdev $netid: ignoring MTU '$mtu', not using VirtIO or no bridge configured.\n"; + my $msg_prefix = "netdev $netid: ignoring MTU '$mtu'"; + if ($migration_skip_host_mtu) { + log_warn("$msg_prefix, not used on the source side according to migration parameters"); + } elsif (!$net->{bridge}) { + log_warn("$msg_prefix, no bridge configured"); + } else { + log_warn("$msg_prefix, not using VirtIO"); + } } if ($use_old_bios_files) { @@ -3810,6 +3837,8 @@ sub config_to_command { }, ); + my $nets_host_mtu = + { map { split('=', $_) } PVE::Tools::split_list($options->{'nets-host-mtu'}) }; for (my $i = 0; $i < $MAX_NETS; $i++) { my $netname = "net$i"; @@ -3836,6 +3865,7 @@ sub config_to_command { $use_old_bios_files, $arch, $machine_version, + $nets_host_mtu->{$netname}, ); push @$devices, '-device', $netdevicefull; @@ -5566,6 +5596,8 @@ sub vm_start { # }, # virtio2 => ... # } +# nets-host-mtu => Used for migration compat. List of VirtIO network devices and their effective +# host_mtu setting according to the QEMU object model on the source side of the migration. # migrate_opts: # nbd => volumes for NBD exports (vm_migrate_alloc_nbd_disks) # migratedfrom => source node @@ -5678,6 +5710,7 @@ sub vm_start_nolock { 'force-machine' => $forcemachine, 'force-cpu' => $forcecpu, 'live-restore-backing' => $params->{'live-restore-backing'}, + 'nets-host-mtu' => $params->{'nets-host-mtu'}, }, ); -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 3/8] migration: preserve host_mtu for virtio-net devices 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 1/8] virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1 Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 2/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 4/8] snapshot: save vmstate: avoid using deprecated check_running() function Fiona Ebner ` (6 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The virtual hardware is generated differently (at least for i440fx machines) when host_mtu is set or not set on the netdev command line [0]. When the MTU is the same value as the default 1500, Proxmox VE did not add a host_mtu parameter. This is problematic for migration where host_mtu is present on one end of the migration, but not on the other [1]. Moreover, the effective setting in the guest (state) will still be the host_mtu from the source side, even if a different value is used for host_mtu on the target instance's commandline. This will not lead to an error loading the migration stream in QEMU, but having a larger host_mtu than the bridge MTU is still problematic for certain network traffic like > iperf3 -c 10.10.10.11 -u -l 2k when host_mtu=9000 and bridge MTU=1500. Pass the values from the source to the target during migration to be able to preserve them. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Changes in v3: * move get_nets_host_mtu to network module * avoid overly long line in tests src/PVE/QemuMigrate.pm | 8 +++++++ src/PVE/QemuServer/Network.pm | 29 +++++++++++++++++++++++ src/test/MigrationTest/QemuMigrateMock.pm | 15 ++++++++++++ 3 files changed, 52 insertions(+) diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm index e18cc2aa..4381b542 100644 --- a/src/PVE/QemuMigrate.pm +++ b/src/PVE/QemuMigrate.pm @@ -998,6 +998,10 @@ sub phase2_start_local_cluster { push @$cmd, '--force-cpu', $start->{forcecpu}; } + if ($start->{'nets-host-mtu'}) { + push @$cmd, '--nets-host-mtu', $start->{'nets-host-mtu'}; + } + if ($self->{storage_migration}) { push @$cmd, '--targetstorage', ($self->{opts}->{targetstorage} // '1'); } @@ -1187,6 +1191,10 @@ sub phase2 { }, }; + if (my $nets_host_mtu = PVE::QemuServer::Network::get_nets_host_mtu($vmid, $conf)) { + $params->{start_params}->{'nets-host-mtu'} = $nets_host_mtu; + } + my ($tunnel_info, $spice_port); my @online_local_volumes = $self->filter_local_volumes('online'); diff --git a/src/PVE/QemuServer/Network.pm b/src/PVE/QemuServer/Network.pm index 56df83fb..eb8222e8 100644 --- a/src/PVE/QemuServer/Network.pm +++ b/src/PVE/QemuServer/Network.pm @@ -11,6 +11,8 @@ use PVE::Network::SDN::Zones; use PVE::RESTEnvironment qw(log_warn); use PVE::Tools qw($IPV6RE file_read_firstline); +use PVE::QemuServer::Monitor qw(mon_cmd); + my $nic_model_list = [ 'e1000', 'e1000-82540em', @@ -330,4 +332,31 @@ sub tap_plug { PVE::Network::SDN::Zones::tap_plug($iface, $bridge, $tag, $firewall, $trunks, $rate); } +sub get_nets_host_mtu { + my ($vmid, $conf) = @_; + + my $nets_host_mtu = []; + for my $opt (sort keys $conf->%*) { + next if $opt !~ m/^net(\d+)$/; + my $net = parse_net($conf->{$opt}); + next if $net->{model} ne 'virtio'; + + my $host_mtu = eval { + mon_cmd( + $vmid, 'qom-get', + path => "/machine/peripheral/$opt", + property => 'host_mtu', + ); + }; + if (my $err = $@) { + log_warn("$opt: could not query host_mtu - $err"); + } elsif (defined($host_mtu)) { + push $nets_host_mtu->@*, "${opt}=${host_mtu}"; + } else { + log_warn("$opt: got undefined value when querying host_mtu"); + } + } + return join(',', $nets_host_mtu->@*); +} + 1; diff --git a/src/test/MigrationTest/QemuMigrateMock.pm b/src/test/MigrationTest/QemuMigrateMock.pm index b04cf78b..421f0bb7 100644 --- a/src/test/MigrationTest/QemuMigrateMock.pm +++ b/src/test/MigrationTest/QemuMigrateMock.pm @@ -225,6 +225,21 @@ $qemu_server_machine_module->mock( my $qemu_server_network_module = Test::MockModule->new("PVE::QemuServer::Network"); $qemu_server_network_module->mock( del_nets_bridge_fdb => sub { return; }, + mon_cmd => sub { + my ($vmid, $command, %params) = @_; + + if ($command eq 'qom-get') { + if ( + $params{path} =~ m|^/machine/peripheral/net\d+$| + && $params{property} eq 'host_mtu' + ) { + return 1500; + } + die "mon_cmd (mocked) - implement me: $command for path '$params{path}' property" + . " '$params{property}'"; + } + die "mon_cmd (mocked) - implement me: $command"; + }, ); my $qemu_server_qmphelpers_module = Test::MockModule->new("PVE::QemuServer::QMPHelpers"); -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 4/8] snapshot: save vmstate: avoid using deprecated check_running() function 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (2 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 3/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 5/8] snapshot: save vmstate: die when PID cannot be obtained Fiona Ebner ` (5 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v3. src/PVE/QemuConfig.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/PVE/QemuConfig.pm b/src/PVE/QemuConfig.pm index e0853d65..e3ba240e 100644 --- a/src/PVE/QemuConfig.pm +++ b/src/PVE/QemuConfig.pm @@ -244,7 +244,7 @@ sub __snapshot_save_vmstate { # get current QEMU -cpu argument to ensure consistency of custom CPU models my $runningcpu; - if (my $pid = PVE::QemuServer::check_running($vmid)) { + if (my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid)) { $runningcpu = PVE::QemuServer::CPUConfig::get_cpu_from_running_vm($pid); } -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 5/8] snapshot: save vmstate: die when PID cannot be obtained 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (3 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 4/8] snapshot: save vmstate: avoid using deprecated check_running() function Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 6/8] snapshot: introduce running-nets-host-mtu property Fiona Ebner ` (4 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The call get_current_qemu_machine() already depends on the virtual machine running, so not being able to obtain the PID is very unexpected. Quietly not including the running CPU in the snapshot can lead to not being able to restore the snapshot later, so die early instead. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v3. src/PVE/QemuConfig.pm | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/src/PVE/QemuConfig.pm b/src/PVE/QemuConfig.pm index e3ba240e..33cac3be 100644 --- a/src/PVE/QemuConfig.pm +++ b/src/PVE/QemuConfig.pm @@ -243,10 +243,9 @@ sub __snapshot_save_vmstate { my $runningmachine = PVE::QemuServer::Machine::get_current_qemu_machine($vmid); # get current QEMU -cpu argument to ensure consistency of custom CPU models - my $runningcpu; - if (my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid)) { - $runningcpu = PVE::QemuServer::CPUConfig::get_cpu_from_running_vm($pid); - } + my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid) + or die "cannot obtain PID for VM $vmid!\n"; + my $runningcpu = PVE::QemuServer::CPUConfig::get_cpu_from_running_vm($pid); if (!$suspend) { $conf = $conf->{snapshots}->{$snapname}; -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 6/8] snapshot: introduce running-nets-host-mtu property 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (4 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 5/8] snapshot: save vmstate: die when PID cannot be obtained Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 7/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner ` (3 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel For VirtIO network devices, it is necessary to preserve the values and presence of the host_mtu setting when restoring a snapshot. See commit "migration: preserve host_mtu for virtio-net devices" for details. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v3. src/PVE/API2/Qemu.pm | 1 + src/PVE/QemuConfig.pm | 6 ++++++ src/PVE/QemuServer.pm | 9 +++++++++ src/PVE/QemuServer/RunState.pm | 3 ++- 4 files changed, 18 insertions(+), 1 deletion(-) diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm index 4770fbf8..ec9f31d8 100644 --- a/src/PVE/API2/Qemu.pm +++ b/src/PVE/API2/Qemu.pm @@ -2113,6 +2113,7 @@ my $update_vm_api = sub { } push @delete, 'runningmachine' if $conf->{runningmachine}; push @delete, 'runningcpu' if $conf->{runningcpu}; + push @delete, 'running-nets-host-mtu' if $conf->{'running-nets-host-mtu'}; } PVE::QemuConfig->check_lock($conf) if !$skiplock; diff --git a/src/PVE/QemuConfig.pm b/src/PVE/QemuConfig.pm index 33cac3be..d0844c4c 100644 --- a/src/PVE/QemuConfig.pm +++ b/src/PVE/QemuConfig.pm @@ -247,6 +247,8 @@ sub __snapshot_save_vmstate { or die "cannot obtain PID for VM $vmid!\n"; my $runningcpu = PVE::QemuServer::CPUConfig::get_cpu_from_running_vm($pid); + my $nets_host_mtu = PVE::QemuServer::Network::get_nets_host_mtu($vmid, $conf); + if (!$suspend) { $conf = $conf->{snapshots}->{$snapname}; } @@ -254,6 +256,7 @@ sub __snapshot_save_vmstate { $conf->{vmstate} = $statefile; $conf->{runningmachine} = $runningmachine; $conf->{runningcpu} = $runningcpu; + $conf->{'running-nets-host-mtu'} = $nets_host_mtu; return $statefile; } @@ -473,6 +476,8 @@ sub __snapshot_rollback_hook { # re-initializing its random number generator $conf->{vmgenid} = PVE::QemuServer::generate_uuid(); } + + $data->{'nets-host-mtu'} = delete($conf->{'running-nets-host-mtu'}); } return; @@ -513,6 +518,7 @@ sub __snapshot_rollback_vm_start { statefile => $vmstate, forcemachine => $data->{forcemachine}, forcecpu => $data->{forcecpu}, + 'nets-host-mtu' => $data->{'nets-host-mtu'}, }; PVE::QemuServer::vm_start($storecfg, $vmid, $params); } diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 20e26cd5..e312acb6 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -635,6 +635,15 @@ EODESCR pattern => $PVE::QemuServer::CPUConfig::qemu_cmdline_cpu_re, format_description => 'QEMU -cpu parameter', }, + 'running-nets-host-mtu' => { + type => 'string', + pattern => 'net\d+=\d+(,net\d+=\d+)*', + optional => 1, + description => + 'List of VirtIO network devices and their effective host_mtu setting. A value of 0' + . ' means that the host_mtu parameter is to be avoided for the corresponding device.' + . ' This is used internally for snapshots.', + }, machine => get_standard_option('pve-qemu-machine'), arch => { description => "Virtual processor architecture. Defaults to the host.", diff --git a/src/PVE/QemuServer/RunState.pm b/src/PVE/QemuServer/RunState.pm index 05e7fb47..6a5fdbd7 100644 --- a/src/PVE/QemuServer/RunState.pm +++ b/src/PVE/QemuServer/RunState.pm @@ -104,7 +104,8 @@ sub vm_suspend { warn $@ if $@; PVE::Storage::deactivate_volumes($storecfg, [$vmstate]); PVE::Storage::vdisk_free($storecfg, $vmstate); - delete $conf->@{qw(vmstate runningmachine runningcpu)}; + delete $conf->@{ + qw(vmstate runningmachine runningcpu running-nets-host-mtu)}; PVE::QemuConfig->write_config($vmid, $conf); }; warn $@ if $@; -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 stable-bookworm 7/8] api: vm start: introduce nets-host-mtu parameter for migration compat 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (5 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 6/8] snapshot: introduce running-nets-host-mtu property Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner ` (2 subsequent siblings) 9 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The virtual hardware is generated differently (at least for i440fx machines) when host_mtu is set or not set on the netdev command line [0]. When the MTU is the same value as the default 1500, Proxmox VE did not add a host_mtu parameter. This is problematic for migration where host_mtu is present on one end of the migration, but not on the other [1]. Moreover, the effective setting in the guest (state) will still be the host_mtu from the source side, even if a different value is used for host_mtu on the target instance's commandline. This will not lead to an error loading the migration stream in QEMU, but having a larger host_mtu than the bridge MTU is still problematic for certain network traffic like > iperf3 -c 10.10.10.11 -u -l 2k when host_mtu=9000 and bridge MTU=1500. Starting a VM cold with such a configuration is already prohibited, so also prevent it for migration. Add the necessary parameter for VM start to allow preserving the values going forward. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- New in v3. src/PVE/API2/Qemu.pm | 12 ++++++++++ src/PVE/QemuServer.pm | 54 ++++++++++++++++++++++++++++++++++++++----- 2 files changed, 60 insertions(+), 6 deletions(-) diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm index ce6f362d..9d026bc5 100644 --- a/src/PVE/API2/Qemu.pm +++ b/src/PVE/API2/Qemu.pm @@ -3364,6 +3364,16 @@ __PACKAGE__->register_method({ default => 'max(30, vm memory in GiB)', optional => 1, }, + 'nets-host-mtu' => { + type => 'string', + pattern => 'net\d+=\d+(,net\d+=\d+)*', + optional => 1, + description => + 'Used for migration compat. List of VirtIO network devices and their effective' + . ' host_mtu setting according to the QEMU object model on the source side of' + . ' the migration. A value of 0 means that the host_mtu parameter is to be' + . ' avoided for the corresponding device.', + }, }, }, returns => { @@ -3394,6 +3404,7 @@ __PACKAGE__->register_method({ my $migration_network = $get_root_param->('migration_network'); my $targetstorage = $get_root_param->('targetstorage'); my $force_cpu = $get_root_param->('force-cpu'); + my $nets_host_mtu = $get_root_param->('nets-host-mtu'); my $storagemap; @@ -3480,6 +3491,7 @@ __PACKAGE__->register_method({ forcemachine => $machine, timeout => $timeout, forcecpu => $force_cpu, + 'nets-host-mtu' => $nets_host_mtu, }; PVE::QemuServer::vm_start($storecfg, $vmid, $params, $migrate_opts); diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 0f46b396..47c96726 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -1647,7 +1647,17 @@ sub print_pbs_blockdev { } sub print_netdevice_full { - my ($vmid, $conf, $net, $netid, $bridges, $use_old_bios_files, $arch, $machine_version) = @_; + my ( + $vmid, + $conf, + $net, + $netid, + $bridges, + $use_old_bios_files, + $arch, + $machine_version, + $host_mtu_migration, # force this value for host_mtu, 0 means force absence of param + ) = @_; my $device = $net->{model}; if ($net->{model} eq 'virtio') { @@ -1673,19 +1683,37 @@ sub print_netdevice_full { $tmpstr .= ",bootindex=$net->{bootindex}" if $net->{bootindex}; if (my $mtu = $net->{mtu}) { - if ($net->{model} eq 'virtio' && $net->{bridge}) { + my $migration_skip_host_mtu = defined($host_mtu_migration) && $host_mtu_migration == 0; + print "netdev $netid: not adding 'host_mtu' parameter for migration compat\n" + if $migration_skip_host_mtu; + + if ($net->{model} eq 'virtio' && $net->{bridge} && !$migration_skip_host_mtu) { my $bridge_mtu = PVE::Network::read_bridge_mtu($net->{bridge}); + + if ($host_mtu_migration) { + print "netdev $netid: using 'host_mtu=$host_mtu_migration' for migration compat\n"; + $mtu = $host_mtu_migration; + } + if ($mtu == 1) { $mtu = $bridge_mtu; } elsif ($mtu < 576) { die "netdev $netid: MTU '$mtu' is smaller than the IP minimum MTU '576'\n"; } elsif ($mtu > $bridge_mtu) { - die "netdev $netid: MTU '$mtu' is bigger than the bridge MTU '$bridge_mtu'\n"; + die "netdev $netid: MTU '$mtu' is bigger than the bridge MTU '$bridge_mtu'" + . " - adjust the MTU for the network device in the VM configuration, while ensuring" + . " that the bridge is configured as desired.\n"; } $tmpstr .= ",host_mtu=$mtu"; } else { - warn - "WARN: netdev $netid: ignoring MTU '$mtu', not using VirtIO or no bridge configured.\n"; + my $msg_prefix = "netdev $netid: ignoring MTU '$mtu'"; + if ($migration_skip_host_mtu) { + log_warn("$msg_prefix, not used on the source side according to migration parameters"); + } elsif (!$net->{bridge}) { + log_warn("$msg_prefix, no bridge configured"); + } else { + log_warn("$msg_prefix, not using VirtIO"); + } } } @@ -3557,7 +3585,16 @@ my sub get_vga_properties { } sub config_to_command { - my ($storecfg, $vmid, $conf, $defaults, $forcemachine, $forcecpu, $live_restore_backing) = @_; + my ( + $storecfg, + $vmid, + $conf, + $defaults, + $forcemachine, + $forcecpu, + $live_restore_backing, + $nets_host_mtu, + ) = @_; # minimize config for templates, they can only start for backup, # so most options besides the disks are irrelevant @@ -4127,6 +4164,7 @@ sub config_to_command { }, ); + my $nets_host_mtu_hash = { map { split('=', $_) } PVE::Tools::split_list($nets_host_mtu) }; for (my $i = 0; $i < $MAX_NETS; $i++) { my $netname = "net$i"; @@ -4151,6 +4189,7 @@ sub config_to_command { $use_old_bios_files, $arch, $machine_version, + $nets_host_mtu_hash->{$netname}, ); push @$devices, '-device', $netdevicefull; @@ -5929,6 +5968,8 @@ sub vm_start { # }, # virtio2 => ... # } +# nets-host-mtu => Used for migration compat. List of VirtIO network devices and their effective +# host_mtu setting according to the QEMU object model on the source side of the migration. # migrate_opts: # nbd => volumes for NBD exports (vm_migrate_alloc_nbd_disks) # migratedfrom => source node @@ -6008,6 +6049,7 @@ sub vm_start_nolock { $forcemachine, $forcecpu, $params->{'live-restore-backing'}, + $params->{'nets-host-mtu'}, ); my $migration_ip; -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (6 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 7/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner @ 2025-09-04 12:40 ` Fiona Ebner 2025-09-04 18:11 ` Thomas Lamprecht 2025-09-04 13:06 ` [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fabian Grünbichler 2025-09-04 18:16 ` [pve-devel] applied: " Thomas Lamprecht 9 siblings, 1 reply; 15+ messages in thread From: Fiona Ebner @ 2025-09-04 12:40 UTC (permalink / raw) To: pve-devel The virtual hardware is generated differently (at least for i440fx machines) when host_mtu is set or not set on the netdev command line [0]. When the MTU is the same value as the default 1500, Proxmox VE did not add a host_mtu parameter. This is problematic for migration where host_mtu is present on one end of the migration, but not on the other [1]. Moreover, the effective setting in the guest (state) will still be the host_mtu from the source side, even if a different value is used for host_mtu on the target instance's commandline. This will not lead to an error loading the migration stream in QEMU, but having a larger host_mtu than the bridge MTU is still problematic for certain network traffic like > iperf3 -c 10.10.10.11 -u -l 2k when host_mtu=9000 and bridge MTU=1500. Pass the values from the source to the target during migration to be able to preserve them. [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> --- Changes in v3: * Adapt to moved/changed get_nets_host_mtu() helper. The Network module does not exist yet in PVE 8, and parse_net() lives in QemuServer itself, so the helper is also moved there. src/PVE/QemuMigrate.pm | 8 +++++++ src/PVE/QemuServer.pm | 27 +++++++++++++++++++++++ src/test/MigrationTest/QemuMigrateMock.pm | 15 +++++++++++++ 3 files changed, 50 insertions(+) diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm index 28d7ac56..3cd7069a 100644 --- a/src/PVE/QemuMigrate.pm +++ b/src/PVE/QemuMigrate.pm @@ -959,6 +959,10 @@ sub phase2_start_local_cluster { push @$cmd, '--force-cpu', $start->{forcecpu}; } + if ($start->{'nets-host-mtu'}) { + push @$cmd, '--nets-host-mtu', $start->{'nets-host-mtu'}; + } + if ($self->{storage_migration}) { push @$cmd, '--targetstorage', ($self->{opts}->{targetstorage} // '1'); } @@ -1144,6 +1148,10 @@ sub phase2 { }, }; + if (my $nets_host_mtu = PVE::QemuServer::get_nets_host_mtu($vmid, $conf)) { + $params->{start_params}->{'nets-host-mtu'} = $nets_host_mtu; + } + my ($tunnel_info, $spice_port); my @online_local_volumes = $self->filter_local_volumes('online'); diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 47c96726..5b08b3e3 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -9533,4 +9533,31 @@ sub delete_ifaces_ipams_ips { } } +sub get_nets_host_mtu { + my ($vmid, $conf) = @_; + + my $nets_host_mtu = []; + for my $opt (sort keys $conf->%*) { + next if $opt !~ m/^net(\d+)$/; + my $net = parse_net($conf->{$opt}); + next if $net->{model} ne 'virtio'; + + my $host_mtu = eval { + mon_cmd( + $vmid, 'qom-get', + path => "/machine/peripheral/$opt", + property => 'host_mtu', + ); + }; + if (my $err = $@) { + log_warn("$opt: could not query host_mtu - $err"); + } elsif (defined($host_mtu)) { + push $nets_host_mtu->@*, "${opt}=${host_mtu}"; + } else { + log_warn("$opt: got undefined value when querying host_mtu"); + } + } + return join(',', $nets_host_mtu->@*); +} + 1; diff --git a/src/test/MigrationTest/QemuMigrateMock.pm b/src/test/MigrationTest/QemuMigrateMock.pm index f678f9ec..05b2c5c1 100644 --- a/src/test/MigrationTest/QemuMigrateMock.pm +++ b/src/test/MigrationTest/QemuMigrateMock.pm @@ -175,6 +175,21 @@ $MigrationTest::Shared::qemu_server_module->mock( delete $expected_calls->{'vm_stop'}; }, del_nets_bridge_fdb => sub { return; }, + mon_cmd => sub { + my ($vmid, $command, %params) = @_; + + if ($command eq 'qom-get') { + if ( + $params{path} =~ m|^/machine/peripheral/net\d+$| + && $params{property} eq 'host_mtu' + ) { + return 1500; + } + die "mon_cmd (mocked) - implement me: $command for path '$params{path}' property" + . " '$params{property}'"; + } + die "mon_cmd (mocked) - implement me: $command"; + }, ); my $qemu_server_cpuconfig_module = Test::MockModule->new("PVE::QemuServer::CPUConfig"); -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner @ 2025-09-04 18:11 ` Thomas Lamprecht 2025-09-05 8:54 ` Fiona Ebner 0 siblings, 1 reply; 15+ messages in thread From: Thomas Lamprecht @ 2025-09-04 18:11 UTC (permalink / raw) To: Proxmox VE development discussion, Fiona Ebner Am 04.09.25 um 14:42 schrieb Fiona Ebner: > The virtual hardware is generated differently (at least for i440fx > machines) when host_mtu is set or not set on the netdev command line > [0]. When the MTU is the same value as the default 1500, Proxmox VE > did not add a host_mtu parameter. This is problematic for migration > where host_mtu is present on one end of the migration, but not on the > other [1]. Moreover, the effective setting in the guest (state) will > still be the host_mtu from the source side, even if a different value > is used for host_mtu on the target instance's commandline. This will > not lead to an error loading the migration stream in QEMU, but having > a larger host_mtu than the bridge MTU is still problematic for certain > network traffic like >> iperf3 -c 10.10.10.11 -u -l 2k > when host_mtu=9000 and bridge MTU=1500. > > Pass the values from the source to the target during migration to be > able to preserve them. Which breaks migration from new to old, which can be fine, but seems avoidable given that we got a tunnel that we can query stuff over. Maybe we could at least catch the "Unknown option: nets-host-mtu" error explicitly and add some context that the target likely just needs to be updated to make the migration work. > > [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346 > [1]: https://forum.proxmox.com/threads/live-vm-migration-fails.169537/post-796379 > > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> > --- > > Changes in v3: > * Adapt to moved/changed get_nets_host_mtu() helper. The Network > module does not exist yet in PVE 8, and parse_net() lives in > QemuServer itself, so the helper is also moved there. > > src/PVE/QemuMigrate.pm | 8 +++++++ > src/PVE/QemuServer.pm | 27 +++++++++++++++++++++++ > src/test/MigrationTest/QemuMigrateMock.pm | 15 +++++++++++++ > 3 files changed, 50 insertions(+) > > diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm > index 28d7ac56..3cd7069a 100644 > --- a/src/PVE/QemuMigrate.pm > +++ b/src/PVE/QemuMigrate.pm > @@ -959,6 +959,10 @@ sub phase2_start_local_cluster { > push @$cmd, '--force-cpu', $start->{forcecpu}; > } > > + if ($start->{'nets-host-mtu'}) { > + push @$cmd, '--nets-host-mtu', $start->{'nets-host-mtu'}; > + } > + > if ($self->{storage_migration}) { > push @$cmd, '--targetstorage', ($self->{opts}->{targetstorage} // '1'); > } > @@ -1144,6 +1148,10 @@ sub phase2 { > }, > }; > > + if (my $nets_host_mtu = PVE::QemuServer::get_nets_host_mtu($vmid, $conf)) { > + $params->{start_params}->{'nets-host-mtu'} = $nets_host_mtu; > + } > + > my ($tunnel_info, $spice_port); > > my @online_local_volumes = $self->filter_local_volumes('online'); > diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm > index 47c96726..5b08b3e3 100644 > --- a/src/PVE/QemuServer.pm > +++ b/src/PVE/QemuServer.pm > @@ -9533,4 +9533,31 @@ sub delete_ifaces_ipams_ips { > } > } > > +sub get_nets_host_mtu { > + my ($vmid, $conf) = @_; > + > + my $nets_host_mtu = []; > + for my $opt (sort keys $conf->%*) { > + next if $opt !~ m/^net(\d+)$/; > + my $net = parse_net($conf->{$opt}); > + next if $net->{model} ne 'virtio'; > + > + my $host_mtu = eval { > + mon_cmd( > + $vmid, 'qom-get', > + path => "/machine/peripheral/$opt", > + property => 'host_mtu', > + ); > + }; > + if (my $err = $@) { > + log_warn("$opt: could not query host_mtu - $err"); > + } elsif (defined($host_mtu)) { > + push $nets_host_mtu->@*, "${opt}=${host_mtu}"; > + } else { > + log_warn("$opt: got undefined value when querying host_mtu"); > + } > + } > + return join(',', $nets_host_mtu->@*); > +} > + > 1; > diff --git a/src/test/MigrationTest/QemuMigrateMock.pm b/src/test/MigrationTest/QemuMigrateMock.pm > index f678f9ec..05b2c5c1 100644 > --- a/src/test/MigrationTest/QemuMigrateMock.pm > +++ b/src/test/MigrationTest/QemuMigrateMock.pm > @@ -175,6 +175,21 @@ $MigrationTest::Shared::qemu_server_module->mock( > delete $expected_calls->{'vm_stop'}; > }, > del_nets_bridge_fdb => sub { return; }, > + mon_cmd => sub { > + my ($vmid, $command, %params) = @_; > + > + if ($command eq 'qom-get') { > + if ( > + $params{path} =~ m|^/machine/peripheral/net\d+$| > + && $params{property} eq 'host_mtu' > + ) { > + return 1500; > + } > + die "mon_cmd (mocked) - implement me: $command for path '$params{path}' property" > + . " '$params{property}'"; > + } > + die "mon_cmd (mocked) - implement me: $command"; > + }, > ); > > my $qemu_server_cpuconfig_module = Test::MockModule->new("PVE::QemuServer::CPUConfig"); _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices 2025-09-04 18:11 ` Thomas Lamprecht @ 2025-09-05 8:54 ` Fiona Ebner 2025-09-05 9:09 ` Thomas Lamprecht 0 siblings, 1 reply; 15+ messages in thread From: Fiona Ebner @ 2025-09-05 8:54 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox VE development discussion Am 04.09.25 um 8:11 PM schrieb Thomas Lamprecht: > Am 04.09.25 um 14:42 schrieb Fiona Ebner: >> The virtual hardware is generated differently (at least for i440fx >> machines) when host_mtu is set or not set on the netdev command line >> [0]. When the MTU is the same value as the default 1500, Proxmox VE >> did not add a host_mtu parameter. This is problematic for migration >> where host_mtu is present on one end of the migration, but not on the >> other [1]. Moreover, the effective setting in the guest (state) will >> still be the host_mtu from the source side, even if a different value >> is used for host_mtu on the target instance's commandline. This will >> not lead to an error loading the migration stream in QEMU, but having >> a larger host_mtu than the bridge MTU is still problematic for certain >> network traffic like >>> iperf3 -c 10.10.10.11 -u -l 2k >> when host_mtu=9000 and bridge MTU=1500. >> >> Pass the values from the source to the target during migration to be >> able to preserve them. > > Which breaks migration from new to old, which can be fine, but seems > avoidable given that we got a tunnel that we can query stuff over. How can we query? The old tunnel only supports very specific commands like 'quit' and 'resume $vmid'. Note that remote migration using the new tunnel version is not broken - an old node will just ignore the additional parameter in the passed-along JSON. We could do something like ssh ... qm start 0 --nets-host-mtu and match for "Unknown option: nets-host-mtu" for detection. Alternatively, we could bump the pve-manager version and guard adding the option via the pmxcfs 'version-info' node kv. That mechanism wasn't super reliable in the past though. > Maybe we could at least catch the "Unknown option: nets-host-mtu" > error explicitly and add some context that the target likely just > needs to be updated to make the migration work. If we don't want to go for either of the above or if there isn't an other way to query, I'll go for that? _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices 2025-09-05 8:54 ` Fiona Ebner @ 2025-09-05 9:09 ` Thomas Lamprecht 2025-09-05 9:17 ` Fiona Ebner 0 siblings, 1 reply; 15+ messages in thread From: Thomas Lamprecht @ 2025-09-05 9:09 UTC (permalink / raw) To: Fiona Ebner, Proxmox VE development discussion Am 05.09.25 um 10:54 schrieb Fiona Ebner: > Am 04.09.25 um 8:11 PM schrieb Thomas Lamprecht: >> Am 04.09.25 um 14:42 schrieb Fiona Ebner: >>> The virtual hardware is generated differently (at least for i440fx >>> machines) when host_mtu is set or not set on the netdev command line >>> [0]. When the MTU is the same value as the default 1500, Proxmox VE >>> did not add a host_mtu parameter. This is problematic for migration >>> where host_mtu is present on one end of the migration, but not on the >>> other [1]. Moreover, the effective setting in the guest (state) will >>> still be the host_mtu from the source side, even if a different value >>> is used for host_mtu on the target instance's commandline. This will >>> not lead to an error loading the migration stream in QEMU, but having >>> a larger host_mtu than the bridge MTU is still problematic for certain >>> network traffic like >>>> iperf3 -c 10.10.10.11 -u -l 2k >>> when host_mtu=9000 and bridge MTU=1500. >>> >>> Pass the values from the source to the target during migration to be >>> able to preserve them. >> >> Which breaks migration from new to old, which can be fine, but seems >> avoidable given that we got a tunnel that we can query stuff over. > > How can we query? The old tunnel only supports very specific commands > like 'quit' and 'resume $vmid'. Note that remote migration using the new > tunnel version is not broken - an old node will just ignore the > additional parameter in the passed-along JSON. The absence of a command gives you also information. > > We could do something like > > ssh ... qm start 0 --nets-host-mtu > > and match for "Unknown option: nets-host-mtu" for detection. Yeah, that's exactly what I wrote later in my reply. > Alternatively, we could bump the pve-manager version and guard adding > the option via the pmxcfs 'version-info' node kv. That mechanism wasn't > super reliable in the past though. FWIW, we now re-broadcast that periodically and IIRC even on pmxcfs start up though. >> Maybe we could at least catch the "Unknown option: nets-host-mtu" >> error explicitly and add some context that the target likely just >> needs to be updated to make the migration work. > > If we don't want to go for either of the above or if there isn't an > other way to query, I'll go for that? Would be fine for me, it's the simplest thing to do for now. Adding some more fleshed out general approach for such things might be nice to have available for the future. That could be some versioning or a more structured capabilities query, that is split into required ones (which block the migration) and hints, that are for best-effort stuff, probably also including some basic version info like qemu-server, as that often is needed to know if a capability is required or not, like here, when migrating to a another 8.x node it won't matter, but for a 9.x target node we should enforce an e.g. nets-host-mtu to be available. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices 2025-09-05 9:09 ` Thomas Lamprecht @ 2025-09-05 9:17 ` Fiona Ebner 0 siblings, 0 replies; 15+ messages in thread From: Fiona Ebner @ 2025-09-05 9:17 UTC (permalink / raw) To: Thomas Lamprecht, Proxmox VE development discussion Am 05.09.25 um 11:09 AM schrieb Thomas Lamprecht: > Am 05.09.25 um 10:54 schrieb Fiona Ebner: >> Am 04.09.25 um 8:11 PM schrieb Thomas Lamprecht: >>> Am 04.09.25 um 14:42 schrieb Fiona Ebner: >>>> The virtual hardware is generated differently (at least for i440fx >>>> machines) when host_mtu is set or not set on the netdev command line >>>> [0]. When the MTU is the same value as the default 1500, Proxmox VE >>>> did not add a host_mtu parameter. This is problematic for migration >>>> where host_mtu is present on one end of the migration, but not on the >>>> other [1]. Moreover, the effective setting in the guest (state) will >>>> still be the host_mtu from the source side, even if a different value >>>> is used for host_mtu on the target instance's commandline. This will >>>> not lead to an error loading the migration stream in QEMU, but having >>>> a larger host_mtu than the bridge MTU is still problematic for certain >>>> network traffic like >>>>> iperf3 -c 10.10.10.11 -u -l 2k >>>> when host_mtu=9000 and bridge MTU=1500. >>>> >>>> Pass the values from the source to the target during migration to be >>>> able to preserve them. >>> >>> Which breaks migration from new to old, which can be fine, but seems >>> avoidable given that we got a tunnel that we can query stuff over. >> >> How can we query? The old tunnel only supports very specific commands >> like 'quit' and 'resume $vmid'. Note that remote migration using the new >> tunnel version is not broken - an old node will just ignore the >> additional parameter in the passed-along JSON. > > The absence of a command gives you also information. Okay, so you mean adding a new command and using that to detect that the node is recent enough? What should that command be? The capabilities one you suggest below? >> >> We could do something like >> >> ssh ... qm start 0 --nets-host-mtu >> >> and match for "Unknown option: nets-host-mtu" for detection. > > Yeah, that's exactly what I wrote later in my reply. I thought you meant matching the error for the actual command. My suggestion is using a dummy command for early detection and guard using the new option for the actual command based on that. >> Alternatively, we could bump the pve-manager version and guard adding >> the option via the pmxcfs 'version-info' node kv. That mechanism wasn't >> super reliable in the past though. > > FWIW, we now re-broadcast that periodically and IIRC even on pmxcfs > start up though. Yes, and if we really can't get the info we can err on the side of "assume it's recent enough". >>> Maybe we could at least catch the "Unknown option: nets-host-mtu" >>> error explicitly and add some context that the target likely just >>> needs to be updated to make the migration work. >> >> If we don't want to go for either of the above or if there isn't an >> other way to query, I'll go for that? > > Would be fine for me, it's the simplest thing to do for now. > > Adding some more fleshed out general approach for such things might > be nice to have available for the future. That could be some > versioning or a more structured capabilities query, that is split > into required ones (which block the migration) and hints, that are > for best-effort stuff, probably also including some basic version > info like qemu-server, as that often is needed to know if a > capability is required or not, like here, when migrating to a > another 8.x node it won't matter, but for a 9.x target node we > should enforce an e.g. nets-host-mtu to be available. Sounds sensible. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (7 preceding siblings ...) 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner @ 2025-09-04 13:06 ` Fabian Grünbichler 2025-09-04 18:16 ` [pve-devel] applied: " Thomas Lamprecht 9 siblings, 0 replies; 15+ messages in thread From: Fabian Grünbichler @ 2025-09-04 13:06 UTC (permalink / raw) To: Proxmox VE development discussion consider this Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> haven't had time yet for in-depth tests. the 'nets-host-mtu' name reads a bit strange, but there is precedent for similarly named stuff in the code already, and the parameter is really for internal use only anyway, so.. On September 4, 2025 2:40 pm, Fiona Ebner wrote: > Changes in v3: > * add part three - snapshot handling > * backport full migration+start handling > * schema description: explicitly mention that a value of 0 means to > not use host_mtu > * die when host_mtu > bridge MTU also upon migration and expand error > message > * less bloaty code by always mentioning migrated host_mtu value > * move get_nets_host_mtu to network module for re-use with snapshots > * avoid overly long line in tests > > Changes in v2: > * push make tidy change already to master > * add part two - migration > * move version_guard() call to outside of print_netdevice_full() call > * add comment about why host_mtu is always set in source code > > The virtual hardware is generated differently (at least for i440fx > machines) when host_mtu is set or not set on the netdev command line > [0]. When the MTU is the same value as the default 1500, Proxmox VE > did not add a host_mtu parameter. This is problematic for migration > where host_mtu is present on one end of the migration, but not on the > other [1]. > > Always set the host_mtu parameter starting with machine version > 10.0+pve1 to avoid this issue going forward. For snapshots, the > nets-host-mtu information is recorded in the snapshot config. When the > information is not present, this series keeps the behavior on Proxmox > VE 8 and Proxmox VE 9 as-is, i.e. loading a Proxmox VE 8 snapshot on > Proxmox VE 9 when the bridge MTU has a mismatch can still be > problematic. Loading snapshots made on the same major version works. > The VM start parameter already provides an escape hatch. We could also > think about doing a follow-up and automatically try to fallback to > Proxmox VE 8 default behavior when loading the snapshot fails (for > machine verison < 10.0+pve1). > > Moreover, the effective setting in the guest (state) will > still be the host_mtu from the source side, even if a different value > is used for host_mtu on the target instance's commandline. This will > not lead to an error loading the migration stream in QEMU, but having > a larger host_mtu than the bridge MTU is still problematic for certain > network traffic like >> iperf3 -c 10.10.10.11 -u -l 2k > when host_mtu=9000 and bridge MTU=1500. > > Add the necessary parameter for VM start and pass the values along for > migration to preserve the values going forward. > > For Proxmox VE 8, the migration handling fixes are backported. > > stable-bookworm > > Fiona Ebner (2): > api: vm start: introduce nets-host-mtu parameter for migration compat > migration: preserve host_mtu for virtio-net devices > > > master: > > Fiona Ebner (6): > virtio-net: fix migration between default/non-default MTUs starting > with machine version 10.0+pve1 > api: vm start: introduce nets-host-mtu parameter for migration compat > migration: preserve host_mtu for virtio-net devices > snapshot: save vmstate: avoid using deprecated check_running() > function > snapshot: save vmstate: die when PID cannot be obtained > snapshot: introduce running-nets-host-mtu property > > src/PVE/API2/Qemu.pm | 13 ++++ > src/PVE/QemuConfig.pm | 13 ++-- > src/PVE/QemuMigrate.pm | 8 +++ > src/PVE/QemuServer.pm | 62 +++++++++++++++++-- > src/PVE/QemuServer/Machine.pm | 6 ++ > src/PVE/QemuServer/Network.pm | 29 +++++++++ > src/PVE/QemuServer/RunState.pm | 3 +- > src/test/MigrationTest/QemuMigrateMock.pm | 15 +++++ > src/test/cfg2cmd/bootorder-empty.conf.cmd | 4 +- > src/test/cfg2cmd/bootorder-legacy.conf.cmd | 4 +- > src/test/cfg2cmd/bootorder.conf.cmd | 4 +- > src/test/cfg2cmd/efidisk-on-rbd.conf.cmd | 4 +- > src/test/cfg2cmd/ide.conf.cmd | 4 +- > .../cfg2cmd/netdev-7.1-multiqueues.conf.cmd | 2 +- > src/test/cfg2cmd/netdev-7.1.conf.cmd | 2 +- > src/test/cfg2cmd/netdev_vxlan.conf.cmd | 2 +- > src/test/cfg2cmd/q35-ide.conf.cmd | 4 +- > .../q35-linux-hostpci-mapping.conf.cmd | 4 +- > .../q35-linux-hostpci-multifunction.conf.cmd | 4 +- > ...q35-linux-hostpci-x-pci-overrides.conf.cmd | 4 +- > src/test/cfg2cmd/q35-linux-hostpci.conf.cmd | 4 +- > src/test/cfg2cmd/q35-simple.conf.cmd | 4 +- > src/test/cfg2cmd/seabios_serial.conf.cmd | 4 +- > src/test/cfg2cmd/simple-btrfs.conf.cmd | 4 +- > .../cfg2cmd/simple-disk-passthrough.conf.cmd | 4 +- > src/test/cfg2cmd/simple-rbd.conf.cmd | 4 +- > src/test/cfg2cmd/simple-virtio-blk.conf.cmd | 4 +- > .../cfg2cmd/simple-zfs-over-iscsi.conf.cmd | 4 +- > src/test/cfg2cmd/simple1.conf.cmd | 4 +- > 29 files changed, 177 insertions(+), 50 deletions(-) > > -- > 2.47.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
* [pve-devel] applied: [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner ` (8 preceding siblings ...) 2025-09-04 13:06 ` [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fabian Grünbichler @ 2025-09-04 18:16 ` Thomas Lamprecht 9 siblings, 0 replies; 15+ messages in thread From: Thomas Lamprecht @ 2025-09-04 18:16 UTC (permalink / raw) To: Proxmox VE development discussion, Fiona Ebner Am 04.09.25 um 14:41 schrieb Fiona Ebner: > Changes in v3: > * add part three - snapshot handling > * backport full migration+start handling > * schema description: explicitly mention that a value of 0 means to > not use host_mtu > * die when host_mtu > bridge MTU also upon migration and expand error > message > * less bloaty code by always mentioning migrated host_mtu value > * move get_nets_host_mtu to network module for re-use with snapshots > * avoid overly long line in tests > > Changes in v2: > * push make tidy change already to master > * add part two - migration > * move version_guard() call to outside of print_netdevice_full() call > * add comment about why host_mtu is always set in source code > > The virtual hardware is generated differently (at least for i440fx > machines) when host_mtu is set or not set on the netdev command line > [0]. When the MTU is the same value as the default 1500, Proxmox VE > did not add a host_mtu parameter. This is problematic for migration > where host_mtu is present on one end of the migration, but not on the > other [1]. > > Always set the host_mtu parameter starting with machine version > 10.0+pve1 to avoid this issue going forward. For snapshots, the > nets-host-mtu information is recorded in the snapshot config. When the > information is not present, this series keeps the behavior on Proxmox > VE 8 and Proxmox VE 9 as-is, i.e. loading a Proxmox VE 8 snapshot on > Proxmox VE 9 when the bridge MTU has a mismatch can still be > problematic. Loading snapshots made on the same major version works. > The VM start parameter already provides an escape hatch. We could also > think about doing a follow-up and automatically try to fallback to > Proxmox VE 8 default behavior when loading the snapshot fails (for > machine verison < 10.0+pve1). > > Moreover, the effective setting in the guest (state) will > still be the host_mtu from the source side, even if a different value > is used for host_mtu on the target instance's commandline. This will > not lead to an error loading the migration stream in QEMU, but having > a larger host_mtu than the bridge MTU is still problematic for certain > network traffic like >> iperf3 -c 10.10.10.11 -u -l 2k > when host_mtu=9000 and bridge MTU=1500. > > Add the necessary parameter for VM start and pass the values along for > migration to preserve the values going forward. > > For Proxmox VE 8, the migration handling fixes are backported. even though I think that breaking new to old here might be avoidable without to bending much backward I still applied the series now, a annoying up front error (even if potentially not required) is better than crashing VM and can still be improved later on. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-09-05 9:17 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-04 12:40 [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 1/8] virtio-net: fix migration between default/non-default MTUs starting with machine version 10.0+pve1 Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 2/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 3/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 4/8] snapshot: save vmstate: avoid using deprecated check_running() function Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 5/8] snapshot: save vmstate: die when PID cannot be obtained Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 6/8] snapshot: introduce running-nets-host-mtu property Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 7/8] api: vm start: introduce nets-host-mtu parameter for migration compat Fiona Ebner 2025-09-04 12:40 ` [pve-devel] [PATCH qemu-server v3 stable-bookworm 8/8] migration: preserve host_mtu for virtio-net devices Fiona Ebner 2025-09-04 18:11 ` Thomas Lamprecht 2025-09-05 8:54 ` Fiona Ebner 2025-09-05 9:09 ` Thomas Lamprecht 2025-09-05 9:17 ` Fiona Ebner 2025-09-04 13:06 ` [pve-devel] [PATCH-SERIES qemu-server v3 0/8] virtio-net: fix migration between default/non-default MTUs Fabian Grünbichler 2025-09-04 18:16 ` [pve-devel] applied: " Thomas Lamprecht
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.