all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option
@ 2025-09-05 14:15 Daniel Kral
  2025-09-08  9:35 ` Fiona Ebner
  2025-09-08 10:07 ` [pve-devel] applied: " Fiona Ebner
  0 siblings, 2 replies; 4+ messages in thread
From: Daniel Kral @ 2025-09-05 14:15 UTC (permalink / raw)
  To: pve-devel

Since QEMU 9.2 [0], the default I/O address space bit width was raised
from 39 bits to 48 bits for the Intel vIOMMU driver, which makes the
aw-bits check introduced in [1] to trip for host CPUs with less than 48
bits physical address width from QEMU 9.2 onwards:

vfio 0000:XX:YY.Z: Failed to set vIOMMU: aw-bits 48 > host aw-bits 39

For VFIO devices where a vIOMMU is in-use, QEMU fetches the IOVA ranges
with the iommufd ioctl IOMMU_IOAS_IOVA_RANGES or the vfio_iommu_type1's
VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE info, so 'phys-bits' doesn't change
the behavior of the check.

Therefore, expose the 'aw-bits' option of the intel-iommu and
virtio-iommu QEMU drivers to allow users to set the value.

[0] qemu ddd84fd0c1 ("intel_iommu: Set default aw_bits to 48 starting from QEMU 9.2")
[1] qemu 77f6efc0ab ("intel_iommu: Check compatibility with host IOMMU capabilities")

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
changes from v2:
  - changed quotes from qemu-devel lore to upstream qemu commits
  - added die when aw-bits is set without viommu
  - added test case for aw-bits without viommu

 src/PVE/QemuServer.pm                         |  9 +++++--
 src/PVE/QemuServer/Machine.pm                 | 24 +++++++++++++++---
 .../cfg2cmd/q35-no-viommu-with-aw-bits.conf   |  3 +++
 .../cfg2cmd/q35-viommu-intel-aw-bits.conf     |  2 ++
 .../cfg2cmd/q35-viommu-intel-aw-bits.conf.cmd | 25 +++++++++++++++++++
 .../cfg2cmd/q35-viommu-virtio-aw-bits.conf    |  2 ++
 .../q35-viommu-virtio-aw-bits.conf.cmd        | 25 +++++++++++++++++++
 7 files changed, 85 insertions(+), 5 deletions(-)
 create mode 100644 src/test/cfg2cmd/q35-no-viommu-with-aw-bits.conf
 create mode 100644 src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf
 create mode 100644 src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf.cmd
 create mode 100644 src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf
 create mode 100644 src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf.cmd

diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
index 9597d316..04e988c7 100644
--- a/src/PVE/QemuServer.pm
+++ b/src/PVE/QemuServer.pm
@@ -3903,11 +3903,16 @@ sub config_to_command {
     PVE::QemuServer::Machine::assert_valid_machine_property($machine_conf);
 
     if (my $viommu = $machine_conf->{viommu}) {
+        my $viommu_devstr = '';
+        $viommu_devstr .= ",aw-bits=$machine_conf->{'aw-bits'}" if $machine_conf->{'aw-bits'};
+
         if ($viommu eq 'intel') {
-            unshift @$devices, '-device', 'intel-iommu,intremap=on,caching-mode=on';
+            $viommu_devstr = "intel-iommu,intremap=on,caching-mode=on$viommu_devstr";
+            unshift @$devices, '-device', $viommu_devstr;
             push @$machineFlags, 'kernel-irqchip=split';
         } elsif ($viommu eq 'virtio') {
-            push @$devices, '-device', 'virtio-iommu-pci';
+            $viommu_devstr = "virtio-iommu-pci$viommu_devstr";
+            push @$devices, '-device', $viommu_devstr;
         }
     }
 
diff --git a/src/PVE/QemuServer/Machine.pm b/src/PVE/QemuServer/Machine.pm
index b61667e0..7e6ee6a4 100644
--- a/src/PVE/QemuServer/Machine.pm
+++ b/src/PVE/QemuServer/Machine.pm
@@ -58,6 +58,16 @@ my $machine_fmt = {
         enum => ['intel', 'virtio'],
         optional => 1,
     },
+    'aw-bits' => {
+        type => 'number',
+        description => "Specifies the vIOMMU address space bit width.",
+        verbose_description => "Specifies the vIOMMU address space bit width.\n\n"
+            . "Intel vIOMMU supports a bit width of either 39 or 48 bits and"
+            . " VirtIO vIOMMU supports any bit width between 32 and 64 bits.",
+        minimum => 32,
+        maximum => 64,
+        optional => 1,
+    },
     'enable-s3' => {
         type => 'boolean',
         description =>
@@ -112,10 +122,18 @@ sub default_machine_for_arch {
 
 sub assert_valid_machine_property {
     my ($machine_conf) = @_;
-    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
-    if ($machine_conf->{viommu} && $machine_conf->{viommu} eq "intel" && !$q35) {
-        die "to use Intel vIOMMU please set the machine type to q35\n";
+    if ($machine_conf->{viommu} && $machine_conf->{viommu} eq "intel") {
+        my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
+        die "to use Intel vIOMMU please set the machine type to q35\n" if !$q35;
+
+        die "Intel vIOMMU supports only 39 or 48 bits as address width\n"
+            if $machine_conf->{'aw-bits'}
+            && $machine_conf->{'aw-bits'} != 39
+            && $machine_conf->{'aw-bits'} != 48;
     }
+
+    die "cannot set aw-bits if no vIOMMU is configured\n"
+        if $machine_conf->{'aw-bits'} && !$machine_conf->{viommu};
 }
 
 sub machine_type_is_q35 {
diff --git a/src/test/cfg2cmd/q35-no-viommu-with-aw-bits.conf b/src/test/cfg2cmd/q35-no-viommu-with-aw-bits.conf
new file mode 100644
index 00000000..06db33ab
--- /dev/null
+++ b/src/test/cfg2cmd/q35-no-viommu-with-aw-bits.conf
@@ -0,0 +1,3 @@
+# TEST: Check that aw-bits cannot be set without viommu
+# EXPECT_ERROR: cannot set aw-bits if no vIOMMU is configured
+machine: q35,aw-bits=39
diff --git a/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf b/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf
new file mode 100644
index 00000000..9e84e42e
--- /dev/null
+++ b/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf
@@ -0,0 +1,2 @@
+# TEST: Check if aw-bits are propagated correctly to intel-iommu device
+machine: q35,viommu=intel,aw-bits=39
diff --git a/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf.cmd b/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf.cmd
new file mode 100644
index 00000000..030ccaa5
--- /dev/null
+++ b/src/test/cfg2cmd/q35-viommu-intel-aw-bits.conf.cmd
@@ -0,0 +1,25 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'vm8006,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect-ms=5000' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smp '1,sockets=1,cores=1,maxcpus=1' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 512 \
+  -global 'ICH9-LPC.disable_s3=1' \
+  -global 'ICH9-LPC.disable_s4=1' \
+  -device 'intel-iommu,intremap=on,caching-mode=on,aw-bits=39' \
+  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
+  -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
+  -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=q35+pve0,kernel-irqchip=split'
diff --git a/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf b/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf
new file mode 100644
index 00000000..dd8ef1fd
--- /dev/null
+++ b/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf
@@ -0,0 +1,2 @@
+# TEST: Check if aw-bits are propagated correctly to virtio-iommu-pci device
+machine: q35,viommu=virtio,aw-bits=39
diff --git a/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf.cmd b/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf.cmd
new file mode 100644
index 00000000..c3b12eee
--- /dev/null
+++ b/src/test/cfg2cmd/q35-viommu-virtio-aw-bits.conf.cmd
@@ -0,0 +1,25 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'vm8006,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect-ms=5000' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smp '1,sockets=1,cores=1,maxcpus=1' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 512 \
+  -global 'ICH9-LPC.disable_s3=1' \
+  -global 'ICH9-LPC.disable_s4=1' \
+  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
+  -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
+  -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -device 'virtio-iommu-pci,aw-bits=39' \
+  -machine 'type=q35+pve0'
-- 
2.47.2



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option
  2025-09-05 14:15 [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option Daniel Kral
@ 2025-09-08  9:35 ` Fiona Ebner
  2025-09-08  9:40   ` Daniel Kral
  2025-09-08 10:07 ` [pve-devel] applied: " Fiona Ebner
  1 sibling, 1 reply; 4+ messages in thread
From: Fiona Ebner @ 2025-09-08  9:35 UTC (permalink / raw)
  To: Proxmox VE development discussion, Daniel Kral

Am 05.09.25 um 4:15 PM schrieb Daniel Kral:
> Since QEMU 9.2 [0], the default I/O address space bit width was raised
> from 39 bits to 48 bits for the Intel vIOMMU driver, which makes the
> aw-bits check introduced in [1] to trip for host CPUs with less than 48
> bits physical address width from QEMU 9.2 onwards:
> 
> vfio 0000:XX:YY.Z: Failed to set vIOMMU: aw-bits 48 > host aw-bits 39
> 
> For VFIO devices where a vIOMMU is in-use, QEMU fetches the IOVA ranges
> with the iommufd ioctl IOMMU_IOAS_IOVA_RANGES or the vfio_iommu_type1's
> VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE info, so 'phys-bits' doesn't change
> the behavior of the check.
> 
> Therefore, expose the 'aw-bits' option of the intel-iommu and
> virtio-iommu QEMU drivers to allow users to set the value.
> 
> [0] qemu ddd84fd0c1 ("intel_iommu: Set default aw_bits to 48 starting from QEMU 9.2")
> [1] qemu 77f6efc0ab ("intel_iommu: Check compatibility with host IOMMU capabilities")
> 
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>

I'll go ahead and apply this and the below, if that addition is fine by you?

> commit 05eb8e6394ca83e53cbf6a01e1a8848ff3d4d3e8
> Author: Fiona Ebner <f.ebner@proxmox.com>
> Date:   Mon Sep 8 10:37:24 2025 +0200
> 
>     cfg2cmd: inform users that setting guest-phys-bits might be necessary when setting aw-bits
>     
>     Until QEMU warns about this itself, inform the users here. Commit
>     message below copied from [1].
>     
>     If a virtual machine is setup with an intel-iommu device, QEMU
>     allocates and maps the (virtual) I/O address space (IOAS) for a VFIO
>     passthrough device with iommufd.
>     
>     In case of a mismatch of the address width of the host CPU and IOMMU
>     CPU, the guest physical address space (GPAS) and memory-type range
>     registers (MTRRs) are setup to the host CPU's address width, which
>     causes IOAS to be allocated and mapped outside of the IOMMU's maximum
>     guest address width (MGAW) and causes the following error from QEMU
>     (the error message is copied from the user forum [0]):
>     
>         kvm: vfio_container_dma_map(0x5c9222494280, 0x380000000000, 0x10000, 0x78075ee70000) = -22 (Invalid argument)
>     
>     [0]: https://forum.proxmox.com/threads/169586/page-3#post-795717
>     [1]: https://lore.proxmox.com/pve-devel/20250902112307.124706-5-d.kral@proxmox.com/
>     
>     Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  src/PVE/QemuServer.pm | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
> index c428e2d7..bf229610 100644
> --- a/src/PVE/QemuServer.pm
> +++ b/src/PVE/QemuServer.pm
> @@ -3944,7 +3944,13 @@ sub config_to_command {
>  
>      if (my $viommu = $machine_conf->{viommu}) {
>          my $viommu_devstr = '';
> -        $viommu_devstr .= ",aw-bits=$machine_conf->{'aw-bits'}" if $machine_conf->{'aw-bits'};
> +        if ($machine_conf->{'aw-bits'}) {
> +            $viommu_devstr .= ",aw-bits=$machine_conf->{'aw-bits'}";
> +
> +            # TODO remove message once this gets properly checked/warned about in QEMU itself.
> +            print "vIOMMU 'aw-bits' set to $machine_conf->{'aw-bits'}. Sometimes it is necessary to"
> +                . " set the CPU's 'guest-phys-bits' to the same value.\n";
> +        }
>  
>          if ($viommu eq 'intel') {
>              $viommu_devstr = "intel-iommu,intremap=on,caching-mode=on$viommu_devstr";



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option
  2025-09-08  9:35 ` Fiona Ebner
@ 2025-09-08  9:40   ` Daniel Kral
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Kral @ 2025-09-08  9:40 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On Mon Sep 8, 2025 at 11:35 AM CEST, Fiona Ebner wrote:
> I'll go ahead and apply this and the below, if that addition is fine by you?

Yes, looks good to me, thanks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option
  2025-09-05 14:15 [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option Daniel Kral
  2025-09-08  9:35 ` Fiona Ebner
@ 2025-09-08 10:07 ` Fiona Ebner
  1 sibling, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2025-09-08 10:07 UTC (permalink / raw)
  To: pve-devel, Daniel Kral

On Fri, 05 Sep 2025 16:15:06 +0200, Daniel Kral wrote:
> Since QEMU 9.2 [0], the default I/O address space bit width was raised
> from 39 bits to 48 bits for the Intel vIOMMU driver, which makes the
> aw-bits check introduced in [1] to trip for host CPUs with less than 48
> bits physical address width from QEMU 9.2 onwards:
> 
> vfio 0000:XX:YY.Z: Failed to set vIOMMU: aw-bits 48 > host aw-bits 39
> 
> [...]

Applied, together with my follow-up, thanks!

[1/1] fix #6608: expose viommu driver aw-bits option
      commit: dc52c006ce0527181556aaa363f49082cd613b5c


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-09-08 10:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-05 14:15 [pve-devel] [PATCH qemu-server v3] fix #6608: expose viommu driver aw-bits option Daniel Kral
2025-09-08  9:35 ` Fiona Ebner
2025-09-08  9:40   ` Daniel Kral
2025-09-08 10:07 ` [pve-devel] applied: " Fiona Ebner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal