From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <w.bumiller@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id A703294854
 for <pve-devel@lists.proxmox.com>; Fri, 13 Jan 2023 10:51:52 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 883A035F1
 for <pve-devel@lists.proxmox.com>; Fri, 13 Jan 2023 10:51:52 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Fri, 13 Jan 2023 10:51:50 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 1F1F844874
 for <pve-devel@lists.proxmox.com>; Fri, 13 Jan 2023 10:51:50 +0100 (CET)
Date: Fri, 13 Jan 2023 10:51:48 +0100
From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Markus Frank <m.frank@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Message-ID: <20230113095148.27nhzxmqdr4nmr6h@fwblub>
References: <20221125140857.121622-1-m.frank@proxmox.com>
 <20221125140857.121622-3-m.frank@proxmox.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20221125140857.121622-3-m.frank@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.211 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [qemu.pm, qemuconfig.pm, machine.pm, qemu.org, qemuserver.pm]
Subject: Re: [pve-devel] [PATCH qemu-server v4 2/5] fix #3784: Parameter for
 guest vIOMMU & machine as property-string
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 13 Jan 2023 09:51:52 -0000

On Fri, Nov 25, 2022 at 03:08:54PM +0100, Markus Frank wrote:
> vIOMMU enables the option to passthrough pci devices to L2 VMs
> in L1 VMs via Nested Virtualisation.
> 
> QEMU-Parameters:
> https://www.qemu.org/docs/master/system/qemu-manpage.html
> https://wiki.qemu.org/Features/VT-d
> 
> -machine ...,kernel-irqchip=split:
> 
> "split" because of intremap see below.
> 
> 
> -device intel-iommu:

AFAICT qemu also has an amd-iommu - so shouldn't we check the host arch
for which variant we need to use?

> 
> * caching-mode=on:
> 
> "It is required for -device vfio-pci to work with the VT-d device, because host
> assigned devices requires to setup the DMA mapping on the host before guest DMA
> starts."
> 
> * intremap=on:
> 
> "This enables interrupt remapping feature. It's required to enable complete
> x2apic. Currently it only supports kvm kernel-irqchip modes off or split, while
> full kernel-irqchip is not yet supported."
> 
> 
> Signed-off-by: Markus Frank <m.frank@proxmox.com>
> ---
> 
> for dmar on virtio-devices:
> 
> * device-iotlb
> 
> "This enables device-iotlb capability for the emulated VT-d device. So far
> virtio/vhost should be the only real user for this parameter, paired with
> ats=on configured for the device."
> 
> * disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on:
> 
> I did not find any good documentation.
> Maybe someone can explain these parameters and how to use them right.
> As I tried them with virtio-net-pci I got about 4-9 times less transfer-speed
> when sending then without them.

I mean, the viommu adds overhead, so I'd expect some downsides.

- iommu_platform=on:
Now, normally virtio devices can just directly access the guest memory
since the hypervisor has full access. `iommu_platform=on` disables this,
and it'll go through some generic DMA process that is supposed to deal
with things such as AMD-SEV where the hypervisor doesn't actually have
access to the full guest memory. I'd expect a large performance hit from
that.

I don't expect the others to make much of a difference, in fact, AFAICT
disable-legacy shouldn't do much at all on modern guests I think.

- 'disable-legacy=on':
Virtio has evolved quite a bit and this option AFAICT disables support
for "legacy" (pre-virtio-1.0) parts, but I don't know the details, you
can probably read them in the virtio spec, it mentions things such as
pci configuration space having been in native-endian rather than
little-endian as is defined by PCI (apparently).

There are apparently 3 "flavors" of virtio devices: legacy,
transitional (supporting "IO" and "MMIO" modes (according to qemu's
docs/pcie.txt)), and modern. Qemu seems to decide the defaults there
depending on whether the device is on a pci or pcie port.
disable-legacy and disable-modern override this explicitly.

> 
> However these Parameters seem not to be necessary for passthroughing
> Assigned Devices, so I would say "dmar for virtio" would be its own
> separate feature.
> 
> v4:
> * added kvm/q35 checks in API
> * reused pve-qemu-machine
> 
> v3:
> * replaced old machine type with property-string with viommu-parameter
> 
> v2:
> * moved viommu-parameter inside of machine_fmt and added it the new
> parameter machine_properties
> new Config -> machine_properties: viommu=1,etc
> * check if kvm and q35 are set
> 
> 
>  PVE/API2/Qemu.pm          | 21 ++++++++++++---
>  PVE/QemuConfig.pm         |  3 ++-
>  PVE/QemuServer.pm         | 55 ++++++++++++++++++++++++++++++++++++---
>  PVE/QemuServer/Machine.pm |  6 +++--
>  4 files changed, 75 insertions(+), 10 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index badfc37..5268e56 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -979,13 +979,19 @@ __PACKAGE__->register_method({
>  			$conf->{vmgenid} = PVE::QemuServer::generate_uuid();
>  		    }
>  
> -		    my $machine = $conf->{machine};
> +		    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +		    my $machine = $machine_conf->{type};
>  		    if (!$machine || $machine =~ m/^(?:pc|q35|virt)$/) {
>  			# always pin Windows' machine version on create, they get to easily confused
> -			if (PVE::QemuServer::Helpers::windows_version($conf->{ostype})) {
> -			    $conf->{machine} = PVE::QemuServer::windows_get_pinned_machine_version($machine);
> +			if (PVE::QemuServer::windows_version($conf->{ostype})) {

You dropped the Helpers::' part here, is this intentional? AFAICT
"windows_version still lives in Helpers.pm?

> +			    $machine_conf->{type} = PVE::QemuServer::windows_get_pinned_machine_version($machine);
> +			    $conf->{machine} = PVE::QemuServer::print_machine($machine_conf);
>  			}
>  		    }
> +		    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
> +		    if ((!$conf->{kvm} || !$q35) && $machine_conf->{viommu}) {
> +			die "to use vIOMMU please enable kvm and set the machine type to q35\n"
> +		    }
>  
>  		    PVE::QemuConfig->write_config($vmid, $conf);
>