From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 09620D0FD
 for <pve-devel@lists.proxmox.com>; Thu, 17 Aug 2023 14:29:03 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id E01403051F
 for <pve-devel@lists.proxmox.com>; Thu, 17 Aug 2023 14:29:02 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Thu, 17 Aug 2023 14:29:02 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id CC4EA42012
 for <pve-devel@lists.proxmox.com>; Thu, 17 Aug 2023 14:29:01 +0200 (CEST)
Message-ID: <a4840906-d627-40f5-f533-6d5ad218a092@proxmox.com>
Date: Thu, 17 Aug 2023 14:29:00 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.13.1
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Markus Frank <m.frank@proxmox.com>
References: <20230118135800.131382-1-m.frank@proxmox.com>
 <20230118135800.131382-3-m.frank@proxmox.com>
From: Fiona Ebner <f.ebner@proxmox.com>
In-Reply-To: <20230118135800.131382-3-m.frank@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 1.524 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -3.165 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [qemuconfig.pm, machine.pm, qemu.org, qemuserver.pm]
Subject: Re: [pve-devel] [PATCH qemu-server v5 2/5] feature #3784: Parameter
 for guest vIOMMU & machine as property-string
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 17 Aug 2023 12:29:03 -0000

Am 18.01.23 um 14:57 schrieb Markus Frank:
> vIOMMU enables the option to passthrough pci devices to L2 VMs
> in L1 VMs via Nested Virtualisation.
> 
> QEMU-Parameters:
> https://www.qemu.org/docs/master/system/qemu-manpage.html
> https://wiki.qemu.org/Features/VT-d
> 
> -machine ...,kernel-irqchip=split:
> 
> "split" because of intremap see below.
> 
> -device intel-iommu:
> 
> * caching-mode=on:
> 
> "It is required for -device vfio-pci to work with the VT-d device, because host
> assigned devices requires to setup the DMA mapping on the host before guest DMA
> starts."
> 
> * intremap=on:
> 
> "This enables interrupt remapping feature. It's required to enable complete
> x2apic. Currently it only supports kvm kernel-irqchip modes off or split, while
> full kernel-irqchip is not yet supported."
> 
> Signed-off-by: Markus Frank <m.frank@proxmox.com>
> ---

This one needs a rebase

(...)

> +		    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
> +		    my $kvm = $conf->{kvm};
> +		    $kvm //= 1 if PVE::QemuServer::is_native($arch);
> +		    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +			die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +		    }
>  
>  		    PVE::QemuConfig->write_config($vmid, $conf);
>  
> @@ -1770,7 +1778,16 @@ my $update_vm_api  = sub {
>  		} elsif ($opt eq 'tags') {
>  		    assert_tag_permissions($vmid, $conf->{$opt}, $param->{$opt}, $rpcenv, $authuser);
>  		    $conf->{pending}->{$opt} = PVE::GuestHelpers::get_unique_tags($param->{$opt});
> -		} else {
> +		} elsif ($opt eq 'machine') {
> +		    my $machine_conf = PVE::QemuServer::parse_machine($param->{$opt});
> +		    my $q35 = $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
> +		    my $kvm = $conf->{kvm};
> +		    $kvm //= 1 if PVE::QemuServer::is_native($arch);
> +		    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +			die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +		    }

Maybe worth adding a helper function taking in the config and the
machine option. It's the very same check as above.

> +		    $conf->{pending}->{$opt} = $param->{$opt};
> +		}else {

Style nit: missing space before else

>  		    $conf->{pending}->{$opt} = $param->{$opt};
>  
>  		    if ($opt eq 'boot') {
> diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
> index 051382c..7c998ef 100644
> --- a/PVE/QemuConfig.pm
> +++ b/PVE/QemuConfig.pm
> @@ -433,7 +433,8 @@ sub __snapshot_rollback_hook {
>  	} else {
>  	    # Note: old code did not store 'machine', so we try to be smart
>  	    # and guess the snapshot was generated with kvm 1.4 (pc-i440fx-1.4).
> -	    $data->{forcemachine} = $conf->{machine} || 'pc-i440fx-1.4';
> +	    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +	    $data->{forcemachine} = $machine_conf->{type} || 'pc-i440fx-1.4';
>  
>  	    # we remove the 'machine' configuration if not explicitly specified
>  	    # in the original config.
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 987908d..55c11d5 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -124,6 +124,19 @@ PVE::JSONSchema::register_standard_option('pve-qemu-machine', {
>  	optional => 1,
>  });
>  
> +my $machine_fmt = {
> +    type => get_standard_option('pve-qemu-machine', {

Any reason against changing the standard option itself to be the whole
property string rather than keep the option just being the type? I
noticed that 'runningmachine' still uses only
"get_standard_option('pve-qemu-machine'", but that is wrong after this
patch. Changing the standard option itself would avoid that.

> +	default_key => 1,
> +	format_description => "pve-qemu-machine-type",

That format description is not very telling at all. Usually, this is
used to clarify what exact format the string is, e.g. base64. I don't
think it's needed here and there's already a description of the property
itself.

> +    }),
> +    viommu => {
> +	type => 'boolean',
> +	description => "enable guest vIOMMU (needs kvm to be enabled and q35 to be set as machine)",

Nit: "as machine type" sounds slightly better now that machine is more
than just the type

> +	default => 0,
> +	optional => 1,
> +    },
> +};
> +
>  # FIXME: remove in favor of just using the INotify one, it's cached there exactly the same way
>  my $nodename_cache;
>  sub nodename {
> @@ -626,7 +639,12 @@ EODESCR
>  	pattern => $PVE::QemuServer::CPUConfig::qemu_cmdline_cpu_re,
>  	format_description => 'QEMU -cpu parameter'
>      },
> -    machine => get_standard_option('pve-qemu-machine'),
> +    machine => {
> +	description => "Specifies the Qemu machine type.",

Nit: QEMU should be capitalized

> +	type => 'string',
> +	optional => 1,
> +	format => $machine_fmt,
> +    },
>      arch => {
>  	description => "Virtual processor architecture. Defaults to the host.",
>  	optional => 1,
> @@ -2134,6 +2152,21 @@ sub parse_watchdog {
>      return $res;
>  }
>  
> +sub parse_machine {
> +    my ($value) = @_;
> +
> +    return if !$value;
> +
> +    my $res = eval { parse_property_string($machine_fmt, $value) };
> +    die $@ if $@;

No need for eval if you just die with the exact same error ;)

> +    return $res;
> +}
> +
> +sub print_machine {
> +    my ($machine_conf) = @_;
> +    return PVE::JSONSchema::print_property_string($machine_conf, $machine_fmt);
> +}
> +
>  sub parse_guest_agent {
>      my ($conf) = @_;
>  

(...)

> @@ -4137,6 +4174,15 @@ sub config_to_command {
>      }
>      push @$machineFlags, "type=${machine_type_min}";
>  
> +    if ($machine_conf->{viommu} && (!$kvm || !$q35)) {
> +        die "to use vIOMMU please enable kvm and set the machine type to q35\n";
> +    }
> +
> +    if ($machine_conf->{viommu}) {

Style nit: the check for kvm+q35 and the die from above could be moved
into this block

Or instead re-use the check helper I suggested above

> +	unshift @$devices, '-device', "intel-iommu,intremap=on,caching-mode=on";
> +	push @$machineFlags, 'kernel-irqchip=split';
> +    }
> +
>      push @$cmd, @$devices;
>      push @$cmd, '-rtc', join(',', @$rtcFlags) if scalar(@$rtcFlags);
>      push @$cmd, '-machine', join(',', @$machineFlags) if scalar(@$machineFlags);
> diff --git a/PVE/QemuServer/Machine.pm b/PVE/QemuServer/Machine.pm
> index d9429ed..bfbde59 100644
> --- a/PVE/QemuServer/Machine.pm
> +++ b/PVE/QemuServer/Machine.pm
> @@ -15,7 +15,8 @@ our $PVE_MACHINE_VERSION = {
>  sub machine_type_is_q35 {
>      my ($conf) = @_;
>  
> -    return $conf->{machine} && ($conf->{machine} =~ m/q35/) ? 1 : 0;
> +    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +    return $machine_conf->{type} && ($machine_conf->{type} =~ m/q35/) ? 1 : 0;
>  }

The parse and print functions and option definition should really live
inside here, i.e. Machine.pm. The Machine module should not call back
into PVE::QemuServer if we can somehow avoid it.

>  
>  sub current_from_query_machines {
> @@ -120,7 +121,8 @@ sub qemu_machine_pxe {
>  
>      my $machine =  get_current_qemu_machine($vmid);
>  
> -    if ($conf->{machine} && $conf->{machine} =~ m/\.pxe$/) {
> +    my $machine_conf = PVE::QemuServer::parse_machine($conf->{machine});
> +    if ($machine_conf->{type} && $machine_conf->{type} =~ m/\.pxe$/) {
>  	$machine .= '.pxe';
>      }
>