From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 114B3925B4 for ; Tue, 14 Mar 2023 13:48:11 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E6F0120DAF for ; Tue, 14 Mar 2023 13:48:10 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 14 Mar 2023 13:48:09 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 61DD945F7C for ; Tue, 14 Mar 2023 13:48:09 +0100 (CET) From: Noel Ullreich To: pve-devel@lists.proxmox.com Date: Tue, 14 Mar 2023 13:48:04 +0100 Message-Id: <20230314124804.62223-1-n.ullreich@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.056 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment POISEN_SPAM_PILL 0.1 Meta: its spam POISEN_SPAM_PILL_1 0.1 random spam to be learned in bayes POISEN_SPAM_PILL_3 0.1 random spam to be learned in bayes SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com] Subject: [pve-devel] [PATCH pve-docs] update the PCI(e) docs X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Mar 2023 12:48:11 -0000 A little update to the PCI(e) docs with the plan of reworking the PCI wiki as well. Some questions and reasoning to the patch: * I would only mention the ACS patch in the PCI examples wiki, since it is a last-ditch effort to get IOMMU to work and who knows how long we will support the patch. * Should I move the blacklising example to the example-wiki and just link to it? I don't want people blindly copy-pasting commands. Same goes for the softdep example. Signed-off-by: Noel Ullreich --- qm-pci-passthrough.adoc | 87 +++++++++++++++++++++++++++++++++++------ 1 file changed, 75 insertions(+), 12 deletions(-) diff --git a/qm-pci-passthrough.adoc b/qm-pci-passthrough.adoc index df6cf21..ed17b9c 100644 --- a/qm-pci-passthrough.adoc +++ b/qm-pci-passthrough.adoc @@ -16,16 +16,17 @@ device anymore on the host or in any other VM. General Requirements ~~~~~~~~~~~~~~~~~~~~ -Since passthrough is a feature which also needs hardware support, there are -some requirements to check and preparations to be done to make it work. - +Since passthrough is preformed on real hardware, the hardware needs to fulfill +some requirements. A brief overview of these requirements is given below, for more +information on specific devices, see +https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples]. Hardware ^^^^^^^^ Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement **U**nit) interrupt remapping, this includes the CPU and the mainboard. -Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this. +Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this. But it is not guaranteed that everything will work out of the box, due to bad hardware implementation and missing or low quality drivers. @@ -44,8 +45,8 @@ some configuration to enable PCI(e) passthrough. .IOMMU -First, you have to enable IOMMU support in your BIOS/UEFI. Usually the -corresponding setting is called `IOMMU` or `VT-d`,but you should find the exact +First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the +corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact option name in the manual of your motherboard. For Intel CPUs, you may also need to enable the IOMMU on the @@ -72,6 +73,9 @@ hardware IOMMU. To enable these options, add: to the xref:sysboot_edit_kernel_cmdline[kernel commandline]. +For a complete list of kernel commandline options (of kernel 5.15), see +https://www.kernel.org/doc/html/v5.15/admin-guide/kernel-parameters.html[kernel.org]. + .Kernel Modules You have to make sure the following modules are loaded. This can be achieved by @@ -92,6 +96,14 @@ After changing anything modules related, you need to refresh your # update-initramfs -u -k all ---- +To check if the modules are being loaded, the output of + +---- +# lsmod | grep vfio +---- + +should include the four modules from above. + .Finish Configuration Finally reboot to bring the changes into effect and check that it is indeed @@ -105,8 +117,22 @@ should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is enabled, depending on hardware and kernel the exact message can vary. It is also important that the device(s) you want to pass through -are in a *separate* `IOMMU` group. This can be checked with: +are in a *separate* `IOMMU` group. This can be checked either with: +* a call to the {pve} API: ++ +---- +# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" +---- + +* a bash oneliner: ++ +---- +# for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU group %s ' "$n"; lspci -nns "${d##*/}"; done +---- + +* this command, although it gives less information than the other two: ++ ---- # find /sys/kernel/iommu_groups/ -type l ---- @@ -148,6 +174,10 @@ desktop software (for example, VNC or RDP) inside the guest. If you want to use the GPU as a hardware accelerator, for example, for programs using OpenCL or CUDA, this is not required. +In this case, to use NoVNC or SPICE, you might need to unset the 'primary GPU' +flag(see xref:qm_pci_passthrough_vm_config[VM configuration]) and make sure the +GPU is not phyiscally connected to a monitor. + Host Device Passthrough ~~~~~~~~~~~~~~~~~~~~~~~ @@ -159,8 +189,8 @@ PCI(e) card, for example a GPU or a network card. Host Configuration ^^^^^^^^^^^^^^^^^^ -In this case, the host must not use the card. There are two methods to achieve -this: +{pve} tries to automatically make the PCI(e) device unavailable for the host. +However, if this doesn't work, there are two things that can be done: * pass the device IDs to the options of the 'vfio-pci' modules by adding + @@ -175,7 +205,7 @@ the vendor and device IDs obtained by: # lspci -nn ---- -* blacklist the driver completely on the host, ensuring that it is free to bind +* blacklist the driver on the host completely, ensuring that it is free to bind for passthrough, with + ---- @@ -183,11 +213,46 @@ for passthrough, with ---- + in a .conf file in */etc/modprobe.d/*. ++ +To find the drivername, execute ++ +---- +# lspci -k +---- ++ +for example: ++ +---- +# lspci -k | grep -A 3 "VGA" + +// The output tells us, that the drivers are called `nvidia` +01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1) + Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030] + Kernel driver in use: nvidia + Kernel modules: nvidia +---- ++ +Now we can blacklist the drivers by writing them into a .conf file: ++ +---- +echo "blacklist nvidia" >> /etc/modprobe.d/blacklist.conf +---- For both methods you need to xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and reboot after that. +Should this not work, you might need to set a soft dependency to load the gpu +modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see +also the manpages on 'modprobe.d' for more information. + +For example, if you are using a NVIDIA gpu and using the 'nouveau' drivers: + +---- +# echo "softdep nouveau pre: vfio-pci" >> /etc/modprobe.d/nouveau.conf +---- + + .Verify Configuration To check if your changes were successful, you can use @@ -262,7 +327,6 @@ For example: # qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000 ---- - Other considerations ^^^^^^^^^^^^^^^^^^^^ @@ -288,7 +352,6 @@ Currently, the most common use case for this are NICs (**N**etwork physical port. This allows using features such as checksum offloading, etc. to be used inside a VM, reducing the (host) CPU overhead. - Host Configuration ^^^^^^^^^^^^^^^^^^ -- 2.30.2