From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 378E29ED0C for ; Wed, 7 Jun 2023 13:34:49 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 18FB118AA5 for ; Wed, 7 Jun 2023 13:34:49 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 7 Jun 2023 13:34:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2283041E68 for ; Wed, 7 Jun 2023 13:34:47 +0200 (CEST) Message-ID: <4456228e-e580-d0ad-f12c-3e92e2e17454@proxmox.com> Date: Wed, 7 Jun 2023 13:34:42 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Content-Language: en-US To: Proxmox VE development discussion , Noel Ullreich References: <20230417124502.76121-1-n.ullreich@proxmox.com> From: Dominik Csapak In-Reply-To: <20230417124502.76121-1-n.ullreich@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.088 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.094 Looks like a legit reply (A) POISEN_SPAM_PILL 0.1 Meta: its spam POISEN_SPAM_PILL_1 0.1 random spam to be learned in bayes POISEN_SPAM_PILL_3 0.1 random spam to be learned in bayes SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH pve-docs v5] update the PCI(e) docs X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jun 2023 11:34:49 -0000 mostly LGTM, a few minor comments inline (that could probably be followed up?) On 4/17/23 14:45, Noel Ullreich wrote: > A little update to the PCI(e) docs with the plan of reworking the PCI > wiki as well. > > Along some minor grammar fixes added: > * how to check if kernelmodules are being loaded > * how to check which drivers to blacklist > * how to add softdeps for module loading > * where to find kernel params > > Signed-off-by: Noel Ullreich > --- > changes from v1: > * fixed spelling mistakes > * reduced code snippets of how to check iommu groupings to one > * moved where to find kernel params to kernel cmdline section > * removed wrong info on display output. will add correct info to > Examples-Wiki > * changed module names to variable-names, so that people can't > blindly copy-paste. > * restructured commit message ;) > > changes from v2: > * while moving where to find the kernel params to the kernel > cmdline section, I forgot to remove it from the pci(e) section > * fixed typo in the link to the kernel param section > > changes from v3: > * Some restructuring of the layout as well as moving parts of the > PCI examples wiki to the docs here. This should lead to well- > structured, concise docs that are independent from the PCI wiki. > * found some more minor grammar errors > * found a spelling mistake in qm.adoc > > changes from v4: > * formatted the git message wrong again :/ > > qm-pci-passthrough.adoc | 149 +++++++++++++++++++++++++++++++--------- > qm.adoc | 2 +- > system-booting.adoc | 9 +++ > 3 files changed, 127 insertions(+), 33 deletions(-) > > diff --git a/qm-pci-passthrough.adoc b/qm-pci-passthrough.adoc > index df6cf21..dbce383 100644 > --- a/qm-pci-passthrough.adoc > +++ b/qm-pci-passthrough.adoc > @@ -13,19 +13,27 @@ features (e.g., offloading). > But, if you pass through a device to a virtual machine, you cannot use that > device anymore on the host or in any other VM. > > +Note that, while PCI passthrough is available for i440fx and q35 machines, PCIe > +passthrough is only available on q35 machines. This does not mean that > +PCIe capable devices that are passed through as PCI devices will only run at > +PCI speeds. Passing through devices as PCIe just sets a flag for the guest to > +tell it that the device is a PCIe device instead of a "really fast legacy PCI > +device". Some guest applications benefit from this. > + > General Requirements > ~~~~~~~~~~~~~~~~~~~~ > > -Since passthrough is a feature which also needs hardware support, there are > -some requirements to check and preparations to be done to make it work. > - > +Since passthrough is performed on real hardware, it needs to fulfill some > +requirements. A brief overview of these requirements is given below, for more > +information on specific devices, see > +https://pve.proxmox.com/wiki/PCI_Passthrough[PCI Passthrough Examples]. > > Hardware > ^^^^^^^^ > Your hardware needs to support `IOMMU` (*I*/*O* **M**emory **M**anagement > **U**nit) interrupt remapping, this includes the CPU and the mainboard. > > -Generally, Intel systems with VT-d, and AMD systems with AMD-Vi support this. > +Generally, Intel systems with VT-d and AMD systems with AMD-Vi support this. > But it is not guaranteed that everything will work out of the box, due > to bad hardware implementation and missing or low quality drivers. > > @@ -35,6 +43,17 @@ hardware, but even then, many modern system can support this. > Please refer to your hardware vendor to check if they support this feature > under Linux for your specific setup. > > +Determining PCI Card Address > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +The easiest way is to use the GUI to add a device of type "Host PCI" in the VM's > +hardware tab. Alternatively, you can use the command line. > + > +You can locate your card using > + > +---- > + lspci > +---- > > Configuration > ^^^^^^^^^^^^^ > @@ -44,8 +63,8 @@ some configuration to enable PCI(e) passthrough. > > .IOMMU > > -First, you have to enable IOMMU support in your BIOS/UEFI. Usually the > -corresponding setting is called `IOMMU` or `VT-d`,but you should find the exact > +First, you will have to enable IOMMU support in your BIOS/UEFI. Usually the > +corresponding setting is called `IOMMU` or `VT-d`, but you should find the exact > option name in the manual of your motherboard. > > For Intel CPUs, you may also need to enable the IOMMU on the > @@ -92,6 +111,14 @@ After changing anything modules related, you need to refresh your > # update-initramfs -u -k all > ---- > > +To check if the modules are being loaded, the output of > + > +---- > +# lsmod | grep vfio > +---- > + > +should include the four modules from above. > + > .Finish Configuration > > Finally reboot to bring the changes into effect and check that it is indeed > @@ -104,11 +131,16 @@ enabled. > should display that `IOMMU`, `Directed I/O` or `Interrupt Remapping` is > enabled, depending on hardware and kernel the exact message can vary. > > +For notes on how to troubleshoot or verify if IOMMU is working as intended, please > +see the link:/wiki/Pci_passthroughi#Verifying_IOMMU_Parameters[Verifying IOMMU Parameters] > +section in our wiki. > + AFAIK you cannot link to the wiki this way, at least it didn't work here when applying the patch > It is also important that the device(s) you want to pass through > -are in a *separate* `IOMMU` group. This can be checked with: > +are in a *separate* `IOMMU` group. This can be checked with a call to the {pve} > +API: > > ---- > -# find /sys/kernel/iommu_groups/ -type l > +# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" > ---- > > It is okay if the device is in an `IOMMU` group together with its functions > @@ -159,8 +191,8 @@ PCI(e) card, for example a GPU or a network card. > Host Configuration > ^^^^^^^^^^^^^^^^^^ > > -In this case, the host must not use the card. There are two methods to achieve > -this: > +{pve} tries to automatically make the PCI(e) device unavailable for the host. > +However, if this doesn't work, there are two things that can be done: > > * pass the device IDs to the options of the 'vfio-pci' modules by adding > + > @@ -175,7 +207,7 @@ the vendor and device IDs obtained by: > # lspci -nn > ---- > > -* blacklist the driver completely on the host, ensuring that it is free to bind > +* blacklist the driver on the host completely, ensuring that it is free to bind > for passthrough, with > + > ---- > @@ -183,11 +215,49 @@ for passthrough, with > ---- > + > in a .conf file in */etc/modprobe.d/*. > ++ > +To find the drivername, execute > ++ > +---- > +# lspci -k > +---- > ++ > +for example: > ++ > +---- > +# lspci -k | grep -A 3 "VGA" > +---- > ++ > +will output something similar to > ++ > +---- > +01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1) > + Subsystem: Micro-Star International Co., Ltd. [MSI] GP108 [GeForce GT 1030] > + Kernel driver in use: > + Kernel modules: > +---- > ++ > +Now we can blacklist the drivers by writing them into a .conf file: > ++ > +---- > +echo "blacklist " >> /etc/modprobe.d/blacklist.conf > +---- > > For both methods you need to > xref:qm_pci_passthrough_update_initramfs[update the `initramfs`] again and > reboot after that. > > +Should this not work, you might need to set a soft dependency to load the gpu > +modules before loading 'vfio-pci'. This can be done with the 'softdep' flag, see > +also the manpages on 'modprobe.d' for more information. > + > +For example, if you are using drivers named : > + > +---- > +# echo "softdep pre: vfio-pci" >> /etc/modprobe.d/.conf > +---- > + > + > .Verify Configuration > > To check if your changes were successful, you can use > @@ -208,13 +278,42 @@ passthrough. > [[qm_pci_passthrough_vm_config]] > VM Configuration > ^^^^^^^^^^^^^^^^ > -To pass through the device you need to set the *hostpciX* option in the VM > +When passing through a GPU, the best compatibility is reached when using > +'q35' as machine type, 'OVMF' ('UEFI' for VMs) instead of SeaBIOS and PCIe > +instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the > +GPU needs to have an UEFI capable ROM, otherwise use SeaBIOS instead. To check if > +the ROM is UEFI capable, see the > +link:/wiki/Pci_passthrough#How_to_know_if_a_Graphics_Card_is_UEFI_.28OVMF.29_compatible[PCI Passthrough Examples] > +wiki. same here > + > +Furthermore, using OVMF, disabling vga arbitration may be possible, reducing the > +amount of legacy code needed to be run during boot. To disable vga arbitration: > + > +---- > + echo "options vfio-pci ids=, disable_vga=1" > /etc/modprobe.d/vfio.conf > +---- > + > +replacing the and with the ones obtained from > + > +---- > +# lspci -nn > +---- > + > +PCI devices can be added in the web interface in the hardware section of the VM. > +Alternatively, you can use the command line; set the *hostpciX* option in the VM > configuration, for example by executing: > > ---- > # qm set VMID -hostpci0 00:02.0 > ---- > > +or by adding a line to the VM configuration file: > + > +---- > + hostpci0: 00:02.0 > +---- > + > + > If your device has multiple functions (e.g., ``00:02.0`' and ``00:02.1`' ), > you can pass them through all together with the shortened syntax ``00:02`'. > This is equivalent with checking the ``All Functions`' checkbox in the > @@ -262,21 +361,17 @@ For example: > # qm set VMID -hostpci0 02:00,device-id=0x10f6,sub-vendor-id=0x0000 > ---- > > - > -Other considerations > -^^^^^^^^^^^^^^^^^^^^ > - > -When passing through a GPU, the best compatibility is reached when using > -'q35' as machine type, 'OVMF' ('EFI' for VMs) instead of SeaBIOS and PCIe > -instead of PCI. Note that if you want to use 'OVMF' for GPU passthrough, the > -GPU needs to have an EFI capable ROM, otherwise use SeaBIOS instead. > - > SR-IOV > ~~~~~~ > > -Another variant for passing through PCI(e) devices, is to use the hardware > +Another variant for passing through PCI(e) devices is to use the hardware > virtualization features of your devices, if available. > > +{{Note | To use SR-IOV, platform support is especially important. It may be necessary > +to enable this feature in the BIOS/UEFI first, or to use a specific PCI(e) port > +for it to work. In doubt, consult the manual of the platform or contact its > +vendor.}} > + > 'SR-IOV' (**S**ingle-**R**oot **I**nput/**O**utput **V**irtualization) enables > a single device to provide multiple 'VF' (**V**irtual **F**unctions) to the > system. Each of those 'VF' can be used in a different VM, with full hardware > @@ -288,7 +383,6 @@ Currently, the most common use case for this are NICs (**N**etwork > physical port. This allows using features such as checksum offloading, etc. to > be used inside a VM, reducing the (host) CPU overhead. > > - > Host Configuration > ^^^^^^^^^^^^^^^^^^ > > @@ -326,14 +420,6 @@ After creating VFs, you should see them as separate PCI(e) devices when > outputting them with `lspci`. Get their ID and pass them through like a > xref:qm_pci_passthrough_vm_config[normal PCI(e) device]. > > -Other considerations > -^^^^^^^^^^^^^^^^^^^^ > - > -For this feature, platform support is especially important. It may be necessary > -to enable this feature in the BIOS/EFI first, or to use a specific PCI(e) port > -for it to work. In doubt, consult the manual of the platform or contact its > -vendor. > - > Mediated Devices (vGPU, GVT-g) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > @@ -346,7 +432,6 @@ With this, a physical Card is able to create virtual cards, similar to SR-IOV. > The difference is that mediated devices do not appear as PCI(e) devices in the > host, and are such only suited for using in virtual machines. > > - > Host Configuration > ^^^^^^^^^^^^^^^^^^ > > diff --git a/qm.adoc b/qm.adoc > index bd535a2..8f46cd6 100644 > --- a/qm.adoc > +++ b/qm.adoc > @@ -139,7 +139,7 @@ snapshots) more intelligently. > {pve} allows to boot VMs with different firmware and machine types, namely > xref:qm_bios_and_uefi[SeaBIOS and OVMF]. In most cases you want to switch from > the default SeaBIOS to OVMF only if you plan to use > -xref:qm_pci_passthrough[PCIe pass through]. A VMs 'Machine Type' defines the > +xref:qm_pci_passthrough[PCIe passthrough]. A VMs 'Machine Type' defines the > hardware layout of the VM's virtual motherboard. You can choose between the > default https://en.wikipedia.org/wiki/Intel_440FX[Intel 440FX] or the > https://ark.intel.com/content/www/us/en/ark/products/31918/intel-82q35-graphics-and-memory-controller.html[Q35] > diff --git a/system-booting.adoc b/system-booting.adoc > index 30621a6..c80d19c 100644 > --- a/system-booting.adoc > +++ b/system-booting.adoc > @@ -272,6 +272,15 @@ initrd /EFI/proxmox/5.0.15-1-pve/initrd.img-5.0.15-1-pve > Editing the Kernel Commandline > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > +A complete list of kernel parameters can be found at > +'https://www.kernel.org/doc/html/v/admin-guide/kernel-parameters.html'. > +replace with the major.minor version (e.g. 5.15). You can > +find your kernel version by running > + > +---- > +# uname -r > +---- > + i'd move this hunk to the end of the chapter instead of the beginning > You can modify the kernel commandline in the following places, depending on the > bootloader used: >