From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id CDF441FF136 for ; Mon, 23 Mar 2026 12:02:00 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 06171127AE; Mon, 23 Mar 2026 12:02:20 +0100 (CET) Message-ID: <1df04afb-7dcd-4576-bc78-b36cfbe50a92@proxmox.com> Date: Mon, 23 Mar 2026 12:01:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta Subject: Re: [PATCH proxmox-perl-rs 1/1] pve: add binding for accessing vgpu info To: Christoph Heiss References: <20260305091711.1221589-1-d.csapak@proxmox.com> <20260305091711.1221589-10-d.csapak@proxmox.com> Content-Language: en-US From: Dominik Csapak In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774263659126 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.041 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: DWKCMSFGLGF3GQJ6SY7WFTDIETVZYDNR X-Message-ID-Hash: DWKCMSFGLGF3GQJ6SY7WFTDIETVZYDNR X-MailFrom: d.csapak@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: pve-devel@lists.proxmox.com X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On 3/19/26 12:16 PM, Christoph Heiss wrote: > Two comments inline. > > Other than that, please consider it: > > Reviewed-by: Christoph Heiss > > On Thu Mar 5, 2026 at 10:16 AM CET, Dominik Csapak wrote: > [..] >> diff --git a/pve-rs/Cargo.toml b/pve-rs/Cargo.toml >> index 45389b5..3b6c2fc 100644 >> --- a/pve-rs/Cargo.toml >> +++ b/pve-rs/Cargo.toml >> @@ -20,6 +20,7 @@ hex = "0.4" >> http = "1" >> libc = "0.2" >> nix = "0.29" >> +nvml-wrapper = "0.12" > > Missing the respective entry in d/control. > > [..] >> diff --git a/pve-rs/src/bindings/nvml.rs b/pve-rs/src/bindings/nvml.rs >> new file mode 100644 >> index 0000000..0f4c81e >> --- /dev/null >> +++ b/pve-rs/src/bindings/nvml.rs >> @@ -0,0 +1,91 @@ >> +//! Provides access to the state of NVIDIA (v)GPU devices connected to the system. >> + >> +#[perlmod::package(name = "PVE::RS::NVML", lib = "pve_rs")] >> +pub mod pve_rs_nvml { >> + //! The `PVE::RS::NVML` package. >> + //! >> + //! Provides high level helpers to get info from the system with NVML. >> + >> + use anyhow::Result; >> + use nvml_wrapper::Nvml; >> + use perlmod::Value; >> + >> + /// Retrieves a list of *creatable* vGPU types for the specified GPU by bus id. >> + /// >> + /// The [`bus_id`] is of format "\:\:\.\", >> + /// e.g. "0000:01:01.0". >> + /// >> + /// # See also >> + /// >> + /// [`nvmlDeviceGetCreatableVgpus`]: >> + /// [`nvmlDeviceGetHandleByPciBusId_v2`]: >> + /// [`struct nvmlPciInfo_t`]: >> + #[export] >> + fn creatable_vgpu_types_for_dev(bus_id: &str) -> Result> { >> + let nvml = Nvml::init()?; > > Looking at this, I was wondering how expensive that call is, considering > this path is triggered from the API. Same for > supported_vgpu_types_for_dev() below. > > Did some quick & simple benchmarking - on average, `Nvml::init()` took > ~32ms, with quite some variance; at best ~26ms up to an worst case > of >150ms. > > IMO nothing worth blocking the series on, as this falls into premature > optimization territory and can be fixed in the future, if needed. > > Holding an instance in memory might also be problematic on driver > upgrades? I.e. we keep an old version of the library loaded, and thus > mismatched API. > > The above results were done with one GPU only though, so potentially > could be worse on multi-GPU systems. what we could do is to cache the results from this either here, or in perl (i think it's easier to do on the perl side) that way the cost has to be only paid once, and the amount of data should be in the KBs only. I think this should work because the available models/devices can't change while the server is up? > >> + let device = nvml.device_by_pci_bus_id(bus_id)?; >> + >> + build_vgpu_type_list(device.vgpu_creatable_types()?) >> + } >> + >> + /// Retrieves a list of *supported* vGPU types for the specified GPU by bus id. >> + /// >> + /// The [`bus_id`] is of format "\:\:\.\", >> + /// e.g. "0000:01:01.0". >> + /// >> + /// # See also >> + /// >> + /// [`nvmlDeviceGetSupportedVgpus`]: >> + /// [`nvmlDeviceGetHandleByPciBusId_v2`]: >> + /// [`struct nvmlPciInfo_t`]: >> + #[export] >> + fn supported_vgpu_types_for_dev(bus_id: &str) -> Result> { >> + let nvml = Nvml::init()?; >> + let device = nvml.device_by_pci_bus_id(bus_id)?; >> + >> + build_vgpu_type_list(device.vgpu_supported_types()?) >> + }