* Re: [PVE-User] vGPU scheduling [not found] <mailman.190.1684933430.348.pve-user@lists.proxmox.com> @ 2023-05-24 13:47 ` Dominik Csapak 2023-05-25 7:32 ` DERUMIER, Alexandre [not found] ` <mailman.193.1684936457.348.pve-user@lists.proxmox.com> 0 siblings, 2 replies; 7+ messages in thread From: Dominik Csapak @ 2023-05-24 13:47 UTC (permalink / raw) To: pve-user On 5/24/23 15:03, Eneko Lacunza via pve-user wrote: > Hi, Hi, > > We're looking to move a PoC in a customer to full-scale production. > > Proxmox/Ceph cluster will be for VDI, and some VMs will use vGPU. > > I'd like to know if vGPU status is being exposed right now (as of 7.4) for each node through API, as > it is done for RAM/CPU, and if not, about any plans to implement that so that a scheduler (in our > case that should be UDS Enterprise VDI manager) can choose a node with free vGPUs to deploy VDIs. what exactly do you mean with vGPU status? there currently is no api to see which pci devices are in use of a vm (though that could be done per node, not really for mediated devices though) there is the /nodes/NODENAME/hardware/pci api call which shows what devices exist and if they have mdev (mediated device) capability (e.g. NVIDIA GRID vGPU) for those cards there also exists the api call /nodes/NODENAME/hardware/pci/PCIID/mdev which gives a list of mdev types and how many are available of them does that help? if you have more specific requirements (or i misunderstood you), please open a bug/feature request on https://bugzilla.proxmox.com > > Thanks Kind Regards Dominik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PVE-User] vGPU scheduling 2023-05-24 13:47 ` [PVE-User] vGPU scheduling Dominik Csapak @ 2023-05-25 7:32 ` DERUMIER, Alexandre 2023-05-25 7:43 ` Dominik Csapak [not found] ` <mailman.193.1684936457.348.pve-user@lists.proxmox.com> 1 sibling, 1 reply; 7+ messages in thread From: DERUMIER, Alexandre @ 2023-05-25 7:32 UTC (permalink / raw) To: pve-user Hi Dominik, any news about your patches "add cluster-wide hardware device mapping" ? Do you think it'll be merged for proxmox 8 ? I think it could help for this usecase. Le mercredi 24 mai 2023 à 15:47 +0200, Dominik Csapak a écrit : > On 5/24/23 15:03, Eneko Lacunza via pve-user wrote: > > Hi, > > Hi, > > > > > > We're looking to move a PoC in a customer to full-scale production. > > > > Proxmox/Ceph cluster will be for VDI, and some VMs will use vGPU. > > > > I'd like to know if vGPU status is being exposed right now (as of > > 7.4) for each node through API, as > > it is done for RAM/CPU, and if not, about any plans to implement > > that so that a scheduler (in our > > case that should be UDS Enterprise VDI manager) can choose a node > > with free vGPUs to deploy VDIs. > > what exactly do you mean with vGPU status? > > there currently is no api to see which pci devices are in use of a vm > (though that could be done per node, not really for mediated devices > though) > > there is the /nodes/NODENAME/hardware/pci api call which shows what > devices exist > and if they have mdev (mediated device) capability (e.g. NVIDIA GRID > vGPU) > > for those cards there also exists the api call > > /nodes/NODENAME/hardware/pci/PCIID/mdev > > which gives a list of mdev types and how many are available of them > > > does that help? > > if you have more specific requirements (or i misunderstood you), > please > open a bug/feature request on > https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0HXt7OYBHs7zwFT11HmTkxxI8I-pATquRuU6kvITTv9cBf2Zui06iS-0a3kahQ3A&f=S1Zkd042VWdrZG5qQUxxWk5ps4tr-FB5X49U18fsoL28uwE1E1KkbxV-Cz-yhkuBYlgSb2bYN4CyFf-pEZOdcQ&u=https%3A//bugzilla.proxmox.com&k=F1is > > > > > Thanks > > Kind Regards > Dominik > > > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://antiphishing.cetsi.fr/proxy/v3?i=MlZSTzBhZFZ6Nzl4c3EyN7fbSKDePLMxi5u5_onpAoI&r=cm1qVmRYUWk2WXhYZVFHWA0HXt7OYBHs7zwFT11HmTkxxI8I-pATquRuU6kvITTv9cBf2Zui06iS-0a3kahQ3A&f=S1Zkd042VWdrZG5qQUxxWk5ps4tr-FB5X49U18fsoL28uwE1E1KkbxV-Cz-yhkuBYlgSb2bYN4CyFf-pEZOdcQ&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user&k=F1is > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PVE-User] vGPU scheduling 2023-05-25 7:32 ` DERUMIER, Alexandre @ 2023-05-25 7:43 ` Dominik Csapak 2023-05-25 9:03 ` Thomas Lamprecht ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Dominik Csapak @ 2023-05-25 7:43 UTC (permalink / raw) To: pve-user On 5/25/23 09:32, DERUMIER, Alexandre wrote: > Hi Dominik, > > any news about your patches "add cluster-wide hardware device mapping" i'm currently on a new version of this first part was my recent series for the section config/api array support i think i can send the new version for the backend this week > ? > > Do you think it'll be merged for proxmox 8 ? i don't know, but this also depends on the capacity of my colleagues to review ;) > > I think it could help for this usecase. > yes i think so too, but it was not directly connected to the request so i did not mention it ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PVE-User] vGPU scheduling 2023-05-25 7:43 ` Dominik Csapak @ 2023-05-25 9:03 ` Thomas Lamprecht 2024-05-31 7:59 ` Eneko Lacunza via pve-user [not found] ` <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es> 2 siblings, 0 replies; 7+ messages in thread From: Thomas Lamprecht @ 2023-05-25 9:03 UTC (permalink / raw) To: Proxmox VE user list, Dominik Csapak, Alexandre Derumier Am 25/05/2023 um 09:43 schrieb Dominik Csapak: >> >> Do you think it'll be merged for proxmox 8 ? > > i don't know, but this also depends on the capacity of my colleagues to review 😉 making it easy to digest and adding (good) tests will surely help to accelerate this ;-P But, you're naturally right, and tbh., while I'll try hard to get the access-control and some other fundaments in, especially those where we can profit from the higher freedom/flexibility of a major release, I cannot definitely say that the actual HW mapping will make it for initial 8.0. For initial major release I prefer having a bit less features but focus more on that the existing features keep working and that there's a stable and well tested upgrade path. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PVE-User] vGPU scheduling 2023-05-25 7:43 ` Dominik Csapak 2023-05-25 9:03 ` Thomas Lamprecht @ 2024-05-31 7:59 ` Eneko Lacunza via pve-user [not found] ` <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es> 2 siblings, 0 replies; 7+ messages in thread From: Eneko Lacunza via pve-user @ 2024-05-31 7:59 UTC (permalink / raw) To: pve-user; +Cc: Eneko Lacunza [-- Attachment #1: Type: message/rfc822, Size: 6471 bytes --] From: Eneko Lacunza <elacunza@binovo.es> To: pve-user@lists.proxmox.com Subject: Re: [PVE-User] vGPU scheduling Date: Fri, 31 May 2024 09:59:14 +0200 Message-ID: <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es> Hi Dominik, Do you have any expected timeline/version for this to be merged? Thanks El 25/5/23 a las 9:43, Dominik Csapak escribió: > On 5/25/23 09:32, DERUMIER, Alexandre wrote: >> Hi Dominik, >> >> any news about your patches "add cluster-wide hardware device mapping" > > i'm currently on a new version of this > first part was my recent series for the section config/api array support > > i think i can send the new version for the backend this week > >> ? >> >> Do you think it'll be merged for proxmox 8 ? > > i don't know, but this also depends on the capacity of my colleagues > to review ;) > >> >> I think it could help for this usecase. >> > > yes i think so too, but it was not directly connected to the request so > i did not mention it > > > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > Eneko Lacunza Zuzendari teknikoa | Director técnico Binovo IT Human Project Tel. +34 943 569 206 | https://www.binovo.es Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun https://www.youtube.com/user/CANALBINOVO https://www.linkedin.com/company/37269706/ [-- Attachment #2: Type: text/plain, Size: 157 bytes --] _______________________________________________ pve-user mailing list pve-user@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es>]
* Re: [PVE-User] vGPU scheduling [not found] ` <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es> @ 2024-05-31 8:08 ` Dominik Csapak 0 siblings, 0 replies; 7+ messages in thread From: Dominik Csapak @ 2024-05-31 8:08 UTC (permalink / raw) To: Eneko Lacunza, pve-user On 5/31/24 09:59, Eneko Lacunza wrote: > > Hi Dominik, > > Do you have any expected timeline/version for this to be merged? the cluster wide device mapping was merged last year already and is included in pve 8.0 and onwards. or do you mean something different? > > Thanks > > El 25/5/23 a las 9:43, Dominik Csapak escribió: >> On 5/25/23 09:32, DERUMIER, Alexandre wrote: >>> Hi Dominik, >>> >>> any news about your patches "add cluster-wide hardware device mapping" >> >> i'm currently on a new version of this >> first part was my recent series for the section config/api array support >> >> i think i can send the new version for the backend this week >> >>> ? >>> >>> Do you think it'll be merged for proxmox 8 ? >> >> i don't know, but this also depends on the capacity of my colleagues to review ;) >> >>> >>> I think it could help for this usecase. >>> >> >> yes i think so too, but it was not directly connected to the request so >> i did not mention it >> >> >> _______________________________________________ >> pve-user mailing list >> pve-user@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > Eneko Lacunza > Zuzendari teknikoa | Director técnico > Binovo IT Human Project > > Tel. +34 943 569 206 | https://www.binovo.es > Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun > > https://www.youtube.com/user/CANALBINOVO > https://www.linkedin.com/company/37269706/ > > _______________________________________________ pve-user mailing list pve-user@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <mailman.193.1684936457.348.pve-user@lists.proxmox.com>]
[parent not found: <mailman.222.1684948008.348.pve-user@lists.proxmox.com>]
[parent not found: <0e48cb4a-7fa0-2bd7-9d4e-f18ab8e03d20@proxmox.com>]
[parent not found: <mailman.252.1685000179.348.pve-user@lists.proxmox.com>]
* Re: [PVE-User] vGPU scheduling [not found] ` <mailman.252.1685000179.348.pve-user@lists.proxmox.com> @ 2023-05-25 7:53 ` Dominik Csapak 0 siblings, 0 replies; 7+ messages in thread From: Dominik Csapak @ 2023-05-25 7:53 UTC (permalink / raw) To: pve-user On 5/25/23 09:36, Eneko Lacunza via pve-user wrote: > Hi Dominik, > > > El 25/5/23 a las 9:24, Dominik Csapak escribió: >> >>> 2.12.0 (qemu-kvm-2.12.0-64.el8.2.27782638) >>> * Microsoft Windows Server with Hyper-V 2019 Datacenter edition >>> * Red Hat Enterprise Linux Kernel-based Virtual Machine (KVM) 9.0 and 9.1 >>> * Red Hat Virtualization 4.3 >>> * Ubuntu Hypervisor 22.04 >>> * VMware vSphere Hypervisor (ESXi) 7.0.1, 7.0.2, and 7.0.3 >>> >>> Is there any effort planned or on the way to have Proxmox added to the above list? >> >> We'd generally like to be on the supported hypervisor list, but currently >> none of our efforts to contact NVIDIA regarding this were successful, >> but i hope we can solve this sometime in the future... > > I can try to report this via customer request to nvidia, where should I refer them to? You can refer them directly to me (d.csapak@proxmox.com) or our office mail (office@proxmox.com). Maybe it helps if the request comes also from the customer side. > >> >>> >>> As Ubuntu 22.04 is in it and the Proxmox kernel is derived from it, the technical effort may not >>> be so large. >> >> Yes, their current Linux KVM package (15.2) should work with our 5.15 kernel, >> it's what i use here locally to test, e.g. [0] > > We had varying success with 5.15 kernels, some versions work but others do not (refused to work > after kernel upgrade and had to pin older kernel). Maybe it would be worth to keep a list of known > to work/known not to work kernels? Normally i have my tests running with each update of the 5.15 kernel and i did not see any special problems there. The only recent thing was that we had to change how we clean up the mediated devices for their newer versions [0] Note that i only test the latest supported GRID version though (Currently 15.2) Regards Dominik 0: https://git.proxmox.com/?p=qemu-server.git;a=commit;h=49c51a60db7f12d7fe2073b755d18b4d9b628fbd ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-05-31 8:08 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <mailman.190.1684933430.348.pve-user@lists.proxmox.com> 2023-05-24 13:47 ` [PVE-User] vGPU scheduling Dominik Csapak 2023-05-25 7:32 ` DERUMIER, Alexandre 2023-05-25 7:43 ` Dominik Csapak 2023-05-25 9:03 ` Thomas Lamprecht 2024-05-31 7:59 ` Eneko Lacunza via pve-user [not found] ` <cbb9c11d-5b8e-4f1a-98dd-b3e1cf4be45c@binovo.es> 2024-05-31 8:08 ` Dominik Csapak [not found] ` <mailman.193.1684936457.348.pve-user@lists.proxmox.com> [not found] ` <mailman.222.1684948008.348.pve-user@lists.proxmox.com> [not found] ` <0e48cb4a-7fa0-2bd7-9d4e-f18ab8e03d20@proxmox.com> [not found] ` <mailman.252.1685000179.348.pve-user@lists.proxmox.com> 2023-05-25 7:53 ` Dominik Csapak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox