From: Christoph Heiss <c.heiss@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [RFC PATCH common] SysFSTools: mdev: retrieve Nvidia vGPU description from nvidia-smi
Date: Wed, 30 Oct 2024 12:09:41 +0100 [thread overview]
Message-ID: <6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7> (raw)
In-Reply-To: <cec48210-e391-4fd0-b911-89eb30159dc9@proxmox.com>
Thanks for the review!
On Wed, Oct 30, 2024 at 11:13:36AM GMT, Dominik Csapak wrote:
> Hi,
>
> thanks for tackling this!
>
> a few comments inline
>
> On 10/28/24 12:31, Christoph Heiss wrote:
> > [..]
> > And FWIW, these properties could also be retrieved without going through
> > nvidia-smi using the NVML API directly [0], the same API nvidia-smi uses
> > anyway under the hood.
> >
> > But that would require either using something like e.g. DynaLoader in
> > perl [1] or calling it from Rust using e.g. the nvml-wrapper-sys [2] and
> > wrapping it using perlmod.
> >
> > Both ways would be a bit involved of course, but also a lot more
> > future-proof than parsing the human-readable output from `nvidia-smi`.
> > If preferred I'd be happy to re-write it in some way or another.
>
> another alternative could be to wrap the few necessary calls in a small
> c binary that we can call in place of nvidia-smi.
> This tool could then output more structured (e.g. json) output,
> though not sure if it's worth that,
Sounds like a good idea, TBH. Although that should then be probably be
packaged separately from libpve-common-perl, so there is still that
"overhead".
> since I hope that the nvidia-smi output is relatively stable
I'd hope too, but I (personally, at least) wouldn't want to count on it.
It is still a proprietary tool after all, so really could be changed
any way seen fit w/o notice.
>
> (we can still change this method should parsing nvidia-smi output
> turn out to be too brittle)
Using it as a stop-gap measure basically would work for me, so that this
isn't blocked unnecessarily long and gives us time to properly implement
something like this.
>
> > [..]
> > diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> > index 0bde6d7..fc6282d 100644
> > --- a/src/PVE/SysFSTools.pm
> > +++ b/src/PVE/SysFSTools.pm
> > @@ -4,8 +4,10 @@ use strict;
> > [..]
> > +
> > + my $command = ['nvidia-smi', 'vgpu', '--creatable', '--verbose'];
>
> IMO we should not use 'creatable' here but 'available' (or however it's named)
FWIW, for reference, would be `--supported`.
> AFAIR the 'creatable' are only the ones currently allowed to be created,
> not the ones the cards generally support.
>
> This could become a problem e.g. when a vm is started with some profile,
> then 'creatable' will only return models compatible to that, and even if
> the vm is stopped after, since we cached the output, will not show the
> description of the others)
Oh I see, yeah, that makes sense then. I'll change it for v2, thanks!
>
> Of course we could instead change the caching logic to reparse it, whenever
> we don't find the particular id in the cached config, but that
> seems too much work if we can just have a list of all 'available'
>
> > + my $parsefn = sub {
> > + my ($line) = @_;
> > + return if $line =~ m/^GPU/;
> > +
> > + my @parts = split(':', $line);
> > + return if scalar(@parts) != 2;
> > +
> > + my ($key, $value) = @parts;
> > +
> > + $key =~ s/^\s+|\s+$//g; # trim whitespace from start and end
> > + $value =~ s/\s+//g; # trim all whitespace
> > + $value =~ s/,/-/g; # replace any commas with dashes
> > +
> > + if ($key eq 'vGPU Type ID') {
> > + $cur_id = hex($value);
> > + } elsif (defined($generic_propmap->{$key}) && $value ne 'N/A') {
> > + $configs->{$cur_id}->{$generic_propmap->{$key}} = $value;
> > + }
> > +
> > + # `nvidia-smi` prints these keys/values in a deterministic order,
> > + # so the order they appear in can be relied upon.
> > + if ($key eq 'Maximum X Resolution') {
> > + $configs->{$cur_id}->{'max-resolution'} = $value;
> > + } elsif ($key eq 'Maximum Y Resolution') {
> > + $configs->{$cur_id}->{'max-resolution'} .= "x$value";
> > + }
> > + };
>
> it would be great to have some tests here with some example output, that
> way we can compare output if it changes, and it makes reasoning about the
> parsing logic a bit easier.
Sure, will add some tests with the next revision!
>
> > +
> > + eval {
> > + PVE::Tools::run_command($command, outfunc => $parsefn);
> > + };
> > +
> > + if (my $err = $@) {
> > + warn "failed to run nvidia-smi: $err\n";
> > + return undef;
> > + }
> > +
> > + for my $k (keys %$configs) {
> > + $configs->{$k} = PVE::JSONSchema::print_property_string($configs->{$k}, $prop_schema);
> > + }
>
> since the schema above is mostly emptly besides defining which keys there are,
> is there a specific reason you don't just use something like (untested!):
>
> join(",", map { $_ ."=". $configs->{$_} } keys $configs->%*);
>
> ?
> that way you could omit the prop_schema entirely
No particular reason, other than of not thinking of it :^)
I'll rework it, since the property schema is indeed mostly redundant
w.r.t to the property map above it.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2024-10-30 11:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-28 11:31 Christoph Heiss
2024-10-30 10:13 ` Dominik Csapak
2024-10-30 11:09 ` Christoph Heiss [this message]
2024-10-30 12:08 ` Thomas Lamprecht
2024-10-30 13:50 ` Christoph Heiss
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7 \
--to=c.heiss@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox