public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christoph Heiss <c.heiss@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [RFC PATCH common] SysFSTools: mdev: retrieve Nvidia vGPU description from nvidia-smi
Date: Wed, 30 Oct 2024 12:09:41 +0100	[thread overview]
Message-ID: <6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7> (raw)
In-Reply-To: <cec48210-e391-4fd0-b911-89eb30159dc9@proxmox.com>

Thanks for the review!

On Wed, Oct 30, 2024 at 11:13:36AM GMT, Dominik Csapak wrote:
> Hi,
>
> thanks for tackling this!
>
> a few comments inline
>
> On 10/28/24 12:31, Christoph Heiss wrote:
> > [..]
> > And FWIW, these properties could also be retrieved without going through
> > nvidia-smi using the NVML API directly [0], the same API nvidia-smi uses
> > anyway under the hood.
> >
> > But that would require either using something like e.g. DynaLoader in
> > perl [1] or calling it from Rust using e.g. the nvml-wrapper-sys [2] and
> > wrapping it using perlmod.
> >
> > Both ways would be a bit involved of course, but also a lot more
> > future-proof than parsing the human-readable output from `nvidia-smi`.
> > If preferred I'd be happy to re-write it in some way or another.
>
> another alternative could be to wrap the few necessary calls in a small
> c binary that we can call in place of nvidia-smi.
> This tool could then output more structured (e.g. json) output,
> though not sure if it's worth that,

Sounds like a good idea, TBH. Although that should then be probably be
packaged separately from libpve-common-perl, so there is still that
"overhead".

> since I hope that the nvidia-smi output is relatively stable

I'd hope too, but I (personally, at least) wouldn't want to count on it.
It is still a proprietary tool after all, so really could be changed
any way seen fit w/o notice.

>
> (we can still change this method should parsing nvidia-smi output
> turn out to be too brittle)

Using it as a stop-gap measure basically would work for me, so that this
isn't blocked unnecessarily long and gives us time to properly implement
something like this.

>
> > [..]
> > diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> > index 0bde6d7..fc6282d 100644
> > --- a/src/PVE/SysFSTools.pm
> > +++ b/src/PVE/SysFSTools.pm
> > @@ -4,8 +4,10 @@ use strict;
> > [..]
> > +
> > +    my $command = ['nvidia-smi', 'vgpu', '--creatable', '--verbose'];
>
> IMO we should not use 'creatable' here but 'available' (or however it's named)

FWIW, for reference, would be `--supported`.

> AFAIR the 'creatable' are only the ones currently allowed to be created,
> not the ones the cards generally support.
>
> This could become a  problem e.g. when a vm is started with some profile,
> then 'creatable' will only return models compatible to that, and even if
> the vm is stopped after, since we cached the output, will not show the
> description of the others)

Oh I see, yeah, that makes sense then. I'll change it for v2, thanks!

>
> Of course we could instead change the caching logic to reparse it, whenever
> we don't find the particular id in the cached config, but that
> seems too much work if we can just have a list of all 'available'
>
> > +    my $parsefn = sub {
> > +	my ($line) = @_;
> > +	return if $line =~ m/^GPU/;
> > +
> > +	my @parts = split(':', $line);
> > +	return if scalar(@parts) != 2;
> > +
> > +	my ($key, $value) = @parts;
> > +
> > +	$key =~ s/^\s+|\s+$//g; # trim whitespace from start and end
> > +	$value =~ s/\s+//g; # trim all whitespace
> > +	$value =~ s/,/-/g; # replace any commas with dashes
> > +
> > +	if ($key eq 'vGPU Type ID') {
> > +	    $cur_id = hex($value);
> > +	} elsif (defined($generic_propmap->{$key}) && $value ne 'N/A') {
> > +	    $configs->{$cur_id}->{$generic_propmap->{$key}} = $value;
> > +	}
> > +
> > +	# `nvidia-smi` prints these keys/values in a deterministic order,
> > +	# so the order they appear in can be relied upon.
> > +	if ($key eq 'Maximum X Resolution') {
> > +	    $configs->{$cur_id}->{'max-resolution'} = $value;
> > +	} elsif ($key eq 'Maximum Y Resolution') {
> > +	    $configs->{$cur_id}->{'max-resolution'} .= "x$value";
> > +	}
> > +    };
>
> it would be great to have some tests here with some example output, that
> way we can compare output if it changes, and it makes reasoning about the
> parsing logic a bit easier.

Sure, will add some tests with the next revision!

>
> > +
> > +    eval {
> > +	PVE::Tools::run_command($command, outfunc => $parsefn);
> > +    };
> > +
> > +    if (my $err = $@) {
> > +	warn "failed to run nvidia-smi: $err\n";
> > +	return undef;
> > +    }
> > +
> > +    for my $k (keys %$configs) {
> > +	$configs->{$k} = PVE::JSONSchema::print_property_string($configs->{$k}, $prop_schema);
> > +    }
>
> since the schema above is mostly emptly besides defining which keys there are,
> is there a specific reason you don't just use something like (untested!):
>
> join(",", map { $_ ."=". $configs->{$_} } keys $configs->%*);
>
> ?
> that way you could omit the prop_schema entirely

No particular reason, other than of not thinking of it :^)

I'll rework it, since the property schema is indeed mostly redundant
w.r.t to the property map above it.



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2024-10-30 11:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-28 11:31 Christoph Heiss
2024-10-30 10:13 ` Dominik Csapak
2024-10-30 11:09   ` Christoph Heiss [this message]
2024-10-30 12:08 ` Thomas Lamprecht
2024-10-30 13:50   ` Christoph Heiss

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7 \
    --to=c.heiss@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal