all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Christoph Heiss <c.heiss@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [RFC PATCH common] SysFSTools: mdev: retrieve Nvidia vGPU description from nvidia-smi
Date: Wed, 30 Oct 2024 12:09:41 +0100	[thread overview]
Message-ID: <6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7> (raw)
In-Reply-To: <cec48210-e391-4fd0-b911-89eb30159dc9@proxmox.com>

Thanks for the review!

On Wed, Oct 30, 2024 at 11:13:36AM GMT, Dominik Csapak wrote:
> Hi,
>
> thanks for tackling this!
>
> a few comments inline
>
> On 10/28/24 12:31, Christoph Heiss wrote:
> > [..]
> > And FWIW, these properties could also be retrieved without going through
> > nvidia-smi using the NVML API directly [0], the same API nvidia-smi uses
> > anyway under the hood.
> >
> > But that would require either using something like e.g. DynaLoader in
> > perl [1] or calling it from Rust using e.g. the nvml-wrapper-sys [2] and
> > wrapping it using perlmod.
> >
> > Both ways would be a bit involved of course, but also a lot more
> > future-proof than parsing the human-readable output from `nvidia-smi`.
> > If preferred I'd be happy to re-write it in some way or another.
>
> another alternative could be to wrap the few necessary calls in a small
> c binary that we can call in place of nvidia-smi.
> This tool could then output more structured (e.g. json) output,
> though not sure if it's worth that,

Sounds like a good idea, TBH. Although that should then be probably be
packaged separately from libpve-common-perl, so there is still that
"overhead".

> since I hope that the nvidia-smi output is relatively stable

I'd hope too, but I (personally, at least) wouldn't want to count on it.
It is still a proprietary tool after all, so really could be changed
any way seen fit w/o notice.

>
> (we can still change this method should parsing nvidia-smi output
> turn out to be too brittle)

Using it as a stop-gap measure basically would work for me, so that this
isn't blocked unnecessarily long and gives us time to properly implement
something like this.

>
> > [..]
> > diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> > index 0bde6d7..fc6282d 100644
> > --- a/src/PVE/SysFSTools.pm
> > +++ b/src/PVE/SysFSTools.pm
> > @@ -4,8 +4,10 @@ use strict;
> > [..]
> > +
> > +    my $command = ['nvidia-smi', 'vgpu', '--creatable', '--verbose'];
>
> IMO we should not use 'creatable' here but 'available' (or however it's named)

FWIW, for reference, would be `--supported`.

> AFAIR the 'creatable' are only the ones currently allowed to be created,
> not the ones the cards generally support.
>
> This could become a  problem e.g. when a vm is started with some profile,
> then 'creatable' will only return models compatible to that, and even if
> the vm is stopped after, since we cached the output, will not show the
> description of the others)

Oh I see, yeah, that makes sense then. I'll change it for v2, thanks!

>
> Of course we could instead change the caching logic to reparse it, whenever
> we don't find the particular id in the cached config, but that
> seems too much work if we can just have a list of all 'available'
>
> > +    my $parsefn = sub {
> > +	my ($line) = @_;
> > +	return if $line =~ m/^GPU/;
> > +
> > +	my @parts = split(':', $line);
> > +	return if scalar(@parts) != 2;
> > +
> > +	my ($key, $value) = @parts;
> > +
> > +	$key =~ s/^\s+|\s+$//g; # trim whitespace from start and end
> > +	$value =~ s/\s+//g; # trim all whitespace
> > +	$value =~ s/,/-/g; # replace any commas with dashes
> > +
> > +	if ($key eq 'vGPU Type ID') {
> > +	    $cur_id = hex($value);
> > +	} elsif (defined($generic_propmap->{$key}) && $value ne 'N/A') {
> > +	    $configs->{$cur_id}->{$generic_propmap->{$key}} = $value;
> > +	}
> > +
> > +	# `nvidia-smi` prints these keys/values in a deterministic order,
> > +	# so the order they appear in can be relied upon.
> > +	if ($key eq 'Maximum X Resolution') {
> > +	    $configs->{$cur_id}->{'max-resolution'} = $value;
> > +	} elsif ($key eq 'Maximum Y Resolution') {
> > +	    $configs->{$cur_id}->{'max-resolution'} .= "x$value";
> > +	}
> > +    };
>
> it would be great to have some tests here with some example output, that
> way we can compare output if it changes, and it makes reasoning about the
> parsing logic a bit easier.

Sure, will add some tests with the next revision!

>
> > +
> > +    eval {
> > +	PVE::Tools::run_command($command, outfunc => $parsefn);
> > +    };
> > +
> > +    if (my $err = $@) {
> > +	warn "failed to run nvidia-smi: $err\n";
> > +	return undef;
> > +    }
> > +
> > +    for my $k (keys %$configs) {
> > +	$configs->{$k} = PVE::JSONSchema::print_property_string($configs->{$k}, $prop_schema);
> > +    }
>
> since the schema above is mostly emptly besides defining which keys there are,
> is there a specific reason you don't just use something like (untested!):
>
> join(",", map { $_ ."=". $configs->{$_} } keys $configs->%*);
>
> ?
> that way you could omit the prop_schema entirely

No particular reason, other than of not thinking of it :^)

I'll rework it, since the property schema is indeed mostly redundant
w.r.t to the property map above it.



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2024-10-30 11:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-28 11:31 Christoph Heiss
2024-10-30 10:13 ` Dominik Csapak
2024-10-30 11:09   ` Christoph Heiss [this message]
2024-10-30 12:08 ` Thomas Lamprecht
2024-10-30 13:50   ` Christoph Heiss

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6obxtzisy4patvz6yifrtib7xoudrfmrgagv3udr5wssbqbjxq@7krizy3j7ys7 \
    --to=c.heiss@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal