From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Filip Schauer <f.schauer@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH v2 container 1/1] Add device passthrough
Date: Mon, 30 Oct 2023 14:34:54 +0100 [thread overview]
Message-ID: <2rzmdty5ax4v5fssxkvjey4rfhzrcdmjzx5dti4m73lpbekqcf@3wna2j3j2jks> (raw)
In-Reply-To: <20231024125554.131800-2-f.schauer@proxmox.com>
On Tue, Oct 24, 2023 at 02:55:53PM +0200, Filip Schauer wrote:
> Add a dev[n] argument to the container config to pass devices through to
> a container. A device can be passed by its path. Alternatively a mapped
> USB device can be passed through with usbmapping=<name>.
>
> Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
> ---
> src/PVE/LXC.pm | 34 +++++++++++++++++++++++-
> src/PVE/LXC/Config.pm | 60 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 93 insertions(+), 1 deletion(-)
>
> diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
> index c9b5ba7..a3ddb62 100644
> --- a/src/PVE/LXC.pm
> +++ b/src/PVE/LXC.pm
> @@ -5,7 +5,8 @@ use warnings;
>
> use Cwd qw();
> use Errno qw(ELOOP ENOTDIR EROFS ECONNREFUSED EEXIST);
> -use Fcntl qw(O_RDONLY O_WRONLY O_NOFOLLOW O_DIRECTORY);
> +use Fcntl qw(O_RDONLY O_WRONLY O_NOFOLLOW O_DIRECTORY :mode);
> +use File::Basename;
> use File::Path;
> use File::Spec;
> use IO::Poll qw(POLLIN POLLHUP);
> @@ -639,6 +640,37 @@ sub update_lxc_config {
> $raw .= "lxc.mount.auto = sys:mixed\n";
> }
>
> + # Clear passthrough directory from previous run
> + my $passthrough_dir = "/var/lib/lxc/$vmid/passthrough";
> + File::Path::rmtree($passthrough_dir);
I think we need to make a few changes here.
First: we don't necessarily need this directory.
Having a device list would certainly be nice, but it makes more sense to
just have a file we can easily parse (possibly even just a json hash),
like the `devices` file we already create in the pre-start hook, except
prepared *for* the pre-start hook, which *should* be able to just
`mknod` the devices right into the container's `/dev` on startup.
We'd also avoid "lingering" device nodes with potentially harmful
uid/permissions in /var, which is certainly better from a security POV.
But note that we do need the `lxc.cgroup2.*` entries before starting the
container in order to ensure the devices cgroup has the right
permissions.
> +
> + PVE::LXC::Config->foreach_passthrough_device($conf, sub {
> + my ($key, $sanitized_path) = @_;
> +
> + my $absolute_path = "/$sanitized_path";
> + my ($mode, $rdev) = (stat($absolute_path))[2, 6];
> + die "Could not find major and minor ids of device $absolute_path.\n"
> + unless ($mode && $rdev);
> +
> + my $major = PVE::Tools::dev_t_major($rdev);
> + my $minor = PVE::Tools::dev_t_minor($rdev);
> + my $device_type_char = S_ISBLK($mode) ? 'b' : 'c';
> + my $passthrough_device_path = "$passthrough_dir/$sanitized_path";
> + File::Path::make_path(dirname($passthrough_device_path));
> + PVE::Tools::run_command([
> + '/usr/bin/mknod',
> + '-m', '0660',
> + $passthrough_device_path,
> + $device_type_char,
> + $major,
> + $minor
> + ]);
It's probably worth adding a helper for the mknod syscall to
`PVE::Tools`, there are a bunch of syscalls in there already.
> + chown 100000, 100000, $passthrough_device_path if ($unprivileged);
^ This isn't necessarily the correct id. Users may have custom id
mappings.
`PVE::LXC::parse_id_maps($conf)` returns the mapping alongside the root
uid and gid. (See for example `sub mount_all` for how it's used.
> +
> + $raw .= "lxc.cgroup2.devices.allow = $device_type_char $major:$minor rw\n";
> + $raw .= "lxc.mount.entry = $passthrough_device_path $sanitized_path none bind,create=file\n";
> + });
> +
> # WARNING: DO NOT REMOVE this without making sure that loop device nodes
> # cannot be exposed to the container with r/w access (cgroup perms).
> # When this is enabled mounts will still remain in the monitor's namespace
> diff --git a/src/PVE/LXC/Config.pm b/src/PVE/LXC/Config.pm
> index 56e1f10..edd813e 100644
> --- a/src/PVE/LXC/Config.pm
> +++ b/src/PVE/LXC/Config.pm
> @@ -29,6 +29,7 @@ mkdir $lockdir;
> mkdir "/etc/pve/nodes/$nodename/lxc";
> my $MAX_MOUNT_POINTS = 256;
> my $MAX_UNUSED_DISKS = $MAX_MOUNT_POINTS;
> +my $MAX_DEVICES = 256;
>
> # BEGIN implemented abstract methods from PVE::AbstractConfig
>
> @@ -908,6 +909,49 @@ for (my $i = 0; $i < $MAX_UNUSED_DISKS; $i++) {
> }
> }
>
> +PVE::JSONSchema::register_format('pve-lxc-dev-string', \&verify_lxc_dev_string);
> +sub verify_lxc_dev_string {
> + my ($dev, $noerr) = @_;
> +
> + if (
> + $dev =~ m@/\.\.?/@ ||
> + $dev =~ m@/\.\.?$@ ||
> + $dev !~ m!^/dev/!
> + ) {
> + return undef if $noerr;
> + die "$dev is not a valid device path\n";
> + }
> +
> + return $dev;
> +}
> +
> +my $dev_desc = {
> + path => {
> + optional => 1,
> + type => 'string',
> + default_key => 1,
> + format => 'pve-lxc-dev-string',
> + format_description => 'Path',
> + description => 'Device to pass through to the container',
> + verbose_description => 'Path to the device to pass through to the container'
> + },
> + usbmapping => {
> + optional => 1,
> + type => 'string',
> + format => 'pve-configid',
> + format_description => 'mapping-id',
> + description => 'The ID of a cluster wide USB mapping.'
> + }
> +};
> +
> +for (my $i = 0; $i < $MAX_DEVICES; $i++) {
> + $confdesc->{"dev$i"} = {
> + optional => 1,
> + type => 'string', format => $dev_desc,
> + description => "Device to pass through to the container",
> + }
> +}
> +
> sub parse_pct_config {
> my ($filename, $raw, $strict) = @_;
>
> @@ -1255,6 +1299,22 @@ sub parse_volume {
> return;
> }
>
> +sub parse_device {
> + my ($class, $device_string, $noerr) = @_;
> +
> + my $res;
> + eval { $res = PVE::JSONSchema::parse_property_string($dev_desc, $device_string) };
> + if ($@) {
> + return undef if $noerr;
> + die $@;
> + }
> +
> + die "Either path or usbmapping has to be defined"
> + unless (defined($res->{path}) || defined($res->{usbmapping}));
> +
> + return $res;
> +}
> +
> sub print_volume {
> my ($class, $key, $volume) = @_;
>
> --
> 2.39.2
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
next prev parent reply other threads:[~2023-10-30 13:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-24 12:55 [pve-devel] [PATCH many v2] Add container " Filip Schauer
2023-10-24 12:55 ` [pve-devel] [PATCH v2 container 1/1] Add " Filip Schauer
2023-10-30 13:34 ` Wolfgang Bumiller [this message]
2023-11-02 14:28 ` Filip Schauer
2023-11-03 8:14 ` Wolfgang Bumiller
2023-11-07 13:49 ` Filip Schauer
2023-10-24 12:55 ` [pve-devel] [PATCH v2 guest-common 1/1] Add foreach_passthrough_device Filip Schauer
2023-10-30 13:12 ` Wolfgang Bumiller
2023-10-31 9:00 ` Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2rzmdty5ax4v5fssxkvjey4rfhzrcdmjzx5dti4m73lpbekqcf@3wna2j3j2jks \
--to=w.bumiller@proxmox.com \
--cc=f.schauer@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox