public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend
@ 2023-05-25 10:17 Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping Dominik Csapak
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

this series aims to add a cluster-wide device mapping for pci and usb
devices. so that an admin can configure a device to be availble for
migration and configuring for uses that are non-root
(the existing pattern can be copied easily for other types, e.g.
markus upcoming folder sharing)

note that this series requires the array support in api/section
config[0] but how the api is formed/where the data comes should not be
hard to change

also since the api changed quite drastically, the gui must be adapted,
and i'm not done with that yet, so sending the backend only for now

the series is a bigger change to the v3, so a closer look is probably
warranted

changes from v3:
* the configs are now split by type (for ease of use of the section
  config) and live in pve-guest-common, to avoid a cyclic dependcy
* the configs are section configs now (with mentioned array support)
* the api is now only defined in /cluster/resource/{TYPE} and has
  no nodespecific api anymore, besides a 'check-node' parameter
  (see the pve-manager patch for more details on that)
* the internal structure of the pci parsing changed completely, making
  the structure more understandable
* a single map entry now has the same semantic as the qemu-server
  hostpci config entry, meaning if you want multiple mappings per host,
  you have to add multiple map entries. this is a more flexible
  approach, and the parsing code gets a bit simpler
* combined some properties in the config (e.g. vendor/device) so that
  we don't have too many
* squashed some changes together, as they didn't make much sense
  separately anyway (e.g. api/config patches) and it didn't make
  reviewing easier
* changed the ACL paths & privileges to be more general
* surely some other changes i forgot..

changes from v2:
* some bug fixes (e.g use of unitialized variable)
* don't set mdev for multifunction devices
  -> this should fix alexandres issue, since it's not possible anymore
  to select a mediated device when having a multifunction device
  selected

changes from v1:
* dropped 'check_hw_perm' (just use 'check_full' now)
* added some cleanups
* renamed the buttons in the ui (hopefully better now)
* added multi device mapping for each host
  this includes a new 'multi pci' selector for that window, which
  automatically adds entries for the whole slots which, when selected,
  disabled the selection of the individual functions
* fixed some issues (e.g. missing entries in the 'caps' object, wrong
  usb config parsing, etc.)

changes from the rfc:
* new cluster wide gui instead of node-local one (removed that, since
  it's not necessary when we have a cluster-wide one)
* uses json instead of a section config
* api is quite different overall, i split the type into its own level
  for configuring, similar to what we do in pbs
  (e.g. /nodes/NODENAME/hardware/mapping/usb/)
* fixed quite some bugs the rfc had
* added patch for handling the gui with limited permissions better
* added a 'comment' field for mappings

dependencies are pretty straight forward this time around (if i'm not
overlooking something):

qemu-server/pve-manager -> new access-control/pve-guest-common -> new pve-cluster

0: https://lists.proxmox.com/pipermail/pve-devel/2023-May/056739.html

pve-cluster:

Dominik Csapak (1):
  add cfg files for resource mapping

 src/PVE/Cluster.pm  | 2 ++
 src/pmxcfs/status.c | 2 ++
 2 files changed, 4 insertions(+)

pve-access-control:

Dominik Csapak (1):
  add privileges and paths for cluster resource mapping

 src/PVE/AccessControl.pm  | 20 +++++++++++++++++++-
 src/PVE/RPCEnvironment.pm |  7 +++++--
 2 files changed, 24 insertions(+), 3 deletions(-)

pve-guest-common:

Dominik Csapak (1):
  add PCI/USB Resource configs

 src/Makefile            |   3 +
 src/PVE/Resource/PCI.pm | 226 ++++++++++++++++++++++++++++++++++++++++
 src/PVE/Resource/USB.pm | 183 ++++++++++++++++++++++++++++++++
 3 files changed, 412 insertions(+)
 create mode 100644 src/PVE/Resource/PCI.pm
 create mode 100644 src/PVE/Resource/USB.pm

qemu-server:

Dominik Csapak (6):
  enable cluster mapped USB devices for guests
  enable cluster mapped PCI devices for guests
  check_local_resources: extend for mapped resources
  api: migrate preconditions: use new check_local_resources info
  migration: check for mapped resources
  add test for mapped pci devices

 PVE/API2/Qemu.pm                              | 110 +++++++-
 PVE/QemuMigrate.pm                            |  23 +-
 PVE/QemuServer.pm                             | 111 +++++---
 PVE/QemuServer/PCI.pm                         | 243 +++++++++++++++---
 PVE/QemuServer/USB.pm                         |  22 +-
 test/MigrationTest/Shared.pm                  |  14 +
 test/cfg2cmd/q35-linux-hostpci-mapping.conf   |  17 ++
 .../q35-linux-hostpci-mapping.conf.cmd        |  36 +++
 test/cfg2cmd/q35-linux-hostpci.conf           |   2 +-
 test/cfg2cmd/q35-linux-hostpci.conf.cmd       |   2 +-
 test/run_config2command_tests.pl              |  83 ++++++
 11 files changed, 575 insertions(+), 88 deletions(-)
 create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf
 create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd

pve-manager:

Dominik Csapak (2):
  pvesh: fix parameters for proxyto_callback
  api: add resource map api endpoints for PCI and USB

 PVE/API2/Cluster.pm                |   8 +
 PVE/API2/Cluster/Makefile          |   5 +
 PVE/API2/Cluster/Resource.pm       |  53 +++++
 PVE/API2/Cluster/Resource/Makefile |  18 ++
 PVE/API2/Cluster/Resource/PCI.pm   | 297 +++++++++++++++++++++++++++++
 PVE/API2/Cluster/Resource/USB.pm   | 262 +++++++++++++++++++++++++
 PVE/API2/Hardware.pm               |   1 -
 PVE/API2/Nodes.pm                  |   1 +
 PVE/CLI/pvesh.pm                   |  10 +-
 9 files changed, 650 insertions(+), 5 deletions(-)
 create mode 100644 PVE/API2/Cluster/Resource.pm
 create mode 100644 PVE/API2/Cluster/Resource/Makefile
 create mode 100644 PVE/API2/Cluster/Resource/PCI.pm
 create mode 100644 PVE/API2/Cluster/Resource/USB.pm

-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-06-05  9:11   ` [pve-devel] applied: " Thomas Lamprecht
  2023-05-25 10:17 ` [pve-devel] [PATCH access-control v4 1/1] add privileges and paths for cluster " Dominik Csapak
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

resource/pci.cfg and
resource/usb.cfg

to PVE/Cluster.pm
and status.c

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PVE/Cluster.pm  | 2 ++
 src/pmxcfs/status.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/PVE/Cluster.pm b/src/PVE/Cluster.pm
index efca58f..05f84bf 100644
--- a/src/PVE/Cluster.pm
+++ b/src/PVE/Cluster.pm
@@ -78,6 +78,8 @@ my $observed = {
     'sdn/dns.cfg' => 1,
     'sdn/.running-config' => 1,
     'virtual-guest/cpu-models.conf' => 1,
+    'resource/pci.cfg' => 1,
+    'resource/usb.cfg' => 1,
 };
 
 sub prepare_observed_file_basedirs {
diff --git a/src/pmxcfs/status.c b/src/pmxcfs/status.c
index 8d62986..77ed968 100644
--- a/src/pmxcfs/status.c
+++ b/src/pmxcfs/status.c
@@ -108,6 +108,8 @@ static memdb_change_t memdb_change_array[] = {
 	{ .path = "sdn/.running-config" },
 	{ .path = "virtual-guest/cpu-models.conf" },
 	{ .path = "firewall/cluster.fw" },
+	{ .path = "resource/pci.cfg" },
+	{ .path = "resource/usb.cfg" },
 };
 
 static GMutex mutex;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH access-control v4 1/1] add privileges and paths for cluster resource mapping
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs Dominik Csapak
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

uses the privileges:

Resource.Use
Resource.Modify

on /resource/{TYPE}/{id}

so that we can assign privileges on resource level

this will generate new roles (PVEResourceUser, PVEResourceAdmin)

note that every user with Permissions.Modify on '/' and propagate can add these
new roles to themselves (Administrator and PVEAdmin roles currently have
this privilege)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PVE/AccessControl.pm  | 20 +++++++++++++++++++-
 src/PVE/RPCEnvironment.pm |  7 +++++--
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/src/PVE/AccessControl.pm b/src/PVE/AccessControl.pm
index 89b7d90..d093970 100644
--- a/src/PVE/AccessControl.pm
+++ b/src/PVE/AccessControl.pm
@@ -1159,13 +1159,22 @@ my $privgroups = {
 	    'Pool.Audit',
 	],
     },
+    Resource => {
+	root => [],
+	admin => [
+	    'Resource.Modify',
+	],
+	user => [
+	    'Resource.Use',
+	],
+    },
 };
 
 my $valid_privs = {};
 
 my $special_roles = {
     'NoAccess' => {}, # no privileges
-    'Administrator' => $valid_privs, # all privileges
+    'Administrator' => {}, # all privileges
 };
 
 sub create_roles {
@@ -1175,6 +1184,7 @@ sub create_roles {
 	foreach my $p (@{$cd->{root}}, @{$cd->{admin}},
 		       @{$cd->{user}}, @{$cd->{audit}}) {
 	    $valid_privs->{$p} = 1;
+	    $special_roles->{"Administrator"}->{$p} = 1;
 	}
 	foreach my $p (@{$cd->{admin}}, @{$cd->{user}}, @{$cd->{audit}}) {
 
@@ -1191,6 +1201,11 @@ sub create_roles {
 	}
     }
 
+    # remove Resource.Modify from Administrator and PVEAdmin, only
+    # root@pam and PVEResourceAdmin should be able to use that for now
+    delete $special_roles->{"Administrator"}->{"Resource.Modify"};
+    delete $special_roles->{"PVEAdmin"}->{"Resource.Modify"};
+
     $special_roles->{"PVETemplateUser"} = { 'VM.Clone' => 1, 'VM.Audit' => 1 };
 };
 
@@ -1288,6 +1303,9 @@ sub check_path {
 	|/storage/[[:alnum:]\.\-\_]+
 	|/vms
 	|/vms/[1-9][0-9]{2,}
+	|/resource
+	|/resource/[[:alnum:]\.\-\_]+/
+	|/resource/[[:alnum:]\.\-\_]+/[[:alnum:]\.\-\_]+
     )$!xs;
 }
 
diff --git a/src/PVE/RPCEnvironment.pm b/src/PVE/RPCEnvironment.pm
index 8586938..8454939 100644
--- a/src/PVE/RPCEnvironment.pm
+++ b/src/PVE/RPCEnvironment.pm
@@ -137,7 +137,9 @@ sub permissions {
 
     if ($user eq 'root@pam') { # root can do anything
 	my $cfg = $self->{user_cfg};
-	return { map { $_ => 1 } keys %{$cfg->{roles}->{'Administrator'}} };
+	my $permissions = { map { $_ => 1 } keys %{$cfg->{roles}->{'Administrator'}} };
+	$permissions->{"Resource.Modify"} = 1;
+	return $permissions;
     }
 
     if (!defined($path)) {
@@ -187,10 +189,11 @@ sub compute_api_permission {
 	nodes => qr/Sys\.|Permissions\.Modify/,
 	sdn => qr/SDN\.|Permissions\.Modify/,
 	dc => qr/Sys\.Audit|SDN\./,
+	resource => qr/Resource\./,
     };
     map { $res->{$_} = {} } keys %$priv_re_map;
 
-    my $required_paths = ['/', '/nodes', '/access/groups', '/vms', '/storage', '/sdn'];
+    my $required_paths = ['/', '/nodes', '/access/groups', '/vms', '/storage', '/sdn', '/resource'];
     my $defined_paths = [];
     PVE::AccessControl::iterate_acl_tree("/", $usercfg->{acl_root}, sub {
 	my ($path, $node) = @_;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH access-control v4 1/1] add privileges and paths for cluster " Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:40   ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 1/6] enable cluster mapped USB devices for guests Dominik Csapak
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

adds a config file for each type of resource (usb/pci) by using a 'map'
array propertystring for each node mapping

in each mapping we save the path(s) and some other information to detect
hardware changes (if possible) like the vendor/device id

both configs have custom header parser/formatter to omit the type (since
we only want one type per config here)

also each config has some helpers like find_on_current_node

the resulting config (e.g. for pci) would look like this:

---
some-pci-device
    description some device
    map node=node1,path=0000:00:01.0,id=1234:1234,iommugroup=4
    map node=node2,path=0000:01:01.0,id=1234:1234,iommugroup=5
    map node=node3,path=0000:02:01.0,id=1234:1234,iommugroup=6
---

some special notes:
* mdev is a per config entry, since it does not make sense to mix mdev
  and non-mdev devices
* there can be multiple pci paths given per node mapping, this is
  intended to have a different semantic than in the qemu config, namely
  it will select the first available instead of passing all through as a
  multifunction device

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/Makefile            |   3 +
 src/PVE/Resource/PCI.pm | 226 ++++++++++++++++++++++++++++++++++++++++
 src/PVE/Resource/USB.pm | 183 ++++++++++++++++++++++++++++++++
 3 files changed, 412 insertions(+)
 create mode 100644 src/PVE/Resource/PCI.pm
 create mode 100644 src/PVE/Resource/USB.pm

diff --git a/src/Makefile b/src/Makefile
index 57a360b..d92b441 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -14,6 +14,9 @@ install: PVE
 	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
 	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
+	install -d ${PERL5DIR}/PVE/Resource
+	install -m 0644 PVE/Resource/PCI.pm ${PERL5DIR}/PVE/Resource/
+	install -m 0644 PVE/Resource/USB.pm ${PERL5DIR}/PVE/Resource/
 	install -d ${PERL5DIR}/PVE/VZDump
 	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
 	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
diff --git a/src/PVE/Resource/PCI.pm b/src/PVE/Resource/PCI.pm
new file mode 100644
index 0000000..d4d9c44
--- /dev/null
+++ b/src/PVE/Resource/PCI.pm
@@ -0,0 +1,226 @@
+package PVE::Resource::PCI;
+
+use strict;
+use warnings;
+
+use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_lock_file cfs_write_file);
+use PVE::INotify;
+use PVE::JSONSchema qw(get_standard_option parse_property_string);
+use PVE::SectionConfig;
+use PVE::SysFSTools;
+
+use base qw(PVE::SectionConfig);
+
+my $FILENAME = 'resource/pci.cfg';
+
+cfs_register_file($FILENAME,
+		  sub { __PACKAGE__->parse_config(@_); },
+		  sub { __PACKAGE__->write_config(@_); });
+
+
+# so we don't have to repeat the type every time
+sub parse_section_header {
+    my ($class, $line) = @_;
+
+    if ($line =~ m/^(\S+)\s*$/) {
+	my $id = $1;
+	my $errmsg = undef; # set if you want to skip whole section
+	eval { PVE::JSONSchema::pve_verify_configid($id) };
+	$errmsg = $@ if $@;
+	my $config = {}; # to return additional attributes
+	return ('pci', $id, $errmsg, $config);
+    }
+    return undef;
+}
+
+sub format_section_header {
+    my ($class, $type, $sectionId, $scfg, $done_hash) = @_;
+
+    return "$sectionId\n";
+}
+
+sub type {
+    return 'pci';
+}
+
+my $PCI_RE = "[a-f0-9]{4,}:[a-f0-9]{2}:[a-f0-9]{2}(?:\.[a-f0-9])?";
+
+my $map_fmt = {
+    node => get_standard_option('pve-node'),
+    id =>{
+	description => "The vendor and device ID that is expected. Used for".
+	" detecting hardware changes",
+	type => 'string',
+	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
+    },
+    'subsystem-id' => {
+	description => "The subsystem vendor and device ID that is expected. Used".
+	" for detecting hardware changes.",
+	type => 'string',
+	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
+	optional => 1,
+    },
+    path => {
+	description => "The path to the device. If the function is omitted, the whole device is"
+	." mapped. In that case use the attributes of the first device. You can give"
+	." multiple paths as a semicolon seperated list, the first available will then"
+	." be chosen on guest start.",
+	type => 'string',
+	pattern => "(?:${PCI_RE};)*${PCI_RE}",
+    },
+    iommugroup => {
+	type => 'integer',
+	description => "The IOMMU group in which the device is to be expected in.".
+	"Used for detecting hardware changes.",
+	optional => 1,
+    },
+    description => {
+	description => "Description of the node specific device.",
+	type => 'string',
+	optional => 1,
+	maxLength => 4096,
+    },
+};
+
+my $defaultData = {
+    propertyList => {
+	id => {
+	    type => 'string',
+	    description => "The ID of the logical PCI resource.",
+	    format => 'pve-configid',
+	},
+	description => {
+	    description => "Description of the logical PCI device.",
+	    type => 'string',
+	    optional => 1,
+	    maxLength => 4096,
+	},
+	mdev => {
+	    type => 'boolean',
+	    optional => 1,
+	},
+	map => {
+	    type => 'array',
+	    description => 'A list of maps for the cluster nodes.',
+	    optional => 1,
+	    items => {
+		type => 'string',
+		format => $map_fmt,
+	    },
+	},
+    },
+};
+
+sub private {
+    return $defaultData;
+}
+
+sub options {
+    return {
+	description => { optional => 1 },
+	mdev => { optional => 1 },
+	map => {},
+    };
+}
+
+# checks if the given config is valid for the current node
+sub assert_valid {
+    my ($name, $cfg) = @_;
+
+    my @paths = split(';', $cfg->{path} // '');
+
+    my $idx = 0;
+    for my $path (@paths) {
+
+	my $multifunction = 0;
+	if ($path !~ m/\.[a-f0-9]/i) {
+	    # whole device, add .0 (must exist)
+	    $path = "$path.0";
+	    $multifunction = 1;
+	}
+
+	my $info = PVE::SysFSTools::pci_device_info($path, 1);
+	die "pci device '$path' not found\n" if !defined($info);
+
+	my $correct_props = {
+	    id => "$info->{vendor}:$info->{device}",
+	    iommugroup => $info->{iommugroup},
+	};
+
+	if (defined($info->{'subsystem_vendor'}) && defined($info->{'subsystem_device'})) {
+	    $correct_props->{'subsystem-id'} = "$info->{'subsystem_vendor'}:$info->{'subsystem_device'}";
+	}
+
+	for my $prop (sort keys %$correct_props) {
+	    next if $prop eq 'iommugroup' && $idx > 0; # check iommu only on the first device
+
+	    next if !defined($correct_props->{$prop}) && !defined($cfg->{$prop});
+	    die "no '$prop' for device '$path'\n"
+		if defined($correct_props->{$prop}) && !defined($cfg->{$prop});
+	    die "'$prop' configured but should not be\n"
+		if !defined($correct_props->{$prop}) && defined($cfg->{$prop});
+
+	    my $correct_prop = $correct_props->{$prop};
+	    $correct_prop =~ s/0x//g;
+	    my $configured_prop = $cfg->{$prop};
+	    $configured_prop =~ s/0x//g;
+
+	    die "'$prop' does not match for '$name' ($correct_prop != $configured_prop)\n"
+		if $correct_prop ne $configured_prop;
+	}
+	$idx++;
+    }
+
+    return 1;
+};
+
+sub config {
+    return cfs_read_file($FILENAME);
+}
+
+sub lock_pci_config {
+    my ($code, $errmsg) = @_;
+
+    cfs_lock_file($FILENAME, undef, $code);
+    if (my $err = $@) {
+	$errmsg ? die "$errmsg: $err" : die $err;
+    }
+}
+
+sub write_pci_config {
+    my ($cfg) = @_;
+
+    cfs_write_file($FILENAME, $cfg);
+}
+
+sub find_on_current_node {
+    my ($id) = @_;
+
+    my $cfg = PVE::Resource::PCI::config();
+    my $node = PVE::INotify::nodename();
+
+    # ignore errors
+    return get_node_mapping($cfg, $id, $node);
+}
+
+sub get_node_mapping {
+    my ($cfg, $id, $nodename) = @_;
+
+    return undef if !defined($cfg->{ids}->{$id});
+
+    my $res = [];
+    for my $map ($cfg->{ids}->{$id}->{map}->@*) {
+	my $entry = eval { parse_property_string($map_fmt, $map) };
+	warn $@ if $@;
+	if ($entry && $entry->{node} eq $nodename) {
+	    push $res->@*, $entry;
+	}
+    }
+
+    return $res;
+}
+
+PVE::Resource::PCI->register();
+PVE::Resource::PCI->init();
+
+1;
diff --git a/src/PVE/Resource/USB.pm b/src/PVE/Resource/USB.pm
new file mode 100644
index 0000000..05ea789
--- /dev/null
+++ b/src/PVE/Resource/USB.pm
@@ -0,0 +1,183 @@
+package PVE::Resource::USB;
+
+use strict;
+use warnings;
+
+use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_lock_file cfs_write_file);
+use PVE::INotify;
+use PVE::JSONSchema qw(get_standard_option parse_property_string);
+use PVE::SectionConfig;
+use PVE::SysFSTools;
+
+use base qw(PVE::SectionConfig);
+
+my $FILENAME = 'resource/usb.cfg';
+
+cfs_register_file($FILENAME,
+		  sub { __PACKAGE__->parse_config(@_); },
+		  sub { __PACKAGE__->write_config(@_); });
+
+
+# so we don't have to repeat the type every time
+sub parse_section_header {
+    my ($class, $line) = @_;
+
+    if ($line =~ m/^(\S+)\s*$/) {
+	my $id = $1;
+	my $errmsg = undef; # set if you want to skip whole section
+	eval { PVE::JSONSchema::pve_verify_configid($id) };
+	$errmsg = $@ if $@;
+	my $config = {}; # to return additional attributes
+	return ('usb', $id, $errmsg, $config);
+    }
+    return undef;
+}
+
+sub format_section_header {
+    my ($class, $type, $sectionId, $scfg, $done_hash) = @_;
+
+    return "$sectionId\n";
+}
+
+sub type {
+    return 'usb';
+}
+
+my $map_fmt = {
+    node => get_standard_option('pve-node'),
+    'id' => {
+	description => "The vendor and device ID that is expected. If a USB path".
+	" is given, it is only used for detecting hardware changes",
+	type => 'string',
+	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
+    },
+    path => {
+	description => "The path to the usb device.",
+	type => 'string',
+	optional => 1,
+	pattern => qr/^(\d+)\-(\d+(\.\d+)*)$/,
+    },
+    description => {
+	description => "Description of the node specific device.",
+	type => 'string',
+	optional => 1,
+	maxLength => 4096,
+    },
+};
+
+my $defaultData = {
+    propertyList => {
+	id => {
+	    type => 'string',
+	    description => "The ID of the logical PCI resource.",
+	    format => 'pve-configid',
+	},
+	description => {
+	    description => "Description of the logical PCI device.",
+	    type => 'string',
+	    optional => 1,
+	    maxLength => 4096,
+	},
+	map => {
+	    type => 'array',
+	    description => 'A list of maps for the cluster nodes.',
+	    items => {
+		type => 'string',
+		format => $map_fmt,
+	    },
+	},
+    },
+};
+sub private {
+    return $defaultData;
+}
+
+sub options {
+    return {
+	description => { optional => 1 },
+	map => {},
+    };
+}
+
+# checks if the given device is valid for the current node
+sub assert_valid {
+    my ($name, $cfg) = @_;
+
+    my $id = $cfg->{id};
+
+    my $usb_list = PVE::SysFSTools::scan_usb();
+
+    my $info;
+    if (my $path = $cfg->{path}) {
+	for my $dev (@$usb_list) {
+	    next if !$dev->{usbpath} || !$dev->{busnum};
+	    my $usbpath = "$dev->{busnum}-$dev->{usbpath}";
+	    next if $usbpath ne $path;
+	    $info = $dev;
+	}
+	die "usb device '$path' not found\n" if !defined($info);
+
+	my $realId = "$info->{vendid}:$info->{prodid}";
+	die "'id' does not match for '$name' ($realId != $id)\n"
+	    if $realId ne $id;
+    } else {
+	for my $dev (@$usb_list) {
+	    my $realId = "$dev->{vendid}:$dev->{prodid}";
+	    next if $realId ne $id;
+	    $info = $dev;
+	}
+	die "usb device '$id' not found\n" if !defined($info);
+    }
+
+    return 1;
+};
+
+sub config {
+    return cfs_read_file($FILENAME);
+}
+
+sub lock_usb_config {
+    my ($code, $errmsg) = @_;
+
+    cfs_lock_file($FILENAME, undef, $code);
+    if (my $err = $@) {
+	$errmsg ? die "$errmsg: $err" : die $err;
+    }
+}
+
+sub write_usb_config {
+    my ($cfg) = @_;
+
+    cfs_write_file($FILENAME, $cfg);
+}
+
+sub find_on_current_node {
+    my ($id) = @_;
+
+    my $cfg = config();
+    my $node = PVE::INotify::nodename();
+
+    return get_node_mapping($cfg, $id, $node);
+}
+
+sub get_node_mapping {
+    my ($cfg, $id, $nodename) = @_;
+
+    return undef if !defined($cfg->{ids}->{$id});
+
+    my $res = [];
+    for my $map ($cfg->{ids}->{$id}->{map}->@*) {
+	my $entry = eval { parse_property_string($map_fmt, $map) };
+	warn $@ if $@;
+	if ($entry && $entry->{node} eq $nodename) {
+	    push $res->@*, $entry;
+	}
+    }
+
+    return $res;
+}
+
+PVE::Resource::USB->register();
+PVE::Resource::USB->init();
+
+1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 1/6] enable cluster mapped USB devices for guests
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (2 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 2/6] enable cluster mapped PCI " Dominik Csapak
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

this patch allows configuring usb devices that are mapped via
cluster resource mapping when the user has 'Resource.Use' on the ACL
path '/resource/usb/{ID}' (in addition to the usual required vm config
privileges)

for now, this is only valid if there is exactly one mapping for the
host, since we don't track passed through usb devices yet

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Qemu.pm      | 40 +++++++++++++++++++++++++++++++++++++---
 PVE/QemuServer.pm     |  4 ++++
 PVE/QemuServer/USB.pm | 22 +++++++++++++++++++++-
 3 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 587bb222..cedddf97 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::USB qw(parse_usb_device);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -590,8 +591,13 @@ my $check_vm_create_usb_perm = sub {
 
     foreach my $opt (keys %{$param}) {
 	next if $opt !~ m/^usb\d+$/;
+	my $entry = PVE::JSONSchema::parse_property_string('pve-qm-usb', $param->{$opt});
+	my $device = parse_usb_device($entry->{host});
 
-	if ($param->{$opt} =~ m/spice/) {
+	if ($device->{spice}) {
+	    $rpcenv->check_vm_perm($authuser, $vmid, $pool, ['VM.Config.HWType']);
+	} elsif ($device->{mapped}) {
+	    $rpcenv->check_full($authuser, "/resource/usb/$entry->{host}", ['Resource.Use']);
 	    $rpcenv->check_vm_perm($authuser, $vmid, $pool, ['VM.Config.HWType']);
 	} else {
 	    die "only root can set '$opt' config for real devices\n";
@@ -1696,7 +1702,12 @@ my $update_vm_api  = sub {
 		    PVE::QemuConfig->add_to_pending_delete($conf, $opt, $force);
 		    PVE::QemuConfig->write_config($vmid, $conf);
 		} elsif ($opt =~ m/^usb\d+$/) {
-		    if ($val =~ m/spice/) {
+		    my $device = PVE::JSONSchema::parse_property_string('pve-qm-usb', $val);
+		    my $host = parse_usb_device($device->{host});
+		    if ($host->{spice}) {
+			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+		    } elsif ($host->{mapped}) {
+			$rpcenv->check_full($authuser, "/resource/usb/$device->{host}", ['Resource.Use']);
 			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
 		    } elsif ($authuser ne 'root@pam') {
 			die "only root can delete '$opt' config for real devices\n";
@@ -1761,7 +1772,30 @@ my $update_vm_api  = sub {
 		    }
 		    $conf->{pending}->{$opt} = $param->{$opt};
 		} elsif ($opt =~ m/^usb\d+/) {
-		    if ((!defined($conf->{$opt}) || $conf->{$opt} =~ m/spice/) && $param->{$opt} =~ m/spice/) {
+		    my $olddevice;
+		    my $oldhost;
+		    if (defined($conf->{$opt})) {
+			$olddevice = PVE::JSONSchema::parse_property_string('pve-qm-usb', $conf->{$opt});
+			$oldhost = parse_usb_device($olddevice->{host});
+		    }
+		    if (defined($oldhost)) {
+			if ($oldhost->{spice}) {
+			    $rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+			} elsif ($oldhost->{mapped}) {
+			    $rpcenv->check_full($authuser, "/resource/usb/$olddevice->{host}", ['Resource.Use']);
+			    $rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+			} elsif ($authuser ne 'root@pam') {
+			    die "only root can modify '$opt' config for real devices\n";
+			}
+		    }
+
+		    my $newdevice = PVE::JSONSchema::parse_property_string('pve-qm-usb', $param->{$opt});
+		    my $newhost = parse_usb_device($newdevice->{host});
+
+		    if ($newhost->{spice}) {
+			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+		    } elsif ($newhost->{mapped}) {
+			$rpcenv->check_full($authuser, "/resource/usb/$newdevice->{host}", ['Resource.Use']);
 			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
 		    } elsif ($authuser ne 'root@pam') {
 			die "only root can modify '$opt' config for real devices\n";
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index ab33aa37..bdc4f3d9 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -1090,6 +1090,8 @@ The Host USB device or port or the value 'spice'. HOSTUSBDEVICE syntax is:
 
 You can use the 'lsusb -t' command to list existing usb devices.
 
+Alternatively, you can used an ID of a mapped usb device.
+
 NOTE: This option allows direct access to host hardware. So it is no longer possible to migrate such
 machines - use with special care.
 
@@ -1106,6 +1108,8 @@ EODESCR
     },
 };
 
+PVE::JSONSchema::register_format('pve-qm-usb', $usb_fmt);
+
 my $usbdesc = {
     optional => 1,
     type => 'string', format => $usb_fmt,
diff --git a/PVE/QemuServer/USB.pm b/PVE/QemuServer/USB.pm
index 686461cc..b764ce39 100644
--- a/PVE/QemuServer/USB.pm
+++ b/PVE/QemuServer/USB.pm
@@ -6,6 +6,7 @@ use PVE::QemuServer::PCI qw(print_pci_addr);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Helpers qw(min_version windows_version);
 use PVE::JSONSchema;
+use PVE::Resource::USB;
 use base 'Exporter';
 
 our @EXPORT_OK = qw(
@@ -31,7 +32,26 @@ sub parse_usb_device {
     } elsif ($value =~ m/^spice$/i) {
 	$res->{spice} = 1;
     } else {
-	return;
+	# we have no ordinary usb device, must be a mapping
+	my $devices = PVE::Resource::USB::find_on_current_node($value);
+	return undef if scalar($devices->@*) != 1;
+	eval {
+	    PVE::Resource::USB::assert_valid($value, $devices->[0]);
+	};
+	if (my $err = $@) {
+	    warn "USB Mapping invalid (hardware probably changed): $err\n";
+	    return;
+	}
+
+	my $device = $devices->[0];
+
+	if ($device->{path}) {
+	    $res = parse_usb_device($device->{path});
+	} else {
+	    $res = parse_usb_device($device->{id});
+	}
+
+	$res->{mapped} = 1;
     }
 
     return $res;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 2/6] enable cluster mapped PCI devices for guests
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (3 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 1/6] enable cluster mapped USB devices for guests Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 3/6] check_local_resources: extend for mapped resources Dominik Csapak
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

this patch allows configuring pci devices that are mapped via cluster
resource mapping when the user has 'Resource.Use' on the ACL path
'/resource/pci/{ID}' (in  addition to the usual required vm config
privileges)

When given multiple mappings in the config, we use them as alternatives
for the passthrough, and will select the first free one on startup.
It is using our regular pci reservation mechanism for regular devices and
we introduce a selection mechanism for mediated devices.

A few changes to the inner workings were required to make this work well:
* parse_hostpci now returns a different structure where we have a list
  of lists (first level is for the different alternatives and second
  level is for the different devices that should be passed through
  together)
* factor out the 'parse_hostpci_devices' which parses each device from
  the config and does some precondition checks
* reserve_pci_usage now behaves slightly different when trying to
  reserve an device with the same VMID that's already reserved for,
  since for checking which alternative we can use, we already must
  reserve one (this means that qm showcmd can actually reserve devices,
  albeit only for up to 10 seconds)
* configuring a mediated device on a multifunction device is not
  supported anymore, and results in failure to start (previously, it
  just chose the first device to do it). This is a breaking change
* configuring a single pci device twice on different hostpci slots now
  fails during commandline generation instead on qemu start, so we had
  to adapt one test where this occurred (it could never have worked
  anyway)

Fixes #3574: Improve SR-IOV usability
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Qemu.pm                        |  53 +++++-
 PVE/QemuServer.pm                       |  64 ++++---
 PVE/QemuServer/PCI.pm                   | 243 ++++++++++++++++++++----
 test/cfg2cmd/q35-linux-hostpci.conf     |   2 +-
 test/cfg2cmd/q35-linux-hostpci.conf.cmd |   2 +-
 test/run_config2command_tests.pl        |   1 +
 6 files changed, 291 insertions(+), 74 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index cedddf97..68d1b818 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::PCI;
 use PVE::QemuServer::USB qw(parse_usb_device);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
@@ -607,6 +608,26 @@ my $check_vm_create_usb_perm = sub {
     return 1;
 };
 
+my $check_vm_create_hostpci_perm = sub {
+    my ($rpcenv, $authuser, $vmid, $pool, $param) = @_;
+
+    return 1 if $authuser eq 'root@pam';
+
+    foreach my $opt (keys %{$param}) {
+	next if $opt !~ m/^hostpci\d+$/;
+
+	my $device = PVE::JSONSchema::parse_property_string('pve-qm-hostpci', $param->{$opt});
+	if ($device->{host} !~ m/:/) {
+	    $rpcenv->check_full($authuser, "/resource/pci/$device->{host}", ['Resource.Use']);
+	    $rpcenv->check_vm_perm($authuser, $vmid, $pool, ['VM.Config.HWType']);
+	} else {
+	    die "only root can set '$opt' config for non-mapped devices\n";
+	}
+    }
+
+    return 1;
+};
+
 my $check_vm_modify_config_perm = sub {
     my ($rpcenv, $authuser, $vmid, $pool, $key_list) = @_;
 
@@ -617,7 +638,7 @@ my $check_vm_modify_config_perm = sub {
 	# else, as there the permission can be value dependend
 	next if PVE::QemuServer::is_valid_drivename($opt);
 	next if $opt eq 'cdrom';
-	next if $opt =~ m/^(?:unused|serial|usb)\d+$/;
+	next if $opt =~ m/^(?:unused|serial|usb|hostpci)\d+$/;
 	next if $opt eq 'tags';
 
 
@@ -646,7 +667,7 @@ my $check_vm_modify_config_perm = sub {
 	    # also needs privileges on the storage, that will be checked later
 	    $rpcenv->check_vm_perm($authuser, $vmid, $pool, ['VM.Config.Disk', 'VM.PowerMgmt' ]);
 	} else {
-	    # catches hostpci\d+, args, lock, etc.
+	    # catches args, lock, etc.
 	    # new options will be checked here
 	    die "only root can set '$opt' config\n";
 	}
@@ -884,6 +905,7 @@ __PACKAGE__->register_method({
 
 	    &$check_vm_create_serial_perm($rpcenv, $authuser, $vmid, $pool, $param);
 	    &$check_vm_create_usb_perm($rpcenv, $authuser, $vmid, $pool, $param);
+	    &$check_vm_create_hostpci_perm($rpcenv, $authuser, $vmid, $pool, $param);
 
 	    &$check_cpu_model_access($rpcenv, $authuser, $param);
 
@@ -1714,6 +1736,16 @@ my $update_vm_api  = sub {
 		    }
 		    PVE::QemuConfig->add_to_pending_delete($conf, $opt, $force);
 		    PVE::QemuConfig->write_config($vmid, $conf);
+		} elsif ($opt =~ m/^hostpci\d+$/) {
+		    my $olddevice = PVE::JSONSchema::parse_property_string('pve-qm-hostpci', $val);
+		    if ($olddevice->{host} !~ m/:/) {
+			$rpcenv->check_full($authuser, "/resource/pci/$olddevice->{host}", ['Resource.Use']);
+			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+		    } elsif ($authuser ne 'root@pam') {
+			die "only root can set '$opt' config for non-mapped devices\n";
+		    }
+		    PVE::QemuConfig->add_to_pending_delete($conf, $opt, $force);
+		    PVE::QemuConfig->write_config($vmid, $conf);
 		} elsif ($opt eq 'tags') {
 		    assert_tag_permissions($vmid, $val, '', $rpcenv, $authuser);
 		    delete $conf->{$opt};
@@ -1800,6 +1832,23 @@ my $update_vm_api  = sub {
 		    } elsif ($authuser ne 'root@pam') {
 			die "only root can modify '$opt' config for real devices\n";
 		    }
+
+		    $conf->{pending}->{$opt} = $param->{$opt};
+		} elsif ($opt =~ m/^hostpci\d+$/) {
+		    my $olddevice;
+		    if (defined($conf->{$opt})) {
+			$olddevice = PVE::JSONSchema::parse_property_string('pve-qm-hostpci', $conf->{$opt});
+		    }
+		    my $newdevice = PVE::JSONSchema::parse_property_string('pve-qm-hostpci', $param->{$opt});
+		    if ((!defined($olddevice) || $olddevice->{host} !~ m/:/) && $newdevice->{host} !~ m/:/) {
+			if (defined($olddevice)) {
+			    $rpcenv->check_full($authuser, "/resource/pci/$olddevice->{host}", ['Resource.Use']);
+			}
+			$rpcenv->check_full($authuser, "/resource/pci/$newdevice->{host}", ['Resource.Use']);
+			$rpcenv->check_vm_perm($authuser, $vmid, undef, ['VM.Config.HWType']);
+		    } elsif ($authuser ne 'root@pam') {
+			die "only root can set '$opt' config for non-mapped devices\n";
+		    }
 		    $conf->{pending}->{$opt} = $param->{$opt};
 		} elsif ($opt eq 'tags') {
 		    assert_tag_permissions($vmid, $conf->{$opt}, $param->{$opt}, $rpcenv, $authuser);
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index bdc4f3d9..cfdf9918 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3773,8 +3773,8 @@ sub config_to_command {
     my $bootorder = device_bootorder($conf);
 
     # host pci device passthrough
-    my ($kvm_off, $gpu_passthrough, $legacy_igd) = PVE::QemuServer::PCI::print_hostpci_devices(
-	$vmid, $conf, $devices, $vga, $winversion, $q35, $bridges, $arch, $machine_type, $bootorder);
+    my ($kvm_off, $gpu_passthrough, $legacy_igd, $pci_devices) = PVE::QemuServer::PCI::print_hostpci_devices(
+	$vmid, $conf, $devices, $vga, $winversion, $bridges, $arch, $machine_type, $bootorder);
 
     # usb devices
     my $usb_dev_features = {};
@@ -4193,7 +4193,7 @@ sub config_to_command {
 	push @$cmd, @$aa;
     }
 
-    return wantarray ? ($cmd, $vollist, $spice_port) : $cmd;
+    return wantarray ? ($cmd, $vollist, $spice_port, $pci_devices) : $cmd;
 }
 
 sub check_rng_source {
@@ -5750,7 +5750,7 @@ sub vm_start_nolock {
 	print "Resuming suspended VM\n";
     }
 
-    my ($cmd, $vollist, $spice_port) = config_to_command($storecfg, $vmid,
+    my ($cmd, $vollist, $spice_port, $pci_devices) = config_to_command($storecfg, $vmid,
 	$conf, $defaults, $forcemachine, $forcecpu, $params->{'pbs-backing'});
 
     my $migration_ip;
@@ -5835,37 +5835,43 @@ sub vm_start_nolock {
 
     my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
 
-    my $pci_devices = {}; # host pci devices
-    for (my $i = 0; $i < $PVE::QemuServer::PCI::MAX_HOSTPCI_DEVICES; $i++)  {
-	my $dev = $conf->{"hostpci$i"} or next;
-	$pci_devices->{$i} = parse_hostpci($dev);
+    my $pci_reserve_list = [];
+    for my $device (values $pci_devices->%*) {
+	next if $device->{mdev}; # we don't reserve for mdev devices
+	push $pci_reserve_list->@*, map { $_->{id} } $device->{ids}->@*;
     }
 
-    # do not reserve pciid for mediated devices, sysfs will error out for duplicate assignment
-    my $real_pci_devices = [ grep { !(defined($_->{mdev}) && scalar($_->{pciid}->@*) == 1) } values $pci_devices->%* ];
-
-    # map to a flat list of pci ids
-    my $pci_id_list = [ map { $_->{id} } map { $_->{pciid}->@* } $real_pci_devices->@* ];
-
     # reserve all PCI IDs before actually doing anything with them
-    PVE::QemuServer::PCI::reserve_pci_usage($pci_id_list, $vmid, $start_timeout);
+    PVE::QemuServer::PCI::reserve_pci_usage($pci_reserve_list, $vmid, $start_timeout);
 
     eval {
 	my $uuid;
 	for my $id (sort keys %$pci_devices) {
 	    my $d = $pci_devices->{$id};
-	    for my $dev ($d->{pciid}->@*) {
-		my $info = PVE::QemuServer::PCI::prepare_pci_device($vmid, $dev->{id}, $id, $d->{mdev});
-
-		# nvidia grid needs the qemu parameter '-uuid' set
-		# use smbios uuid or mdev uuid as fallback for that
-		if ($d->{mdev} && !defined($uuid) && $info->{vendor} eq '10de') {
-		    if (defined($conf->{smbios1})) {
-			my $smbios_conf = parse_smbios1($conf->{smbios1});
-			$uuid = $smbios_conf->{uuid} if defined($smbios_conf->{uuid});
-		    }
-		    $uuid = PVE::QemuServer::PCI::generate_mdev_uuid($vmid, $id) if !defined($uuid);
+	    my ($index) = ($id =~ m/^hostpci(\d+)$/);
+
+	    my $chosen_mdev;
+	    for my $dev ($d->{ids}->@*) {
+		my $info = eval { PVE::QemuServer::PCI::prepare_pci_device($vmid, $dev->{id}, $index, $d->{mdev}) };
+		if ($d->{mdev}) {
+		    warn $@ if $@;
+		    $chosen_mdev = $info;
+		    last if $chosen_mdev; # if successful, we're done
+		} else {
+		    die $@ if $@;
+		}
+	    }
+
+	    next if !$d->{mdev};
+	    die "could not create mediated device\n" if !defined($chosen_mdev);
+
+	    # nvidia grid needs the uuid of the mdev as qemu parameter
+	    if (!defined($uuid) && $chosen_mdev->{vendor} =~ m/^(0x)?10de$/) {
+		if (defined($conf->{smbios1})) {
+		    my $smbios_conf = parse_smbios1($conf->{smbios1});
+		    $uuid = $smbios_conf->{uuid} if defined($smbios_conf->{uuid});
 		}
+		$uuid = PVE::QemuServer::PCI::generate_mdev_uuid($vmid, $index) if !defined($uuid);
 	    }
 	}
 	push @$cmd, '-uuid', $uuid if defined($uuid);
@@ -5980,7 +5986,7 @@ sub vm_start_nolock {
 
     # re-reserve all PCI IDs now that we can know the actual VM PID
     my $pid = PVE::QemuServer::Helpers::vm_running_locally($vmid);
-    eval { PVE::QemuServer::PCI::reserve_pci_usage($pci_id_list, $vmid, undef, $pid) };
+    eval { PVE::QemuServer::PCI::reserve_pci_usage($pci_reserve_list, $vmid, undef, $pid) };
     warn $@ if $@;
 
     if (defined($res->{migrate})) {
@@ -6175,9 +6181,7 @@ sub cleanup_pci_devices {
 
 	    # some nvidia vgpu driver versions want to clean the mdevs up themselves, and error
 	    # out when we do it first. so wait for 10 seconds and then try it
-	    my $pciid = $d->{pciid}->[0]->{id};
-	    my $info = PVE::SysFSTools::pci_device_info("$pciid");
-	    if ($info->{vendor} eq '10de') {
+	    if ($d->{ids}->[0]->[0]->{vendor} =~ m/^(0x)?10de$/) {
 		sleep 10;
 	    }
 
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index a18b9747..27e7c847 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -4,6 +4,7 @@ use warnings;
 use strict;
 
 use PVE::JSONSchema;
+use PVE::Resource::PCI;
 use PVE::SysFSTools;
 use PVE::Tools;
 
@@ -23,8 +24,8 @@ my $hostpci_fmt = {
     host => {
 	default_key => 1,
 	type => 'string',
-	pattern => qr/$PCIRE(;$PCIRE)*/,
-	format_description => 'HOSTPCIID[;HOSTPCIID2...]',
+	pattern => qr/(:?$PCIRE(;$PCIRE)*)|(:?$PVE::JSONSchema::CONFIGID_RE)/,
+	format_description => 'HOSTPCIID[;HOSTPCIID2...] or configured mapping id',
 	description => <<EODESCR,
 Host PCI device pass through. The PCI ID of a host's PCI device or a list
 of PCI virtual functions of the host. HOSTPCIID syntax is:
@@ -32,6 +33,8 @@ of PCI virtual functions of the host. HOSTPCIID syntax is:
 'bus:dev.func' (hexadecimal numbers)
 
 You can us the 'lspci' command to list existing PCI devices.
+
+Alternatively use the ID of a mapped pci device.
 EODESCR
     },
     rombar => {
@@ -376,6 +379,32 @@ sub print_pcie_root_port {
     return $res;
 }
 
+# returns the parsed pci config but parses the 'host' part into
+# a list if lists into the 'id' property like this:
+#
+# {
+#   mdev => 1,
+#   rombar => ...
+#   ...
+#   ids => [
+#       # this contains a list of alternative devices,
+#       [
+#           # which are itself lists of ids for one multifunction device
+#           {
+#               id => "0000:00:00.0",
+#               vendor => "...",
+#           },
+#           {
+#               id => "0000:00:00.1",
+#               vendor => "...",
+#           },
+#       ],
+#       [
+#           ...
+#       ],
+#       ...
+#   ],
+# }
 sub parse_hostpci {
     my ($value) = @_;
 
@@ -383,31 +412,167 @@ sub parse_hostpci {
 
     my $res = PVE::JSONSchema::parse_property_string($hostpci_fmt, $value);
 
-    my @idlist = split(/;/, $res->{host});
+    my $alternatives = [];
+    if ($res->{host} !~ m/:/) {
+	# we have no ordinary pci id, must be a mapping
+	my $devices = PVE::Resource::PCI::find_on_current_node($res->{host});
+	die "PCI device mapping not found for '$res->{host}'\n" if !$devices || !scalar($devices->@*);
+
+	for my $device ($devices->@*) {
+	    eval { PVE::Resource::PCI::assert_valid($res->{host}, $device) };
+	    die "PCI device mapping invalid (hardware probably changed): $@\n" if $@;
+	    push $alternatives->@*, [split(/;/, $device->{path})];
+	}
+    } else {
+	push $alternatives->@*, [split(/;/, $res->{host})];
+    }
     delete $res->{host};
-    foreach my $id (@idlist) {
-	my $devs = PVE::SysFSTools::lspci($id);
-	die "no PCI device found for '$id'\n" if !scalar(@$devs);
-	push @{$res->{pciid}}, @$devs;
+
+    $res->{ids} = [];
+    for my $alternative ($alternatives->@*) {
+	my $ids = [];
+	foreach my $id ($alternative->@*) {
+	    my $devs = PVE::SysFSTools::lspci($id);
+	    die "no PCI device found for '$id'\n" if !scalar($devs->@*);
+	    push $ids->@*, @$devs;
+	}
+	if (scalar($ids->@*) > 1) {
+	    $res->{'has-multifunction'} = 1;
+	    die "cannot use mediated device with multifunction device\n" if $res->{mdev};
+	}
+	push $res->{ids}->@*, $ids;
     }
+
     return $res;
 }
 
+# parses all hostpci devices from a config and does some sanity checks
+# returns a hash like this:
+# {
+#     hostpci0 => {
+#         # hash from parse_hostpci function
+#     },
+#     hostpci1 => { ... },
+#     ...
+# }
+sub parse_hostpci_devices {
+    my ($conf) = @_;
+
+    my $q35 = PVE::QemuServer::Machine::machine_type_is_q35($conf);
+    my $legacy_igd = 0;
+
+    my $parsed_devices = {};
+    for (my $i = 0; $i < $MAX_HOSTPCI_DEVICES; $i++)  {
+	my $id = "hostpci$i";
+	my $d = parse_hostpci($conf->{$id});
+	next if !$d;
+
+	# check syntax
+	die "q35 machine model is not enabled" if !$q35 && $d->{pcie};
+
+	if ($d->{'legacy-igd'}) {
+	    die "only one device can be assigned in legacy-igd mode\n"
+		if $legacy_igd;
+	    $legacy_igd = 1;
+
+	    die "legacy IGD assignment requires VGA mode to be 'none'\n"
+		if !defined($conf->{'vga'}) || $conf->{'vga'} ne 'none';
+	    die "legacy IGD assignment requires rombar to be enabled\n"
+		if defined($d->{rombar}) && !$d->{rombar};
+	    die "legacy IGD assignment is not compatible with x-vga\n"
+		if $d->{'x-vga'};
+	    die "legacy IGD assignment is not compatible with mdev\n"
+		if $d->{mdev};
+	    die "legacy IGD assignment is not compatible with q35\n"
+		if $q35;
+	    die "legacy IGD assignment is not compatible with multifunction devices\n"
+		if $d->{'has-multifunction'};
+	    die "legacy IGD assignment is not compatible with alternate devices\n"
+		if scalar($d->{ids}->@*) > 1;
+	    # check first device for valid id
+	    die "legacy IGD assignment only works for devices on host bus 00:02.0\n"
+		if $d->{ids}->[0]->[0]->{id} !~ m/02\.0$/;
+	}
+
+	$parsed_devices->{$id} = $d;
+    }
+
+    return $parsed_devices;
+}
+
+# takes the hash returned by parse_hostpci_devices and for all non mdev gpus,
+# selects one of the given alternatives by trying to reserve it
+#
+# mdev devices must be chosen later when we actually allocate it, but we
+# flatten the inner list since there can only be one device per alternative anyway
+my sub choose_hostpci_devices {
+    my ($devices, $vmid) = @_;
+
+    my $used = {};
+
+    my $add_used_device = sub {
+	my ($devices) = @_;
+	for my $used_device ($devices->@*) {
+	    my $used_id = $used_device->{id};
+	    die "device '$used_id' assigned more than once\n" if $used->{$used_id};
+	    $used->{$used_id} = 1;
+	}
+    };
+
+    for (my $i = 0; $i < $MAX_HOSTPCI_DEVICES; $i++)  {
+	my $device = $devices->{"hostpci$i"};
+	next if !$device;
+
+	if ($device->{mdev}) {
+	    $device->{ids} = [ map { $_->[0] } $device->{ids}->@* ];
+	    next;
+	}
+
+	if (scalar($device->{ids}->@* == 1)) {
+	    # we only have one alternative, use that
+	    $device->{ids} = $device->{ids}->[0];
+	    $add_used_device->($device->{ids});
+	    next;
+	}
+
+	my $found = 0;
+	for my $alternative ($device->{ids}->@*) {
+	    my $ids = [map { $_->{id} } @$alternative];
+
+	    next if grep { defined($used->{$_}) } @$ids; # already used
+	    eval { reserve_pci_usage($ids, $vmid, 10, undef) };
+	    next if $@;
+
+	    # found one that is not used or reserved
+	    $add_used_device->($alternative);
+	    $device->{ids} = $alternative;
+	    $found = 1;
+	    last;
+	}
+	die "could not find a free device for 'hostpci$i'\n" if !$found;
+    }
+
+    return $devices;
+}
+
 sub print_hostpci_devices {
-    my ($vmid, $conf, $devices, $vga, $winversion, $q35, $bridges, $arch, $machine_type, $bootorder) = @_;
+    my ($vmid, $conf, $devices, $vga, $winversion, $bridges, $arch, $machine_type, $bootorder) = @_;
 
     my $kvm_off = 0;
     my $gpu_passthrough = 0;
     my $legacy_igd = 0;
 
     my $pciaddr;
+    my $pci_devices = choose_hostpci_devices(parse_hostpci_devices($conf), $vmid);
+
     for (my $i = 0; $i < $MAX_HOSTPCI_DEVICES; $i++)  {
 	my $id = "hostpci$i";
-	my $d = parse_hostpci($conf->{$id});
+	my $d = $pci_devices->{$id};
 	next if !$d;
 
+	$legacy_igd = 1 if $d->{'legacy-igd'};
+
 	if (my $pcie = $d->{pcie}) {
-	    die "q35 machine model is not enabled" if !$q35;
 	    # win7 wants to have the pcie devices directly on the pcie bus
 	    # instead of in the root port
 	    if ($winversion == 7) {
@@ -425,29 +590,8 @@ sub print_hostpci_devices {
 	    $pciaddr = print_pci_addr($pci_name, $bridges, $arch, $machine_type);
 	}
 
-	my $pcidevices = $d->{pciid};
-	my $multifunction = @$pcidevices > 1;
-
-	if ($d->{'legacy-igd'}) {
-	    die "only one device can be assigned in legacy-igd mode\n"
-		if $legacy_igd;
-	    $legacy_igd = 1;
-
-	    die "legacy IGD assignment requires VGA mode to be 'none'\n"
-		if !defined($conf->{'vga'}) || $conf->{'vga'} ne 'none';
-	    die "legacy IGD assignment requires rombar to be enabled\n"
-		if defined($d->{rombar}) && !$d->{rombar};
-	    die "legacy IGD assignment is not compatible with x-vga\n"
-		if $d->{'x-vga'};
-	    die "legacy IGD assignment is not compatible with mdev\n"
-		if $d->{mdev};
-	    die "legacy IGD assignment is not compatible with q35\n"
-		if $q35;
-	    die "legacy IGD assignment is not compatible with multifunction devices\n"
-		if $multifunction;
-	    die "legacy IGD assignment only works for devices on host bus 00:02.0\n"
-		if $pcidevices->[0]->{id} !~ m/02\.0$/;
-	}
+	my $num_devices = scalar($d->{ids}->@*);
+	my $multifunction = $num_devices > 1 && !$d->{mdev};
 
 	my $xvga = '';
 	if ($d->{'x-vga'}) {
@@ -458,15 +602,13 @@ sub print_hostpci_devices {
 	}
 
 	my $sysfspath;
-	if ($d->{mdev} && scalar(@$pcidevices) == 1) {
+	if ($d->{mdev}) {
 	    my $uuid = generate_mdev_uuid($vmid, $i);
 	    $sysfspath = "/sys/bus/mdev/devices/$uuid";
-	} elsif ($d->{mdev}) {
-	    warn "ignoring mediated device '$id' with multifunction device\n";
 	}
 
-	my $j = 0;
-	foreach my $pcidevice (@$pcidevices) {
+	for (my $j = 0; $j < $num_devices; $j++) {
+	    my $pcidevice = $d->{ids}->[$j];
 	    my $devicestr = "vfio-pci";
 
 	    if ($sysfspath) {
@@ -489,12 +631,13 @@ sub print_hostpci_devices {
 		}
 	    }
 
+
 	    push @$devices, '-device', $devicestr;
-	    $j++;
+	    last if $d->{mdev};
 	}
     }
 
-    return ($kvm_off, $gpu_passthrough, $legacy_igd);
+    return ($kvm_off, $gpu_passthrough, $legacy_igd, $pci_devices);
 }
 
 sub prepare_pci_device {
@@ -596,6 +739,26 @@ sub reserve_pci_usage {
 			warn "leftover PCI reservation found for $id, lets take it...\n";
 		    }
 		}
+	    } elsif ($reservation) {
+		# already reserved by the same vmid
+		if (my $reserved_time = $reservation->{time}) {
+		    if (defined($timeout)) {
+			# use the longer timeout
+			my $old_timeout = $reservation->{time} - 5 - $ctime;
+			$timeout = $old_timeout if $old_timeout > $timeout;
+		    }
+		} elsif (my $reserved_pid = $reservation->{pid}) {
+		    my $running_pid = PVE::QemuServer::Helpers::vm_running_locally($reservation->{vmid});
+		    if (defined($running_pid) && $running_pid == $reservation->{pid}) {
+			if (defined($pid)) {
+			    die "PCI device '$id' already in use by running VMID '$reservation->{vmid}'\n";
+			} elsif (defined($timeout)) {
+			    # ignore timeout reservation for running vms, can happen with e.g.
+			    # qm showcmd
+			    return;
+			}
+		    }
+		}
 	    }
 
 	    $reservation_list->{$id} = { vmid => $vmid };
diff --git a/test/cfg2cmd/q35-linux-hostpci.conf b/test/cfg2cmd/q35-linux-hostpci.conf
index 749f9839..7290120a 100644
--- a/test/cfg2cmd/q35-linux-hostpci.conf
+++ b/test/cfg2cmd/q35-linux-hostpci.conf
@@ -8,7 +8,7 @@ hostpci1: d0:13.0,pcie=1
 hostpci2: 00:f4.0
 hostpci3: d0:15.1,pcie=1
 hostpci4: d0:17.0,pcie=1,rombar=0
-hostpci7: d0:15.1,pcie=1
+hostpci7: d0:15.2,pcie=1
 machine: q35
 memory: 512
 net0: virtio=2E:01:68:F9:9C:87,bridge=vmbr0
diff --git a/test/cfg2cmd/q35-linux-hostpci.conf.cmd b/test/cfg2cmd/q35-linux-hostpci.conf.cmd
index 0774047d..0fedbd2c 100644
--- a/test/cfg2cmd/q35-linux-hostpci.conf.cmd
+++ b/test/cfg2cmd/q35-linux-hostpci.conf.cmd
@@ -32,7 +32,7 @@
   -device 'pcie-root-port,id=ich9-pcie-port-5,addr=10.0,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=5,chassis=5' \
   -device 'vfio-pci,host=0000:d0:17.0,id=hostpci4,bus=ich9-pcie-port-5,addr=0x0,rombar=0' \
   -device 'pcie-root-port,id=ich9-pcie-port-8,addr=10.3,x-speed=16,x-width=32,multifunction=on,bus=pcie.0,port=8,chassis=8' \
-  -device 'vfio-pci,host=0000:d0:15.1,id=hostpci7,bus=ich9-pcie-port-8,addr=0x0' \
+  -device 'vfio-pci,host=0000:d0:15.2,id=hostpci7,bus=ich9-pcie-port-8,addr=0x0' \
   -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
   -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
index c4966902..dda44d99 100755
--- a/test/run_config2command_tests.pl
+++ b/test/run_config2command_tests.pl
@@ -81,6 +81,7 @@ my $pci_devs = [
     "0000:0f:f2.0",
     "0000:d0:13.0",
     "0000:d0:15.1",
+    "0000:d0:15.2",
     "0000:d0:17.0",
     "0000:f0:42.0",
     "0000:f0:43.0",
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 3/6] check_local_resources: extend for mapped resources
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (4 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 2/6] enable cluster mapped PCI " Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 4/6] api: migrate preconditions: use new check_local_resources info Dominik Csapak
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

by adding them to their own list, saving the nodes where
they are not allowed, and return those on 'wantarray' so we don't break
existing callers that don't expect it.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer.pm            | 43 ++++++++++++++++++++++++++++++++++--
 test/MigrationTest/Shared.pm | 14 ++++++++++++
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index cfdf9918..ca689002 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -34,6 +34,8 @@ use PVE::DataCenterConfig;
 use PVE::Exception qw(raise raise_param_exc);
 use PVE::Format qw(render_duration render_bytes);
 use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
+use PVE::Resource::PCI;
+use PVE::Resource::USB;
 use PVE::INotify;
 use PVE::JSONSchema qw(get_standard_option parse_property_string);
 use PVE::ProcFSTools;
@@ -2699,6 +2701,28 @@ sub check_local_resources {
     my ($conf, $noerr) = @_;
 
     my @loc_res = ();
+    my $mapped_res = [];
+
+    my $nodelist = PVE::Cluster::get_nodelist();
+    my $pci_map = PVE::Resource::PCI::config();
+    my $usb_map = PVE::Resource::USB::config();
+
+    my $not_allowed_nodes = { map { $_ => [] } @$nodelist };
+
+    my $add_not_allowed_nodes = sub {
+	my ($type, $key, $id) = @_;
+	for my $node (@$nodelist) {
+	    my $entry;
+	    if ($type eq 'pci') {
+		$entry = PVE::Resource::PCI::get_node_mapping($pci_map, $id, $node);
+	    } elsif ($type eq 'usb') {
+		$entry = PVE::Resource::USB::get_node_mapping($usb_map, $id, $node);
+	    }
+	    if (!scalar($entry->@*)) {
+		push @{$not_allowed_nodes->{$node}}, $key;
+	    }
+	}
+    };
 
     push @loc_res, "hostusb" if $conf->{hostusb}; # old syntax
     push @loc_res, "hostpci" if $conf->{hostpci}; # old syntax
@@ -2706,7 +2730,22 @@ sub check_local_resources {
     push @loc_res, "ivshmem" if $conf->{ivshmem};
 
     foreach my $k (keys %$conf) {
-	next if $k =~ m/^usb/ && ($conf->{$k} =~ m/^spice(?![^,])/);
+	if ($k =~ m/^usb/) {
+	    my $entry = parse_property_string($usb_fmt, $conf->{$k});
+	    my $usb = PVE::QemuServer::USB::parse_usb_device($entry->{host});
+	    next if $usb->{spice};
+	    if ($usb->{mapped}) {
+		$add_not_allowed_nodes->('usb', $k, $entry->{host});
+		push @$mapped_res, $k;
+	    }
+	}
+	if ($k =~ m/^hostpci/) {
+	    my $entry = parse_property_string('pve-qm-hostpci', $conf->{$k});
+	    if ($entry->{host} !~ m/:/) {
+		$add_not_allowed_nodes->('pci', $k, $entry->{host});
+		push @$mapped_res, $k;
+	    }
+	}
 	# sockets are safe: they will recreated be on the target side post-migrate
 	next if $k =~ m/^serial/ && ($conf->{$k} eq 'socket');
 	push @loc_res, $k if $k =~ m/^(usb|hostpci|serial|parallel)\d+$/;
@@ -2714,7 +2753,7 @@ sub check_local_resources {
 
     die "VM uses local resources\n" if scalar @loc_res && !$noerr;
 
-    return \@loc_res;
+    return wantarray ? (\@loc_res, $mapped_res, $not_allowed_nodes) : \@loc_res;
 }
 
 # check if used storages are available on all nodes (use by migrate)
diff --git a/test/MigrationTest/Shared.pm b/test/MigrationTest/Shared.pm
index bd4d20c0..4d95018d 100644
--- a/test/MigrationTest/Shared.pm
+++ b/test/MigrationTest/Shared.pm
@@ -76,6 +76,20 @@ $cluster_module->mock(
     },
 );
 
+our $resource_usb_module = Test::MockModule->new("PVE::Resource::USB");
+$resource_usb_module->mock(
+    config => sub {
+	return {};
+    },
+);
+
+our $resource_pci_module = Test::MockModule->new("PVE::Resource::PCI");
+$resource_pci_module->mock(
+    config => sub {
+	return {};
+    },
+);
+
 our $ha_config_module = Test::MockModule->new("PVE::HA::Config");
 $ha_config_module->mock(
     vm_is_ha_managed => sub {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 4/6] api: migrate preconditions: use new check_local_resources info
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (5 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 3/6] check_local_resources: extend for mapped resources Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 5/6] migration: check for mapped resources Dominik Csapak
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

for offline migration, limit the allowed nodes to the ones where the
mapped resources are available

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Qemu.pm | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 68d1b818..e73a35c9 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4319,7 +4319,11 @@ __PACKAGE__->register_method({
 
 	$res->{running} = PVE::QemuServer::check_running($vmid) ? 1:0;
 
-	# if vm is not running, return target nodes where local storage is available
+	my ($local_resources, $mapped_resources, $not_allowed_nodes) =
+	    PVE::QemuServer::check_local_resources($vmconf, 1);
+	delete $not_allowed_nodes->{$localnode};
+
+	# if vm is not running, return target nodes where local storage/mapped devices are available
 	# for offline migration
 	if (!$res->{running}) {
 	    $res->{allowed_nodes} = [];
@@ -4327,7 +4331,12 @@ __PACKAGE__->register_method({
 	    delete $checked_nodes->{$localnode};
 
 	    foreach my $node (keys %$checked_nodes) {
-		if (!defined $checked_nodes->{$node}->{unavailable_storages}) {
+		if (scalar($not_allowed_nodes->{$node}->@*)) {
+		    $checked_nodes->{$node}->{unavailable_resources} = $not_allowed_nodes->{$node};
+		    next;
+		}
+
+		if (!defined($checked_nodes->{$node}->{unavailable_storages})) {
 		    push @{$res->{allowed_nodes}}, $node;
 		}
 
@@ -4335,13 +4344,11 @@ __PACKAGE__->register_method({
 	    $res->{not_allowed_nodes} = $checked_nodes;
 	}
 
-
 	my $local_disks = &$check_vm_disks_local($storecfg, $vmconf, $vmid);
 	$res->{local_disks} = [ values %$local_disks ];;
 
-	my $local_resources =  PVE::QemuServer::check_local_resources($vmconf, 1);
-
 	$res->{local_resources} = $local_resources;
+	$res->{mapped_resources} = $mapped_resources;
 
 	return $res;
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 5/6] migration: check for mapped resources
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (6 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 4/6] api: migrate preconditions: use new check_local_resources info Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 6/6] add test for mapped pci devices Dominik Csapak
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

they can only be migrated to nodes where there exists a mapping and if
the migration is done offline

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuMigrate.pm | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 09cc1d83..03847c5a 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -210,15 +210,32 @@ sub prepare {
 	$self->{vm_was_paused} = 1 if PVE::QemuServer::vm_is_paused($vmid);
     }
 
-    my $loc_res = PVE::QemuServer::check_local_resources($conf, 1);
-    if (scalar @$loc_res) {
+    my ($loc_res, $mapped_res, $not_allowed_nodes) = PVE::QemuServer::check_local_resources($conf, 1);
+    my $blocking_resources = [];
+    for my $res ($loc_res->@*) {
+	if (!grep($res, $mapped_res->@*)) {
+	    push $blocking_resources->@*, $res;
+	}
+    }
+    if (scalar($blocking_resources->@*)) {
 	if ($self->{running} || !$self->{opts}->{force}) {
-	    die "can't migrate VM which uses local devices: " . join(", ", @$loc_res) . "\n";
+	    die "can't migrate VM which uses local devices: " . join(", ", $blocking_resources->@*) . "\n";
 	} else {
 	    $self->log('info', "migrating VM which uses local devices");
 	}
     }
 
+    if (scalar($mapped_res->@*)) {
+	my $not_available = $not_allowed_nodes->{$self->{node}};
+	if ($running) {
+	    die "can't migrate running VM which uses mapped devices: " . join(", ", $mapped_res->@*) . "\n";
+	} elsif (scalar($not_available->@*)) {
+	    die "can't migrate to '$self->{node}': missing mapped devices " . join(", ", $not_available->@*) . "\n";
+	} else {
+	    $self->log('info', "migrating VM which uses mapped local devices");
+	}
+    }
+
     my $vollist = PVE::QemuServer::get_vm_volumes($conf);
 
     my $storages = {};
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH qemu-server v4 6/6] add test for mapped pci devices
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (7 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 5/6] migration: check for mapped resources Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 1/2] pvesh: fix parameters for proxyto_callback Dominik Csapak
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 test/cfg2cmd/q35-linux-hostpci-mapping.conf   | 17 ++++
 .../q35-linux-hostpci-mapping.conf.cmd        | 36 ++++++++
 test/run_config2command_tests.pl              | 82 +++++++++++++++++++
 3 files changed, 135 insertions(+)
 create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf
 create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd

diff --git a/test/cfg2cmd/q35-linux-hostpci-mapping.conf b/test/cfg2cmd/q35-linux-hostpci-mapping.conf
new file mode 100644
index 00000000..2402cf2a
--- /dev/null
+++ b/test/cfg2cmd/q35-linux-hostpci-mapping.conf
@@ -0,0 +1,17 @@
+# TEST: Config with q35, NUMA, hostpci mapping passthrough, EFI & Linux
+bios: ovmf
+bootdisk: scsi0
+cores: 1
+efidisk0: local:100/vm-100-disk-1.qcow2,size=128K
+hostpci0: someNic
+hostpci1: someGpu,mdev=some-model
+hostpci2: someNic
+machine: q35
+memory: 512
+net0: virtio=2E:01:68:F9:9C:87,bridge=vmbr0
+numa: 1
+ostype: l26
+scsihw: virtio-scsi-pci
+smbios1: uuid=3dd750ce-d910-44d0-9493-525c0be4e687
+sockets: 2
+vmgenid: 54d1c06c-8f5b-440f-b5b2-6eab1380e13d
diff --git a/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd b/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd
new file mode 100644
index 00000000..814a9021
--- /dev/null
+++ b/test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd
@@ -0,0 +1,36 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'vm8006,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=3dd750ce-d910-44d0-9493-525c0be4e687' \
+  -drive 'if=pflash,unit=0,format=raw,readonly=on,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' \
+  -drive 'if=pflash,unit=1,id=drive-efidisk0,format=qcow2,file=/var/lib/vz/images/100/vm-100-disk-1.qcow2' \
+  -global 'ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off' \
+  -smp '2,sockets=2,cores=1,maxcpus=2' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 512 \
+  -object 'memory-backend-ram,id=ram-node0,size=256M' \
+  -numa 'node,nodeid=0,cpus=0,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=256M' \
+  -numa 'node,nodeid=1,cpus=1,memdev=ram-node1' \
+  -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg \
+  -device 'vmgenid,guid=54d1c06c-8f5b-440f-b5b2-6eab1380e13d' \
+  -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
+  -device 'vfio-pci,host=0000:07:10.0,id=hostpci0,bus=pci.0,addr=0x10' \
+  -device 'vfio-pci,sysfsdev=/sys/bus/mdev/devices/00000001-0000-0000-0000-000000008006,id=hostpci1,bus=pci.0,addr=0x11' \
+  -device 'vfio-pci,host=0000:07:10.4,id=hostpci2,bus=pci.0,addr=0x1b' \
+  -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
+  -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=1024,bootindex=300' \
+  -machine 'type=q35+pve0'
diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
index dda44d99..eaea9047 100755
--- a/test/run_config2command_tests.pl
+++ b/test/run_config2command_tests.pl
@@ -87,8 +87,37 @@ my $pci_devs = [
     "0000:f0:43.0",
     "0000:f0:43.1",
     "1234:f0:43.1",
+    "0000:01:00.4",
+    "0000:01:00.5",
+    "0000:01:00.6",
+    "0000:07:10.0",
+    "0000:07:10.1",
+    "0000:07:10.4",
 ];
 
+my $pci_map_config = {
+    ids => {
+	someGpu => {
+	    type => 'pci',
+	    map => [
+		'node=localhost,path=0000:01:00.4,id=10de:2231,iommugroup=1',
+		'node=localhost,path=0000:01:00.5,id=10de:2231,iommugroup=1',
+		'node=localhost,path=0000:01:00.6,id=10de:2231,iommugroup=1',
+	    ],
+	},
+	someNic => {
+	    type => 'pci',
+	    map => [
+		'node=localhost,path=0000:07:10.0,id=8086:1520,iommugroup=2',
+		'node=localhost,path=0000:07:10.1,id=8086:1520,iommugroup=2',
+		'node=localhost,path=0000:07:10.4,id=8086:1520,iommugroup=2',
+	    ],
+	},
+    },
+};
+
+my $usb_map_config = {},
+
 my $current_test; # = {
 #   description => 'Test description', # if available
 #   qemu_version => '2.12',
@@ -275,6 +304,28 @@ $pve_common_sysfstools->mock(
 	    } sort @$pci_devs
 	];
     },
+    pci_device_info => sub {
+	my ($path, $noerr) = @_;
+
+	if ($path =~ m/^0000:01:00/) {
+	    return {
+		mdev => 1,
+		iommugroup => 1,
+		mdev => 1,
+		vendor => "0x10de",
+		device => "0x2231",
+	    };
+	} elsif ($path =~ m/^0000:07:10/) {
+	    return {
+		iommugroup => 2,
+		mdev => 0,
+		vendor => "0x8086",
+		device => "0x1520",
+	    };
+	} else {
+	    return {};
+	}
+    },
 );
 
 my $qemu_monitor_module;
@@ -303,6 +354,37 @@ $qemu_monitor_module->mock(
 );
 $qemu_monitor_module->mock('qmp_cmd', \&qmp_cmd);
 
+my $resource_usb_module = Test::MockModule->new("PVE::Resource::USB");
+$resource_usb_module->mock(
+    config => sub {
+	return $usb_map_config;
+    },
+);
+
+my $resource_pci_module = Test::MockModule->new("PVE::Resource::PCI");
+$resource_pci_module->mock(
+    config => sub {
+	return $pci_map_config;
+    },
+);
+
+my $pci_module = Test::MockModule->new("PVE::QemuServer::PCI");
+$pci_module->mock(
+    reserve_pci_usage => sub {
+	my ($ids, $vmid, $timeout, $pid, $dryrun) = @_;
+
+	$ids = [$ids] if !ref($ids);
+
+	for my $id (@$ids) {
+	    if ($id eq "0000:07:10.1") {
+		die "reserved";
+	    }
+	}
+
+	return undef;
+    },
+);
+
 sub diff($$) {
     my ($a, $b) = @_;
     return if $a eq $b;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH manager v4 1/2] pvesh: fix parameters for proxyto_callback
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (8 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 6/6] add test for mapped pci devices Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 2/2] api: add resource map api endpoints for PCI and USB Dominik Csapak
  2023-05-26 16:09 ` [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend DERUMIER, Alexandre
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

in pve-http-server the proxyto_callback always has a complete list of
parameters, not only the ones in the url, so adapt the implementation
here to do the same

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/CLI/pvesh.pm | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/PVE/CLI/pvesh.pm b/PVE/CLI/pvesh.pm
index 9acf292ac..28e2518d5 100755
--- a/PVE/CLI/pvesh.pm
+++ b/PVE/CLI/pvesh.pm
@@ -82,13 +82,15 @@ my $method_map = {
 };
 
 sub check_proxyto {
-    my ($info, $uri_param) = @_;
+    my ($info, $uri_param, $params) = @_;
 
     my $rpcenv = PVE::RPCEnvironment->get();
 
+    my $all_params = { %$uri_param, %$params };
+
     if ($info->{proxyto} || $info->{proxyto_callback}) {
 	my $node = PVE::API2Tools::resolve_proxyto(
-	    $rpcenv, $info->{proxyto_callback}, $info->{proxyto}, $uri_param);
+	    $rpcenv, $info->{proxyto_callback}, $info->{proxyto}, $all_params);
 
 	if ($node ne 'localhost' && ($node ne PVE::INotify::nodename())) {
 	    die "proxy loop detected - aborting\n" if $disable_proxy;
@@ -301,7 +303,7 @@ sub call_api_method {
     }
 
     my $data;
-    my ($node, $remip) = check_proxyto($info, $uri_param);
+    my ($node, $remip) = check_proxyto($info, $uri_param, $param);
     if ($node) {
 	$data = proxy_handler($node, $remip, $path, $cmd, $param);
     } else {
@@ -345,7 +347,7 @@ __PACKAGE__->register_method ({
 
 	my $res;
 
-	my ($node, $remip) = check_proxyto($info, $uri_param);
+	my ($node, $remip) = check_proxyto($info, $uri_param, $param);
 	if ($node) {
 	    $res = proxy_handler($node, $remip, $path, 'ls', $param);
 	} else {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] [PATCH manager v4 2/2] api: add resource map api endpoints for PCI and USB
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (9 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 1/2] pvesh: fix parameters for proxyto_callback Dominik Csapak
@ 2023-05-25 10:17 ` Dominik Csapak
  2023-05-26 16:09 ` [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend DERUMIER, Alexandre
  11 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:17 UTC (permalink / raw)
  To: pve-devel

this adds the typical section config crud API calls for
USB and PCI resource mapping to /cluster/resource/{TYPE}

the only special thing that this series does is the list call
for both has a special 'check-node' parameter that uses the
'proxyto_callback' to reroute the api call to the given node
so that it can check the validity of the mapping for that node

in the future when we e.g. broadcast the lspci output via pmxcfs
we drop the proxyto_callback and directly use the info from
pmxcfs (or we drop the parameter and always check all nodes)

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Cluster.pm                |   8 +
 PVE/API2/Cluster/Makefile          |   5 +
 PVE/API2/Cluster/Resource.pm       |  53 +++++
 PVE/API2/Cluster/Resource/Makefile |  18 ++
 PVE/API2/Cluster/Resource/PCI.pm   | 297 +++++++++++++++++++++++++++++
 PVE/API2/Cluster/Resource/USB.pm   | 262 +++++++++++++++++++++++++
 PVE/API2/Hardware.pm               |   1 -
 PVE/API2/Nodes.pm                  |   1 +
 8 files changed, 644 insertions(+), 1 deletion(-)
 create mode 100644 PVE/API2/Cluster/Resource.pm
 create mode 100644 PVE/API2/Cluster/Resource/Makefile
 create mode 100644 PVE/API2/Cluster/Resource/PCI.pm
 create mode 100644 PVE/API2/Cluster/Resource/USB.pm

diff --git a/PVE/API2/Cluster.pm b/PVE/API2/Cluster.pm
index 2e9423685..60e1f5563 100644
--- a/PVE/API2/Cluster.pm
+++ b/PVE/API2/Cluster.pm
@@ -26,6 +26,7 @@ use PVE::API2::ACMEPlugin;
 use PVE::API2::Backup;
 use PVE::API2::Cluster::BackupInfo;
 use PVE::API2::Cluster::Ceph;
+use PVE::API2::Cluster::Resource;
 use PVE::API2::Cluster::Jobs;
 use PVE::API2::Cluster::MetricServer;
 use PVE::API2::ClusterConfig;
@@ -90,6 +91,12 @@ __PACKAGE__->register_method ({
     subclass => "PVE::API2::Cluster::Jobs",
     path => 'jobs',
 });
+
+__PACKAGE__->register_method ({
+    subclass => "PVE::API2::Cluster::Resource",
+    path => 'resource',
+});
+
 if ($have_sdn) {
     __PACKAGE__->register_method ({
        subclass => "PVE::API2::Network::SDN",
@@ -145,6 +152,7 @@ __PACKAGE__->register_method ({
 	    { name => 'options' },
 	    { name => 'replication' },
 	    { name => 'resources' },
+	    { name => 'resource' },
 	    { name => 'status' },
 	    { name => 'tasks' },
 	];
diff --git a/PVE/API2/Cluster/Makefile b/PVE/API2/Cluster/Makefile
index 8d3065078..56b37a0d7 100644
--- a/PVE/API2/Cluster/Makefile
+++ b/PVE/API2/Cluster/Makefile
@@ -1,10 +1,13 @@
 include ../../../defines.mk
 
+SUBDIRS=Resource
+
 # for node independent, cluster-wide applicable, API endpoints
 # ensure we do not conflict with files shipped by pve-cluster!!
 PERLSOURCE= 			\
 	BackupInfo.pm		\
 	MetricServer.pm		\
+	Resource.pm		\
 	Jobs.pm			\
 	Ceph.pm
 
@@ -13,8 +16,10 @@ all:
 .PHONY: clean
 clean:
 	rm -rf *~
+	set -e && for i in ${SUBDIRS}; do ${MAKE} -C $$i $@; done
 
 .PHONY: install
 install: ${PERLSOURCE}
 	install -d ${PERLLIBDIR}/PVE/API2/Cluster
 	install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/API2/Cluster
+	set -e && for i in ${SUBDIRS}; do ${MAKE} -C $$i $@; done
diff --git a/PVE/API2/Cluster/Resource.pm b/PVE/API2/Cluster/Resource.pm
new file mode 100644
index 000000000..2c288d8f5
--- /dev/null
+++ b/PVE/API2/Cluster/Resource.pm
@@ -0,0 +1,53 @@
+package PVE::API2::Cluster::Resource;
+
+use strict;
+use warnings;
+
+use PVE::RESTHandler;
+
+use PVE::API2::Cluster::Resource::PCI;
+use PVE::API2::Cluster::Resource::USB;
+
+use base qw(PVE::RESTHandler);
+
+__PACKAGE__->register_method ({
+    subclass => "PVE::API2::Cluster::Resource::PCI",
+    path => 'pci',
+});
+
+__PACKAGE__->register_method ({
+    subclass => "PVE::API2::Cluster::Resource::USB",
+    path => 'usb',
+});
+
+__PACKAGE__->register_method ({
+    name => 'index',
+    path => '',
+    method => 'GET',
+    description => "List resource types.",
+    permissions => {
+	user => 'all',
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {},
+    },
+    returns => {
+	type => 'array',
+	items => {
+	    type => "object",
+	},
+	links => [ { rel => 'child', href => "{name}" } ],
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $result = [
+	    { name => 'pci' },
+	    { name => 'usb' },
+	];
+
+	return $result;
+    }});
+
+1;
diff --git a/PVE/API2/Cluster/Resource/Makefile b/PVE/API2/Cluster/Resource/Makefile
new file mode 100644
index 000000000..720e91549
--- /dev/null
+++ b/PVE/API2/Cluster/Resource/Makefile
@@ -0,0 +1,18 @@
+include ../../../../defines.mk
+
+# for node independent, cluster-wide applicable, API endpoints
+# ensure we do not conflict with files shipped by pve-cluster!!
+PERLSOURCE= 	\
+	PCI.pm	\
+	USB.pm
+
+all:
+
+.PHONY: clean
+clean:
+	rm -rf *~
+
+.PHONY: install
+install: ${PERLSOURCE}
+	install -d ${PERLLIBDIR}/PVE/API2/Cluster/Resource
+	install -m 0644 ${PERLSOURCE} ${PERLLIBDIR}/PVE/API2/Cluster/Resource
diff --git a/PVE/API2/Cluster/Resource/PCI.pm b/PVE/API2/Cluster/Resource/PCI.pm
new file mode 100644
index 000000000..cb641cc4d
--- /dev/null
+++ b/PVE/API2/Cluster/Resource/PCI.pm
@@ -0,0 +1,297 @@
+package PVE::API2::Cluster::Resource::PCI;
+
+use strict;
+use warnings;
+
+use Storable qw(dclone);
+
+use PVE::Cluster qw(cfs_lock_file);
+use PVE::Resource::PCI;
+use PVE::JSONSchema qw(get_standard_option);
+use PVE::Tools qw(extract_param);
+
+use PVE::RESTHandler;
+
+use base qw(PVE::RESTHandler);
+
+__PACKAGE__->register_method ({
+    name => 'index',
+    path => '',
+    method => 'GET',
+    # only proxy if we give the 'check-node' parameter
+    proxyto_callback => sub {
+	my ($rpcenv, $proxyto, $param) = @_;
+	return $param->{'check-node'} // 'localhost';
+    },
+    description => "PCI Hardware Mapping",
+    permissions => {
+	description => "Only lists entries where you have 'Resource.Modify', 'Resource.Use' ".
+	    "permissions on '/resource/pci/<name>'.",
+	user => 'all',
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    'check-node' => get_standard_option('pve-node', {
+		description => "If given, checks the configurations on the given node for ".
+		    "correctness, and adds relevant errors to the devices.",
+		optional => 1,
+	    }),
+	},
+    },
+    returns => {
+	type => 'array',
+	items => {
+	    type => "object",
+	    properties => {
+		id => {
+		    type => 'string',
+		    description => "The logical ID of the resource."
+		},
+		map => {
+		    type => 'array',
+		    description => "The node mappings for the resource.",
+		    items => {
+			type => 'string',
+			description => "A mapping for a node.",
+		    },
+		},
+		description => {
+		    type => 'string',
+		    description => "A description of the logical resource.",
+		},
+		error => {
+		    properties => {
+			severity => {
+			    type => "string",
+			    description => "The severity of the error",
+			},
+			message => {
+			    type => "string",
+			    description => "The message of the error",
+			},
+		    },
+		},
+	    },
+	},
+	links => [ { rel => 'child', href => "{name}" } ],
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+	my $node = $param->{'check-node'};
+
+	die "Wrong node to check\n" if defined($node) && $node ne PVE::INotify::nodename();
+
+	my $cfg = PVE::Resource::PCI::config();
+
+	my $res = [];
+
+	my $privs = ['Resource.Modify', 'Resource.Use'];
+
+	for my $id (keys $cfg->{ids}->%*) {
+	    next if !$rpcenv->check_full($authuser, "/resource/pci/$id", $privs, 1, 1);
+	    next if !$cfg->{ids}->{$id};
+
+	    my $entry = dclone($cfg->{ids}->{$id});
+	    $entry->{id} = $id;
+
+	    if (defined($node)) {
+		$entry->{errors} = [];
+		if (my $mappings = PVE::Resource::PCI::get_node_mapping($cfg, $id, $node)) {
+		    if (!scalar($mappings->@*)) {
+			push $entry->{errors}->@*, {
+			    severity => 'warning',
+			    message => "No mapping for node $node.",
+			};
+		    }
+		    for my $mapping ($mappings->@*) {
+			eval {
+			    PVE::Resource::PCI::assert_valid($id, $mapping);
+			};
+			if (my $err = $@) {
+			    push $entry->{errors}->@*, {
+				severity => 'error',
+				message => "Invalid configuration: $err",
+			    };
+			}
+		    }
+		}
+	    }
+
+	    push @$res, $entry;
+	}
+
+	return $res;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'get',
+    protected => 1,
+    path => '{id}',
+    method => 'GET',
+    description => "GET PCI Resource.",
+    permissions => {
+	check =>['or',
+	    ['perm', '/resource/pci/{name}', ['Resource.Use']],
+	    ['perm', '/resource/pci/{name}', ['Resource.Modify']],
+	],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    id => {
+		type => 'string',
+		format => 'pve-configid',
+	    },
+	}
+    },
+    returns => { type => 'object' },
+    code => sub {
+	my ($param) = @_;
+
+	my $cfg = PVE::Resource::PCI::config();
+	my $name = $param->{id};
+
+	die "mapping '$param->{name}' not found\n" if !defined($cfg->{ids}->{$name});
+
+	my $data = dclone($cfg->{ids}->{$name});
+
+	$data->{digest} = $cfg->{digest};
+
+	return $data;
+    }});
+
+__PACKAGE__->register_method ({
+    name => 'create',
+    protected => 1,
+    path => '',
+    method => 'POST',
+    description => "Create a new hardware mapping.",
+    permissions => {
+	check => ['perm', '/resource/pci/{name}', ['Resource.Modify']],
+    },
+    # todo parameters
+    parameters => PVE::Resource::PCI->createSchema(1),
+    returns => {
+	type => 'null',
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $id = extract_param($param, 'id');
+
+	$param->{map} = [$param->{map}] if defined($param->{map}) && !ref($param->{map});
+
+	my $plugin = PVE::Resource::PCI->lookup('pci');
+	my $opts = $plugin->check_config($id, $param, 1, 1);
+
+	PVE::Resource::PCI::lock_pci_config(sub {
+	    my $cfg = PVE::Resource::PCI::config();
+
+	    die "pci ID '$id' already defined\n" if defined($cfg->{ids}->{$id});
+
+	    $cfg->{ids}->{$id} = $opts;
+
+	    PVE::Resource::PCI::write_pci_config($cfg);
+
+	}, "create hardware mapping failed");
+
+	return;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'update',
+    protected => 1,
+    path => '{id}',
+    method => 'PUT',
+    description => "Update a hardware mapping.",
+    permissions => {
+	check => ['perm', '/resource/pci/{id}', ['Resource.Modify']],
+    },
+    parameters => PVE::Resource::PCI->updateSchema(),
+    returns => {
+	type => 'null',
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $digest = extract_param($param, 'digest');
+	my $delete = extract_param($param, 'delete');
+	my $id = extract_param($param, 'id');
+
+	if ($delete) {
+	    $delete = [ PVE::Tools::split_list($delete) ];
+	}
+
+	$param->{map} = [$param->{map}] if defined($param->{map}) && !ref($param->{map});
+
+	PVE::Resource::PCI::lock_pci_config(sub {
+	    my $cfg = PVE::Resource::PCI::config();
+
+	    PVE::Tools::assert_if_modified($cfg->{digest}, $digest) if defined($digest);
+
+	    die "pci ID '$id' does not exist\n" if !defined($cfg->{ids}->{$id});
+
+	    my $plugin = PVE::Resource::PCI->lookup('pci');
+	    my $opts = $plugin->check_config($id, $param, 1, 1);
+
+	    my $data = $cfg->{ids}->{$id};
+
+	    my $options = $plugin->private()->{options}->{pci};
+	    PVE::SectionConfig::delete_from_config($data, $options, $opts, $delete);
+
+	    $data->{$_} = $opts->{$_} for keys $opts->%*;
+
+	    PVE::Resource::PCI::write_pci_config($cfg);
+
+	}, "update hardware mapping failed");
+
+	return;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'delete',
+    protected => 1,
+    path => '{id}',
+    method => 'DELETE',
+    description => "Remove Hardware Mapping.",
+    permissions => {
+	check => [ 'perm', '/resource/pci/{id}', ['Resource.Modify']],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    id => {
+		type => 'string',
+		format => 'pve-configid',
+	    },
+	}
+    },
+    returns => { type => 'null' },
+    code => sub {
+	my ($param) = @_;
+
+	my $id = $param->{id};
+
+	PVE::Resource::PCI::lock_pci_config(sub {
+	    my $cfg = PVE::Resource::PCI::config();
+
+	    if ($cfg->{ids}->{$id}) {
+		delete $cfg->{ids}->{$id};
+	    }
+
+	    PVE::Resource::PCI::write_pci_config($cfg);
+
+	}, "delete pci mapping failed");
+
+	return;
+    }
+});
+
+1;
diff --git a/PVE/API2/Cluster/Resource/USB.pm b/PVE/API2/Cluster/Resource/USB.pm
new file mode 100644
index 000000000..c47066b69
--- /dev/null
+++ b/PVE/API2/Cluster/Resource/USB.pm
@@ -0,0 +1,262 @@
+package PVE::API2::Cluster::Resource::USB;
+
+use strict;
+use warnings;
+
+use Storable qw(dclone);
+
+use PVE::Cluster qw(cfs_lock_file);
+use PVE::Resource::USB;
+use PVE::JSONSchema qw(get_standard_option);
+use PVE::Tools qw(extract_param);
+
+use PVE::RESTHandler;
+
+use base qw(PVE::RESTHandler);
+
+__PACKAGE__->register_method ({
+    name => 'index',
+    path => '',
+    method => 'GET',
+    description => "USB Hardware Mapping",
+    permissions => {
+	description => "Only lists entries where you have 'Resource.Modify', 'Resource.Use' permissions on '/resource/usb/<name>'.",
+	user => 'all',
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    'check-node' => get_standard_option('pve-node', {
+		description => "If given, checks the configurations on the given node for ".
+		    "correctness, and adds relevant errors to the devices.",
+		optional => 1,
+	    }),
+	},
+    },
+    returns => {
+	type => 'array',
+	items => {
+	    type => "object",
+	    properties => { name => { type => 'string'} },
+	},
+	links => [ { rel => 'child', href => "{name}" } ],
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $rpcenv = PVE::RPCEnvironment::get();
+	my $authuser = $rpcenv->get_user();
+	my $node = $param->{'check-node'};
+
+	die "Wrong node to check\n" if defined($node) && $node ne PVE::INotify::nodename();
+
+	my $cfg = PVE::Resource::USB::config();
+
+	my $res = [];
+
+	my $privs = ['Resource.Modify', 'Resource.Use'];
+
+	for my $id (keys $cfg->{ids}->%*) {
+	    next if !$rpcenv->check_full($authuser, "/resource/usb/$id", $privs, 1, 1);
+	    next if !$cfg->{ids}->{$id};
+
+	    my $entry = dclone($cfg->{ids}->{$id});
+	    $entry->{name} = $id;
+
+	    if (defined($node)) {
+		$entry->{errors} = [];
+		if (my $mappings = PVE::Resource::USB::get_node_mapping($cfg, $id, $node)) {
+		    if (!scalar($mappings->@*)) {
+			push $entry->{errors}->@*, {
+			    severity => 'warning',
+			    message => "No mapping for node $node.",
+			};
+		    }
+		    for my $mapping ($mappings->@*) {
+			eval {
+			    PVE::Resource::USB::assert_valid($id, $mapping);
+			};
+			if (my $err = $@) {
+			    push $entry->{errors}->@*, {
+				severity => 'error',
+				message => "Invalid configuration: $err",
+			    };
+			}
+		    }
+		}
+	    }
+
+	    push @$res, $entry;
+	}
+
+	return $res;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'get',
+    protected => 1,
+    path => '{id}',
+    method => 'GET',
+    description => "GET USB Resource.",
+    permissions => {
+	check =>['or',
+	    ['perm', '/resource/usb/{name}', ['Resource.Use']],
+	    ['perm', '/resource/usb/{name}', ['Resource.Modify']],
+	],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    id => {
+		type => 'string',
+		format => 'pve-configid',
+	    },
+	}
+    },
+    returns => { type => 'object' },
+    code => sub {
+	my ($param) = @_;
+
+	my $cfg = PVE::Resource::USB::config();
+	my $name = $param->{id};
+
+	die "mapping '$param->{name}' not found\n" if !defined($cfg->{ids}->{$name});
+
+	my $data = dclone($cfg->{ids}->{$name});
+
+	$data->{digest} = $cfg->{digest};
+
+	return $data;
+    }});
+
+__PACKAGE__->register_method ({
+    name => 'create',
+    protected => 1,
+    path => '',
+    method => 'POST',
+    description => "Create a new hardware mapping.",
+    permissions => {
+	check => ['perm', '/resource/usb/{name}', ['Resource.Modify']],
+    },
+    # todo parameters
+    parameters => PVE::Resource::USB->createSchema(1),
+    returns => {
+	type => 'null',
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $id = extract_param($param, 'id');
+
+	$param->{map} = [$param->{map}] if defined($param->{map}) && !ref($param->{map});
+
+	my $plugin = PVE::Resource::USB->lookup('usb');
+	my $opts = $plugin->check_config($id, $param, 1, 1);
+
+	PVE::Resource::USB::lock_usb_config(sub {
+	    my $cfg = PVE::Resource::USB::config();
+
+	    die "usb ID '$id' already defined\n" if defined($cfg->{ids}->{$id});
+
+	    $cfg->{ids}->{$id} = $opts;
+
+	    PVE::Resource::USB::write_usb_config($cfg);
+
+	}, "create hardware mapping failed");
+
+	return;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'update',
+    protected => 1,
+    path => '{id}',
+    method => 'PUT',
+    description => "Update a hardware mapping.",
+    permissions => {
+	check => ['perm', '/resource/usb/{id}', ['Resource.Modify']],
+    },
+    parameters => PVE::Resource::USB->updateSchema(),
+    returns => {
+	type => 'null',
+    },
+    code => sub {
+	my ($param) = @_;
+
+	my $digest = extract_param($param, 'digest');
+	my $delete = extract_param($param, 'delete');
+	my $id = extract_param($param, 'id');
+
+	if ($delete) {
+	    $delete = [ PVE::Tools::split_list($delete) ];
+	}
+
+	$param->{map} = [$param->{map}] if defined($param->{map}) && !ref($param->{map});
+
+	PVE::Resource::USB::lock_usb_config(sub {
+	    my $cfg = PVE::Resource::USB::config();
+
+	    PVE::Tools::assert_if_modified($cfg->{digest}, $digest) if defined($digest);
+
+	    die "usb ID '$id' does not exist\n" if !defined($cfg->{ids}->{$id});
+
+	    my $plugin = PVE::Resource::USB->lookup('usb');
+	    my $opts = $plugin->check_config($id, $param, 1, 1);
+
+	    my $data = $cfg->{ids}->{$id};
+
+	    my $options = $plugin->private()->{options}->{usb};
+	    PVE::SectionConfig::delete_from_config($data, $options, $opts, $delete);
+
+	    $data->{$_} = $opts->{$_} for keys $opts->%*;
+
+	    PVE::Resource::USB::write_usb_config($cfg);
+
+	}, "update hardware mapping failed");
+
+	return;
+    },
+});
+
+__PACKAGE__->register_method ({
+    name => 'delete',
+    protected => 1,
+    path => '{id}',
+    method => 'DELETE',
+    description => "Remove Hardware Mapping.",
+    permissions => {
+	check => [ 'perm', '/resource/usb/{id}', ['Resource.Modify']],
+    },
+    parameters => {
+	additionalProperties => 0,
+	properties => {
+	    id => {
+		type => 'string',
+		format => 'pve-configid',
+	    },
+	}
+    },
+    returns => { type => 'null' },
+    code => sub {
+	my ($param) = @_;
+
+	my $id = $param->{id};
+
+	PVE::Resource::USB::lock_usb_config(sub {
+	    my $cfg = PVE::Resource::USB::config();
+
+	    if ($cfg->{ids}->{$id}) {
+		delete $cfg->{ids}->{$id};
+	    }
+
+	    PVE::Resource::USB::write_usb_config($cfg);
+
+	}, "delete usb mapping failed");
+
+	return;
+    }
+});
+
+1;
diff --git a/PVE/API2/Hardware.pm b/PVE/API2/Hardware.pm
index f59bfbe0e..1c6fd8f5c 100644
--- a/PVE/API2/Hardware.pm
+++ b/PVE/API2/Hardware.pm
@@ -21,7 +21,6 @@ __PACKAGE__->register_method ({
     path => 'usb',
 });
 
-
 __PACKAGE__->register_method ({
     name => 'index',
     path => '',
diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
index bfe5c40a1..bf498bedf 100644
--- a/PVE/API2/Nodes.pm
+++ b/PVE/API2/Nodes.pm
@@ -278,6 +278,7 @@ __PACKAGE__->register_method ({
 	    { name => 'query-url-metadata' },
 	    { name => 'replication' },
 	    { name => 'report' },
+	    { name => 'resource-map' },
 	    { name => 'rrd' }, # fixme: remove?
 	    { name => 'rrddata' },# fixme: remove?
 	    { name => 'scan' },
-- 
2.30.2





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs
  2023-05-25 10:17 ` [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs Dominik Csapak
@ 2023-05-25 10:40   ` Dominik Csapak
  0 siblings, 0 replies; 15+ messages in thread
From: Dominik Csapak @ 2023-05-25 10:40 UTC (permalink / raw)
  To: pve-devel

On 5/25/23 12:17, Dominik Csapak wrote:
> adds a config file for each type of resource (usb/pci) by using a 'map'
> array propertystring for each node mapping
> 
> in each mapping we save the path(s) and some other information to detect
> hardware changes (if possible) like the vendor/device id
> 
> both configs have custom header parser/formatter to omit the type (since
> we only want one type per config here)
> 
> also each config has some helpers like find_on_current_node
> 
> the resulting config (e.g. for pci) would look like this:
> 
> ---
> some-pci-device
>      description some device
>      map node=node1,path=0000:00:01.0,id=1234:1234,iommugroup=4
>      map node=node2,path=0000:01:01.0,id=1234:1234,iommugroup=5
>      map node=node3,path=0000:02:01.0,id=1234:1234,iommugroup=6
> ---
> 
> some special notes:
> * mdev is a per config entry, since it does not make sense to mix mdev
>    and non-mdev devices
> * there can be multiple pci paths given per node mapping, this is
>    intended to have a different semantic than in the qemu config, namely
>    it will select the first available instead of passing all through as a
>    multifunction device

just noticed that commit message is slightly outdated, a single mapping
has the same semantic as in the qemu config, and when you want multiple mappings
per node there must be multiple entries

> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>   src/Makefile            |   3 +
>   src/PVE/Resource/PCI.pm | 226 ++++++++++++++++++++++++++++++++++++++++
>   src/PVE/Resource/USB.pm | 183 ++++++++++++++++++++++++++++++++
>   3 files changed, 412 insertions(+)
>   create mode 100644 src/PVE/Resource/PCI.pm
>   create mode 100644 src/PVE/Resource/USB.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index 57a360b..d92b441 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -14,6 +14,9 @@ install: PVE
>   	install -m 0644 PVE/Replication.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/StorageTunnel.pm ${PERL5DIR}/PVE/
>   	install -m 0644 PVE/Tunnel.pm ${PERL5DIR}/PVE/
> +	install -d ${PERL5DIR}/PVE/Resource
> +	install -m 0644 PVE/Resource/PCI.pm ${PERL5DIR}/PVE/Resource/
> +	install -m 0644 PVE/Resource/USB.pm ${PERL5DIR}/PVE/Resource/
>   	install -d ${PERL5DIR}/PVE/VZDump
>   	install -m 0644 PVE/VZDump/Plugin.pm ${PERL5DIR}/PVE/VZDump/
>   	install -m 0644 PVE/VZDump/Common.pm ${PERL5DIR}/PVE/VZDump/
> diff --git a/src/PVE/Resource/PCI.pm b/src/PVE/Resource/PCI.pm
> new file mode 100644
> index 0000000..d4d9c44
> --- /dev/null
> +++ b/src/PVE/Resource/PCI.pm
> @@ -0,0 +1,226 @@
> +package PVE::Resource::PCI;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_lock_file cfs_write_file);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option parse_property_string);
> +use PVE::SectionConfig;
> +use PVE::SysFSTools;
> +
> +use base qw(PVE::SectionConfig);
> +
> +my $FILENAME = 'resource/pci.cfg';
> +
> +cfs_register_file($FILENAME,
> +		  sub { __PACKAGE__->parse_config(@_); },
> +		  sub { __PACKAGE__->write_config(@_); });
> +
> +
> +# so we don't have to repeat the type every time
> +sub parse_section_header {
> +    my ($class, $line) = @_;
> +
> +    if ($line =~ m/^(\S+)\s*$/) {
> +	my $id = $1;
> +	my $errmsg = undef; # set if you want to skip whole section
> +	eval { PVE::JSONSchema::pve_verify_configid($id) };
> +	$errmsg = $@ if $@;
> +	my $config = {}; # to return additional attributes
> +	return ('pci', $id, $errmsg, $config);
> +    }
> +    return undef;
> +}
> +
> +sub format_section_header {
> +    my ($class, $type, $sectionId, $scfg, $done_hash) = @_;
> +
> +    return "$sectionId\n";
> +}
> +
> +sub type {
> +    return 'pci';
> +}
> +
> +my $PCI_RE = "[a-f0-9]{4,}:[a-f0-9]{2}:[a-f0-9]{2}(?:\.[a-f0-9])?";
> +
> +my $map_fmt = {
> +    node => get_standard_option('pve-node'),
> +    id =>{
> +	description => "The vendor and device ID that is expected. Used for".
> +	" detecting hardware changes",
> +	type => 'string',
> +	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
> +    },
> +    'subsystem-id' => {
> +	description => "The subsystem vendor and device ID that is expected. Used".
> +	" for detecting hardware changes.",
> +	type => 'string',
> +	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
> +	optional => 1,
> +    },
> +    path => {
> +	description => "The path to the device. If the function is omitted, the whole device is"
> +	." mapped. In that case use the attributes of the first device. You can give"
> +	." multiple paths as a semicolon seperated list, the first available will then"
> +	." be chosen on guest start.",
> +	type => 'string',
> +	pattern => "(?:${PCI_RE};)*${PCI_RE}",
> +    },
> +    iommugroup => {
> +	type => 'integer',
> +	description => "The IOMMU group in which the device is to be expected in.".
> +	"Used for detecting hardware changes.",
> +	optional => 1,
> +    },
> +    description => {
> +	description => "Description of the node specific device.",
> +	type => 'string',
> +	optional => 1,
> +	maxLength => 4096,
> +    },
> +};
> +
> +my $defaultData = {
> +    propertyList => {
> +	id => {
> +	    type => 'string',
> +	    description => "The ID of the logical PCI resource.",
> +	    format => 'pve-configid',
> +	},
> +	description => {
> +	    description => "Description of the logical PCI device.",
> +	    type => 'string',
> +	    optional => 1,
> +	    maxLength => 4096,
> +	},
> +	mdev => {
> +	    type => 'boolean',
> +	    optional => 1,
> +	},
> +	map => {
> +	    type => 'array',
> +	    description => 'A list of maps for the cluster nodes.',
> +	    optional => 1,
> +	    items => {
> +		type => 'string',
> +		format => $map_fmt,
> +	    },
> +	},
> +    },
> +};
> +
> +sub private {
> +    return $defaultData;
> +}
> +
> +sub options {
> +    return {
> +	description => { optional => 1 },
> +	mdev => { optional => 1 },
> +	map => {},
> +    };
> +}
> +
> +# checks if the given config is valid for the current node
> +sub assert_valid {
> +    my ($name, $cfg) = @_;
> +
> +    my @paths = split(';', $cfg->{path} // '');
> +
> +    my $idx = 0;
> +    for my $path (@paths) {
> +
> +	my $multifunction = 0;
> +	if ($path !~ m/\.[a-f0-9]/i) {
> +	    # whole device, add .0 (must exist)
> +	    $path = "$path.0";
> +	    $multifunction = 1;
> +	}
> +
> +	my $info = PVE::SysFSTools::pci_device_info($path, 1);
> +	die "pci device '$path' not found\n" if !defined($info);
> +
> +	my $correct_props = {
> +	    id => "$info->{vendor}:$info->{device}",
> +	    iommugroup => $info->{iommugroup},
> +	};
> +
> +	if (defined($info->{'subsystem_vendor'}) && defined($info->{'subsystem_device'})) {
> +	    $correct_props->{'subsystem-id'} = "$info->{'subsystem_vendor'}:$info->{'subsystem_device'}";
> +	}
> +
> +	for my $prop (sort keys %$correct_props) {
> +	    next if $prop eq 'iommugroup' && $idx > 0; # check iommu only on the first device
> +
> +	    next if !defined($correct_props->{$prop}) && !defined($cfg->{$prop});
> +	    die "no '$prop' for device '$path'\n"
> +		if defined($correct_props->{$prop}) && !defined($cfg->{$prop});
> +	    die "'$prop' configured but should not be\n"
> +		if !defined($correct_props->{$prop}) && defined($cfg->{$prop});
> +
> +	    my $correct_prop = $correct_props->{$prop};
> +	    $correct_prop =~ s/0x//g;
> +	    my $configured_prop = $cfg->{$prop};
> +	    $configured_prop =~ s/0x//g;
> +
> +	    die "'$prop' does not match for '$name' ($correct_prop != $configured_prop)\n"
> +		if $correct_prop ne $configured_prop;
> +	}
> +	$idx++;
> +    }
> +
> +    return 1;
> +};
> +
> +sub config {
> +    return cfs_read_file($FILENAME);
> +}
> +
> +sub lock_pci_config {
> +    my ($code, $errmsg) = @_;
> +
> +    cfs_lock_file($FILENAME, undef, $code);
> +    if (my $err = $@) {
> +	$errmsg ? die "$errmsg: $err" : die $err;
> +    }
> +}
> +
> +sub write_pci_config {
> +    my ($cfg) = @_;
> +
> +    cfs_write_file($FILENAME, $cfg);
> +}
> +
> +sub find_on_current_node {
> +    my ($id) = @_;
> +
> +    my $cfg = PVE::Resource::PCI::config();
> +    my $node = PVE::INotify::nodename();
> +
> +    # ignore errors
> +    return get_node_mapping($cfg, $id, $node);
> +}
> +
> +sub get_node_mapping {
> +    my ($cfg, $id, $nodename) = @_;
> +
> +    return undef if !defined($cfg->{ids}->{$id});
> +
> +    my $res = [];
> +    for my $map ($cfg->{ids}->{$id}->{map}->@*) {
> +	my $entry = eval { parse_property_string($map_fmt, $map) };
> +	warn $@ if $@;
> +	if ($entry && $entry->{node} eq $nodename) {
> +	    push $res->@*, $entry;
> +	}
> +    }
> +
> +    return $res;
> +}
> +
> +PVE::Resource::PCI->register();
> +PVE::Resource::PCI->init();
> +
> +1;
> diff --git a/src/PVE/Resource/USB.pm b/src/PVE/Resource/USB.pm
> new file mode 100644
> index 0000000..05ea789
> --- /dev/null
> +++ b/src/PVE/Resource/USB.pm
> @@ -0,0 +1,183 @@
> +package PVE::Resource::USB;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_lock_file cfs_write_file);
> +use PVE::INotify;
> +use PVE::JSONSchema qw(get_standard_option parse_property_string);
> +use PVE::SectionConfig;
> +use PVE::SysFSTools;
> +
> +use base qw(PVE::SectionConfig);
> +
> +my $FILENAME = 'resource/usb.cfg';
> +
> +cfs_register_file($FILENAME,
> +		  sub { __PACKAGE__->parse_config(@_); },
> +		  sub { __PACKAGE__->write_config(@_); });
> +
> +
> +# so we don't have to repeat the type every time
> +sub parse_section_header {
> +    my ($class, $line) = @_;
> +
> +    if ($line =~ m/^(\S+)\s*$/) {
> +	my $id = $1;
> +	my $errmsg = undef; # set if you want to skip whole section
> +	eval { PVE::JSONSchema::pve_verify_configid($id) };
> +	$errmsg = $@ if $@;
> +	my $config = {}; # to return additional attributes
> +	return ('usb', $id, $errmsg, $config);
> +    }
> +    return undef;
> +}
> +
> +sub format_section_header {
> +    my ($class, $type, $sectionId, $scfg, $done_hash) = @_;
> +
> +    return "$sectionId\n";
> +}
> +
> +sub type {
> +    return 'usb';
> +}
> +
> +my $map_fmt = {
> +    node => get_standard_option('pve-node'),
> +    'id' => {
> +	description => "The vendor and device ID that is expected. If a USB path".
> +	" is given, it is only used for detecting hardware changes",
> +	type => 'string',
> +	pattern => qr/^[0-9A-Fa-f]{4}:[0-9A-Fa-f]{4}$/,
> +    },
> +    path => {
> +	description => "The path to the usb device.",
> +	type => 'string',
> +	optional => 1,
> +	pattern => qr/^(\d+)\-(\d+(\.\d+)*)$/,
> +    },
> +    description => {
> +	description => "Description of the node specific device.",
> +	type => 'string',
> +	optional => 1,
> +	maxLength => 4096,
> +    },
> +};
> +
> +my $defaultData = {
> +    propertyList => {
> +	id => {
> +	    type => 'string',
> +	    description => "The ID of the logical PCI resource.",
> +	    format => 'pve-configid',
> +	},
> +	description => {
> +	    description => "Description of the logical PCI device.",
> +	    type => 'string',
> +	    optional => 1,
> +	    maxLength => 4096,
> +	},
> +	map => {
> +	    type => 'array',
> +	    description => 'A list of maps for the cluster nodes.',
> +	    items => {
> +		type => 'string',
> +		format => $map_fmt,
> +	    },
> +	},
> +    },
> +};
> +sub private {
> +    return $defaultData;
> +}
> +
> +sub options {
> +    return {
> +	description => { optional => 1 },
> +	map => {},
> +    };
> +}
> +
> +# checks if the given device is valid for the current node
> +sub assert_valid {
> +    my ($name, $cfg) = @_;
> +
> +    my $id = $cfg->{id};
> +
> +    my $usb_list = PVE::SysFSTools::scan_usb();
> +
> +    my $info;
> +    if (my $path = $cfg->{path}) {
> +	for my $dev (@$usb_list) {
> +	    next if !$dev->{usbpath} || !$dev->{busnum};
> +	    my $usbpath = "$dev->{busnum}-$dev->{usbpath}";
> +	    next if $usbpath ne $path;
> +	    $info = $dev;
> +	}
> +	die "usb device '$path' not found\n" if !defined($info);
> +
> +	my $realId = "$info->{vendid}:$info->{prodid}";
> +	die "'id' does not match for '$name' ($realId != $id)\n"
> +	    if $realId ne $id;
> +    } else {
> +	for my $dev (@$usb_list) {
> +	    my $realId = "$dev->{vendid}:$dev->{prodid}";
> +	    next if $realId ne $id;
> +	    $info = $dev;
> +	}
> +	die "usb device '$id' not found\n" if !defined($info);
> +    }
> +
> +    return 1;
> +};
> +
> +sub config {
> +    return cfs_read_file($FILENAME);
> +}
> +
> +sub lock_usb_config {
> +    my ($code, $errmsg) = @_;
> +
> +    cfs_lock_file($FILENAME, undef, $code);
> +    if (my $err = $@) {
> +	$errmsg ? die "$errmsg: $err" : die $err;
> +    }
> +}
> +
> +sub write_usb_config {
> +    my ($cfg) = @_;
> +
> +    cfs_write_file($FILENAME, $cfg);
> +}
> +
> +sub find_on_current_node {
> +    my ($id) = @_;
> +
> +    my $cfg = config();
> +    my $node = PVE::INotify::nodename();
> +
> +    return get_node_mapping($cfg, $id, $node);
> +}
> +
> +sub get_node_mapping {
> +    my ($cfg, $id, $nodename) = @_;
> +
> +    return undef if !defined($cfg->{ids}->{$id});
> +
> +    my $res = [];
> +    for my $map ($cfg->{ids}->{$id}->{map}->@*) {
> +	my $entry = eval { parse_property_string($map_fmt, $map) };
> +	warn $@ if $@;
> +	if ($entry && $entry->{node} eq $nodename) {
> +	    push $res->@*, $entry;
> +	}
> +    }
> +
> +    return $res;
> +}
> +
> +PVE::Resource::USB->register();
> +PVE::Resource::USB->init();
> +
> +1;





^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend
  2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
                   ` (10 preceding siblings ...)
  2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 2/2] api: add resource map api endpoints for PCI and USB Dominik Csapak
@ 2023-05-26 16:09 ` DERUMIER, Alexandre
  11 siblings, 0 replies; 15+ messages in thread
From: DERUMIER, Alexandre @ 2023-05-26 16:09 UTC (permalink / raw)
  To: pve-devel

Thanks for sharing your progress Dominik.

It'll try to retest them with an nvidia mdev vgpu cluster in coming
weeks. 

Le jeudi 25 mai 2023 à 12:17 +0200, Dominik Csapak a écrit :
> this series aims to add a cluster-wide device mapping for pci and usb
> devices. so that an admin can configure a device to be availble for
> migration and configuring for uses that are non-root
> (the existing pattern can be copied easily for other types, e.g.
> markus upcoming folder sharing)
> 
> note that this series requires the array support in api/section
> config[0] but how the api is formed/where the data comes should not
> be
> hard to change
> 
> also since the api changed quite drastically, the gui must be
> adapted,
> and i'm not done with that yet, so sending the backend only for now
> 
> the series is a bigger change to the v3, so a closer look is probably
> warranted
> 
> changes from v3:
> * the configs are now split by type (for ease of use of the section
>   config) and live in pve-guest-common, to avoid a cyclic dependcy
> * the configs are section configs now (with mentioned array support)
> * the api is now only defined in /cluster/resource/{TYPE} and has
>   no nodespecific api anymore, besides a 'check-node' parameter
>   (see the pve-manager patch for more details on that)
> * the internal structure of the pci parsing changed completely,
> making
>   the structure more understandable
> * a single map entry now has the same semantic as the qemu-server
>   hostpci config entry, meaning if you want multiple mappings per
> host,
>   you have to add multiple map entries. this is a more flexible
>   approach, and the parsing code gets a bit simpler
> * combined some properties in the config (e.g. vendor/device) so that
>   we don't have too many
> * squashed some changes together, as they didn't make much sense
>   separately anyway (e.g. api/config patches) and it didn't make
>   reviewing easier
> * changed the ACL paths & privileges to be more general
> * surely some other changes i forgot..
> 
> changes from v2:
> * some bug fixes (e.g use of unitialized variable)
> * don't set mdev for multifunction devices
>   -> this should fix alexandres issue, since it's not possible
> anymore
>   to select a mediated device when having a multifunction device
>   selected
> 
> changes from v1:
> * dropped 'check_hw_perm' (just use 'check_full' now)
> * added some cleanups
> * renamed the buttons in the ui (hopefully better now)
> * added multi device mapping for each host
>   this includes a new 'multi pci' selector for that window, which
>   automatically adds entries for the whole slots which, when
> selected,
>   disabled the selection of the individual functions
> * fixed some issues (e.g. missing entries in the 'caps' object, wrong
>   usb config parsing, etc.)
> 
> changes from the rfc:
> * new cluster wide gui instead of node-local one (removed that, since
>   it's not necessary when we have a cluster-wide one)
> * uses json instead of a section config
> * api is quite different overall, i split the type into its own level
>   for configuring, similar to what we do in pbs
>   (e.g. /nodes/NODENAME/hardware/mapping/usb/)
> * fixed quite some bugs the rfc had
> * added patch for handling the gui with limited permissions better
> * added a 'comment' field for mappings
> 
> dependencies are pretty straight forward this time around (if i'm not
> overlooking something):
> 
> qemu-server/pve-manager -> new access-control/pve-guest-common -> new
> pve-cluster
> 
> 0:
> https://antiphishing.cetsi.fr/proxy/v3?i=SHV0Y1JZQjNyckJFa3dUQiblhF5YcUqtiWCaK_ri0kk&r=T0hnMlUyVEgwNmlmdHc1NSqeTQ1pLQVNn4UvDLnWe4fCxNuytxXrtkvXRfHgEH29SgNUOJTfU-F2je9BBTq-sg&f=V3p0eFlQOUZ4czh2enpJS6vlBYwhEUcOwTmUN-Hu71ZWogcUGH-slS7gYzVrVVB6_wb2zNaC4g2GRLF4nWvKLw&u=https%3A//lists.proxmox.com/pipermail/pve-devel/2023-May/056739.html&k=ZVd0
> 
> pve-cluster:
> 
> Dominik Csapak (1):
>   add cfg files for resource mapping
> 
>  src/PVE/Cluster.pm  | 2 ++
>  src/pmxcfs/status.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> pve-access-control:
> 
> Dominik Csapak (1):
>   add privileges and paths for cluster resource mapping
> 
>  src/PVE/AccessControl.pm  | 20 +++++++++++++++++++-
>  src/PVE/RPCEnvironment.pm |  7 +++++--
>  2 files changed, 24 insertions(+), 3 deletions(-)
> 
> pve-guest-common:
> 
> Dominik Csapak (1):
>   add PCI/USB Resource configs
> 
>  src/Makefile            |   3 +
>  src/PVE/Resource/PCI.pm | 226
> ++++++++++++++++++++++++++++++++++++++++
>  src/PVE/Resource/USB.pm | 183 ++++++++++++++++++++++++++++++++
>  3 files changed, 412 insertions(+)
>  create mode 100644 src/PVE/Resource/PCI.pm
>  create mode 100644 src/PVE/Resource/USB.pm
> 
> qemu-server:
> 
> Dominik Csapak (6):
>   enable cluster mapped USB devices for guests
>   enable cluster mapped PCI devices for guests
>   check_local_resources: extend for mapped resources
>   api: migrate preconditions: use new check_local_resources info
>   migration: check for mapped resources
>   add test for mapped pci devices
> 
>  PVE/API2/Qemu.pm                              | 110 +++++++-
>  PVE/QemuMigrate.pm                            |  23 +-
>  PVE/QemuServer.pm                             | 111 +++++---
>  PVE/QemuServer/PCI.pm                         | 243 +++++++++++++++-
> --
>  PVE/QemuServer/USB.pm                         |  22 +-
>  test/MigrationTest/Shared.pm                  |  14 +
>  test/cfg2cmd/q35-linux-hostpci-mapping.conf   |  17 ++
>  .../q35-linux-hostpci-mapping.conf.cmd        |  36 +++
>  test/cfg2cmd/q35-linux-hostpci.conf           |   2 +-
>  test/cfg2cmd/q35-linux-hostpci.conf.cmd       |   2 +-
>  test/run_config2command_tests.pl              |  83 ++++++
>  11 files changed, 575 insertions(+), 88 deletions(-)
>  create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf
>  create mode 100644 test/cfg2cmd/q35-linux-hostpci-mapping.conf.cmd
> 
> pve-manager:
> 
> Dominik Csapak (2):
>   pvesh: fix parameters for proxyto_callback
>   api: add resource map api endpoints for PCI and USB
> 
>  PVE/API2/Cluster.pm                |   8 +
>  PVE/API2/Cluster/Makefile          |   5 +
>  PVE/API2/Cluster/Resource.pm       |  53 +++++
>  PVE/API2/Cluster/Resource/Makefile |  18 ++
>  PVE/API2/Cluster/Resource/PCI.pm   | 297
> +++++++++++++++++++++++++++++
>  PVE/API2/Cluster/Resource/USB.pm   | 262 +++++++++++++++++++++++++
>  PVE/API2/Hardware.pm               |   1 -
>  PVE/API2/Nodes.pm                  |   1 +
>  PVE/CLI/pvesh.pm                   |  10 +-
>  9 files changed, 650 insertions(+), 5 deletions(-)
>  create mode 100644 PVE/API2/Cluster/Resource.pm
>  create mode 100644 PVE/API2/Cluster/Resource/Makefile
>  create mode 100644 PVE/API2/Cluster/Resource/PCI.pm
>  create mode 100644 PVE/API2/Cluster/Resource/USB.pm
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [pve-devel] applied: [PATCH cluster v4 1/1] add cfg files for resource mapping
  2023-05-25 10:17 ` [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping Dominik Csapak
@ 2023-06-05  9:11   ` Thomas Lamprecht
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Lamprecht @ 2023-06-05  9:11 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 25/05/2023 um 12:17 schrieb Dominik Csapak:
> resource/pci.cfg and
> resource/usb.cfg
> 
> to PVE/Cluster.pm
> and status.c
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  src/PVE/Cluster.pm  | 2 ++
>  src/pmxcfs/status.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
>

applied, with finalizing our wording for this through s/resource/mapping/, as
talked off-list - thanks!




^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-06-05  9:11 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-25 10:17 [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH cluster v4 1/1] add cfg files for resource mapping Dominik Csapak
2023-06-05  9:11   ` [pve-devel] applied: " Thomas Lamprecht
2023-05-25 10:17 ` [pve-devel] [PATCH access-control v4 1/1] add privileges and paths for cluster " Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH guest-common v4 1/1] add PCI/USB Resource configs Dominik Csapak
2023-05-25 10:40   ` Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 1/6] enable cluster mapped USB devices for guests Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 2/6] enable cluster mapped PCI " Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 3/6] check_local_resources: extend for mapped resources Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 4/6] api: migrate preconditions: use new check_local_resources info Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 5/6] migration: check for mapped resources Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH qemu-server v4 6/6] add test for mapped pci devices Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 1/2] pvesh: fix parameters for proxyto_callback Dominik Csapak
2023-05-25 10:17 ` [pve-devel] [PATCH manager v4 2/2] api: add resource map api endpoints for PCI and USB Dominik Csapak
2023-05-26 16:09 ` [pve-devel] [PATCH cluster/access-control/guest-common/qemu-server/manager v4] cluster mapping backend DERUMIER, Alexandre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal