public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server 0/3] fix #3258: check for in-use pci devices on vm start
@ 2021-10-05 13:11 Dominik Csapak
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE Dominik Csapak
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dominik Csapak @ 2021-10-05 13:11 UTC (permalink / raw)
  To: pve-devel

by having a vmid <-> pciid mapping in /var/run
i did not check if the vm has the pci device really in the config,
but we should not need that, since we remove the reservation again
in the cleanup step.

if wanted we can of course parse the target vms config and check if
the pci device is still configured, or alternatively, ask qmp and or
parse the /proc/PID/cmdline for the pcidevice, but both options seem
too expensive?

Dominik Csapak (3):
  pci: to not capture first group in PCIRE
  pci: add helpers to (un)reserve pciids for a vm
  fix #3258: block vm start when pci device is already in use

 PVE/QemuServer.pm     |  8 ++++
 PVE/QemuServer/PCI.pm | 91 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 98 insertions(+), 1 deletion(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE
  2021-10-05 13:11 [pve-devel] [PATCH qemu-server 0/3] fix #3258: check for in-use pci devices on vm start Dominik Csapak
@ 2021-10-05 13:11 ` Dominik Csapak
  2021-10-05 14:13   ` [pve-devel] applied: " Thomas Lamprecht
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm Dominik Csapak
  2021-10-05 13:12 ` [pve-devel] [PATCH qemu-server 3/3] fix #3258: block vm start when pci device is already in use Dominik Csapak
  2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-10-05 13:11 UTC (permalink / raw)
  To: pve-devel

we do not need this group, but want to use the regex where we have
multiple groups, so make it a non-capture group

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer/PCI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index 2ee142f..5608207 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -17,7 +17,7 @@ parse_hostpci
 
 our $MAX_HOSTPCI_DEVICES = 16;
 
-my $PCIRE = qr/([a-f0-9]{4}:)?[a-f0-9]{2}:[a-f0-9]{2}(?:\.[a-f0-9])?/;
+my $PCIRE = qr/(?:[a-f0-9]{4}:)?[a-f0-9]{2}:[a-f0-9]{2}(?:\.[a-f0-9])?/;
 my $hostpci_fmt = {
     host => {
 	default_key => 1,
-- 
2.30.2





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm
  2021-10-05 13:11 [pve-devel] [PATCH qemu-server 0/3] fix #3258: check for in-use pci devices on vm start Dominik Csapak
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE Dominik Csapak
@ 2021-10-05 13:11 ` Dominik Csapak
  2021-10-05 14:13   ` Thomas Lamprecht
  2021-10-05 13:12 ` [pve-devel] [PATCH qemu-server 3/3] fix #3258: block vm start when pci device is already in use Dominik Csapak
  2 siblings, 1 reply; 6+ messages in thread
From: Dominik Csapak @ 2021-10-05 13:11 UTC (permalink / raw)
  To: pve-devel

saves a list of pciid <-> vmid mappings in /var/run
that we can check when we start a vm

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer/PCI.pm | 89 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index 5608207..0626b76 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -5,6 +5,7 @@ use strict;
 
 use PVE::JSONSchema;
 use PVE::SysFSTools;
+use PVE::Tools;
 
 use base 'Exporter';
 
@@ -461,4 +462,92 @@ sub print_hostpci_devices {
     return ($kvm_off, $gpu_passthrough, $legacy_igd);
 }
 
+my $PCIID_RESERVATION_FILE = "/var/run/pve-reserved-pciids";
+my $PCIID_RESERVATION_LCK = "/var/lock/pve-reserved-pciids.lck";
+
+my $parse_pci_reservation = sub {
+    my $pciids = {};
+
+    if (my $fh = IO::File->new ($PCIID_RESERVATION_FILE, "r")) {
+	while (my $line = <$fh>) {
+	    if ($line =~ m/^($PCIRE)\s(\d+)\s(\d+)$/) {
+		$pciids->{$1} = {
+		    timestamp => $2,
+		    vmid => $3,
+		};
+	    }
+	}
+    }
+
+    return $pciids;
+};
+
+my $write_pci_reservation = sub {
+    my ($pciids) = @_;
+
+    my $data = "";
+    foreach my $p (keys %$pciids) {
+	$data .= "$p $pciids->{$p}->{timestamp} $pciids->{$p}->{vmid}\n";
+    }
+
+    PVE::Tools::file_set_contents($PCIID_RESERVATION_FILE, $data);
+};
+
+sub remove_pci_reservation {
+    my ($id) = @_;
+
+    my $code = sub {
+	my $pciids = $parse_pci_reservation->();
+
+	delete $pciids->{$id};
+
+	$write_pci_reservation->($pciids);
+    };
+
+    PVE::Tools::lock_file($PCIID_RESERVATION_LCK, 10, $code);
+    die $@ if $@;
+
+    return;
+}
+
+sub reserve_pci_usage {
+    my ($id, $vmid) = @_;
+
+    my $code = sub {
+
+	# have a 60 second grace period, since we reserve before
+	# we actually start the vm
+	my $graceperiod = 60;
+	my $ctime = time();
+
+	my $pciids = $parse_pci_reservation->();
+
+	if (my $pciid = $pciids->{$id}) {
+	    if ($pciid->{vmid} == $vmid) {
+		return; # already reserved
+	    }
+
+	    if (($pciid->{timestamp} + $graceperiod > $ctime) ||
+		PVE::QemuServer::Helpers::vm_running_locally($vmid))
+	    {
+		die "PCI device '$id' already in use\n";
+	    }
+	}
+
+	$pciids->{$id} = {
+	    timestamp => $ctime,
+	    vmid => $vmid,
+	};
+
+	$write_pci_reservation->($pciids);
+
+	return;
+    };
+
+    PVE::Tools::lock_file($PCIID_RESERVATION_LCK, 10, $code);
+    die $@ if $@;
+
+    return;
+}
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] [PATCH qemu-server 3/3] fix #3258: block vm start when pci device is already in use
  2021-10-05 13:11 [pve-devel] [PATCH qemu-server 0/3] fix #3258: check for in-use pci devices on vm start Dominik Csapak
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE Dominik Csapak
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm Dominik Csapak
@ 2021-10-05 13:12 ` Dominik Csapak
  2 siblings, 0 replies; 6+ messages in thread
From: Dominik Csapak @ 2021-10-05 13:12 UTC (permalink / raw)
  To: pve-devel

on vm start, we reserve all pciids that we use, and
remove the reservation again in vm_stop_cleanup

this way, when a vm starts with a pci device that is already configured
for a different running vm, will not be started and the user gets
the error that the device is already in use

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer.pm | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 076ce59..1e8cd53 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5365,6 +5365,13 @@ sub vm_start_nolock {
       my $d = parse_hostpci($conf->{"hostpci$i"});
       next if !$d;
       my $pcidevices = $d->{pciid};
+
+      # reserve all pciids
+      foreach my $pcidevice (@$pcidevices) {
+	  my $pciid = $pcidevice->{id};
+	  PVE::QemuServer::PCI::reserve_pci_usage($pciid, $vmid);
+      }
+
       foreach my $pcidevice (@$pcidevices) {
 	    my $pciid = $pcidevice->{id};
 
@@ -5676,6 +5683,7 @@ sub vm_stop_cleanup {
 	    foreach my $pci (@{$d->{pciid}}) {
 		my $pciid = $pci->{id};
 		PVE::SysFSTools::pci_cleanup_mdev_device($pciid, $uuid);
+		PVE::QemuServer::PCI::remove_pci_reservation($pciid);
 	    }
 	}
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm Dominik Csapak
@ 2021-10-05 14:13   ` Thomas Lamprecht
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2021-10-05 14:13 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

On 05.10.21 15:11, Dominik Csapak wrote:
> saves a list of pciid <-> vmid mappings in /var/run
> that we can check when we start a vm

a few style nits but also one serious note inline

> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  PVE/QemuServer/PCI.pm | 89 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 89 insertions(+)
> 
> diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
> index 5608207..0626b76 100644
> --- a/PVE/QemuServer/PCI.pm
> +++ b/PVE/QemuServer/PCI.pm
> @@ -5,6 +5,7 @@ use strict;
>  
>  use PVE::JSONSchema;
>  use PVE::SysFSTools;
> +use PVE::Tools;
>  
>  use base 'Exporter';
>  
> @@ -461,4 +462,92 @@ sub print_hostpci_devices {
>      return ($kvm_off, $gpu_passthrough, $legacy_igd);
>  }
>  
> +my $PCIID_RESERVATION_FILE = "/var/run/pve-reserved-pciids";
> +my $PCIID_RESERVATION_LCK = "/var/lock/pve-reserved-pciids.lck";
> +
> +my $parse_pci_reservation = sub {
> +    my $pciids = {};
> +
> +    if (my $fh = IO::File->new ($PCIID_RESERVATION_FILE, "r")) {
> +	while (my $line = <$fh>) {
> +	    if ($line =~ m/^($PCIRE)\s(\d+)\s(\d+)$/) {
> +		$pciids->{$1} = {
> +		    timestamp => $2,
> +		    vmid => $3,
> +		};
> +	    }
> +	}
> +    }
> +
> +    return $pciids;
> +};
> +
> +my $write_pci_reservation = sub {
> +    my ($pciids) = @_;
> +
> +    my $data = "";
> +    foreach my $p (keys %$pciids) {

prefer for over foreach

> +	$data .= "$p $pciids->{$p}->{timestamp} $pciids->{$p}->{vmid}\n";
> +    }

my $data = join("\n", map { "$_ $pciids->{$_}->{timestamp} $pciids->{$_}->{vmid}" }, keys $pciids->%*);

> +
> +    PVE::Tools::file_set_contents($PCIID_RESERVATION_FILE, $data);
> +};
> +
> +sub remove_pci_reservation {
> +    my ($id) = @_;
> +
> +    my $code = sub {
> +	my $pciids = $parse_pci_reservation->();
> +
> +	delete $pciids->{$id};
> +
> +	$write_pci_reservation->($pciids);
> +    };
> +
> +    PVE::Tools::lock_file($PCIID_RESERVATION_LCK, 10, $code);

IMO it has some benefit for passing the closure directly, less lines and slightly
more obvious about the locking (as at least I read methods from top to bottom):

PVE::Tools::lock_file($PCIID_RESERVATION_LCK, 10, sub {
    my $pciids = $parse_pci_reservation->();
    ...
});

but we have no clear style guide regarding this and def. use both variants, so no
hard feelings here.

> +    die $@ if $@;
> +
> +    return;
> +}
> +
> +sub reserve_pci_usage {
> +    my ($id, $vmid) = @_;
> +
> +    my $code = sub {
> +
> +	# have a 60 second grace period, since we reserve before
> +	# we actually start the vm

huh, whats the use on that? so I can "steal" PCI devices the first 60s, feels weird...

Why not either:
* catch any start error somewhere centrally and clear the reservation in that
  case again, a kill/crash could still result into false-positives though
* save timestamp now and then once we know it the PID of the VM as third
  param, VMID + PID are quite good in being resistent against PID-reuse
  and an future start could check if the process still lives to decide
  if the reservation is still valid

> +	my $graceperiod = 60;
> +	my $ctime = time();
> +
> +	my $pciids = $parse_pci_reservation->();
> +
> +	if (my $pciid = $pciids->{$id}) {
> +	    if ($pciid->{vmid} == $vmid) {
> +		return; # already reserved
> +	    }

I'd prefer a onliner for above, less lines/noise while not yet being
code-golfy, so easier to read IMO. i.e.:

return if $pciid->{vmid} == $vmid; # already reserved

> +
> +	    if (($pciid->{timestamp} + $graceperiod > $ctime) ||
> +		PVE::QemuServer::Helpers::vm_running_locally($vmid))
> +	    {

style nit², we (nowadays) normally place the if's closing ) also on the new
line:

if (($pciid->{timestamp} + $graceperiod > $ctime) ||
    PVE::QemuServer::Helpers::vm_running_locally($vmid)
) {
    ....
}

honestly I'd like it 1000% more the rust way, but well..




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [pve-devel] applied: [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE
  2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE Dominik Csapak
@ 2021-10-05 14:13   ` Thomas Lamprecht
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2021-10-05 14:13 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

On 05.10.21 15:11, Dominik Csapak wrote:
> we do not need this group, but want to use the regex where we have
> multiple groups, so make it a non-capture group
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  PVE/QemuServer/PCI.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
>

applied, thanks!




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-05 14:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-05 13:11 [pve-devel] [PATCH qemu-server 0/3] fix #3258: check for in-use pci devices on vm start Dominik Csapak
2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 1/3] pci: to not capture first group in PCIRE Dominik Csapak
2021-10-05 14:13   ` [pve-devel] applied: " Thomas Lamprecht
2021-10-05 13:11 ` [pve-devel] [PATCH qemu-server 2/3] pci: add helpers to (un)reserve pciids for a vm Dominik Csapak
2021-10-05 14:13   ` Thomas Lamprecht
2021-10-05 13:12 ` [pve-devel] [PATCH qemu-server 3/3] fix #3258: block vm start when pci device is already in use Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal