public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour
@ 2024-11-11  8:12 Dominik Csapak
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Dominik Csapak @ 2024-11-11  8:12 UTC (permalink / raw)
  To: pve-devel

and fix passthrough regressions

As i feared previously in [0], making it a hard error when encountering
errors during sysfs writes uncovered some situations where our code was
too strict to keep some setups working.

One such case is resetting devices, which is seemingly not necessary
at all times, so this series

* donwgrades that error to warning
* adds some more logging to `file_write` to be able to better debug

Another case that broke was passing through similar devices with the
same vendor/modelid since the write to vfio-pci's 'new_id' works only
once for the same vendor/modelid.

To fix that make some errors ignorable for file_write

changes from v2:
* improve comment on file_write
* shorten code with suggestions from stoiko
* fix commit message

changes from v1:
* also include error ignore list
* ignore EEXIST for writing to new_id

0: https://lore.proxmox.com/pve-devel/20240723082925.934603-1-d.csapak@proxmox.com/

pve-common:

Dominik Csapak (2):
  sysfstools: file_write: extend with logging and ignore list
  sysfstools: fix regression on binding to vfio-pci

 src/PVE/SysFSTools.pm | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

qemu-server:

Dominik Csapak (1):
  pci: don't hard require resetting devices for passthrough

 PVE/QemuServer/PCI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list
  2024-11-11  8:12 [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
@ 2024-11-11  8:12 ` Dominik Csapak
  2024-11-11  9:02   ` Thomas Lamprecht
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2024-11-11  8:12 UTC (permalink / raw)
  To: pve-devel

the actual error and path is useful to know when trying to debug or
figure out what did not work, so warn here if there was an error.

Now also takes an optional error list that can be ignored. If
encountering such an error, returns success instead of failure.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v2:
* enhance comment
* improve commit message
* shorten return statement
* remove indentation by default initializing $ignore_list to []

 src/PVE/SysFSTools.pm | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
index 0bde6d7..0aeff5f 100644
--- a/src/PVE/SysFSTools.pm
+++ b/src/PVE/SysFSTools.pm
@@ -211,17 +211,28 @@ sub check_iommu_support{
     return PVE::Tools::dir_glob_regex('/sys/class/iommu/', "[^\.].*");
 }
 
+# writes $buf into $filename, returns false and warns on errors not listed in the optional $ignore_list
+# error to ignore come from the POSIX module e.g. 'EEXIST'
 sub file_write {
-    my ($filename, $buf) = @_;
+    my ($filename, $buf, $ignore_list) = @_;
+    $ignore_list //= [];
 
     my $fh = IO::File->new($filename, "w");
     return undef if !$fh;
 
-    my $res = defined(syswrite($fh, $buf)) ? 1 : 0;
-
+    my $res = syswrite($fh, $buf);
     $fh->close();
 
-    return $res;
+    if (defined($res)) {
+	return 1;
+    } elsif (my $err = $!) {
+	for my $to_ignore ($ignore_list->@*) {
+	    return 1 if $err == $to_ignore;
+	}
+	warn "error writing '$buf' to '$filename': $err\n";
+    }
+
+    return 0;
 }
 
 sub pci_device_info {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH common v3 2/2] sysfstools: fix regression on binding to vfio-pci
  2024-11-11  8:12 [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
@ 2024-11-11  8:12 ` Dominik Csapak
  2024-11-11  8:12 ` [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
  2024-11-11 10:20 ` [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2024-11-11  8:12 UTC (permalink / raw)
  To: pve-devel

when starting a vm with passthrough, we have to bind all normal pci
devices to vfio-pci. This happens by

* unbinding from current driver
* telling vfio-pci the 'vendorid modelid' combo so it knows this device
  class can use the driver (by writing to 'new_id')
* actually binding the device to vfio-pci

if there are multiple devices of the same 'vendorid modelid' class on
the host (and passed through), only the first write to 'new_id' is
successful, all subsequent ones return EEXIST.

This could happen e.g. for setups with multiple GPUs that have the same
audio chip.

To fix this, ignore the EEXIST error for this write to new_id

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v2:
* fix typo in commit message

 src/PVE/SysFSTools.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
index 0aeff5f..c0a1b76 100644
--- a/src/PVE/SysFSTools.pm
+++ b/src/PVE/SysFSTools.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 
 use IO::File;
+use POSIX qw(EEXIST);
 
 use PVE::Tools qw(file_read_firstline dir_glob_foreach);
 
@@ -314,7 +315,7 @@ sub pci_dev_bind_to_vfio {
     return 1 if -d $testdir;
 
     my $data = "$dev->{vendor} $dev->{device}";
-    return undef if !file_write("$vfio_basedir/new_id", $data);
+    return undef if !file_write("$vfio_basedir/new_id", $data, [EEXIST]);
 
     my $fn = "$pcisysfs/devices/$name/driver/unbind";
     if (!file_write($fn, $name)) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough
  2024-11-11  8:12 [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
@ 2024-11-11  8:12 ` Dominik Csapak
  2024-11-11  9:05   ` Thomas Lamprecht
  2024-11-11 10:20 ` [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
  3 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2024-11-11  8:12 UTC (permalink / raw)
  To: pve-devel

Since pve-common commit:

 eff5957 (sysfstools: file_write: properly catch errors)

this check here fails now when the reset does not work. It turns out
that resetting the device is not always necessary, and we previously
ignored most errors when trying to do so.

To restore that functionality, downgrade this `die` to a warning.

If the device really needs a reset to work, it will either fail later
during startup, or not work correctly in the guest, but that behavior
existed before and is AFAIK not really detectable from our side.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
no changes from v2

 PVE/QemuServer/PCI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index 75eac134..dceb8938 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -728,7 +728,7 @@ sub prepare_pci_device {
     } else {
 	die "can't unbind/bind PCI group to VFIO '$pciid'\n"
 	    if !PVE::SysFSTools::pci_dev_group_bind_to_vfio($pciid);
-	die "can't reset PCI device '$pciid'\n"
+	warn "can't reset PCI device '$pciid'\n"
 	    if $info->{has_fl_reset} && !PVE::SysFSTools::pci_dev_reset($info);
     }
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list
  2024-11-11  8:12 ` [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
@ 2024-11-11  9:02   ` Thomas Lamprecht
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Lamprecht @ 2024-11-11  9:02 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 11.11.24 um 09:12 schrieb Dominik Csapak:
> the actual error and path is useful to know when trying to debug or
> figure out what did not work, so warn here if there was an error.
> 
> Now also takes an optional error list that can be ignored. If
> encountering such an error, returns success instead of failure.
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v2:
> * enhance comment
> * improve commit message
> * shorten return statement
> * remove indentation by default initializing $ignore_list to []
> 
>  src/PVE/SysFSTools.pm | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> index 0bde6d7..0aeff5f 100644
> --- a/src/PVE/SysFSTools.pm
> +++ b/src/PVE/SysFSTools.pm
> @@ -211,17 +211,28 @@ sub check_iommu_support{
>      return PVE::Tools::dir_glob_regex('/sys/class/iommu/', "[^\.].*");
>  }
>  
> +# writes $buf into $filename, returns false and warns on errors not listed in the optional $ignore_list
> +# error to ignore come from the POSIX module e.g. 'EEXIST'
>  sub file_write {
> -    my ($filename, $buf) = @_;
> +    my ($filename, $buf, $ignore_list) = @_;

The parameter name could be better, i.e.: ignore _what_ list?

But actually I'd prefer having an $allow_existing error, this is too generic for
my taste, at least for this use case.

Alternatively return undef and relay the $! if it's set and allow the call-site
to decide, this feels like a sort of mixed approach; if the case for EEXIST is
common enough having a wrapper here that handles that explicitly, and note that
one then also can check $! like a hash, i.e. `$!{EEXIST}`, so no need to include
POSIX then.

> +    $ignore_list //= [];
>  
>      my $fh = IO::File->new($filename, "w");
>      return undef if !$fh;
>  
> -    my $res = defined(syswrite($fh, $buf)) ? 1 : 0;
> -
> +    my $res = syswrite($fh, $buf);
>      $fh->close();
>  
> -    return $res;
> +    if (defined($res)) {
> +	return 1;
> +    } elsif (my $err = $!) {
> +	for my $to_ignore ($ignore_list->@*) {
> +	    return 1 if $err == $to_ignore;
> +	}
> +	warn "error writing '$buf' to '$filename': $err\n";
> +    }
> +
> +    return 0;
>  }
>  
>  sub pci_device_info {



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough
  2024-11-11  8:12 ` [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
@ 2024-11-11  9:05   ` Thomas Lamprecht
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Lamprecht @ 2024-11-11  9:05 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 11.11.24 um 09:12 schrieb Dominik Csapak:
> Since pve-common commit:
> 
>  eff5957 (sysfstools: file_write: properly catch errors)
> 
> this check here fails now when the reset does not work. It turns out
> that resetting the device is not always necessary, and we previously
> ignored most errors when trying to do so.
> 
> To restore that functionality, downgrade this `die` to a warning.
> 
> If the device really needs a reset to work, it will either fail later
> during startup, or not work correctly in the guest, but that behavior
> existed before and is AFAIK not really detectable from our side.
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> no changes from v2
> 
>  PVE/QemuServer/PCI.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
> index 75eac134..dceb8938 100644
> --- a/PVE/QemuServer/PCI.pm
> +++ b/PVE/QemuServer/PCI.pm
> @@ -728,7 +728,7 @@ sub prepare_pci_device {
>      } else {
>  	die "can't unbind/bind PCI group to VFIO '$pciid'\n"
>  	    if !PVE::SysFSTools::pci_dev_group_bind_to_vfio($pciid);
> -	die "can't reset PCI device '$pciid'\n"
> +	warn "can't reset PCI device '$pciid'\n"

maybe include something to also tell the user what you replied stoiko, like

"can't reset PCI device '$pciid', trying to continue as for some devices it will still work"

(just better, i.e. not just written from top of my head without knowing the
details for why/where reset is odd here).

>  	    if $info->{has_fl_reset} && !PVE::SysFSTools::pci_dev_reset($info);
>      }
>  



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour
  2024-11-11  8:12 [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
                   ` (2 preceding siblings ...)
  2024-11-11  8:12 ` [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
@ 2024-11-11 10:20 ` Dominik Csapak
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2024-11-11 10:20 UTC (permalink / raw)
  To: pve-devel

replaced by v4:

https://lore.proxmox.com/pve-devel/20241111101758.1259669-1-d.csapak@proxmox.com/


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-11-11 10:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-11  8:12 [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak
2024-11-11  8:12 ` [pve-devel] [PATCH common v3 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
2024-11-11  9:02   ` Thomas Lamprecht
2024-11-11  8:12 ` [pve-devel] [PATCH common v3 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
2024-11-11  8:12 ` [pve-devel] [PATCH qemu-server v3 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
2024-11-11  9:05   ` Thomas Lamprecht
2024-11-11 10:20 ` [pve-devel] [PATCH common/qemu-server v3] improve sysfs write behaviour Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal