public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour
@ 2024-11-08  9:32 Dominik Csapak
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Dominik Csapak @ 2024-11-08  9:32 UTC (permalink / raw)
  To: pve-devel

and fix passthrough regressions

As i feared previously in [0], making it a hard error when encountering
errors during sysfs writes uncovered some situations where our code was
too strict to keep some setups working.

One such case is resetting devices, which is seemingly not necessary
at all times, so this series

* donwgrades that error to warning
* adds some more logging to `file_write` to be able to better debug

Another case that broke was passing through similar devices with the
same vendor/modelid since the write to vfio-pci's 'new_id' works only
once for the same vendor/modelid.

To fix that make some errors ignorable for file_write

changes from v1:
* also include error ignore list
* ignore EEXIST for writing to new_id

0: https://lore.proxmox.com/pve-devel/20240723082925.934603-1-d.csapak@proxmox.com/

pve-common:

Dominik Csapak (2):
  sysfstools: file_write: extend with logging and ignore list
  sysfstools: fix regression on binding to vfio-pci

 src/PVE/SysFSTools.pm | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

qemu-server:

Dominik Csapak (1):
  pci: don't hard require resetting devices for passthrough

 PVE/QemuServer/PCI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list
  2024-11-08  9:32 [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Dominik Csapak
@ 2024-11-08  9:32 ` Dominik Csapak
  2024-11-08 10:24   ` Stoiko Ivanov
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2024-11-08  9:32 UTC (permalink / raw)
  To: pve-devel

the actual error and path is useful to know when trying to debug or
figure out what did not work, so warn here if there was an error.

Also takes now an optional error list that can be ignored. If
encountering such an error, returns success instead of failure.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PVE/SysFSTools.pm | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
index 0bde6d7..156fee6 100644
--- a/src/PVE/SysFSTools.pm
+++ b/src/PVE/SysFSTools.pm
@@ -211,17 +211,31 @@ sub check_iommu_support{
     return PVE::Tools::dir_glob_regex('/sys/class/iommu/', "[^\.].*");
 }
 
+# writes $buf into $filename
+# returns success when encountering an error from the given $ignore_list, e.g. EEXIST
 sub file_write {
-    my ($filename, $buf) = @_;
+    my ($filename, $buf, $ignore_list) = @_;
 
     my $fh = IO::File->new($filename, "w");
     return undef if !$fh;
 
-    my $res = defined(syswrite($fh, $buf)) ? 1 : 0;
-
+    my $res = syswrite($fh, $buf);
     $fh->close();
 
-    return $res;
+    if (defined($res)) {
+	return 1;
+    } elsif (my $err = $!) {
+	if (defined($ignore_list)) {
+	    for my $to_ignore ($ignore_list->@*) {
+		if ($err == $to_ignore) {
+		    return 1;
+		}
+	    }
+	}
+	warn "error writing '$buf' to '$filename': $err\n";
+    }
+
+    return 0;
 }
 
 sub pci_device_info {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci
  2024-11-08  9:32 [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Dominik Csapak
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
@ 2024-11-08  9:32 ` Dominik Csapak
  2024-11-08 10:26   ` Stoiko Ivanov
  2024-11-08  9:33 ` [pve-devel] [PATCH qemu-server v2 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
  2024-11-08 10:28 ` [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Stoiko Ivanov
  3 siblings, 1 reply; 7+ messages in thread
From: Dominik Csapak @ 2024-11-08  9:32 UTC (permalink / raw)
  To: pve-devel

when starting a vm with passthrough, we have to bind all normal pci
devices to vfio-pci. This happens by

* unbinding from current driver
* telling vfio-pci the 'vendorid modelid' combo to it know this device
  class can use the driver (by writing to 'new_id')
* actually binding the device to vfio-pci

if there are multiple devices of the same 'vendorid modelid' class on
the host (and passed through), only the first write to 'new_id' is
successful, all subsequent ones return EEXIST.

This could happen e.g. for setups with multiple GPUs that have the same
audio chip.

To fix this, ignore the EEXIST error for this write to new_id

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PVE/SysFSTools.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
index 156fee6..569693e 100644
--- a/src/PVE/SysFSTools.pm
+++ b/src/PVE/SysFSTools.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 
 use IO::File;
+use POSIX qw(EEXIST);
 
 use PVE::Tools qw(file_read_firstline dir_glob_foreach);
 
@@ -317,7 +318,7 @@ sub pci_dev_bind_to_vfio {
     return 1 if -d $testdir;
 
     my $data = "$dev->{vendor} $dev->{device}";
-    return undef if !file_write("$vfio_basedir/new_id", $data);
+    return undef if !file_write("$vfio_basedir/new_id", $data, [EEXIST]);
 
     my $fn = "$pcisysfs/devices/$name/driver/unbind";
     if (!file_write($fn, $name)) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH qemu-server v2 1/1] pci: don't hard require resetting devices for passthrough
  2024-11-08  9:32 [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Dominik Csapak
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
@ 2024-11-08  9:33 ` Dominik Csapak
  2024-11-08 10:28 ` [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Stoiko Ivanov
  3 siblings, 0 replies; 7+ messages in thread
From: Dominik Csapak @ 2024-11-08  9:33 UTC (permalink / raw)
  To: pve-devel

Since pve-common commit:

 eff5957 (sysfstools: file_write: properly catch errors)

this check here fails now when the reset does not work. It turns out
that resetting the device is not always necessary, and we previously
ignored most errors when trying to do so.

To restore that functionality, downgrade this `die` to a warning.

If the device really needs a reset to work, it will either fail later
during startup, or not work correctly in the guest, but that behavior
existed before and is AFAIK not really detectable from our side.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer/PCI.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index 75eac134..dceb8938 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -728,7 +728,7 @@ sub prepare_pci_device {
     } else {
 	die "can't unbind/bind PCI group to VFIO '$pciid'\n"
 	    if !PVE::SysFSTools::pci_dev_group_bind_to_vfio($pciid);
-	die "can't reset PCI device '$pciid'\n"
+	warn "can't reset PCI device '$pciid'\n"
 	    if $info->{has_fl_reset} && !PVE::SysFSTools::pci_dev_reset($info);
     }
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
@ 2024-11-08 10:24   ` Stoiko Ivanov
  0 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-11-08 10:24 UTC (permalink / raw)
  To: Dominik Csapak; +Cc: Proxmox VE development discussion

2 cosmetic nits inline:
On Fri,  8 Nov 2024 10:32:58 +0100
Dominik Csapak <d.csapak@proxmox.com> wrote:

> the actual error and path is useful to know when trying to debug or
> figure out what did not work, so warn here if there was an error.
> 
> Also takes now an optional error list that can be ignored. If
> encountering such an error, returns success instead of failure.
suggestion:
Now also takes an optional list of errors that can be ignored.

> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  src/PVE/SysFSTools.pm | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> index 0bde6d7..156fee6 100644
> --- a/src/PVE/SysFSTools.pm
> +++ b/src/PVE/SysFSTools.pm
> @@ -211,17 +211,31 @@ sub check_iommu_support{
>      return PVE::Tools::dir_glob_regex('/sys/class/iommu/', "[^\.].*");
>  }
>  
> +# writes $buf into $filename
> +# returns success when encountering an error from the given $ignore_list, e.g. EEXIST
I somehow read this as only condition when it returns success (also not
sure if the comment is needed or this gets clear when looking through the
code)?

maybe:
# writes $buf into $filename, returns false and warns on errors not listed in the optional $ignore_list
?

>  sub file_write {
> -    my ($filename, $buf) = @_;
> +    my ($filename, $buf, $ignore_list) = @_;
if you do:
$ignore_list //= [];
here...

>  
>      my $fh = IO::File->new($filename, "w");
>      return undef if !$fh;
>  
> -    my $res = defined(syswrite($fh, $buf)) ? 1 : 0;
> -
> +    my $res = syswrite($fh, $buf);
>      $fh->close();
>  
> -    return $res;
> +    if (defined($res)) {
> +	return 1;
> +    } elsif (my $err = $!) {
> +	if (defined($ignore_list)) {
.. this nesting can be omitted.
> +	    for my $to_ignore ($ignore_list->@*) {
> +		if ($err == $to_ignore) {
> +		    return 1;
> +		}
the inner if could be shorted a bit to:
return 1 if ($err == $to_ignore);

> +	    }
> +	}
> +	warn "error writing '$buf' to '$filename': $err\n";
> +    }
> +
> +    return 0;
>  }
>  
>  sub pci_device_info {



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci
  2024-11-08  9:32 ` [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
@ 2024-11-08 10:26   ` Stoiko Ivanov
  0 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-11-08 10:26 UTC (permalink / raw)
  To: Dominik Csapak; +Cc: Proxmox VE development discussion

On Fri,  8 Nov 2024 10:32:59 +0100
Dominik Csapak <d.csapak@proxmox.com> wrote:

> when starting a vm with passthrough, we have to bind all normal pci
> devices to vfio-pci. This happens by
> 
> * unbinding from current driver
> * telling vfio-pci the 'vendorid modelid' combo to it know this device
s/to it know/so it knows/ ?
>   class can use the driver (by writing to 'new_id')
> * actually binding the device to vfio-pci
> 
> if there are multiple devices of the same 'vendorid modelid' class on
> the host (and passed through), only the first write to 'new_id' is
> successful, all subsequent ones return EEXIST.
> 
> This could happen e.g. for setups with multiple GPUs that have the same
> audio chip.
> 
> To fix this, ignore the EEXIST error for this write to new_id
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  src/PVE/SysFSTools.pm | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/SysFSTools.pm b/src/PVE/SysFSTools.pm
> index 156fee6..569693e 100644
> --- a/src/PVE/SysFSTools.pm
> +++ b/src/PVE/SysFSTools.pm
> @@ -4,6 +4,7 @@ use strict;
>  use warnings;
>  
>  use IO::File;
> +use POSIX qw(EEXIST);
>  
>  use PVE::Tools qw(file_read_firstline dir_glob_foreach);
>  
> @@ -317,7 +318,7 @@ sub pci_dev_bind_to_vfio {
>      return 1 if -d $testdir;
>  
>      my $data = "$dev->{vendor} $dev->{device}";
> -    return undef if !file_write("$vfio_basedir/new_id", $data);
> +    return undef if !file_write("$vfio_basedir/new_id", $data, [EEXIST]);
>  
>      my $fn = "$pcisysfs/devices/$name/driver/unbind";
>      if (!file_write($fn, $name)) {



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour
  2024-11-08  9:32 [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Dominik Csapak
                   ` (2 preceding siblings ...)
  2024-11-08  9:33 ` [pve-devel] [PATCH qemu-server v2 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
@ 2024-11-08 10:28 ` Stoiko Ivanov
  3 siblings, 0 replies; 7+ messages in thread
From: Stoiko Ivanov @ 2024-11-08 10:28 UTC (permalink / raw)
  To: Dominik Csapak; +Cc: Proxmox VE development discussion

gave this another spin on my reproducer for the failing device-reset
- still works fine!


2 cosmetic nits for the patches aside - which from my pov can also be left
as is (or maybe fixed up when applying):
Tested-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Reviewed-by: Stoiko Ivanov <s.ivanov@proxmox.com>

On Fri,  8 Nov 2024 10:32:57 +0100
Dominik Csapak <d.csapak@proxmox.com> wrote:

> and fix passthrough regressions
> 
> As i feared previously in [0], making it a hard error when encountering
> errors during sysfs writes uncovered some situations where our code was
> too strict to keep some setups working.
> 
> One such case is resetting devices, which is seemingly not necessary
> at all times, so this series
> 
> * donwgrades that error to warning
> * adds some more logging to `file_write` to be able to better debug
> 
> Another case that broke was passing through similar devices with the
> same vendor/modelid since the write to vfio-pci's 'new_id' works only
> once for the same vendor/modelid.
> 
> To fix that make some errors ignorable for file_write
> 
> changes from v1:
> * also include error ignore list
> * ignore EEXIST for writing to new_id
> 
> 0: https://lore.proxmox.com/pve-devel/20240723082925.934603-1-d.csapak@proxmox.com/
> 
> pve-common:
> 
> Dominik Csapak (2):
>   sysfstools: file_write: extend with logging and ignore list
>   sysfstools: fix regression on binding to vfio-pci
> 
>  src/PVE/SysFSTools.pm | 25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> qemu-server:
> 
> Dominik Csapak (1):
>   pci: don't hard require resetting devices for passthrough
> 
>  PVE/QemuServer/PCI.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-11-08 10:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-08  9:32 [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Dominik Csapak
2024-11-08  9:32 ` [pve-devel] [PATCH common v2 1/2] sysfstools: file_write: extend with logging and ignore list Dominik Csapak
2024-11-08 10:24   ` Stoiko Ivanov
2024-11-08  9:32 ` [pve-devel] [PATCH common v2 2/2] sysfstools: fix regression on binding to vfio-pci Dominik Csapak
2024-11-08 10:26   ` Stoiko Ivanov
2024-11-08  9:33 ` [pve-devel] [PATCH qemu-server v2 1/1] pci: don't hard require resetting devices for passthrough Dominik Csapak
2024-11-08 10:28 ` [pve-devel] [PATCH common/qemu-server v2] improve sysfs write behaviour Stoiko Ivanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal