public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC storage v2 1/3] api: smart: return unknown health instead of error message
@ 2025-04-15  7:11 Daniel Kral
  2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 2/3] disks: get: separate error path for retrieving SMART data Daniel Kral
  2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 3/3] fix #6224: disks: get: set timeout for retrieval of SMART stat data Daniel Kral
  0 siblings, 2 replies; 3+ messages in thread
From: Daniel Kral @ 2025-04-15  7:11 UTC (permalink / raw)
  To: pve-devel

In case of an error, the WebGUI expects the SMART data API endpoint to
return a health value, but it will return an error message directly. To
make this more user friendly, mask the error in the API handler.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
This could also have been intentional as the error message is passed to
the CLI and API response correctly, which is why it is a RFC.

 src/PVE/API2/Disks.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/PVE/API2/Disks.pm b/src/PVE/API2/Disks.pm
index 408bdbe..fe8d08e 100644
--- a/src/PVE/API2/Disks.pm
+++ b/src/PVE/API2/Disks.pm
@@ -216,7 +216,9 @@ __PACKAGE__->register_method ({
 
 	my $disk = PVE::Diskmanage::verify_blockdev_path($param->{disk});
 
-	my $result = PVE::Diskmanage::get_smart_data($disk, $param->{healthonly});
+	my $result = eval {
+	    PVE::Diskmanage::get_smart_data($disk, $param->{healthonly})
+	};
 
 	$result->{health} = 'UNKNOWN' if !defined $result->{health};
 	$result = { health => $result->{health} } if $param->{healthonly};
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH storage v2 2/3] disks: get: separate error path for retrieving SMART data
  2025-04-15  7:11 [pve-devel] [RFC storage v2 1/3] api: smart: return unknown health instead of error message Daniel Kral
@ 2025-04-15  7:11 ` Daniel Kral
  2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 3/3] fix #6224: disks: get: set timeout for retrieval of SMART stat data Daniel Kral
  1 sibling, 0 replies; 3+ messages in thread
From: Daniel Kral @ 2025-04-15  7:11 UTC (permalink / raw)
  To: pve-devel

Make the subroutine get_smart_data() die with the error message from
running the `smartctl` command before. This is in preparation for the
next patch, which makes that command fail in certain scenarios.

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
 src/PVE/Diskmanage.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/PVE/Diskmanage.pm b/src/PVE/Diskmanage.pm
index 4272668..0cf7175 100644
--- a/src/PVE/Diskmanage.pm
+++ b/src/PVE/Diskmanage.pm
@@ -148,10 +148,10 @@ sub get_smart_data {
 	    }
 	})
     };
-    my $err = $@;
+    die "Error getting S.M.A.R.T. data: $@\n" if $@;
 
     # bit 0 and 1 mark a fatal error, other bits are for disk status -> ignore (see man 8 smartctl)
-    if ((defined($returncode) && ($returncode & 0b00000011)) || $err) {
+    if (defined($returncode) && ($returncode & 0b00000011)) {
 	die "Error getting S.M.A.R.T. data: Exit code: $returncode\n";
     }
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] [PATCH storage v2 3/3] fix #6224: disks: get: set timeout for retrieval of SMART stat data
  2025-04-15  7:11 [pve-devel] [RFC storage v2 1/3] api: smart: return unknown health instead of error message Daniel Kral
  2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 2/3] disks: get: separate error path for retrieving SMART data Daniel Kral
@ 2025-04-15  7:11 ` Daniel Kral
  1 sibling, 0 replies; 3+ messages in thread
From: Daniel Kral @ 2025-04-15  7:11 UTC (permalink / raw)
  To: pve-devel

In rare scenarios, `smartctl` takes up to 60 seconds to timeout for SCSI
commands to be completed, as reported in our user forum [0] and bugzilla
[1]. It seems that USB drives handled by the USB Attached SCSI (UAS)
kernel module are more likely to be affected by this [2], but is more of
a case-by-case situation.

Therefore, set a more reasonable timeout of 10 seconds, so that callers
don't have to wait too long or seem unresponsive (e.g. Node Disks view
in the WebGUI).

[0] https://forum.proxmox.com/threads/164799/
[1] https://bugzilla.proxmox.com/show_bug.cgi?id=6224
[2] https://www.smartmontools.org/wiki/SAT-with-UAS-Linux

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
Reviewed-by: Max Carrara <m.carrara@proxmox.com>
---
 src/PVE/Diskmanage.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/PVE/Diskmanage.pm b/src/PVE/Diskmanage.pm
index 0cf7175..d4f2692 100644
--- a/src/PVE/Diskmanage.pm
+++ b/src/PVE/Diskmanage.pm
@@ -98,7 +98,7 @@ sub get_smart_data {
     push @$cmd, $disk;
 
     my $returncode = eval {
-	run_command($cmd, noerr => 1, outfunc => sub {
+	run_command($cmd, noerr => 1, timeout => 10, outfunc => sub {
 	    my ($line) = @_;
 
 # ATA SMART attributes, e.g.:
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-04-15  7:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-15  7:11 [pve-devel] [RFC storage v2 1/3] api: smart: return unknown health instead of error message Daniel Kral
2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 2/3] disks: get: separate error path for retrieving SMART data Daniel Kral
2025-04-15  7:11 ` [pve-devel] [PATCH storage v2 3/3] fix #6224: disks: get: set timeout for retrieval of SMART stat data Daniel Kral

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal