From: Alwin Antreich via pve-user <pve-user@lists.proxmox.com>
To: casati@kona.it
Cc: Alwin Antreich <alwin@antreich.com>,
Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Dell R350, Proxmox VE 8.2.2, sas-megaraid error and system hang
Date: Sat, 10 Aug 2024 09:22:11 +0200 [thread overview]
Message-ID: <mailman.171.1723274935.302.pve-user@lists.proxmox.com> (raw)
In-Reply-To: <f7806f1a-6604-453b-96c4-d127df37cd17@kona.it>
[-- Attachment #1: Type: message/rfc822, Size: 5486 bytes --]
From: Alwin Antreich <alwin@antreich.com>
To: casati@kona.it
Cc: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Dell R350, Proxmox VE 8.2.2, sas-megaraid error and system hang
Date: Sat, 10 Aug 2024 09:22:11 +0200
Message-ID: <B601AE4A-F3C4-44A1-9F48-C116F8208BD8@antreich.com>
On August 9, 2024 1:30:22 PM GMT+02:00, Andrea Casati <casati@kona.it> wrote:
>Hello
>
>Dell R350 with PERC H755.
>Tried with kernel 6.8.4, 6.8.8 and 6.5.13.
>System hangs (need to phisically power off/on the machine) every day during compressed backup, and sometimes during normal usage of VM.
>
>Log with kernel 6.8.4:
>*Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Adapter is OPERATIONAL for scsi:0
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Snap dump wait time : 15
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: Reset successful for scsi0.
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: 3296 (774378251s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c
>Jul 15 19:04:45 r350ve kernel: megaraid_sas 0000:01:00.0: 3300 (boot + 5s/0x0020/CRIT) - Controller encountered an error and was reset*
>
>Errors on console with kernel 6.5.13:
>*kvm_intel: kvm [2225]: vcpu0, guest rIP: 0xfffff80277d68f93 Unhandled WRMSR(0x1d9) = 0x1*
>*megaraid_sas 0000:01:00.0: FW in FAULT state Fault code:0x10000 subcode:0x0 func:megasas_wait_for_outstanding_fusion*
>
>
>IDRAC reports no errors - Dell support reports no problems.
>
>Have anyone seen something like this before?
I've seen similar issues with other controllers when a faulty disk was present.
And do you have the latest firmware on the controller?
Cheers,
Alwin
Hi Andrea,
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
prev parent reply other threads:[~2024-08-10 7:28 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-09 11:30 Andrea Casati
2024-08-10 7:22 ` Alwin Antreich via pve-user [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mailman.171.1723274935.302.pve-user@lists.proxmox.com \
--to=pve-user@lists.proxmox.com \
--cc=alwin@antreich.com \
--cc=casati@kona.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox