From: Gilberto Ferreira <gilberto.nunes32@gmail.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] 6.5.13-3-pve kernel panic on shutdown
Date: Thu, 28 Mar 2024 12:18:19 -0300 [thread overview]
Message-ID: <CAOKSTBvmuqr7R8KntqREdoP5f5hE2U+n-m0WTsXsN=wNPWkaKw@mail.gmail.com> (raw)
In-Reply-To: <mailman.755.1711637904.434.pve-user@lists.proxmox.com>
Try to update the server firmware.
---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram
Em qui., 28 de mar. de 2024 às 11:58, Stefan Radman via pve-user <
pve-user@lists.proxmox.com> escreveu:
>
>
>
> ---------- Forwarded message ----------
> From: Stefan Radman <stefan.radman@me.com>
> To: PVE User List <pve-user@pve.proxmox.com>
> Cc:
> Bcc:
> Date: Thu, 28 Mar 2024 15:50:02 +0100
> Subject: 6.5.13-3-pve kernel panic on shutdown
> I recently noticed that a Dell Poweredge R540 currently running Proxmox VE
> 8.1.8 (kernel 6.5.13-3-pve) throws a kernel panic on shutdown.
>
> The kernel panic is triggered 3-4 seconds after the last network interface
> goes down (onboard BCM5720 LOM), while the system enters S5 (sleep) state.
>
> [84459.970212] bond0: (slave eno1): link status definitely down, disabling
> slave
> [84459.982170] bond0: (slave eno2): link status definitely down, disabling
> slave
> [84459.990037] tg3 0000:04:00.0 eno1: left promiscuous mode
> [84459.995822] tg3 0000:04:00.0 eno1: left allmulticast mode
> [84460.001615] bond0: now running without any active interface!
> [84460.018133] vmbr0: port 1(bond0) entered disabled state
> [84460.291379] ACPI: PM: Preparing to enter system sleep state S5
> [84463.685113] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
>
> This is reproducible on every reboot.
>
> R540 and BCM5720 are running the latest firmware available from the Dell
> support website.
>
> Link [2] below seem to suggest that my problem is related to a combination
> of ACPI S5, the tg3 driver and the BCM5720 on-board NIC.
>
> Has anyone else seen this lately (or ever) with Promox VE?
>
> Thank you
>
> Stefan
>
> [1] Use ACPI S5 for reboot #1904225: causes reboot crash on Dell T440
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1962730
>
> [2] [SRU][Regression] Revert "PM: ACPI: reboot: Use S5 for reboot" which
> causes Bus Fatal Error when rebooting system with BCM5720 NIC
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471
>
> [3] tg3: Disable tg3 device on system reboot to avoid triggering AER
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2ca1c94ce0b65a2ce7512b718f3d8a0fe6224bca
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/broadcom/tg3.c?id=2ca1c94ce0b65a2ce7512b718f3d8a0fe6224bca#n18074
>
> [4] * [PATCH] tg3: Disable tg3 device on system reboot to avoid triggering
> AER
>
> https://lore.kernel.org/netdev/CAAd53p7PmEp+vWLz+fGdDntGQ2KqgL54fo86Bpy7oy9tKzXsAg@mail.gmail.com/T/
>
> [5] [v4,2/2] PM: ACPI: reboot: Reinstate S5 for reboot
>
> https://patches.linaro.org/project/linux-acpi/patch/20220916043319.119716-2-kai.heng.feng@canonical.com/
>
> [6] * [PATCH] tg3: add new module param to force device power down on
> reboot
>
> https://lore.kernel.org/lkml/d8ed4af1-5c83-4895-9fc3-9aea25724fd9@gmail.com/T/
>
>
> [84458.600189] systemd-shutdown[1]: Syncing filesystems and block devices.
> [84458.607141] systemd-shutdown[1]: Rebooting.
> [84458.612283] spi-nor spi0.0: Software reset failed: -524
> [84459.777370] megaraid_sas 0000:17:00.0: megasas_disable_intr_fusion is
> called outbound_intr_mask:0x40000009
> [84459.970212] bond0: (slave eno1): link status definitely down, disabling
> slave
> [84459.982170] bond0: (slave eno2): link status definitely down, disabling
> slave
> [84459.990037] tg3 0000:04:00.0 eno1: left promiscuous mode
> [84459.995822] tg3 0000:04:00.0 eno1: left allmulticast mode
> [84460.001615] bond0: now running without any active interface!
> [84460.018133] vmbr0: port 1(bond0) entered disabled state
> [84460.291379] ACPI: PM: Preparing to enter system sleep state S5
> [84463.685113] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
> [84463.685116] {1}[Hardware Error]: event severity: fatal
> [84463.685117] {1}[Hardware Error]: Error 0, type: fatal
> [84463.685119] {1}[Hardware Error]: section_type: PCIe error
> [84463.685120] {1}[Hardware Error]: port_type: 0, PCIe end point
> [84463.685121] {1}[Hardware Error]: version: 3.0
> [84463.685122] {1}[Hardware Error]: command: 0x0002, status: 0x0010
> [84463.685123] {1}[Hardware Error]: device_id: 0000:04:00.1
> [84463.685125] {1}[Hardware Error]: slot: 0
> [84463.685126] {1}[Hardware Error]: secondary_bus: 0x00
> [84463.685127] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
> [84463.685128] {1}[Hardware Error]: class_code: 020000
> [84463.685129] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
> aer_uncor_mask: 0x00010000
> [84463.685130] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
> [84463.685131] {1}[Hardware Error]: TLP Header: 40000001 0000010f
> 90028090 00000000
> [84463.685134] Kernel panic - not syncing: Fatal hardware error!
> [84463.685136] CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: P O
> 6.5.13-3-pve #1
> [84463.685139] Hardware name: Dell Inc. PowerEdge R540/0VC7DK, BIOS 2.21.1
> 03/07/2024
> [84463.685140] Call Trace:
> [84463.685142] <NMI>
> …
>
> root@pve:~# pveversion
> pve-manager/8.1.8/d29041d9f87575d0 (running kernel: 6.5.13-3-pve)
> root@pve:~# ethtool -i eno2
> driver: tg3
> version: 6.5.13-3-pve
> firmware-version: FFV22.71.3 bc 5720-v1.39
> expansion-rom-version:
> bus-info: 0000:04:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
> root@pve:~# lspci | fgrep 04:00.1
> 04:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
> BCM5720 Gigabit Ethernet PCIe
>
>
>
>
> ---------- Forwarded message ----------
> From: Stefan Radman via pve-user <pve-user@lists.proxmox.com>
> To: PVE User List <pve-user@pve.proxmox.com>
> Cc: Stefan Radman <stefan.radman@me.com>
> Bcc:
> Date: Thu, 28 Mar 2024 15:50:02 +0100
> Subject: [PVE-User] 6.5.13-3-pve kernel panic on shutdown
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
next parent reply other threads:[~2024-03-28 15:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <D8D305A2-D2B7-4A5D-821C-65DE75621457@kmi.com>
[not found] ` <93280CB8-7582-4456-9101-D594CE2C86A2@kmi.com>
[not found] ` <mailman.755.1711637904.434.pve-user@lists.proxmox.com>
2024-03-28 15:18 ` Gilberto Ferreira [this message]
[not found] ` <mailman.761.1711641292.434.pve-user@lists.proxmox.com>
2024-03-28 15:57 ` Gilberto Ferreira
[not found] ` <mailman.785.1712036605.434.pve-user@lists.proxmox.com>
2024-04-02 7:37 ` Gilberto Ferreira
[not found] ` <5D727A1E-902A-4CA4-BEF8-A0F1CBFA754E@me.com>
2024-04-02 11:15 ` Gilberto Ferreira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOKSTBvmuqr7R8KntqREdoP5f5hE2U+n-m0WTsXsN=wNPWkaKw@mail.gmail.com' \
--to=gilberto.nunes32@gmail.com \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox