From: Gilberto Ferreira <gilberto.nunes32@gmail.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] 6.5.13-3-pve kernel panic on shutdown
Date: Thu, 28 Mar 2024 12:18:19 -0300 [thread overview]
Message-ID: <CAOKSTBvmuqr7R8KntqREdoP5f5hE2U+n-m0WTsXsN=wNPWkaKw@mail.gmail.com> (raw)
In-Reply-To: <mailman.755.1711637904.434.pve-user@lists.proxmox.com>
Try to update the server firmware.
---
Gilberto Nunes Ferreira
(47) 99676-7530 - Whatsapp / Telegram
Em qui., 28 de mar. de 2024 às 11:58, Stefan Radman via pve-user <
pve-user@lists.proxmox.com> escreveu:
>
>
>
> ---------- Forwarded message ----------
> From: Stefan Radman <stefan.radman@me.com>
> To: PVE User List <pve-user@pve.proxmox.com>
> Cc:
> Bcc:
> Date: Thu, 28 Mar 2024 15:50:02 +0100
> Subject: 6.5.13-3-pve kernel panic on shutdown
> I recently noticed that a Dell Poweredge R540 currently running Proxmox VE
> 8.1.8 (kernel 6.5.13-3-pve) throws a kernel panic on shutdown.
>
> The kernel panic is triggered 3-4 seconds after the last network interface
> goes down (onboard BCM5720 LOM), while the system enters S5 (sleep) state.
>
> [84459.970212] bond0: (slave eno1): link status definitely down, disabling
> slave
> [84459.982170] bond0: (slave eno2): link status definitely down, disabling
> slave
> [84459.990037] tg3 0000:04:00.0 eno1: left promiscuous mode
> [84459.995822] tg3 0000:04:00.0 eno1: left allmulticast mode
> [84460.001615] bond0: now running without any active interface!
> [84460.018133] vmbr0: port 1(bond0) entered disabled state
> [84460.291379] ACPI: PM: Preparing to enter system sleep state S5
> [84463.685113] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
>
> This is reproducible on every reboot.
>
> R540 and BCM5720 are running the latest firmware available from the Dell
> support website.
>
> Link [2] below seem to suggest that my problem is related to a combination
> of ACPI S5, the tg3 driver and the BCM5720 on-board NIC.
>
> Has anyone else seen this lately (or ever) with Promox VE?
>
> Thank you
>
> Stefan
>
> [1] Use ACPI S5 for reboot #1904225: causes reboot crash on Dell T440
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1962730
>
> [2] [SRU][Regression] Revert "PM: ACPI: reboot: Use S5 for reboot" which
> causes Bus Fatal Error when rebooting system with BCM5720 NIC
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471
>
> [3] tg3: Disable tg3 device on system reboot to avoid triggering AER
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2ca1c94ce0b65a2ce7512b718f3d8a0fe6224bca
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/broadcom/tg3.c?id=2ca1c94ce0b65a2ce7512b718f3d8a0fe6224bca#n18074
>
> [4] * [PATCH] tg3: Disable tg3 device on system reboot to avoid triggering
> AER
>
> https://lore.kernel.org/netdev/CAAd53p7PmEp+vWLz+fGdDntGQ2KqgL54fo86Bpy7oy9tKzXsAg@mail.gmail.com/T/
>
> [5] [v4,2/2] PM: ACPI: reboot: Reinstate S5 for reboot
>
> https://patches.linaro.org/project/linux-acpi/patch/20220916043319.119716-2-kai.heng.feng@canonical.com/
>
> [6] * [PATCH] tg3: add new module param to force device power down on
> reboot
>
> https://lore.kernel.org/lkml/d8ed4af1-5c83-4895-9fc3-9aea25724fd9@gmail.com/T/
>
>
> [84458.600189] systemd-shutdown[1]: Syncing filesystems and block devices.
> [84458.607141] systemd-shutdown[1]: Rebooting.
> [84458.612283] spi-nor spi0.0: Software reset failed: -524
> [84459.777370] megaraid_sas 0000:17:00.0: megasas_disable_intr_fusion is
> called outbound_intr_mask:0x40000009
> [84459.970212] bond0: (slave eno1): link status definitely down, disabling
> slave
> [84459.982170] bond0: (slave eno2): link status definitely down, disabling
> slave
> [84459.990037] tg3 0000:04:00.0 eno1: left promiscuous mode
> [84459.995822] tg3 0000:04:00.0 eno1: left allmulticast mode
> [84460.001615] bond0: now running without any active interface!
> [84460.018133] vmbr0: port 1(bond0) entered disabled state
> [84460.291379] ACPI: PM: Preparing to enter system sleep state S5
> [84463.685113] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
> [84463.685116] {1}[Hardware Error]: event severity: fatal
> [84463.685117] {1}[Hardware Error]: Error 0, type: fatal
> [84463.685119] {1}[Hardware Error]: section_type: PCIe error
> [84463.685120] {1}[Hardware Error]: port_type: 0, PCIe end point
> [84463.685121] {1}[Hardware Error]: version: 3.0
> [84463.685122] {1}[Hardware Error]: command: 0x0002, status: 0x0010
> [84463.685123] {1}[Hardware Error]: device_id: 0000:04:00.1
> [84463.685125] {1}[Hardware Error]: slot: 0
> [84463.685126] {1}[Hardware Error]: secondary_bus: 0x00
> [84463.685127] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
> [84463.685128] {1}[Hardware Error]: class_code: 020000
> [84463.685129] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
> aer_uncor_mask: 0x00010000
> [84463.685130] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
> [84463.685131] {1}[Hardware Error]: TLP Header: 40000001 0000010f
> 90028090 00000000
> [84463.685134] Kernel panic - not syncing: Fatal hardware error!
> [84463.685136] CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: P O
> 6.5.13-3-pve #1
> [84463.685139] Hardware name: Dell Inc. PowerEdge R540/0VC7DK, BIOS 2.21.1
> 03/07/2024
> [84463.685140] Call Trace:
> [84463.685142] <NMI>
> …
>
> root@pve:~# pveversion
> pve-manager/8.1.8/d29041d9f87575d0 (running kernel: 6.5.13-3-pve)
> root@pve:~# ethtool -i eno2
> driver: tg3
> version: 6.5.13-3-pve
> firmware-version: FFV22.71.3 bc 5720-v1.39
> expansion-rom-version:
> bus-info: 0000:04:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
> root@pve:~# lspci | fgrep 04:00.1
> 04:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
> BCM5720 Gigabit Ethernet PCIe
>
>
>
>
> ---------- Forwarded message ----------
> From: Stefan Radman via pve-user <pve-user@lists.proxmox.com>
> To: PVE User List <pve-user@pve.proxmox.com>
> Cc: Stefan Radman <stefan.radman@me.com>
> Bcc:
> Date: Thu, 28 Mar 2024 15:50:02 +0100
> Subject: [PVE-User] 6.5.13-3-pve kernel panic on shutdown
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
next parent reply other threads:[~2024-03-28 15:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <D8D305A2-D2B7-4A5D-821C-65DE75621457@kmi.com>
[not found] ` <93280CB8-7582-4456-9101-D594CE2C86A2@kmi.com>
[not found] ` <mailman.755.1711637904.434.pve-user@lists.proxmox.com>
2024-03-28 15:18 ` Gilberto Ferreira [this message]
[not found] ` <mailman.761.1711641292.434.pve-user@lists.proxmox.com>
2024-03-28 15:57 ` Gilberto Ferreira
[not found] ` <mailman.785.1712036605.434.pve-user@lists.proxmox.com>
2024-04-02 7:37 ` Gilberto Ferreira
[not found] ` <5D727A1E-902A-4CA4-BEF8-A0F1CBFA754E@me.com>
2024-04-02 11:15 ` Gilberto Ferreira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOKSTBvmuqr7R8KntqREdoP5f5hE2U+n-m0WTsXsN=wNPWkaKw@mail.gmail.com' \
--to=gilberto.nunes32@gmail.com \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.