From: Hermann Himmelbauer <hermann@qwer.tk>
To: pve-user@lists.proxmox.com
Subject: Re: [PVE-User] Server freezing randomly with Proxmox 6.2-4 on AMD Ryzen system
Date: Mon, 16 Nov 2020 18:21:24 +0100 [thread overview]
Message-ID: <23e655af-534e-3066-e3cd-514421730aa2@qwer.tk> (raw)
In-Reply-To: <0e58d1d5-384b-55d2-9042-ae8c1e2ade6c@qwer.tk>
Hi,
In case someone is interested, the problem is now solved, the system
seems to be rock solid after ~ 2 month testing:
I changed the AMD Ryzen 3 3200G to a AMD Ryzen 5 3600 on one node and to
a AMD Ryzen 3 3100 on the two other nodes, now the problem is gone.
I don't really know why, I can think of two reasons:
1) The 3200G did not support ECC but I use ECC RAM. Maybe this leads to
errors (although intensive memory testing with memtest86 did not report
anything).
2) The new CPUs do not have integrated graphic capabilities. I noticed
that the two onboard 10GBit-Ethernet adapters now have other PCI
addresses with the new CPU. And with the old CPUs there were problem
with malfunctioning of these 10G adapters.
Many thanks for input + your help.
The ASRock Rack X470D4U2-2T is definitly stable now.
Best Regards,
Hermann
Am 04.09.20 um 16:45 schrieb Hermann Himmelbauer:
> Dear Proxmox users,
>
> I'm trying to install a 3-node cluster (latest proxmox/ceph) and
> experience random freezes. The node can either be completely frozen (no
> blinking cursor on console, no ping) or can get somewhat blocked / slow etc.
>
> This happens most often on node 2 (approx. 3-4 times / day), node 3
> never got stuck within 14 days runtime, node 1 once.
>
> Unfortunately I did not find any way to trigger this behaviour, however,
> I *think* that this happens most often if I stress the machine in some
> way (performance test within a virtual machine) and then idling the machine.
>
> When the machine freezes completely, there is no logfile. However, if it
> is partially frozen, some info can be aquired via dmesg. (See attached
> file). ("device=2b:00.0" is an intel 10GBit ethernet adapter (X550T). So
> perhaps there is some driver issue regarding this ethernet adapter?)
>
> The system consists of the following components:
>
> - AMD Ryzen 3 3200G, 4x 3.60GHz, boxed (YD3200C5FHBOX)
> - ASRock Rack X470D4U2-2T (Mainboard)
> - Samsung SSD 970 EVO Plus 250GB, M.2 (MZ-V7S250BW) (builtin SSD for OS)
> - 2 * Kingston Server Premier DIMM 16GB, DDR4-2666, CL19-19-19, ECC (BOM
> Number: 9965745-002.A00G, Part Number: KSM26ED8/16ME)
> - be quiet! Pure Power 11 CM 400W ATX 2.4 (BN296) (Power supply)
> - 2 * Micron 5300 PRO - Read Intensive 960GB, SATA
> (MTFDDAK960TDS-1AW1Z6) (SSD for Ceph)
> - LogiLink PC0075, 2x RJ-45, PCIe 2.0 x1 (second NIC with two ports)
>
> The system is Linux Debian 10.4 (Proxmox 6.2-4) with kernel 5.4.34-1-pve
> #1 SMP PVE 5.4.34-2 (Thu, 07 May 2020 10:02:02 +0200) x86_64 GNU/Linux.
>
> What I did so far (without success):
>
> - Disabled C6 as I read that this CPU-state can lead to unstable systems
> (via "python zenstates.py --c6-disable" -> still errors).
> - Updated my Bios to the latest version (3.30)
> - Checked that the CPU + RAM are compatible to the mainboard (they are
> listed as compatible on the ASRock website)
> - Checked logs in IPMI (undervoltage, temperature etc., nothing is logged)
> - Memory test (memtest86, no errors)
>
> Do you have any clue what could be the reason for these freezes? Should
> I think of some hardware error? Or is this some known Linux bug that can
> be fixed?
>
> Best Regards,
> Hermann
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
prev parent reply other threads:[~2020-11-16 17:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <0e58d1d5-384b-55d2-9042-ae8c1e2ade6c@qwer.tk>
2020-09-07 11:21 ` Wolfgang Link
2020-09-07 18:44 ` Hermann Himmelbauer
2020-09-08 4:25 ` Wolfgang Link
2020-09-07 11:29 ` Chris Sutcliff
2020-11-16 17:21 ` Hermann Himmelbauer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23e655af-534e-3066-e3cd-514421730aa2@qwer.tk \
--to=hermann@qwer.tk \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox