From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id A230B61979 for ; Mon, 7 Sep 2020 13:22:17 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8BAA2A4C3 for ; Mon, 7 Sep 2020 13:21:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 12EBCA4B5 for ; Mon, 7 Sep 2020 13:21:44 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id CE2CF44A77; Mon, 7 Sep 2020 13:21:43 +0200 (CEST) Date: Mon, 7 Sep 2020 13:21:00 +0200 (CEST) From: Wolfgang Link To: Proxmox VE user list , Hermann Himmelbauer Message-ID: <2073914112.857.1599477660280@webmail.proxmox.com> In-Reply-To: <0e58d1d5-384b-55d2-9042-ae8c1e2ade6c@qwer.tk> References: <0e58d1d5-384b-55d2-9042-ae8c1e2ade6c@qwer.tk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.3-Rev21 X-Originating-Client: open-xchange-appsuite X-SPAM-LEVEL: Spam detection results: 0 AWL 0.460 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [PVE-User] Server freezing randomly with Proxmox 6.2-4 on AMD Ryzen system X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Sep 2020 11:22:17 -0000 Hi Hermann, this board with this Bios version and an Ryzen 9 3900X is running perfectly over 4 month, also with very high load in the VM. What have you set at BIOS? Regards Wolfgang > On 09/04/2020 4:45 PM Hermann Himmelbauer wrote: > > > Dear Proxmox users, > > I'm trying to install a 3-node cluster (latest proxmox/ceph) and > experience random freezes. The node can either be completely frozen (no > blinking cursor on console, no ping) or can get somewhat blocked / slow etc. > > This happens most often on node 2 (approx. 3-4 times / day), node 3 > never got stuck within 14 days runtime, node 1 once. > > Unfortunately I did not find any way to trigger this behaviour, however, > I *think* that this happens most often if I stress the machine in some > way (performance test within a virtual machine) and then idling the machine. > > When the machine freezes completely, there is no logfile. However, if it > is partially frozen, some info can be aquired via dmesg. (See attached > file). ("device=2b:00.0" is an intel 10GBit ethernet adapter (X550T). So > perhaps there is some driver issue regarding this ethernet adapter?) > > The system consists of the following components: > > - AMD Ryzen 3 3200G, 4x 3.60GHz, boxed (YD3200C5FHBOX) > - ASRock Rack X470D4U2-2T (Mainboard) > - Samsung SSD 970 EVO Plus 250GB, M.2 (MZ-V7S250BW) (builtin SSD for OS) > - 2 * Kingston Server Premier DIMM 16GB, DDR4-2666, CL19-19-19, ECC (BOM > Number: 9965745-002.A00G, Part Number: KSM26ED8/16ME) > - be quiet! Pure Power 11 CM 400W ATX 2.4 (BN296) (Power supply) > - 2 * Micron 5300 PRO - Read Intensive 960GB, SATA > (MTFDDAK960TDS-1AW1Z6) (SSD for Ceph) > - LogiLink PC0075, 2x RJ-45, PCIe 2.0 x1 (second NIC with two ports) > > The system is Linux Debian 10.4 (Proxmox 6.2-4) with kernel 5.4.34-1-pve > #1 SMP PVE 5.4.34-2 (Thu, 07 May 2020 10:02:02 +0200) x86_64 GNU/Linux. > > What I did so far (without success): > > - Disabled C6 as I read that this CPU-state can lead to unstable systems > (via "python zenstates.py --c6-disable" -> still errors). > - Updated my Bios to the latest version (3.30) > - Checked that the CPU + RAM are compatible to the mainboard (they are > listed as compatible on the ASRock website) > - Checked logs in IPMI (undervoltage, temperature etc., nothing is logged) > - Memory test (memtest86, no errors) > > Do you have any clue what could be the reason for these freezes? Should > I think of some hardware error? Or is this some known Linux bug that can > be fixed? > > Best Regards, > Hermann > > -- > hermann@qwer.tk > PGP/GPG: 299893C7 (on keyservers) > _______________________________________________ > pve-user mailing list > pve-user@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user