From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id F2D9761CAE for ; Mon, 7 Sep 2020 20:53:41 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E3C3CE452 for ; Mon, 7 Sep 2020 20:53:41 +0200 (CEST) Received: from hermes.qwer.tk (hermes.qwer.tk [93.82.198.100]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9FD91E441 for ; Mon, 7 Sep 2020 20:53:39 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by hermes.qwer.tk (Postfix) with ESMTP id 98C6A1A4ED3 for ; Mon, 7 Sep 2020 20:44:45 +0200 (CEST) X-Virus-Scanned: by amavisd-new at qwer.tk Received: from hermes.qwer.tk ([127.0.0.1]) by localhost (hermes.qwer.tk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MwCEwngwu4mH for ; Mon, 7 Sep 2020 20:44:43 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by hermes.qwer.tk (Postfix) with ESMTP id BF03E1A4EDC for ; Mon, 7 Sep 2020 20:44:43 +0200 (CEST) Received: from [192.168.50.79] (daidalos.qwer.tk [192.168.50.79]) by hermes.qwer.tk (Postfix) with ESMTP id B10811A4ED3; Mon, 7 Sep 2020 20:44:43 +0200 (CEST) To: Wolfgang Link , Proxmox VE user list References: <0e58d1d5-384b-55d2-9042-ae8c1e2ade6c@qwer.tk> <2073914112.857.1599477660280@webmail.proxmox.com> From: Hermann Himmelbauer Autocrypt: addr=hermann@qwer.tk; prefer-encrypt=mutual; keydata= xsDhBDvBnt8RBACMws44bPKEa5IhH4xO3RRRuzB2OHSK6wbd14yggq5/e3kL5wT6wiBrvnnz bm4fB0wXYVi1didw/x31Jivw0xvFTJmQFtkIRmznG4xL6H6sFUOoJ0ve+3BGHLCN8y/Sq8XA CiEbQ/a/O4xGBmVErsUHbVy8w1Vip+ryhmGE1pb4uwCgxz58QAphR7kSuYecCY6NSDmN25UD /A2A9YKYnIsC1kX9xmXqfK9WG3nHhAfXXaRnAAtIJmYWvYrmAKZFq1aCV0WEqEnmCzGGLUcs Hfz9QiTKeeDo2r3Kkeq3hkObVRpbz09O2E1UP0JM4wSH5KP1CU95Dh7K9grPbjGTmJRnApPZ AoIiAHn3bft8N4YFSGC0GnH9b1yKA/dERK82n+3E9xk4s1PF5e+yKFy9oMhHbyeeC0U5XtRh 1g+IttjHByHhDkzspM+0m/IksTm+tO0XCMcs03kgNHus1nvNVgZ3ju0VG+t0kgxOwbPFfL7n EUzWs8V2xxDCdj51GPpPgS5d5EZ0bic64nboGZSYWQJPmGHCWkP8vE8MzThIZXJtYW5uIEhp bW1lbGJhdWVyIChPZmZpY2UpIDxoZXJtYW5uQGhpbW1lbGJhdWVyLWl0LmF0PsJmBBMRAgAm AhsjBwsJCAcDAgEGFQgCCQoLBBYCAwECHgECF4AFAlQ+eyMCGQEACgkQZLTRaymYk8dH9wCg i5a2Ziy6jfORkHZaLGTMiXFU4EUAn3yvFpjotgqRKWua1b+KC+U3/8VpzsFNBDvBnusQCACo zV8ICMObecOnFRB9K2kQTXg2YpQJ6fFCpdHj9uYwBLtoDMn1f1Hg68MxS2+/eoQ8AftaBd8B H7WPBbLDXD4OedM5lLMId3xJ94jSHb8LPF/dFvAW4jScDoDP2SBFPzugivHwYZhHwoefCVPi QKZFaDOOfBlRRTHlLRct3kcHCZloNqmprBxg4r5MAAaRzIllzUXzTim+AEq+9nvg1ka1k44s GKpdDD5U+qPmQL+4jVf7ESH1oNI7cFQqvM2KmONjtNzul0+WF3gkTAf2jc7eRCoPbi72Vsv+ oPfh3/wT54d3TcGdti5cZlEsBttj9CNi6XTtLqVznlpO8AQoL/IjAAMFCACFf3thVOTTRAbR NSnxBe3SM45YIUFN/ROCb8WWlMj7o1pYHGJl2XFGhhRoGHYBrnrRy1NQKIVr/0O2V1LRhMMH h9lX74FpKLfEmIffmjoiWTM6O6ailoVVtwXdooRnGOHXkCqWrGPYToiq4xTncSiRE48Woc82 tqXXhwFII23go450X13PbeqDHHzi3lL0l6JaGK0CdrCzANmn6xCMRo1+lYkCvmvmay/pyHNm 5Le9VFLXSKsVbdU93FIVn98uT9/Qz4/lFVPEXrxIQFk1gCO+GmFBC7If7DmjfMU+YSt++pmC hl91UOEWDmqzZ5NPVYZxa9Pnw2WZxNmahcs1gkkVwk4EGBECAAYFAjvBnusAEgkQZLTRaymY k8cHZUdQRwABAR7eAKDEMfcDthWDvrCoAiQtm/wfYMx/bQCgqFmCXEfXZkSpnsr8RIrIWuK8 2Dk= Message-ID: <2be2c52e-4c86-6ca6-9f4a-49a25315c994@qwer.tk> Date: Mon, 7 Sep 2020 20:44:43 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <2073914112.857.1599477660280@webmail.proxmox.com> Content-Type: text/plain; charset=utf-8 Content-Language: de-DE Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.345 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -2.69 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [PVE-User] Server freezing randomly with Proxmox 6.2-4 on AMD Ryzen system X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Sep 2020 18:53:42 -0000 Dear Wolfgang, Thank you for your reply. Glad to hear that the board is stable for you. My BIOS has the default values, so no overclocking or the like. Did you do any alterations? Did you in some way disable C6? Maybe this is really some defect (mainboard, RAM, cpu, power supply...) - since my posting I managed to crash node 2, however, node 1 + node 3 are stable. BTW - did you manage to get ECC running? I do have ECC memory but it does not seem to be detected. Maybe this is due to the AMD Ryzen 3 3200G - I read somewhere that the CPUs with integrated graphic do not report ECC? Can you perhaps send me the other components of your system? The board itself + the AMD CPUs are a very price-efficient combination. The onboard 10GBit ethernet is great for ceph, I get quite good I/O speeds. If things get stable, it's a perfect combination for a cost efficient HA cluster, I think. Best Regards, Hermann Am 07.09.20 um 13:21 schrieb Wolfgang Link: > Hi Hermann, > > this board with this Bios version and an Ryzen 9 3900X is running perfectly over 4 month, also with very high load in the VM. > > What have you set at BIOS? > > Regards > > Wolfgang >> On 09/04/2020 4:45 PM Hermann Himmelbauer wrote: >> >> >> Dear Proxmox users, >> >> I'm trying to install a 3-node cluster (latest proxmox/ceph) and >> experience random freezes. The node can either be completely frozen (no >> blinking cursor on console, no ping) or can get somewhat blocked / slow etc. >> >> This happens most often on node 2 (approx. 3-4 times / day), node 3 >> never got stuck within 14 days runtime, node 1 once. >> >> Unfortunately I did not find any way to trigger this behaviour, however, >> I *think* that this happens most often if I stress the machine in some >> way (performance test within a virtual machine) and then idling the machine. >> >> When the machine freezes completely, there is no logfile. However, if it >> is partially frozen, some info can be aquired via dmesg. (See attached >> file). ("device=2b:00.0" is an intel 10GBit ethernet adapter (X550T). So >> perhaps there is some driver issue regarding this ethernet adapter?) >> >> The system consists of the following components: >> >> - AMD Ryzen 3 3200G, 4x 3.60GHz, boxed (YD3200C5FHBOX) >> - ASRock Rack X470D4U2-2T (Mainboard) >> - Samsung SSD 970 EVO Plus 250GB, M.2 (MZ-V7S250BW) (builtin SSD for OS) >> - 2 * Kingston Server Premier DIMM 16GB, DDR4-2666, CL19-19-19, ECC (BOM >> Number: 9965745-002.A00G, Part Number: KSM26ED8/16ME) >> - be quiet! Pure Power 11 CM 400W ATX 2.4 (BN296) (Power supply) >> - 2 * Micron 5300 PRO - Read Intensive 960GB, SATA >> (MTFDDAK960TDS-1AW1Z6) (SSD for Ceph) >> - LogiLink PC0075, 2x RJ-45, PCIe 2.0 x1 (second NIC with two ports) >> >> The system is Linux Debian 10.4 (Proxmox 6.2-4) with kernel 5.4.34-1-pve >> #1 SMP PVE 5.4.34-2 (Thu, 07 May 2020 10:02:02 +0200) x86_64 GNU/Linux. >> >> What I did so far (without success): >> >> - Disabled C6 as I read that this CPU-state can lead to unstable systems >> (via "python zenstates.py --c6-disable" -> still errors). >> - Updated my Bios to the latest version (3.30) >> - Checked that the CPU + RAM are compatible to the mainboard (they are >> listed as compatible on the ASRock website) >> - Checked logs in IPMI (undervoltage, temperature etc., nothing is logged) >> - Memory test (memtest86, no errors) >> >> Do you have any clue what could be the reason for these freezes? Should >> I think of some hardware error? Or is this some known Linux bug that can >> be fixed? >> >> Best Regards, >> Hermann >> >> -- >> hermann@qwer.tk >> PGP/GPG: 299893C7 (on keyservers) >> _______________________________________________ >> pve-user mailing list >> pve-user@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user > -- hermann@qwer.tk PGP/GPG: 299893C7 (on keyservers)