From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9532B8CF73 for ; Mon, 7 Nov 2022 09:39:15 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 6EE19279F3 for ; Mon, 7 Nov 2022 09:38:45 +0100 (CET) Received: from suzuki.com.pl (suzuki.com.pl [79.96.117.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 7 Nov 2022 09:38:44 +0100 (CET) Received: from localhost (127.0.0.1) (HELO v317.home.net.pl) by /usr/run/smtp (/usr/run/postfix/private/idea_relay_lmtp) via UNIX with SMTP (IdeaSmtpServer 5.0.0) id 4e8380717c2b4a6f; Mon, 7 Nov 2022 09:32:03 +0100 Received: from SMP5CG1013R03 (unknown [85.219.240.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by v317.home.net.pl (Postfix) with ESMTPSA id 68091E8041C for ; Mon, 7 Nov 2022 09:32:02 +0100 (CET) Authentication-Results: v317.home.net.pl; dmarc=fail (p=none dis=none) header.from=suzuki.com.pl Authentication-Results: v317.home.net.pl; spf=fail smtp.mailfrom=suzuki.com.pl From: "Mariusz Suchodolski" To: "'Proxmox VE user list'" References: <7523843a-2cdb-fd3a-1bb5-3423f47ee7ab@riminilug.it> In-Reply-To: <7523843a-2cdb-fd3a-1bb5-3423f47ee7ab@riminilug.it> Date: Mon, 7 Nov 2022 09:32:01 +0100 Organization: Suzuki Motor Poland Message-ID: <011a01d8f283$69085160$3b18f420$@suzuki.com.pl> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQIB/bqUjsHqtjF6gu7x4VvBc7zxLq3hCxDQ Content-Language: pl X-CLIENT-IP: 85.219.240.99 X-CLIENT-HOSTNAME: 85.219.240.99 X-VADE-SPAMSTATE: clean X-VADE-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvgedrvdejgdduvdduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecujffqoffgrffnpdggtffipffknecuuegrihhlohhuthemucduhedtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpefhvfhfjgfuffhokfggtgfgofhtsehtqhhgtddvtdejnecuhfhrohhmpedfofgrrhhiuhhsiicuufhutghhohguohhlshhkihdfuceomhgrrhhiuhhsiidrshhutghhohguohhlshhkihesshhuiihukhhirdgtohhmrdhplheqnecuggftrfgrthhtvghrnhepfeetudetvdejfeekhfduuddtheettefgtdffhfdttdeljeejieehjeekteetteefnecuffhomhgrihhnpehprhhogihmohigrdgtohhmnecukfhppeekhedrvdduledrvdegtddrleelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepkeehrddvudelrddvgedtrdelledphhgvlhhopefuoffrheevifdutddufefttdefpdhmrghilhhfrhhomhepfdforghrihhushiiucfuuhgthhhougholhhskhhifdcuoehmrghrihhushiirdhsuhgthhhougholhhskhhisehsuhiiuhhkihdrtghomhdrphhlqedpnhgspghrtghpthhtohepuddprhgtphhtthhopehpvhgvqdhushgvrheslhhishhtshdrphhrohigmhhogidrtghomh X-DCC--Metrics: v317.home.net.pl 1024; Body=1 Fuz1=1 Fuz2=1 X-SPAM-LEVEL: Spam detection results: 0 BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_ASCII_DIVIDERS 0.8 Spam that uses ascii formatting tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_PASS -0.001 SPF: HELO matches SPF record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [PVE-User] Quorum Activity blocked X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Nov 2022 08:39:15 -0000 Hi Piviul, Is the output of "pvecm status" the same on all machines? Looks like the same issue I've had some time ago - = https://forum.proxmox.com/threads/large-delay-on-pvecm-status-webui-unres= ponsive-node-failed-to-rejoin-cluster.96402/ MS. -----Original Message----- From: pve-user On Behalf Of Piviul Sent: Monday, November 7, 2022 8:44 AM To: pve-user@lists.proxmox.com Subject: [PVE-User] Quorum Activity blocked Good morning sirs, in a 3 nodes proxmox 6.4 all the 3 nodes seems to = works, all vm guest continue to works but If I try to start a vm guest = the starting fails with the message: "cluster not ready - no quorum?=20 (500)". This is the cluster manager status: # pvecm status Cluster information ------------------- Name: CSA-cluster1 Config Version: 3 Transport: knet Secure auth: on Quorum information ------------------ Date: Mon Nov 7 08:37:20 2022 Quorum provider: = corosync_votequorum Nodes: 1 Node ID: 0x00000002 Ring ID: 2.91e Quorate: No Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 1 Quorum: 2 Activity blocked Flags: Membership information ---------------------- Nodeid Votes Name 0x00000002 1 192.168.255.2 (local) These are the first logs in syslog showing that some problem occurs: Nov 4 23:38:01 pve02 systemd[1]: Started Proxmox VE replication runner. Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0 = is down Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] host: host: 3 = (passive) best link: 0 (pri: 1) Nov 4 23:38:26 pve02 corosync[1703]: = [KNET ] host: host: 3 has no active links Nov 4 23:38:28 pve02 = corosync[1703]: [TOTEM ] Token has not been received in 2737 ms Nov 4 = 23:38:30 pve02 corosync[1703]: [KNET ] rx: host: 3 link: 0 is up Nov = 4 23:38:30 pve02 corosync[1703]: [KNET ] host: host: 3 (passive) best = link: 0 (pri: 1) Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync = members[2]: 1 2 Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync = left[1]: 3 Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] A new = membership (1.873) was formed. Members left: 3 Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] Failed to receive the = leave message. failed: 3 Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] = notice: members: 1/1626, 2/1578 Nov 4 23:38:32 pve02 pmxcfs[1578]: = [dcdb] notice: starting data syncronisation Nov 4 23:38:32 pve02 = pmxcfs[1578]: [status] notice: members: 1/1626, 2/1578 Nov 4 23:38:32 = pve02 pmxcfs[1578]: [status] notice: starting data syncronisation Nov 4 = 23:38:32 pve02 corosync[1703]: [QUORUM] Members[2]: 1 2 Nov 4 = 23:38:32 pve02 corosync[1703]: [MAIN ] Completed service = synchronization, ready to provide service. Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: received sync request = (epoch 1/1626/00000009) Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] = notice: received sync request (epoch 1/1626/00000009) Nov 4 23:38:32 = pve02 pmxcfs[1578]: [dcdb] notice: received all states Nov 4 23:38:32 = pve02 pmxcfs[1578]: [dcdb] notice: leader is 1/1626 Nov 4 23:38:32 = pve02 pmxcfs[1578]: [dcdb] notice: synced members:=20 1/1626, 2/1578 Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: all data is up to = date Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: = dfsm_deliver_queue:=20 queue length 2 Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: received all states = Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: all data is up to = date Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: = dfsm_deliver_queue:=20 queue length 46 Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0 = is down Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] host: host: 3 = (passive) best link: 0 (pri: 1) Nov 4 23:38:34 pve02 corosync[1703]: = [KNET ] host: host: 3 has no active links Nov 4 23:38:41 pve02 = corosync[1703]: [KNET ] link: host: 1 link: 0 is down Nov 4 23:38:41 = pve02 corosync[1703]: [KNET ] host: host: 1 (passive) best link: 0 = (pri: 1) Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] host: host: 1 = has no active links Nov 4 23:38:42 pve02 corosync[1703]: [TOTEM ] = Token has not been received in 2737 ms Nov 4 23:38:43 pve02 = corosync[1703]: [TOTEM ] A processor failed, forming new = configuration: token timed out (3650ms), waiting 4380ms for consensus. Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync members[1]: 2 Nov = 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync left[1]: 1 Nov 4 = 23:38:48 pve02 corosync[1703]: [TOTEM ] A new membership (2.877) was formed. Members left: 1 Nov 4 23:38:48 pve02 corosync[1703]: [TOTEM ] Failed to receive the = leave message. failed: 1 Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] = notice: members: 2/1578 Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] = notice: members: 2/1578 Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] = This node is within the non-primary component and will NOT provide any = services. Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Members[1]: 2 Nov 4 = 23:38:48 pve02 corosync[1703]: [MAIN ] Completed service = synchronization, ready to provide service. Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] notice: node lost quorum = Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] crit: received write while = not quorate - trigger resync Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] = crit: leaving CPG group Nov 4 23:38:48 pve02 pve-ha-lrm[1943]: unable = to write lrm status file - unable to open file '/etc/pve/nodes/pve02/lrm_status.tmp.1943' - = Permission denied Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] notice: = start cluster connection Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] = crit: cpg_join failed: 14 Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] = crit: can't initialize service Nov 4 23:38:55 pve02 pmxcfs[1578]: = [dcdb] notice: members: 2/1578 Nov 4 23:38:55 pve02 pmxcfs[1578]: = [dcdb] notice: all data is up to date Nov 4 23:39:00 pve02 systemd[1]: = Starting Proxmox VE replication runner... Nov 4 23:39:01 pve02 pvesr[2146320]: trying to acquire cfs lock = 'file-replication_cfg' ... [...] What's happened to my cluster? Someone has some suggestions to = troubleshoot the problem? Piviul _______________________________________________ pve-user mailing list pve-user@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user