From: "Mariusz Suchodolski" <mariusz.suchodolski@suzuki.com.pl>
To: "'Proxmox VE user list'" <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Quorum Activity blocked
Date: Mon, 7 Nov 2022 09:32:01 +0100 [thread overview]
Message-ID: <011a01d8f283$69085160$3b18f420$@suzuki.com.pl> (raw)
In-Reply-To: <7523843a-2cdb-fd3a-1bb5-3423f47ee7ab@riminilug.it>
Hi Piviul,
Is the output of "pvecm status" the same on all machines?
Looks like the same issue I've had some time ago - https://forum.proxmox.com/threads/large-delay-on-pvecm-status-webui-unresponsive-node-failed-to-rejoin-cluster.96402/
MS.
-----Original Message-----
From: pve-user <pve-user-bounces@lists.proxmox.com> On Behalf Of Piviul
Sent: Monday, November 7, 2022 8:44 AM
To: pve-user@lists.proxmox.com
Subject: [PVE-User] Quorum Activity blocked
Good morning sirs, in a 3 nodes proxmox 6.4 all the 3 nodes seems to works, all vm guest continue to works but If I try to start a vm guest the starting fails with the message: "cluster not ready - no quorum?
(500)". This is the cluster manager status:
# pvecm status
Cluster information
-------------------
Name: CSA-cluster1
Config Version: 3
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Mon Nov 7 08:37:20 2022 Quorum provider: corosync_votequorum
Nodes: 1
Node ID: 0x00000002
Ring ID: 2.91e
Quorate: No
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 1
Quorum: 2 Activity blocked
Flags:
Membership information
----------------------
Nodeid Votes Name
0x00000002 1 192.168.255.2 (local)
These are the first logs in syslog showing that some problem occurs:
Nov 4 23:38:01 pve02 systemd[1]: Started Proxmox VE replication runner.
Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0 is down Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Nov 4 23:38:26 pve02 corosync[1703]: [KNET ] host: host: 3 has no active links Nov 4 23:38:28 pve02 corosync[1703]: [TOTEM ] Token has not been received in 2737 ms Nov 4 23:38:30 pve02 corosync[1703]: [KNET ] rx: host: 3 link: 0 is up Nov 4 23:38:30 pve02 corosync[1703]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync members[2]: 1 2 Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Sync left[1]: 3 Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] A new membership
(1.873) was formed. Members left: 3
Nov 4 23:38:32 pve02 corosync[1703]: [TOTEM ] Failed to receive the leave message. failed: 3 Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: members: 1/1626, 2/1578 Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: starting data syncronisation Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: members: 1/1626, 2/1578 Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: starting data syncronisation Nov 4 23:38:32 pve02 corosync[1703]: [QUORUM] Members[2]: 1 2 Nov 4 23:38:32 pve02 corosync[1703]: [MAIN ] Completed service synchronization, ready to provide service.
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: received sync request (epoch 1/1626/00000009) Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: received sync request (epoch 1/1626/00000009) Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: received all states Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: leader is 1/1626 Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: synced members:
1/1626, 2/1578
Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: all data is up to date Nov 4 23:38:32 pve02 pmxcfs[1578]: [dcdb] notice: dfsm_deliver_queue:
queue length 2
Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: received all states Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: all data is up to date Nov 4 23:38:32 pve02 pmxcfs[1578]: [status] notice: dfsm_deliver_queue:
queue length 46
Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] link: host: 3 link: 0 is down Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1) Nov 4 23:38:34 pve02 corosync[1703]: [KNET ] host: host: 3 has no active links Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] link: host: 1 link: 0 is down Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Nov 4 23:38:41 pve02 corosync[1703]: [KNET ] host: host: 1 has no active links Nov 4 23:38:42 pve02 corosync[1703]: [TOTEM ] Token has not been received in 2737 ms Nov 4 23:38:43 pve02 corosync[1703]: [TOTEM ] A processor failed, forming new configuration: token timed out (3650ms), waiting 4380ms for consensus.
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync members[1]: 2 Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Sync left[1]: 1 Nov 4 23:38:48 pve02 corosync[1703]: [TOTEM ] A new membership
(2.877) was formed. Members left: 1
Nov 4 23:38:48 pve02 corosync[1703]: [TOTEM ] Failed to receive the leave message. failed: 1 Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] notice: members: 2/1578 Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] notice: members: 2/1578 Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] This node is within the non-primary component and will NOT provide any services.
Nov 4 23:38:48 pve02 corosync[1703]: [QUORUM] Members[1]: 2 Nov 4 23:38:48 pve02 corosync[1703]: [MAIN ] Completed service synchronization, ready to provide service.
Nov 4 23:38:48 pve02 pmxcfs[1578]: [status] notice: node lost quorum Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] crit: received write while not quorate - trigger resync Nov 4 23:38:48 pve02 pmxcfs[1578]: [dcdb] crit: leaving CPG group Nov 4 23:38:48 pve02 pve-ha-lrm[1943]: unable to write lrm status file
- unable to open file '/etc/pve/nodes/pve02/lrm_status.tmp.1943' - Permission denied Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] notice: start cluster connection Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] crit: cpg_join failed: 14 Nov 4 23:38:49 pve02 pmxcfs[1578]: [dcdb] crit: can't initialize service Nov 4 23:38:55 pve02 pmxcfs[1578]: [dcdb] notice: members: 2/1578 Nov 4 23:38:55 pve02 pmxcfs[1578]: [dcdb] notice: all data is up to date Nov 4 23:39:00 pve02 systemd[1]: Starting Proxmox VE replication runner...
Nov 4 23:39:01 pve02 pvesr[2146320]: trying to acquire cfs lock 'file-replication_cfg' ...
[...]
What's happened to my cluster? Someone has some suggestions to troubleshoot the problem?
Piviul
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
next prev parent reply other threads:[~2022-11-07 8:39 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-07 7:44 Piviul
2022-11-07 8:32 ` Mariusz Suchodolski [this message]
[not found] ` <0ec3862e-69e4-4af3-b8ac-e1390d7ecd2b@binovo.es>
2022-11-07 9:28 ` Piviul
2022-11-07 9:45 ` Piviul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='011a01d8f283$69085160$3b18f420$@suzuki.com.pl' \
--to=mariusz.suchodolski@suzuki.com.pl \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox