all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Alexandre DERUMIER <aderumier@odiso.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Cc: Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
Date: Wed, 16 Sep 2020 09:34:27 +0200 (CEST)	[thread overview]
Message-ID: <1097647242.851726.1600241667098.JavaMail.zimbra@odiso.com> (raw)
In-Reply-To: <597522514.840749.1600185513450.JavaMail.zimbra@odiso.com>

Hi,

I have produced the problem now,

and this time I don't have restarted corosync a second after the lock of /etc/pve

so, currently it's readonly.


I don't have used gdb since a long time, 

could you tell me how to attach to the running pmxcfs process, and some gbd commands to launch ?

----- Mail original -----
De: "aderumier" <aderumier@odiso.com>
À: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Cc: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
Envoyé: Mardi 15 Septembre 2020 17:58:33
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

Another small lock at 17:41:09 

To be sure, I have done a small loop of write each second in /etc/pve, node node2. 

it's hanging at first corosync restart, then, on second corosync restart it's working again. 

I'll try to improve this tomorrow to be able to debug corosync process 
- restarting corosync 
do some write in /etc/pve/ 
- and if it's hanging don't restart corosync again 



node2: echo test > /etc/pve/test loop 
-------------------------------------- 
Current time : 17:41:01 
Current time : 17:41:02 
Current time : 17:41:03 
Current time : 17:41:04 
Current time : 17:41:05 
Current time : 17:41:06 
Current time : 17:41:07 
Current time : 17:41:08 
Current time : 17:41:09 

hang 

Current time : 17:42:05 
Current time : 17:42:06 
Current time : 17:42:07 



node1 
----- 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD completed for host: 6 link: 0 current link mtu: 1397 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: Starting PMTUD for host: 10 link: 0 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] udp: detected kernel MTU: 1500 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [TOTEM ] Knet pMTU change: 1397 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD link change for host: 10 link: 0 from 469 to 1397 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD completed for host: 10 link: 0 current link mtu: 1397 
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: Global data MTU changed to: 1397 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-31-zx6KJM/qb) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239] 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] lib_init_fn: conn=0x556c2918d5f0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] got quorum_type request on 0x556c2918d5f0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] got trackstart request on 0x556c2918d5f0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] sending initial status to 0x556c2918d5f0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] sending quorum notification to 0x556c2918d5f0, length = 52 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-32-I7ZZ6e/qb) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239] 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CMAP ] lib_init_fn: conn=0x556c2918ef20 
Sep 15 17:41:09 m6kvm1 pmxcfs[16239]: [status] notice: update cluster info (cluster name m6kvm, version = 20) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-33-6RKbvH/qb) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239] 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] lib_init_fn: conn=0x556c2918ad00, cpd=0x556c2918b50c 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-34-GAY5T9/qb) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239] 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] lib_init_fn: conn=0x556c2918c740, cpd=0x556c2918ce8c 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Creating commit token because I am the rep. 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Saving state aru 5 high seq received 5 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Storing new sequence id for ring 1197 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering COMMIT state. 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] got commit token 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering RECOVERY state. 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] TRANS [0] member 1: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [0] member 1: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (1.1193) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 5 high delivered 5 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [1] member 2: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [2] member 3: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [3] member 4: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
ep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [4] member 5: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [5] member 6: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [6] member 7: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [7] member 8: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [8] member 9: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [9] member 10: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [10] member 11: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [11] member 12: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [12] member 13: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [13] member 14: 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Did not need to originate any messages in recovery. 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] got commit token 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Sending initial ORF token 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Resetting old ring state 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] recovery to regular 1-0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] waiting_trans_ack changed to 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.90) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.91) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.92) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.93) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.94) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.95) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.96) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.97) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.107) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.108) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.109) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.110) 
ep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.111) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] call init for locally known services 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering OPERATIONAL state. 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] A new membership (1.1197) was formed. Members joined: 2 3 4 5 6 7 8 9 10 11 12 13 14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] enter sync process 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] Committing synchronization for corosync configuration map access 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 2 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 3 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 4 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 5 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 6 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 7 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 8 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 9 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 10 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 11 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 12 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 13 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] my downlist: members(old:1 left:0) 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[4] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[5] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[6] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[7] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[8] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[9] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[10] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[11] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[12] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[13] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[14] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[15] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[16] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[17] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[18] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[19] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[20] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[21] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[22] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[23] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[24] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[25] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] Sending nodelist callback. ring_id = 1.1197 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 13 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[13]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=2, expected_votes=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 13 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[14]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=3, expected_votes=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 14 flags: 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=3, expected_votes=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 2 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=4, expected_votes=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 2 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 2 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 3 
.... 
.... 
next corosync restart 

Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] Node was shut down by a signal 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [SERV ] Unloading all Corosync service engines. 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] withdrawing server sockets 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] qb_ipcs_unref() - destroying 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [SERV ] Service engine unloaded: corosync vote quorum service v1.0 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] qb_ipcs_disconnect(/dev/shm/qb-18145-16239-32-I7ZZ6e/qb) state:2 
Sep 15 17:42:03 m6kvm1 pmxcfs[16239]: [confdb] crit: cmap_dispatch failed: 2 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] cs_ipcs_connection_closed() 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [CMAP ] exit_fn for conn=0x556c2918ef20 
Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] cs_ipcs_connection_destroyed() 


node2 
----- 



Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] pmtud: Starting PMTUD for host: 10 link: 0 
Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] udp: detected kernel MTU: 1500 
Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] pmtud: PMTUD completed for host: 10 link: 0 current link mtu: 1397 
Sep 15 17:41:07 m6kvm2 corosync[25411]: [KNET ] rx: host: 1 link: 0 received pong: 2 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [TOTEM ] entering GATHER state from 11(merge during join). 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: host: 1 link: 0 is up 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Knet host change callback. nodeid: 1 reachable: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] pmtud: Starting PMTUD for host: 1 link: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] udp: detected kernel MTU: 1500 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 1397 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] got commit token 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Saving state aru 123 high seq received 123 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Storing new sequence id for ring 1197 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering COMMIT state. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] got commit token 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering RECOVERY state. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [0] member 2: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [1] member 3: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [2] member 4: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [3] member 5: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [4] member 6: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [5] member 7: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [6] member 8: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [7] member 9: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [8] member 10: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [9] member 11: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [10] member 12: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [11] member 13: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [12] member 14: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [0] member 1: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (1.1193) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 5 high delivered 5 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [1] member 2: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [2] member 3: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [3] member 4: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [4] member 5: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [5] member 6: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [6] member 7: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [7] member 8: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [8] member 9: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [9] member 10: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [10] member 11: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [11] member 12: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [12] member 13: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [13] member 14: 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Did not need to originate any messages in recovery. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru ffffffff 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Resetting old ring state 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] recovery to regular 1-0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] waiting_trans_ack changed to 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Member joined: r(0) ip(10.3.94.89) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] call init for locally known services 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering OPERATIONAL state. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] A new membership (1.1197) was formed. Members joined: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] enter sync process 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync configuration map access 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CMAP ] Not first sync -> no action 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 3 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 4 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 5 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 6 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 7 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 8 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 9 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 10 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 11 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 12 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 13 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] my downlist: members(old:13 left:0) 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[4] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[5] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[6] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[7] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[8] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[9] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[10] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[11] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[12] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[13] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[14] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[15] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[16] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[17] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[18] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[19] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[20] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[21] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[22] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[23] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[24] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[25] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] Sending nodelist callback. ring_id = 1.1197 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 13 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[13]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 13 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[14]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 14 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 3 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[3]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 3 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 4 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 4 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 5 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[5]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 5 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 6 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[6]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 6 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 7 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[7]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 7 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 8 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[8]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 8 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 9 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[9]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 9 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 10 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[10]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 10 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 11 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[11]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 11 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 12 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[12]: votes: 1, expected: 14 flags: 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 12 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync vote quorum service v1.0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [QUORUM] Members[14]: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [QUORUM] sending quorum notification to (nil), length = 104 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] Sending quorum callback, quorate = 1 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Completed service synchronization, ready to provide service. 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] waiting_trans_ack changed to 0 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got procjoin message from cluster node 1 (r(0) ip(10.3.94.89) ) for pid 16239 
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got procjoin message from cluster node 1 (r(0) ip(10.3.94.89) ) for pid 16239 

----- Mail original ----- 
De: "aderumier" <aderumier@odiso.com> 
À: "Thomas Lamprecht" <t.lamprecht@proxmox.com> 
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com> 
Envoyé: Mardi 15 Septembre 2020 16:57:46 
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown 

>>I mean this is bad, but also great! 
>>Cam you do a coredump of the whole thing and upload it somewhere with the version info 
>>used (for dbgsym package)? That could help a lot. 

I'll try to reproduce it again (with the full lock everywhere), and do the coredump. 




I have tried the real time scheduling, 

but I still have been able to reproduce the "lrm too long" for 60s (but as I'm restarting corosync each minute, I think it's unlocking 
something at next corosync restart.) 


this time it was blocked at the same time on a node in: 

work { 
... 
} elsif ($state eq 'active') { 
.... 
$self->update_lrm_status(); 


and another node in 

if ($fence_request) { 
$haenv->log('err', "node need to be fenced - releasing agent_lock\n"); 
$self->set_local_status({ state => 'lost_agent_lock'}); 
} elsif (!$self->get_protected_ha_agent_lock()) { 
$self->set_local_status({ state => 'lost_agent_lock'}); 
} elsif ($self->{mode} eq 'maintenance') { 
$self->set_local_status({ state => 'maintenance'}); 
} 





----- Mail original ----- 
De: "Thomas Lamprecht" <t.lamprecht@proxmox.com> 
À: "aderumier" <aderumier@odiso.com> 
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com> 
Envoyé: Mardi 15 Septembre 2020 16:32:52 
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown 

On 9/15/20 4:09 PM, Alexandre DERUMIER wrote: 
>>> Can you try to give pmxcfs real time scheduling, e.g., by doing: 
>>> 
>>> # systemctl edit pve-cluster 
>>> 
>>> And then add snippet: 
>>> 
>>> 
>>> [Service] 
>>> CPUSchedulingPolicy=rr 
>>> CPUSchedulingPriority=99 
> yes, sure, I'll do it now 
> 
> 
>> I'm currently digging the logs 
>>> Is your most simplest/stable reproducer still a periodic restart of corosync in one node? 
> yes, a simple "systemctl restart corosync" on 1 node each minute 
> 
> 
> 
> After 1hour, it's still locked. 
> 
> on other nodes, I still have pmxfs logs like: 
> 

I mean this is bad, but also great! 
Cam you do a coredump of the whole thing and upload it somewhere with the version info 
used (for dbgsym package)? That could help a lot. 


> manual "pmxcfs -d" 
> https://gist.github.com/aderumier/4cd91d17e1f8847b93ea5f621f257c2e 
> 

Hmm, the fuse connection of the previous one got into a weird state (or something is still 
running) but I'd rather say this is a side-effect not directly connected to the real bug. 

> 
> some interesting dmesg about "pvesr" 
> 
> [Tue Sep 15 14:45:34 2020] INFO: task pvesr:19038 blocked for more than 120 seconds. 
> [Tue Sep 15 14:45:34 2020] Tainted: P O 5.4.60-1-pve #1 
> [Tue Sep 15 14:45:34 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
> [Tue Sep 15 14:45:34 2020] pvesr D 0 19038 1 0x00000080 
> [Tue Sep 15 14:45:34 2020] Call Trace: 
> [Tue Sep 15 14:45:34 2020] __schedule+0x2e6/0x6f0 
> [Tue Sep 15 14:45:34 2020] ? filename_parentat.isra.57.part.58+0xf7/0x180 
> [Tue Sep 15 14:45:34 2020] schedule+0x33/0xa0 
> [Tue Sep 15 14:45:34 2020] rwsem_down_write_slowpath+0x2ed/0x4a0 
> [Tue Sep 15 14:45:34 2020] down_write+0x3d/0x40 
> [Tue Sep 15 14:45:34 2020] filename_create+0x8e/0x180 
> [Tue Sep 15 14:45:34 2020] do_mkdirat+0x59/0x110 
> [Tue Sep 15 14:45:34 2020] __x64_sys_mkdir+0x1b/0x20 
> [Tue Sep 15 14:45:34 2020] do_syscall_64+0x57/0x190 
> [Tue Sep 15 14:45:34 2020] entry_SYSCALL_64_after_hwframe+0x44/0xa9 
> 

hmm, hangs in mkdir (cluster wide locking) 


_______________________________________________ 
pve-devel mailing list 
pve-devel@lists.proxmox.com 
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 


_______________________________________________ 
pve-devel mailing list 
pve-devel@lists.proxmox.com 
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 




  reply	other threads:[~2020-09-16  7:35 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 14:11 Alexandre DERUMIER
2020-09-04 12:29 ` Alexandre DERUMIER
2020-09-04 15:42   ` Dietmar Maurer
2020-09-05 13:32     ` Alexandre DERUMIER
2020-09-05 15:23       ` dietmar
2020-09-05 17:30         ` Alexandre DERUMIER
2020-09-06  4:21           ` dietmar
2020-09-06  5:36             ` Alexandre DERUMIER
2020-09-06  6:33               ` Alexandre DERUMIER
2020-09-06  8:43               ` Alexandre DERUMIER
2020-09-06 12:14                 ` dietmar
2020-09-06 12:19                   ` dietmar
2020-09-07  7:00                     ` Thomas Lamprecht
2020-09-07  7:19                   ` Alexandre DERUMIER
2020-09-07  8:18                     ` dietmar
2020-09-07  9:32                       ` Alexandre DERUMIER
2020-09-07 13:23                         ` Alexandre DERUMIER
2020-09-08  4:41                           ` dietmar
2020-09-08  7:11                             ` Alexandre DERUMIER
2020-09-09 20:05                               ` Thomas Lamprecht
2020-09-10  4:58                                 ` Alexandre DERUMIER
2020-09-10  8:21                                   ` Thomas Lamprecht
2020-09-10 11:34                                     ` Alexandre DERUMIER
2020-09-10 18:21                                       ` Thomas Lamprecht
2020-09-14  4:54                                         ` Alexandre DERUMIER
2020-09-14  7:14                                           ` Dietmar Maurer
2020-09-14  8:27                                             ` Alexandre DERUMIER
2020-09-14  8:51                                               ` Thomas Lamprecht
2020-09-14 15:45                                                 ` Alexandre DERUMIER
2020-09-15  5:45                                                   ` dietmar
2020-09-15  6:27                                                     ` Alexandre DERUMIER
2020-09-15  7:13                                                       ` dietmar
2020-09-15  8:42                                                         ` Alexandre DERUMIER
2020-09-15  9:35                                                           ` Alexandre DERUMIER
2020-09-15  9:46                                                             ` Thomas Lamprecht
2020-09-15 10:15                                                               ` Alexandre DERUMIER
2020-09-15 11:04                                                                 ` Alexandre DERUMIER
2020-09-15 12:49                                                                   ` Alexandre DERUMIER
2020-09-15 13:00                                                                     ` Thomas Lamprecht
2020-09-15 14:09                                                                       ` Alexandre DERUMIER
2020-09-15 14:19                                                                         ` Alexandre DERUMIER
2020-09-15 14:32                                                                         ` Thomas Lamprecht
2020-09-15 14:57                                                                           ` Alexandre DERUMIER
2020-09-15 15:58                                                                             ` Alexandre DERUMIER
2020-09-16  7:34                                                                               ` Alexandre DERUMIER [this message]
2020-09-16  7:58                                                                                 ` Alexandre DERUMIER
2020-09-16  8:30                                                                                   ` Alexandre DERUMIER
2020-09-16  8:53                                                                                     ` Alexandre DERUMIER
     [not found]                                                                                     ` <1894376736.864562.1600253445817.JavaMail.zimbra@odiso.com>
2020-09-16 13:15                                                                                       ` Alexandre DERUMIER
2020-09-16 14:45                                                                                         ` Thomas Lamprecht
2020-09-16 15:17                                                                                           ` Alexandre DERUMIER
2020-09-17  9:21                                                                                             ` Fabian Grünbichler
2020-09-17  9:59                                                                                               ` Alexandre DERUMIER
2020-09-17 10:02                                                                                                 ` Alexandre DERUMIER
2020-09-17 11:35                                                                                                   ` Thomas Lamprecht
2020-09-20 23:54                                                                                                     ` Alexandre DERUMIER
2020-09-22  5:43                                                                                                       ` Alexandre DERUMIER
2020-09-24 14:02                                                                                                         ` Fabian Grünbichler
2020-09-24 14:29                                                                                                           ` Alexandre DERUMIER
2020-09-24 18:07                                                                                                             ` Alexandre DERUMIER
2020-09-25  6:44                                                                                                               ` Alexandre DERUMIER
2020-09-25  7:15                                                                                                                 ` Alexandre DERUMIER
2020-09-25  9:19                                                                                                                   ` Fabian Grünbichler
2020-09-25  9:46                                                                                                                     ` Alexandre DERUMIER
2020-09-25 12:51                                                                                                                       ` Fabian Grünbichler
2020-09-25 16:29                                                                                                                         ` Alexandre DERUMIER
2020-09-28  9:17                                                                                                                           ` Fabian Grünbichler
2020-09-28  9:35                                                                                                                             ` Alexandre DERUMIER
2020-09-28 15:59                                                                                                                               ` Alexandre DERUMIER
2020-09-29  5:30                                                                                                                                 ` Alexandre DERUMIER
2020-09-29  8:51                                                                                                                                 ` Fabian Grünbichler
2020-09-29  9:37                                                                                                                                   ` Alexandre DERUMIER
2020-09-29 10:52                                                                                                                                     ` Alexandre DERUMIER
2020-09-29 11:43                                                                                                                                       ` Alexandre DERUMIER
2020-09-29 11:50                                                                                                                                         ` Alexandre DERUMIER
2020-09-29 13:28                                                                                                                                           ` Fabian Grünbichler
2020-09-29 13:52                                                                                                                                             ` Alexandre DERUMIER
2020-09-30  6:09                                                                                                                                               ` Alexandre DERUMIER
2020-09-30  6:26                                                                                                                                                 ` Thomas Lamprecht
2020-09-15  7:58                                                       ` Thomas Lamprecht
2020-12-29 14:21   ` Josef Johansson
2020-09-04 15:46 ` Alexandre DERUMIER
2020-09-30 15:50 ` Thomas Lamprecht
2020-10-15  9:16   ` Eneko Lacunza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1097647242.851726.1600241667098.JavaMail.zimbra@odiso.com \
    --to=aderumier@odiso.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal