From: Alexandre DERUMIER <aderumier@odiso.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Cc: Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
Date: Wed, 16 Sep 2020 09:34:27 +0200 (CEST) [thread overview]
Message-ID: <1097647242.851726.1600241667098.JavaMail.zimbra@odiso.com> (raw)
In-Reply-To: <597522514.840749.1600185513450.JavaMail.zimbra@odiso.com>
Hi,
I have produced the problem now,
and this time I don't have restarted corosync a second after the lock of /etc/pve
so, currently it's readonly.
I don't have used gdb since a long time,
could you tell me how to attach to the running pmxcfs process, and some gbd commands to launch ?
----- Mail original -----
De: "aderumier" <aderumier@odiso.com>
À: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Cc: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
Envoyé: Mardi 15 Septembre 2020 17:58:33
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
Another small lock at 17:41:09
To be sure, I have done a small loop of write each second in /etc/pve, node node2.
it's hanging at first corosync restart, then, on second corosync restart it's working again.
I'll try to improve this tomorrow to be able to debug corosync process
- restarting corosync
do some write in /etc/pve/
- and if it's hanging don't restart corosync again
node2: echo test > /etc/pve/test loop
--------------------------------------
Current time : 17:41:01
Current time : 17:41:02
Current time : 17:41:03
Current time : 17:41:04
Current time : 17:41:05
Current time : 17:41:06
Current time : 17:41:07
Current time : 17:41:08
Current time : 17:41:09
hang
Current time : 17:42:05
Current time : 17:42:06
Current time : 17:42:07
node1
-----
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD completed for host: 6 link: 0 current link mtu: 1397
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: Starting PMTUD for host: 10 link: 0
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] udp: detected kernel MTU: 1500
Sep 15 17:41:08 m6kvm1 corosync[18145]: [TOTEM ] Knet pMTU change: 1397
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD link change for host: 10 link: 0 from 469 to 1397
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: PMTUD completed for host: 10 link: 0 current link mtu: 1397
Sep 15 17:41:08 m6kvm1 corosync[18145]: [KNET ] pmtud: Global data MTU changed to: 1397
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-31-zx6KJM/qb)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239]
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] lib_init_fn: conn=0x556c2918d5f0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] got quorum_type request on 0x556c2918d5f0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] got trackstart request on 0x556c2918d5f0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] sending initial status to 0x556c2918d5f0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QUORUM] sending quorum notification to 0x556c2918d5f0, length = 52
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-32-I7ZZ6e/qb)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239]
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CMAP ] lib_init_fn: conn=0x556c2918ef20
Sep 15 17:41:09 m6kvm1 pmxcfs[16239]: [status] notice: update cluster info (cluster name m6kvm, version = 20)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-33-6RKbvH/qb)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239]
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] lib_init_fn: conn=0x556c2918ad00, cpd=0x556c2918b50c
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] IPC credentials authenticated (/dev/shm/qb-18145-16239-34-GAY5T9/qb)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] connecting to client [16239]
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [QB ] shm size:1048589; real_size:1052672; rb->word_size:263168
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] connection created
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] lib_init_fn: conn=0x556c2918c740, cpd=0x556c2918ce8c
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Creating commit token because I am the rep.
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Saving state aru 5 high seq received 5
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Storing new sequence id for ring 1197
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering COMMIT state.
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] got commit token
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering RECOVERY state.
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] TRANS [0] member 1:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [0] member 1:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (1.1193)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 5 high delivered 5 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [1] member 2:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [2] member 3:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [3] member 4:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
ep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [4] member 5:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [5] member 6:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [6] member 7:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [7] member 8:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [8] member 9:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [9] member 10:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [10] member 11:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [11] member 12:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [12] member 13:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] position [13] member 14:
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Did not need to originate any messages in recovery.
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] got commit token
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Sending initial ORF token
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] Resetting old ring state
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] recovery to regular 1-0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] waiting_trans_ack changed to 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.90)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.91)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.92)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.93)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.94)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.95)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.96)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.97)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.107)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.108)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.109)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.110)
ep 15 17:41:09 m6kvm1 corosync[18145]: [MAIN ] Member joined: r(0) ip(10.3.94.111)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] call init for locally known services
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] entering OPERATIONAL state.
Sep 15 17:41:09 m6kvm1 corosync[18145]: [TOTEM ] A new membership (1.1197) was formed. Members joined: 2 3 4 5 6 7 8 9 10 11 12 13 14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] enter sync process
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] Committing synchronization for corosync configuration map access
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 2
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 3
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 4
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 5
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 6
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 7
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 8
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 9
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 10
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 11
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 12
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] got joinlist message from node 13
Sep 15 17:41:09 m6kvm1 corosync[18145]: [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] my downlist: members(old:1 left:0)
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[4] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[5] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[6] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[7] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[8] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[9] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[10] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[11] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[12] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[13] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[14] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[15] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[16] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[17] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[18] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[19] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[20] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[21] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[22] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[23] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[24] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634
Sep 15 17:41:09 m6kvm1 corosync[18145]: [CPG ] joinlist_messages[25] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] Sending nodelist callback. ring_id = 1.1197
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 13
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[13]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=2, expected_votes=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 13
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[14]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=3, expected_votes=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 14 flags: 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=3, expected_votes=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 2
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] total_votes=4, expected_votes=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 2 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 2
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm1 corosync[18145]: [VOTEQ ] got nodeinfo message from cluster node 3
....
....
next corosync restart
Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] Node was shut down by a signal
Sep 15 17:42:03 m6kvm1 corosync[18145]: [SERV ] Unloading all Corosync service engines.
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] withdrawing server sockets
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] qb_ipcs_unref() - destroying
Sep 15 17:42:03 m6kvm1 corosync[18145]: [SERV ] Service engine unloaded: corosync vote quorum service v1.0
Sep 15 17:42:03 m6kvm1 corosync[18145]: [QB ] qb_ipcs_disconnect(/dev/shm/qb-18145-16239-32-I7ZZ6e/qb) state:2
Sep 15 17:42:03 m6kvm1 pmxcfs[16239]: [confdb] crit: cmap_dispatch failed: 2
Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] cs_ipcs_connection_closed()
Sep 15 17:42:03 m6kvm1 corosync[18145]: [CMAP ] exit_fn for conn=0x556c2918ef20
Sep 15 17:42:03 m6kvm1 corosync[18145]: [MAIN ] cs_ipcs_connection_destroyed()
node2
-----
Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] pmtud: Starting PMTUD for host: 10 link: 0
Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] udp: detected kernel MTU: 1500
Sep 15 17:41:05 m6kvm2 corosync[25411]: [KNET ] pmtud: PMTUD completed for host: 10 link: 0 current link mtu: 1397
Sep 15 17:41:07 m6kvm2 corosync[25411]: [KNET ] rx: host: 1 link: 0 received pong: 2
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [TOTEM ] entering GATHER state from 11(merge during join).
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:08 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: Source host 1 not reachable yet. Discarding packet.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] rx: host: 1 link: 0 is up
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Knet host change callback. nodeid: 1 reachable: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 1)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] pmtud: Starting PMTUD for host: 1 link: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] udp: detected kernel MTU: 1500
Sep 15 17:41:09 m6kvm2 corosync[25411]: [KNET ] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 1397
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] got commit token
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Saving state aru 123 high seq received 123
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Storing new sequence id for ring 1197
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering COMMIT state.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] got commit token
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering RECOVERY state.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [0] member 2:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [1] member 3:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [2] member 4:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [3] member 5:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [4] member 6:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [5] member 7:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [6] member 8:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [7] member 9:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [8] member 10:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [9] member 11:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [10] member 12:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [11] member 13:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] TRANS [12] member 14:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [0] member 1:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (1.1193)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 5 high delivered 5 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [1] member 2:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [2] member 3:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [3] member 4:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [4] member 5:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [5] member 6:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [6] member 7:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [7] member 8:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [8] member 9:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [9] member 10:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [10] member 11:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [11] member 12:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [12] member 13:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] position [13] member 14:
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] previous ringid (2.1192)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] aru 123 high delivered 123 received flag 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Did not need to originate any messages in recovery.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru ffffffff
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] install seq 0 aru 0 high seq received 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] Resetting old ring state
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] recovery to regular 1-0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] waiting_trans_ack changed to 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Member joined: r(0) ip(10.3.94.89)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] call init for locally known services
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] entering OPERATIONAL state.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] A new membership (1.1197) was formed. Members joined: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] enter sync process
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync configuration map access
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CMAP ] Not first sync -> no action
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 3
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 4
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 5
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 6
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 7
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 8
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 9
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 10
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 11
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 12
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] downlist left_list: 0 received
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got joinlist message from node 13
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] my downlist: members(old:13 left:0)
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[0] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[1] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.110) , pid:30209
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[2] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[3] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.109) , pid:31350
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[4] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[5] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.108) , pid:3569
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[6] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[7] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.107) , pid:19504
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[8] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[9] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.97) , pid:11947
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[10] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[11] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.96) , pid:20814
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[12] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[13] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.95) , pid:39420
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[14] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[15] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.94) , pid:12452
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[16] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[17] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.93) , pid:44300
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[18] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[19] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.92) , pid:42259
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[20] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[21] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.91) , pid:40630
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[22] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[23] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.90) , pid:25870
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[24] group:pve_kvstore_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] joinlist_messages[25] group:pve_dcdb_v1\x00, ip:r(0) ip(10.3.94.111) , pid:25634
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] Sending nodelist callback. ring_id = 1.1197
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 13
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[13]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 13
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[14]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 14 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: No Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 3
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[3]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 3
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 4
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[4]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 4
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 5
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[5]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 5
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 6
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[6]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 6
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 7
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[7]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 7
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 8
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[8]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 8
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 9
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[9]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 9
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 10
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[10]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 10
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 11
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[11]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 11
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 12
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[12]: votes: 1, expected: 14 flags: 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] got nodeinfo message from cluster node 12
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [SYNC ] Committing synchronization for corosync vote quorum service v1.0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] total_votes=14, expected_votes=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 1 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 3 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 4 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 5 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 6 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 7 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 8 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 9 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 10 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 11 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 12 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 13 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 14 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] node 2 state=1, votes=1, expected=14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] lowest node id: 1 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] highest node id: 14 us: 2
Sep 15 17:41:09 m6kvm2 corosync[25411]: [QUORUM] Members[14]: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Sep 15 17:41:09 m6kvm2 corosync[25411]: [QUORUM] sending quorum notification to (nil), length = 104
Sep 15 17:41:09 m6kvm2 corosync[25411]: [VOTEQ ] Sending quorum callback, quorate = 1
Sep 15 17:41:09 m6kvm2 corosync[25411]: [MAIN ] Completed service synchronization, ready to provide service.
Sep 15 17:41:09 m6kvm2 corosync[25411]: [TOTEM ] waiting_trans_ack changed to 0
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got procjoin message from cluster node 1 (r(0) ip(10.3.94.89) ) for pid 16239
Sep 15 17:41:09 m6kvm2 corosync[25411]: [CPG ] got procjoin message from cluster node 1 (r(0) ip(10.3.94.89) ) for pid 16239
----- Mail original -----
De: "aderumier" <aderumier@odiso.com>
À: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Envoyé: Mardi 15 Septembre 2020 16:57:46
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
>>I mean this is bad, but also great!
>>Cam you do a coredump of the whole thing and upload it somewhere with the version info
>>used (for dbgsym package)? That could help a lot.
I'll try to reproduce it again (with the full lock everywhere), and do the coredump.
I have tried the real time scheduling,
but I still have been able to reproduce the "lrm too long" for 60s (but as I'm restarting corosync each minute, I think it's unlocking
something at next corosync restart.)
this time it was blocked at the same time on a node in:
work {
...
} elsif ($state eq 'active') {
....
$self->update_lrm_status();
and another node in
if ($fence_request) {
$haenv->log('err', "node need to be fenced - releasing agent_lock\n");
$self->set_local_status({ state => 'lost_agent_lock'});
} elsif (!$self->get_protected_ha_agent_lock()) {
$self->set_local_status({ state => 'lost_agent_lock'});
} elsif ($self->{mode} eq 'maintenance') {
$self->set_local_status({ state => 'maintenance'});
}
----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
À: "aderumier" <aderumier@odiso.com>
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Envoyé: Mardi 15 Septembre 2020 16:32:52
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
On 9/15/20 4:09 PM, Alexandre DERUMIER wrote:
>>> Can you try to give pmxcfs real time scheduling, e.g., by doing:
>>>
>>> # systemctl edit pve-cluster
>>>
>>> And then add snippet:
>>>
>>>
>>> [Service]
>>> CPUSchedulingPolicy=rr
>>> CPUSchedulingPriority=99
> yes, sure, I'll do it now
>
>
>> I'm currently digging the logs
>>> Is your most simplest/stable reproducer still a periodic restart of corosync in one node?
> yes, a simple "systemctl restart corosync" on 1 node each minute
>
>
>
> After 1hour, it's still locked.
>
> on other nodes, I still have pmxfs logs like:
>
I mean this is bad, but also great!
Cam you do a coredump of the whole thing and upload it somewhere with the version info
used (for dbgsym package)? That could help a lot.
> manual "pmxcfs -d"
> https://gist.github.com/aderumier/4cd91d17e1f8847b93ea5f621f257c2e
>
Hmm, the fuse connection of the previous one got into a weird state (or something is still
running) but I'd rather say this is a side-effect not directly connected to the real bug.
>
> some interesting dmesg about "pvesr"
>
> [Tue Sep 15 14:45:34 2020] INFO: task pvesr:19038 blocked for more than 120 seconds.
> [Tue Sep 15 14:45:34 2020] Tainted: P O 5.4.60-1-pve #1
> [Tue Sep 15 14:45:34 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Tue Sep 15 14:45:34 2020] pvesr D 0 19038 1 0x00000080
> [Tue Sep 15 14:45:34 2020] Call Trace:
> [Tue Sep 15 14:45:34 2020] __schedule+0x2e6/0x6f0
> [Tue Sep 15 14:45:34 2020] ? filename_parentat.isra.57.part.58+0xf7/0x180
> [Tue Sep 15 14:45:34 2020] schedule+0x33/0xa0
> [Tue Sep 15 14:45:34 2020] rwsem_down_write_slowpath+0x2ed/0x4a0
> [Tue Sep 15 14:45:34 2020] down_write+0x3d/0x40
> [Tue Sep 15 14:45:34 2020] filename_create+0x8e/0x180
> [Tue Sep 15 14:45:34 2020] do_mkdirat+0x59/0x110
> [Tue Sep 15 14:45:34 2020] __x64_sys_mkdir+0x1b/0x20
> [Tue Sep 15 14:45:34 2020] do_syscall_64+0x57/0x190
> [Tue Sep 15 14:45:34 2020] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
hmm, hangs in mkdir (cluster wide locking)
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2020-09-16 7:35 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-03 14:11 Alexandre DERUMIER
2020-09-04 12:29 ` Alexandre DERUMIER
2020-09-04 15:42 ` Dietmar Maurer
2020-09-05 13:32 ` Alexandre DERUMIER
2020-09-05 15:23 ` dietmar
2020-09-05 17:30 ` Alexandre DERUMIER
2020-09-06 4:21 ` dietmar
2020-09-06 5:36 ` Alexandre DERUMIER
2020-09-06 6:33 ` Alexandre DERUMIER
2020-09-06 8:43 ` Alexandre DERUMIER
2020-09-06 12:14 ` dietmar
2020-09-06 12:19 ` dietmar
2020-09-07 7:00 ` Thomas Lamprecht
2020-09-07 7:19 ` Alexandre DERUMIER
2020-09-07 8:18 ` dietmar
2020-09-07 9:32 ` Alexandre DERUMIER
2020-09-07 13:23 ` Alexandre DERUMIER
2020-09-08 4:41 ` dietmar
2020-09-08 7:11 ` Alexandre DERUMIER
2020-09-09 20:05 ` Thomas Lamprecht
2020-09-10 4:58 ` Alexandre DERUMIER
2020-09-10 8:21 ` Thomas Lamprecht
2020-09-10 11:34 ` Alexandre DERUMIER
2020-09-10 18:21 ` Thomas Lamprecht
2020-09-14 4:54 ` Alexandre DERUMIER
2020-09-14 7:14 ` Dietmar Maurer
2020-09-14 8:27 ` Alexandre DERUMIER
2020-09-14 8:51 ` Thomas Lamprecht
2020-09-14 15:45 ` Alexandre DERUMIER
2020-09-15 5:45 ` dietmar
2020-09-15 6:27 ` Alexandre DERUMIER
2020-09-15 7:13 ` dietmar
2020-09-15 8:42 ` Alexandre DERUMIER
2020-09-15 9:35 ` Alexandre DERUMIER
2020-09-15 9:46 ` Thomas Lamprecht
2020-09-15 10:15 ` Alexandre DERUMIER
2020-09-15 11:04 ` Alexandre DERUMIER
2020-09-15 12:49 ` Alexandre DERUMIER
2020-09-15 13:00 ` Thomas Lamprecht
2020-09-15 14:09 ` Alexandre DERUMIER
2020-09-15 14:19 ` Alexandre DERUMIER
2020-09-15 14:32 ` Thomas Lamprecht
2020-09-15 14:57 ` Alexandre DERUMIER
2020-09-15 15:58 ` Alexandre DERUMIER
2020-09-16 7:34 ` Alexandre DERUMIER [this message]
2020-09-16 7:58 ` Alexandre DERUMIER
2020-09-16 8:30 ` Alexandre DERUMIER
2020-09-16 8:53 ` Alexandre DERUMIER
[not found] ` <1894376736.864562.1600253445817.JavaMail.zimbra@odiso.com>
2020-09-16 13:15 ` Alexandre DERUMIER
2020-09-16 14:45 ` Thomas Lamprecht
2020-09-16 15:17 ` Alexandre DERUMIER
2020-09-17 9:21 ` Fabian Grünbichler
2020-09-17 9:59 ` Alexandre DERUMIER
2020-09-17 10:02 ` Alexandre DERUMIER
2020-09-17 11:35 ` Thomas Lamprecht
2020-09-20 23:54 ` Alexandre DERUMIER
2020-09-22 5:43 ` Alexandre DERUMIER
2020-09-24 14:02 ` Fabian Grünbichler
2020-09-24 14:29 ` Alexandre DERUMIER
2020-09-24 18:07 ` Alexandre DERUMIER
2020-09-25 6:44 ` Alexandre DERUMIER
2020-09-25 7:15 ` Alexandre DERUMIER
2020-09-25 9:19 ` Fabian Grünbichler
2020-09-25 9:46 ` Alexandre DERUMIER
2020-09-25 12:51 ` Fabian Grünbichler
2020-09-25 16:29 ` Alexandre DERUMIER
2020-09-28 9:17 ` Fabian Grünbichler
2020-09-28 9:35 ` Alexandre DERUMIER
2020-09-28 15:59 ` Alexandre DERUMIER
2020-09-29 5:30 ` Alexandre DERUMIER
2020-09-29 8:51 ` Fabian Grünbichler
2020-09-29 9:37 ` Alexandre DERUMIER
2020-09-29 10:52 ` Alexandre DERUMIER
2020-09-29 11:43 ` Alexandre DERUMIER
2020-09-29 11:50 ` Alexandre DERUMIER
2020-09-29 13:28 ` Fabian Grünbichler
2020-09-29 13:52 ` Alexandre DERUMIER
2020-09-30 6:09 ` Alexandre DERUMIER
2020-09-30 6:26 ` Thomas Lamprecht
2020-09-15 7:58 ` Thomas Lamprecht
2020-12-29 14:21 ` Josef Johansson
2020-09-04 15:46 ` Alexandre DERUMIER
2020-09-30 15:50 ` Thomas Lamprecht
2020-10-15 9:16 ` Eneko Lacunza
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1097647242.851726.1600241667098.JavaMail.zimbra@odiso.com \
--to=aderumier@odiso.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal