public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Alexandre DERUMIER <aderumier@odiso.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown
Date: Tue, 15 Sep 2020 16:09:50 +0200 (CEST)	[thread overview]
Message-ID: <1798333820.838842.1600178990068.JavaMail.zimbra@odiso.com> (raw)
In-Reply-To: <43250fdc-55ba-03d9-2507-a2b08c5945ce@proxmox.com>

>>
>>Can you try to give pmxcfs real time scheduling, e.g., by doing: 
>>
>># systemctl edit pve-cluster 
>>
>>And then add snippet: 
>>
>>
>>[Service] 
>>CPUSchedulingPolicy=rr 
>>CPUSchedulingPriority=99 

yes, sure, I'll do it now


> I'm currently digging the logs 
>>Is your most simplest/stable reproducer still a periodic restart of corosync in one node? 

yes, a simple "systemctl restart corosync" on 1 node each minute



After 1hour, it's still locked.

on other nodes, I still have pmxfs logs like:

Sep 15 15:36:31 m6kvm2 pmxcfs[3474]: [status] notice: received log
Sep 15 15:46:21 m6kvm2 pmxcfs[3474]: [status] notice: received log
Sep 15 15:46:23 m6kvm2 pmxcfs[3474]: [status] notice: received log
...


on node1, I just restarted the pve-cluster service with systemctl restart pve-cluster, 
the pmxcfs process was killed, but not able to start it again
and after that the /etc/pve become writable again on others node.

(I don't have rebooted yet node1, if you want more test on pmxcfs)



root@m6kvm1:~# systemctl status pve-cluster
● pve-cluster.service - The Proxmox VE cluster filesystem
   Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-09-15 15:52:11 CEST; 3min 29s ago
  Process: 12536 ExecStart=/usr/bin/pmxcfs (code=exited, status=255/EXCEPTION)

Sep 15 15:52:11 m6kvm1 systemd[1]: pve-cluster.service: Service RestartSec=100ms expired, scheduling restart.
Sep 15 15:52:11 m6kvm1 systemd[1]: pve-cluster.service: Scheduled restart job, restart counter is at 5.
Sep 15 15:52:11 m6kvm1 systemd[1]: Stopped The Proxmox VE cluster filesystem.
Sep 15 15:52:11 m6kvm1 systemd[1]: pve-cluster.service: Start request repeated too quickly.
Sep 15 15:52:11 m6kvm1 systemd[1]: pve-cluster.service: Failed with result 'exit-code'.
Sep 15 15:52:11 m6kvm1 systemd[1]: Failed to start The Proxmox VE cluster filesystem.

manual "pmxcfs -d"
https://gist.github.com/aderumier/4cd91d17e1f8847b93ea5f621f257c2e




Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_initialize failed: 2
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [quorum] crit: can't initialize service
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_initialize failed: 2
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [confdb] crit: can't initialize service
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [dcdb] notice: start cluster connection
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_initialize failed: 2
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [dcdb] crit: can't initialize service
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [status] notice: start cluster connection
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [status] crit: cpg_initialize failed: 2
Sep 15 14:38:24 m6kvm1 pmxcfs[3491]: [status] crit: can't initialize service
Sep 15 14:38:30 m6kvm1 pmxcfs[3491]: [status] notice: update cluster info (cluster name  m6kvm, version = 20)
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: node has quorum
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: starting data syncronisation
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: starting data syncronisation
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: received sync request (epoch 1/3491/00000064)
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: received sync request (epoch 1/3491/00000063)
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: received all states
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: leader is 2/3474
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: synced members: 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: waiting for updates from leader
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_queue: queue length 23
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: received all states
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: all data is up to date
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [status] notice: dfsm_deliver_queue: queue length 157
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: update complete - trying to commit (got 4 inode updates)
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: all data is up to date
Sep 15 14:38:32 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_sync_queue: queue length 31
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_dispatch failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] crit: cpg_dispatch failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] crit: cpg_leave failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_dispatch failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_leave failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_dispatch failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] notice: node lost quorum
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_initialize failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [quorum] crit: can't initialize service
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_initialize failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [confdb] crit: can't initialize service
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [dcdb] notice: start cluster connection
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_initialize failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [dcdb] crit: can't initialize service
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] notice: start cluster connection
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] crit: cpg_initialize failed: 2
Sep 15 14:39:25 m6kvm1 pmxcfs[3491]: [status] crit: can't initialize service
Sep 15 14:39:31 m6kvm1 pmxcfs[3491]: [status] notice: update cluster info (cluster name  m6kvm, version = 20)
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: node has quorum
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: starting data syncronisation
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: starting data syncronisation
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: received sync request (epoch 1/3491/00000065)
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: received sync request (epoch 1/3491/00000064)
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: received all states
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: leader is 2/3474
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: synced members: 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: waiting for updates from leader
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_queue: queue length 20
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: received all states
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [status] notice: all data is up to date
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: update complete - trying to commit (got 9 inode updates)
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: all data is up to date
Sep 15 14:39:33 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_sync_queue: queue length 25
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_dispatch failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [status] crit: cpg_dispatch failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [status] crit: cpg_leave failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_dispatch failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_leave failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_dispatch failed: 2
Sep 15 14:40:26 m6kvm1 pmxcfs[3491]: [status] notice: node lost quorum
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_initialize failed: 2
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [quorum] crit: can't initialize service
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_initialize failed: 2
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [confdb] crit: can't initialize service
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [dcdb] notice: start cluster connection
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_initialize failed: 2
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [dcdb] crit: can't initialize service
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [status] notice: start cluster connection
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [status] crit: cpg_initialize failed: 2
Sep 15 14:40:27 m6kvm1 pmxcfs[3491]: [status] crit: can't initialize service
Sep 15 14:40:33 m6kvm1 pmxcfs[3491]: [status] notice: update cluster info (cluster name  m6kvm, version = 20)
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: node has quorum
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: starting data syncronisation
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: starting data syncronisation
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: received sync request (epoch 1/3491/00000066)
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: received sync request (epoch 1/3491/00000065)
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: received all states
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: leader is 2/3474
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: synced members: 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: waiting for updates from leader
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_queue: queue length 23
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: received all states
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: all data is up to date
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [status] notice: dfsm_deliver_queue: queue length 87
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: update complete - trying to commit (got 6 inode updates)
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: all data is up to date
Sep 15 14:40:34 m6kvm1 pmxcfs[3491]: [dcdb] notice: dfsm_deliver_sync_queue: queue length 33
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_dispatch failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] crit: cpg_dispatch failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] crit: cpg_leave failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_dispatch failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_leave failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_dispatch failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] notice: node lost quorum
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [quorum] crit: quorum_initialize failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [quorum] crit: can't initialize service
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [confdb] crit: cmap_initialize failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [confdb] crit: can't initialize service
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [dcdb] notice: start cluster connection
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [dcdb] crit: cpg_initialize failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [dcdb] crit: can't initialize service
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] notice: start cluster connection
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] crit: cpg_initialize failed: 2
Sep 15 14:41:28 m6kvm1 pmxcfs[3491]: [status] crit: can't initialize service
Sep 15 14:41:34 m6kvm1 pmxcfs[3491]: [status] notice: update cluster info (cluster name  m6kvm, version = 20)
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: node has quorum
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [dcdb] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [dcdb] notice: starting data syncronisation
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: members: 1/3491, 2/3474, 3/3566, 4/3805, 5/3835, 6/3862, 7/3797, 8/3808, 9/9541, 10/3787, 11/3799, 12/3795, 13/3776, 14/3778
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: starting data syncronisation
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [dcdb] notice: received sync request (epoch 1/3491/00000067)
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: received sync request (epoch 1/3491/00000066)
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: received all states
Sep 15 14:41:35 m6kvm1 pmxcfs[3491]: [status] notice: all data is up to date
Sep 15 14:47:54 m6kvm1 pmxcfs[3491]: [status] notice: received log
Sep 15 15:02:55 m6kvm1 pmxcfs[3491]: [status] notice: received log
Sep 15 15:17:56 m6kvm1 pmxcfs[3491]: [status] notice: received log
Sep 15 15:32:57 m6kvm1 pmxcfs[3491]: [status] notice: received log
Sep 15 15:47:58 m6kvm1 pmxcfs[3491]: [status] notice: received log

----> restart
 2352  [ 15/09/2020 15:52:00 ] systemctl restart pve-cluster


Sep 15 15:52:10 m6kvm1 pmxcfs[12438]: [main] crit: fuse_mount error: Transport endpoint is not connected
Sep 15 15:52:10 m6kvm1 pmxcfs[12438]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 15 15:52:10 m6kvm1 pmxcfs[12529]: [main] crit: fuse_mount error: Transport endpoint is not connected
Sep 15 15:52:10 m6kvm1 pmxcfs[12529]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 15 15:52:10 m6kvm1 pmxcfs[12531]: [main] crit: fuse_mount error: Transport endpoint is not connected
Sep 15 15:52:10 m6kvm1 pmxcfs[12531]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 15 15:52:11 m6kvm1 pmxcfs[12533]: [main] crit: fuse_mount error: Transport endpoint is not connected
Sep 15 15:52:11 m6kvm1 pmxcfs[12533]: [main] notice: exit proxmox configuration filesystem (-1)
Sep 15 15:52:11 m6kvm1 pmxcfs[12536]: [main] crit: fuse_mount error: Transport endpoint is not connected
Sep 15 15:52:11 m6kvm1 pmxcfs[12536]: [main] notice: exit proxmox configuration filesystem (-1)


some interesting dmesg about "pvesr"

[Tue Sep 15 14:45:34 2020] INFO: task pvesr:19038 blocked for more than 120 seconds.
[Tue Sep 15 14:45:34 2020]       Tainted: P           O      5.4.60-1-pve #1
[Tue Sep 15 14:45:34 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Sep 15 14:45:34 2020] pvesr           D    0 19038      1 0x00000080
[Tue Sep 15 14:45:34 2020] Call Trace:
[Tue Sep 15 14:45:34 2020]  __schedule+0x2e6/0x6f0
[Tue Sep 15 14:45:34 2020]  ? filename_parentat.isra.57.part.58+0xf7/0x180
[Tue Sep 15 14:45:34 2020]  schedule+0x33/0xa0
[Tue Sep 15 14:45:34 2020]  rwsem_down_write_slowpath+0x2ed/0x4a0
[Tue Sep 15 14:45:34 2020]  down_write+0x3d/0x40
[Tue Sep 15 14:45:34 2020]  filename_create+0x8e/0x180
[Tue Sep 15 14:45:34 2020]  do_mkdirat+0x59/0x110
[Tue Sep 15 14:45:34 2020]  __x64_sys_mkdir+0x1b/0x20
[Tue Sep 15 14:45:34 2020]  do_syscall_64+0x57/0x190
[Tue Sep 15 14:45:34 2020]  entry_SYSCALL_64_after_hwframe+0x44/0xa9




----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
À: "aderumier" <aderumier@odiso.com>, "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Envoyé: Mardi 15 Septembre 2020 15:00:03
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown

On 9/15/20 2:49 PM, Alexandre DERUMIER wrote: 
> Hi, 
> 
> I have produce it again, 
> 
> now I can't write to /etc/pve/ from any node 
> 

OK, so seems to really be an issue in pmxcfs or between corosync and pmxcfs, 
not the HA LRM or watchdog mux itself. 

Can you try to give pmxcfs real time scheduling, e.g., by doing: 

# systemctl edit pve-cluster 

And then add snippet: 


[Service] 
CPUSchedulingPolicy=rr 
CPUSchedulingPriority=99 


And restart pve-cluster 

> I have also added some debug logs to pve-ha-lrm, and it was stuck in: 
> (but if /etc/pve is locked, this is normal) 
> 
> if ($fence_request) { 
> $haenv->log('err', "node need to be fenced - releasing agent_lock\n"); 
> $self->set_local_status({ state => 'lost_agent_lock'}); 
> } elsif (!$self->get_protected_ha_agent_lock()) { 
> $self->set_local_status({ state => 'lost_agent_lock'}); 
> } elsif ($self->{mode} eq 'maintenance') { 
> $self->set_local_status({ state => 'maintenance'}); 
> } 
> 
> 
> corosync quorum is currently ok 
> 
> I'm currently digging the logs 
Is your most simplest/stable reproducer still a periodic restart of corosync in one node? 




  reply	other threads:[~2020-09-15 14:09 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 14:11 Alexandre DERUMIER
2020-09-04 12:29 ` Alexandre DERUMIER
2020-09-04 15:42   ` Dietmar Maurer
2020-09-05 13:32     ` Alexandre DERUMIER
2020-09-05 15:23       ` dietmar
2020-09-05 17:30         ` Alexandre DERUMIER
2020-09-06  4:21           ` dietmar
2020-09-06  5:36             ` Alexandre DERUMIER
2020-09-06  6:33               ` Alexandre DERUMIER
2020-09-06  8:43               ` Alexandre DERUMIER
2020-09-06 12:14                 ` dietmar
2020-09-06 12:19                   ` dietmar
2020-09-07  7:00                     ` Thomas Lamprecht
2020-09-07  7:19                   ` Alexandre DERUMIER
2020-09-07  8:18                     ` dietmar
2020-09-07  9:32                       ` Alexandre DERUMIER
2020-09-07 13:23                         ` Alexandre DERUMIER
2020-09-08  4:41                           ` dietmar
2020-09-08  7:11                             ` Alexandre DERUMIER
2020-09-09 20:05                               ` Thomas Lamprecht
2020-09-10  4:58                                 ` Alexandre DERUMIER
2020-09-10  8:21                                   ` Thomas Lamprecht
2020-09-10 11:34                                     ` Alexandre DERUMIER
2020-09-10 18:21                                       ` Thomas Lamprecht
2020-09-14  4:54                                         ` Alexandre DERUMIER
2020-09-14  7:14                                           ` Dietmar Maurer
2020-09-14  8:27                                             ` Alexandre DERUMIER
2020-09-14  8:51                                               ` Thomas Lamprecht
2020-09-14 15:45                                                 ` Alexandre DERUMIER
2020-09-15  5:45                                                   ` dietmar
2020-09-15  6:27                                                     ` Alexandre DERUMIER
2020-09-15  7:13                                                       ` dietmar
2020-09-15  8:42                                                         ` Alexandre DERUMIER
2020-09-15  9:35                                                           ` Alexandre DERUMIER
2020-09-15  9:46                                                             ` Thomas Lamprecht
2020-09-15 10:15                                                               ` Alexandre DERUMIER
2020-09-15 11:04                                                                 ` Alexandre DERUMIER
2020-09-15 12:49                                                                   ` Alexandre DERUMIER
2020-09-15 13:00                                                                     ` Thomas Lamprecht
2020-09-15 14:09                                                                       ` Alexandre DERUMIER [this message]
2020-09-15 14:19                                                                         ` Alexandre DERUMIER
2020-09-15 14:32                                                                         ` Thomas Lamprecht
2020-09-15 14:57                                                                           ` Alexandre DERUMIER
2020-09-15 15:58                                                                             ` Alexandre DERUMIER
2020-09-16  7:34                                                                               ` Alexandre DERUMIER
2020-09-16  7:58                                                                                 ` Alexandre DERUMIER
2020-09-16  8:30                                                                                   ` Alexandre DERUMIER
2020-09-16  8:53                                                                                     ` Alexandre DERUMIER
     [not found]                                                                                     ` <1894376736.864562.1600253445817.JavaMail.zimbra@odiso.com>
2020-09-16 13:15                                                                                       ` Alexandre DERUMIER
2020-09-16 14:45                                                                                         ` Thomas Lamprecht
2020-09-16 15:17                                                                                           ` Alexandre DERUMIER
2020-09-17  9:21                                                                                             ` Fabian Grünbichler
2020-09-17  9:59                                                                                               ` Alexandre DERUMIER
2020-09-17 10:02                                                                                                 ` Alexandre DERUMIER
2020-09-17 11:35                                                                                                   ` Thomas Lamprecht
2020-09-20 23:54                                                                                                     ` Alexandre DERUMIER
2020-09-22  5:43                                                                                                       ` Alexandre DERUMIER
2020-09-24 14:02                                                                                                         ` Fabian Grünbichler
2020-09-24 14:29                                                                                                           ` Alexandre DERUMIER
2020-09-24 18:07                                                                                                             ` Alexandre DERUMIER
2020-09-25  6:44                                                                                                               ` Alexandre DERUMIER
2020-09-25  7:15                                                                                                                 ` Alexandre DERUMIER
2020-09-25  9:19                                                                                                                   ` Fabian Grünbichler
2020-09-25  9:46                                                                                                                     ` Alexandre DERUMIER
2020-09-25 12:51                                                                                                                       ` Fabian Grünbichler
2020-09-25 16:29                                                                                                                         ` Alexandre DERUMIER
2020-09-28  9:17                                                                                                                           ` Fabian Grünbichler
2020-09-28  9:35                                                                                                                             ` Alexandre DERUMIER
2020-09-28 15:59                                                                                                                               ` Alexandre DERUMIER
2020-09-29  5:30                                                                                                                                 ` Alexandre DERUMIER
2020-09-29  8:51                                                                                                                                 ` Fabian Grünbichler
2020-09-29  9:37                                                                                                                                   ` Alexandre DERUMIER
2020-09-29 10:52                                                                                                                                     ` Alexandre DERUMIER
2020-09-29 11:43                                                                                                                                       ` Alexandre DERUMIER
2020-09-29 11:50                                                                                                                                         ` Alexandre DERUMIER
2020-09-29 13:28                                                                                                                                           ` Fabian Grünbichler
2020-09-29 13:52                                                                                                                                             ` Alexandre DERUMIER
2020-09-30  6:09                                                                                                                                               ` Alexandre DERUMIER
2020-09-30  6:26                                                                                                                                                 ` Thomas Lamprecht
2020-09-15  7:58                                                       ` Thomas Lamprecht
2020-12-29 14:21   ` Josef Johansson
2020-09-04 15:46 ` Alexandre DERUMIER
2020-09-30 15:50 ` Thomas Lamprecht
2020-10-15  9:16   ` Eneko Lacunza

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1798333820.838842.1600178990068.JavaMail.zimbra@odiso.com \
    --to=aderumier@odiso.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal