From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <aderumier@odiso.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 69DB96202E
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:57:49 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 599381BD5C
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:57:49 +0200 (CEST)
Received: from mailpro.odiso.net (mailpro.odiso.net [89.248.211.110])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 667CD1BD50
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:57:47 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
 by mailpro.odiso.net (Postfix) with ESMTP id 0A6221A5F8DF;
 Tue, 15 Sep 2020 16:57:47 +0200 (CEST)
Received: from mailpro.odiso.net ([127.0.0.1])
 by localhost (mailpro.odiso.net [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id HuuqKSI2eD2c; Tue, 15 Sep 2020 16:57:46 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1])
 by mailpro.odiso.net (Postfix) with ESMTP id E09C61A5F8E3;
 Tue, 15 Sep 2020 16:57:46 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mailpro.odiso.com
Received: from mailpro.odiso.net ([127.0.0.1])
 by localhost (mailpro.odiso.net [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id FvrfqgdOm77N; Tue, 15 Sep 2020 16:57:46 +0200 (CEST)
Received: from mailpro.odiso.net (mailpro.odiso.net [10.1.31.111])
 by mailpro.odiso.net (Postfix) with ESMTP id C91821A5F8DF;
 Tue, 15 Sep 2020 16:57:46 +0200 (CEST)
Date: Tue, 15 Sep 2020 16:57:46 +0200 (CEST)
From: Alexandre DERUMIER <aderumier@odiso.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Message-ID: <132388307.839866.1600181866529.JavaMail.zimbra@odiso.com>
In-Reply-To: <6b680921-12d0-006b-6d04-bbe1c4bb04f8@proxmox.com>
References: <216436814.339545.1599142316781.JavaMail.zimbra@odiso.com>
 <98e79e8d-9001-db77-c032-bdfcdb3698a6@proxmox.com>
 <1282130277.831843.1600164947209.JavaMail.zimbra@odiso.com>
 <1732268946.834480.1600167871823.JavaMail.zimbra@odiso.com>
 <1800811328.836757.1600174194769.JavaMail.zimbra@odiso.com>
 <43250fdc-55ba-03d9-2507-a2b08c5945ce@proxmox.com>
 <1798333820.838842.1600178990068.JavaMail.zimbra@odiso.com>
 <6b680921-12d0-006b-6d04-bbe1c4bb04f8@proxmox.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Mailer: Zimbra 8.8.12_GA_3866 (ZimbraWebClient - GC83 (Linux)/8.8.12_GA_3844)
Thread-Topic: corosync bug: cluster break after 1 node clean shutdown
Thread-Index: fqzQ8CV4gT3UroNXiJlm8US/HHWe/A==
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.089 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_DNSWL_NONE     -0.0001 Sender listed at https://www.dnswl.org/,
 no trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean
 shutdown
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 15 Sep 2020 14:57:49 -0000

>>I mean this is bad, but also great!=20
>>Cam you do a coredump of the whole thing and upload it somewhere with the=
 version info=20
>>used (for dbgsym package)? That could help a lot.

I'll try to reproduce it again (with the full lock everywhere), and do the =
coredump.


I have tried the real time scheduling,

but I still have been able to reproduce the "lrm too long" for 60s (but as =
I'm restarting corosync each minute, I think it's unlocking
something at next corosync restart.)


this time it was blocked at the same time on a node in:

work {
...
   } elsif ($state eq 'active') {
      ....
        $self->update_lrm_status();


and another node in

        if ($fence_request) {
            $haenv->log('err', "node need to be fenced - releasing agent_lo=
ck\n");
            $self->set_local_status({ state =3D> 'lost_agent_lock'});
        } elsif (!$self->get_protected_ha_agent_lock()) {
            $self->set_local_status({ state =3D> 'lost_agent_lock'});
        } elsif ($self->{mode} eq 'maintenance') {
            $self->set_local_status({ state =3D> 'maintenance'});
        }


----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht@proxmox.com>
=C3=80: "aderumier" <aderumier@odiso.com>
Cc: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Envoy=C3=A9: Mardi 15 Septembre 2020 16:32:52
Objet: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutd=
own

On 9/15/20 4:09 PM, Alexandre DERUMIER wrote:=20
>>> Can you try to give pmxcfs real time scheduling, e.g., by doing:=20
>>>=20
>>> # systemctl edit pve-cluster=20
>>>=20
>>> And then add snippet:=20
>>>=20
>>>=20
>>> [Service]=20
>>> CPUSchedulingPolicy=3Drr=20
>>> CPUSchedulingPriority=3D99=20
> yes, sure, I'll do it now=20
>=20
>=20
>> I'm currently digging the logs=20
>>> Is your most simplest/stable reproducer still a periodic restart of cor=
osync in one node?=20
> yes, a simple "systemctl restart corosync" on 1 node each minute=20
>=20
>=20
>=20
> After 1hour, it's still locked.=20
>=20
> on other nodes, I still have pmxfs logs like:=20
>=20

I mean this is bad, but also great!=20
Cam you do a coredump of the whole thing and upload it somewhere with the v=
ersion info=20
used (for dbgsym package)? That could help a lot.=20


> manual "pmxcfs -d"=20
> https://gist.github.com/aderumier/4cd91d17e1f8847b93ea5f621f257c2e=20
>=20

Hmm, the fuse connection of the previous one got into a weird state (or som=
ething is still=20
running) but I'd rather say this is a side-effect not directly connected to=
 the real bug.=20

>=20
> some interesting dmesg about "pvesr"=20
>=20
> [Tue Sep 15 14:45:34 2020] INFO: task pvesr:19038 blocked for more than 1=
20 seconds.=20
> [Tue Sep 15 14:45:34 2020] Tainted: P O 5.4.60-1-pve #1=20
> [Tue Sep 15 14:45:34 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_s=
ecs" disables this message.=20
> [Tue Sep 15 14:45:34 2020] pvesr D 0 19038 1 0x00000080=20
> [Tue Sep 15 14:45:34 2020] Call Trace:=20
> [Tue Sep 15 14:45:34 2020] __schedule+0x2e6/0x6f0=20
> [Tue Sep 15 14:45:34 2020] ? filename_parentat.isra.57.part.58+0xf7/0x180=
=20
> [Tue Sep 15 14:45:34 2020] schedule+0x33/0xa0=20
> [Tue Sep 15 14:45:34 2020] rwsem_down_write_slowpath+0x2ed/0x4a0=20
> [Tue Sep 15 14:45:34 2020] down_write+0x3d/0x40=20
> [Tue Sep 15 14:45:34 2020] filename_create+0x8e/0x180=20
> [Tue Sep 15 14:45:34 2020] do_mkdirat+0x59/0x110=20
> [Tue Sep 15 14:45:34 2020] __x64_sys_mkdir+0x1b/0x20=20
> [Tue Sep 15 14:45:34 2020] do_syscall_64+0x57/0x190=20
> [Tue Sep 15 14:45:34 2020] entry_SYSCALL_64_after_hwframe+0x44/0xa9=20
>=20

hmm, hangs in mkdir (cluster wide locking)=20