From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 8D2CD62021
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:33:26 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 7170F1BAC0
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:32:56 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id B39081BAB1
 for <pve-devel@lists.proxmox.com>; Tue, 15 Sep 2020 16:32:54 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 7F64A44C30;
 Tue, 15 Sep 2020 16:32:54 +0200 (CEST)
To: Alexandre DERUMIER <aderumier@odiso.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <216436814.339545.1599142316781.JavaMail.zimbra@odiso.com>
 <1746620611.752896.1600159335616.JavaMail.zimbra@odiso.com>
 <1464606394.823230.1600162557186.JavaMail.zimbra@odiso.com>
 <98e79e8d-9001-db77-c032-bdfcdb3698a6@proxmox.com>
 <1282130277.831843.1600164947209.JavaMail.zimbra@odiso.com>
 <1732268946.834480.1600167871823.JavaMail.zimbra@odiso.com>
 <1800811328.836757.1600174194769.JavaMail.zimbra@odiso.com>
 <43250fdc-55ba-03d9-2507-a2b08c5945ce@proxmox.com>
 <1798333820.838842.1600178990068.JavaMail.zimbra@odiso.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
Message-ID: <6b680921-12d0-006b-6d04-bbe1c4bb04f8@proxmox.com>
Date: Tue, 15 Sep 2020 16:32:52 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101
 Thunderbird/81.0
MIME-Version: 1.0
In-Reply-To: <1798333820.838842.1600178990068.JavaMail.zimbra@odiso.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-GB
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.199 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean
 shutdown
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 15 Sep 2020 14:33:26 -0000

On 9/15/20 4:09 PM, Alexandre DERUMIER wrote:
>>> Can you try to give pmxcfs real time scheduling, e.g., by doing: 
>>>
>>> # systemctl edit pve-cluster 
>>>
>>> And then add snippet: 
>>>
>>>
>>> [Service] 
>>> CPUSchedulingPolicy=rr 
>>> CPUSchedulingPriority=99 
> yes, sure, I'll do it now
> 
> 
>> I'm currently digging the logs 
>>> Is your most simplest/stable reproducer still a periodic restart of corosync in one node? 
> yes, a simple "systemctl restart corosync" on 1 node each minute
> 
> 
> 
> After 1hour, it's still locked.
> 
> on other nodes, I still have pmxfs logs like:
> 

I mean this is bad, but also great!
Cam you do a coredump of the whole thing and upload it somewhere with the version info
used (for dbgsym package)? That could help a lot.


> manual "pmxcfs -d"
> https://gist.github.com/aderumier/4cd91d17e1f8847b93ea5f621f257c2e
> 

Hmm, the fuse connection of the previous one got into a weird state (or something is still
running) but I'd rather say this is a side-effect not directly connected to the real bug.

> 
> some interesting dmesg about "pvesr"
> 
> [Tue Sep 15 14:45:34 2020] INFO: task pvesr:19038 blocked for more than 120 seconds.
> [Tue Sep 15 14:45:34 2020]       Tainted: P           O      5.4.60-1-pve #1
> [Tue Sep 15 14:45:34 2020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Tue Sep 15 14:45:34 2020] pvesr           D    0 19038      1 0x00000080
> [Tue Sep 15 14:45:34 2020] Call Trace:
> [Tue Sep 15 14:45:34 2020]  __schedule+0x2e6/0x6f0
> [Tue Sep 15 14:45:34 2020]  ? filename_parentat.isra.57.part.58+0xf7/0x180
> [Tue Sep 15 14:45:34 2020]  schedule+0x33/0xa0
> [Tue Sep 15 14:45:34 2020]  rwsem_down_write_slowpath+0x2ed/0x4a0
> [Tue Sep 15 14:45:34 2020]  down_write+0x3d/0x40
> [Tue Sep 15 14:45:34 2020]  filename_create+0x8e/0x180
> [Tue Sep 15 14:45:34 2020]  do_mkdirat+0x59/0x110
> [Tue Sep 15 14:45:34 2020]  __x64_sys_mkdir+0x1b/0x20
> [Tue Sep 15 14:45:34 2020]  do_syscall_64+0x57/0x190
> [Tue Sep 15 14:45:34 2020]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

hmm, hangs in mkdir (cluster wide locking)