From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 532FC61ACC for ; Tue, 15 Sep 2020 09:58:12 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 47EE9174BA for ; Tue, 15 Sep 2020 09:58:12 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id AED3A174AD for ; Tue, 15 Sep 2020 09:58:11 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6DAA844C20; Tue, 15 Sep 2020 09:58:11 +0200 (CEST) To: Alexandre DERUMIER , dietmar Cc: Proxmox VE development discussion References: <216436814.339545.1599142316781.JavaMail.zimbra@odiso.com> <9e2974b8-3c39-0fda-6f73-6677e3d796f4@proxmox.com> <1928266603.714059.1600059280338.JavaMail.zimbra@odiso.com> <803983196.1499.1600067690947@webmail.proxmox.com> <2093781647.723563.1600072074707.JavaMail.zimbra@odiso.com> <88fe5075-870d-9197-7c84-71ae8a25e9dd@proxmox.com> <1775665592.735772.1600098305930.JavaMail.zimbra@odiso.com> <487514223.9.1600148741895@webmail.proxmox.com> <295606419.745430.1600151269212.JavaMail.zimbra@odiso.com> From: Thomas Lamprecht Message-ID: <94ccda38-3f20-3fd5-0e00-d0fd6ef1fc53@proxmox.com> Date: Tue, 15 Sep 2020 09:58:09 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:81.0) Gecko/20100101 Thunderbird/81.0 MIME-Version: 1.0 In-Reply-To: <295606419.745430.1600151269212.JavaMail.zimbra@odiso.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.208 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [syslog.target] Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Sep 2020 07:58:12 -0000 On 9/15/20 8:27 AM, Alexandre DERUMIER wrote: >>> This is by intention - we do not want to stop pmxcfs only because coorosync service stops. > > Yes, but at shutdown, it could be great to stop pmxcfs before corosync ? > I ask the question, because the 2 times I have problem, it was when shutting down a server. > So maybe some strange behaviour occur with both corosync && pmxcfs are stopped at same time ? > > > looking at the pve-cluster unit file, > why do we have "Before=corosync.service" and not "After=corosync.service" ? We may need to sync over the cluster corosync.conf to the local one, that can only happen before. Also, if we shutdown pmxcfs before corosync we may still get corosync events (file writes, locking, ...) but the node does not sees it locally anymore but still looks quorate for others, that'd be not good. > > I have tried to change this, but even with that, both are still shutting down in parallel. > > the only way I have found to have clean shutdown, is "Requires=corosync.server" + "After=corosync.service". > But that mean than if you restart corosync, it's restart pmxcfs too first. > > I have looked at systemd doc, After= should be enough (as at shutdown it's doing the reverse order), > but I don't known why corosync don't wait than pve-cluster ??? > > > (Also, I think than pmxcfs is also stopping after syslog, because I never see the pmxcfs "teardown filesystem" logs at shutdown) is that true for (persistent) systemd-journald too? IIRC syslog.target is deprecated and only rsyslog provides it. As the next Debian will enable persistent journal by default and we already use it for everything (IIRC) were we provide an interface to logs, we will probably not enable rsyslog by default with PVE 7.x But if we can add some ordering for this to be improved I'm open for it.