From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] applied: [PATCH cluster] pmxcfs sync: properly check for corosync error
Date: Fri, 25 Sep 2020 15:48:58 +0200 [thread overview]
Message-ID: <23b09052-6b1f-ee28-0b91-7c3f629ee0c6@proxmox.com> (raw)
In-Reply-To: <954846404.464.1601041007631@webmail.proxmox.com>
On 25.09.20 15:36, Fabian Grünbichler wrote:
>
>> Thomas Lamprecht <t.lamprecht@proxmox.com> hat am 25.09.2020 15:23 geschrieben:
>>
>>
>> On 25.09.20 14:53, Fabian Grünbichler wrote:
>>> dfsm_send_state_message_full always returns != 0, since it returns
>>> cs_error_t which starts with CS_OK at 1, with values >1 representing
>>> errors.
>>>
>>> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
>>> ---
>>> unfortunately not that cause of Alexandre's shutdown/restart issue, but
>>> might have caused some hangs as well since we would be stuck in
>>> START_SYNC in that case..
>>>
>>> data/src/dfsm.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>>
>>
>> applied, thanks! But as the old wrong code showed up as critical error
>> "failed to send SYNC_START message" if it worked, it either (almost) never
>> works here or is not a probable case, else we'd saw this earlier.
>>
>> (still a valid and appreciated fix, just noting)
>
> no, the old wrong code never triggered the error handling (log + leave), no matter whether the send worked or failed - the return value cannot be 0, so the condition is never true. if the send failed, the code assumed the state machine is now in START_SYNC mode and waits for STATE messages, which will never come since the other nodes haven't switched to START_SYNC..
>
ah yeah, was confused about the CS_OK value for a moment
> it would still show up in the logs since cpg_mcast_joined failure is always verbose in the logs, but it would not be obvious that it caused the state machine to take a wrong turn I think.
>
prev parent reply other threads:[~2020-09-25 13:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-25 12:53 [pve-devel] " Fabian Grünbichler
2020-09-25 13:23 ` [pve-devel] applied: " Thomas Lamprecht
2020-09-25 13:36 ` Fabian Grünbichler
2020-09-25 13:48 ` Thomas Lamprecht [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=23b09052-6b1f-ee28-0b91-7c3f629ee0c6@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.