From: proxmox@elchaka.de
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Corosync and Cluster reboot
Date: Wed, 08 Jan 2025 13:53:20 +0100 [thread overview]
Message-ID: <1E4CF56F-1B37-4C06-8687-20658EAFC431@elchaka.de> (raw)
In-Reply-To: <1be81920-ed5b-4b96-938a-4f35551b9ce5@elettra.eu>
Hello,
Am 8. Januar 2025 11:12:02 MEZ schrieb Iztok Gregori <iztok.gregori@elettra.eu>:
>Hi!
>
>On 07/01/25 15:15, DERUMIER, Alexandre wrote:
>> Personnaly, I'll recommand to disable HA temporary during the network change (mv /etc/pve/ha/resources.cfg to a tmp directory, stop all pve-ha-lrm , tehn stop all pve-ha-crm to stop the watchdog)
>>
>> Then, after the migration, check the corosync logs during 1 or 2 days , and after that , if no retransmit occur, reenable HA.
>>
>
>Good advice. But with the pve-ha-* services down the "HA-VMs" cannot migrate from a node to the other, because the migration is handled by the HA (or at least that is how I remember to happen some time ago). So I've (temporary) removed all the resources (VMs) from HA, which has the effect to tell "pve-ha-lrm" to disable the watchdog( "watchdog closed (disabled)" ) and no reboot should occur.
>
>> It's really possible that it's a corosync bug (I remember to have had this kind of error with pve 7.X)
>
>I'm leaning to a similar conclusion, but I'm still lacking in understanding of how corosync/watchdog is handled in Proxmox.
>
>For example I still don't know who is updating the watchdog-mux service? Is corosync (but no "watchdog_device" is set in corosync.conf and by manual "if unset, empty or "off", no watchdog is used.") or is pve-ha-lrm?
As far as i can say it's the pve-ha... service.
If you dont use HA in your cluster, the watchdog isnt used.
Thats how i understand this.
Hth
>
>I think that, after the migration, my best shot is to upgrade the cluster, but I have to understand if newer libcephfs client libraries support old Ceph clusters.
>
>> Also, for "big" clusters (20-30 nodes), I'm using sctp protocol now, instead udp. for me , it's a lot more reliable when you have a network saturation on 1 now.
>>
>> (I had the case of interne udp flood attack coming from outside on 1 on my node, lagging the whole corosync cluster).²
>>
>>
>> corosync.conf
>>
>> totem {
>> cluster_name: ....
>> ....
>> interface {
>> knet_transport: sctp
>> linknumber: 0
>> }
>> ....
>>
>>
>> (This need a full restart of corosync everywhere, and HA need to be disable before, because udp can't communite with sctp, so you'll have a loss of quorum during the change)
>
>I've read about it, I think I'll follow your suggestion. In those bigger cluster have you tinker with corosync values as "token" or "token_retransmits_before_loss_const"?
>
>Thank you!
>
>Iztok
>
>
>_______________________________________________
>pve-user mailing list
>pve-user@lists.proxmox.com
>https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
prev parent reply other threads:[~2025-01-08 13:00 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 11:06 Iztok Gregori
2025-01-07 12:01 ` Gilberto Ferreira
2025-01-07 12:33 ` Gilberto Ferreira
2025-01-07 14:06 ` Iztok Gregori
2025-01-07 14:17 ` Gilberto Ferreira
2025-01-07 14:15 ` DERUMIER, Alexandre
2025-01-08 10:12 ` Iztok Gregori
2025-01-08 12:02 ` Alwin Antreich via pve-user
2025-01-08 12:53 ` proxmox [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1E4CF56F-1B37-4C06-8687-20658EAFC431@elchaka.de \
--to=proxmox@elchaka.de \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.