all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: proxmox@elchaka.de
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Corosync and Cluster reboot
Date: Wed, 08 Jan 2025 13:53:20 +0100	[thread overview]
Message-ID: <1E4CF56F-1B37-4C06-8687-20658EAFC431@elchaka.de> (raw)
In-Reply-To: <1be81920-ed5b-4b96-938a-4f35551b9ce5@elettra.eu>

Hello, 

Am 8. Januar 2025 11:12:02 MEZ schrieb Iztok Gregori <iztok.gregori@elettra.eu>:
>Hi!
>
>On 07/01/25 15:15, DERUMIER, Alexandre wrote:
>> Personnaly, I'll recommand to disable HA  temporary during the network change  (mv /etc/pve/ha/resources.cfg  to a tmp directory,  stop all pve-ha-lrm   , tehn stop all pve-ha-crm   to stop  the watchdog)
>> 
>> Then, after the migration, check the corosync logs during 1 or 2 days , and after that , if no retransmit occur, reenable HA.
>> 
>
>Good advice. But with the pve-ha-* services down the "HA-VMs" cannot migrate from a node to the other, because the migration is handled by the HA (or at least that is how I remember to happen some time ago). So I've (temporary) removed all the resources (VMs) from HA, which has the effect to tell "pve-ha-lrm" to disable the watchdog( "watchdog closed (disabled)" ) and no reboot should occur.
>
>> It's really possible that it's a corosync bug (I remember to have had this kind of error with pve 7.X)
>
>I'm leaning to a similar conclusion, but I'm still lacking in understanding of how corosync/watchdog is handled in Proxmox.
>
>For example I still don't know who is updating the watchdog-mux service? Is corosync (but no "watchdog_device" is set in corosync.conf and by manual "if unset, empty or "off", no watchdog is used.") or is pve-ha-lrm?

As far as i can say it's the pve-ha... service. 
If you dont use HA in your cluster, the watchdog isnt used.
Thats how i understand this. 

Hth

>
>I think that, after the migration, my best shot is to upgrade the cluster, but I have to understand if newer libcephfs client libraries support old Ceph clusters.
>
>> Also, for "big" clusters (20-30 nodes), I'm using sctp protocol now, instead udp. for me , it's a lot more reliable when you have a network saturation on 1 now.
>> 
>> (I had the case of interne  udp flood attack coming from outside on 1 on my node, lagging the whole corosync cluster).²
>> 
>> 
>> corosync.conf
>> 
>> totem {
>>     cluster_name: ....
>>     ....
>>    interface {
>>        knet_transport: sctp
>>        linknumber: 0
>>    }
>>    ....
>> 
>> 
>> (This need a full restart of corosync everywhere, and HA need to be disable before, because udp can't communite with sctp, so you'll have a loss of quorum during the change)
>
>I've read about it, I think I'll follow your suggestion. In those bigger cluster have you tinker with corosync values as "token" or "token_retransmits_before_loss_const"?
>
>Thank you!
>
>Iztok
>
>
>_______________________________________________
>pve-user mailing list
>pve-user@lists.proxmox.com
>https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

      parent reply	other threads:[~2025-01-08 13:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 11:06 Iztok Gregori
2025-01-07 12:01 ` Gilberto Ferreira
2025-01-07 12:33   ` Gilberto Ferreira
2025-01-07 14:06     ` Iztok Gregori
2025-01-07 14:17       ` Gilberto Ferreira
2025-01-07 14:15     ` DERUMIER, Alexandre
2025-01-08 10:12       ` Iztok Gregori
2025-01-08 12:02         ` Alwin Antreich via pve-user
2025-01-08 12:53         ` proxmox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1E4CF56F-1B37-4C06-8687-20658EAFC431@elchaka.de \
    --to=proxmox@elchaka.de \
    --cc=pve-user@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal