all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Daniel Herzig <d.herzig@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Friedrich Weber <f.weber@proxmox.com>
Subject: Re: [pve-devel] [PATCH docs] pvecm, network: add section on corosync over bonds
Date: Thu, 24 Jul 2025 10:22:30 +0200	[thread overview]
Message-ID: <dda9e4ad-604d-4832-b872-354d95fa41e1@proxmox.com> (raw)
In-Reply-To: <20250721152734.230940-1-f.weber@proxmox.com>

Thanks for documenting this!

I'd even go one step further and discourage the use of bonds in one of 
the sections of 'Cluster Network' in `pvecm.adoc` as well. Best with a 
link to the new 'Corosync over Bonds' section, with your decent 
explanation. That way it would be more difficult to miss for hasty 
readers (which would be a pity).

On 7/21/25 17:27, Friedrich Weber wrote:
> Testing has shown that running corosync (only) over a bond can be
> problematic in some failure scenarios and for certain bond modes. The
> documentation only discourages bonds for corosync because corosync can
> switch between available networks itself, but does not mention other
> caveats when using bonds for corosync.
>
> Hence, extend the documentation with recommendations and caveats
> regarding bonds for corosync.
>
> Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
> ---
>
> Notes:
>      Aaron suggested we could expose the bond-lacp-rate in the GUI to
>      make it easier to change the setting on the PVE side. I'd open a
>      feature report for this.
>
>   pve-network.adoc |  4 +++-
>   pvecm.adoc       | 42 +++++++++++++++++++++++++++++++++++++++---
>   2 files changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/pve-network.adoc b/pve-network.adoc
> index 2dec882..b361f97 100644
> --- a/pve-network.adoc
> +++ b/pve-network.adoc
> @@ -495,7 +495,9 @@ use the active-backup mode.
>   
>   For the cluster network (Corosync) we recommend configuring it with multiple
>   networks. Corosync does not need a bond for network redundancy as it can switch
> -between networks by itself, if one becomes unusable.
> +between networks by itself, if one becomes unusable. Some bond modes are known
> +to be problematic for Corosync, see
> +xref:pvecm_corosync_over_bonds[Corosync over Bonds].
>   
>   The following bond configuration can be used as distributed/shared
>   storage network. The benefit would be that you get more speed and the
> diff --git a/pvecm.adoc b/pvecm.adoc
> index 312a26f..1045abb 100644
> --- a/pvecm.adoc
> +++ b/pvecm.adoc
> @@ -90,15 +90,51 @@ another link on a different physical network. This enables Corosync to keep the
>   cluster communication alive should the dedicated network be down.
>   +
>   NOTE: A single link backed by a bond is not enough to provide Corosync
> -redundancy. When a bonded interface fails and Corosync cannot fall back to
> -another link, it can lead to  asymmetric communication in the cluster, which in
> -turn can lead to the cluster losing quorum.
> +redundancy. See xref:pvecm_corosync_over_bonds[Corosync over Bonds].
>   
>   * The root password of a cluster node is required for adding nodes.
>   
>   * Online migration of virtual machines is only supported when nodes have CPUs
>     from the same vendor. It might work otherwise, but this is never guaranteed.
>   
> +[[pvecm_corosync_over_bonds]]
> +Corosync over Bonds
> +~~~~~~~~~~~~~~~~~~~
> +
> +Using a xref:sysadmin_network_bond[bond] as the only Corosync link can be
> +problematic in certain failure scenarios. If one of the bonded interfaces fails
> +and stops transmitting packets, but its link state stays up, some bond modes
> +may cause a state of asymmetric connectivity where cluster nodes can only
> +communicate with different subsets of other nodes. In case of asymmetric
> +connectivity, Corosync may not be able to form a stable quorum in the cluster.
> +If this state persists and HA is enabled, nodes may fence themselves, even if
> +their respective bond is still fully functioning. In the worst case, the whole
> +cluster may fence itself.
> +
> +For this reason, our recommendations are as follows.
> +
> +* We recommend a dedicated physical NIC for the primary Corosync link. Bonds
> +  can be used as additional links for increased redundancy.
> +
> +* We *advise against* using bond modes *balance-rr*, *balance-xor*,
> +  *balance-tlb*, or *balance-alb* for Corosync traffic. As explained above,
> +  they can cause asymmetric connectivity in certain failure scenarios.
> +
> +* *IEEE 802.3ad (LACP)*: This bond mode can cause asymmetric connectivity in
> +  certain failure scenarios as explained above, but it can recover from this
> +  state, as each side can stop using a bonded interface if it has not received
> +  three LACPDUs in a row. However, with default settings, LACPDUs are only sent
> +  every 30 seconds, yielding a failover time of 90 seconds. This is too long,
> +  as nodes with HA resources will fence themselves already after roughly one
> +  minute without a stable quorum. If LACP bonds are used for corosync traffic,
> +  we recommend setting `bond-lacp-rate fast` *on the Proxmox VE node and the
> +  switch*! Setting this option on one side requests the other side to send an
> +  LACPDU every second, which reduces the failover time in the scenario above to
> +  3 seconds.
> +
> +* Bond mode *active-backup* will not cause asymmetric connectivity in the
> +  failure scenario described above, but the affected node may lose connection
> +  to the cluster and, if HA is enabled, fence itself.
>   
>   Preparing Nodes
>   ---------------


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2025-07-24  8:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-21 15:26 Friedrich Weber
2025-07-24  8:22 ` Daniel Herzig [this message]
2025-07-24 15:01   ` Friedrich Weber
2025-07-25 11:40 ` [pve-devel] superseded: " Friedrich Weber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dda9e4ad-604d-4832-b872-354d95fa41e1@proxmox.com \
    --to=d.herzig@proxmox.com \
    --cc=f.weber@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal