public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Friedrich Weber <f.weber@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH docs v2] pvecm: elaborate when and how to change the token coefficient
Date: Tue, 19 May 2026 19:38:58 +0200	[thread overview]
Message-ID: <20260519173900.383644-1-f.weber@proxmox.com> (raw)

Since pve-cluster 9.1.0, more specifically commit a7b1c76 ("corosync
config: allow to override token coefficient and lower default"), new
corosync clusters are created with a token_coefficient of 125ms (the
default being 650ms), primarily to avoid issues with larger clusters
in combination with HA.

Already existing clusters may need manual adjustment of the
token_coefficient. Hence, expand the "Changing the Token Coefficient"
section and provide instructions when and how to change the token
coefficient.

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---

Notes:
    v1 depended on Michael's patch series [1], this v2 does not anymore,
    the motivation being that this v2 could be applied even if Michael's
    patch series is not applied.
    
    Changes since v2:
    
    - rebase on current master, squash in the explicit anchor added by
      Michael's patch [2]
    
    - drop sentence about pvecm showing warnings, as these would only
      be added by Michael's patch series
    
    - mention Proxmox VE 9.2 as the first version with the lower default
      token_coefficient. Although the change was rolled out before Proxmox
      VE 9.2 in pve-cluster 9.1.0, mentioning an explicit minor release
      seems more informative in the long run than mentioning the explicit
      pve-cluster package version.
    
    - reorder the suggested optimization / recommendation / strong
      recommendation
    
    - minor rewordings
    
    [1] https://lore.proxmox.com/pve-devel/20260427170548.307698-1-m.koeppl@proxmox.com/
    [2] https://lore.proxmox.com/pve-devel/20260427170548.307698-3-m.koeppl@proxmox.com/
    
    v1: https://lore.proxmox.com/all/20260505144946.234522-1-f.weber@proxmox.com/

 pvecm.adoc | 92 +++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 80 insertions(+), 12 deletions(-)

diff --git a/pvecm.adoc b/pvecm.adoc
index 250a154..f4dbd85 100644
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -1393,22 +1393,90 @@ systemctl restart corosync
 
 On errors, check the troubleshooting section below.
 
+[[pvecm_changing_token_coefficient]]
 Changing the Token Coefficient
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The token coefficient can be configured in the `totem` section in
-`/etc/pve/corosync.conf`. corosync uses the token coefficient to calculate
-several timeouts in relation to the cluster size.footnote:[
-`token_coefficient` in the corosync manual page
-https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token_coefficient]
-
-If the token coefficient is not explicitly set, it defaults to 650 milliseconds.
-New clusters are created with a lower token coefficient of 125 milliseconds that
-is explicitly set in `/etc/pve/corosync.conf`.
-
-You can change the token coefficient of an existing cluster by
-xref:pvecm_edit_corosync_conf[editing corosync.conf]. Corosync will then
-automatically adopt the new value for the cluster.
+`/etc/pve/corosync.conf`. If the token coefficient is not explicitly set, it
+defaults to 650 milliseconds. Since Proxmox VE 9.2, new clusters are created
+with a lower token coefficient of 125 milliseconds explicitly set in
+`/etc/pve/corosync.conf`. This reduces the time needed for reestablishing a
+new cluster membership after a node failure.
+
+corosync uses the token coefficient to calculate
+several timeouts in relation to the cluster size
+footnote:[ `token_coefficient` in the corosync manual
+page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token_coefficient],
+most importantly the token timeout and consensus timeout.
+Corosync implements a token-passing protocol. The token timeout specifies how
+long a node waits for the token until it declares the token to be lost
+footnote:[ `token` in the corosync manual page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token]. The
+consensus timeout footnote:[ `consensus` in the corosync manual page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#consensus]
+specifies the time nodes wait for a consensus on a new cluster membership. The
+sum of token and consensus timeouts defines the minimum time needed to
+reestablish a new cluster membership after a node goes offline.
+
+Keeping the sum of token and consensus timeouts below 30 seconds reduces the
+time needed for restablishing a new cluster membership after a node failure.
+When HA is enabled, it is especially important that this time stays below 45
+seconds to ensure that a new cluster membership is formed before the
+xref:ha_manager_crm[watchdog timeout] of 60 seconds expires, which would
+trigger a node fence. The recommended mechanism for lowering the token and
+consensus timeouts is lowering the token coefficient as explained below.
+
+You can check the current token and consensus timeouts (in milliseconds) with
+the following command:
+
+[source,bash]
+----
+corosync-cmapctl | grep -Ew 'runtime.config.totem.token|runtime.config.totem.consensus'
+----
+For example:
+[source,bash]
+----
+runtime.config.totem.consensus (u32) = 5940
+runtime.config.totem.token (u32) = 4950
+----
+
+The sum of these two values (10.89 seconds in the example) defines the minimum
+time needed to reestablish a new cluster membership after a node goes offline.
+Lowering the token coefficient is
+
+* a suggested optimization if this value exceeds 30 seconds,
+* recommended if it exceeds 40 seconds,
+* strongly recommended if it exceeds 45 seconds.
+
+To lower the token coefficient, first make sure your setup adheres
+xref:pvecm_cluster_network_requirements[to the network requirements],
+especially with regards to latency. Then:
+
+* If the `token_coefficient` is not yet set explicitly to 125 milliseconds in
+  corosync.conf, xref:pvecm_edit_corosync_conf[edit corosync.conf] and add
+  `token_coefficient: 125` to the `totem` section. Do not forget to
+  xref:pvecm_edit_corosync_conf[increase the `config_version`].
+* If the `token_coefficient` is already set explicitly to 125 milliseconds,
+  select a `token_coefficient` with which the token and consensus timeouts sum
+  up to at most 45 seconds. By default, corosync computes the token and
+  consensus timeouts (in milliseconds) according to the following formula:
++
+----
+token = 3000 + (number_of_nodes - 2) * token_coefficient
+consensus = 1.2 * token
+----
++
+xref:pvecm_edit_corosync_conf[Edit corosync.conf] and set an appropriate
+`token_coefficient` option in the `totem` section. Do not forget to
+xref:pvecm_edit_corosync_conf[increase the `config_version`]. Test your setup
+thoroughly for stability!
+
+After adjusting the `token_coefficient` in `corosync.conf`, recent corosync
+versions will automatically adopt the new value for the cluster. For corosync
+versions below `3.1.10-pve1`, corosync needs to be restarted on all nodes for
+the change to take effect.
 
 Troubleshooting
 ~~~~~~~~~~~~~~~
-- 
2.47.3





             reply	other threads:[~2026-05-19 17:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-19 17:38 Friedrich Weber [this message]
2026-05-19 18:10 ` applied: [PATCH docs v2] pvecm: elaborate when and how to change the token coefficient Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260519173900.383644-1-f.weber@proxmox.com \
    --to=f.weber@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal