all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH docs] pvecm: elaborate when and how to change the token coefficient
@ 2026-05-05 14:49 Friedrich Weber
  0 siblings, 0 replies; only message in thread
From: Friedrich Weber @ 2026-05-05 14:49 UTC (permalink / raw)
  To: pve-devel

Since pve-cluster 9.1.0, more specifically commit a7b1c76 ("corosync
config: allow to override token coefficient and lower default"), new
corosync clusters are created with a token_coefficient of 125ms (the
default being 650ms), primarily to avoid issues with larger clusters
in combination with HA.

Already existing clusters may need manual adjustment of the
token_coefficient. Hence, expand the "Changing the Token Coefficient"
section and provide instructions when and how to change the token
coefficient.

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---

Notes:
    Based on the pve-docs patches from Michael's patch series [1],
    it also refers to warnings added by Michael's patch series.
    
    Thanks @Michael for feedback on an initial draft of this!
    
    [1] https://lore.proxmox.com/all/20260427170548.307698-1-m.koeppl@proxmox.com/

 pvecm.adoc | 95 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 79 insertions(+), 16 deletions(-)

diff --git a/pvecm.adoc b/pvecm.adoc
index 3d65265..f3625c5 100644
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -1386,23 +1386,86 @@ Changing the Token Coefficient
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The token coefficient can be configured in the `totem` section in
+`/etc/pve/corosync.conf`. If the token coefficient is not explicitly set, it
+defaults to 650 milliseconds. New clusters are created with a lower token
+coefficient of 125 milliseconds that is explicitly set in
 `/etc/pve/corosync.conf`. corosync uses the token coefficient to calculate
-several timeouts in relation to the cluster size.footnote:[
-`token_coefficient` in the corosync manual page
-https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token_coefficient]
-
-If the token coefficient is not explicitly set, it defaults to 650 milliseconds.
-New clusters are created with a lower token coefficient of 125 milliseconds that
-is explicitly set in `/etc/pve/corosync.conf`.
-
-You can change the token coefficient of an existing cluster by
-xref:pvecm_edit_corosync_conf[editing corosync.conf]. Corosync will then
-automatically adopt the new value for the cluster.
-
-Cluster commands may display a warning if the sum of the Corosync token and
-consensus timeouts is considered too high (e.g., "Changing the token coefficient
-is recommended"). To resolve this warning, it is recommended to lower the token
-coefficient.
+several timeouts in relation to the cluster size footnote:[ `token_coefficient`
+in the corosync manual page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token_coefficient],
+most importantly the token timeout and consensus timeout.
+
+Corosync implements a token-passing protocol. The token timeout specifies how
+long a node waits for the token until it declares the token to be lost
+footnote:[ `token` in the corosync manual page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#token]. The
+consensus timeout footnote:[ `consensus` in the corosync manual page
+https://manpages.debian.org/stable/corosync/corosync.conf.5.en.html#consensus]
+specifies the time nodes wait for a consensus on a new cluster membership. The
+sum of token and consensus timeouts defines the minimum time needed to
+reestablish a new cluster membership after a node goes offline.
+
+Keeping the sum of token and consensus timeouts below 30 seconds reduces the
+time needed for restablishing a new cluster membership after a node failure.
+When HA is enabled, it is especially important that this time stays below 45
+seconds to ensure that a new cluster membership is formed before the
+xref:ha_manager_crm[watchdog timeout] of 60 seconds expires, which would
+trigger a node fence. The recommended mechanism for lowering the token and
+consensus timeouts is lowering the token coefficient as explained below.
+
+You can check the current token and consensus timeouts (in milliseconds) with
+the following command:
+
+[source,bash]
+----
+corosync-cmapctl | grep -Ew 'runtime.config.totem.token|runtime.config.totem.consensus'
+----
+For example:
+[source,bash]
+----
+runtime.config.totem.consensus (u32) = 5940
+runtime.config.totem.token (u32) = 4950
+----
+
+The sum of these two values (10.89 seconds in the example) defines the minimum
+time needed to reestablish a new cluster membership after a node goes offline.
+Lowering the token coefficient is
+
+* strongly recommended if this value exceeds 45 seconds,
+* recommended if it exceeds 40 seconds,
+* a suggested optimization if it exceeds 30 seconds.
+
+Cluster commands like `pvecm status` will display corresponding warnings based
+on the sum of the token and consensus timeouts. When joining a new node into
+the cluster, the GUI will display a warning if adding the node would increase
+the timeouts above any of the recommended thresholds.
+
+To lower the token coefficient, first make sure your setup adheres
+xref:pvecm_cluster_network_requirements[to the network requirements]. Then:
+
+* If the `token_coefficient` is not yet set explicitly to 125 milliseconds in
+  corosync.conf, xref:pvecm_edit_corosync_conf[edit corosync.conf] and add
+  `token_coefficient: 125` to the `totem` section. Do not forget to
+  xref:pvecm_edit_corosync_conf[increase the `config_version`].
+* If the `token_coefficient` is already set explicitly to 125 milliseconds,
+  select a `token_coefficient` with which the token and consensus timeouts sum
+  up to at most 45 seconds. By default, corosync computes the token and
+  consensus timeouts (in milliseconds) according to the following formula:
++
+----
+token = 3000 + (number_of_nodes - 2) * token_coefficient
+consensus = 1.2 * token
+----
++
+xref:pvecm_edit_corosync_conf[Edit corosync.conf] and add a corresponding
+`token_coefficient` option to the `totem` section. Do not forget to
+xref:pvecm_edit_corosync_conf[increase the `config_version`]. Test your setup
+thoroughly for stability!
+
+After adjusting the `token_coefficient` in `corosync.conf`, recent corosync
+versions will automatically adopt the new value for the cluster. For corosync
+versions below `3.1.10-pve1`, corosync needs to be restarted on all nodes for
+the change to take effect.
 
 Troubleshooting
 ~~~~~~~~~~~~~~~
-- 
2.47.3





^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-05 14:50 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-05 14:49 [PATCH docs] pvecm: elaborate when and how to change the token coefficient Friedrich Weber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal