From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id C46E11FF142 for ; Mon, 16 Feb 2026 16:59:50 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 0E1EB13AFC; Mon, 16 Feb 2026 17:00:38 +0100 (CET) From: Maximiliano Sandoval To: Friedrich Weber Subject: Re: [PATCH pve-cluster 2/2] api: cluster config: create new clusters with lower token coefficient In-Reply-To: <20260212115928.148999-3-f.weber@proxmox.com> (Friedrich Weber's message of "Thu, 12 Feb 2026 12:57:56 +0100") References: <20260212115928.148999-1-f.weber@proxmox.com> <20260212115928.148999-3-f.weber@proxmox.com> User-Agent: mu4e 1.12.9; emacs 30.1 Date: Mon, 16 Feb 2026 17:00:29 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771257623696 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.036 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment PROLO_LEO1 0.1 Meta Catches all Leo drug variations so far RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: QTC6ACZCEQWDFU6OCGR364NEEA2GWIZX X-Message-ID-Hash: QTC6ACZCEQWDFU6OCGR364NEEA2GWIZX X-MailFrom: m.sandoval@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: pve-devel@lists.proxmox.com X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Friedrich Weber writes: A comment below > corosync makes use of several timeouts, in particular the token and > consensus timeouts. The sum of these two timeouts yields the minimum > time a cluster needs to reestablish a membership after a token loss > due to a complete node failure. > > By default, corosync sets the timeouts based on the cluster size [1]: > > token timeout = token + (#nodes - 2) * token_coefficient > consensus timeout = 1.2 * token timeout > > token defaults to 3000ms, token_coefficient defaults to 650ms. > > With more than ~30 nodes in the default settings, the sum of token and > consensus timeouts gets close to or exceeds 50-60s. As a result, after > a token loss due to a complete node failure in an HA cluster, the > watchdog may fence nodes because it takes too long to reestablish a > new membership and quorum. > > One way to avoid this is to lower the sum of the token and consensus > timeouts. The consensus timeout is intentionally slightly larger than > the token timeout [2], so the definition of the consensus timeout in > terms of the token timeout should be preserved. Since it does make > sense to define both timeouts in terms of the cluster size, the most > viable option to lower the timeouts appears to be to adjust the > token_coefficient. Experiments suggest that the default 650ms is > overly conservative considering the low-latency network requirements > postulated in the admin guide [3]. > > Hence, create new clusters with a default token coefficient of 125ms. > This keeps the sum of token and consensus timeouts well below 50s for > realistic cluster sizes. Users who prefer a larger token coefficient > can manually override the token coefficient when creating a cluster > via pvecm create. The token coefficient can also be changed for an > existing cluster, this will be documented separately. > > Note that knet_ping_interval and knet_ping_timeout are derived from > the token timeout, hence, a lower token coefficient will result in > more frequent kronosnet pings and shorter ping timeouts. > > With this change, newly created clusters will always set an explicit > token_coefficient in their corosync.conf. > > [1] https://manpages.debian.org/trixie/corosync/corosync.conf.5.en.html#token_coefficient > [2] https://github.com/corosync/corosync/commit/b3e19b29058eafc3e808ded7f4c2440c3f957392 > [3] https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_cluster_network_requirements > > Signed-off-by: Friedrich Weber > --- > src/PVE/API2/ClusterConfig.pm | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/src/PVE/API2/ClusterConfig.pm b/src/PVE/API2/ClusterConfig.pm > index 1bc7bcf..8df257a 100644 > --- a/src/PVE/API2/ClusterConfig.pm > +++ b/src/PVE/API2/ClusterConfig.pm > @@ -111,12 +111,21 @@ __PACKAGE__->register_method({ > minimum => 1, > optional => 1, > }, > + 'token-coefficient' => { > + type => 'integer', > + description => "Token coefficient to set in the corosync configuration.", > + default => 125, > + minimum => 0, >>From man 5 corosync.conf's token_coefficient documentation: "This value can be set to 0 resulting in effective removal of this feature.". If we want to expose setting this to 0 I would document that it has a special meaning and what does this entail. I would personally feel more comfortable setting `minimum => 1` for now instead. > + optional => 1, > + }, > }), > }, > returns => { type => 'string' }, > code => sub { > my ($param) = @_; > > + $param->{'token-coefficient'} //= 125; > + > die "cluster config '$clusterconf' already exists\n" if -f $clusterconf; > > my $rpcenv = PVE::RPCEnvironment::get(); -- Maximiliano