From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 9F9A71FF142 for ; Mon, 16 Feb 2026 17:09:08 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8850914021; Mon, 16 Feb 2026 17:09:56 +0100 (CET) From: Maximiliano Sandoval To: Friedrich Weber Subject: Re: [PATCH pve-cluster 2/2] api: cluster config: create new clusters with lower token coefficient In-Reply-To: <20260212115928.148999-3-f.weber@proxmox.com> (Friedrich Weber's message of "Thu, 12 Feb 2026 12:57:56 +0100") References: <20260212115928.148999-1-f.weber@proxmox.com> <20260212115928.148999-3-f.weber@proxmox.com> User-Agent: mu4e 1.12.9; emacs 30.1 Date: Mon, 16 Feb 2026 17:09:20 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771258154863 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.036 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment PROLO_LEO1 0.1 Meta Catches all Leo drug variations so far RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: E4PSXYXB7SWKFU6NUVBRFARUPALCOQOO X-Message-ID-Hash: E4PSXYXB7SWKFU6NUVBRFARUPALCOQOO X-MailFrom: m.sandoval@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: pve-devel@lists.proxmox.com X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Friedrich Weber writes: > corosync makes use of several timeouts, in particular the token and > consensus timeouts. The sum of these two timeouts yields the minimum > time a cluster needs to reestablish a membership after a token loss > due to a complete node failure. > > By default, corosync sets the timeouts based on the cluster size [1]: > > token timeout = token + (#nodes - 2) * token_coefficient > consensus timeout = 1.2 * token timeout > > token defaults to 3000ms, token_coefficient defaults to 650ms. > > With more than ~30 nodes in the default settings, the sum of token and > consensus timeouts gets close to or exceeds 50-60s. As a result, after > a token loss due to a complete node failure in an HA cluster, the > watchdog may fence nodes because it takes too long to reestablish a > new membership and quorum. > > One way to avoid this is to lower the sum of the token and consensus > timeouts. The consensus timeout is intentionally slightly larger than > the token timeout [2], so the definition of the consensus timeout in > terms of the token timeout should be preserved. Since it does make > sense to define both timeouts in terms of the cluster size, the most > viable option to lower the timeouts appears to be to adjust the > token_coefficient. Experiments suggest that the default 650ms is > overly conservative considering the low-latency network requirements > postulated in the admin guide [3]. > > Hence, create new clusters with a default token coefficient of 125ms. > This keeps the sum of token and consensus timeouts well below 50s for > realistic cluster sizes. Users who prefer a larger token coefficient > can manually override the token coefficient when creating a cluster > via pvecm create. The token coefficient can also be changed for an > existing cluster, this will be documented separately. > > Note that knet_ping_interval and knet_ping_timeout are derived from > the token timeout, hence, a lower token coefficient will result in > more frequent kronosnet pings and shorter ping timeouts. > > With this change, newly created clusters will always set an explicit > token_coefficient in their corosync.conf. > > [1] https://manpages.debian.org/trixie/corosync/corosync.conf.5.en.html#token_coefficient > [2] https://github.com/corosync/corosync/commit/b3e19b29058eafc3e808ded7f4c2440c3f957392 > [3] https://pve.proxmox.com/pve-docs/chapter-pvecm.html#pvecm_cluster_network_requirements > > Signed-off-by: Friedrich Weber > --- > src/PVE/API2/ClusterConfig.pm | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/src/PVE/API2/ClusterConfig.pm b/src/PVE/API2/ClusterConfig.pm > index 1bc7bcf..8df257a 100644 > --- a/src/PVE/API2/ClusterConfig.pm > +++ b/src/PVE/API2/ClusterConfig.pm > @@ -111,12 +111,21 @@ __PACKAGE__->register_method({ > minimum => 1, > optional => 1, > }, > + 'token-coefficient' => { > + type => 'integer', > + description => "Token coefficient to set in the corosync configuration.", This description does not help understanding what it does, no more than its name at least. It would perhaps be preferable to say something along the lines of: "Coefficient used to determine Corosync's token timeout. See the corosync.conf(5) manual for more details." > + default => 125, > + minimum => 0, > + optional => 1, > + }, > }), > }, > returns => { type => 'string' }, > code => sub { > my ($param) = @_; > > + $param->{'token-coefficient'} //= 125; > + > die "cluster config '$clusterconf' already exists\n" if -f $clusterconf; > > my $rpcenv = PVE::RPCEnvironment::get(); -- Maximiliano