From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "Michael Köppl" <m.koeppl@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
Date: Mon, 18 May 2026 16:11:22 +0200 [thread overview]
Message-ID: <1779112787.r3e8af5igi.astroid@yuna.none> (raw)
In-Reply-To: <20260427170548.307698-5-m.koeppl@proxmox.com>
On April 27, 2026 7:05 pm, Michael Köppl wrote:
> High token timeouts can lead to stability problems in clusters. To
> inform users about the timeout in their current setup (or expected
> timeouts when adding nodes) and give recommendations regarding the token
> coefficient setting, introduce function to calculate the timeout as well
> as determine the warning / recommendation levels.
>
> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
> ---
> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
> index aef0d31..45a1f71 100644
> --- a/src/PVE/Corosync.pm
> +++ b/src/PVE/Corosync.pm
> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
> return $match_ip_and_version->($resolved_ip);
> }
>
> +sub calculate_membership_recovery_timeout {
> + my ($totemcfg, $node_count) = @_;
> +
> + my $token_timeout = $totemcfg->{token} // 3000;
> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
> +
> + my $expected_token_timeout = $token_timeout;
> + if ($node_count > 2) {
> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
> + }
> +
> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
we could also ask corosync (via corosync-cmapctl) about most of these,
to avoid duplicating the calculations/defaults. the only thing missing
is the coefficient, though we could probably expose that on the corosync
side as well.
> +}
> +
> +sub get_membership_recovery_timeout_warning_level {
> + my ($total_timeout_secs) = @_;
> +
> + if ($total_timeout_secs > 45) {
> + return 'change-strongly-recommended';
> + } elsif ($total_timeout_secs > 40) {
> + return 'change-recommended';
> + } elsif ($total_timeout_secs > 30) {
> + return 'optimize';
> + }
> +
> + return undef;
> +}
> +
> +sub get_membership_recovery_timeout_warning {
> + my ($total_timeout_secs) = @_;
> +
> + my $level = get_membership_recovery_timeout_warning_level($total_timeout_secs);
> + return undef if !defined($level);
> +
> + my $level_msg;
> + if ($level eq 'change-strongly-recommended') {
> + $level_msg = "Lowering the token coefficient is strongly recommended";
> + } elsif ($level eq 'change-recommended') {
> + $level_msg = "Lowering the token coefficient is recommended";
> + } elsif ($level eq 'optimize') {
> + $level_msg = "The token coefficient can be optimized";
> + }
> +
> + return
> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
> + . "$level_msg. "
> + . "See 'man pvecm' for details.";
this pretty much duplicates the frontend code - if we leave out the last
line we could just return the warning message, and call the field in the
API return value "totem_warning(s)" or "health_warnings" or just
"warnings" and potentially add more information in the future? we could
still keep the level and return
warnings = [
level => ...,
msg => ...,
]
but I don't currently see a reason why we'd benefit from returning raw
values and constructing the warning message on both ends?
> +}
> +
> 1;
> --
> 2.47.3
>
>
>
>
>
>
next prev parent reply other threads:[~2026-05-18 14:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
2026-05-18 14:11 ` Fabian Grünbichler [this message]
2026-05-18 15:39 ` Michael Köppl
2026-05-19 6:59 ` Fabian Grünbichler
2026-04-27 17:05 ` [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-05-04 9:37 ` [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Lukas Sichert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1779112787.r3e8af5igi.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=m.koeppl@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox