From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "Michael Köppl" <m.koeppl@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
Date: Mon, 18 May 2026 16:11:22 +0200 [thread overview]
Message-ID: <1779112787.r3e8af5igi.astroid@yuna.none> (raw)
In-Reply-To: <20260427170548.307698-5-m.koeppl@proxmox.com>
On April 27, 2026 7:05 pm, Michael Köppl wrote:
> High token timeouts can lead to stability problems in clusters. To
> inform users about the timeout in their current setup (or expected
> timeouts when adding nodes) and give recommendations regarding the token
> coefficient setting, introduce function to calculate the timeout as well
> as determine the warning / recommendation levels.
>
> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
> ---
> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
> index aef0d31..45a1f71 100644
> --- a/src/PVE/Corosync.pm
> +++ b/src/PVE/Corosync.pm
> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
> return $match_ip_and_version->($resolved_ip);
> }
>
> +sub calculate_membership_recovery_timeout {
> + my ($totemcfg, $node_count) = @_;
> +
> + my $token_timeout = $totemcfg->{token} // 3000;
> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
> +
> + my $expected_token_timeout = $token_timeout;
> + if ($node_count > 2) {
> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
> + }
> +
> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
we could also ask corosync (via corosync-cmapctl) about most of these,
to avoid duplicating the calculations/defaults. the only thing missing
is the coefficient, though we could probably expose that on the corosync
side as well.
> +}
> +
> +sub get_membership_recovery_timeout_warning_level {
> + my ($total_timeout_secs) = @_;
> +
> + if ($total_timeout_secs > 45) {
> + return 'change-strongly-recommended';
> + } elsif ($total_timeout_secs > 40) {
> + return 'change-recommended';
> + } elsif ($total_timeout_secs > 30) {
> + return 'optimize';
> + }
> +
> + return undef;
> +}
> +
> +sub get_membership_recovery_timeout_warning {
> + my ($total_timeout_secs) = @_;
> +
> + my $level = get_membership_recovery_timeout_warning_level($total_timeout_secs);
> + return undef if !defined($level);
> +
> + my $level_msg;
> + if ($level eq 'change-strongly-recommended') {
> + $level_msg = "Lowering the token coefficient is strongly recommended";
> + } elsif ($level eq 'change-recommended') {
> + $level_msg = "Lowering the token coefficient is recommended";
> + } elsif ($level eq 'optimize') {
> + $level_msg = "The token coefficient can be optimized";
> + }
> +
> + return
> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
> + . "$level_msg. "
> + . "See 'man pvecm' for details.";
this pretty much duplicates the frontend code - if we leave out the last
line we could just return the warning message, and call the field in the
API return value "totem_warning(s)" or "health_warnings" or just
"warnings" and potentially add more information in the future? we could
still keep the level and return
warnings = [
level => ...,
msg => ...,
]
but I don't currently see a reason why we'd benefit from returning raw
values and constructing the warning message on both ends?
> +}
> +
> 1;
> --
> 2.47.3
>
>
>
>
>
>
next prev parent reply other threads:[~2026-05-18 14:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
2026-05-18 14:11 ` Fabian Grünbichler [this message]
2026-05-18 15:39 ` Michael Köppl
2026-05-19 6:59 ` Fabian Grünbichler
2026-05-19 11:40 ` Michael Köppl
2026-05-19 16:12 ` Friedrich Weber
2026-04-27 17:05 ` [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-05-04 9:37 ` [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Lukas Sichert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1779112787.r3e8af5igi.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=m.koeppl@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.