public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Friedrich Weber <f.weber@proxmox.com>
To: "Michael Köppl" <m.koeppl@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster 1/5] add functions to determine warning level for high token timeouts
Date: Fri, 17 Apr 2026 10:33:16 +0200	[thread overview]
Message-ID: <a0d4baa9-8e7e-4cc5-8117-c8cf0bfc74b0@proxmox.com> (raw)
In-Reply-To: <20260330144321.321072-2-m.koeppl@proxmox.com>

On 30/03/2026 16:46, Michael Köppl wrote:
> High token timeouts can lead to stability problems in clusters. To
> inform users about the timeout in their current setup (or expected
> timeouts when adding nodes) and give recommendations regarding the token
> coefficient setting, introduce function to calculate the timeout as well
> as determine the warning / recommendation levels.
> 
> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
> ---
> The timeouts are chosen according to Friedrich's description in [0].
> 
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=7398
> 
>  src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
> index aef0d31..41d4c6f 100644
> --- a/src/PVE/Corosync.pm
> +++ b/src/PVE/Corosync.pm
> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
>      return $match_ip_and_version->($resolved_ip);
>  }
>  
> +sub calculate_total_timeout {

I think "total timeout" is a little too vague, especially because it's
also user-facing. I don't think totem/corosync have a term for "sum of
token and consensus timeout", and "sum of token and consensus timeout"
is a too long. Perhaps something like "recovery timeout" -- though not
perfect because "Recovery" is a specific state in the totem state
machine. Maybe "membership convergence timeout" (though that's a bit
long and obscure)?

> +    my ($totemcfg, $node_count) = @_;
> +
> +    my $token_timeout = $totemcfg->{token} // 3000;
> +    my $token_coefficient = $totemcfg->{token_coefficient} // 650;
> +
> +    my $expected_token_timeout = $token_timeout;
> +    if ($node_count > 2) {
> +        $expected_token_timeout += ($node_count - 2) * $token_coefficient;
> +    }
> +
> +    my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
> +    return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
> +}
> +
> +sub get_timeout_warning_level {
> +    my ($total_timeout_secs) = @_;
> +
> +    if ($total_timeout_secs > 50) {
> +        return 'change-strongly-recommended';

I realize I'm the source of these numbers :) But since >50 is actually
pretty bad already, if we phrase it as "strongly recommended" we can
probably go for a slightly lower number:

- > 45: change-strongly-recommended
- > 40: change recommended
- > 30: optimize

> +    } elsif ($total_timeout_secs > 40) {
> +        return 'change-recommended';
> +    } elsif ($total_timeout_secs > 30) {
> +        return 'optimize';
> +    }
> +
> +    return undef;
> +}
> +
> +sub get_timeout_warning {
> +    my ($total_timeout_secs) = @_;
> +
> +    my $level = get_timeout_warning_level($total_timeout_secs);
> +    return undef if !defined($level);
> +
> +    my $level_msg;
> +    if ($level eq 'change-strongly-recommended') {
> +        $level_msg = "Changing the token coefficient is strongly recommended";
> +    } elsif ($level eq 'change-recommended') {
> +        $level_msg = "Changing the token coefficient is recommended";
> +    } elsif ($level eq 'optimize') {
> +        $level_msg = "Token coefficient can be optimized";
> +    }
> +
> +    return
> +        "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
> +        . "$level_msg. "
> +        . "See https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_changing_the_token_coefficient for details.";
> +}
> +
>  1;





  reply	other threads:[~2026-04-17  8:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 14:43 [PATCH cluster/manager 0/5] add warning messages for high token timeouts in clusters Michael Köppl
2026-03-30 14:43 ` [PATCH cluster 1/5] add functions to determine warning level for high token timeouts Michael Köppl
2026-04-17  8:33   ` Friedrich Weber [this message]
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH cluster 2/5] pvecm: warn users of high token timeouts when using nodes command Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH cluster 3/5] api: add token timeout and warning level to cluster join info Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH manager 4/5] ui: cluster info: move initialization of items to initComponent Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH manager 5/5] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-04-17  8:34   ` Friedrich Weber
2026-04-17  8:33 ` [PATCH cluster/manager 0/5] add warning messages for high token timeouts in clusters Friedrich Weber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0d4baa9-8e7e-4cc5-8117-c8cf0bfc74b0@proxmox.com \
    --to=f.weber@proxmox.com \
    --cc=m.koeppl@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal