all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Friedrich Weber <f.weber@proxmox.com>
To: "Michael Köppl" <m.koeppl@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [PATCH cluster 1/5] add functions to determine warning level for high token timeouts
Date: Fri, 17 Apr 2026 10:33:16 +0200	[thread overview]
Message-ID: <a0d4baa9-8e7e-4cc5-8117-c8cf0bfc74b0@proxmox.com> (raw)
In-Reply-To: <20260330144321.321072-2-m.koeppl@proxmox.com>

On 30/03/2026 16:46, Michael Köppl wrote:
> High token timeouts can lead to stability problems in clusters. To
> inform users about the timeout in their current setup (or expected
> timeouts when adding nodes) and give recommendations regarding the token
> coefficient setting, introduce function to calculate the timeout as well
> as determine the warning / recommendation levels.
> 
> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
> ---
> The timeouts are chosen according to Friedrich's description in [0].
> 
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=7398
> 
>  src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
> index aef0d31..41d4c6f 100644
> --- a/src/PVE/Corosync.pm
> +++ b/src/PVE/Corosync.pm
> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
>      return $match_ip_and_version->($resolved_ip);
>  }
>  
> +sub calculate_total_timeout {

I think "total timeout" is a little too vague, especially because it's
also user-facing. I don't think totem/corosync have a term for "sum of
token and consensus timeout", and "sum of token and consensus timeout"
is a too long. Perhaps something like "recovery timeout" -- though not
perfect because "Recovery" is a specific state in the totem state
machine. Maybe "membership convergence timeout" (though that's a bit
long and obscure)?

> +    my ($totemcfg, $node_count) = @_;
> +
> +    my $token_timeout = $totemcfg->{token} // 3000;
> +    my $token_coefficient = $totemcfg->{token_coefficient} // 650;
> +
> +    my $expected_token_timeout = $token_timeout;
> +    if ($node_count > 2) {
> +        $expected_token_timeout += ($node_count - 2) * $token_coefficient;
> +    }
> +
> +    my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
> +    return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
> +}
> +
> +sub get_timeout_warning_level {
> +    my ($total_timeout_secs) = @_;
> +
> +    if ($total_timeout_secs > 50) {
> +        return 'change-strongly-recommended';

I realize I'm the source of these numbers :) But since >50 is actually
pretty bad already, if we phrase it as "strongly recommended" we can
probably go for a slightly lower number:

- > 45: change-strongly-recommended
- > 40: change recommended
- > 30: optimize

> +    } elsif ($total_timeout_secs > 40) {
> +        return 'change-recommended';
> +    } elsif ($total_timeout_secs > 30) {
> +        return 'optimize';
> +    }
> +
> +    return undef;
> +}
> +
> +sub get_timeout_warning {
> +    my ($total_timeout_secs) = @_;
> +
> +    my $level = get_timeout_warning_level($total_timeout_secs);
> +    return undef if !defined($level);
> +
> +    my $level_msg;
> +    if ($level eq 'change-strongly-recommended') {
> +        $level_msg = "Changing the token coefficient is strongly recommended";
> +    } elsif ($level eq 'change-recommended') {
> +        $level_msg = "Changing the token coefficient is recommended";
> +    } elsif ($level eq 'optimize') {
> +        $level_msg = "Token coefficient can be optimized";
> +    }
> +
> +    return
> +        "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
> +        . "$level_msg. "
> +        . "See https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_changing_the_token_coefficient for details.";
> +}
> +
>  1;





  reply	other threads:[~2026-04-17  8:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 14:43 [PATCH cluster/manager 0/5] add warning messages for high token timeouts in clusters Michael Köppl
2026-03-30 14:43 ` [PATCH cluster 1/5] add functions to determine warning level for high token timeouts Michael Köppl
2026-04-17  8:33   ` Friedrich Weber [this message]
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH cluster 2/5] pvecm: warn users of high token timeouts when using nodes command Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH cluster 3/5] api: add token timeout and warning level to cluster join info Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH manager 4/5] ui: cluster info: move initialization of items to initComponent Michael Köppl
2026-04-17  8:33   ` Friedrich Weber
2026-03-30 14:43 ` [PATCH manager 5/5] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-04-17  8:34   ` Friedrich Weber
2026-04-17  8:33 ` [PATCH cluster/manager 0/5] add warning messages for high token timeouts in clusters Friedrich Weber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0d4baa9-8e7e-4cc5-8117-c8cf0bfc74b0@proxmox.com \
    --to=f.weber@proxmox.com \
    --cc=m.koeppl@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal