* [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters
@ 2026-04-27 17:05 Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
` (8 more replies)
0 siblings, 9 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
This patch series introduces warnings informing users about high token
timeouts in their clusters. A recent change [0] lowered the token
coefficient for clusters and allowed adapting it. However, this change
only affects new clusters. As described in [1], users with existing
cluster should be informed about the high token timeouts in their
configurations and what they can do to alleviate this problem.
Thus, warnings are added to the `pvecm status` command as well as to the
cluster join info dialog in the web UI. The warning in the web UI warns
users about the effect adding another node would have to allow them to
make an informed change before adding another node.
changes since v2 (thanks to @Lukas and @Friedrich for the feedback):
- use "lowering" instead of "changing" for the warning strings
- replace leftover hard-coded PVE docs link with `man pvecm`
- update descriptions of timeout_warning_level and expected_timeout
params such that they say "membership recovery timeout", to more
clearly state which timeout this is about
- update function names for get_timeout_warning_level and
get_timeout_warning to
calculate_membership_recovery_timeout_warning_level and
calculate_membership_recovery_timeout_warning respectively
changes since v1 (thanks to @Friedrich for the feedback on v1):
- add pve-docs patch to allow using get_help_link to directly link to
an anchor in the local documentation (then used to link to the section
on changing the token coefficient)
- add pve-docs patch to define explicit anchor for the "Changing the
Token Coefficient" section
- add pve-docs patch extending the section for Changing the Token
Coefficient slightly, informing users of the potential warning
messages in the pvecm status output
- change the threshold for "strongly recommend" from 50s to 45s as
suggested by @Friedrich
- adapted the name of calculate_total_timeout to
calculate_membership_recovery_timeout
- adapted commit messages for preparatory pve-manager patch (no
functional changes intended)
- moved the warning message in pvecm from `pvecm nodes` to `pvecm status`
- replaced the URL in the pvecm warning message with a reference to the
pvecm man pages
- link to local documentation in ClusterEdit.js
pve-docs:
Michael Köppl (3):
asciidoc-pve: allow linking sections with get_help_link
pvecm: add explicit anchor for token coefficient section
pvecm: add info about warnings regarding token coefficient
pvecm.adoc | 6 ++++++
scripts/asciidoc-pve.in | 2 +-
2 files changed, 7 insertions(+), 1 deletion(-)
pve-cluster:
Michael Köppl (3):
add functions to determine warning level for high token timeouts
pvecm: warn users of high token timeouts when using status command
api: add token timeout and warning level to cluster join info
src/PVE/API2/ClusterConfig.pm | 24 +++++++++++++++++
src/PVE/CLI/pvecm.pm | 12 +++++++++
src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++
3 files changed, 86 insertions(+)
pve-manager:
Michael Köppl (2):
ui: cluster info: move initialization of items to initComponent
ui: cluster info: warn users of high token timeout in join info
www/manager6/dc/Cluster.js | 4 +
www/manager6/dc/ClusterEdit.js | 141 ++++++++++++++++++++++-----------
2 files changed, 97 insertions(+), 48 deletions(-)
Summary over all repositories:
7 files changed, 190 insertions(+), 49 deletions(-)
--
Generated by murpp 0.11.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section Michael Köppl
` (7 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
scan_extjs_file only scanned for occurrences of onlineHelp. Linking to
specific sections of the documentation with get_help_link was not
possible withouth an accompanying onlineHelp entry. Therefore, also scan
for occurrences of get_help_link.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
I'm sure this can be done more elegantly, but I wanted to avoid breaking
the existing regex for onlineHelp. Perhaps someone more knowledgeable in
regexes can point me in the right direction.
scripts/asciidoc-pve.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/asciidoc-pve.in b/scripts/asciidoc-pve.in
index d42ddbe9..c4e72635 100644
--- a/scripts/asciidoc-pve.in
+++ b/scripts/asciidoc-pve.in
@@ -464,7 +464,7 @@ sub scan_extjs_file {
debug("scan-extjs $filename");
while (defined(my $line = <$fh>)) {
- if ($line =~ m/\s+onlineHelp:\s*[\'\"]([^{}\[\]\'\"]+)[\'\"]/) {
+ if ($line =~ m/(?|\s+onlineHelp:\s*[\'\"]([^{}\[\]\'\"]+)[\'\"]|\bget_help_link\(\s*[\'\"]([^{}\[\]\'\"]+)[\'\"]\s*\))/) {
my $blockid = $1;
my $link = $fileinfo->{blockid_target}->{default}->{$blockid};
if (!(defined($link) || defined($online_help_links->{$blockid}))) {
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient Michael Köppl
` (6 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
Suggested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
pvecm.adoc | 1 +
1 file changed, 1 insertion(+)
diff --git a/pvecm.adoc b/pvecm.adoc
index c09d19ff..2d14af11 100644
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -1381,6 +1381,7 @@ systemctl restart corosync
On errors, check the troubleshooting section below.
+[[pvecm_changing_token_coefficient]]
Changing the Token Coefficient
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
` (5 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
`pvecm status` might warn users of a high sum of token and consensus
timeout, recommending lowering the token coefficient. To make users
aware that these warnings may occur and to allow users to search for
this warning in the docs and man pages, extend the section on lowering
the token coefficient.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
pvecm.adoc | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/pvecm.adoc b/pvecm.adoc
index 2d14af11..3d652652 100644
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -1399,6 +1399,11 @@ You can change the token coefficient of an existing cluster by
xref:pvecm_edit_corosync_conf[editing corosync.conf]. Corosync will then
automatically adopt the new value for the cluster.
+Cluster commands may display a warning if the sum of the Corosync token and
+consensus timeouts is considered too high (e.g., "Changing the token coefficient
+is recommended"). To resolve this warning, it is recommended to lower the token
+coefficient.
+
Troubleshooting
~~~~~~~~~~~~~~~
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (2 preceding siblings ...)
2026-04-27 17:05 ` [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-05-18 14:11 ` Fabian Grünbichler
2026-04-27 17:05 ` [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command Michael Köppl
` (4 subsequent siblings)
8 siblings, 1 reply; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
High token timeouts can lead to stability problems in clusters. To
inform users about the timeout in their current setup (or expected
timeouts when adding nodes) and give recommendations regarding the token
coefficient setting, introduce function to calculate the timeout as well
as determine the warning / recommendation levels.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
index aef0d31..45a1f71 100644
--- a/src/PVE/Corosync.pm
+++ b/src/PVE/Corosync.pm
@@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
return $match_ip_and_version->($resolved_ip);
}
+sub calculate_membership_recovery_timeout {
+ my ($totemcfg, $node_count) = @_;
+
+ my $token_timeout = $totemcfg->{token} // 3000;
+ my $token_coefficient = $totemcfg->{token_coefficient} // 650;
+
+ my $expected_token_timeout = $token_timeout;
+ if ($node_count > 2) {
+ $expected_token_timeout += ($node_count - 2) * $token_coefficient;
+ }
+
+ my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
+ return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
+}
+
+sub get_membership_recovery_timeout_warning_level {
+ my ($total_timeout_secs) = @_;
+
+ if ($total_timeout_secs > 45) {
+ return 'change-strongly-recommended';
+ } elsif ($total_timeout_secs > 40) {
+ return 'change-recommended';
+ } elsif ($total_timeout_secs > 30) {
+ return 'optimize';
+ }
+
+ return undef;
+}
+
+sub get_membership_recovery_timeout_warning {
+ my ($total_timeout_secs) = @_;
+
+ my $level = get_membership_recovery_timeout_warning_level($total_timeout_secs);
+ return undef if !defined($level);
+
+ my $level_msg;
+ if ($level eq 'change-strongly-recommended') {
+ $level_msg = "Lowering the token coefficient is strongly recommended";
+ } elsif ($level eq 'change-recommended') {
+ $level_msg = "Lowering the token coefficient is recommended";
+ } elsif ($level eq 'optimize') {
+ $level_msg = "The token coefficient can be optimized";
+ }
+
+ return
+ "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
+ . "$level_msg. "
+ . "See 'man pvecm' for details.";
+}
+
1;
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (3 preceding siblings ...)
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info Michael Köppl
` (3 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
If the calculated token timeout is above certain thresholds, display a
warning for users when running `pvecm status` as part of the Cluster
Information block. Also points users to the documentation regarding
potential adaptations to their cluster configuration to alleviate the
problem.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
src/PVE/CLI/pvecm.pm | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/src/PVE/CLI/pvecm.pm b/src/PVE/CLI/pvecm.pm
index 7d393a8..64e1d94 100755
--- a/src/PVE/CLI/pvecm.pm
+++ b/src/PVE/CLI/pvecm.pm
@@ -561,6 +561,18 @@ __PACKAGE__->register_method({
$print_info->('Transport', 'transport', 'knet');
$print_info->('Secure auth', 'secauth', 'off');
printf "\n";
+
+ my $nodelist = PVE::Corosync::nodelist($conf);
+ my $total_timeout_secs = PVE::Corosync::calculate_membership_recovery_timeout(
+ $totem,
+ scalar(keys %$nodelist),
+ );
+ if (
+ my $msg =
+ PVE::Corosync::get_membership_recovery_timeout_warning($total_timeout_secs)
+ ) {
+ warn "$msg\n\n";
+ }
}
exec('corosync-quorumtool', '-siH');
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (4 preceding siblings ...)
2026-04-27 17:05 ` [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent Michael Köppl
` (2 subsequent siblings)
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
The token timeout in seconds and the warning level provide additional
information for users regarding the expected token timeout in seconds
after adding an additional node and whether changing the token
coefficient is recommended.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
src/PVE/API2/ClusterConfig.pm | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/src/PVE/API2/ClusterConfig.pm b/src/PVE/API2/ClusterConfig.pm
index bbed40e..e3a07cd 100644
--- a/src/PVE/API2/ClusterConfig.pm
+++ b/src/PVE/API2/ClusterConfig.pm
@@ -571,6 +571,19 @@ __PACKAGE__->register_method({
preferred_node => get_standard_option('pve-node'),
totem => { type => 'object' },
config_digest => { type => 'string' },
+ expected_timeout => {
+ type => 'number',
+ description =>
+ "Expected total membership recovery timeout (in seconds) if an additional node is added.",
+ optional => 1,
+ },
+ timeout_warning_level => {
+ type => 'string',
+ description =>
+ "Warning level for the expected total membership recovery timeout.",
+ optional => 1,
+ enum => ['optimize', 'change-recommended', 'change-strongly-recommended'],
+ },
},
},
code => sub {
@@ -599,12 +612,23 @@ __PACKAGE__->register_method({
$node->{pve_addr} = scalar(PVE::Cluster::remote_node_ip($name));
}
+ # Total timeout if additional node is added
+ my $total_timeout_secs = PVE::Corosync::calculate_membership_recovery_timeout(
+ $totem_cfg,
+ scalar(keys %$nodelist) + 1,
+ );
+
+ my $warning_level =
+ PVE::Corosync::get_membership_recovery_timeout_warning_level($total_timeout_secs);
+
my $res = {
nodelist => [values %$nodelist],
preferred_node => $nodename,
totem => $totem_cfg,
config_digest => $corosync_config_digest,
+ expected_timeout => $total_timeout_secs,
};
+ $res->{timeout_warning_level} = $warning_level if defined($warning_level);
return $res;
},
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (5 preceding siblings ...)
2026-04-27 17:05 ` [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-05-04 9:37 ` [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Lukas Sichert
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
This allows conditionally adding items to the form. No functional
changes intended.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
www/manager6/dc/ClusterEdit.js | 102 +++++++++++++++++----------------
1 file changed, 54 insertions(+), 48 deletions(-)
diff --git a/www/manager6/dc/ClusterEdit.js b/www/manager6/dc/ClusterEdit.js
index 109325855..aff1515ab 100644
--- a/www/manager6/dc/ClusterEdit.js
+++ b/www/manager6/dc/ClusterEdit.js
@@ -57,58 +57,64 @@ Ext.define('PVE.ClusterInfoWindow', {
totem: {},
},
- items: [
- {
- xtype: 'component',
- border: false,
- padding: '10 10 10 10',
- html: gettext('Copy the Join Information here and use it on the node you want to add.'),
- },
- {
- xtype: 'container',
- layout: 'form',
- border: false,
- padding: '0 10 10 10',
- items: [
- {
- xtype: 'textfield',
- fieldLabel: gettext('IP Address'),
- cbind: {
- value: '{joinInfo.ipAddress}',
- },
- editable: false,
- },
- {
- xtype: 'textfield',
- fieldLabel: gettext('Fingerprint'),
- cbind: {
- value: '{joinInfo.fingerprint}',
+ initComponent: function () {
+ var me = this;
+
+ var joinInfo = me.joinInfo;
+
+ me.items = [];
+
+ me.items.push(
+ {
+ xtype: 'component',
+ border: false,
+ padding: '10 10 10 10',
+ html: gettext(
+ 'Copy the Join Information here and use it on the node you want to add.',
+ ),
+ },
+ {
+ xtype: 'container',
+ layout: 'form',
+ border: false,
+ padding: '0 10 10 10',
+ items: [
+ {
+ xtype: 'textfield',
+ fieldLabel: gettext('IP Address'),
+ value: joinInfo.ipAddress,
+ editable: false,
},
- editable: false,
- },
- {
- xtype: 'textarea',
- inputId: 'pveSerializedClusterInfo',
- fieldLabel: gettext('Join Information'),
- grow: true,
- cbind: {
- joinInfo: '{joinInfo}',
+ {
+ xtype: 'textfield',
+ fieldLabel: gettext('Fingerprint'),
+ value: joinInfo.fingerprint,
+ editable: false,
},
- editable: false,
- listeners: {
- afterrender: function (field) {
- if (!field.joinInfo) {
- return;
- }
- var jsons = Ext.JSON.encode(field.joinInfo);
- var base64s = Ext.util.Base64.encode(jsons);
- field.setValue(base64s);
+ {
+ xtype: 'textarea',
+ inputId: 'pveSerializedClusterInfo',
+ fieldLabel: gettext('Join Information'),
+ grow: true,
+ joinInfo: joinInfo,
+ editable: false,
+ listeners: {
+ afterrender: function (field) {
+ if (!field.joinInfo) {
+ return;
+ }
+ var jsons = Ext.JSON.encode(field.joinInfo);
+ var base64s = Ext.util.Base64.encode(jsons);
+ field.setValue(base64s);
+ },
},
},
- },
- ],
- },
- ],
+ ],
+ },
+ );
+
+ me.callParent();
+ },
dockedItems: [
{
dock: 'bottom',
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (6 preceding siblings ...)
2026-04-27 17:05 ` [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent Michael Köppl
@ 2026-04-27 17:05 ` Michael Köppl
2026-05-04 9:37 ` [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Lukas Sichert
8 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-04-27 17:05 UTC (permalink / raw)
To: pve-devel
If another node would increase Corosync's token timeout to a level that
might affect the stability of the cluster, display a warning hint to
users, pointing them to the documentation section about changing the
token coefficient, allowing them to make an informed change before
another node.
Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
---
www/manager6/dc/Cluster.js | 4 ++++
www/manager6/dc/ClusterEdit.js | 39 ++++++++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
diff --git a/www/manager6/dc/Cluster.js b/www/manager6/dc/Cluster.js
index 2ec5588c3..00138f328 100644
--- a/www/manager6/dc/Cluster.js
+++ b/www/manager6/dc/Cluster.js
@@ -91,6 +91,8 @@ Ext.define('PVE.ClusterAdministration', {
vm.set('totem', data.totem);
vm.set('isInCluster', !!data.totem.cluster_name);
vm.set('nodelist', data.nodelist);
+ vm.set('expected_timeout', data.expected_timeout);
+ vm.set('timeout_warning_level', data.timeout_warning_level);
let nodeinfo = data.nodelist.find((el) => el.name === data.preferred_node);
@@ -133,6 +135,8 @@ Ext.define('PVE.ClusterAdministration', {
peerLinks: vm.get('preferred_node.peerLinks'),
ring_addr: vm.get('preferred_node.ring_addr'),
totem: vm.get('totem'),
+ expected_timeout: vm.get('expected_timeout'),
+ timeout_warning_level: vm.get('timeout_warning_level'),
},
});
},
diff --git a/www/manager6/dc/ClusterEdit.js b/www/manager6/dc/ClusterEdit.js
index aff1515ab..800ae5dd4 100644
--- a/www/manager6/dc/ClusterEdit.js
+++ b/www/manager6/dc/ClusterEdit.js
@@ -55,6 +55,8 @@ Ext.define('PVE.ClusterInfoWindow', {
ipAddress: undefined,
fingerprint: undefined,
totem: {},
+ expected_timeout: undefined,
+ timeout_warning_level: undefined,
},
initComponent: function () {
@@ -113,6 +115,43 @@ Ext.define('PVE.ClusterInfoWindow', {
},
);
+ if (joinInfo.expected_timeout && joinInfo.timeout_warning_level) {
+ let level;
+ if (joinInfo.timeout_warning_level === 'change-strongly-recommended') {
+ level = gettext('Lowering the token coefficient is strongly recommended');
+ } else if (joinInfo.timeout_warning_level === 'change-recommended') {
+ level = gettext('Lowering the token coefficient is recommended');
+ } else if (joinInfo.timeout_warning_level === 'optimize') {
+ level = gettext('The token coefficient can be optimized');
+ }
+
+ let msg = Ext.String.format(
+ gettext(
+ "Adding another node will increase the sum of Corosync's token and consensus timeout to {0}s. {1}." +
+ ' See {2} for details.',
+ ),
+ joinInfo.expected_timeout,
+ level,
+ '<a target="_blank" href="' +
+ Proxmox.Utils.get_help_link('pvecm_changing_token_coefficient') +
+ '">the documentation</a>',
+ );
+
+ me.items.push({
+ xtype: 'container',
+ border: false,
+ padding: '0 10 10 10',
+ items: [
+ {
+ itemId: 'joinInfoWarningHint',
+ xtype: 'displayfield',
+ userCls: 'pmx-hint',
+ value: msg,
+ },
+ ],
+ });
+ }
+
me.callParent();
},
dockedItems: [
--
2.47.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
` (7 preceding siblings ...)
2026-04-27 17:05 ` [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info Michael Köppl
@ 2026-05-04 9:37 ` Lukas Sichert
8 siblings, 0 replies; 14+ messages in thread
From: Lukas Sichert @ 2026-05-04 9:37 UTC (permalink / raw)
To: Michael Köppl, pve-devel
What I have tested:
- a small selection of links using 'onlineHelp' still work
- the new link to documentation in the 'Join Information' field in the
ui using the 'get_help_link' works
-the correct messages are displayed in the cli and ui if the
`token_coefficient` is high enough
Tested-by: Lukas Sichert <l.sichert@proxmox.com>
On 2026-04-27 19:05, Michael Köppl <m.koeppl@proxmox.com> wrote:
> This patch series introduces warnings informing users about high token
> timeouts in their clusters. A recent change [0] lowered the token
> coefficient for clusters and allowed adapting it. However, this change
> only affects new clusters. As described in [1], users with existing
> cluster should be informed about the high token timeouts in their
> configurations and what they can do to alleviate this problem.
>
> Thus, warnings are added to the `pvecm status` command as well as to the
> cluster join info dialog in the web UI. The warning in the web UI warns
> users about the effect adding another node would have to allow them to
> make an informed change before adding another node.
>
> changes since v2 (thanks to @Lukas and @Friedrich for the feedback):
> - use "lowering" instead of "changing" for the warning strings
> - replace leftover hard-coded PVE docs link with `man pvecm`
> - update descriptions of timeout_warning_level and expected_timeout
> params such that they say "membership recovery timeout", to more
> clearly state which timeout this is about
> - update function names for get_timeout_warning_level and
> get_timeout_warning to
> calculate_membership_recovery_timeout_warning_level and
> calculate_membership_recovery_timeout_warning respectively
>
> changes since v1 (thanks to @Friedrich for the feedback on v1):
> - add pve-docs patch to allow using get_help_link to directly link to
> an anchor in the local documentation (then used to link to the section
> on changing the token coefficient)
> - add pve-docs patch to define explicit anchor for the "Changing the
> Token Coefficient" section
> - add pve-docs patch extending the section for Changing the Token
> Coefficient slightly, informing users of the potential warning
> messages in the pvecm status output
> - change the threshold for "strongly recommend" from 50s to 45s as
> suggested by @Friedrich
> - adapted the name of calculate_total_timeout to
> calculate_membership_recovery_timeout
> - adapted commit messages for preparatory pve-manager patch (no
> functional changes intended)
> - moved the warning message in pvecm from `pvecm nodes` to `pvecm status`
> - replaced the URL in the pvecm warning message with a reference to the
> pvecm man pages
> - link to local documentation in ClusterEdit.js
>
>
> pve-docs:
>
> Michael Köppl (3):
> asciidoc-pve: allow linking sections with get_help_link
> pvecm: add explicit anchor for token coefficient section
> pvecm: add info about warnings regarding token coefficient
>
> pvecm.adoc | 6 ++++++
> scripts/asciidoc-pve.in | 2 +-
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
>
> pve-cluster:
>
> Michael Köppl (3):
> add functions to determine warning level for high token timeouts
> pvecm: warn users of high token timeouts when using status command
> api: add token timeout and warning level to cluster join info
>
> src/PVE/API2/ClusterConfig.pm | 24 +++++++++++++++++
> src/PVE/CLI/pvecm.pm | 12 +++++++++
> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++
> 3 files changed, 86 insertions(+)
>
>
> pve-manager:
>
> Michael Köppl (2):
> ui: cluster info: move initialization of items to initComponent
> ui: cluster info: warn users of high token timeout in join info
>
> www/manager6/dc/Cluster.js | 4 +
> www/manager6/dc/ClusterEdit.js | 141 ++++++++++++++++++++++-----------
> 2 files changed, 97 insertions(+), 48 deletions(-)
>
>
> Summary over all repositories:
> 7 files changed, 190 insertions(+), 49 deletions(-)
>
> --
> Generated by murpp 0.11.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
@ 2026-05-18 14:11 ` Fabian Grünbichler
2026-05-18 15:39 ` Michael Köppl
0 siblings, 1 reply; 14+ messages in thread
From: Fabian Grünbichler @ 2026-05-18 14:11 UTC (permalink / raw)
To: Michael Köppl, pve-devel
On April 27, 2026 7:05 pm, Michael Köppl wrote:
> High token timeouts can lead to stability problems in clusters. To
> inform users about the timeout in their current setup (or expected
> timeouts when adding nodes) and give recommendations regarding the token
> coefficient setting, introduce function to calculate the timeout as well
> as determine the warning / recommendation levels.
>
> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
> ---
> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
> index aef0d31..45a1f71 100644
> --- a/src/PVE/Corosync.pm
> +++ b/src/PVE/Corosync.pm
> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
> return $match_ip_and_version->($resolved_ip);
> }
>
> +sub calculate_membership_recovery_timeout {
> + my ($totemcfg, $node_count) = @_;
> +
> + my $token_timeout = $totemcfg->{token} // 3000;
> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
> +
> + my $expected_token_timeout = $token_timeout;
> + if ($node_count > 2) {
> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
> + }
> +
> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
we could also ask corosync (via corosync-cmapctl) about most of these,
to avoid duplicating the calculations/defaults. the only thing missing
is the coefficient, though we could probably expose that on the corosync
side as well.
> +}
> +
> +sub get_membership_recovery_timeout_warning_level {
> + my ($total_timeout_secs) = @_;
> +
> + if ($total_timeout_secs > 45) {
> + return 'change-strongly-recommended';
> + } elsif ($total_timeout_secs > 40) {
> + return 'change-recommended';
> + } elsif ($total_timeout_secs > 30) {
> + return 'optimize';
> + }
> +
> + return undef;
> +}
> +
> +sub get_membership_recovery_timeout_warning {
> + my ($total_timeout_secs) = @_;
> +
> + my $level = get_membership_recovery_timeout_warning_level($total_timeout_secs);
> + return undef if !defined($level);
> +
> + my $level_msg;
> + if ($level eq 'change-strongly-recommended') {
> + $level_msg = "Lowering the token coefficient is strongly recommended";
> + } elsif ($level eq 'change-recommended') {
> + $level_msg = "Lowering the token coefficient is recommended";
> + } elsif ($level eq 'optimize') {
> + $level_msg = "The token coefficient can be optimized";
> + }
> +
> + return
> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
> + . "$level_msg. "
> + . "See 'man pvecm' for details.";
this pretty much duplicates the frontend code - if we leave out the last
line we could just return the warning message, and call the field in the
API return value "totem_warning(s)" or "health_warnings" or just
"warnings" and potentially add more information in the future? we could
still keep the level and return
warnings = [
level => ...,
msg => ...,
]
but I don't currently see a reason why we'd benefit from returning raw
values and constructing the warning message on both ends?
> +}
> +
> 1;
> --
> 2.47.3
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
2026-05-18 14:11 ` Fabian Grünbichler
@ 2026-05-18 15:39 ` Michael Köppl
2026-05-19 6:59 ` Fabian Grünbichler
0 siblings, 1 reply; 14+ messages in thread
From: Michael Köppl @ 2026-05-18 15:39 UTC (permalink / raw)
To: Fabian Grünbichler, Michael Köppl, pve-devel
On Mon May 18, 2026 at 4:11 PM CEST, Fabian Grünbichler wrote:
> On April 27, 2026 7:05 pm, Michael Köppl wrote:
>> High token timeouts can lead to stability problems in clusters. To
>> inform users about the timeout in their current setup (or expected
>> timeouts when adding nodes) and give recommendations regarding the token
>> coefficient setting, introduce function to calculate the timeout as well
>> as determine the warning / recommendation levels.
>>
>> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
>> ---
>> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 50 insertions(+)
>>
>> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
>> index aef0d31..45a1f71 100644
>> --- a/src/PVE/Corosync.pm
>> +++ b/src/PVE/Corosync.pm
>> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
>> return $match_ip_and_version->($resolved_ip);
>> }
>>
>> +sub calculate_membership_recovery_timeout {
>> + my ($totemcfg, $node_count) = @_;
>> +
>> + my $token_timeout = $totemcfg->{token} // 3000;
>> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
>> +
>> + my $expected_token_timeout = $token_timeout;
>> + if ($node_count > 2) {
>> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
>> + }
>> +
>> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
>> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
>
> we could also ask corosync (via corosync-cmapctl) about most of these,
> to avoid duplicating the calculations/defaults. the only thing missing
> is the coefficient, though we could probably expose that on the corosync
> side as well.
Thanks for having a look at this!
In the original implementation I used the values from cmap directly. The
reason I decided to implement it like this later on was that I wanted to
be able to calculate the timeout for an arbitrary number of nodes
(although n and n+1 would suffice) to be able to display a warning
before adding another node if the timeout would increase to a
"problematic" level. I suppose using the values from corosync-cmapctl
and then adding $node_delta * $token_coefficient to the token timeout
would work, but apart from the avoiding duplicating the defaults, I'm
not sure this would improve the solution much? Or am I missing
something here?
>
>> +}
>> +
>> +sub get_membership_recovery_timeout_warning_level {
>> + my ($total_timeout_secs) = @_;
>> +
[snip]
>> + my $level_msg;
>> + if ($level eq 'change-strongly-recommended') {
>> + $level_msg = "Lowering the token coefficient is strongly recommended";
>> + } elsif ($level eq 'change-recommended') {
>> + $level_msg = "Lowering the token coefficient is recommended";
>> + } elsif ($level eq 'optimize') {
>> + $level_msg = "The token coefficient can be optimized";
>> + }
>> +
>> + return
>> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
>> + . "$level_msg. "
>> + . "See 'man pvecm' for details.";
>
> this pretty much duplicates the frontend code - if we leave out the last
> line we could just return the warning message, and call the field in the
> API return value "totem_warning(s)" or "health_warnings" or just
> "warnings" and potentially add more information in the future? we could
> still keep the level and return
>
> warnings = [
> level => ...,
> msg => ...,
> ]
>
> but I don't currently see a reason why we'd benefit from returning raw
> values and constructing the warning message on both ends?
The messages themselves differ because one warning message is for the
current state, whereas the other is for what would happen if another
node was added to the cluster, but I agree that it's unnecessarily
duplicated. We could instead return the warning message as
totem_warnings, as you suggested, but offer different warning messages
depending on a $node_delta (+ how many nodes to the current state, which
will pretty much be 1 for all cases right now)?
>
>> +}
>> +
>> 1;
>> --
>> 2.47.3
>>
>>
>>
>>
>>
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
2026-05-18 15:39 ` Michael Köppl
@ 2026-05-19 6:59 ` Fabian Grünbichler
2026-05-19 11:40 ` Michael Köppl
0 siblings, 1 reply; 14+ messages in thread
From: Fabian Grünbichler @ 2026-05-19 6:59 UTC (permalink / raw)
To: Michael Köppl, pve-devel
On May 18, 2026 5:39 pm, Michael Köppl wrote:
> On Mon May 18, 2026 at 4:11 PM CEST, Fabian Grünbichler wrote:
>> On April 27, 2026 7:05 pm, Michael Köppl wrote:
>>> High token timeouts can lead to stability problems in clusters. To
>>> inform users about the timeout in their current setup (or expected
>>> timeouts when adding nodes) and give recommendations regarding the token
>>> coefficient setting, introduce function to calculate the timeout as well
>>> as determine the warning / recommendation levels.
>>>
>>> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
>>> ---
>>> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 50 insertions(+)
>>>
>>> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
>>> index aef0d31..45a1f71 100644
>>> --- a/src/PVE/Corosync.pm
>>> +++ b/src/PVE/Corosync.pm
>>> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
>>> return $match_ip_and_version->($resolved_ip);
>>> }
>>>
>>> +sub calculate_membership_recovery_timeout {
>>> + my ($totemcfg, $node_count) = @_;
>>> +
>>> + my $token_timeout = $totemcfg->{token} // 3000;
>>> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
>>> +
>>> + my $expected_token_timeout = $token_timeout;
>>> + if ($node_count > 2) {
>>> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
>>> + }
>>> +
>>> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
>>> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
>>
>> we could also ask corosync (via corosync-cmapctl) about most of these,
>> to avoid duplicating the calculations/defaults. the only thing missing
>> is the coefficient, though we could probably expose that on the corosync
>> side as well.
>
> Thanks for having a look at this!
>
> In the original implementation I used the values from cmap directly. The
> reason I decided to implement it like this later on was that I wanted to
> be able to calculate the timeout for an arbitrary number of nodes
> (although n and n+1 would suffice) to be able to display a warning
> before adding another node if the timeout would increase to a
> "problematic" level. I suppose using the values from corosync-cmapctl
> and then adding $node_delta * $token_coefficient to the token timeout
> would work, but apart from the avoiding duplicating the defaults, I'm
> not sure this would improve the solution much? Or am I missing
> something here?
if corosync ever changes its calculation or defaults, the current
approach is bad ;)
of course, that also still applies if we get the current value and the
coefficient from corosync, in case it is the formula that changes..
>>> +}
>>> +
>>> +sub get_membership_recovery_timeout_warning_level {
>>> + my ($total_timeout_secs) = @_;
>>> +
>
> [snip]
>
>>> + my $level_msg;
>>> + if ($level eq 'change-strongly-recommended') {
>>> + $level_msg = "Lowering the token coefficient is strongly recommended";
>>> + } elsif ($level eq 'change-recommended') {
>>> + $level_msg = "Lowering the token coefficient is recommended";
>>> + } elsif ($level eq 'optimize') {
>>> + $level_msg = "The token coefficient can be optimized";
>>> + }
>>> +
>>> + return
>>> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
>>> + . "$level_msg. "
>>> + . "See 'man pvecm' for details.";
>>
>> this pretty much duplicates the frontend code - if we leave out the last
>> line we could just return the warning message, and call the field in the
>> API return value "totem_warning(s)" or "health_warnings" or just
>> "warnings" and potentially add more information in the future? we could
>> still keep the level and return
>>
>> warnings = [
>> level => ...,
>> msg => ...,
>> ]
>>
>> but I don't currently see a reason why we'd benefit from returning raw
>> values and constructing the warning message on both ends?
>
> The messages themselves differ because one warning message is for the
> current state, whereas the other is for what would happen if another
> node was added to the cluster, but I agree that it's unnecessarily
> duplicated. We could instead return the warning message as
> totem_warnings, as you suggested, but offer different warning messages
> depending on a $node_delta (+ how many nodes to the current state, which
> will pretty much be 1 for all cases right now)?
yeah, I also wondered whether we should just have a boolean flag to
determine whether we want the current value or the one for if one node
were added to the current setup.. but in the end it doesn't make that
much of a difference, unless the user for some reason set a very large
coefficient manually?
for small clusters, we should be below the thresholds anyway, and one
more node doesn't matter. for big clusters, a single node being added
with the default settings would add 0.65 * 2.2 = 1.43 seconds to the
total timeout. the gaps between the warning levels are way bigger than
that, so maybe just checking the current value is enough anyhow?
if the user for some reason has a huge token timeout or coefficient or
consensus timeout configured manually, they will most likely already be
in a warning state anyway.. and with the default settings, joining would
at most bump them from slightly below a warning level into that warning
level, it's not like we can jump from "everything fine" to "strongly
recommended" with a single node addition..
it might make more sense to warn about custom values for each of those
three that are above certain thresholds, in addition to the total
timeout checks implemented by this series?
>>
>>> +}
>>> +
>>> 1;
>>> --
>>> 2.47.3
>>>
>>>
>>>
>>>
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts
2026-05-19 6:59 ` Fabian Grünbichler
@ 2026-05-19 11:40 ` Michael Köppl
0 siblings, 0 replies; 14+ messages in thread
From: Michael Köppl @ 2026-05-19 11:40 UTC (permalink / raw)
To: Fabian Grünbichler, Michael Köppl, pve-devel
On Tue May 19, 2026 at 8:59 AM CEST, Fabian Grünbichler wrote:
> On May 18, 2026 5:39 pm, Michael Köppl wrote:
>> On Mon May 18, 2026 at 4:11 PM CEST, Fabian Grünbichler wrote:
>>> On April 27, 2026 7:05 pm, Michael Köppl wrote:
>>>> High token timeouts can lead to stability problems in clusters. To
>>>> inform users about the timeout in their current setup (or expected
>>>> timeouts when adding nodes) and give recommendations regarding the token
>>>> coefficient setting, introduce function to calculate the timeout as well
>>>> as determine the warning / recommendation levels.
>>>>
>>>> Signed-off-by: Michael Köppl <m.koeppl@proxmox.com>
>>>> ---
>>>> src/PVE/Corosync.pm | 50 +++++++++++++++++++++++++++++++++++++++++++++
>>>> 1 file changed, 50 insertions(+)
>>>>
>>>> diff --git a/src/PVE/Corosync.pm b/src/PVE/Corosync.pm
>>>> index aef0d31..45a1f71 100644
>>>> --- a/src/PVE/Corosync.pm
>>>> +++ b/src/PVE/Corosync.pm
>>>> @@ -534,4 +534,54 @@ sub resolve_hostname_like_corosync {
>>>> return $match_ip_and_version->($resolved_ip);
>>>> }
>>>>
>>>> +sub calculate_membership_recovery_timeout {
>>>> + my ($totemcfg, $node_count) = @_;
>>>> +
>>>> + my $token_timeout = $totemcfg->{token} // 3000;
>>>> + my $token_coefficient = $totemcfg->{token_coefficient} // 650;
>>>> +
>>>> + my $expected_token_timeout = $token_timeout;
>>>> + if ($node_count > 2) {
>>>> + $expected_token_timeout += ($node_count - 2) * $token_coefficient;
>>>> + }
>>>> +
>>>> + my $expected_consensus_timeout = $totemcfg->{consensus} // $expected_token_timeout * 1.2;
>>>> + return ($expected_token_timeout + $expected_consensus_timeout) / 1000.0;
>>>
>>> we could also ask corosync (via corosync-cmapctl) about most of these,
>>> to avoid duplicating the calculations/defaults. the only thing missing
>>> is the coefficient, though we could probably expose that on the corosync
>>> side as well.
>>
>> Thanks for having a look at this!
>>
>> In the original implementation I used the values from cmap directly. The
>> reason I decided to implement it like this later on was that I wanted to
>> be able to calculate the timeout for an arbitrary number of nodes
>> (although n and n+1 would suffice) to be able to display a warning
>> before adding another node if the timeout would increase to a
>> "problematic" level. I suppose using the values from corosync-cmapctl
>> and then adding $node_delta * $token_coefficient to the token timeout
>> would work, but apart from the avoiding duplicating the defaults, I'm
>> not sure this would improve the solution much? Or am I missing
>> something here?
>
> if corosync ever changes its calculation or defaults, the current
> approach is bad ;)
>
> of course, that also still applies if we get the current value and the
> coefficient from corosync, in case it is the formula that changes..
I agree that getting the values from Corosync directly makes more sense
to avoid future divergence between what our implementation looks like
and what Corosync does, at least if we only want to calculate the
current value of the timeout. Given you suggested below that it would
probably make more sense to have warnings only for the current value,
we could do something like:
```perl
sub calculate_membership_recovery_timeout {
my $cmap = read_cmap();
return undef if !$cmap;
my $token = $cmap->{'runtime.config.totem.token'};
my $consensus = $cmap->{'runtime.config.totem.consensus'};
return undef if !defined($token) || !defined($consensus);
return ($token + $consensus) / 1000.0;
}
```
with read_cmap parsing the output of corosync-cmapctl. This could still
be extended to calculate it for an additional node if we wanted to in
the future.
>
>>>> +}
>>>> +
>>>> +sub get_membership_recovery_timeout_warning_level {
>>>> + my ($total_timeout_secs) = @_;
>>>> +
>>
>> [snip]
>>
>>>> + my $level_msg;
>>>> + if ($level eq 'change-strongly-recommended') {
>>>> + $level_msg = "Lowering the token coefficient is strongly recommended";
>>>> + } elsif ($level eq 'change-recommended') {
>>>> + $level_msg = "Lowering the token coefficient is recommended";
>>>> + } elsif ($level eq 'optimize') {
>>>> + $level_msg = "The token coefficient can be optimized";
>>>> + }
>>>> +
>>>> + return
>>>> + "Sum of Corosync token and consensus timeout is ${total_timeout_secs}s. "
>>>> + . "$level_msg. "
>>>> + . "See 'man pvecm' for details.";
>>>
>>> this pretty much duplicates the frontend code - if we leave out the last
>>> line we could just return the warning message, and call the field in the
>>> API return value "totem_warning(s)" or "health_warnings" or just
>>> "warnings" and potentially add more information in the future? we could
>>> still keep the level and return
>>>
>>> warnings = [
>>> level => ...,
>>> msg => ...,
>>> ]
>>>
>>> but I don't currently see a reason why we'd benefit from returning raw
>>> values and constructing the warning message on both ends?
>>
>> The messages themselves differ because one warning message is for the
>> current state, whereas the other is for what would happen if another
>> node was added to the cluster, but I agree that it's unnecessarily
>> duplicated. We could instead return the warning message as
>> totem_warnings, as you suggested, but offer different warning messages
>> depending on a $node_delta (+ how many nodes to the current state, which
>> will pretty much be 1 for all cases right now)?
>
> yeah, I also wondered whether we should just have a boolean flag to
> determine whether we want the current value or the one for if one node
> were added to the current setup.. but in the end it doesn't make that
> much of a difference, unless the user for some reason set a very large
> coefficient manually?
>
> for small clusters, we should be below the thresholds anyway, and one
> more node doesn't matter. for big clusters, a single node being added
> with the default settings would add 0.65 * 2.2 = 1.43 seconds to the
> total timeout. the gaps between the warning levels are way bigger than
> that, so maybe just checking the current value is enough anyhow?
>
> if the user for some reason has a huge token timeout or coefficient or
> consensus timeout configured manually, they will most likely already be
> in a warning state anyway.. and with the default settings, joining would
> at most bump them from slightly below a warning level into that warning
> level, it's not like we can jump from "everything fine" to "strongly
> recommended" with a single node addition..
Agreed, would work for me to adapt it such that we only had the single
warning for the current value. Since @Friedrich and I discussed this
off-list initially and this was the primary reason why I implemented it
this way, maybe he has some input here as well, but I'll prepare a v4
using the values from corosync-cmapctl and with a single warning. Then
we could probably omit the warning level as discussed above and simply
return a `totem_warning` string as part of the response and print that,
appending the "See <documentation> for details" part separately for the
web UI and CLI.
>
> it might make more sense to warn about custom values for each of those
> three that are above certain thresholds, in addition to the total
> timeout checks implemented by this series?
[snip]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-05-19 11:40 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 17:05 [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 1/8] asciidoc-pve: allow linking sections with get_help_link Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 2/8] pvecm: add explicit anchor for token coefficient section Michael Köppl
2026-04-27 17:05 ` [PATCH docs v3 3/8] pvecm: add info about warnings regarding token coefficient Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 4/8] add functions to determine warning level for high token timeouts Michael Köppl
2026-05-18 14:11 ` Fabian Grünbichler
2026-05-18 15:39 ` Michael Köppl
2026-05-19 6:59 ` Fabian Grünbichler
2026-05-19 11:40 ` Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 5/8] pvecm: warn users of high token timeouts when using status command Michael Köppl
2026-04-27 17:05 ` [PATCH cluster v3 6/8] api: add token timeout and warning level to cluster join info Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 7/8] ui: cluster info: move initialization of items to initComponent Michael Köppl
2026-04-27 17:05 ` [PATCH manager v3 8/8] ui: cluster info: warn users of high token timeout in join info Michael Köppl
2026-05-04 9:37 ` [PATCH cluster/docs/manager v3 0/8] add warning messages for high token timeouts in clusters Lukas Sichert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox