From: Fabian Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [RFC ha-manager] manage: handle edge case where a node gets stuck in 'fence' state
Date: Fri, 8 Oct 2021 14:52:26 +0200 [thread overview]
Message-ID: <20211008125226.56551-1-f.ebner@proxmox.com> (raw)
If all services in 'fence' state are gone from a node (e.g. by
removing the services) before fence_node() was successful, a node
would get stuck in the 'fence' state. Avoid this by calling
fence_node() if the node is in 'fence' state, regardless of service
state.
Reported in the community forum:
https://forum.proxmox.com/threads/ha-migration-stuck-is-doing-nothing.94469/
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
Not really sure if this is worth it, because it's a hard to reach edge
case, but AFAICT there is no good way to get out of being stuck. What
would work is either of:
* Manually correcting the node state.
* Adding a service to the stuck node and triggering a fence
situation.
An alternative would be to keep services in 'fence' state in the
manager state, even if they were removed from the config. But the
approach from this patch seemed a bit more robust: for example, it
will fix an already existing stuck state, rather than just avoid
creating one.
src/PVE/HA/Manager.pm | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 1c66b43..fc445b1 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -472,6 +472,14 @@ sub manage {
$repeat = 1; # for faster execution
}
+ # Avoid that a node without services in 'fence' state gets stuck in 'fence' state.
+ for my $node (sort keys $ns->{status}->%*) {
+ next if $ns->get_node_state($node) ne 'fence';
+ next if defined($fenced_nodes->{$node});
+
+ $fenced_nodes->{$node} = $ns->fence_node($node) || 0;
+ }
+
last if !$repeat;
}
--
2.30.2
next reply other threads:[~2021-10-08 12:52 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-08 12:52 Fabian Ebner [this message]
2022-01-19 13:36 ` [pve-devel] applied: " Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211008125226.56551-1-f.ebner@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox