From: "Daniel Kral" <d.kral@proxmox.com>
To: "Thomas Lamprecht" <t.lamprecht@proxmox.com>,
<pve-devel@lists.proxmox.com>
Subject: Re: [PATCH ha-manager v2 3/4] fix #2751: implement disarm-ha and arm-ha for safe cluster maintenance
Date: Thu, 26 Mar 2026 17:02:40 +0100 [thread overview]
Message-ID: <DHCU9R824P2D.38DIPJF7JXIZ9@proxmox.com> (raw)
In-Reply-To: <20260321234350.2158438-4-t.lamprecht@proxmox.com>
Nice work!
While rebasing the load balancing patches on this, I had a few thoughts
about the transient states, which I wanted to add below. Nothing really
urgent and I might have missed something so correct me if I'm wrong.
One small UX nit:
As the test case 'test-disarm-relocate1' documents, the 'disarm-ha'
command takes priority over the migrate/relocate command. It is more of
an edge case anyway that both a migration an disarm-ha command are
issued in the same HA Manager cycle, but of course could still happen.
Handling the disarming as an admin task is more important than a
user-initiated migration request. It would still be great that this
action is relayed back to the user for the migration request (which
might only see the hamigrate task but no following qmigrate task), but
this is more an UX thing and should be handled e.g. as part of #6220 [0]
so users can directly follow the HA request and the further actions.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=6220
On Sun Mar 22, 2026 at 12:42 AM CET, Thomas Lamprecht wrote:
> diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
> index b1dbe6a..aa29858 100644
> --- a/src/PVE/HA/Manager.pm
> +++ b/src/PVE/HA/Manager.pm
> @@ -631,6 +684,94 @@ sub try_persistent_group_migration {
> }
> }
>
> +sub handle_disarm {
> + my ($self, $disarm, $ss, $lrm_modes) = @_;
> +
> + my $haenv = $self->{haenv};
> + my $ns = $self->{ns};
> +
> + # defer disarm if any services are in a transient state that needs the state machine to resolve
> + for my $sid (sort keys %$ss) {
> + my $state = $ss->{$sid}->{state};
> + if ($state eq 'fence' || $state eq 'recovery') {
> + $haenv->log(
> + 'warn', "deferring disarm - service '$sid' is in '$state' state",
> + );
> + return 0; # let manage() continue so fence/recovery can progress
> + }
> + if ($state eq 'migrate' || $state eq 'relocate') {
> + $haenv->log(
> + 'info', "deferring disarm - service '$sid' is in '$state' state",
> + );
> + return 0; # let manage() continue so migration can complete
> + }
> + }
Here the HA disarming is deferred so that the HA Manager can continue
with processing the HA resources FSM if at least one of the HA resources
is in one of the 4 transient states.
This can have side-effects for other HA resources as well, which
currently aren't in one of these 4 transient states, but are implicitly
in a transient state.
For example:
There are two HA resources in the current service state.
{
...
"service_status": {
"vm:101": {
"state": "started",
"node": "node1",
"cmd": [ "migrate", "node2" ],
"uid": "...",
},
"vm:102": {
"state": "...",
"node": "node1",
"uid": "...",
},
}
}
If vm:102 is in state 'started', then the HA Manager will start to
disarm the HA stack right away and drop vm:101's migration request in
both disarm modes.
If vm:102 is in any of the 4 transient states from above though, then
the HA Manager will defer the disarm and process all HA resources, which
will cause 'vm:101' to be put in 'migrate' now.
So the transient state of one HA resource might cause other HA resources
to be put in a transient state as well, even though I would have
expected the deferring here to only resolve the existing transient
states.
This also means that the HA Manager will handle user requests for HA
resources in the same HA Manager cycle as a disarm request if there is
at least one HA resource in one of the 4 transient states. This
contradicts that disarming takes priority over the migrate/relocate
request.
Two other possible cases of 'implicitly transient' states might be:
- adding a HA rule, which makes a HA resource in 'started' state be put
in 'migrate' to another node when processing the select_service_node()
in next_state_started().
- the node of a HA resource is offline delayed in the same round as the
disarm request. If none of the HA resources are in a transient state
yet, the disarm request goes through, otherwise the affected HA
resources might be put in 'fence'.
I haven't thought this through fully, but an option might be that we
only allow the FSM processing of the HA resources, which are in one of
these 4 transient states and don't process the others.
E.g. breaking out the FSM transition loop into its own function and in
normal operation we iterate through all services in $ss, but for
deferred disarming we only iterate through the HA resources in transient
states, which should be resolved.
> +
> + my $mode = $disarm->{mode};
> +
> + # prune stale runtime data (failed_nodes, cmd, target, ...) so the state machine starts
> + # fresh on re-arm; preserve maintenance_node for correct return behavior
> + my %keep_keys = map { $_ => 1 } qw(state node uid maintenance_node);
> +
> + if ($mode eq 'freeze') {
> + for my $sid (sort keys %$ss) {
> + my $sd = $ss->{$sid};
> + my $state = $sd->{state};
> + next if $state eq 'freeze'; # already frozen
> + if (
> + $state eq 'started'
> + || $state eq 'stopped'
> + || $state eq 'request_stop'
> + || $state eq 'request_start'
> + || $state eq 'request_start_balance'
> + || $state eq 'error'
> + ) {
> + $haenv->log('info', "disarm: freezing service '$sid' (was '$state')");
> + delete $sd->{$_} for grep { !$keep_keys{$_} } keys %$sd;
> + $sd->{state} = 'freeze';
> + $sd->{uid} = compute_new_uuid('freeze');
> + }
> + }
> + } elsif ($mode eq 'ignore') {
> + # keep $ss intact; the disarm flag in $ms causes service loops and vm_is_ha_managed()
> + # to skip these services while disarmed
> + for my $sid (sort keys %$ss) {
> + my $sd = $ss->{$sid};
> + delete $sd->{$_} for grep { !$keep_keys{$_} } keys %$sd;
> + }
> + }
> +
> + # check if all online LRMs have entered disarm mode
> + my $all_disarmed = 1;
> + my $online_nodes = $ns->list_online_nodes();
> +
> + for my $node (@$online_nodes) {
> + my $lrm_mode = $lrm_modes->{$node} // 'unknown';
> + if ($lrm_mode ne 'disarm') {
> + $all_disarmed = 0;
> + last;
> + }
> + }
> +
> + if ($all_disarmed && $disarm->{state} ne 'disarmed') {
> + $haenv->log('info', "all LRMs disarmed, HA stack is now fully disarmed");
> + $disarm->{state} = 'disarmed';
> + }
> +
> + # once disarmed, stay disarmed - a returning node's LRM will catch up within one cycle
> + $self->{all_lrms_disarmed} = $disarm->{state} eq 'disarmed';
> +
> + $self->flush_master_status();
> +
> + return 1;
> +}
> +
> +sub is_fully_disarmed {
> + my ($self) = @_;
> +
> + return $self->{all_lrms_disarmed};
> +}
> +
> sub manage {
> my ($self) = @_;
>
> @@ -713,6 +857,15 @@ sub manage {
>
> $self->update_crm_commands();
>
> + if (my $disarm = $ms->{disarm}) {
> + if ($self->handle_disarm($disarm, $ss, $lrm_modes)) {
> + return; # disarm active and progressing, skip normal service state machine
> + }
> + # disarm deferred (e.g. due to active fencing) - fall through to let it complete
> + }
Regarding the load balancing patches, I would change this to
- }
+ } else {
+ # load balance only if disarm is disabled as during a deferred disarm
+ # the HA Manager should not introduce any new migrations
+ $self->load_balance();
+ }
there.
> +
> + $self->{all_lrms_disarmed} = 0;
> +
> for (;;) {
> my $repeat = 0;
next prev parent reply other threads:[~2026-03-26 16:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-21 23:42 [PATCH ha-manager v2 0/4] fix #2751: implement disarm/arm HA for safer " Thomas Lamprecht
2026-03-21 23:42 ` [PATCH ha-manager v2 1/4] sim: hardware: add manual-migrate command for ignored services Thomas Lamprecht
2026-03-21 23:42 ` [PATCH ha-manager v2 2/4] api: status: add fencing status entry with armed/standby state Thomas Lamprecht
2026-03-21 23:42 ` [PATCH ha-manager v2 3/4] fix #2751: implement disarm-ha and arm-ha for safe cluster maintenance Thomas Lamprecht
2026-03-23 13:04 ` Dominik Rusovac
2026-03-25 15:50 ` Fiona Ebner
2026-03-27 1:17 ` Thomas Lamprecht
2026-03-26 16:02 ` Daniel Kral [this message]
2026-03-26 23:15 ` Thomas Lamprecht
2026-03-27 10:21 ` Daniel Kral
2026-03-21 23:42 ` [PATCH ha-manager v2 4/4] api: status: add disarm-ha and arm-ha endpoints and CLI wiring Thomas Lamprecht
2026-03-23 13:05 ` [PATCH ha-manager v2 0/4] fix #2751: implement disarm/arm HA for safer cluster maintenance Dominik Rusovac
2026-03-25 12:06 ` applied: " Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHCU9R824P2D.38DIPJF7JXIZ9@proxmox.com \
--to=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.