From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 6D26C1FF13F for ; Thu, 26 Mar 2026 17:02:27 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id CB0AE19F7C; Thu, 26 Mar 2026 17:02:46 +0100 (CET) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Thu, 26 Mar 2026 17:02:40 +0100 Message-Id: From: "Daniel Kral" To: "Thomas Lamprecht" , Subject: Re: [PATCH ha-manager v2 3/4] fix #2751: implement disarm-ha and arm-ha for safe cluster maintenance X-Mailer: aerc 0.21.0-38-g7088c3642f2c-dirty References: <20260321234350.2158438-1-t.lamprecht@proxmox.com> <20260321234350.2158438-4-t.lamprecht@proxmox.com> In-Reply-To: <20260321234350.2158438-4-t.lamprecht@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774540912048 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.056 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: VYNWNANIHJCSNGLXQPIFQKZ3IN5ALIGF X-Message-ID-Hash: VYNWNANIHJCSNGLXQPIFQKZ3IN5ALIGF X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Nice work! While rebasing the load balancing patches on this, I had a few thoughts about the transient states, which I wanted to add below. Nothing really urgent and I might have missed something so correct me if I'm wrong. One small UX nit: As the test case 'test-disarm-relocate1' documents, the 'disarm-ha' command takes priority over the migrate/relocate command. It is more of an edge case anyway that both a migration an disarm-ha command are issued in the same HA Manager cycle, but of course could still happen. Handling the disarming as an admin task is more important than a user-initiated migration request. It would still be great that this action is relayed back to the user for the migration request (which might only see the hamigrate task but no following qmigrate task), but this is more an UX thing and should be handled e.g. as part of #6220 [0] so users can directly follow the HA request and the further actions. [0] https://bugzilla.proxmox.com/show_bug.cgi?id=3D6220 On Sun Mar 22, 2026 at 12:42 AM CET, Thomas Lamprecht wrote: > diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm > index b1dbe6a..aa29858 100644 > --- a/src/PVE/HA/Manager.pm > +++ b/src/PVE/HA/Manager.pm > @@ -631,6 +684,94 @@ sub try_persistent_group_migration { > } > } > =20 > +sub handle_disarm { > + my ($self, $disarm, $ss, $lrm_modes) =3D @_; > + > + my $haenv =3D $self->{haenv}; > + my $ns =3D $self->{ns}; > + > + # defer disarm if any services are in a transient state that needs t= he state machine to resolve > + for my $sid (sort keys %$ss) { > + my $state =3D $ss->{$sid}->{state}; > + if ($state eq 'fence' || $state eq 'recovery') { > + $haenv->log( > + 'warn', "deferring disarm - service '$sid' is in '$state= ' state", > + ); > + return 0; # let manage() continue so fence/recovery can prog= ress > + } > + if ($state eq 'migrate' || $state eq 'relocate') { > + $haenv->log( > + 'info', "deferring disarm - service '$sid' is in '$state= ' state", > + ); > + return 0; # let manage() continue so migration can complete > + } > + } Here the HA disarming is deferred so that the HA Manager can continue with processing the HA resources FSM if at least one of the HA resources is in one of the 4 transient states. This can have side-effects for other HA resources as well, which currently aren't in one of these 4 transient states, but are implicitly in a transient state. For example: There are two HA resources in the current service state. { ... "service_status": { "vm:101": { "state": "started", "node": "node1", "cmd": [ "migrate", "node2" ], "uid": "...", }, "vm:102": { "state": "...", "node": "node1", "uid": "...", }, } } If vm:102 is in state 'started', then the HA Manager will start to disarm the HA stack right away and drop vm:101's migration request in both disarm modes. If vm:102 is in any of the 4 transient states from above though, then the HA Manager will defer the disarm and process all HA resources, which will cause 'vm:101' to be put in 'migrate' now. So the transient state of one HA resource might cause other HA resources to be put in a transient state as well, even though I would have expected the deferring here to only resolve the existing transient states. This also means that the HA Manager will handle user requests for HA resources in the same HA Manager cycle as a disarm request if there is at least one HA resource in one of the 4 transient states. This contradicts that disarming takes priority over the migrate/relocate request. Two other possible cases of 'implicitly transient' states might be: - adding a HA rule, which makes a HA resource in 'started' state be put in 'migrate' to another node when processing the select_service_node() in next_state_started(). - the node of a HA resource is offline delayed in the same round as the disarm request. If none of the HA resources are in a transient state yet, the disarm request goes through, otherwise the affected HA resources might be put in 'fence'. I haven't thought this through fully, but an option might be that we only allow the FSM processing of the HA resources, which are in one of these 4 transient states and don't process the others. E.g. breaking out the FSM transition loop into its own function and in normal operation we iterate through all services in $ss, but for deferred disarming we only iterate through the HA resources in transient states, which should be resolved. > + > + my $mode =3D $disarm->{mode}; > + > + # prune stale runtime data (failed_nodes, cmd, target, ...) so the s= tate machine starts > + # fresh on re-arm; preserve maintenance_node for correct return beha= vior > + my %keep_keys =3D map { $_ =3D> 1 } qw(state node uid maintenance_no= de); > + > + if ($mode eq 'freeze') { > + for my $sid (sort keys %$ss) { > + my $sd =3D $ss->{$sid}; > + my $state =3D $sd->{state}; > + next if $state eq 'freeze'; # already frozen > + if ( > + $state eq 'started' > + || $state eq 'stopped' > + || $state eq 'request_stop' > + || $state eq 'request_start' > + || $state eq 'request_start_balance' > + || $state eq 'error' > + ) { > + $haenv->log('info', "disarm: freezing service '$sid' (wa= s '$state')"); > + delete $sd->{$_} for grep { !$keep_keys{$_} } keys %$sd; > + $sd->{state} =3D 'freeze'; > + $sd->{uid} =3D compute_new_uuid('freeze'); > + } > + } > + } elsif ($mode eq 'ignore') { > + # keep $ss intact; the disarm flag in $ms causes service loops a= nd vm_is_ha_managed() > + # to skip these services while disarmed > + for my $sid (sort keys %$ss) { > + my $sd =3D $ss->{$sid}; > + delete $sd->{$_} for grep { !$keep_keys{$_} } keys %$sd; > + } > + } > + > + # check if all online LRMs have entered disarm mode > + my $all_disarmed =3D 1; > + my $online_nodes =3D $ns->list_online_nodes(); > + > + for my $node (@$online_nodes) { > + my $lrm_mode =3D $lrm_modes->{$node} // 'unknown'; > + if ($lrm_mode ne 'disarm') { > + $all_disarmed =3D 0; > + last; > + } > + } > + > + if ($all_disarmed && $disarm->{state} ne 'disarmed') { > + $haenv->log('info', "all LRMs disarmed, HA stack is now fully di= sarmed"); > + $disarm->{state} =3D 'disarmed'; > + } > + > + # once disarmed, stay disarmed - a returning node's LRM will catch u= p within one cycle > + $self->{all_lrms_disarmed} =3D $disarm->{state} eq 'disarmed'; > + > + $self->flush_master_status(); > + > + return 1; > +} > + > +sub is_fully_disarmed { > + my ($self) =3D @_; > + > + return $self->{all_lrms_disarmed}; > +} > + > sub manage { > my ($self) =3D @_; > =20 > @@ -713,6 +857,15 @@ sub manage { > =20 > $self->update_crm_commands(); > =20 > + if (my $disarm =3D $ms->{disarm}) { > + if ($self->handle_disarm($disarm, $ss, $lrm_modes)) { > + return; # disarm active and progressing, skip normal service= state machine > + } > + # disarm deferred (e.g. due to active fencing) - fall through to= let it complete > + } Regarding the load balancing patches, I would change this to - } + } else { + # load balance only if disarm is disabled as during a deferred disarm + # the HA Manager should not introduce any new migrations + $self->load_balance(); + } there. > + > + $self->{all_lrms_disarmed} =3D 0; > + > for (;;) { > my $repeat =3D 0;