From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 243901FF137 for ; Tue, 31 Mar 2026 11:07:38 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 74AF114EC6; Tue, 31 Mar 2026 11:08:05 +0200 (CEST) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 31 Mar 2026 11:07:31 +0200 Message-Id: Subject: Re: [PATCH ha-manager v3 35/40] implement automatic rebalancing From: =?utf-8?q?Michael_K=C3=B6ppl?= To: "Daniel Kral" , X-Mailer: aerc 0.21.0 References: <20260330144101.668747-1-d.kral@proxmox.com> <20260330144101.668747-36-d.kral@proxmox.com> In-Reply-To: <20260330144101.668747-36-d.kral@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1774947996132 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.091 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 7ZCDOPLOCNSF63H3D7TPIJVR3T5RWZKP X-Message-ID-Hash: 7ZCDOPLOCNSF63H3D7TPIJVR3T5RWZKP X-MailFrom: m.koeppl@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: 2 comments inline On Mon Mar 30, 2026 at 4:30 PM CEST, Daniel Kral wrote: [snip] > + my $candidates =3D $self->get_resource_migration_candidates(); > + > + my $result; > + if ($method eq 'bruteforce') { > + $result =3D $online_node_usage->select_best_balancing_migration(= $candidates); > + } elsif ($method eq 'topsis') { > + $result =3D $online_node_usage->select_best_balancing_migration_= topsis($candidates); > + } > + > + # happens if $candidates is empty or $method isn't handled above > + return if !$result; > + > + my ($migration, $target_imbalance) =3D $result->@{qw(migration imbal= ance)}; > + > + my $relative_change =3D ($imbalance - $target_imbalance) / $imbalanc= e; Since you get $imbalance from a function that returns 0.0 for the case that the cluster load is perfectly balanced (?), you could run into division by 0 here, no? > + return if $relative_change < $margin; > + > + my ($sid, $source, $target) =3D $migration->@{qw(sid source-node tar= get-node)}; > + > + my (undef, $type, $id) =3D $haenv->parse_sid($sid); > + my $task =3D $type eq 'vm' ? "migrate" : "relocate"; > + my $cmd =3D "$task $sid $target"; > + > + my $target_imbalance_str =3D int(100 * $target_imbalance + 0.5) / 10= 0; > + $haenv->log( > + 'info', > + "auto rebalance - $task $sid to $target (expected target imbalan= ce: $target_imbalance_str)", > + ); > + > + $self->queue_resource_motion($cmd, $task, $sid, $target); > +} > + [snip] > $self->{all_lrms_disarmed} =3D 0; > diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm > index 43feb041..659ab30a 100644 > --- a/src/PVE/HA/Usage.pm > +++ b/src/PVE/HA/Usage.pm > @@ -60,6 +60,40 @@ sub remove_service_usage { > die "implement in subclass"; > } > =20 > +sub calculate_node_imbalance { > + my ($self) =3D @_; > + > + die "implement in subclass"; > +} > + > +sub score_best_balancing_migrations { > + my ($self, $migration_candidates, $limit) =3D @_; > + > + die "implement in subclass"; > +} > + > +sub select_best_balancing_migration { > + my ($self, $migration_candidates) =3D @_; > + > + my $migrations =3D $self->score_best_balancing_migrations($migration= _candidates, 1); > + > + return $migrations->[0]; If an error occurs in the following call in score_best_balancing_migrations my $migrations =3D eval { $self->{scheduler} ->score_best_balancing_migration_candidates_topsis($migration_c= andidates, $limit); }; you'd return an undefined $migrations, which would result in a dereference error here. > +} > + > +sub score_best_balancing_migrations_topsis { > + my ($self, $migration_candidates, $limit) =3D @_; > + > + die "implement in subclass"; > +} > + > +sub select_best_balancing_migration_topsis { > + my ($self, $migration_candidates) =3D @_; > + > + my $migrations =3D $self->score_best_balancing_migrations_topsis($mi= gration_candidates, 1); > + > + return $migrations->[0]; > +} > + > # Returns a hash with $nodename =3D> $score pairs. A lower $score is bet= ter. > sub score_nodes_to_start_service { > my ($self, $sid) =3D @_; > diff --git a/src/PVE/HA/Usage/Dynamic.pm b/src/PVE/HA/Usage/Dynamic.pm > index 24c85a41..76d0feaa 100644 > --- a/src/PVE/HA/Usage/Dynamic.pm > +++ b/src/PVE/HA/Usage/Dynamic.pm > @@ -104,6 +104,39 @@ sub remove_service_usage { > $self->{haenv}->log('warning', "unable to remove service '$sid' usag= e - $@") if $@; > } > =20 [snip]