From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 260421FF289 for ; Mon, 13 Apr 2026 18:11:20 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 30CD349D1; Mon, 13 Apr 2026 18:12:09 +0200 (CEST) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Mon, 13 Apr 2026 18:11:33 +0200 Message-Id: From: =?utf-8?q?Michael_K=C3=B6ppl?= To: "Daniel Kral" , Subject: Re: [PATCH docs 18/18] ha-manager: crs: add load balancer section X-Mailer: aerc 0.21.0 References: <20260409114224.323102-1-d.kral@proxmox.com> <20260409114224.323102-19-d.kral@proxmox.com> In-Reply-To: <20260409114224.323102-19-d.kral@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776096618987 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.100 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: EK3YKBHAL4PXS4CEUTTLTFDQAJZXY6WA X-Message-ID-Hash: EK3YKBHAL4PXS4CEUTTLTFDQAJZXY6WA X-MailFrom: m.koeppl@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: 2 nits inline On Thu Apr 9, 2026 at 1:41 PM CEST, Daniel Kral wrote: > For pve-ha-manager >=3D 5.2.0, the HA Manager features an automatic > rebalancing system, which is available for the static and dynamic-load > scheduler mode. > > Signed-off-by: Daniel Kral > --- > ha-manager.adoc | 51 ++++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 48 insertions(+), 3 deletions(-) > > diff --git a/ha-manager.adoc b/ha-manager.adoc > index 4162071..0ba00cd 100644 > --- a/ha-manager.adoc > +++ b/ha-manager.adoc > @@ -1435,9 +1435,6 @@ scheduling decisions. The CRS mode can be changed i= n the web interface > is `basic`. The change will take effect starting with the next HA Manage= r > round, which should take no longer than 10 seconds. > =20 > -NOTE: There are plans to add modes for (static and dynamic) load-balanci= ng in > -the future. > - > [[_basic_scheduler]] > Basic Scheduler > ^^^^^^^^^^^^^^^ > @@ -1526,6 +1523,54 @@ functionality is still in technology preview. The = more HA resources the more > possible combinations there are, so it's currently not recommended to us= e it if > you have thousands of HA resources. > =20 > +CRS Load Balancer > +~~~~~~~~~~~~~~~~~ > + > +IMPORTANT: This feature is still in technology preview. > + > +The utilization of individual cluster nodes can vary greatly depending w= here nit: depending *on* where > +guests are currently running and how much load these exert at any moment= . This > +can cause the node utilizations to become imbalanced over time, where HA > +resources on one node become bottlenecked due to resource contention, wh= ile > +another node is barely utilized. > + > +The HA Manager's automatic load balancer can lower the overall cluster > +imbalance by automatically issuing rebalancing migrations. Currently, th= ese > +migrations are issued only for HA resources and executed sequentially. > +Furthermore, these migrations always obey the current HA affinity rules. > + > +The automatic load balancing system can be enabled and configured in the= web > +interface under `Datacenter` -> `Options` -> `Cluster Resource Schedulin= g`. > +This allows fine-tuning the various parameters as described in the next > +paragraph for different cluster setups and requirements. The load balanc= er > +requires either the xref:_static_scheduler[static] or > +xref:_dynamic_scheduler[dynamic-load scheduler mode] to be in use. > + > +In every HA Manager round, where each lasts around 10 seconds, the autom= atic > +load balancer checks whether the cluster imbalance exceeds the _imbalanc= e > +threshold_. If exceeding the imbalance threshold is sustained for at lea= st as > +many consecutive HA Manager rounds as given by the _hold duration_, then= the > +load balancer will select the rebalancing migration for an HA resource, = which > +lowers the overall cluster imbalance the most. The method used to select= the > +best rebalancing migration can be set with the _rebalancing method_ opti= on. At > +last, if the selected migration reduces the current cluster imbalance by= at > +least the percentage given by the _imbalance improvement_ option, the lo= ad > +balancer issues the selected migration. > + > +The cluster imbalance is calculated as the ratio between the standard de= viation > +and mean of the individual node loads. Therefore, the _imbalance thresho= ld_ > +controls how sensitive the load balancer should be triggered. If the nit: s/sensitive/sensitively or "controls the sensitivity of the load balancer" > +_imbalance threshold_ is set to a value of `0.0`, it will try to select = a > +rebalancing migration every time the _hold duration_ is exceeded. > +The _imbalance improvement_ percentage filters what constitutes as an > +acceptable rebalancing migration. If it is set to `0.0`, then it will al= ways > +commit to a migration even though the expected imbalance does not change= at > +all. However, the value can never be negative, therefore it will never c= ommit > +to a migration with an expected imbalance worse than the current one. > + > +NOTE: It is planned that load balancing will also include non-HA resourc= es in > +the future. > + > ifdef::manvolnum[] > include::pve-copyright.adoc[] > endif::manvolnum[]