From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id E675D1FF13F for ; Thu, 09 Apr 2026 13:42:51 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 9FDFD1DD4; Thu, 9 Apr 2026 13:42:54 +0200 (CEST) From: Daniel Kral To: pve-devel@lists.proxmox.com Subject: [PATCH docs 06/18] ha-manager: crs: replace service term with HA resource Date: Thu, 9 Apr 2026 13:41:32 +0200 Message-ID: <20260409114224.323102-7-d.kral@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260409114224.323102-1-d.kral@proxmox.com> References: <20260409114224.323102-1-d.kral@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1775734879737 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.081 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: IGLKLFNB6LNMRWDOWM4EYXCDBJMLNK6T X-Message-ID-Hash: IGLKLFNB6LNMRWDOWM4EYXCDBJMLNK6T X-MailFrom: d.kral@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: 'HA resource' is the preferred term for the HA-managed guests instead of 'service' and prefixing them with 'HA' makes the distinction between HA-managed and HA-unmanaged resources clearer. Signed-off-by: Daniel Kral --- ha-manager.adoc | 53 +++++++++++++++++++++++++------------------------ 1 file changed, 27 insertions(+), 26 deletions(-) diff --git a/ha-manager.adoc b/ha-manager.adoc index a1b2210..aa2d362 100644 --- a/ha-manager.adoc +++ b/ha-manager.adoc @@ -1426,9 +1426,9 @@ crs: ha=static The change will be in effect starting with the next manager round (after a few seconds). -For each service that needs to be recovered or migrated, the scheduler +For each HA resource that needs to be recovered or migrated, the scheduler iteratively chooses the best node among the nodes that are available to -the service according to their HA rules, if any. +the HA resource according to their HA rules, if any. NOTE: There are plans to add modes for (static and dynamic) load-balancing in the future. @@ -1436,18 +1436,18 @@ the future. Basic Scheduler ~~~~~~~~~~~~~~~ -The number of active HA services on each node is used to choose a recovery node. -Non-HA-managed services are currently not counted. +The number of active HA resources on each node is used to choose a recovery +node. Non-HA-managed resources are currently not counted. Static-Load Scheduler ~~~~~~~~~~~~~~~~~~~~~ IMPORTANT: The static mode is still a technology preview. -Static usage information from HA services on each node is used to choose a -recovery node. Usage of non-HA-managed services is currently not considered. +Static usage information from HA resources on each node is used to choose a +recovery node. Usage of non-HA-managed resources is currently not considered. -For this selection, each node in turn is considered as if the service was +For this selection, each node in turn is considered as if the HA resource was already running on it, using CPU and memory usage from the associated guest configuration. Then for each such alternative, CPU and memory usage of all nodes are considered, with memory being weighted much more, because it's a truly @@ -1456,29 +1456,29 @@ more, as ideally no node should be overcommitted) and average usage of all nodes (to still be able to distinguish in case there already is a more highly committed node) are considered. -IMPORTANT: The more services the more possible combinations there are, so it's -currently not recommended to use it if you have thousands of HA managed -services. +IMPORTANT: The more HA resources the more possible combinations there are, so +it's currently not recommended to use it if you have thousands of HA resources. CRS Scheduling Points ~~~~~~~~~~~~~~~~~~~~~ -The CRS algorithm is not applied for every service in every round, since this -would mean a large number of constant migrations. Depending on the workload, -this could put more strain on the cluster than could be avoided by constant -balancing. -That's why the {pve} HA manager favors keeping services on their current node. +The CRS algorithm is not applied for every HA resource in every round, since +this would mean a large number of constant migrations. Depending on the +workload, this could put more strain on the cluster than could be avoided by +constant balancing. +That's why the {pve} HA manager favors keeping HA resources on their current +node. The CRS is currently used at the following scheduling points: -- Service recovery (always active). When a node with active HA services fails, - all its services need to be recovered to other nodes. The CRS algorithm will - be used here to balance that recovery over the remaining nodes. +- HA resource recovery (always active). When a node with active HA resources + fails, all its HA resources need to be recovered to other nodes. The CRS + algorithm will be used here to balance that recovery over the remaining nodes. - HA group config changes (always active). If a node is removed from a group, or its priority is reduced, the HA stack will use the CRS algorithm to find a - new target node for the HA services in that group, matching the adapted + new target node for the HA resources in that group, matching the adapted priority constraints. - HA rule config changes (always active). If a rule emposes different @@ -1501,13 +1501,14 @@ The CRS is currently used at the following scheduling points: rule, the HA stack will use the CRS algorithm to ensure that these HA resources are moved to separate nodes. -- HA service stopped -> start transition (opt-in). Requesting that a stopped - service should be started is an good opportunity to check for the best suited - node as per the CRS algorithm, as moving stopped services is cheaper to do - than moving them started, especially if their disk volumes reside on shared - storage. You can enable this by setting the **`ha-rebalance-on-start`** - CRS option in the datacenter config. You can change that option also in the - Web UI, under `Datacenter` -> `Options` -> `Cluster Resource Scheduling`. +- HA resources stopped -> start transition (opt-in). Requesting that a stopped + HA resource should be started is an good opportunity to check for the best + suited node as per the CRS algorithm, as moving stopped HA resources is + cheaper to do than moving them started, especially if their disk volumes + reside on shared storage. You can enable this by setting the + **`ha-rebalance-on-start`** CRS option in the datacenter config. You can + change that option also in the Web UI, under `Datacenter` -> `Options` -> + `Cluster Resource Scheduling`. ifdef::manvolnum[] include::pve-copyright.adoc[] -- 2.47.3