public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH ha-manager v2 13/26] usage: add information about a service's assigned nodes
Date: Fri, 20 Jun 2025 16:31:25 +0200	[thread overview]
Message-ID: <20250620143148.218469-18-d.kral@proxmox.com> (raw)
In-Reply-To: <20250620143148.218469-1-d.kral@proxmox.com>

This will be used to retrieve the nodes, which a service is currently
putting load on and using their resources, when dealing with colocation
rules in select_service_node(...). For example, a migrating service in a
negative colocation will need to block other negatively colocated
services to migrate on both the source and target node.

This is implemented here, because the service's usage of the nodes is
currently best encoded in recompute_online_node_usage(...) and other
call sites of add_service_usage_to_node(...).

Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
changes since v1:
    - let these be in Usage as it would introduce a second/third source
      of truth where the services are supposed to be; this should
      possibly be refactored in a follow-up (e.g. when making service
      state/node load changes more granular)
    - use services-nodes key for both Usage implementations
    - replace `pin_service_node(...)` with `set_service_node(...)`
    - introduce `add_service_node(...)` and allow multiple nodes in the
      services-nodes hash
    - make adds to the services-nodes hash more explicit with direct
      calls to `{add,set}_service_node(...)` instead of being in
      `add_service_usage_to_node(...)`

 src/PVE/HA/Manager.pm      | 17 +++++++++++++----
 src/PVE/HA/Usage.pm        | 18 ++++++++++++++++++
 src/PVE/HA/Usage/Basic.pm  | 19 +++++++++++++++++++
 src/PVE/HA/Usage/Static.pm | 19 +++++++++++++++++++
 4 files changed, 69 insertions(+), 4 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 00efc7c..4c7228e 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -258,6 +258,7 @@ sub recompute_online_node_usage {
         my $sd = $self->{ss}->{$sid};
         my $state = $sd->{state};
         my $target = $sd->{target}; # optional
+
         if ($online_node_usage->contains_node($sd->{node})) {
             if (
                 $state eq 'started'
@@ -268,6 +269,7 @@ sub recompute_online_node_usage {
                 || $state eq 'recovery'
             ) {
                 $online_node_usage->add_service_usage_to_node($sd->{node}, $sid, $sd->{node});
+                $online_node_usage->set_service_node($sid, $sd->{node});
             } elsif (
                 $state eq 'migrate'
                 || $state eq 'relocate'
@@ -275,10 +277,14 @@ sub recompute_online_node_usage {
             ) {
                 my $source = $sd->{node};
                 # count it for both, source and target as load is put on both
-                $online_node_usage->add_service_usage_to_node($source, $sid, $source, $target)
-                    if $state ne 'request_start_balance';
-                $online_node_usage->add_service_usage_to_node($target, $sid, $source, $target)
-                    if $online_node_usage->contains_node($target);
+                if ($state ne 'request_start_balance') {
+                    $online_node_usage->add_service_usage_to_node($source, $sid, $source, $target);
+                    $online_node_usage->add_service_node($sid, $source);
+                }
+                if ($online_node_usage->contains_node($target)) {
+                    $online_node_usage->add_service_usage_to_node($target, $sid, $source, $target);
+                    $online_node_usage->add_service_node($sid, $target);
+                }
             } elsif ($state eq 'stopped' || $state eq 'request_start') {
                 # do nothing
             } else {
@@ -290,6 +296,7 @@ sub recompute_online_node_usage {
                 # case a node dies, as we cannot really know if the to-be-aborted incoming migration
                 # has already cleaned up all used resources
                 $online_node_usage->add_service_usage_to_node($target, $sid, $sd->{node}, $target);
+                $online_node_usage->set_service_node($sid, $target);
             }
         }
     }
@@ -976,6 +983,7 @@ sub next_state_started {
 
             if ($node && ($sd->{node} ne $node)) {
                 $self->{online_node_usage}->add_service_usage_to_node($node, $sid, $sd->{node});
+                $self->{online_node_usage}->add_service_node($sid, $node);
 
                 if (defined(my $fallback = $sd->{maintenance_node})) {
                     if ($node eq $fallback) {
@@ -1104,6 +1112,7 @@ sub next_state_recovery {
 
         $haenv->steal_service($sid, $sd->{node}, $recovery_node);
         $self->{online_node_usage}->add_service_usage_to_node($recovery_node, $sid, $recovery_node);
+        $self->{online_node_usage}->add_service_node($sid, $recovery_node);
 
         # NOTE: $sd *is normally read-only*, fencing is the exception
         $cd->{node} = $sd->{node} = $recovery_node;
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 66d9572..7f4d9ca 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -27,6 +27,24 @@ sub list_nodes {
     die "implement in subclass";
 }
 
+sub get_service_nodes {
+    my ($self, $sid) = @_;
+
+    die "implement in subclass";
+}
+
+sub set_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    die "implement in subclass";
+}
+
+sub add_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    die "implement in subclass";
+}
+
 sub contains_node {
     my ($self, $nodename) = @_;
 
diff --git a/src/PVE/HA/Usage/Basic.pm b/src/PVE/HA/Usage/Basic.pm
index ead08c5..afe3733 100644
--- a/src/PVE/HA/Usage/Basic.pm
+++ b/src/PVE/HA/Usage/Basic.pm
@@ -11,6 +11,7 @@ sub new {
     return bless {
         nodes => {},
         haenv => $haenv,
+        'service-nodes' => {},
     }, $class;
 }
 
@@ -38,6 +39,24 @@ sub contains_node {
     return defined($self->{nodes}->{$nodename});
 }
 
+sub get_service_nodes {
+    my ($self, $sid) = @_;
+
+    return $self->{'service-nodes'}->{$sid};
+}
+
+sub set_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    $self->{'service-nodes'}->{$sid} = [$nodename];
+}
+
+sub add_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
+}
+
 sub add_service_usage_to_node {
     my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
 
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 061e74a..6707a54 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -22,6 +22,7 @@ sub new {
         'service-stats' => {},
         haenv => $haenv,
         scheduler => $scheduler,
+        'service-nodes' => {},
         'service-counts' => {}, # Service count on each node. Fallback if scoring calculation fails.
     }, $class;
 }
@@ -86,6 +87,24 @@ my sub get_service_usage {
     return $service_stats;
 }
 
+sub get_service_nodes {
+    my ($self, $sid) = @_;
+
+    return $self->{'service-nodes'}->{$sid};
+}
+
+sub set_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    $self->{'service-nodes'}->{$sid} = [$nodename];
+}
+
+sub add_service_node {
+    my ($self, $sid, $nodename) = @_;
+
+    push @{ $self->{'service-nodes'}->{$sid} }, $nodename;
+}
+
 sub add_service_usage_to_node {
     my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-06-20 14:32 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-20 14:31 [pve-devel] [RFC common/cluster/ha-manager/docs/manager v2 00/40] HA colocation rules Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH common v2 1/1] introduce HashTools module Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH cluster v2 1/3] cfs: add 'ha/rules.cfg' to observed files Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH cluster v2 2/3] datacenter config: make pve-ha-shutdown-policy optional Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH cluster v2 3/3] datacenter config: introduce feature flag for location rules Daniel Kral
2025-06-23 15:58   ` Thomas Lamprecht
2025-06-24  7:29     ` Daniel Kral
2025-06-24  7:51       ` Thomas Lamprecht
2025-06-24  8:19         ` Daniel Kral
2025-06-24  8:25           ` Thomas Lamprecht
2025-06-24  8:52             ` Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 01/26] tree-wide: make arguments for select_service_node explicit Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 02/26] manager: improve signature of select_service_node Daniel Kral
2025-06-23 16:21   ` Thomas Lamprecht
2025-06-24  8:06     ` Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 03/26] introduce rules base plugin Daniel Kral
2025-07-04 14:18   ` Michael Köppl
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 04/26] rules: introduce location rule plugin Daniel Kral
2025-06-20 16:17   ` Jillian Morgan
2025-06-20 16:30     ` Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 05/26] rules: introduce colocation " Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 06/26] rules: add global checks between location and colocation rules Daniel Kral
2025-07-01 11:02   ` Daniel Kral
2025-07-04 14:43   ` Michael Köppl
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 07/26] config, env, hw: add rules read and parse methods Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 08/26] manager: read and update rules config Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 09/26] test: ha tester: add test cases for future location rules Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 10/26] resources: introduce failback property in service config Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 11/26] manager: migrate ha groups to location rules in-memory Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 12/26] manager: apply location rules when selecting service nodes Daniel Kral
2025-06-20 14:31 ` Daniel Kral [this message]
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 14/26] manager: apply colocation " Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 15/26] manager: handle migrations for colocated services Daniel Kral
2025-06-27  9:10   ` Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 16/26] sim: resources: add option to limit start and migrate tries to node Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 17/26] test: ha tester: add test cases for strict negative colocation rules Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 18/26] test: ha tester: add test cases for strict positive " Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 19/26] test: ha tester: add test cases in more complex scenarios Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 20/26] test: add test cases for rules config Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 21/26] manager: handle negative colocations with too many services Daniel Kral
2025-07-01 12:11   ` Michael Köppl
2025-07-01 12:23     ` Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 22/26] config: prune services from rules if services are deleted from config Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 23/26] api: introduce ha rules api endpoints Daniel Kral
2025-07-04 14:16   ` Michael Köppl
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 24/26] cli: expose ha rules api endpoints to ha-manager cli Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 25/26] api: groups, services: assert use-location-rules feature flag Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH ha-manager v2 26/26] api: services: check for colocations for service motions Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH docs v2 1/5] ha: config: add section about ha rules Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH docs v2 2/5] update static files to include ha rules api endpoints Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH docs v2 3/5] update static files to include use-location-rules feature flag Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH docs v2 4/5] update static files to include ha resources failback flag Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH docs v2 5/5] update static files to include ha service motion return value schema Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH manager v2 1/5] api: ha: add ha rules api endpoints Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH manager v2 2/5] ui: add use-location-rules feature flag Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH manager v2 3/5] ui: ha: hide ha groups if use-location-rules is enabled Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH manager v2 4/5] ui: ha: adapt resources components " Daniel Kral
2025-06-20 14:31 ` [pve-devel] [PATCH manager v2 5/5] ui: ha: add ha rules components and menu entry Daniel Kral
2025-06-30 15:09   ` Michael Köppl
2025-07-01 14:38   ` Michael Köppl
2025-06-20 15:43 ` [pve-devel] [RFC common/cluster/ha-manager/docs/manager v2 00/40] HA colocation rules Daniel Kral
2025-06-20 17:11   ` Jillian Morgan
2025-06-20 17:45     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <476c41123dced9d560dfbf27640ef8705fd90f11.camel@groupe-cyllene.com>
2025-06-23 15:36       ` Thomas Lamprecht
2025-06-24  8:48         ` Daniel Kral
2025-06-27 12:23           ` Friedrich Weber
2025-06-27 12:41             ` Daniel Kral
2025-06-23  8:11 ` DERUMIER, Alexandre via pve-devel
     [not found] ` <bf973ec4e8c52a10535ed35ad64bf0ec8d1ad37d.camel@groupe-cyllene.com>
2025-06-23 15:28   ` Thomas Lamprecht
2025-06-23 23:21     ` DERUMIER, Alexandre via pve-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250620143148.218469-18-d.kral@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal