public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH v2 ha-manager 06/15] usage: add Usage::Static plugin
Date: Thu, 17 Nov 2022 15:00:07 +0100	[thread overview]
Message-ID: <20221117140018.105004-7-f.ebner@proxmox.com> (raw)
In-Reply-To: <20221117140018.105004-1-f.ebner@proxmox.com>

for calculating node usage of services based upon static CPU and
memory configuration as well as scoring the nodes with that
information to decide where to start a new or recovered service.

For getting the service stats, it's necessary to also consider the
migration target (if present), becuase the configuration file might
have already moved.

It's necessary to update the cluster filesystem upon stealing the
service to be able to always read the moved config right away when
adding the usage.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes from v1:
    * Pass haenv to resource's get_static_stats(), required by
      simulation env.

 debian/pve-ha-manager.install |   1 +
 src/PVE/HA/Env/PVE2.pm        |   4 ++
 src/PVE/HA/Usage.pm           |   1 +
 src/PVE/HA/Usage/Makefile     |   2 +-
 src/PVE/HA/Usage/Static.pm    | 114 ++++++++++++++++++++++++++++++++++
 5 files changed, 121 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/HA/Usage/Static.pm

diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 87fb24c..a7598a9 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -35,5 +35,6 @@
 /usr/share/perl5/PVE/HA/Tools.pm
 /usr/share/perl5/PVE/HA/Usage.pm
 /usr/share/perl5/PVE/HA/Usage/Basic.pm
+/usr/share/perl5/PVE/HA/Usage/Static.pm
 /usr/share/perl5/PVE/Service/pve_ha_crm.pm
 /usr/share/perl5/PVE/Service/pve_ha_lrm.pm
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 7cecf35..7fac43c 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -176,6 +176,10 @@ sub steal_service {
     } else {
 	die "implement me";
     }
+
+    # Necessary for (at least) static usage plugin to always be able to read service config from new
+    # node right away.
+    $self->cluster_state_update();
 }
 
 sub read_group_config {
diff --git a/src/PVE/HA/Usage.pm b/src/PVE/HA/Usage.pm
index 4c723d1..66d9572 100644
--- a/src/PVE/HA/Usage.pm
+++ b/src/PVE/HA/Usage.pm
@@ -33,6 +33,7 @@ sub contains_node {
     die "implement in subclass";
 }
 
+# Logs a warning to $haenv upon failure, but does not die.
 sub add_service_usage_to_node {
     my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
 
diff --git a/src/PVE/HA/Usage/Makefile b/src/PVE/HA/Usage/Makefile
index ccf1282..5a51359 100644
--- a/src/PVE/HA/Usage/Makefile
+++ b/src/PVE/HA/Usage/Makefile
@@ -1,4 +1,4 @@
-SOURCES=Basic.pm
+SOURCES=Basic.pm Static.pm
 
 .PHONY: install
 install:
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
new file mode 100644
index 0000000..ce705eb
--- /dev/null
+++ b/src/PVE/HA/Usage/Static.pm
@@ -0,0 +1,114 @@
+package PVE::HA::Usage::Static;
+
+use strict;
+use warnings;
+
+use PVE::HA::Resources;
+use PVE::RS::ResourceScheduling::Static;
+
+use base qw(PVE::HA::Usage);
+
+sub new {
+    my ($class, $haenv) = @_;
+
+    my $node_stats = eval { $haenv->get_static_node_stats() };
+    die "did not get static node usage information - $@" if $@;
+
+    my $scheduler = eval { PVE::RS::ResourceScheduling::Static->new(); };
+    die "unable to initialize static scheduling - $@" if $@;
+
+    return bless {
+	'node-stats' => $node_stats,
+	'service-stats' => {},
+	haenv => $haenv,
+	scheduler => $scheduler,
+    }, $class;
+}
+
+sub add_node {
+    my ($self, $nodename) = @_;
+
+    my $stats = $self->{'node-stats'}->{$nodename}
+	or die "did not get static node usage information for '$nodename'\n";
+    die "static node usage information for '$nodename' missing cpu count\n" if !$stats->{cpus};
+    die "static node usage information for '$nodename' missing memory\n" if !$stats->{memory};
+
+    eval { $self->{scheduler}->add_node($nodename, int($stats->{cpus}), int($stats->{memory})); };
+    die "initializing static node usage for '$nodename' failed - $@" if $@;
+}
+
+sub remove_node {
+    my ($self, $nodename) = @_;
+
+    $self->{scheduler}->remove_node($nodename);
+}
+
+sub list_nodes {
+    my ($self) = @_;
+
+    return $self->{scheduler}->list_nodes()->@*;
+}
+
+sub contains_node {
+    my ($self, $nodename) = @_;
+
+    return $self->{scheduler}->contains_node($nodename);
+}
+
+my sub get_service_usage {
+    my ($self, $sid, $service_node, $migration_target) = @_;
+
+    return $self->{'service-stats'}->{$sid} if $self->{'service-stats'}->{$sid};
+
+    my (undef, $type, $id) = $self->{haenv}->parse_sid($sid);
+    my $plugin = PVE::HA::Resources->lookup($type);
+
+    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $service_node); };
+    if (my $err = $@) {
+	# config might've already moved during a migration
+	$stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $migration_target); } if $migration_target;
+	die "did not get static service usage information for '$sid' - $err\n" if !$stats;
+    }
+
+    my $service_stats = {
+	maxcpu => $stats->{maxcpu} + 0.0, # containers allow non-integer cpulimit
+	maxmem => int($stats->{maxmem}),
+    };
+
+    $self->{'service-stats'}->{$sid} = $service_stats;
+
+    return $service_stats;
+}
+
+sub add_service_usage_to_node {
+    my ($self, $nodename, $sid, $service_node, $migration_target) = @_;
+
+    eval {
+	my $service_usage = get_service_usage($self, $sid, $service_node, $migration_target);
+	$self->{scheduler}->add_service_usage_to_node($nodename, $service_usage);
+    };
+    $self->{haenv}->log('warning', "unable to add service '$sid' usage to node '$nodename' - $@")
+	if $@;
+}
+
+sub score_nodes_to_start_service {
+    my ($self, $sid, $service_node) = @_;
+
+    my $score_list = eval {
+	my $service_usage = get_service_usage($self, $sid, $service_node);
+	$self->{scheduler}->score_nodes_to_start_service($service_usage);
+    };
+    if (my $err = $@) {
+	$self->{haenv}->log(
+	    'err',
+	    "unable to score nodes according to static usage for service '$sid' - $err",
+	);
+	# TODO maybe use service count as fallback?
+	return { map { $_ => 1 } $self->list_nodes() };
+    }
+
+    # Take minus the value, so that a lower score is better, which our caller(s) expect(s).
+    return { map { $_->[0] => -$_->[1] } $score_list->@* };
+}
+
+1;
-- 
2.30.2





  parent reply	other threads:[~2022-11-17 14:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-17 14:00 [pve-devel] [PATCH-SERIES v2 ha-manager/docs] add static usage scheduler for HA manager Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 01/15] env: add get_static_node_stats() method Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 02/15] resources: add get_static_stats() method Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 03/15] add Usage base plugin and Usage::Basic plugin Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 04/15] manager: select service node: add $sid to parameters Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 05/15] manager: online node usage: switch to Usage::Basic plugin Fiona Ebner
2022-11-17 14:00 ` Fiona Ebner [this message]
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 07/15] env: rename get_ha_settings to get_datacenter_settings Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 08/15] env: datacenter config: include crs (cluster-resource-scheduling) setting Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 09/15] manager: set resource scheduler mode upon init Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 10/15] manager: use static resource scheduler when configured Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 11/15] manager: avoid scoring nodes if maintenance fallback node is valid Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 12/15] manager: avoid scoring nodes when not trying next and current " Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 13/15] usage: static: use service count on nodes as a fallback Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 14/15] test: add tests for static resource scheduling Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 ha-manager 15/15] resources: add missing PVE::Cluster use statements Fiona Ebner
2022-11-18  7:48   ` Fiona Ebner
2022-11-18 12:48     ` Thomas Lamprecht
2022-11-17 14:00 ` [pve-devel] [PATCH v2 docs 1/2] ha: add section about scheduler modes Fiona Ebner
2022-11-17 14:00 ` [pve-devel] [PATCH v2 docs 2/2] ha: add warning against using 'static' mode with many services Fiona Ebner
2022-11-18 13:23 ` [pve-devel] applied-series: [PATCH-SERIES v2 ha-manager/docs] add static usage scheduler for HA manager Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221117140018.105004-7-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal