public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC ha-manager] make static usage calculation faster
@ 2022-11-18 11:32 Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 1/3] resources: get static stats: add cache parameter Fiona Ebner
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Fiona Ebner @ 2022-11-18 11:32 UTC (permalink / raw)
  To: pve-devel

by avoiding overhead from load_config().

Benchmarked recompute_online_node_usage() again with ~300 HA services
(minimal containers) running on my virtual test cluster.

Timings with 'basic' were between 0.0004 - 0.002 seconds
(a bit worse today than last time)
Timings before these patches were between 0.007 - 0.016 seconds
(also a bit worse than last time)
Timings after these patches were between 0.0035 - 0.006 seconds

So only about twice as fast unfortunately. Reducing the number of
recompute_online_node_usage() calls might be necessary after all.

Probably not worth applying, as this didn't get much testing and is
not a huge improvement :/


Fiona Ebner (3):
  resources: get static stats: add cache parameter
  env: add get_static_guest_stats method
  manager/usage: cache static service stats to avoid overhead

 src/PVE/HA/Env.pm             |  6 ++++++
 src/PVE/HA/Env/PVE2.pm        | 15 +++++++++++++++
 src/PVE/HA/Manager.pm         |  1 +
 src/PVE/HA/Resources.pm       |  2 +-
 src/PVE/HA/Resources/PVECT.pm |  5 +++--
 src/PVE/HA/Resources/PVEVM.pm |  6 ++++--
 src/PVE/HA/Sim/Env.pm         |  7 +++++++
 src/PVE/HA/Sim/Resources.pm   |  2 +-
 src/PVE/HA/Usage/Static.pm    | 13 ++++++++++---
 9 files changed, 48 insertions(+), 9 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] [RFC ha-manager 1/3] resources: get static stats: add cache parameter
  2022-11-18 11:32 [pve-devel] [RFC ha-manager] make static usage calculation faster Fiona Ebner
@ 2022-11-18 11:32 ` Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 2/3] env: add get_static_guest_stats method Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 3/3] manager/usage: cache static service stats to avoid overhead Fiona Ebner
  2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2022-11-18 11:32 UTC (permalink / raw)
  To: pve-devel

so callers can avoid the overhead from load_config() if they already
have the required information.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/HA/Resources.pm       | 2 +-
 src/PVE/HA/Resources/PVECT.pm | 4 ++--
 src/PVE/HA/Resources/PVEVM.pm | 5 +++--
 src/PVE/HA/Sim/Resources.pm   | 2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index 7ba90f6..ee8de52 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -162,7 +162,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id, $service_node, $cache) = @_;
 
     die "implement in subclass";
 }
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index e77d98c..c10d024 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -153,9 +153,9 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id, $service_node, $cache) = @_;
 
-    my $conf = PVE::LXC::Config->load_config($id, $service_node);
+    my $conf = $cache->{$id} ||= PVE::LXC::Config->load_config($id, $service_node);
 
     return {
 	maxcpu => $conf->{cpulimit} || $conf->{cores} || 0,
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index f405d86..ca7fbc4 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -176,9 +176,10 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id, $service_node, $cache) = @_;
+
+    my $conf = $cache->{$id} ||= PVE::QemuConfig->load_config($id, $service_node);
 
-    my $conf = PVE::QemuConfig->load_config($id, $service_node);
     my $defaults = PVE::QemuServer::load_defaults();
 
     my $cpus = ($conf->{sockets} || $defaults->{sockets}) * ($conf->{cores} || $defaults->{cores});
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index e6e1853..999a77a 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -140,7 +140,7 @@ sub remove_locks {
 }
 
 sub get_static_stats {
-    my ($class, $haenv, $id, $service_node) = @_;
+    my ($class, $haenv, $id, $service_node, $cache) = @_;
 
     my $sid = $class->type() . ":$id";
     my $hardware = $haenv->hardware();
-- 
2.30.2





^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] [RFC ha-manager 2/3] env: add get_static_guest_stats method
  2022-11-18 11:32 [pve-devel] [RFC ha-manager] make static usage calculation faster Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 1/3] resources: get static stats: add cache parameter Fiona Ebner
@ 2022-11-18 11:32 ` Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 3/3] manager/usage: cache static service stats to avoid overhead Fiona Ebner
  2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2022-11-18 11:32 UTC (permalink / raw)
  To: pve-devel

which uses the efficient PVE::Cluster::get_guest_config_properties()
to retrieve the information and supports caching, so the information
can be passed around avoiding overhead from load_config().

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/HA/Env.pm      |  6 ++++++
 src/PVE/HA/Env/PVE2.pm | 15 +++++++++++++++
 src/PVE/HA/Sim/Env.pm  |  7 +++++++
 3 files changed, 28 insertions(+)

diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index 16603ec..2b422eb 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -275,4 +275,10 @@ sub get_static_node_stats {
     return $self->{plug}->get_static_node_stats();
 }
 
+sub get_static_guest_stats {
+    my ($self, $cached) = @_;
+
+    return $self->{plug}->get_static_guest_stats($cached);
+}
+
 1;
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index f6ebfeb..02b981e 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -476,4 +476,19 @@ sub get_static_node_stats {
     return $stats;
 }
 
+sub get_static_guest_stats {
+    my ($self, $cached) = @_;
+
+    return $self->{'static-guest-stats'} if $self->{'static-guest-stats'} && $cached;
+
+    # NOTE see get_static_stats in Resources/*.pm for what properties are required.
+    my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
+    my $stats = eval { PVE::Cluster::get_guest_config_properties($properties); };
+    $self->log('warning', "unable to initialize cache for static service stats - $@") if $@;
+
+    $self->{'static-guest-stats'} = $stats;
+
+    return $stats // {};
+}
+
 1;
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index c6ea73c..51690f5 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -442,4 +442,11 @@ sub get_static_node_stats {
     return $self->{hardware}->get_static_node_stats();
 }
 
+# FIXME actually return the stats here
+sub get_static_guest_stats {
+    my ($self, $cached) = @_;
+
+    return {};
+}
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 4+ messages in thread

* [pve-devel] [RFC ha-manager 3/3] manager/usage: cache static service stats to avoid overhead
  2022-11-18 11:32 [pve-devel] [RFC ha-manager] make static usage calculation faster Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 1/3] resources: get static stats: add cache parameter Fiona Ebner
  2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 2/3] env: add get_static_guest_stats method Fiona Ebner
@ 2022-11-18 11:32 ` Fiona Ebner
  2 siblings, 0 replies; 4+ messages in thread
From: Fiona Ebner @ 2022-11-18 11:32 UTC (permalink / raw)
  To: pve-devel

Benchmarked recompute_online_node_usage() again with ~300 HA services
(minimal containers) running on my virtual test cluster.

Timings before this patch were between 0.007 - 0.016 seconds
Timings after this patch were between 0.0035 - 0.006 seconds

So only about twice as fast unfortunately. Reducing the number of
recompute_online_node_usage() calls might be necessary after all.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 src/PVE/HA/Manager.pm         |  1 +
 src/PVE/HA/Resources/PVECT.pm |  1 +
 src/PVE/HA/Resources/PVEVM.pm |  1 +
 src/PVE/HA/Usage/Static.pm    | 13 ++++++++++---
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 69bfbc3..c9a9f14 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -447,6 +447,7 @@ sub manage {
     for (;;) {
 	my $repeat = 0;
 
+	$haenv->get_static_guest_stats(); # to cache the info
 	$self->recompute_online_node_usage();
 
 	foreach my $sid (sort keys %$ss) {
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index c10d024..4c295f3 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -155,6 +155,7 @@ sub remove_locks {
 sub get_static_stats {
     my ($class, $haenv, $id, $service_node, $cache) = @_;
 
+    # NOTE that cache might not contain the full config
     my $conf = $cache->{$id} ||= PVE::LXC::Config->load_config($id, $service_node);
 
     return {
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index ca7fbc4..b6234be 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -178,6 +178,7 @@ sub remove_locks {
 sub get_static_stats {
     my ($class, $haenv, $id, $service_node, $cache) = @_;
 
+    # NOTE that cache might not contain the full config
     my $conf = $cache->{$id} ||= PVE::QemuConfig->load_config($id, $service_node);
 
     my $defaults = PVE::QemuServer::load_defaults();
diff --git a/src/PVE/HA/Usage/Static.pm b/src/PVE/HA/Usage/Static.pm
index 73ce836..05b876d 100644
--- a/src/PVE/HA/Usage/Static.pm
+++ b/src/PVE/HA/Usage/Static.pm
@@ -20,6 +20,7 @@ sub new {
     return bless {
 	'node-stats' => $node_stats,
 	'service-stats' => {},
+	'service-stats-cache' => $haenv->get_static_guest_stats(1),
 	haenv => $haenv,
 	scheduler => $scheduler,
 	'service-counts' => {}, # Service count on each node. Fallback if scoring calculation fails.
@@ -65,13 +66,19 @@ my sub get_service_usage {
 
     return $self->{'service-stats'}->{$sid} if $self->{'service-stats'}->{$sid};
 
-    my (undef, $type, $id) = $self->{haenv}->parse_sid($sid);
+    my $haenv = $self->{haenv};
+
+    my (undef, $type, $id) = $haenv->parse_sid($sid);
     my $plugin = PVE::HA::Resources->lookup($type);
 
-    my $stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $service_node); };
+    my $stats = eval {
+	$plugin->get_static_stats($haenv, $id, $service_node, $self->{'service-stats-cache'});
+    };
     if (my $err = $@) {
 	# config might've already moved during a migration
-	$stats = eval { $plugin->get_static_stats($self->{haenv}, $id, $migration_target); } if $migration_target;
+	if ($migration_target) {
+	    $stats = eval { $plugin->get_static_stats($haenv, $id, $migration_target); };
+	}
 	die "did not get static service usage information for '$sid' - $err\n" if !$stats;
     }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-18 11:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-18 11:32 [pve-devel] [RFC ha-manager] make static usage calculation faster Fiona Ebner
2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 1/3] resources: get static stats: add cache parameter Fiona Ebner
2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 2/3] env: add get_static_guest_stats method Fiona Ebner
2022-11-18 11:32 ` [pve-devel] [RFC ha-manager 3/3] manager/usage: cache static service stats to avoid overhead Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal