public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Daniel Kral <d.kral@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH ha-manager v2 8/8] implement static service stats cache
Date: Mon, 20 Oct 2025 18:45:38 +0200	[thread overview]
Message-ID: <20251020164540.517231-13-d.kral@proxmox.com> (raw)
In-Reply-To: <20251020164540.517231-1-d.kral@proxmox.com>

As the HA Manager builds the static load scheduler, it queries the
services' static usage by reading and parsing the static guest configs
individually, which can take significantly more time with respect to how
many HA resources are in an actively managed state.

PVE::Cluster exposes an efficient interface to gather a set of
properties from one or all guest configs [0]. This is used here to build
a rather short-lived cache on every (re)initialization of the static
load scheduler to avoid parsing guest configs individually.

[0] pve-cluster cf1b19d (add get_guest_config_property IPCC method)

Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
changes since v1:
 - populate static service cache with entries from
   PVE::Cluster::get_vmlist(...) to make a better distinction between
   "not cached" and "not specified in guest config"
 - improve interface to cache (remove {} fallback return value)

Should we add another cfs_update(...) for the get_vmlist(...) to be sure
that vmlist contains the newest value?

 src/PVE/HA/Env.pm             | 12 ++++++++++++
 src/PVE/HA/Env/PVE2.pm        | 35 +++++++++++++++++++++++++++++++++++
 src/PVE/HA/Manager.pm         |  1 +
 src/PVE/HA/Resources/PVECT.pm |  3 ++-
 src/PVE/HA/Resources/PVEVM.pm |  3 ++-
 src/PVE/HA/Sim/Env.pm         | 12 ++++++++++++
 src/PVE/HA/Sim/Hardware.pm    | 31 +++++++++++++++++++++----------
 src/PVE/HA/Sim/Resources.pm   |  3 +--
 8 files changed, 86 insertions(+), 14 deletions(-)

diff --git a/src/PVE/HA/Env.pm b/src/PVE/HA/Env.pm
index e00272a0..4282d33f 100644
--- a/src/PVE/HA/Env.pm
+++ b/src/PVE/HA/Env.pm
@@ -300,6 +300,18 @@ sub get_datacenter_settings {
     return $self->{plug}->get_datacenter_settings();
 }
 
+sub get_static_service_stats {
+    my ($self, $id) = @_;
+
+    return $self->{plug}->get_static_service_stats($id);
+}
+
+sub update_static_service_stats {
+    my ($self) = @_;
+
+    return $self->{plug}->update_static_service_stats();
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Env/PVE2.pm b/src/PVE/HA/Env/PVE2.pm
index 2cec6f25..83ab88ab 100644
--- a/src/PVE/HA/Env/PVE2.pm
+++ b/src/PVE/HA/Env/PVE2.pm
@@ -49,6 +49,8 @@ sub new {
 
     $self->{nodename} = $nodename;
 
+    $self->{static_service_stats} = undef;
+
     return $self;
 }
 
@@ -502,6 +504,39 @@ sub get_datacenter_settings {
     };
 }
 
+sub get_static_service_stats {
+    my ($self, $id) = @_;
+
+    # undef if update_static_service_stats(...) failed before
+    return undef if !defined($self->{static_service_stats});
+
+    return $self->{static_service_stats}->{$id};
+}
+
+sub update_static_service_stats {
+    my ($self) = @_;
+
+    my $properties = ['cores', 'cpulimit', 'memory', 'sockets', 'vcpus'];
+    my $service_stats = eval {
+        my $stats = PVE::Cluster::get_guest_config_properties($properties);
+
+        # get_guest_config_properties(...) doesn't add guests which do not
+        # specify any of the given properties, but we need to make a distinction
+        # between "not cached" and "not specified" here
+        my $vmlist = PVE::Cluster::get_vmlist();
+        for my $id (keys %$vmlist) {
+            next if defined($stats->{$id});
+
+            $stats->{$id} = {};
+        }
+
+        return $stats;
+    };
+    $self->log('warning', "unable to update static service stats cache - $@") if $@;
+
+    $self->{static_service_stats} = $service_stats;
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 3bd6e1a6..83167075 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -253,6 +253,7 @@ sub recompute_online_node_usage {
                 $online_node_usage = eval {
                     my $scheduler = PVE::HA::Usage::Static->new($haenv);
                     $scheduler->add_node($_) for $online_nodes->@*;
+                    $haenv->update_static_service_stats();
                     return $scheduler;
                 };
             } else {
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index 44644d92..091249d7 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -156,7 +156,8 @@ sub remove_locks {
 sub get_static_stats {
     my ($class, $haenv, $id, $service_node) = @_;
 
-    my $conf = PVE::LXC::Config->load_config($id, $service_node);
+    my $conf = $haenv->get_static_service_stats($id);
+    $conf = PVE::LXC::Config->load_config($id, $service_node) if !defined($conf);
 
     return {
         maxcpu => PVE::LXC::Config->get_derived_property($conf, 'max-cpu'),
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index e634fe3c..d1bc3329 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -177,7 +177,8 @@ sub remove_locks {
 sub get_static_stats {
     my ($class, $haenv, $id, $service_node) = @_;
 
-    my $conf = PVE::QemuConfig->load_config($id, $service_node);
+    my $conf = $haenv->get_static_service_stats($id);
+    $conf = PVE::QemuConfig->load_config($id, $service_node) if !defined($conf);
 
     return {
         maxcpu => PVE::QemuConfig->get_derived_property($conf, 'max-cpu'),
diff --git a/src/PVE/HA/Sim/Env.pm b/src/PVE/HA/Sim/Env.pm
index 684e92f8..1d70026e 100644
--- a/src/PVE/HA/Sim/Env.pm
+++ b/src/PVE/HA/Sim/Env.pm
@@ -488,6 +488,18 @@ sub get_datacenter_settings {
     };
 }
 
+sub get_static_service_stats {
+    my ($self, $id) = @_;
+
+    return $self->{hardware}->get_static_service_stats($id);
+}
+
+sub update_static_service_stats {
+    my ($self) = @_;
+
+    return $self->{hardware}->update_static_service_stats();
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Sim/Hardware.pm b/src/PVE/HA/Sim/Hardware.pm
index 9e8c7995..fffc90e7 100644
--- a/src/PVE/HA/Sim/Hardware.pm
+++ b/src/PVE/HA/Sim/Hardware.pm
@@ -387,16 +387,6 @@ sub write_service_status {
     return $res;
 }
 
-sub read_static_service_stats {
-    my ($self) = @_;
-
-    my $filename = "$self->{statusdir}/static_service_stats";
-    my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
-    $self->log('error', "loading static service stats failed - $@") if $@;
-
-    return $stats;
-}
-
 sub new {
     my ($this, $testdir) = @_;
 
@@ -477,6 +467,8 @@ sub new {
 
     $self->{service_config} = $self->read_service_config();
 
+    $self->{static_service_stats} = undef;
+
     return $self;
 }
 
@@ -943,6 +935,25 @@ sub watchdog_update {
     return &$modify_watchog($self, $code);
 }
 
+sub get_static_service_stats {
+    my ($self, $id) = @_;
+
+    # undef if update_static_service_stats(...) failed before
+    return undef if !defined($self->{static_service_stats});
+
+    return $self->{static_service_stats}->{$id};
+}
+
+sub update_static_service_stats {
+    my ($self) = @_;
+
+    my $filename = "$self->{statusdir}/static_service_stats";
+    my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
+    $self->log('warning', "unable to update static service stats cache - $@") if $@;
+
+    $self->{static_service_stats} = $stats;
+}
+
 sub get_static_node_stats {
     my ($self) = @_;
 
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index 72623ee1..ed43373e 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -143,8 +143,7 @@ sub get_static_stats {
     my $sid = $class->type() . ":$id";
     my $hardware = $haenv->hardware();
 
-    my $stats = $hardware->read_static_service_stats();
-    if (my $service_stats = $stats->{$sid}) {
+    if (my $service_stats = $hardware->get_static_service_stats($sid)) {
         return $service_stats;
     } elsif ($id =~ /^(\d)(\d\d)/) {
         # auto assign usage calculated from ID for convenience
-- 
2.47.3



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-10-20 16:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-20 16:45 [pve-devel] [PATCH ha-manager/perl-rs/proxmox/qemu-server v2 00/12] Granular online_node_usage accounting Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH qemu-server v2 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-21 11:47   ` [pve-devel] applied: " Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH proxmox v2 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-10-21 12:14   ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH perl-rs v2 1/2] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH perl-rs v2 2/2] test: resource_scheduling: use score_nodes helper to imitate HA Manager Daniel Kral
2025-10-21 12:14   ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 1/8] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 2/8] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 3/8] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 4/8] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-21 13:02   ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 5/8] manager: make recompute_online_node_usage use add_service_usage helper Daniel Kral
2025-10-21 13:06   ` Fiona Ebner
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 6/8] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-20 16:45 ` [pve-devel] [PATCH ha-manager v2 7/8] manager: make online node usage computation granular Daniel Kral
2025-10-21 13:09   ` Fiona Ebner
2025-10-20 16:45 ` Daniel Kral [this message]
2025-10-21 13:23   ` [pve-devel] [PATCH ha-manager v2 8/8] implement static service stats cache Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251020164540.517231-13-d.kral@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal