From: Fiona Ebner <f.ebner@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
Daniel Kral <d.kral@proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache
Date: Thu, 16 Oct 2025 13:12:56 +0200 [thread overview]
Message-ID: <e753b537-762e-4c5a-8f19-cd1cfb56a02b@proxmox.com> (raw)
In-Reply-To: <20250930142021.366529-5-d.kral@proxmox.com>
Am 30.09.25 um 4:21 PM schrieb Daniel Kral:
> As the HA Manager builds the static load scheduler, it queries the
> services' static usage by reading and parsing the static guest configs
> individually, which can take significant time with respect to how many
> times recompute_online_node_usage(...) is called.
>
> PVE::Cluster exposes an efficient interface to gather a set of
> properties from one or all guest configs [0]. This is used here to build
> a rather short-lived cache on every (re)initialization of the static
> load scheduler.
>
> The downside to this approach is if there are way more non-HA managed
> guests in the cluster than HA-managed guests, which causes this to load
> much more information than necessary. It also doesn't cache the default
> values for the environment-specific static service usage, which causes
> quite a bottleneck as well.
>
> [0] pve-cluster cf1b19d (add get_guest_config_property IPCC method)
>
> Suggested-by: Fiona Ebner <f.ebner@proxmox.com>
> Signed-off-by: Daniel Kral <d.kral@proxmox.com>
> ---
> If the above mentioned downside is too large for some setups, we could
> also just add a switch to enable/disable the static stats cache or
> automatically figure out if it brings any benefits.. But I think this
> will be good enough, especially with the latter patches making way fewer
> calls to get_service_usage(...).
Parsing the configs is much more costly than
PVE::Cluster::get_guest_config_properties(), so improving the situation
for the majority of use cases of the static scheduler still makes sense,
even if it hurts performance for some more exotic setups. And we would
like to consider usage of non-HA guests in the future too to make the
static scheduler more accurate even in such exotic setups. But with
PSI-based information that is already included in the node usage, no
extra preparation necessary there.
>
> src/PVE/HA/Env.pm | 12 ++++++++++++
> src/PVE/HA/Env/PVE2.pm | 21 +++++++++++++++++++++
> src/PVE/HA/Manager.pm | 1 +
Needs a rebase because of changes in Manager.pm
> @@ -497,6 +499,25 @@ sub get_datacenter_settings {
> };
> }
>
> +sub get_static_service_stats {
> + my ($self, $id) = @_;
> +
> + # undef if update_static_service_stats(...) failed before
> + return undef if !defined($self->{static_service_stats});
> +
> + return $self->{static_service_stats}->{$id} // {};
Can't returning '{}' when nothing is there lead to issues down the line?
If we return undef instead, it's consistent with not having anything
cached and the caller will fall back to loading the config.
> @@ -477,6 +467,8 @@ sub new {
>
> $self->{service_config} = $self->read_service_config();
>
> + $self->{static_service_stats} = undef;
> +
> return $self;
> }
>
> @@ -943,6 +935,25 @@ sub watchdog_update {
> return &$modify_watchog($self, $code);
> }
>
> +sub get_static_service_stats {
> + my ($self, $id) = @_;
> +
> + # undef if update_static_service_stats(...) failed before
> + return undef if !defined($self->{static_service_stats});
> +
> + return $self->{static_service_stats}->{$id} // {};
Same here.
> +}
> +
> +sub update_static_service_stats {
> + my ($self) = @_;
> +
> + my $filename = "$self->{statusdir}/static_service_stats";
> + my $stats = eval { PVE::HA::Tools::read_json_from_file($filename) };
> + $self->log('warning', "unable to update static service stats cache - $@") if $@;
> +
> + $self->{static_service_stats} = $stats;
> +}
> +
> sub get_static_node_stats {
> my ($self) = @_;
>
> diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
> index 72623ee1..7641b1a9 100644
> --- a/src/PVE/HA/Sim/Resources.pm
> +++ b/src/PVE/HA/Sim/Resources.pm
> @@ -143,8 +143,8 @@ sub get_static_stats {
> my $sid = $class->type() . ":$id";
> my $hardware = $haenv->hardware();
>
> - my $stats = $hardware->read_static_service_stats();
> - if (my $service_stats = $stats->{$sid}) {
> + my $service_stats = $hardware->get_static_service_stats($sid);
> + if (%$service_stats) {
Style nit: with the above change, this could be just $service_stats or a
definedness check for it. If really checking if there are any keys, I
prefer being explicit with scalar(keys ...)
> return $service_stats;
> } elsif ($id =~ /^(\d)(\d\d)/) {
> # auto assign usage calculated from ID for convenience
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-10-16 11:13 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 14:19 [pve-devel] [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH qemu-server 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-15 14:31 ` Fiona Ebner
2025-10-16 9:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH proxmox 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-16 10:32 ` Fiona Ebner
2025-10-16 15:34 ` Daniel Kral
2025-10-17 10:55 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache Daniel Kral
2025-10-16 11:12 ` Fiona Ebner [this message]
2025-10-16 15:15 ` Daniel Kral
2025-10-17 10:02 ` Fiona Ebner
2025-10-17 10:08 ` Fiona Ebner
2025-10-17 16:18 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 2/9] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-16 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 3/9] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-16 11:33 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 4/9] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-16 11:39 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 5/9] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-17 11:14 ` Fiona Ebner
2025-10-17 15:46 ` Daniel Kral
2025-10-20 15:18 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 6/9] manager: make recompute_online_node_usage use get_service_nodes helper Daniel Kral
2025-10-17 11:25 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 7/9] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-17 11:57 ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular Daniel Kral
2025-10-17 12:32 ` Fiona Ebner
2025-10-17 16:07 ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular Daniel Kral
2025-10-17 12:42 ` Fiona Ebner
2025-10-17 15:59 ` Daniel Kral
2025-10-20 16:50 ` [pve-devel] superseded: [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e753b537-762e-4c5a-8f19-cd1cfb56a02b@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=d.kral@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox