public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Daniel Kral" <d.kral@proxmox.com>
To: "Fiona Ebner" <f.ebner@proxmox.com>,
	"Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular
Date: Fri, 17 Oct 2025 18:07:29 +0200	[thread overview]
Message-ID: <DDKQ69YA6XYL.7CKSC2J62U4U@proxmox.com> (raw)
In-Reply-To: <cd768499-bde3-4c39-bf04-cdec71ed9464@proxmox.com>

On Fri Oct 17, 2025 at 2:32 PM CEST, Fiona Ebner wrote:
> Am 30.09.25 um 4:20 PM schrieb Daniel Kral:
>> The HA Manager builds $online_node_usage in every FSM iteration in
>> manage(...) and at every HA resource state change in
>> change_service_state(...). This becomes quite costly with a high HA
>> resource count and a lot of state changes happening at once, e.g.
>> starting up multiple nodes with rebalance_on_request_start set or a
>> failover of a node with many configured HA resources.
>> 
>> To improve this situation, make the changes to the $online_node_usage
>> more granular by building $online_node_usage only once per call to
>> manage(...) and changing the nodes a HA resource uses individually on
>> every HA resource state transition.
>> 
>> The change in service usage "freshness" should be negligible here as the
>> static service usage data is cached anyway (except if the cache fails
>> for some reason).
>
> But the cache is refreshed on every recompute_online_node_usage(), which
> happened much more frequently before, so the fact that it's cached
> doesn't seem like a strong argument here?
>
> I /do/ think there is a real tradeoff being made, namely "the ability to
> manage much larger fleets of guests" versus "immediately incorporating
> every guest config change in decisions". Config changes that would lead
> to wildly different decisions would need to be timed very badly to cause
> actual issues and should be rare to begin with. Also, with PSI-based
> information, things are also less "instant", I don't see an issue with
> moving in the same direction.

Right, I'll change that to better reflect the tradeoff here!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2025-10-17 16:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30 14:19 [pve-devel] [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH qemu-server 1/1] config: only fetch necessary default values in get_derived_property helper Daniel Kral
2025-10-15 14:31   ` Fiona Ebner
2025-10-16  9:07     ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH proxmox 1/1] resource-scheduling: change score_nodes_to_start_service signature Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH perl-rs 1/1] pve-rs: resource_scheduling: allow granular usage changes Daniel Kral
2025-10-16 10:32   ` Fiona Ebner
2025-10-16 15:34     ` Daniel Kral
2025-10-17 10:55       ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 1/9] implement static service stats cache Daniel Kral
2025-10-16 11:12   ` Fiona Ebner
2025-10-16 15:15     ` Daniel Kral
2025-10-17 10:02       ` Fiona Ebner
2025-10-17 10:08         ` Fiona Ebner
2025-10-17 16:18           ` Daniel Kral
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 2/9] manager: remove redundant recompute_online_node_usage from next_state_recovery Daniel Kral
2025-10-16 11:25   ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 3/9] manager: remove redundant add_service_usage_to_node " Daniel Kral
2025-10-16 11:33   ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 4/9] manager: remove redundant add_service_usage_to_node from next_state_started Daniel Kral
2025-10-16 11:39   ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 5/9] rules: resource affinity: decouple get_resource_affinity helper from Usage class Daniel Kral
2025-10-17 11:14   ` Fiona Ebner
2025-10-17 15:46     ` Daniel Kral
2025-10-20 15:18       ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 6/9] manager: make recompute_online_node_usage use get_service_nodes helper Daniel Kral
2025-10-17 11:25   ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 7/9] usage: allow granular changes to Usage implementations Daniel Kral
2025-10-17 11:57   ` Fiona Ebner
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 8/9] manager: make online node usage computation granular Daniel Kral
2025-10-17 12:32   ` Fiona Ebner
2025-10-17 16:07     ` Daniel Kral [this message]
2025-09-30 14:19 ` [pve-devel] [PATCH ha-manager 9/9] manager: make service node usage computation more granular Daniel Kral
2025-10-17 12:42   ` Fiona Ebner
2025-10-17 15:59     ` Daniel Kral
2025-10-20 16:50 ` [pve-devel] superseded: [RFC ha-manager/perl-rs/proxmox/qemu-server 00/12] Granular online_node_usage accounting Daniel Kral

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DDKQ69YA6XYL.7CKSC2J62U4U@proxmox.com \
    --to=d.kral@proxmox.com \
    --cc=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal