all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager 09/11] manager: use static resource scheduler when configured
Date: Wed, 16 Nov 2022 10:37:18 +0100	[thread overview]
Message-ID: <82c69808-1032-ff32-1d23-ceacdc0a11eb@proxmox.com> (raw)
In-Reply-To: <ff380260-8de5-57d8-321e-7a1e0b8893cf@proxmox.com>

Am 16.11.22 um 08:14 schrieb Thomas Lamprecht:
> Am 11/11/2022 um 10:28 schrieb Fiona Ebner:
>> Am 10.11.22 um 15:37 schrieb Fiona Ebner:
>>> @@ -206,11 +207,30 @@ my $valid_service_states = {
>>>  sub recompute_online_node_usage {
>> So I was a bit worried that recompute_online_node_usage() would become
>> too inefficient with the new add_service_usage_to_node() overhead from
>> needing to read the guest configs. I now tested it with ~300 HA services
>> (minimal containers) running on my virtual test cluster.
>>
>> Timings with 'basic' mode were between 0.0004 - 0.001 seconds
>> Timings with 'static' mode were between 0.007 - 0.012 seconds
>>
>> While about a 10-fold increase, it's not too dramatic at least. I guess
>> that's what the caching of cfs files is for :)
>>
>> Still, the function is currently not only called in the main loop in
>> manage(), but also in next_state_recovery() and change_service_state().
>>
>> With, say, 400 HA services each on 5 nodes, if a node fails there's
>> 400 calls from changing to freeze
> 
> huh, freeze should only happen on graceful shutdown of a node, not
> if it fails?

Sorry, I meant fence not freeze.

> 
>> 400 calls from changing to recovery
>> 400 calls in next_state_recovery
>> 400 calls from changing to started
>> If we take a generous estimate that each call takes 0.1 seconds (there's
>> 2000 services in total), that's 40+80+40 seconds in 3 bursts during the
>> fencing and recovery period.
> 
> doesn't that lead to overly long run windows between watchdog updates?
> 
>>
>> Is that acceptable? Should I try to optimize how often the function is
>> called?
>>
> 
> hmm, a quick look wouldn't hurt, but not required for now IMO - if it can
> interfere with watchdog updates I'd sneak in updating it once in between
> though.
> 

Yes, from a quick look that might become a problem, exactly because the
delays happen in bursts (all services change state in a single manage()
run).

Not sure how you would trigger the update, because that would need to
happen in the CRM AFAIU?

There is a fixme comment in CRM.pm's work() to set an alert timer and
enforce working for at most $max_time seconds. That would of course help
here.

Getting rid of superfluous recompute_online_node_usage() calls should
also not be impossible. We'd need to ensure that we add service usage
(that already is done in recovery and next_state_started) and remove
service usage (removing is not implemented right now) when changing
nodes or states. Then it'd be enough to call
recompute_online_node_usage() once per cycle and it'd be a huge
improvement compared to now. Additionally, we could call it whenever we
iterated a certain number of services, just to be sure.

> 
> ps. maybe you can have some of that info/stats here in the commit message
> of this patch.

Sure.




  reply	other threads:[~2022-11-16  9:37 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 14:37 [pve-devel] [PATCH-SERIES proxmox-resource-scheduling/pve-ha-manager/etc] add static usage scheduler for HA manager Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 1/3] initial commit Fiona Ebner
2022-11-15 10:15   ` [pve-devel] applied: " Wolfgang Bumiller
2022-11-15 15:39   ` [pve-devel] " DERUMIER, Alexandre
2022-11-16  9:09     ` Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 2/3] add pve_static module Fiona Ebner
2022-11-16  9:18   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 3/3] add Debian packaging Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-perl-rs 1/2] pve-rs: add resource scheduling module Fiona Ebner
2022-11-15 10:16   ` [pve-devel] applied-series: " Wolfgang Bumiller
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-perl-rs 2/2] add basic test for resource scheduling Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH manager 1/3] pvestatd: broadcast static node information Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH v3 manager 2/3] cluster resources: add cgroup-mode to node properties Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH v2 manager 3/3] ui: lxc/qemu: cpu edit: make cpuunits depend on node's cgroup version Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH cluster 1/1] datacenter config: add cluster resource scheduling (crs) options Fiona Ebner
2022-11-17 11:52   ` [pve-devel] applied: " Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 01/11] env: add get_static_node_stats() method Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 02/11] resources: add get_static_stats() method Fiona Ebner
2022-11-15 13:28   ` Thomas Lamprecht
2022-11-16  8:46     ` Fiona Ebner
2022-11-16  8:59       ` Thomas Lamprecht
2022-11-16 12:38       ` DERUMIER, Alexandre
2022-11-16 12:52         ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 03/11] add Usage base plugin and Usage::Basic plugin Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 04/11] manager: select service node: add $sid to parameters Fiona Ebner
2022-11-16  7:17   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 05/11] manager: online node usage: switch to Usage::Basic plugin Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 06/11] usage: add Usage::Static plugin Fiona Ebner
2022-11-15 15:55   ` DERUMIER, Alexandre
2022-11-16  9:10     ` Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 07/11] env: add get_crs_settings() method Fiona Ebner
2022-11-16  7:05   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 08/11] manager: set resource scheduler mode upon init Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 09/11] manager: use static resource scheduler when configured Fiona Ebner
2022-11-11  9:28   ` Fiona Ebner
2022-11-16  7:14     ` Thomas Lamprecht
2022-11-16  9:37       ` Fiona Ebner [this message]
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 10/11] manager: avoid scoring nodes if maintenance fallback node is valid Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 11/11] manager: avoid scoring nodes when not trying next and current " Fiona Ebner
2022-11-10 14:38 ` [pve-devel] [PATCH docs 1/1] ha: add section about scheduler modes Fiona Ebner
2022-11-15 13:12 ` [pve-devel] partially-applied: [PATCH-SERIES proxmox-resource-scheduling/pve-ha-manager/etc] add static usage scheduler for HA manager Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82c69808-1032-ff32-1d23-ceacdc0a11eb@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal