public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] [PATCH-SERIES proxmox-resource-scheduling/pve-ha-manager/etc] add static usage scheduler for HA manager
Date: Thu, 10 Nov 2022 15:37:39 +0100	[thread overview]
Message-ID: <20221110143800.98047-1-f.ebner@proxmox.com> (raw)

Right now, the online node usage calculation for the HA manager only
considers the number of active services on each node. This patch
series allows switching to a 'static' scheduler mode instead, where
static usage information from the nodes and guest configurations is
used instead.

This also includes the remaining cgroup/cpuunits-related patches,
because the broadcasting of static information was done to include the
cgroup mode of the node.

With this version, the effect is limited to choosing nodes during
recovery, but the plan is to extend this.

As a next step, it would be nice to also have for startup, but AFAICT
the issue is that the node selection only happens after the state is
already set to started and I think select_service_node() doesn't
currently know if a service has been newly started. I haven't looked
into it in too much detail though.

An idea to get a balancer out of it, is to:
1. (optionally) sort all services by badness (needs new backend function)
2. iterate scoring the nodes for each service, adding the usage to the
   chosen node after each iteration. The current node can be kept if the
   score compared to the best node doesn't differ too much.
3. record the chosen nodes and migrate the services accordingly.

Still missing are also unit tests for ha-manager itself.


Almost all of the series is preparatory infrastructure, but the hope
is that much of it can be re-used for balancers and dynamic
scheduling in the future.

The proxmox-resource-scheduling Rust crate implements the TOPSIS
algorithm first suggested by Alexandre. It also models the static node
and service usages in PVE and allows to score nodes where to start
new or recovered service. This is done by simulating starting it on
each node and comparing the alternatives with average and highest CPU
and memory as criteria. Memory being weighted much more as it is a
more limited resource than CPU.

I did not implement the criteria weighing process from AHP (yet) (also
suggested by Alexandre) which computes avaraged weights and a bias
score from a table of pairwise weights between criteria. The downside
is that one needs to guess n(n-1)/2 weights instead of n, and the
upside is that it has to be done only pairwise rather than relative to
all others. But this still can be done in the future if we want.

In proxmox-perl-rs, a class is provided for interfacing from Perl.

In pve-manager, the static node information is broadcast whenever
outdated. There also are the unrelated (but touching the same code)
cgroup/cpuunits patches.

In pve-cluster, a new crs (=cluster-resource-scheduler) option is
added, initially with a mode for HA.

In pve-ha-manager, the online node usage calculation is factored out
into a 'Usage' plugin system to ease adding the new static mode
without much cluttering. If not all nodes provide static service
information, we fall back to the 'basic' mode. If only the scoring
fails (but that /should/ be rather unlikely), there is no real
fallback implemented currently (the '|| $a cmp $b' in
select_service_node() destroys the random hash keys order again ;)).
We could change it to stay random or better, track the service count
in Usage::Static too and use that.


Dependency bumps needed:
proxmox-perl-rs depends on proxmox-resource-scheduling
proxmox-ha-manager (build)depends on proxmox-perl-rs
The new feature is only usable with updated pve-manager and
pve-cluster of course, but no hard dependency.


proxmox-resource-scheduling:

Fiona Ebner (3):
  initial commit
  add pve_static module
  add Debian packaging


proxmox-perl-rs:

Fiona Ebner (2):
  pve-rs: add resource scheduling module
  add basic test for resource scheduling

 Makefile                                 |   1 +
 pve-rs/Cargo.toml                        |   1 +
 pve-rs/src/lib.rs                        |   1 +
 pve-rs/src/resource_scheduling/mod.rs    |   1 +
 pve-rs/src/resource_scheduling/static.rs | 116 +++++++++++++++++++++++
 pve-rs/test/Makefile                     |   4 +
 pve-rs/test/README                       |   2 +
 pve-rs/test/resource_scheduling.pl       |  70 ++++++++++++++
 8 files changed, 196 insertions(+)
 create mode 100644 pve-rs/src/resource_scheduling/mod.rs
 create mode 100644 pve-rs/src/resource_scheduling/static.rs
 create mode 100644 pve-rs/test/Makefile
 create mode 100644 pve-rs/test/README
 create mode 100755 pve-rs/test/resource_scheduling.pl


pve-manager:

Fiona Ebner (3):
  pvestatd: broadcast static node information
  cluster resources: add cgroup-mode to node properties
  ui: lxc/qemu: cpu edit: make cpuunits depend on node's cgroup version

 PVE/API2/Cluster.pm                | 13 +++++++++++++
 PVE/Service/pvestatd.pm            | 25 ++++++++++++++++++++++++
 www/manager6/lxc/CreateWizard.js   |  8 ++++++++
 www/manager6/lxc/ResourceEdit.js   | 31 +++++++++++++++++++++++++-----
 www/manager6/lxc/Resources.js      |  8 +++++++-
 www/manager6/qemu/CreateWizard.js  |  8 ++++++++
 www/manager6/qemu/HardwareView.js  |  8 +++++++-
 www/manager6/qemu/ProcessorEdit.js | 31 +++++++++++++++++++++++-------
 8 files changed, 118 insertions(+), 14 deletions(-)


pve-cluster:

Fiona Ebner (1):
  datacenter config: add cluster resource scheduling (crs) options

 data/PVE/DataCenterConfig.pm | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)


pve-ha-manager:

Fiona Ebner (11):
  env: add get_static_node_stats() method
  resources: add get_static_stats() method
  add Usage base plugin and Usage::Basic plugin
  manager: select service node: add $sid to parameters
  manager: online node usage: switch to Usage::Basic plugin
  usage: add Usage::Static plugin
  env: add get_crs_settings() method
  manager: set resource scheduler mode upon init
  manager: use static resource scheduler when configured
  manager: avoid scoring nodes if maintenance fallback node is valid
  manager: avoid scoring nodes when not trying next and current node is
    valid

 debian/pve-ha-manager.install |   3 +
 src/PVE/HA/Env.pm             |  13 ++++
 src/PVE/HA/Env/PVE2.pm        |  29 +++++++++
 src/PVE/HA/Makefile           |   3 +-
 src/PVE/HA/Manager.pm         |  77 ++++++++++++++---------
 src/PVE/HA/Resources.pm       |   5 ++
 src/PVE/HA/Resources/PVECT.pm |  11 ++++
 src/PVE/HA/Resources/PVEVM.pm |  14 +++++
 src/PVE/HA/Sim/Env.pm         |   9 +++
 src/PVE/HA/Sim/TestEnv.pm     |   6 ++
 src/PVE/HA/Usage.pm           |  50 +++++++++++++++
 src/PVE/HA/Usage/Basic.pm     |  52 ++++++++++++++++
 src/PVE/HA/Usage/Makefile     |   6 ++
 src/PVE/HA/Usage/Static.pm    | 114 ++++++++++++++++++++++++++++++++++
 src/test/test_failover1.pl    |  21 ++++---
 15 files changed, 374 insertions(+), 39 deletions(-)
 create mode 100644 src/PVE/HA/Usage.pm
 create mode 100644 src/PVE/HA/Usage/Basic.pm
 create mode 100644 src/PVE/HA/Usage/Makefile
 create mode 100644 src/PVE/HA/Usage/Static.pm


pve-docs:

Fiona Ebner (1):
  ha: add section about scheduler modes

 ha-manager.adoc | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

-- 
2.30.2





             reply	other threads:[~2022-11-10 14:38 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-10 14:37 Fiona Ebner [this message]
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 1/3] initial commit Fiona Ebner
2022-11-15 10:15   ` [pve-devel] applied: " Wolfgang Bumiller
2022-11-15 15:39   ` [pve-devel] " DERUMIER, Alexandre
2022-11-16  9:09     ` Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 2/3] add pve_static module Fiona Ebner
2022-11-16  9:18   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-resource-scheduling 3/3] add Debian packaging Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-perl-rs 1/2] pve-rs: add resource scheduling module Fiona Ebner
2022-11-15 10:16   ` [pve-devel] applied-series: " Wolfgang Bumiller
2022-11-10 14:37 ` [pve-devel] [PATCH proxmox-perl-rs 2/2] add basic test for resource scheduling Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH manager 1/3] pvestatd: broadcast static node information Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH v3 manager 2/3] cluster resources: add cgroup-mode to node properties Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH v2 manager 3/3] ui: lxc/qemu: cpu edit: make cpuunits depend on node's cgroup version Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH cluster 1/1] datacenter config: add cluster resource scheduling (crs) options Fiona Ebner
2022-11-17 11:52   ` [pve-devel] applied: " Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 01/11] env: add get_static_node_stats() method Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 02/11] resources: add get_static_stats() method Fiona Ebner
2022-11-15 13:28   ` Thomas Lamprecht
2022-11-16  8:46     ` Fiona Ebner
2022-11-16  8:59       ` Thomas Lamprecht
2022-11-16 12:38       ` DERUMIER, Alexandre
2022-11-16 12:52         ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 03/11] add Usage base plugin and Usage::Basic plugin Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 04/11] manager: select service node: add $sid to parameters Fiona Ebner
2022-11-16  7:17   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 05/11] manager: online node usage: switch to Usage::Basic plugin Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 06/11] usage: add Usage::Static plugin Fiona Ebner
2022-11-15 15:55   ` DERUMIER, Alexandre
2022-11-16  9:10     ` Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 07/11] env: add get_crs_settings() method Fiona Ebner
2022-11-16  7:05   ` Thomas Lamprecht
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 08/11] manager: set resource scheduler mode upon init Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 09/11] manager: use static resource scheduler when configured Fiona Ebner
2022-11-11  9:28   ` Fiona Ebner
2022-11-16  7:14     ` Thomas Lamprecht
2022-11-16  9:37       ` Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 10/11] manager: avoid scoring nodes if maintenance fallback node is valid Fiona Ebner
2022-11-10 14:37 ` [pve-devel] [PATCH ha-manager 11/11] manager: avoid scoring nodes when not trying next and current " Fiona Ebner
2022-11-10 14:38 ` [pve-devel] [PATCH docs 1/1] ha: add section about scheduler modes Fiona Ebner
2022-11-15 13:12 ` [pve-devel] partially-applied: [PATCH-SERIES proxmox-resource-scheduling/pve-ha-manager/etc] add static usage scheduler for HA manager Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221110143800.98047-1-f.ebner@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal