From: "Michael Köppl" <m.koeppl@proxmox.com>
To: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>
Cc: "pve-devel" <pve-devel-bounces@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH ha-manager v3 00/21] HA rules fixes + performance improvements + cleanup
Date: Mon, 03 Nov 2025 18:43:08 +0100	[thread overview]
Message-ID: <DDZ8URVCKQBZ.2KS7D0BJH473R@proxmox.com> (raw)
In-Reply-To: <20251103102118.153666-1-d.kral@proxmox.com>
Gave v3 another spin after having reviewed v1. I again repeated the
following scenarios:
- Checked behavior with ignored resources (i.e. that ignored resources
  are not shown as dependent resources when migrating).
- Checked that conflicts between positive resource affinity rules and
  node affinity rules are detected correctly in the cases that weren't
  detected before
- Checked various combinations of node affinity and resource affinity
  rules, also checking the failback flag, max. restart, and max.
  relocate params. Could not create any configurations that are
  problematic.
- Ran group migrations again (also making them fail on purpose and
  checking that it keeps trying to migrate) to check that the changes
  to the counting of active services did not alter behavior.
Since this is now based on the granular accounting series [0], there is
a problem with 'ignored' resources, which I noted separately on that
patch series [1].
Other than that I did not notice anything off. Everything seems to work
as expected. Also had a look at the parts of the code that did not have
my R-b anymore. My comments on v1 have been addressed in v2 already and
the remaining changes in v3 lgtm. I think the addition of the benchmark
on 10/21 is nice!
Please consider this:
Reviewed-by: Michael Köppl <m.koeppl@proxmox.com>
Tested-by: Michael Köppl <m.koeppl@proxmox.com>
[0] https://lore.proxmox.com/pve-devel/20251027164513.542678-1-d.kral@proxmox.com/
[1] https://lore.proxmox.com/pve-devel/20251027164513.542678-1-d.kral@proxmox.com/t/#m8db5069cad93a9f81c3e3ec50af0c78681601527
On Mon Nov 3, 2025 at 11:19 AM CET, Daniel Kral wrote:
> v2: https://lore.proxmox.com/pve-devel/20250909083539.39675-1-d.kral@proxmox.com/
> v1: https://lore.proxmox.com/pve-devel/20250821143705.256562-1-d.kral@proxmox.com/
>
>
> This series is based on top of the granular accounting series [0]
>
> [0] https://lore.proxmox.com/pve-devel/20251027164513.542678-1-d.kral@proxmox.com/
>
>
> Changelog from v2 -> v3:
>
>   - rebased on top of master + granular accounting series [0], because
>     [0] is already closer to being merged and as both change some rules'
>     interfaces I had to choose one above the other
>
>   - fix and add clarifying notes to inter-rules feasibility checks
>
>   - remove usage of prototypes introduced by the original HA rules
>     series, which I had wrong assumptions about
>
>   - some minor code changes (see per-patch notes):
>       - use 5.36 for newly introduced module PVE::HA::Rules::Helpers
>       - use my sub name {} over my $name = sub {} for new private
>         subroutines / helpers
>       - change POD signature of get_node_affinity(...) too
>
>
> Ran a `git rebase master --exec 'make clean && make deb'` on the series
> and tested the changes to the Status API manually.
>
>
> I put the patches in decreasing priority:
>
> PATCH 1     fix output of get_resource_info() about dependent resources
> PATCH 2     fix retranslating ha rules on nodelist changes
> PATCH 3-5   fix wrong assumption about positive resource affinity checks
> PATCH 6-10  ha rules performance improvements + preparations
> PATCH 11-12 make test cases use to_json(...) and add compiled configs
> PATCH 13-21 various smaller cleanups related to HA rules
>
>
> Ad PATCH 1, 2, 3-5:
>
> The first few patches fix some wrong assumptions and bugs.
>
>
> Ad PATCH 3-5:
>
> During the initial HA rules implementation there were quite a few
> changes done to how rules were checked and translated, which moved the
> merging of positive resource affinity rules to the end of the pipeline.
>
> Not taking notice of this clearly enough, this causes two checks to have
> wrong assumptions about positive positive resource affinity rules: they
> assume that the resource sets of positive resource affinity rules are
> already disjoint from each other, even though that is only done at a
> later stage.
>
> As it would be rather cumbersome to interleave checks/pruning infeasible
> rules and rule transforms [0] (i.e. move the merging transform inbetween
> the previous resource affinity checks and the inter-consistency check; I
> tried that but it looked rather ugly), the best method IMO was to use
> the already existing helper to find these disjoint sets. This also
> provides these checks with the correct positive resource affinity rule
> ids to blame instead of building any extra logic for handling that if
> checks/transforms would've been interleaved.
>
>
> Ad PATCH 6-10:
>
> These patches prepare and implement the compilation of HA rules when
> these are used in the HA Manager, which improves the performance of
> checking for HA rules (significant for overall performance) and when
> applying these (significant only for resources with rules).
>
>
> Ad PATCH 11-21:
>
> These patches are various smaller improvements related to the core HA
> rules feature, which are dependent on the changes above and/or are
> easier to apply in a single patch series (e.g. due to larger changes in
> a single source file):
>
> - make rules tests use to_json(...) instead of Data::Dumper
> - make rules tests also output compiled configs to document changes
> - synchronize how active LRMs are determined, which included how it is
>   done for checking health for the HA groups migration
> - add notes about the why's and how's for feasibility checks
>
>
> Daniel Kral (21):
>   config: do not add ignored resources to dependent resources
>   manager: retranslate rules if nodes are added or removed
>   rules: factor out disjoint rules' resource set helper
>   rules: resource affinity: inter-consistency check with merged positive
>     rules
>   rules: add merged positive resource affinity info in global checks
>   rules: make rules sorting optional in foreach_rule helper
>   rename rule's canonicalize stage to transform stage
>   rules: make plugins register transformers instead of plugin_transform
>   rules: node affinity: decouple get_node_affinity helper from Usage
>     class
>   compile ha rules to a more compact representation
>   test: rules: use to_json instead of Data::Dumper for config output
>   test: rules: add compiled config output to rules config test cases
>   rules: node affinity: define node priority outside hash access
>   move minimum version check helper to ha tools
>   manager: move group migration cooldown variable into helper
>   api: status: sync active service counting with lrm's helper
>   manager: group migration: sync active service counting with lrm's
>     helper
>   factor out counting of active services into helper
>   tree-wide: remove misused function prototype declaractions
>   rules: fix documentation for inter-rules checker subroutines
>   rules: add documentation about current feasibility check
>     implementations
>
>  debian/pve-ha-manager.install                 |   1 +
>  src/PVE/API2/HA/Rules.pm                      |   1 +
>  src/PVE/API2/HA/Status.pm                     |  17 +-
>  src/PVE/HA/Config.pm                          |  18 +-
>  src/PVE/HA/HashTools.pm                       |   6 +-
>  src/PVE/HA/LRM.pm                             |  20 +-
>  src/PVE/HA/Manager.pm                         |  89 ++--
>  src/PVE/HA/NodeStatus.pm                      |  14 +
>  src/PVE/HA/Rules.pm                           | 215 +++++++---
>  src/PVE/HA/Rules/Helpers.pm                   |  77 ++++
>  src/PVE/HA/Rules/Makefile                     |   2 +-
>  src/PVE/HA/Rules/NodeAffinity.pm              |  90 ++--
>  src/PVE/HA/Rules/ResourceAffinity.pm          | 220 +++++-----
>  src/PVE/HA/Tools.pm                           |  52 +++
>  ...efaults-for-node-affinity-rules.cfg.expect | 145 ++++---
>  ...lts-for-resource-affinity-rules.cfg.expect |  87 ++--
>  ...onsistent-node-resource-affinity-rules.cfg |  19 +
>  ...nt-node-resource-affinity-rules.cfg.expect | 185 ++++++---
>  .../inconsistent-resource-affinity-rules.cfg  |  15 +
>  ...sistent-resource-affinity-rules.cfg.expect |  21 +-
>  ...egative-resource-affinity-rules.cfg.expect |  73 ++--
>  ...fective-resource-affinity-rules.cfg.expect |  18 +-
>  ...egative-resource-affinity-rules.cfg.expect | 342 +++++++++------
>  ...ositive-resource-affinity-rules.cfg.expect | 391 +++++++++++++-----
>  ...egative-resource-affinity-rules.cfg.expect | 186 +++++----
>  ...ositive-resource-affinity-rules.cfg.expect | 264 +++++++++---
>  ...ty-with-resource-affinity-rules.cfg.expect | 114 +++--
>  ...rce-refs-in-node-affinity-rules.cfg.expect | 200 ++++++---
>  src/test/test_failover1.pl                    |  15 +-
>  src/test/test_rules_config.pl                 |  14 +-
>  30 files changed, 1904 insertions(+), 1007 deletions(-)
>  create mode 100644 src/PVE/HA/Rules/Helpers.pm
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
     prev parent reply	other threads:[~2025-11-03 17:43 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-03 10:19 Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 01/21] config: do not add ignored resources to dependent resources Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 02/21] manager: retranslate rules if nodes are added or removed Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 03/21] rules: factor out disjoint rules' resource set helper Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 04/21] rules: resource affinity: inter-consistency check with merged positive rules Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 05/21] rules: add merged positive resource affinity info in global checks Daniel Kral
2025-11-03 17:24   ` Michael Köppl
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 06/21] rules: make rules sorting optional in foreach_rule helper Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 07/21] rename rule's canonicalize stage to transform stage Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 08/21] rules: make plugins register transformers instead of plugin_transform Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 09/21] rules: node affinity: decouple get_node_affinity helper from Usage class Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 10/21] compile ha rules to a more compact representation Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 11/21] test: rules: use to_json instead of Data::Dumper for config output Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 12/21] test: rules: add compiled config output to rules config test cases Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 13/21] rules: node affinity: define node priority outside hash access Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 14/21] move minimum version check helper to ha tools Daniel Kral
2025-11-03 10:19 ` [pve-devel] [PATCH ha-manager v3 15/21] manager: move group migration cooldown variable into helper Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 16/21] api: status: sync active service counting with lrm's helper Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 17/21] manager: group migration: " Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 18/21] factor out counting of active services into helper Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 19/21] tree-wide: remove misused function prototype declaractions Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 20/21] rules: fix documentation for inter-rules checker subroutines Daniel Kral
2025-11-03 10:20 ` [pve-devel] [PATCH ha-manager v3 21/21] rules: add documentation about current feasibility check implementations Daniel Kral
2025-11-03 17:43 ` Michael Köppl [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=DDZ8URVCKQBZ.2KS7D0BJH473R@proxmox.com \
    --to=m.koeppl@proxmox.com \
    --cc=pve-devel-bounces@lists.proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox