all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Michael Köppl" <m.koeppl@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Daniel Kral <d.kral@proxmox.com>
Subject: Re: [pve-devel] [PATCH docs/ha-manager/manager v4 00/25] HA Rules
Date: Wed, 30 Jul 2025 19:29:09 +0200	[thread overview]
Message-ID: <25d3ef2c-01ff-4712-af81-f05d84d100f0@proxmox.com> (raw)
In-Reply-To: <20250729180107.428855-1-d.kral@proxmox.com>

Gave this version another spin today, focusing on the migration from
groups to rules. I tested this 3-node and 5-node clusters. Went through
the following scenarios:

1) At least one of the nodes in the cluster not at minimum version
required for migration to rules
2) At least one node offline during the attempt to migrate to rules

In both of the above cases, only the in-memory mapping of groups to
rules will happen. Groups continue to work on the PVE 8 nodes and rules
continue work on the PVE 9 nodes. It should be noted that the nofailback
flag is not inverted for the resources while the rules are still
in-memory. This "switch" from nofailback to failback only occurs once
the migration is persisted.

3) Updating the remaining PVE 8 nodes one after another

Persistent migration started soon after all nodes were upgraded to PVE 9
(there is a slight delay since the check if groups need to migrated does
not happen every round). Worked smoothly and I did not notice any
discrepancies in the rules.cfg generated from the groups.cfg.

4) Migration with non-existent groups in resource.cfg
5) Invalid properties in resources.cfg or groups.cfg
6) Partially upgrading the cluster, editing a rule on a PVE 9 node

This will not persist. It is not unexpected, since the rules exist only
in-memory at this point, but users should probably be warned about
making any changes to rules mid-upgrade.

Dano already incorporated feedback from Hannes' and my tests and we also
tested updated versions that fix the problems that we noticed, just
documenting it here for the sake of completeness. The migration from
groups to rules overall worked very well in the cases where migration
was already possible and did not proceed (and provided informative
errors or warnings) if it was not.


On 7/29/25 20:03, Daniel Kral wrote:
> Here's a quick update on the core HA rules series. This cleans up the
> series so that all tests are running again and includes the missing ui
> patch that I didn't see missing last time.
> 
> The persistent migration path has been tested for at least four full
> upgrade runs now, always with one node being behind and checking that
> the group config is only removed as soon as all nodes are on the right
> version.
> 
> I'll wait for tomorrow if something comes up and will do some testing
> myself, so I'm anticipating to follow up on this tomorrow. I'll also
> want to get a more mature version of the HA resource affinity series
> ready for tomorrow on the mailing list.
> 
> For maintainers: ha-manager patch #19 should be updated to the correct
> pve-manager version that is dependent on the pve-ha-manager package
> which can interpret the HA rules config.
> 
> Changelog since v3
> ------------------
> 
> - rebased on newest available master
> 
> - included missing ui patch for web interface
> 
> - correction in failback property description (does not influence the ha
>   node affinity rules)
> 
> - migrated the groups configs in the test cases to node affinity rules
>   in rules configs (except two test cases for the persistent migration)
> 
> - improved persistent ha group migration process
> 
> - try a persistent upgrade only every 10 HA manager rounds
> 
> - various other minor touches
> 
> TODO
> ----
> 
> - More testing on edge cases for the HA Manager migration path
> 
> - Some more testing of the ha-manager CLI and adding a deprecation
>   warning on the HA Groups API and disallowing requests as soon as the
>   groups config is fully migrated


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


      parent reply	other threads:[~2025-07-30 17:27 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-29 18:00 Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 01/19] tree-wide: make arguments for select_service_node explicit Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 02/19] manager: improve signature of select_service_node Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 03/19] introduce rules base plugin Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 04/19] rules: introduce node affinity rule plugin Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 05/19] config, env, hw: add rules read and parse methods Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 06/19] config: delete services from rules if services are deleted from config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 07/19] manager: read and update rules config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 08/19] test: ha tester: add test cases for future node affinity rules Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 09/19] resources: introduce failback property in ha resource config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 10/19] manager: migrate ha groups to node affinity rules in-memory Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 11/19] manager: apply node affinity rules when selecting service nodes Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 12/19] test: add test cases for rules config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 13/19] api: introduce ha rules api endpoints Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 14/19] cli: expose ha rules api endpoints to ha-manager cli Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 15/19] sim: do not create default groups for test cases Daniel Kral
2025-07-30 10:01   ` Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 16/19] test: ha tester: migrate groups to service and rules config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 17/19] test: ha tester: replace any reference to groups with node affinity rules Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 18/19] env: add property delete for update_service_config Daniel Kral
2025-07-29 18:00 ` [pve-devel] [PATCH ha-manager v4 19/19] manager: persistently migrate ha groups to ha rules Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH docs v4 1/2] ha: add documentation about ha rules and ha node affinity rules Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH docs v4 2/2] ha: crs: add effects of ha node affinity rule on the crs scheduler Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH manager v4 1/4] api: ha: add ha rules api endpoints Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH manager v4 2/4] ui: ha: remove ha groups from ha resource components Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH manager v4 3/4] ui: ha: show failback flag in resources status view Daniel Kral
2025-07-29 18:01 ` [pve-devel] [PATCH manager v4 4/4] ui: ha: replace ha groups with ha node affinity rules Daniel Kral
2025-07-30 17:29 ` Michael Köppl [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25d3ef2c-01ff-4712-af81-f05d84d100f0@proxmox.com \
    --to=m.koeppl@proxmox.com \
    --cc=d.kral@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal