Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules

From: Daniel Kral <d.kral@proxmox.com>
To: Fiona Ebner <f.ebner@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules
Date: Fri, 25 Apr 2025 15:25:42 +0200	[thread overview]
Message-ID: <500c452d-581d-4fb7-81d2-fe0f46d29fd6@proxmox.com> (raw)
In-Reply-To: <d977b382-95ab-4f51-b1ce-8268630a5e24@proxmox.com>

On 4/25/25 14:25, Fiona Ebner wrote:
> Am 25.04.25 um 10:36 schrieb Daniel Kral:
>> On 4/24/25 12:12, Fiona Ebner wrote:
>> As suggested by @Lukas off-list, I'll also try to make the check
>> selective, e.g. the user has made an infeasible change to the config
>> manually by writing to the file and then wants to create another rule.
>> Here it should ignore the infeasible rules (as they'll be dropped
>> anyway) and only check if the added rule / changed rule is infeasible.
> 
> How will you select the rule to drop? Applying the rules one-by-one to
> find a first violation?

AFAICS we could use the same helpers to check whether the rules are 
feasible, and only check whether the added / updated ruleid is one that 
is causing these troubles. I guess this would be a reasonable option 
without duplicating code, but still check against the whole config. 
There's surely some optimization potential here, but then we would have 
a larger problem at reloading the rule configuration for the manager 
anyway. For the latter I could check for what size of a larger 
configuration this could become an actual bottleneck.

For either adding a rule or updating a rule, we would just make the 
change to the configuration in-memory and run the helper. Depending on 
the result, we'd store the config or error out to the API user.

> 
>> But as you said, it must not change the user's configuration in the end
>> as that would be very confusing to the user.
> 
> Okay, so dropping dynamically. I guess we could also disable such rules
> explicitly/mark them as being in violation with other rules somehow:
> Tri-state enabled/disabled/conflict status? Explicit field?
> 
> Something like that would make such rules easily visible and have the
> configuration better reflect the actual status.
> 
> As discussed off-list now: we can try to re-enable conflicting rules
> next time the rules are loaded.

Hm, there's three options now:

- Allowing conflicts over the create / update API and auto-resolving the 
conflicts as soon as we're able to (e.g. on the load / save where the 
rule becomes feasible again).

- Not allowing conflicts over the create / update API, but set the state 
to 'conflict' if manual changes (or other circumstances) made the rules 
be in conflict with one another.

- Having something like the SDN config, where there's a working 
configuration and a "draft" configuration that needs to be applied. So 
conflicts are allowed in drafts, but not in working configurations.

The SDN option seems too much for me here, but I just noticed some 
similarity.

I guess one of the first two makes more sense. If there's no arguments 
against this, I'd choose the second option as we can always allow 
intentional conflicts later if there's user demand or we see other 
reasons in that.

> 
>>>> The only thing that I'm unsure about this, is how we would migrate the
>>>> `nofailback` option, since this operates on the group-level. If we keep
>>>> the `<node>(:<priority>)` syntax and restrict that each service can only
>>>> be part of one location rule, it'd be easy to have the same flag. If we
>>>> go with multiple location rules per service and each having a score or
>>>> weight (for the priority), then we wouldn't be able to have this flag
>>>> anymore. I think we could keep the semantic if we move this flag to the
>>>> service config, but I'm thankful for any comments on this.
>>> My gut feeling is that going for a more direct mapping, i.e. each
>>> location rule represents one HA group, is better. The nofailback flag
>>> can still apply to a given location rule I think? For a given service,
>>> if a higher-priority node is online for any location rule the service is
>>> part of, with nofailback=0, it will get migrated to that higher-priority
>>> node. It does make sense to have a given service be part of only one
>>> location rule then though, since node priorities can conflict between
>>> rules.
>>
>> Yeah, I think this is the reasonable option too.
>>
>> I briefly discussed this with @Fabian off-list and we also agreed that
>> it would be good to make location rules as 1:1 to location rules as
>> possible and keep the nofailback per location rule, as the behavior of
>> the HA group's nofailback could still be preserved - at least if there's
>> only a single location rule per service at least.
>>
>> ---
>>
>> On the other hand, I'll have to take a closer look if we can do
>> something about the blockers when creating multiple location rules where
>> e.g. one has nofailback enabled and the other has not. As you already
>> said, they could easily conflict between rules...
>>
>> My previous idea was to make location rules as flexible as possible, so
>> that it would theoretically not matter if one writes:
>>
>> location: rule1
>>      services: vm:101
>>      nodes: node1:2,node2:1
>>      strict: 1
>> or:
>>
>> location: rule1
>>      services: vm:101
>>      nodes: node1
>>      strict: 1
>>
>> location: rule2
>>      services: vm:101
>>      nodes: node2
>>      strict: 1
>>
>> The order which one's more important could be encoded in the order which
>> it is defined (if one configures this in the config it's easy, and I'd
>> add an API endpoint to realize this over the API/WebGUI too), or maybe
>> even simpler to maintain: just another property.
> 
> We cannot use just the order, because a user might want to give two
> nodes the same priority. I'd also like to avoid an implicit
> order-priority mapping.

Right, good point!

> 
>> But then, the
>> nofailback would have to be either moved to some other place...
> 
>> Or it is still allowed in location rules, but either the more detailed
>> rule wins (e.g. one rule has node1 without a priority and the other does
>> have node1 with a priority)
> 
> Maybe we should prohibit multiple rules with the same service-node pair?
> Otherwise, my intuition says that all rules should be considered and the
> rule with the highest node priority should win.

Yes, I think that would make the most sense as disallowing users to put 
the same two or more services in multiple negative colocation rules.

> 
>> or the first location rule with a specific
>> node wins and the other is ignored. But this is already confusing when
>> writing it out here...
>>
>> I'd prefer users to write the former (and make this the dynamic
>> 'canonical' form when selecting nodes), but as with colocation rules it
>> could make sense to separate them for specific reasons / use cases.
> 
> Fair point.
> 
>> And another reason why it could still make sense to go that way is to
>> allow "negative" location rules at a later point, which makes sense in
>> larger environments, where it's easier to write opt-out rules than opt-
>> in rules, so I'd like to keep that path open for the future.
> 
> We also discussed this off list: Daniel convinced me that it would be
> cleaner if the nofailback property would be associated to a given
> service rather than a given location rule. And if we later support pools
> as resources, the property should be associated to (certain or all)
> services in that pool and defined in the resource config for the pool.
> 
> To avoid the double-negation with nofailback=0, it could also be renamed
> to a positive property, below called "auto-elevate", just a working name.
> 
> A small concern of mine was that this makes it impossible to have a
> service that only "auto-elevates" to a specific node with a priority,
> but not others. This is already not possible right now, and honestly,
> that would be quite strange behavior and not supporting that is unlikely
> to hurt real use cases.

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel