From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 6F5251FF172 for <inbox@lore.proxmox.com>; Tue, 1 Apr 2025 11:39:40 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 71BD81E434; Tue, 1 Apr 2025 11:39:28 +0200 (CEST) Message-ID: <498c09ec-662b-451b-a4a8-0aa51bb575df@proxmox.com> Date: Tue, 1 Apr 2025 11:39:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>, "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com> References: <20250325151254.193177-1-d.kral@proxmox.com> <50c71b96d6cd509783b51c7ad87b94ff200ad78e.camel@groupe-cyllene.com> Content-Language: en-US From: Daniel Kral <d.kral@proxmox.com> In-Reply-To: <50c71b96d6cd509783b51c7ad87b94ff200ad78e.camel@groupe-cyllene.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.010 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> On 4/1/25 03:50, DERUMIER, Alexandre wrote: > my 2cents, but everybody in the industry is calling this > affinity/antiafifnity (vmware, nutanix, hyperv, openstack, ...). > More precisely, vm affinity rules (vm<->vm) vs node affinity rules > (vm->node , the current HA group) > > Personnally I don't care, it's just a name ^_^ . > > But I have a lot of customers asking about "does proxmox support > affinity/anti-affinity". and if they are doing their own research, they > will think that it doesnt exist. > (or at minimum, write somewhere in the doc something like "aka vm > affinity" or in commercial presentation ^_^) I see your point and also called it affinity/anti-affinity before, but if we go for the HA Rules route here, it'd be really neat to have "Location Rules" and "Colocation Rules" in the end to coexist and clearly show the distinction between them, as both are affinity rules at least for me. I'd definitely make sure that it is clear from the release notes and documentation, that this adds the feature to assign affinity between services, but let's wait for some other comments on this ;). On 4/1/25 03:50, DERUMIER, Alexandre wrote: > More serious question : Don't have read yet all the code, but how does > it play with the current topsis placement algorithm ? I currently implemented the colocation rules to put a constraint on which nodes the manager can select from for the to-be-migrated service. So if users use the static load scheduler (and the basic / service count scheduler for that matter too), the colocation rules just make sure that no recovery node is selected, which contradicts the colocation rules. So the TOPSIS algorithm isn't changed at all. There are two things that should/could be changed in the future (besides the many future ideas that I pointed out already), which are - (1) the schedulers will still consider all online nodes, i.e. even though HA groups and/or colocation rules restrict the allowed nodes in the end, the calculation is done for all nodes which could be significant for larger clusters, and - (2) the service (generally) are currently recovered one-by-one in a best-fit fashion, i.e. there's no order on the service's needed resources, etc. There could be some edge cases (e.g. think about a failing node with a bunch of service to be kept together; these should now be migrated to the same node, if possible, or put them on the minimum amount of nodes), where the algorithm could find better solutions if it either orders the to-be-recovered services, and/or the utilization scheduler has knowledge about the 'keep together' colocations and considers these (and all subsets) as a single service. For the latter, the complexity explodes a bit and is harder to test for, which is why I've gone for the current implementation, as it also reduces the burden on users to think about what could happen with a specific set of rules and already allows the notion of MUST/SHOULD. This gives enough flexibility to improve the decision making of the scheduler in the future. On 4/1/25 03:50, DERUMIER, Alexandre wrote: > Small feature request from students && customers: they are a lot > asking to be able to use vm tags in the colocation/affinity Good idea! We were thinking about this too and I forgot to add it to the list, thanks for bringing it up again! Yes, the idea would be to make pools and tags available as selectors for rules here, so that the changes can be made rather dynamic by just adding a tag to a service. The only thing we have to consider here is that HA rules have some verification phase and invalid rules will be dropped or modified to make them applicable. Also these external changes must be identified somehow in the HA stack, as I want to keep the amount of runs through the verification code to a minimum, i.e. only when the configuration is changed by the user. But that will be a discussion for another series ;). _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel