From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9580E844AA for ; Mon, 13 Dec 2021 08:43:29 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8FA7313E55 for ; Mon, 13 Dec 2021 08:43:29 +0100 (CET) Received: from bastionodiso.odiso.net (bastionodiso.odiso.net [IPv6:2a0a:1580:2000::2d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id EC3D713E3D for ; Mon, 13 Dec 2021 08:43:24 +0100 (CET) Received: from kvmformation3.odiso.net (formationkvm3.odiso.net [10.3.94.12]) by bastionodiso.odiso.net (Postfix) with ESMTP id 5E62C2DAE1; Mon, 13 Dec 2021 08:43:17 +0100 (CET) Received: by kvmformation3.odiso.net (Postfix, from userid 0) id 468B6153A44; Mon, 13 Dec 2021 08:43:17 +0100 (CET) From: Alexandre Derumier To: pve-devel@lists.proxmox.com Date: Mon, 13 Dec 2021 08:43:13 +0100 Message-Id: <20211213074316.2565139-1-aderumier@odiso.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.025 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% HEADER_FROM_DIFFERENT_DOMAINS 0.249 From and EnvelopeFrom 2nd level mail domains are different KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LAZY_DOMAIN_SECURITY 1 Sending domain does not have any anti-forgery methods NO_DNS_FOR_FROM 0.001 Envelope sender has no MX or A DNS records SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_NONE 0.001 SPF: sender does not publish an SPF Record Subject: [pve-devel] [PATCH pve-ha-manager 0/3] POC/RFC: ressource aware HA manager X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Dec 2021 07:43:29 -0000 Hi, this is a proof of concept to implement ressource aware HA. The current implementation is really basic, simply balancing the number of services on each node. I had some real production cases, where a node is failing, and restarted vm impact others nodes because of too much cpu/ram usage. This new implementation use best-fit heuristic vector packing with constraints support. - We compute nodes memory/cpu, and vm memory/cpu average stats on last 20min For each ressource : - First, we ordering pending recovery state services by memory, then cpu usage. Memory is more important here, because vm can't start if target node don't have enough memory - Then, we check possible target nodes contraints. (storage available, node have enough cpu/ram, node have enough cores,...) (could be extended with other constraint like vm affinity/anti-affinity, cpu compatibilty, ...) - Then we compute a node weight with euclidean distance of both cpu/ram vectors between vm usage and node available ressources. Then we choose the first node with the lower eucliean distance weight. (Ex: if vm use 1go ram/1% cpu, node1 have 2go ram/2% cpu , and node2 have 4go ram/4% cpu, node1 will be choose because it's the nearest of vm usage) - We add recovered vm cpu/ram to target node stats. (This is only an best effort estimation, as the vm start is async on target lrm, and could failed,...) I have keeped HA group node prio, and other other ordering, so this don't break current tests, and we can add easily a option at datacenter to enable/disable It could be easy to implement later some kind of vm auto migration when a node use too much cpu/ram, reusing same node selection algorithm I have added a basic test, I'll add more tests later if this patch serie is ok for you. Some good litterature about heuristics: microsoft hyper-v implementation: - http://kunaltalwar.org/papers/VBPacking.pdf - https://www.microsoft.com/en-us/research/wp-content/uploads/2011/01/virtualization.pdf Variable size vector bin packing heuristics: - https://hal.archives-ouvertes.fr/hal-00868016v2/document Alexandre Derumier (3): add ressource awareness manager tests: add support for ressources add test-basic0 src/PVE/HA/Env.pm | 24 +++ src/PVE/HA/Env/PVE2.pm | 90 ++++++++++ src/PVE/HA/Manager.pm | 246 ++++++++++++++++++++++++++- src/PVE/HA/Sim/Hardware.pm | 61 +++++++ src/PVE/HA/Sim/TestEnv.pm | 36 ++++ src/test/test-basic0/README | 1 + src/test/test-basic0/cmdlist | 4 + src/test/test-basic0/hardware_status | 5 + src/test/test-basic0/log.expect | 52 ++++++ src/test/test-basic0/manager_status | 1 + src/test/test-basic0/node_stats | 5 + src/test/test-basic0/service_config | 5 + src/test/test-basic0/service_stats | 5 + 13 files changed, 528 insertions(+), 7 deletions(-) create mode 100644 src/test/test-basic0/README create mode 100644 src/test/test-basic0/cmdlist create mode 100644 src/test/test-basic0/hardware_status create mode 100644 src/test/test-basic0/log.expect create mode 100644 src/test/test-basic0/manager_status create mode 100644 src/test/test-basic0/node_stats create mode 100644 src/test/test-basic0/service_config create mode 100644 src/test/test-basic0/service_stats -- 2.30.2