From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id F2ED885D22 for ; Tue, 21 Dec 2021 16:13:39 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id F0FC4B719 for ; Tue, 21 Dec 2021 16:13:39 +0100 (CET) Received: from bastionodiso.odiso.net (bastionodiso.odiso.net [185.151.191.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id D67BEB6FE for ; Tue, 21 Dec 2021 16:13:38 +0100 (CET) Received: from kvmformation3.odiso.net (formationkvm3.odiso.net [10.3.94.12]) by bastionodiso.odiso.net (Postfix) with ESMTP id 7E7B35EE9; Tue, 21 Dec 2021 16:13:32 +0100 (CET) Received: by kvmformation3.odiso.net (Postfix, from userid 0) id 6887B14B294; Tue, 21 Dec 2021 16:13:32 +0100 (CET) From: Alexandre Derumier To: pve-devel@lists.proxmox.com Date: Tue, 21 Dec 2021 16:13:29 +0100 Message-Id: <20211221151331.623760-1-aderumier@odiso.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.249 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% HEADER_FROM_DIFFERENT_DOMAINS 0.25 From and EnvelopeFrom 2nd level mail domains are different KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LAZY_DOMAIN_SECURITY 1 Sending domain does not have any anti-forgery methods NO_DNS_FOR_FROM 0.001 Envelope sender has no MX or A DNS records SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_NONE 0.001 SPF: sender does not publish an SPF Record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [env.pm, hardware.pm, testenv.pm, manager.pm, pve2.pm] Subject: [pve-devel] [PATCH V3 pve-ha-manager 0/2] POC/RFC: ressource aware HA manager X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Dec 2021 15:13:40 -0000 Hi, this is a proof of concept to implement ressource aware HA. The current implementation is really basic, simply balancing the number of services on each node. I had some real production cases, where a node is failing, and restarted vm impact others nodes because of too much cpu/ram usage. Changelog v2: - merging main code && Sim code in same patch for now. (I'll split them later) - cleanup will all Thomas comments review (thanks again) - add more comments in code - check storage for lxc too - use maxmem for windows vms Changelog v3: - fix vm/ct config read (need to specify node) - fix storage_availability_check params - Classify nodes with low/medium/high threshold for better balancing. We try to fill nodes with lower usage first until the threshold is reached I still need to add missing storage availability test Alexandre Derumier (2): add ressource awareness manager add test-basic0 src/PVE/HA/Env.pm | 33 ++++ src/PVE/HA/Env/PVE2.pm | 177 +++++++++++++++++ src/PVE/HA/Manager.pm | 274 ++++++++++++++++++++++++++- src/PVE/HA/Sim/Hardware.pm | 61 ++++++ src/PVE/HA/Sim/TestEnv.pm | 50 ++++- src/test/test-basic0/README | 1 + src/test/test-basic0/cmdlist | 4 + src/test/test-basic0/hardware_status | 5 + src/test/test-basic0/log.expect | 52 +++++ src/test/test-basic0/manager_status | 1 + src/test/test-basic0/node_stats | 5 + src/test/test-basic0/service_config | 5 + src/test/test-basic0/service_stats | 5 + 13 files changed, 664 insertions(+), 9 deletions(-) create mode 100644 src/test/test-basic0/README create mode 100644 src/test/test-basic0/cmdlist create mode 100644 src/test/test-basic0/hardware_status create mode 100644 src/test/test-basic0/log.expect create mode 100644 src/test/test-basic0/manager_status create mode 100644 src/test/test-basic0/node_stats create mode 100644 src/test/test-basic0/service_config create mode 100644 src/test/test-basic0/service_stats -- 2.30.2