From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 73DB21FF16F for ; Thu, 19 Dec 2024 17:00:10 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5C445BDA2; Thu, 19 Dec 2024 17:00:08 +0100 (CET) From: Daniel Kral To: f.gruenbichler@proxmox.com Date: Thu, 19 Dec 2024 16:59:43 +0100 Message-Id: <20241219155943.182322-1-d.kral@proxmox.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20240416122054.733817-1-f.gruenbichler@proxmox.com> References: <20240416122054.733817-1-f.gruenbichler@proxmox.com> MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.003 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH v2 qemu-server/pve-container 0/19] pool resource limits X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Cc: pve-devel@lists.proxmox.com Content-Type: multipart/mixed; boundary="===============7915202020934044227==" Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" --===============7915202020934044227== Content-Transfer-Encoding: quoted-printable On 16/04/2024 14:20, Fabian Gr=C3=BCnbichler wrote:=0D > high level description:=0D > =0D > VM/CT vmstatus returns new fields for configured and running "usage"=0D > values, these are then broadcasted by pvestatd on each node via KV.=0D > =0D > helpers in guest-common to check those limits=0D > pool API returns limits and usage, them and allows setting the limits=0D > =0D > qemu-server/pve-container try to check actions against those limits.=0D > =0D > since the broadcast is async, there is always an opportunity to cheat by= =0D > racing against the broadcast. this is unavoidable unless we want to=0D > serialize all usage affecting tasks across the cluster..=0D > =0D > changelog since v1/RFC:=0D > - GUI can edit limits now=0D > - incorporated most feedback from Wolfgang and Dominik=0D > =0D > potential follow-ups:=0D > - disk limits/usage=0D > - RRD/metrics support (or even switching entirely to RRD based=0D > broadcasting instead of KV)=0D > - timeout of broadcasted data if node goes offline/stops broadcasting=0D > - better GUI=0D > - ... ?=0D =0D I've tried my hands on this patch series. I've added a few notes around=0D some patches and inlined them directly. Hope the notes are helpful as=0D the feature seems very promising, especially when an administrator wants=0D to be able to allocate specific resources to different user's guests.=0D =0D ---=0D =0D I've tested this patch series by creating a few resource pools and=0D adding/removing some VMs and container to them and trying out different=0D ways to start them (start on boot, file-restore, normal start) and see=0D if these are correctly limited.=0D =0D It worked fine for containers as far as I have tested, but I've had some=0D troubles with VMs because it seems that for VMs the memory is not=0D correctly calculated, I've mentioned that inline. I've also tried the=0D same thing with setting different cores / socket values and that worked=0D as expected, when a start would've reached the limit for the cpu count,=0D the start failed as expected.=0D =0D Another thing I took a closer look was hotplugging memory and cpu cores,=0D which also worked as expected, even though I wasn't sure how to hit all=0D new codepaths correctly, as I pointed out inline.=0D =0D All in all, with the two bug fixes I mentioned inside, this patch series=0D looks good to me so far! I've added some notes about applying below.=0D =0D ---=0D =0D Since this patch series is a few months old, here's a quick overview on=0D which base commits I applied, but everything went fine except a single=0D conflict that was very easy to fix:=0D =0D =0D =3D=3D=3D applying patches =3D=3D=3D=0D =0D 01-01: pve-access-control=0D -------------------------=0D =0D 01 applied cleanly on 4465786 ("api, auth: fix two typos in user-visible de= scription")=0D =0D =0D 02-08: pve-container=0D --------------------=0D =0D 02 applied cleanly on a39c5b0 ("bump version to 5.2.2")=0D =0D 03 applied with fuzz:=0D =0D Patch failed at 0002 status: add pool usage fields=0D patching file src/PVE/LXC.pm=0D Hunk #1 succeeded at 146 with fuzz 2 (offset 8 lines).=0D Hunk #2 succeeded at 208 (offset 36 lines).=0D Hunk #3 succeeded at 264 (offset 42 lines).=0D Hunk #4 succeeded at 272 (offset 42 lines).=0D Hunk #5 succeeded at 297 (offset 42 lines).=0D =0D 04-07 applied cleanly.=0D =0D 08 applied with fuzz:=0D =0D Patch failed at 0007 update: handle pool limits=0D patching file src/PVE/API2/LXC/Config.pm=0D Hunk #1 succeeded at 208 with fuzz 2.=0D =0D =0D 09-09: pve-guest-common=0D -----------------------=0D =0D 09 applied with fuzz:=0D =0D Patch failed at 0001 helpers: add pool limit/usage helpers=0D patching file src/PVE/GuestHelpers.pm=0D Hunk #1 succeeded at 450 with fuzz 1 (offset 34 lines).=0D =0D =0D 10-13: pve-manager=0D ------------------=0D =0D 10 applied cleanly on ead665d5 ("ui: ceph pool: add columns for application= ")=0D =0D 11 applied partially with fuzz:=0D =0D Patch failed at 0002 pvestatd: collect and broadcast pool usage=0D patching file PVE/Service/pvestatd.pm=0D Hunk #1 FAILED at 231.=0D Hunk #2 succeeded at 245 (offset 3 lines).=0D Hunk #3 succeeded at 283 with fuzz 1 (offset 5 lines).=0D Hunk #4 FAILED at 466.=0D Hunk #5 succeeded at 480 (offset 5 lines).=0D Hunk #6 succeeded at 592 with fuzz 1 (offset 11 lines).=0D Hunk #7 FAILED at 593.=0D =0D Merged rejected changes of 11 to:=0D =0D ```gitdiff=0D diff --git a/PVE/Service/pvestatd.pm b/PVE/Service/pvestatd.pm=0D index c1d5bda9..fac2c37a 100755=0D --- a/PVE/Service/pvestatd.pm=0D +++ b/PVE/Service/pvestatd.pm=0D @@ -234,7 +234,7 @@ sub auto_balloning {=0D }=0D =0D sub update_qemu_status {=0D - my ($status_cfg, $pull_txn) =3D @_;=0D + my ($status_cfg, $pull_txn, $pool_membership, $pool_usage) =3D @_;=0D =0D my $ctime =3D time();=0D my $vmstatus =3D PVE::QemuServer::vmstatus(undef, 1);=0D @@ -471,7 +471,7 @@ sub rebalance_lxc_containers {=0D }=0D =0D sub update_lxc_status {=0D - my ($status_cfg, $pull_txn) =3D @_;=0D + my ($status_cfg, $pull_txn, $pool_membership, $pool_usage) =3D @_;=0D =0D my $ctime =3D time();=0D my $vmstatus =3D PVE::LXC::vmstatus();=0D @@ -604,17 +604,23 @@ sub update_status {=0D syslog('err', "node status update error: $err") if $err;=0D =0D eval {=0D - update_qemu_status($status_cfg, $pull_txn);=0D + update_qemu_status($status_cfg, $pull_txn, $pool_membership, $pool_= usage);=0D };=0D $err =3D $@;=0D syslog('err', "qemu status update error: $err") if $err;=0D =0D eval {=0D - update_lxc_status($status_cfg, $pull_txn);=0D + update_lxc_status($status_cfg, $pull_txn, $pool_membership, $pool_u= sage);=0D };=0D $err =3D $@;=0D syslog('err', "lxc status update error: $err") if $err;=0D =0D + eval {=0D + update_pool_usage($pool_usage);=0D + };=0D + $err =3D$@;=0D + syslog('err', "pool usage status update error: $err") if $err;=0D +=0D eval {=0D rebalance_lxc_containers();=0D };=0D ```=0D =0D 11-13 applied cleanly.=0D =0D =0D 14-19: qemu-server=0D ------------------=0D =0D 14 applied cleanly on 45fedd47 ("fix #5980: importdisk: fix spurious warnin= g")=0D =0D 15 applied with warning:=0D =0D Applying: vmstatus: add usage values for pool limits=0D .git/rebase-apply/patch:58: trailing whitespace.=0D =0D warning: 1 line adds whitespace errors.=0D =0D 16 applied cleanly.=0D =0D 17 applied with warning:=0D =0D Applying: update/hotplug: handle pool limits=0D .git/rebase-apply/patch:19: space before tab in indent.=0D my $old =3D $usage->{cpu};=0D warning: 1 line adds whitespace errors.=0D =0D 18-19 applied cleanly.=0D --===============7915202020934044227== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel --===============7915202020934044227==--