From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id F14CF8D4D4 for ; Tue, 8 Nov 2022 10:19:24 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id C66F84F72 for ; Tue, 8 Nov 2022 10:19:24 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 8 Nov 2022 10:19:23 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 795AF42E24 for ; Tue, 8 Nov 2022 10:19:21 +0100 (CET) Message-ID: <7d56a05f-be11-586c-f5af-f6fbd2cd2d4e@proxmox.com> Date: Tue, 8 Nov 2022 10:19:20 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:107.0) Gecko/20100101 Thunderbird/107.0 Content-Language: en-GB To: Dominik Csapak , Proxmox Backup Server development discussion , Wolfgang Bumiller References: <20221031113953.3111599-1-d.csapak@proxmox.com> <20221031113953.3111599-4-d.csapak@proxmox.com> <20221107123538.i355vavyncl5edm7@casey.proxmox.com> <36968ac0-73c1-466a-e122-485a2598d14b@proxmox.com> From: Thomas Lamprecht In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL -0.033 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] applied: [RFC PATCH proxmox-backup 2/2] file-restore: dynamically increase memory of vm for zpools X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Nov 2022 09:19:25 -0000 Am 08/11/2022 um 08:59 schrieb Dominik Csapak: > On 11/7/22 18:15, Thomas Lamprecht wrote: >> Am 07/11/2022 um 13:35 schrieb Wolfgang Bumiller: >>> applied >> >> meh, can we please get this opt-in any only enabled it for root@pam or= for users >> with some powerfull priv on / as talked as chosen approach to allow mo= re memory the >> last time this came up (off list IIRC)... I really do *not* want a mem= ory DOS potential >> increased by a lot just opening some file-restore tabs, this actually = should get more >> restrictions (for common "non powerfull" users), not less.. >=20 > understandable, so i can do that, but maybe it's time we rethink the fi= le-restore > mechanism as a whole, since it's currently rather inergonomic: IMO the current approach is very ergonomic as the user doesn't has to car= e at all about details, I'd like to keep it that way as much as possible. While we may need to add some every extra setting or extra work load for = users/admins should be kept at a minimum. >=20 > * users don't know how many and which file restore vms are running, the= y > may not even know it starts a vm at all that isn't an issue, the user wants file restores, not care about managin= g its details. > * regardless with/without my patch, the only thing necessary to start a= bunch vms > is VM.Backup to the vm and Datastore.AllocateSpace on the storage > (which in turn is probably enough to create an arbitrary amount of back= ups) > * having arbitrary sized disks/fs inside, no fixed amount we give will = always > be enough >=20 > so here some proposals on how to improve that (we won't implement all o= f them > simultaneously, but maybe something from that list is usable) > * make the list of running file-restore vms visible, and maybe add a ma= nual 'shutdown' the VM lives for a minute and can be spawn in milliseconds, so not really= seeing the benefit of that, a misguided/bad/... user can much faster spawn such = VMs that one can stop them and the syslog already contains basic info. > * limit the amount of restore vms per user (or per vm?) > =C2=A0 - this would need the mechanism from above anyway, since otherwi= se either the user > =C2=A0=C2=A0=C2=A0 cannot start the restore vm or we abort an older vm = (with possibly running > =C2=A0=C2=A0=C2=A0 download operation) which mechanism? The list of running VMs is only required in the backend and we can already find that info out there. > * make the vm memory configurable (per user/vm/globally?) meh, but will be one of the better mechanisms with the priv/unpriv and different limits enforced via a cgroup (see below) > * limit the global memory usage for file restore vms > =C2=A0 - again needs some control mechanism for stopping/seeing these v= ms Not really, just needs a cgroup for limiting all. Unpriv user A really won't be able to stop the file-restore VM of unpriv User B, so this is no= t really gaining anything - you cannot expect that an admin is around all the time. > * throw away the automatic starting of vms, and make it explicit, i.e. NACK, worsens UX a lot without really a benefit, the VM will still be started in the end, whatever limiting mechanism could be applied without any dialogue anyway.. > =C2=A0 make the user start/shutdown a vm manually > =C2=A0 - we can have some 'configuration panel' before starting (like w= ith restore) > =C2=A0 - the user is aware it's starting > =C2=A0 - still needs some mechanism to see them, but with a manual star= ting api call > =C2=A0=C2=A0=C2=A0 it's easier to have e.g. a running worker that can b= e stopped >=20 As said weeks ago, the simple stop gap for now is to just opt into more p= ossible memory once we got a sufficiently privilege (e.g., VM.Allocate and/or Sys= =2EModify on / as a starter) for a user, this does away the basic woes now. Sure, a= priv and unpriv user may get to "open" the same restore VM, and depending on o= rder it will then be able to use more or less resources, but that won't matter= much in practice and the same is true for the debug flag. As small addition, with possibly nice in practice effects:=20 add a "Finish" button to the file-restore window, that actively sends a = signal to the VM on press which then will reduce the idle shutdown timer to ~3s = (to avoid breaking other currently restores or having that open), that way you don'= t need extra lists and managements as most of the time it happens indirectly jus= t fine. Additionally we can put all VMs in a cgroup with max mem configured, that= really could be a new setting in the node or datacenter.cfg, can easily be neste= d as in: - root restore CG with a total limit, restores started by priv'd user sta= rt here '- user A CG with a per-unpriv-user-limit '- user B CG with a per-unpriv-user-limit That way we would not need to limit on number of restores, which we don't= care but the actual resource we care about. Side benefit, could get reduced CP= U shares so that scheduling prefers the "real" workload. Another possibility would be to also evaluate the MicroVM machine type to= reduce the basic impact of the OS - tbh. I'm not sure if Stefan checked that out= =2E And please note that this all is mostly an artificial problem anyway, ZFS= isn't that popular inside of guests, which are the prime target for file-restor= e, and especially not in form of using a huge set of disks as redundancy et = al. is normally covered more centrally on the host, so iff its rather used for s= napshots (which lvm-thin or btrfs may be a better option inside guests, as they're= in- kernel-tree and one doesn't pay such a (relatively) big memory tax. IIRC the issue even only got reported from internal testing of PVE or the= like, iow. odd setups. So please lets keep this simple and avoid breaking UX fo= r all the actual realistic uses cases. So please, make this opt-in at PBS site first, then, opt-in for the respe= ctive user class (even just root@pam can be argued if really unsure), else I'll= need to revert this for the next bump.