From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 3962893D65
 for <pve-devel@lists.proxmox.com>; Wed, 22 Feb 2023 16:20:27 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 1EC8E1A8B2
 for <pve-devel@lists.proxmox.com>; Wed, 22 Feb 2023 16:19:57 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Wed, 22 Feb 2023 16:19:56 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 261FA4814A;
 Wed, 22 Feb 2023 16:19:56 +0100 (CET)
Message-ID: <8edb8da5-04b4-ed25-c56f-9626a2ee259f@proxmox.com>
Date: Wed, 22 Feb 2023 16:19:55 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.8.0
From: Fiona Ebner <f.ebner@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Alexandre Derumier <aderumier@odiso.com>
References: <20230213120021.3783742-1-aderumier@odiso.com>
 <20230213120021.3783742-16-aderumier@odiso.com>
Content-Language: en-US
In-Reply-To: <20230213120021.3783742-16-aderumier@odiso.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.044 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.095 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem :
 implement redispatch retry.
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 22 Feb 2023 15:20:27 -0000

Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> If some memory can be removed on a specific node,
> we try to rebalance again on other nodes
> 
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer/Memory.pm | 51 +++++++++++++++++++++++++++-------------
>  1 file changed, 35 insertions(+), 16 deletions(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index bf4e92a..f02b4e0 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -201,13 +201,28 @@ my sub get_virtiomem_total_current_size {
>      return $size;
>  }
>  
> +my sub get_virtiomem_total_errors_size {
> +    my ($mems) = @_;
> +
> +    my $size = 0;
> +    for my $mem (values %$mems) {
> +	next if !$mem->{error};
> +	$size += $mem->{current};
> +    }
> +    return $size;
> +}
> +
>  my sub balance_virtiomem {
>      my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
>  
> -    my $nb_virtiomem = scalar(keys %$virtiomems);
> +    my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems->%*);
>  
>      print"try to balance memory on $nb_virtiomem virtiomems\n";
>  
> +    die "No more available blocks in virtiomem to balance all requested memory\n"
> +	if $target_total < 0;

I fee like this message is a bit confusing. This can only happen on
unplug, right? And reading that "no more blocks are available" sounds
like a paradox then. It's rather that no more blocks can be unplugged.

If we really want to, if the $target_total is negative, we could set it
to 0 (best to do it at the call-side already) and try to unplug
everything else? We won't reach the goal anymore, but we could still get
closer to it in some cases. Would need a bit more adaptation to avoid an
endless loop: we also need to stop if all devices reached their current
goal this round (and no new errors appeared), e.g. balance_virtiomem()
could just have that info as its return value.

Example:
> update VM 101: -memory 4100,max=65536,virtio=1
> try to balance memory on 2 virtiomems
> virtiomem0: set-requested-size : 0
> virtiomem1: set-requested-size : 4
> virtiomem1: last: 4 current: 4 target: 4
> virtiomem1: completed
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 0
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 1
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 2
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 3
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 4
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: too many retry. set error
> virtiomem0: increase retry: 5

Currently it stops here, but with setting $target_total = 0 it continues...

> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed

...and gets closer to the goal...

> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed
> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed

...but then it loops, because I didn't add the other stop condition yet
;). But not sure, likely too much magic.

> +    die "No more available virtiomem to balance the remaining memory\n" if $nb_virtiomem == 0;

"No more virtiomem devices left to try to ..." might be a bit clearer.
Technically, they are still available, we just ignore them because they
don't reach the target in time.

> +
>      #if we can't share exactly the same amount, we add the remainder on last node
>      my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
>      my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));