public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Fiona Ebner <f.ebner@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Alexandre Derumier <aderumier@odiso.com>
Subject: Re: [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support
Date: Wed, 22 Feb 2023 16:19:49 +0100	[thread overview]
Message-ID: <9fff5f59-da9a-505c-9427-9320a02a72fb@proxmox.com> (raw)
In-Reply-To: <20230213120021.3783742-15-aderumier@odiso.com>

Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 1b1c99d..bf4e92a 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -3,6 +3,8 @@ package PVE::QemuServer::Memory;
>  use strict;
>  use warnings;
>  
> +use POSIX qw(ceil);
> +
>  use PVE::JSONSchema;
>  use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
>  use PVE::Exception qw(raise raise_param_exc);
> @@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
>  get_current_memory
>  parse_memory
>  get_host_max_mem
> +get_virtiomem_block_size
>  );
>  
>  my $MAX_NUMA = 8;
> @@ -37,6 +40,12 @@ our $memory_fmt = {
>  	maximum => 4194304,
>  	format => 'pve-qm-memory-max',
>      },
> +    virtio => {
> +	description => "Enable virtio-mem memory (Experimental: Only works with Linux guest with kernel >= 5.10)",

Nit: How about "Use virtio-mem devices for hotplug (Experimental: ...)",
then people immediately know it's for hotplug.

> +	type => 'boolean',
> +	optional => 1,
> +	default => 0,
> +    },
>  };
>  
>  PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
> @@ -72,7 +81,9 @@ my sub get_static_mem {
>      my $static_memory = 0;
>      my $memory = parse_memory($conf->{memory});
>  
> -    if ($memory->{max}) {
> +    if ($memory->{virtio}) {
> +	$static_memory = 4096;
> +    } elsif ($memory->{max}) {
>  	my $dimm_size = $memory->{max} / $MAX_SLOTS;
>  	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
>  	$static_memory = $dimm_size * $sockets;
> @@ -161,6 +172,117 @@ sub get_current_memory {
>      return $memory->{current};
>  }
>  
> +sub get_virtiomem_block_size {
> +    my ($conf) = @_;
> +
> +    my $sockets = $conf->{sockets} || 1;
> +    my $MAX_MEM = get_max_mem($conf);
> +    my $static_memory = get_static_mem($conf, $sockets);

Nit: Not making a difference with the current implemenetation, but this
should pass 1 for hotplug (we only use the virtio-mem devices for hotplug).

> +    my $memory = get_current_memory($conf->{memory});
> +
> +    #virtiomem can map 32000 block size.
> +    #try to use lowest blocksize, lower = more chance to unplug memory.
> +    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
> +    #2MB is the minimum to be aligned with THP
> +    $blocksize = 2 if $blocksize < 2;
> +    $blocksize = 2**(ceil(log($blocksize)/log(2)));
> +    #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
> +    $blocksize = 4 if $blocksize < 4;
> +
> +    return $blocksize;
> +}
> +
> +my sub get_virtiomem_total_current_size {
> +    my ($mems) = @_;
> +    my $size = 0;
> +    for my $mem (values %$mems) {
> +	$size += $mem->{current};
> +    }
> +    return $size;
> +}
> +
> +my sub balance_virtiomem {
> +    my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
> +
> +    my $nb_virtiomem = scalar(keys %$virtiomems);
> +
> +    print"try to balance memory on $nb_virtiomem virtiomems\n";
> +
> +    #if we can't share exactly the same amount, we add the remainder on last node
> +    my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
> +    my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
> +
> +    my $i = 0;
> +    foreach my $id (sort keys %$virtiomems) {
> +	my $virtiomem = $virtiomems->{$id};
> +	$i++;
> +	my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
> +	$virtiomem->{completed} = 0;
> +	$virtiomem->{retry} = 0;
> +	$virtiomem->{target} = $virtiomem_target;
> +
> +	print "virtiomem$id: set-requested-size : $virtiomem_target\n";
> +	mon_cmd($vmid, 'qom-set', 
> +		path => "/machine/peripheral/virtiomem$id", 
> +		property => "requested-size", 
> +		value => $virtiomem_target * 1024 * 1024);

Style nit: trailing spaces and should really put each argument on its
own line, with mon_cmd( and the final ) on their own line too.

> +    }
> +
> +    my $total_finished = 0;
> +    my $error = undef;
> +
> +    while ($total_finished != $nb_virtiomem) {
> +
> +	sleep 1;
> +
> +	$total_finished = 0;
> +
> +	foreach my $id (keys %$virtiomems) {
> +
> +	    my $virtiomem = $virtiomems->{$id};
> +
> +	    if ($virtiomem->{error} || $virtiomem->{completed}) {
> +		$total_finished++;
> +		next;
> +	    }
> +
> +	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
> +	    $virtiomem->{current} = $size / 1024 / 1024;
> +	    print"virtiomem$id: last: $virtiomem->{last} current: $virtiomem->{current} target: $virtiomem->{target}\n";

[0] marker so I can reference this message below :)

> +
> +	    if($virtiomem->{current} == $virtiomem->{target}) {
> +		print"virtiomem$id: completed\n";
> +		$virtiomem->{completed} = 1;
> +		next;
> +	    }
> +
> +	    if($virtiomem->{current} != $virtiomem->{last}) {
> +		#if value has changed, but not yet completed
> +		print "virtiomem$id: changed but don't not reach target yet\n";

"don't not" is wrong. But do we really need this print? I feel like the
above[0] is already enough. It already contains the information about
last and current.

> +		$virtiomem->{retry} = 0;
> +		$virtiomem->{last} = $virtiomem->{current};
> +		next;
> +	    }
> +
> +	    if($virtiomem->{retry} >= 5) {
> +		print "virtiomem$id: too many retry. set error\n";

s/retry/retries/
But I'd also change the message to be a bit more telling to users, "set
error" could mean anything. Maybe something like: "virtiomem$id: target
memory still not reached, ignoring device from now on"?

> +		$virtiomem->{error} = 1;
> +		$error = 1;
> +		#as change is async, we don't want that value change after the api call
> +		eval {
> +		    mon_cmd($vmid, 'qom-set', 
> +			    path => "/machine/peripheral/virtiomem$id", 
> +			    property => "requested-size", 
> +			    value => $virtiomem->{current} * 1024 *1024);
> +		};
> +	    }
> +	    print"virtiomem$id: increase retry: $virtiomem->{retry}\n";

Maybe add output the retry counter in the message [0] to avoid output bloat?

> +	    $virtiomem->{retry}++;
> +	}
> +    }
> +    die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
> +}
> +
>  sub get_numa_node_list {
>      my ($conf) = @_;
>      my @numa_map;
> @@ -247,7 +369,39 @@ sub qemu_memory_hotplug {
>      my $MAX_MEM = get_max_mem($conf);
>      die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
>  
> -    if ($value > $memory) {
> +    my $confmem = parse_memory($conf->{memory});

This is already $oldmem, no need for this second variable.




  reply	other threads:[~2023-02-22 15:19 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 02/16] tests: add memory tests Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 03/16] memory: refactor sockets Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 04/16] memory: remove calls to parse_hotplug_features Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 05/16] add memory parser Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 06/16] memory: add get_static_mem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 07/16] memory: use static_memory in foreach_dimm Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option Alexandre Derumier
2023-02-22 15:18   ` Fiona Ebner
2023-02-23  7:35     ` DERUMIER, Alexandre
2023-02-23  7:44       ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max Alexandre Derumier
2023-02-22 15:19   ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
2023-02-22 15:19   ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
2023-02-22 15:19   ` Fiona Ebner
2023-02-23  8:38     ` DERUMIER, Alexandre
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 12/16] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 13/16] test: add memory-max tests Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support Alexandre Derumier
2023-02-22 15:19   ` Fiona Ebner [this message]
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry Alexandre Derumier
2023-02-22 15:19   ` Fiona Ebner
     [not found]     ` <00eab4f6356c760a55182497eb0ad0bac57bdcb4.camel@groupe-cyllene.com>
2023-02-24  7:12       ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 16/16] tests: add virtio-mem tests Alexandre Derumier
2023-02-15 13:42 ` [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Fiona Ebner
2023-02-16 12:35   ` Fiona Ebner
2023-02-27 14:04     ` Thomas Lamprecht
2023-02-28  7:35       ` Fiona Ebner
2023-02-22 15:25 ` [pve-devel] " Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fff5f59-da9a-505c-9427-9320a02a72fb@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=aderumier@odiso.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal