public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem
@ 2023-02-24 12:09 Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
                   ` (11 more replies)
  0 siblings, 12 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

This patch series rework the current memory hotplug + virtiomem.

memory option now have extra options:

memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
ex: memory: current=1024,max=131072,virtio=1


for classic memory hotplug, when maxmemory is defined,
we use 64 fixed size dimm.
The max option is a multiple of 64GB.

The virtio option enable new virtio-mem support,
instead of plugging dimm, it's add/removed block inside
big dimm.
virtio-mem can use 32000 blocks, the blocksize is compute from
max memory.


Changelog v2:

update differents patches based on Fiona comments.
(I have send 2 others mails with comments not yet addressed)

Biggest change is on virtio-mem, instead of trying to have same amount of memory
on each virtiomem (some block could be unmovable and break unplug),
we try to balance/dispatch remaining block on other available virtiomems.

Also, the minimum blocksize supported by linux guest os is 4MB currently,
even if virtiomem can use 2MB on qemu side.


Changelog v3:

- cleanup from last Fiona comments
- classic hotplug : fix memory unplug
- virtio-mem: split retry logic in a second optional patch 
  (feel free to apply it later if needed)

Changelog v4:
- cleanup from last Fiona comments

Changelog v5:
- move dimm hot-unplug patches on top of the serie
- cleanup from last Fiona comments

Alexandre Derumier (12):
  memory: rename qemu_dimm_list to qemu_memdevices_list
  memory: don't use foreach_reversedimm for unplug
  add memory parser
  memory: add get_static_mem
  memory: use static_memory in foreach_dimm
  config: memory: add 'max' option
  memory: get_max_mem: use config memory max
  memory: use 64 slots && static dimm size when max is defined
  test: add memory-max tests
  memory: add virtio-mem support
  memory: virtio-mem : implement redispatch retry.
  tests: add virtio-mem tests

 PVE/API2/Qemu.pm                              |  41 +-
 PVE/QemuConfig.pm                             |   4 +-
 PVE/QemuMigrate.pm                            |   6 +-
 PVE/QemuServer.pm                             |  37 +-
 PVE/QemuServer/Helpers.pm                     |   3 +-
 PVE/QemuServer/Memory.pm                      | 500 ++++++++++++++----
 PVE/QemuServer/PCI.pm                         |   8 +
 test/cfg2cmd/memory-max-128G.conf             |  11 +
 test/cfg2cmd/memory-max-128G.conf.cmd         |  86 +++
 test/cfg2cmd/memory-max-512G.conf             |  11 +
 test/cfg2cmd/memory-max-512G.conf.cmd         |  58 ++
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  |  12 +
 .../memory-virtio-hugepages-1G.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio-max.conf           |  11 +
 test/cfg2cmd/memory-virtio-max.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio.conf               |  11 +
 test/cfg2cmd/memory-virtio.conf.cmd           |  35 ++
 17 files changed, 770 insertions(+), 134 deletions(-)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-03-17 13:38   ` [pve-devel] applied: " Fiona Ebner
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

current qemu_dimm_list can return any kind of memory devices.

make it more generic, with an optionnal type device

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 8984821..7ad8fcb 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -240,7 +240,7 @@ sub qemu_memory_hotplug {
 		while (1) {
 		    eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
 		    sleep 3;
-		    my $dimm_list = qemu_dimm_list($vmid);
+		    my $dimm_list = qemu_memdevices_list($vmid, 'dimm');
 		    last if !$dimm_list->{$name};
 		    raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
 		    $retry++;
@@ -255,14 +255,14 @@ sub qemu_memory_hotplug {
     }
 }
 
-sub qemu_dimm_list {
-    my ($vmid) = @_;
+sub qemu_memdevices_list {
+    my ($vmid, $type) = @_;
 
     my $dimmarray = mon_cmd($vmid, "query-memory-devices");
     my $dimms = {};
 
     foreach my $dimm (@$dimmarray) {
-
+        next if $type && $dimm->{data}->{id} !~ /^$type(\d+)$/;
         $dimms->{$dimm->{data}->{id}}->{id} = $dimm->{data}->{id};
         $dimms->{$dimm->{data}->{id}}->{node} = $dimm->{data}->{node};
         $dimms->{$dimm->{data}->{id}}->{addr} = $dimm->{data}->{addr};
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-03-17 13:39   ` [pve-devel] applied: " Fiona Ebner
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 03/12] add memory parser Alexandre Derumier
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

simple use dimm_list() returned by qemu

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 76 +++++++++++++---------------------------
 1 file changed, 25 insertions(+), 51 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 7ad8fcb..0f4229c 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -128,38 +128,6 @@ sub foreach_dimm{
     }
 }
 
-sub foreach_reverse_dimm {
-    my ($conf, $vmid, $memory, $sockets, $func) = @_;
-
-    my $dimm_id = 253;
-    my $current_size = 0;
-    my $dimm_size = 0;
-
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 8355840;
-	$dimm_size = 131072;
-    } else {
-	$current_size = 4177920;
-	$dimm_size = 65536;
-    }
-
-    return if $current_size == $memory;
-
-    my @numa_map = get_numa_node_list($conf);
-
-    for (my $j = 0; $j < 8; $j++) {
-	for (my $i = 0; $i < 32; $i++) {
-	    my $name = "dimm${dimm_id}";
-	    $dimm_id--;
-	    my $numanode = $numa_map[(31-$i) % @numa_map];
-	    $current_size -= $dimm_size;
-	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
-	    return  $current_size if $current_size <= $memory;
-	}
-	$dimm_size /= 2;
-    }
-}
-
 sub qemu_memory_hotplug {
     my ($vmid, $conf, $defaults, $value) = @_;
 
@@ -230,29 +198,35 @@ sub qemu_memory_hotplug {
 
     } else {
 
-	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
-	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+	my $dimms = qemu_memdevices_list($vmid, 'dimm');
 
-		return if $current_size >= $conf->{memory};
-		print "try to unplug memory dimm $name\n";
-
-		my $retry = 0;
-		while (1) {
-		    eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
-		    sleep 3;
-		    my $dimm_list = qemu_memdevices_list($vmid, 'dimm');
-		    last if !$dimm_list->{$name};
-		    raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
-		    $retry++;
-		}
+	my $current_size = $memory;
+	for my $name (sort { $dimms->{$b}->{slot} <=> $dimms->{$a}->{slot} } keys %$dimms) {
 
-		#update conf after each succesful module unplug
-		$conf->{memory} = $current_size;
+	    my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
 
-		eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
-		PVE::QemuConfig->write_config($vmid, $conf);
-	});
+	    last if $current_size <= $value;
+
+	    print "try to unplug memory dimm $name\n";
+
+	    my $retry = 0;
+	    while (1) {
+		eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
+		sleep 3;
+		my $dimm_list = qemu_memdevices_list($vmid, 'dimm');
+		last if !$dimm_list->{$name};
+		raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
+		$retry++;
+	    }
+	    $current_size -= $dimm_size;
+	    #update conf after each succesful module unplug
+	    $conf->{memory} = $current_size;
+
+	    eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
+	    PVE::QemuConfig->write_config($vmid, $conf);
+	}
     }
+    return $conf->{memory};
 }
 
 sub qemu_memdevices_list {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 03/12] add memory parser
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 04/12] memory: add get_static_mem Alexandre Derumier
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm          |  7 ++--
 PVE/QemuConfig.pm         |  4 +--
 PVE/QemuMigrate.pm        |  6 ++--
 PVE/QemuServer.pm         | 27 ++++++++--------
 PVE/QemuServer/Helpers.pm |  3 +-
 PVE/QemuServer/Memory.pm  | 67 ++++++++++++++++++++++++++++++++-------
 6 files changed, 79 insertions(+), 35 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 587bb22..9b80da1 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -1489,8 +1490,6 @@ my $update_vm_api  = sub {
 
     my $storecfg = PVE::Storage::config();
 
-    my $defaults = PVE::QemuServer::load_defaults();
-
     &$resolve_cdrom_alias($param);
 
     # now try to verify all parameters
@@ -1608,7 +1607,9 @@ my $update_vm_api  = sub {
 	}
 
 	if ($param->{memory} || defined($param->{balloon})) {
-	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
+
+	    my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+	    my $maxmem = get_current_memory($memory);
 	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
 
 	    die "balloon value too large (must be smaller than assigned memory)\n"
diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
index 051382c..999e658 100644
--- a/PVE/QemuConfig.pm
+++ b/PVE/QemuConfig.pm
@@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer;
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::Storage;
 use PVE::Tools;
 use PVE::Format qw(render_bytes render_duration);
@@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
 	$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
     }
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $mem_size = $conf->{memory} // $defaults->{memory};
+    my $mem_size = get_current_memory($conf->{memory});
     my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
     # our savevm-start does live-save of the memory until the space left in the
     # volume is just enough for the remaining memory content + internal state
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 09cc1d8..f86fdbe 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::Helpers qw(min_version);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer;
 
 use PVE::AbstractMigrate;
@@ -1026,7 +1027,8 @@ sub phase2_start_remote_cluster {
     my $remote_vmid = $self->{opts}->{remote}->{vmid};
 
     # like regular start but with some overhead accounted for
-    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+    my $memory = get_current_memory($self->{vmconf}->{memory});
+    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
 
     my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
 
@@ -1181,7 +1183,7 @@ sub phase2 {
     $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
 
     # set cachesize to 10% of the total memory
-    my $memory =  $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $cachesize = int($memory * 1048576 / 10);
     $cachesize = round_powerof2($cachesize);
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a1a8b93..64ae722 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -53,7 +53,7 @@ use PVE::QemuServer::CGroup;
 use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
 use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
 use PVE::QemuServer::USB qw(parse_usb_device);
@@ -341,11 +341,9 @@ my $confdesc = {
     },
     memory => {
 	optional => 1,
-	type => 'integer',
-	description => "Amount of RAM for the VM in MiB. This is the maximum available memory when"
-	    ." you use the balloon device.",
-	minimum => 16,
-	default => 512,
+	type => 'string',
+	description => "Memory properties.",
+	format => $PVE::QemuServer::Memory::memory_fmt
     },
     balloon => {
 	optional => 1,
@@ -2948,8 +2946,7 @@ sub vmstatus {
 	$d->{cpus} = $conf->{vcpus} if $conf->{vcpus};
 
 	$d->{name} = $conf->{name} || "VM $vmid";
-	$d->{maxmem} = $conf->{memory} ? $conf->{memory}*(1024*1024)
-	    : $defaults->{memory}*(1024*1024);
+	$d->{maxmem} = get_current_memory($conf->{memory})*(1024*1024);
 
 	if ($conf->{balloon}) {
 	    $d->{balloon_min} = $conf->{balloon}*(1024*1024);
@@ -3885,7 +3882,7 @@ sub config_to_command {
     }
 
     PVE::QemuServer::Memory::config(
-	$conf, $vmid, $sockets, $cores, $defaults, $hotplug_features->{memory}, $cmd);
+	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
 
     push @$cmd, '-S' if $conf->{freeze};
 
@@ -5049,7 +5046,7 @@ sub vmconfig_hotplug_pending {
 		# enable balloon device is not hotpluggable
 		die "skip\n" if defined($conf->{balloon}) && $conf->{balloon} == 0;
 		# here we reset the ballooning value to memory
-		my $balloon = $conf->{memory} || $defaults->{memory};
+		my $balloon = get_current_memory($conf->{memory});
 		mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 	    } elsif ($fast_plug_option->{$opt}) {
 		# do nothing
@@ -5062,7 +5059,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
 		die "skip\n" if !$hotplug_features->{memory};
-		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
 	    } elsif ($opt eq 'cpulimit') {
@@ -5117,7 +5114,8 @@ sub vmconfig_hotplug_pending {
 
 		# allow manual ballooning if shares is set to zero
 		if ((defined($conf->{shares}) && ($conf->{shares} == 0))) {
-		    my $balloon = $conf->{pending}->{balloon} || $conf->{memory} || $defaults->{memory};
+		    my $memory = get_current_memory($conf->{memory});
+		    my $balloon = $conf->{pending}->{balloon} || $memory;
 		    mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 		}
 	    } elsif ($opt =~ m/^net(\d+)$/) {
@@ -5137,7 +5135,7 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
+		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
 		$cgroup->change_cpu_shares($new_cpuunits);
@@ -5815,7 +5813,8 @@ sub vm_start_nolock {
 	push @$cmd, '-S';
     }
 
-    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
+    my $memory = get_current_memory($conf->{memory});
+    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $memory, $resume);
 
     my $pci_devices = {}; # host pci devices
     for (my $i = 0; $i < $PVE::QemuServer::PCI::MAX_HOSTPCI_DEVICES; $i++)  {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index e91f906..9115d50 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -143,8 +143,7 @@ sub version_cmp {
 }
 
 sub config_aware_timeout {
-    my ($config, $is_suspended) = @_;
-    my $memory = $config->{memory};
+    my ($config, $memory, $is_suspended) = @_;
     my $timeout = 30;
 
     # Based on user reported startup time for vm with 512GiB @ 4-5 minutes
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 0f4229c..e1b811f 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,10 +8,42 @@ use PVE::Exception qw(raise raise_param_exc);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use base qw(Exporter);
+
+our @EXPORT_OK = qw(
+get_current_memory
+);
 
 my $MAX_NUMA = 8;
 my $STATICMEM = 1024;
 
+our $memory_fmt = {
+    current => {
+	description => "Current amount of online RAM for the VM in MiB. This is the maximum available memory when"
+	    ." you use the balloon device.",
+	type => 'integer',
+	default_key => 1,
+	minimum => 16,
+	default => 512,
+    },
+};
+
+sub print_memory {
+    my $memory = shift;
+
+    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
+}
+
+sub parse_memory {
+    my ($value) = @_;
+
+    return { current => $memory_fmt->{current}->{default} } if !defined($value);
+
+    my $res = PVE::JSONSchema::parse_property_string($memory_fmt, $value);
+
+    return $res;
+}
+
 my $_host_bits;
 sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -63,6 +95,13 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+sub get_current_memory {
+    my ($value) = @_;
+
+    my $memory = parse_memory($value);
+    return $memory->{current};
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -129,16 +168,19 @@ sub foreach_dimm{
 }
 
 sub qemu_memory_hotplug {
-    my ($vmid, $conf, $defaults, $value) = @_;
+    my ($vmid, $conf, $value) = @_;
 
     return $value if !PVE::QemuServer::check_running($vmid);
 
-    my $sockets = $conf->{sockets} || 1;
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return $value if $newmem->{current} == $oldmem->{current};
 
-    my $memory = $conf->{memory} || $defaults->{memory};
-    $value = $defaults->{memory} if !$value;
-    return $value if $value == $memory;
+    my $memory = $oldmem->{current};
+    $value = $newmem->{current};
 
+    my $sockets = $conf->{sockets} || 1;
     my $static_memory = $STATICMEM;
     $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
 
@@ -153,7 +195,7 @@ sub qemu_memory_hotplug {
 	foreach_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size <= $conf->{memory};
+		return if $current_size <= get_current_memory($conf->{memory});
 
 		if ($conf->{hugepages}) {
 		    $numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
@@ -192,7 +234,8 @@ sub qemu_memory_hotplug {
 		    die $err;
 		}
 		#update conf after each succesful module hotplug
-		$conf->{memory} = $current_size;
+		$newmem->{current} = $current_size;
+		$conf->{memory} = print_memory($newmem);
 		PVE::QemuConfig->write_config($vmid, $conf);
 	});
 
@@ -220,7 +263,8 @@ sub qemu_memory_hotplug {
 	    }
 	    $current_size -= $dimm_size;
 	    #update conf after each succesful module unplug
-	    $conf->{memory} = $current_size;
+            $newmem->{current} = $current_size;
+            $conf->{memory} = print_memory($newmem);
 
 	    eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
 	    PVE::QemuConfig->write_config($vmid, $conf);
@@ -247,9 +291,9 @@ sub qemu_memdevices_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
 
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $static_memory = 0;
 
     if ($hotplug) {
@@ -470,8 +514,7 @@ sub hugepages_topology {
 
     return if !$conf->{numa};
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $static_memory = 0;
     my $sockets = $conf->{sockets} || 1;
     my $numa_custom_topology = undef;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 04/12] memory: add get_static_mem
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (2 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 03/12] add memory parser Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 05/12] memory: use static_memory in foreach_dimm Alexandre Derumier
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index e1b811f..99b7a21 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -15,7 +15,6 @@ get_current_memory
 );
 
 my $MAX_NUMA = 8;
-my $STATICMEM = 1024;
 
 our $memory_fmt = {
     current => {
@@ -44,6 +43,22 @@ sub parse_memory {
     return $res;
 }
 
+my sub get_static_mem {
+    my ($conf, $sockets, $hotplug) = @_;
+
+    my $static_memory = 0;
+    my $memory = parse_memory($conf->{memory});
+
+    if ($hotplug) {
+	$static_memory = 1024;
+	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    } else {
+	$static_memory = $memory->{current};
+    }
+
+    return $static_memory;
+}
+
 my $_host_bits;
 sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -181,8 +196,7 @@ sub qemu_memory_hotplug {
     $value = $newmem->{current};
 
     my $sockets = $conf->{sockets} || 1;
-    my $static_memory = $STATICMEM;
-    $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    my $static_memory = get_static_mem($conf, $sockets, 1);
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
@@ -294,7 +308,7 @@ sub config {
     my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
 
     my $memory = get_current_memory($conf->{memory});
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
 
     if ($hotplug) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
@@ -306,17 +320,11 @@ sub config {
 		if $conf->{"numa$i"};
 	}
 
-	my $sockets = $conf->{sockets} || 1;
-
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
 
     } else {
 
-	$static_memory = $memory;
 	push @$cmd, '-m', $static_memory;
     }
 
@@ -515,17 +523,10 @@ sub hugepages_topology {
     return if !$conf->{numa};
 
     my $memory = get_current_memory($conf->{memory});
-    my $static_memory = 0;
     my $sockets = $conf->{sockets} || 1;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
     my $numa_custom_topology = undef;
 
-    if ($hotplug) {
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-    } else {
-	$static_memory = $memory;
-    }
-
     #custom numa topology
     for (my $i = 0; $i < $MAX_NUMA; $i++) {
 	next if !$conf->{"numa$i"};
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 05/12] memory: use static_memory in foreach_dimm
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (3 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 04/12] memory: add get_static_mem Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 06/12] config: memory: add 'max' option Alexandre Derumier
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 99b7a21..014d19d 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -151,17 +151,15 @@ sub get_numa_guest_to_host_map {
 }
 
 sub foreach_dimm{
-    my ($conf, $vmid, $memory, $sockets, $func) = @_;
+    my ($conf, $vmid, $memory, $static_memory, $func) = @_;
 
     my $dimm_id = 0;
-    my $current_size = 0;
+    my $current_size = $static_memory;
     my $dimm_size = 0;
 
     if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 1024 * $sockets;
 	$dimm_size = 1024;
     } else {
-	$current_size = 1024;
 	$dimm_size = 512;
     }
 
@@ -206,7 +204,7 @@ sub qemu_memory_hotplug {
 
 	my $numa_hostmap;
 
-	foreach_dimm($conf, $vmid, $value, $sockets, sub {
+	foreach_dimm($conf, $vmid, $value, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 		return if $current_size <= get_current_memory($conf->{memory});
@@ -393,7 +391,7 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
+	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
@@ -559,7 +557,7 @@ sub hugepages_topology {
     if ($hotplug) {
 	my $numa_hostmap = get_numa_guest_to_host_map($conf);
 
-	foreach_dimm($conf, undef, $memory, $sockets, sub {
+	foreach_dimm($conf, undef, $memory, $static_memory, sub {
 	    my ($conf, undef, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    $numanode = $numa_hostmap->{$numanode};
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 06/12] config: memory: add 'max' option
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (4 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 05/12] memory: use static_memory in foreach_dimm Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 07/12] memory: get_max_mem: use config memory max Alexandre Derumier
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

max can be multiple of 64GiB only,
The dimm size is compute from the max memory

we can have 64 slots:

64GiB = 64 slots x 1GiB
128GiB = 64 slots x 2GiB
..
4TiB = 64 slots x 64GiB

Also, with numa, we need to share slot between (up to 8) sockets.

64 is a multiple of 8,

64GiB = 8 sockets * 8 slots * 1GiB
128GiB = 8 sockets * 8 slots * 2GiB
...

and with virtio-mem,
we can have 32000 blocks of 2MiB minimum

64GB = 32000 * 2MiB

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  4 ++--
 PVE/QemuServer/Memory.pm | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 64ae722..0461845 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5058,7 +5058,7 @@ sub vmconfig_hotplug_pending {
 		vm_deviceunplug($vmid, $conf, $opt);
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf);
 		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
@@ -5134,7 +5134,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_update_disk($storecfg, $conf, $hotplug_features->{disk},
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf, $value);
 		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 014d19d..4d79503 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
-use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 use PVE::Exception qw(raise raise_param_exc);
+use PVE::GuestHelpers qw(safe_num_ne);
+use PVE::JSONSchema;
+use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
@@ -25,8 +27,28 @@ our $memory_fmt = {
 	minimum => 16,
 	default => 512,
     },
+    max => {
+	description => "Maximum hotpluggable memory for the VM in MiB.",
+	type => 'integer',
+	optional => 1,
+	minimum => 65536,
+	maximum => 4194304,
+	format => 'pve-qm-memory-max',
+    },
 };
 
+PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
+sub verify_qm_memory_max {
+    my ($max, $noerr) = @_;
+
+    if ($max && $max % 65536 != 0) {
+	return undef if $noerr;
+	die "max memory need to be a multiple of 64GiB\n";
+    }
+
+    return $max;
+}
+
 sub print_memory {
     my $memory = shift;
 
@@ -285,6 +307,19 @@ sub qemu_memory_hotplug {
     return $conf->{memory};
 }
 
+sub can_hotplug {
+    my ($hotplug, $conf, $value) = @_;
+
+    return if !$hotplug;
+
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return if safe_num_ne($newmem->{max}, $oldmem->{max});
+
+    return 1;
+}
+
 sub qemu_memdevices_list {
     my ($vmid, $type) = @_;
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 07/12] memory: get_max_mem: use config memory max
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (5 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 06/12] config: memory: add 'max' option Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 08/12] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

verify than defined vm memorymax is not bigger than
host cpu supported memory

Add add early check in update vm api

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         | 35 +++++++++++++++++++++++++----------
 PVE/QemuServer/Memory.pm | 19 ++++++++++++++++++-
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 9b80da1..10e1940 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -476,6 +476,29 @@ my $create_disks = sub {
     return ($vollist, $res);
 };
 
+my $check_memory_param = sub {
+    my ($conf, $param) = @_;
+
+    my $mem = parse_memory($param->{memory});
+    my $host_max_mem = get_host_max_mem($conf);
+
+    die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	if $mem->{max} && $mem->{max} > $host_max_mem;
+
+    die "current memory value too large (must be smaller than max memory)\n"
+	if $mem->{max} && $mem->{current} > $mem->{max};
+
+    if ($param->{memory} || defined($param->{balloon})) {
+
+	my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+	my $current = get_current_memory($memory);
+	my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
+
+	die "balloon value too large (must be smaller than current assigned memory)\n"
+	    if $balloon && $balloon > $current;
+    }
+};
+
 my $check_cpu_model_access = sub {
     my ($rpcenv, $authuser, $new, $existing) = @_;
 
@@ -1606,15 +1629,7 @@ my $update_vm_api  = sub {
 	    }
 	}
 
-	if ($param->{memory} || defined($param->{balloon})) {
-
-	    my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
-	    my $maxmem = get_current_memory($memory);
-	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
-
-	    die "balloon value too large (must be smaller than assigned memory)\n"
-		if $balloon && $balloon > $maxmem;
-	}
+	&$check_memory_param($conf, $param);
 
 	PVE::Cluster::log_msg('info', $authuser, "update VM $vmid: " . join (' ', @paramarr));
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 4d79503..624ff5d 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -14,6 +14,8 @@ use base qw(Exporter);
 
 our @EXPORT_OK = qw(
 get_current_memory
+parse_memory
+get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
@@ -98,7 +100,7 @@ sub get_host_phys_address_bits {
     return; # undef, cannot really do anything..
 }
 
-my sub get_max_mem {
+sub get_host_max_mem {
     my ($conf) = @_;
 
     my $cpu = {};
@@ -132,6 +134,21 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+my sub get_max_mem {
+    my ($conf) = @_;
+
+    my $host_max_mem = get_host_max_mem($conf);
+    my $mem = parse_memory($conf->{memory});
+
+    if ($mem->{max}) {
+	die "configured memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	    if $mem->{max} > $host_max_mem;
+	return $mem->{max};
+    }
+
+    return $host_max_mem;
+}
+
 sub get_current_memory {
     my ($value) = @_;
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 08/12] memory: use 64 slots && static dimm size when max is defined
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (6 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 07/12] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 09/12] test: add memory-max tests Alexandre Derumier
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

default kernel vhost config only support 64 slots by default,
for performance since 2015.

Original memory hotplug code was done before, using qemu
max supported 255 slots.

To reach max mem (4TB), we used incremental dimm size.

Instead of dynamic memory size, use 1 static dimm size, compute
from max memory/64.

We also force a static memory to 4GB, as memory under 4G don't
works fine with unplug

Fix:
https://bugzilla.proxmox.com/show_bug.cgi?id=3446
https://bugzilla.proxmox.com/show_bug.cgi?id=1426
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 624ff5d..7f9d887 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -19,6 +19,7 @@ get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
+my $MAX_SLOTS = 64;
 
 our $memory_fmt = {
     current => {
@@ -73,7 +74,13 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($hotplug) {
+    if ($memory->{max}) {
+	my $dimm_size = $memory->{max} / $MAX_SLOTS;
+	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
+	$static_memory = $dimm_size * $sockets;
+	$static_memory = 4096 if $static_memory < 4096;
+    } elsif ($hotplug) {
+	#legacy
 	$static_memory = 1024;
 	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
     } else {
@@ -196,7 +203,10 @@ sub foreach_dimm{
     my $current_size = $static_memory;
     my $dimm_size = 0;
 
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
+    my $confmem = parse_memory($conf->{memory});
+    if ($confmem->{max}) {
+	$dimm_size = $confmem->{max} / $MAX_SLOTS;
+    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
 	$dimm_size = 1024;
     } else {
 	$dimm_size = 512;
@@ -215,7 +225,7 @@ sub foreach_dimm{
 	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
 	    return  $current_size if $current_size >= $memory;
 	}
-	$dimm_size *= 2;
+	$dimm_size *= 2 if !$confmem->{max};
     }
 }
 
@@ -371,7 +381,9 @@ sub config {
 	}
 
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
-	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
+	my $confmem = parse_memory($conf->{memory});
+	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
+	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
 
     } else {
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 09/12] test: add memory-max tests
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (7 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 08/12] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 10/12] memory: add virtio-mem support Alexandre Derumier
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-max-128G.conf     | 11 ++++
 test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++++++++++++++++++++++++++
 test/cfg2cmd/memory-max-512G.conf     | 11 ++++
 test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++++++++++++++++++
 4 files changed, 166 insertions(+)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd

diff --git a/test/cfg2cmd/memory-max-128G.conf b/test/cfg2cmd/memory-max-128G.conf
new file mode 100644
index 0000000..8f0328d
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 64G with 2 socket, dimm size should be 1GB && static memory should be 4G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-128G.conf.cmd b/test/cfg2cmd/memory-max-128G.conf.cmd
new file mode 100644
index 0000000..5b22027
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf.cmd
@@ -0,0 +1,86 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,slots=64,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=1024M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=1024M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=1024M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=1024M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=1024M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=1024M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=1024M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=1024M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=1024M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=1024M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=1024M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=1024M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=1024M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=1024M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=1024M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=1024M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=1024M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=1024M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=1024M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=1024M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=1024M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=1024M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=1024M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=1024M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=1024M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=1024M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=1024M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=1024M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf b/test/cfg2cmd/memory-max-512G.conf
new file mode 100644
index 0000000..082f75e
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 512GB with 2 socket, dimm size should be 8GB && static memory should be 16G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 131072,max=524288
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf.cmd b/test/cfg2cmd/memory-max-512G.conf.cmd
new file mode 100644
index 0000000..3c04b70
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf.cmd
@@ -0,0 +1,58 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=16384,slots=64,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=8192M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=8192M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=8192M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=8192M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=8192M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=8192M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=8192M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=8192M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=8192M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=8192M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=8192M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=8192M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=8192M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=8192M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=8192M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=8192M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 10/12] memory: add virtio-mem support
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (8 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 09/12] test: add memory-max tests Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 11/12] memory: virtio-mem : implement redispatch retry Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 12/12] tests: add virtio-mem tests Alexandre Derumier
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

a 4GiB static memory is needed for DMA+boot memory, as this memory
is almost always un-unpluggeable.

1 virtio-mem pci device is setup for each numa node on pci.4 bridge

virtio-mem use a fixed blocksize with 32000 blocks
Blocksize is computed from the maxmemory-4096/32000 with a minimum of
2MiB to map THP.
(lower blocksize = more chance to unplug memory).

Note: Currently, linux only support 4MiB virtio blocksize, 2MiB support
is currently is progress.

For hotplug/unplug, we are try to allocate/unallocate same amount
of memory aligned to the blocksize on each numa node if possible.
If a node a not able to reach the target memory (could be an unmovable
page on unplug for example), we try again to redispatch memory the
remaining memory on others nodes.

About hugepages:

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

fixes:
https://bugzilla.proxmox.com/show_bug.cgi?id=2949
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         |   9 +-
 PVE/QemuServer.pm        |   8 +-
 PVE/QemuServer/Memory.pm | 219 ++++++++++++++++++++++++++++++++++++---
 PVE/QemuServer/PCI.pm    |   8 ++
 4 files changed, 226 insertions(+), 18 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 10e1940..3118d65 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem get_virtiomem_block_size);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -488,6 +488,13 @@ my $check_memory_param = sub {
     die "current memory value too large (must be smaller than max memory)\n"
 	if $mem->{max} && $mem->{current} > $mem->{max};
 
+    if ($mem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	die "memory need to be a multiple of $blocksize MiB when virtiomem is enabled\n"
+	    if $mem->{current} % $blocksize != 0;
+    }
+
     if ($param->{memory} || defined($param->{balloon})) {
 
 	my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 0461845..8626397 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3881,8 +3881,14 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
+    my $mem_devices = {};
     PVE::QemuServer::Memory::config(
-	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
+	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd, $mem_devices);
+
+    foreach my $id (sort keys %$mem_devices) {
+	my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);
+	push @$devices, "-device", "$mem_devices->{$id}$pciaddr";
+    }
 
     push @$cmd, '-S' if $conf->{freeze};
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 7f9d887..3bcdd7c 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
+use POSIX qw(ceil);
+
 use PVE::Exception qw(raise raise_param_exc);
-use PVE::GuestHelpers qw(safe_num_ne);
+use PVE::GuestHelpers qw(safe_boolean_ne safe_num_ne);
 use PVE::JSONSchema;
 use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 
@@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
 get_current_memory
 parse_memory
 get_host_max_mem
+get_virtiomem_block_size
 );
 
 my $MAX_NUMA = 8;
@@ -38,6 +41,12 @@ our $memory_fmt = {
 	maximum => 4194304,
 	format => 'pve-qm-memory-max',
     },
+    virtio => {
+	description => "Use virtio-mem devices for hotplug (Experimental: Only works with Linux guest with kernel >= 5.10)",
+	type => 'boolean',
+	optional => 1,
+	default => 0,
+    },
 };
 
 PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
@@ -74,7 +83,9 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($memory->{max}) {
+    if ($memory->{virtio}) {
+	$static_memory = 4096;
+    } elsif ($memory->{max}) {
 	my $dimm_size = $memory->{max} / $MAX_SLOTS;
 	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
 	$static_memory = $dimm_size * $sockets;
@@ -163,6 +174,121 @@ sub get_current_memory {
     return $memory->{current};
 }
 
+sub get_virtiomem_block_size {
+    my ($conf) = @_;
+
+    my $sockets = $conf->{sockets} || 1;
+    my $MAX_MEM = get_max_mem($conf);
+    my $static_memory = get_static_mem($conf, $sockets, 1);
+    my $memory = get_current_memory($conf->{memory});
+
+    #virtiomem can map 32000 block size.
+    #try to use lowest blocksize, lower = more chance to unplug memory.
+    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
+    #2MB is the minimum to be aligned with THP
+    $blocksize = 2 if $blocksize < 2;
+    $blocksize = 2**(ceil(log($blocksize)/log(2)));
+    #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
+    $blocksize = 4 if $blocksize < 4;
+
+    return $blocksize;
+}
+
+my sub get_virtiomem_total_current_size {
+    my ($mems) = @_;
+    my $size = 0;
+    for my $mem (values %$mems) {
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
+my sub balance_virtiomem {
+    my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
+
+    my $nb_virtiomem = scalar(keys %$virtiomems);
+
+    print"try to balance memory on $nb_virtiomem virtiomems\n";
+
+    #if we can't share exactly the same amount, we add the remainder on last node
+    my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
+    my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
+
+    my $i = 0;
+    foreach my $id (sort keys %$virtiomems) {
+	my $virtiomem = $virtiomems->{$id};
+	$i++;
+	my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
+	$virtiomem->{completed} = 0;
+	$virtiomem->{retry} = 0;
+	$virtiomem->{target} = $virtiomem_target;
+
+	print "virtiomem$id: set-requested-size : $virtiomem_target\n";
+	mon_cmd(
+	    $vmid,
+	    'qom-set',
+	    path => "/machine/peripheral/virtiomem$id",
+	    property => "requested-size",
+	    value => $virtiomem_target * 1024 * 1024
+	);
+    }
+
+    my $total_finished = 0;
+    my $error = undef;
+
+    while ($total_finished != $nb_virtiomem) {
+
+	sleep 1;
+
+	$total_finished = 0;
+
+	foreach my $id (keys %$virtiomems) {
+
+	    my $virtiomem = $virtiomems->{$id};
+
+	    if ($virtiomem->{error} || $virtiomem->{completed}) {
+		$total_finished++;
+		next;
+	    }
+
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
+	    $virtiomem->{current} = $size / 1024 / 1024;
+	    print"virtiomem$id: last: $virtiomem->{last} current: $virtiomem->{current} target: $virtiomem->{target} retry: $virtiomem->{retry}\n";
+
+	    if($virtiomem->{current} == $virtiomem->{target}) {
+		print"virtiomem$id: completed\n";
+		$virtiomem->{completed} = 1;
+		next;
+	    }
+
+	    if($virtiomem->{current} != $virtiomem->{last}) {
+		#if value has changed, but not yet completed
+		$virtiomem->{retry} = 0;
+		$virtiomem->{last} = $virtiomem->{current};
+		next;
+	    }
+
+	    if($virtiomem->{retry} >= 5) {
+		print "virtiomem$id: target memory still not reached, ignoring device from now on\n";
+		$virtiomem->{error} = 1;
+		$error = 1;
+		#as change is async, we don't want that value change after the api call
+		eval {
+		    mon_cmd(
+			$vmid,
+			'qom-set',
+			path => "/machine/peripheral/virtiomem$id",
+			property => "requested-size",
+			value => $virtiomem->{current} * 1024 *1024
+		    );
+		};
+	    }
+	    $virtiomem->{retry}++;
+	}
+    }
+    die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -249,7 +375,37 @@ sub qemu_memory_hotplug {
     my $MAX_MEM = get_max_mem($conf);
     die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
-    if ($value > $memory) {
+    if ($oldmem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	my $virtiomems = {};
+
+	for (my $i = 0; $i < $sockets; $i++) {
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$i", property => "size");
+	    $size = $size / 1024 /1024;
+	    $virtiomems->{$i} = {
+		current => $size,
+		last => $size,
+		error => 0,
+		completed => 0,
+		retry => 0
+	    };
+	}
+
+	my $target_total = $value - $static_memory;
+	my $err;
+	eval {
+	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+	};
+	$err = $@ if $@;
+
+	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	$newmem->{current} = $current_memory;
+	$conf->{memory} = print_memory($newmem);
+	PVE::QemuConfig->write_config($vmid, $conf);
+	die $err if $err;
+
+    } elsif ($value > $memory) {
 
 	my $numa_hostmap;
 
@@ -342,8 +498,8 @@ sub can_hotplug {
     my $oldmem = parse_memory($conf->{memory});
     my $newmem = parse_memory($value);
 
-    return if safe_num_ne($newmem->{max}, $oldmem->{max});
-
+    return if (safe_num_ne($newmem->{max}, $oldmem->{max}) || 
+	       safe_boolean_ne($newmem->{virtio}, $oldmem->{virtio}));
     return 1;
 }
 
@@ -365,7 +521,7 @@ sub qemu_memdevices_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd, $mem_devices) = @_;
 
     my $memory = get_current_memory($conf->{memory});
     my $static_memory = get_static_mem($conf, $sockets, $hotplug);
@@ -383,7 +539,10 @@ sub config {
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	my $confmem = parse_memory($conf->{memory});
 	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
-	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
+	my $cmdstr = "size=${static_memory}";
+	$cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
+	$cmdstr .= ",maxmem=${MAX_MEM}M";
+	push @$cmd, '-m', $cmdstr;
 
     } else {
 
@@ -455,17 +614,42 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
-	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+	my $confmem = parse_memory($conf->{memory});
 
-	    push @$cmd, "-object" , $mem_object;
-	    push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+	if ($confmem->{'virtio'}) {
 
-	    die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
-		if $current_size > $memory;
-	});
+	    my $MAX_MEM = get_max_mem($conf);
+	    my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
+	    my $node_mem = ($memory - $static_memory) / $sockets;
+	    my $blocksize = get_virtiomem_block_size($conf);
+
+	    die "memory need to be a multiple of $blocksize MiB with maxmemory $MAX_MEM MiB when virtiomem is enabled\n"
+		if $memory % $blocksize != 0;
+
+	    for (my $i = 0; $i < $sockets; $i++)  {
+
+		my $id = "virtiomem$i";
+		my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
+		push @$cmd, "-object" , "$mem_object,reserve=off";
+
+		my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i";
+		$mem_device .= ",prealloc=on" if $conf->{hugepages};
+		$mem_devices->{$id} = $mem_device;
+	    }
+	} else {
+	    foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
+		my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+
+		my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+
+		push @$cmd, "-object" , $mem_object;
+		push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+
+		die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
+		    if $current_size > $memory;
+	    });
+	}
     }
 }
 
@@ -476,8 +660,11 @@ sub print_mem_object {
 
 	my $hugepages_size = hugepages_size($conf, $size);
 	my $path = hugepages_mount_path($hugepages_size);
+	my $confmem = parse_memory($conf->{memory});
 
-	return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
+	my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
+	$object .= ",prealloc=yes" if !$confmem->{virtio};
+	return $object;
     } else {
 	return "memory-backend-ram,id=$id,size=${size}M";
     }
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index a18b974..0187c74 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -249,6 +249,14 @@ sub get_pci_addr_map {
 	'scsihw2' => { bus => 4, addr => 1 },
 	'scsihw3' => { bus => 4, addr => 2 },
 	'scsihw4' => { bus => 4, addr => 3 },
+	'virtiomem0' => { bus => 4, addr => 4 },
+	'virtiomem1' => { bus => 4, addr => 5 },
+	'virtiomem2' => { bus => 4, addr => 6 },
+	'virtiomem3' => { bus => 4, addr => 7 },
+	'virtiomem4' => { bus => 4, addr => 8 },
+	'virtiomem5' => { bus => 4, addr => 9 },
+	'virtiomem6' => { bus => 4, addr => 10 },
+	'virtiomem7' => { bus => 4, addr => 11 },
     } if !defined($pci_addr_map);
     return $pci_addr_map;
 }
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 11/12] memory: virtio-mem : implement redispatch retry.
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (9 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 10/12] memory: add virtio-mem support Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 12/12] tests: add virtio-mem tests Alexandre Derumier
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

If some memory can be removed on a specific node,
we try to rebalance again on other nodes

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 59 +++++++++++++++++++++++++++++-----------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 3bcdd7c..271c55f 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -203,13 +203,32 @@ my sub get_virtiomem_total_current_size {
     return $size;
 }
 
+my sub get_virtiomem_total_errors_size {
+    my ($mems) = @_;
+
+    my $size = 0;
+    for my $mem (values %$mems) {
+	next if !$mem->{error};
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
 my sub balance_virtiomem {
     my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
 
-    my $nb_virtiomem = scalar(keys %$virtiomems);
+    my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems->%*);
 
     print"try to balance memory on $nb_virtiomem virtiomems\n";
 
+    my $target_total_err = undef;
+    if($target_total < 0) {
+	$target_total = 0;
+	$target_total_err = 1;
+    }
+
+    die "No more virtiomem devices left to try to balance the remaining memory\n" if $nb_virtiomem == 0;
+
     #if we can't share exactly the same amount, we add the remainder on last node
     my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
     my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
@@ -217,6 +236,7 @@ my sub balance_virtiomem {
     my $i = 0;
     foreach my $id (sort keys %$virtiomems) {
 	my $virtiomem = $virtiomems->{$id};
+	next if $virtiomem->{error};
 	$i++;
 	my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
 	$virtiomem->{completed} = 0;
@@ -234,7 +254,6 @@ my sub balance_virtiomem {
     }
 
     my $total_finished = 0;
-    my $error = undef;
 
     while ($total_finished != $nb_virtiomem) {
 
@@ -258,6 +277,7 @@ my sub balance_virtiomem {
 	    if($virtiomem->{current} == $virtiomem->{target}) {
 		print"virtiomem$id: completed\n";
 		$virtiomem->{completed} = 1;
+		$virtiomem->{last} = $virtiomem->{current};
 		next;
 	    }
 
@@ -271,7 +291,6 @@ my sub balance_virtiomem {
 	    if($virtiomem->{retry} >= 5) {
 		print "virtiomem$id: target memory still not reached, ignoring device from now on\n";
 		$virtiomem->{error} = 1;
-		$error = 1;
 		#as change is async, we don't want that value change after the api call
 		eval {
 		    mon_cmd(
@@ -286,7 +305,9 @@ my sub balance_virtiomem {
 	    $virtiomem->{retry}++;
 	}
     }
-    die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
+
+    die "No more virtiomem devices left to try to balance the remaining memory\n"
+	if $target_total_err;
 }
 
 sub get_numa_node_list {
@@ -392,18 +413,24 @@ sub qemu_memory_hotplug {
 	    };
 	}
 
-	my $target_total = $value - $static_memory;
-	my $err;
-	eval {
-	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
-	};
-	$err = $@ if $@;
-
-	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
-	$newmem->{current} = $current_memory;
-	$conf->{memory} = print_memory($newmem);
-	PVE::QemuConfig->write_config($vmid, $conf);
-	die $err if $err;
+	while (1) {
+
+	    my $target_total = $value - $static_memory - get_virtiomem_total_errors_size($virtiomems);
+	    my $err;
+	    eval {
+		balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+	    };
+	    $err = $@ if $@;
+
+	    my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	    $newmem->{current} = $current_memory;
+	    $conf->{memory} = print_memory($newmem);
+	    PVE::QemuConfig->write_config($vmid, $conf);
+
+	    die $err if $err;
+	    last if $current_memory == $value;
+	}
+	return $conf->{memory};
 
     } elsif ($value > $memory) {
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] [PATCH v5 qemu-server 12/12] tests: add virtio-mem tests
  2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (10 preceding siblings ...)
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 11/12] memory: virtio-mem : implement redispatch retry Alexandre Derumier
@ 2023-02-24 12:09 ` Alexandre Derumier
  11 siblings, 0 replies; 18+ messages in thread
From: Alexandre Derumier @ 2023-02-24 12:09 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  | 12 +++++++
 .../memory-virtio-hugepages-1G.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio-max.conf           | 11 ++++++
 test/cfg2cmd/memory-virtio-max.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio.conf               | 11 ++++++
 test/cfg2cmd/memory-virtio.conf.cmd           | 35 +++++++++++++++++++
 6 files changed, 139 insertions(+)
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
new file mode 100644
index 0000000..bbe908c
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
@@ -0,0 +1,12 @@
+# TEST: virtio-mem without memorymax defined and hugepages
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
new file mode 100644
index 0000000..2e7de74
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-virtiomem0,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -object 'memory-backend-file,id=mem-virtiomem1,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,prealloc=on,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,prealloc=on,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf b/test/cfg2cmd/memory-virtio-max.conf
new file mode 100644
index 0000000..59b8e94
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf
@@ -0,0 +1,11 @@
+# TEST: memory max 64G with 2 socket with virtio-mem
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf.cmd b/test/cfg2cmd/memory-virtio-max.conf.cmd
new file mode 100644
index 0000000..6bda268
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=30720M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=30720M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
new file mode 100644
index 0000000..60c0264
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf
@@ -0,0 +1,11 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf.cmd b/test/cfg2cmd/memory-virtio.conf.cmd
new file mode 100644
index 0000000..f392636
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=260096M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=260096M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.30.2




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] applied: [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
@ 2023-03-17 13:38   ` Fiona Ebner
  0 siblings, 0 replies; 18+ messages in thread
From: Fiona Ebner @ 2023-03-17 13:38 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Am 24.02.23 um 13:09 schrieb Alexandre Derumier:
> current qemu_dimm_list can return any kind of memory devices.
> 
> make it more generic, with an optionnal type device
> 
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer/Memory.pm | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 8984821..7ad8fcb 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -240,7 +240,7 @@ sub qemu_memory_hotplug {
>  		while (1) {
>  		    eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
>  		    sleep 3;
> -		    my $dimm_list = qemu_dimm_list($vmid);
> +		    my $dimm_list = qemu_memdevices_list($vmid, 'dimm');
>  		    last if !$dimm_list->{$name};
>  		    raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
>  		    $retry++;
> @@ -255,14 +255,14 @@ sub qemu_memory_hotplug {
>      }
>  }
>  
> -sub qemu_dimm_list {
> -    my ($vmid) = @_;
> +sub qemu_memdevices_list {
> +    my ($vmid, $type) = @_;
>  
>      my $dimmarray = mon_cmd($vmid, "query-memory-devices");
>      my $dimms = {};
>  
>      foreach my $dimm (@$dimmarray) {
> -
> +        next if $type && $dimm->{data}->{id} !~ /^$type(\d+)$/;
>          $dimms->{$dimm->{data}->{id}}->{id} = $dimm->{data}->{id};
>          $dimms->{$dimm->{data}->{id}}->{node} = $dimm->{data}->{node};
>          $dimms->{$dimm->{data}->{id}}->{addr} = $dimm->{data}->{addr};

applied, thanks!




^ permalink raw reply	[flat|nested] 18+ messages in thread

* [pve-devel] applied: [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug
  2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-03-17 13:39   ` Fiona Ebner
  2023-03-17 16:23     ` DERUMIER, Alexandre
  0 siblings, 1 reply; 18+ messages in thread
From: Fiona Ebner @ 2023-03-17 13:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Am 24.02.23 um 13:09 schrieb Alexandre Derumier:
> simple use dimm_list() returned by qemu
> 
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>

Thanks! Applied with this follow-up:

>     memory: hotplug: sort by numerical ID rather than slot when unplugging
>     
>     While, usually, the slot should match the ID, it's not explicitly
>     guaranteed and relies on QEMU internals. Using the numerical ID is
>     more future-proof and more consistent with plugging, where no slot
>     information (except the maximum limit) is relied upon.
>     
>     Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  PVE/QemuServer/Memory.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 0f4229c5..0601dd6c 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -201,7 +201,7 @@ sub qemu_memory_hotplug {
>         my $dimms = qemu_memdevices_list($vmid, 'dimm');
>  
>         my $current_size = $memory;
> -       for my $name (sort { $dimms->{$b}->{slot} <=> $dimms->{$a}->{slot} } keys %$dimms) {
> +       for my $name (sort { ($b =~ /^dimm(\d+)$/)[0] <=> ($a =~ /^dimm(\d+)$/)[0] } keys %$dimms) {
>  
>             my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
>  




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [pve-devel] applied: [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug
  2023-03-17 13:39   ` [pve-devel] applied: " Fiona Ebner
@ 2023-03-17 16:23     ` DERUMIER, Alexandre
  2023-03-20  8:36       ` Fiona Ebner
  0 siblings, 1 reply; 18+ messages in thread
From: DERUMIER, Alexandre @ 2023-03-17 16:23 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Hi Fiona,

I don't have checked yet, but does I need to rebase the other virtio-
mem patches ?


Le vendredi 17 mars 2023 à 14:39 +0100, Fiona Ebner a écrit :
> Am 24.02.23 um 13:09 schrieb Alexandre Derumier:
> > simple use dimm_list() returned by qemu
> > 
> > Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> 
> Thanks! Applied with this follow-up:
> 
> >     memory: hotplug: sort by numerical ID rather than slot when
> > unplugging
> >     
> >     While, usually, the slot should match the ID, it's not
> > explicitly
> >     guaranteed and relies on QEMU internals. Using the numerical ID
> > is
> >     more future-proof and more consistent with plugging, where no
> > slot
> >     information (except the maximum limit) is relied upon.
> >     
> >     Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> > ---
> >  PVE/QemuServer/Memory.pm | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> > index 0f4229c5..0601dd6c 100644
> > --- a/PVE/QemuServer/Memory.pm
> > +++ b/PVE/QemuServer/Memory.pm
> > @@ -201,7 +201,7 @@ sub qemu_memory_hotplug {
> >         my $dimms = qemu_memdevices_list($vmid, 'dimm');
> >  
> >         my $current_size = $memory;
> > -       for my $name (sort { $dimms->{$b}->{slot} <=> $dimms->{$a}-
> > >{slot} } keys %$dimms) {
> > +       for my $name (sort { ($b =~ /^dimm(\d+)$/)[0] <=> ($a =~
> > /^dimm(\d+)$/)[0] } keys %$dimms) {
> >  
> >             my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
> >  
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [pve-devel] applied: [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug
  2023-03-17 16:23     ` DERUMIER, Alexandre
@ 2023-03-20  8:36       ` Fiona Ebner
  2023-03-20  8:49         ` DERUMIER, Alexandre
  0 siblings, 1 reply; 18+ messages in thread
From: Fiona Ebner @ 2023-03-20  8:36 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 17.03.23 um 17:23 schrieb DERUMIER, Alexandre:
> Hi Fiona,
> 
> I don't have checked yet, but does I need to rebase the other virtio-
> mem patches ?
> 

No, they still apply for me. But this patch(series)[0] and a versioned
breaks is needed for HA manager, before the 'memory' config property can
be turned into a property string. And then we need to update the
implementation of QemuConfig.pm's get_derived_property() to make it
compatible with your changes here, but that can just be a follow-up.
Just for that, a v6 is not needed.

[0]: https://lists.proxmox.com/pipermail/pve-devel/2023-February/055947.html




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [pve-devel] applied: [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug
  2023-03-20  8:36       ` Fiona Ebner
@ 2023-03-20  8:49         ` DERUMIER, Alexandre
  0 siblings, 0 replies; 18+ messages in thread
From: DERUMIER, Alexandre @ 2023-03-20  8:49 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Le lundi 20 mars 2023 à 09:36 +0100, Fiona Ebner a écrit :
> Am 17.03.23 um 17:23 schrieb DERUMIER, Alexandre:
> > Hi Fiona,
> > 
> > I don't have checked yet, but does I need to rebase the other
> > virtio-
> > mem patches ?
> > 
> 
> No, they still apply for me. But this patch(series)[0] and a
> versioned
> breaks is needed for HA manager, before the 'memory' config property
> can
> be turned into a property string. And then we need to update the
> implementation of QemuConfig.pm's get_derived_property() to make it
> compatible with your changes here, but that can just be a follow-up.
> Just for that, a v6 is not needed.

Ok thanks !


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-03-20  8:49 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-24 12:09 [pve-devel] [PATCH v5 qemu-server 00/12] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 01/12] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
2023-03-17 13:38   ` [pve-devel] applied: " Fiona Ebner
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 02/12] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
2023-03-17 13:39   ` [pve-devel] applied: " Fiona Ebner
2023-03-17 16:23     ` DERUMIER, Alexandre
2023-03-20  8:36       ` Fiona Ebner
2023-03-20  8:49         ` DERUMIER, Alexandre
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 03/12] add memory parser Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 04/12] memory: add get_static_mem Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 05/12] memory: use static_memory in foreach_dimm Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 06/12] config: memory: add 'max' option Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 07/12] memory: get_max_mem: use config memory max Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 08/12] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 09/12] test: add memory-max tests Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 10/12] memory: add virtio-mem support Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 11/12] memory: virtio-mem : implement redispatch retry Alexandre Derumier
2023-02-24 12:09 ` [pve-devel] [PATCH v5 qemu-server 12/12] tests: add virtio-mem tests Alexandre Derumier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal