public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem
@ 2023-06-19  7:28 Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser Alexandre Derumier
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

This patch series rework the current memory hotplug + virtiomem.

memory option now have extra options:

memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
ex: memory: current=1024,max=131072,virtio=1


for classic memory hotplug, when maxmemory is defined,
we use 64 fixed size dimm.
The max option is a multiple of 64GB.

The virtio option enable new virtio-mem support,
instead of plugging dimm, it's add/removed block inside
big dimm.
virtio-mem can use 32000 blocks, the blocksize is compute from
max memory.

Changelog v2:

update differents patches based on Fiona comments.
(I have send 2 others mails with comments not yet addressed)

Biggest change is on virtio-mem, instead of trying to have same amount of memory
on each virtiomem (some block could be unmovable and break unplug),
we try to balance/dispatch remaining block on other available virtiomems.

Also, the minimum blocksize supported by linux guest os is 4MB currently,
even if virtiomem can use 2MB on qemu side.


Changelog v3:

- cleanup from last Fiona comments
- classic hotplug : fix memory unplug
- virtio-mem: split retry logic in a second optional patch 
  (feel free to apply it later if needed)

Changelog v4:
- cleanup from last Fiona comments

Changelog v5:
- move dimm hot-unplug patches on top of the serie
- cleanup from last Fiona comments

Changelog v6:
- qemu-server : rebase on last git


qemu-server:

Alexandre Derumier (10):
  add memory parser
  memory: add get_static_mem
  memory: use static_memory in foreach_dimm
  config: memory: add 'max' option
  memory: get_max_mem: use config memory max
  memory: use 64 slots && static dimm size when max is defined
  test: add memory-max tests
  memory: add virtio-mem support
  memory: virtio-mem : implement redispatch retry.
  tests: add virtio-mem tests

 PVE/API2/Qemu.pm                              |  41 +-
 PVE/QemuConfig.pm                             |   4 +-
 PVE/QemuMigrate.pm                            |   6 +-
 PVE/QemuServer.pm                             |  37 +-
 PVE/QemuServer/Helpers.pm                     |   3 +-
 PVE/QemuServer/Memory.pm                      | 422 +++++++++++++++---
 PVE/QemuServer/PCI.pm                         |   8 +
 test/cfg2cmd/memory-max-128G.conf             |  11 +
 test/cfg2cmd/memory-max-128G.conf.cmd         |  86 ++++
 test/cfg2cmd/memory-max-512G.conf             |  11 +
 test/cfg2cmd/memory-max-512G.conf.cmd         |  58 +++
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  |  12 +
 .../memory-virtio-hugepages-1G.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio-max.conf           |  11 +
 test/cfg2cmd/memory-virtio-max.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio.conf               |  11 +
 test/cfg2cmd/memory-virtio.conf.cmd           |  35 ++
 17 files changed, 744 insertions(+), 82 deletions(-)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd


pve-manager:

Alexandre Derumier (2):
  ui: qemu: hardware: add new memory format support
  ui: qemu : memoryedit: add new max && virtio fields

 www/manager6/qemu/HardwareView.js | 17 ++++++++--
 www/manager6/qemu/MemoryEdit.js   | 52 +++++++++++++++++++++++++------
 2 files changed, 58 insertions(+), 11 deletions(-)

-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-09-01 10:23   ` Fiona Ebner
  2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 1/2] ui: qemu: hardware: add new memory format support Alexandre Derumier
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm          |  7 ++--
 PVE/QemuConfig.pm         |  4 +--
 PVE/QemuMigrate.pm        |  6 ++--
 PVE/QemuServer.pm         | 27 ++++++++--------
 PVE/QemuServer/Helpers.pm |  3 +-
 PVE/QemuServer/Memory.pm  | 67 ++++++++++++++++++++++++++++++++-------
 6 files changed, 79 insertions(+), 35 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index d0c199b..39b6b04 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::PCI;
 use PVE::QemuServer::USB;
 use PVE::QemuMigrate;
@@ -1559,8 +1560,6 @@ my $update_vm_api  = sub {
 
     my $storecfg = PVE::Storage::config();
 
-    my $defaults = PVE::QemuServer::load_defaults();
-
     &$resolve_cdrom_alias($param);
 
     # now try to verify all parameters
@@ -1680,7 +1679,9 @@ my $update_vm_api  = sub {
 	}
 
 	if ($param->{memory} || defined($param->{balloon})) {
-	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
+
+	    my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+	    my $maxmem = get_current_memory($memory);
 	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
 
 	    die "balloon value too large (must be smaller than assigned memory)\n"
diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
index 10e6929..d423089 100644
--- a/PVE/QemuConfig.pm
+++ b/PVE/QemuConfig.pm
@@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer;
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::Storage;
 use PVE::Tools;
 use PVE::Format qw(render_bytes render_duration);
@@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
 	$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
     }
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $mem_size = $conf->{memory} // $defaults->{memory};
+    my $mem_size = get_current_memory($conf->{memory});
     my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
     # our savevm-start does live-save of the memory until the space left in the
     # volume is just enough for the remaining memory content + internal state
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 600eeb7..9e75e4b 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::Helpers qw(min_version);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer;
 
 use PVE::AbstractMigrate;
@@ -1044,7 +1045,8 @@ sub phase2_start_remote_cluster {
     my $remote_vmid = $self->{opts}->{remote}->{vmid};
 
     # like regular start but with some overhead accounted for
-    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+    my $memory = get_current_memory($self->{vmconf}->{memory});
+    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
 
     my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
 
@@ -1199,7 +1201,7 @@ sub phase2 {
     $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
 
     # set cachesize to 10% of the total memory
-    my $memory =  $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $cachesize = int($memory * 1048576 / 10);
     $cachesize = round_powerof2($cachesize);
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 940cdac..a751024 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -55,7 +55,7 @@ use PVE::QemuServer::CGroup;
 use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
 use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
 use PVE::QemuServer::USB;
@@ -349,11 +349,9 @@ my $confdesc = {
     },
     memory => {
 	optional => 1,
-	type => 'integer',
-	description => "Amount of RAM for the VM in MiB. This is the maximum available memory when"
-	    ." you use the balloon device.",
-	minimum => 16,
-	default => 512,
+	type => 'string',
+	description => "Memory properties.",
+	format => $PVE::QemuServer::Memory::memory_fmt
     },
     balloon => {
 	optional => 1,
@@ -2947,8 +2945,7 @@ sub vmstatus {
 	$d->{cpus} = $conf->{vcpus} if $conf->{vcpus};
 
 	$d->{name} = $conf->{name} || "VM $vmid";
-	$d->{maxmem} = $conf->{memory} ? $conf->{memory}*(1024*1024)
-	    : $defaults->{memory}*(1024*1024);
+	$d->{maxmem} = get_current_memory($conf->{memory})*(1024*1024);
 
 	if ($conf->{balloon}) {
 	    $d->{balloon_min} = $conf->{balloon}*(1024*1024);
@@ -3891,7 +3888,7 @@ sub config_to_command {
     }
 
     PVE::QemuServer::Memory::config(
-	$conf, $vmid, $sockets, $cores, $defaults, $hotplug_features->{memory}, $cmd);
+	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
 
     push @$cmd, '-S' if $conf->{freeze};
 
@@ -5053,7 +5050,7 @@ sub vmconfig_hotplug_pending {
 		# enable balloon device is not hotpluggable
 		die "skip\n" if defined($conf->{balloon}) && $conf->{balloon} == 0;
 		# here we reset the ballooning value to memory
-		my $balloon = $conf->{memory} || $defaults->{memory};
+		my $balloon = get_current_memory($conf->{memory});
 		mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 	    } elsif ($fast_plug_option->{$opt}) {
 		# do nothing
@@ -5066,7 +5063,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
 		die "skip\n" if !$hotplug_features->{memory};
-		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
 	    } elsif ($opt eq 'cpulimit') {
@@ -5121,7 +5118,8 @@ sub vmconfig_hotplug_pending {
 
 		# allow manual ballooning if shares is set to zero
 		if ((defined($conf->{shares}) && ($conf->{shares} == 0))) {
-		    my $balloon = $conf->{pending}->{balloon} || $conf->{memory} || $defaults->{memory};
+		    my $memory = get_current_memory($conf->{memory});
+		    my $balloon = $conf->{pending}->{balloon} || $memory;
 		    mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 		}
 	    } elsif ($opt =~ m/^net(\d+)$/) {
@@ -5141,7 +5139,7 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
+		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
 		$cgroup->change_cpu_shares($new_cpuunits);
@@ -5820,7 +5818,8 @@ sub vm_start_nolock {
 	push @$cmd, '-S';
     }
 
-    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
+    my $memory = get_current_memory($conf->{memory});
+    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $memory, $resume);
 
     my $pci_reserve_list = [];
     for my $device (values $pci_devices->%*) {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index ee3979d..bbd142f 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -143,8 +143,7 @@ sub version_cmp {
 }
 
 sub config_aware_timeout {
-    my ($config, $is_suspended) = @_;
-    my $memory = $config->{memory};
+    my ($config, $memory, $is_suspended) = @_;
     my $timeout = 30;
 
     # Based on user reported startup time for vm with 512GiB @ 4-5 minutes
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 0601dd6..aaa94cc 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,10 +8,42 @@ use PVE::Exception qw(raise raise_param_exc);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use base qw(Exporter);
+
+our @EXPORT_OK = qw(
+get_current_memory
+);
 
 my $MAX_NUMA = 8;
 my $STATICMEM = 1024;
 
+our $memory_fmt = {
+    current => {
+	description => "Current amount of online RAM for the VM in MiB. This is the maximum available memory when"
+	    ." you use the balloon device.",
+	type => 'integer',
+	default_key => 1,
+	minimum => 16,
+	default => 512,
+    },
+};
+
+sub print_memory {
+    my $memory = shift;
+
+    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
+}
+
+sub parse_memory {
+    my ($value) = @_;
+
+    return { current => $memory_fmt->{current}->{default} } if !defined($value);
+
+    my $res = PVE::JSONSchema::parse_property_string($memory_fmt, $value);
+
+    return $res;
+}
+
 my $_host_bits;
 sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -63,6 +95,13 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+sub get_current_memory {
+    my ($value) = @_;
+
+    my $memory = parse_memory($value);
+    return $memory->{current};
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -129,16 +168,19 @@ sub foreach_dimm{
 }
 
 sub qemu_memory_hotplug {
-    my ($vmid, $conf, $defaults, $value) = @_;
+    my ($vmid, $conf, $value) = @_;
 
     return $value if !PVE::QemuServer::check_running($vmid);
 
-    my $sockets = $conf->{sockets} || 1;
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return $value if $newmem->{current} == $oldmem->{current};
 
-    my $memory = $conf->{memory} || $defaults->{memory};
-    $value = $defaults->{memory} if !$value;
-    return $value if $value == $memory;
+    my $memory = $oldmem->{current};
+    $value = $newmem->{current};
 
+    my $sockets = $conf->{sockets} || 1;
     my $static_memory = $STATICMEM;
     $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
 
@@ -153,7 +195,7 @@ sub qemu_memory_hotplug {
 	foreach_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size <= $conf->{memory};
+		return if $current_size <= get_current_memory($conf->{memory});
 
 		if ($conf->{hugepages}) {
 		    $numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
@@ -192,7 +234,8 @@ sub qemu_memory_hotplug {
 		    die $err;
 		}
 		#update conf after each succesful module hotplug
-		$conf->{memory} = $current_size;
+		$newmem->{current} = $current_size;
+		$conf->{memory} = print_memory($newmem);
 		PVE::QemuConfig->write_config($vmid, $conf);
 	});
 
@@ -220,7 +263,8 @@ sub qemu_memory_hotplug {
 	    }
 	    $current_size -= $dimm_size;
 	    #update conf after each succesful module unplug
-	    $conf->{memory} = $current_size;
+            $newmem->{current} = $current_size;
+            $conf->{memory} = print_memory($newmem);
 
 	    eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
 	    PVE::QemuConfig->write_config($vmid, $conf);
@@ -247,9 +291,9 @@ sub qemu_memdevices_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
 
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $static_memory = 0;
 
     if ($hotplug) {
@@ -470,8 +514,7 @@ sub hugepages_topology {
 
     return if !$conf->{numa};
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $static_memory = 0;
     my $sockets = $conf->{sockets} || 1;
     my $numa_custom_topology = undef;
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v2 pve-manager 1/2] ui: qemu: hardware: add new memory format support
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 02/10] memory: add get_static_mem Alexandre Derumier
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 www/manager6/qemu/HardwareView.js | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/www/manager6/qemu/HardwareView.js b/www/manager6/qemu/HardwareView.js
index af35a980..0b17ef0d 100644
--- a/www/manager6/qemu/HardwareView.js
+++ b/www/manager6/qemu/HardwareView.js
@@ -77,11 +77,17 @@ Ext.define('PVE.qemu.HardwareView', {
 		renderer: function(value, metaData, record, ri, ci, store, pending) {
 		    var res = '';
 
-		    var max = me.getObjectValue('memory', 512, pending);
+		    let memory = me.getObjectValue('memory', undefined, pending);
+		    let props = PVE.Parser.parsePropertyString(memory, 'current');
+		    let current = 512;
+		    if (props.current) {
+			current = props.current;
+		    }
+
 		    var balloon = me.getObjectValue('balloon', undefined, pending);
 		    var shares = me.getObjectValue('shares', undefined, pending);
 
-		    res = Proxmox.Utils.format_size(max*1024*1024);
+		    res = Proxmox.Utils.format_size(current*1024*1024);
 
 		    if (balloon !== undefined && balloon > 0) {
 			res = Proxmox.Utils.format_size(balloon*1024*1024) + "/" + res;
@@ -92,6 +98,13 @@ Ext.define('PVE.qemu.HardwareView', {
 		    } else if (balloon === 0) {
 			res += ' [balloon=0]';
 		    }
+		    if (props.max) {
+			res += ' [max=' + Proxmox.Utils.format_size(props.max*1024*1024) +']';
+		    }
+		    if (props.virtio) {
+			res += ' [virtio=' + props.virtio +']';
+		    }
+
 		    return res;
 		},
 	    },
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 02/10] memory: add get_static_mem
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 1/2] ui: qemu: hardware: add new memory format support Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields Alexandre Derumier
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index aaa94cc..12e6ee4 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -15,7 +15,6 @@ get_current_memory
 );
 
 my $MAX_NUMA = 8;
-my $STATICMEM = 1024;
 
 our $memory_fmt = {
     current => {
@@ -44,6 +43,22 @@ sub parse_memory {
     return $res;
 }
 
+my sub get_static_mem {
+    my ($conf, $sockets, $hotplug) = @_;
+
+    my $static_memory = 0;
+    my $memory = parse_memory($conf->{memory});
+
+    if ($hotplug) {
+	$static_memory = 1024;
+	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    } else {
+	$static_memory = $memory->{current};
+    }
+
+    return $static_memory;
+}
+
 my $_host_bits;
 sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -181,8 +196,7 @@ sub qemu_memory_hotplug {
     $value = $newmem->{current};
 
     my $sockets = $conf->{sockets} || 1;
-    my $static_memory = $STATICMEM;
-    $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    my $static_memory = get_static_mem($conf, $sockets, 1);
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
@@ -294,7 +308,7 @@ sub config {
     my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
 
     my $memory = get_current_memory($conf->{memory});
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
 
     if ($hotplug) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
@@ -306,17 +320,11 @@ sub config {
 		if $conf->{"numa$i"};
 	}
 
-	my $sockets = $conf->{sockets} || 1;
-
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
 
     } else {
 
-	$static_memory = $memory;
 	push @$cmd, '-m', $static_memory;
     }
 
@@ -515,17 +523,10 @@ sub hugepages_topology {
     return if !$conf->{numa};
 
     my $memory = get_current_memory($conf->{memory});
-    my $static_memory = 0;
     my $sockets = $conf->{sockets} || 1;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
     my $numa_custom_topology = undef;
 
-    if ($hotplug) {
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-    } else {
-	$static_memory = $memory;
-    }
-
     #custom numa topology
     for (my $i = 0; $i < $MAX_NUMA; $i++) {
 	next if !$conf->{"numa$i"};
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (2 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 02/10] memory: add get_static_mem Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-09-01  9:48   ` Thomas Lamprecht
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm Alexandre Derumier
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 www/manager6/qemu/MemoryEdit.js | 52 +++++++++++++++++++++++++++------
 1 file changed, 43 insertions(+), 9 deletions(-)

diff --git a/www/manager6/qemu/MemoryEdit.js b/www/manager6/qemu/MemoryEdit.js
index 5e91dc9b..be7903a2 100644
--- a/www/manager6/qemu/MemoryEdit.js
+++ b/www/manager6/qemu/MemoryEdit.js
@@ -20,12 +20,12 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
 	    let me = this;
 	    let view = me.getView(), viewModel = me.getViewModel();
 	    if (view.insideWizard) {
-		let memory = view.down('pveMemoryField[name=memory]');
+		let current = view.down('pveMemoryField[name=current]');
 		// NOTE: we only set memory but that then sets balloon in its change handler
 		if (viewModel.get('current.ostype') === 'win11') {
-		    memory.setValue('4096');
+		    current.setValue('4096');
 		} else {
-		    memory.setValue('2048');
+		    current.setValue('2048');
 		}
 	    }
 	},
@@ -36,13 +36,23 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
 
 	var res = {};
 
-	res.memory = values.memory;
+	let memory = {};
+	memory.current = values.current;
+	if (values.max && values.max !== 0) {
+	    memory.max = values.max;
+	}
+	if (values.virtio) {
+	    memory.virtio = values.virtio;
+	}
+
+	res.memory = PVE.Parser.printPropertyString(memory, 'current');
+
 	res.balloon = values.balloon;
 
 	if (!values.ballooning) {
 	    res.balloon = 0;
 	    res.delete = 'shares';
-	} else if (values.memory === values.balloon) {
+	} else if (values.current === values.balloon) {
 	    delete res.balloon;
 	    res.delete = 'balloon,shares';
 	} else if (Ext.isDefined(values.shares) && values.shares !== "") {
@@ -63,7 +73,7 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
 		xtype: 'pveMemoryField',
 		labelWidth: labelWidth,
 		fieldLabel: gettext('Memory') + ' (MiB)',
-		name: 'memory',
+		name: 'current',
 		value: '512', // better defaults get set via the view controllers afterrender
 		minValue: 1,
 		step: 32,
@@ -96,12 +106,25 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
 		allowBlank: false,
 		listeners: {
 		    change: function(f, value) {
-			var memory = me.down('field[name=memory]').getValue();
+			var memory = me.down('field[name=current]').getValue();
 			var shares = me.down('field[name=shares]');
 			shares.setDisabled(value === memory);
 		    },
 		},
 	    },
+	    {
+		xtype: 'pveMemoryField',
+		name: 'max',
+		minValue: 65536,
+		maxValue: 4194304,
+		value: '',
+		step: 65536,
+		fieldLabel: gettext('Maximum memory') + ' (MiB)',
+		labelWidth: labelWidth,
+		allowBlank: true,
+		emptyText: gettext('Disabled'),
+		submitEmptyText: false,
+	    },
 	    {
 		xtype: 'proxmoxintegerfield',
 		name: 'shares',
@@ -132,6 +155,13 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
 		    },
 		},
 	    },
+	    {
+		xtype: 'proxmoxcheckbox',
+		labelWidth: labelWidth,
+		name: 'virtio',
+		defaultValue: 0,
+		fieldLabel: gettext('Virtiomem'),
+	    },
 	];
 
 	if (me.insideWizard) {
@@ -178,11 +208,15 @@ Ext.define('PVE.qemu.MemoryEdit', {
 	    success: function(response, options) {
 		var data = response.result.data;
 
+		let memory = PVE.Parser.parsePropertyString(data.memory, 'current');
+
 		var values = {
 		    ballooning: data.balloon === 0 ? '0' : '1',
 		    shares: data.shares,
-		    memory: data.memory || '512',
-		    balloon: data.balloon > 0 ? data.balloon : data.memory || '512',
+		    current: memory.current || '512',
+		    max: memory.max || '',
+		    virtio: memory.virtio,
+		    balloon: data.balloon > 0 ? data.balloon : memory.current || '512',
 		};
 
 		ipanel.setValues(values);
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (3 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-09-01 11:39   ` [pve-devel] applied: " Fiona Ebner
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 04/10] config: memory: add 'max' option Alexandre Derumier
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 12e6ee4..ee73a0a 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -151,17 +151,15 @@ sub get_numa_guest_to_host_map {
 }
 
 sub foreach_dimm{
-    my ($conf, $vmid, $memory, $sockets, $func) = @_;
+    my ($conf, $vmid, $memory, $static_memory, $func) = @_;
 
     my $dimm_id = 0;
-    my $current_size = 0;
+    my $current_size = $static_memory;
     my $dimm_size = 0;
 
     if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 1024 * $sockets;
 	$dimm_size = 1024;
     } else {
-	$current_size = 1024;
 	$dimm_size = 512;
     }
 
@@ -206,7 +204,7 @@ sub qemu_memory_hotplug {
 
 	my $numa_hostmap;
 
-	foreach_dimm($conf, $vmid, $value, $sockets, sub {
+	foreach_dimm($conf, $vmid, $value, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 		return if $current_size <= get_current_memory($conf->{memory});
@@ -393,7 +391,7 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
+	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
@@ -559,7 +557,7 @@ sub hugepages_topology {
     if ($hotplug) {
 	my $numa_hostmap = get_numa_guest_to_host_map($conf);
 
-	foreach_dimm($conf, undef, $memory, $sockets, sub {
+	foreach_dimm($conf, undef, $memory, $static_memory, sub {
 	    my ($conf, undef, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    $numanode = $numa_hostmap->{$numanode};
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 04/10] config: memory: add 'max' option
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (4 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

max can be multiple of 64GiB only,
The dimm size is compute from the max memory

we can have 64 slots:

64GiB = 64 slots x 1GiB
128GiB = 64 slots x 2GiB
..
4TiB = 64 slots x 64GiB

Also, with numa, we need to share slot between (up to 8) sockets.

64 is a multiple of 8,

64GiB = 8 sockets * 8 slots * 1GiB
128GiB = 8 sockets * 8 slots * 2GiB
...

and with virtio-mem,
we can have 32000 blocks of 2MiB minimum

64GB = 32000 * 2MiB

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  4 ++--
 PVE/QemuServer/Memory.pm | 37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a751024..1257902 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5062,7 +5062,7 @@ sub vmconfig_hotplug_pending {
 		vm_deviceunplug($vmid, $conf, $opt);
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf);
 		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
@@ -5138,7 +5138,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_update_disk($storecfg, $conf, $hotplug_features->{disk},
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf, $value);
 		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index ee73a0a..d25c79c 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
-use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 use PVE::Exception qw(raise raise_param_exc);
+use PVE::GuestHelpers qw(safe_num_ne);
+use PVE::JSONSchema;
+use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
@@ -25,8 +27,28 @@ our $memory_fmt = {
 	minimum => 16,
 	default => 512,
     },
+    max => {
+	description => "Maximum hotpluggable memory for the VM in MiB.",
+	type => 'integer',
+	optional => 1,
+	minimum => 65536,
+	maximum => 4194304,
+	format => 'pve-qm-memory-max',
+    },
 };
 
+PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
+sub verify_qm_memory_max {
+    my ($max, $noerr) = @_;
+
+    if ($max && $max % 65536 != 0) {
+	return undef if $noerr;
+	die "max memory need to be a multiple of 64GiB\n";
+    }
+
+    return $max;
+}
+
 sub print_memory {
     my $memory = shift;
 
@@ -285,6 +307,19 @@ sub qemu_memory_hotplug {
     return $conf->{memory};
 }
 
+sub can_hotplug {
+    my ($hotplug, $conf, $value) = @_;
+
+    return if !$hotplug;
+
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return if safe_num_ne($newmem->{max}, $oldmem->{max});
+
+    return 1;
+}
+
 sub qemu_memdevices_list {
     my ($vmid, $type) = @_;
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 05/10] memory: get_max_mem: use config memory max
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (5 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 04/10] config: memory: add 'max' option Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 06/10] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

verify than defined vm memorymax is not bigger than
host cpu supported memory

Add add early check in update vm api

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         | 35 +++++++++++++++++++++++++----------
 PVE/QemuServer/Memory.pm | 19 ++++++++++++++++++-
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 39b6b04..f0b0dda 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
 use PVE::QemuServer::PCI;
 use PVE::QemuServer::USB;
 use PVE::QemuMigrate;
@@ -478,6 +478,29 @@ my $create_disks = sub {
     return ($vollist, $res);
 };
 
+my $check_memory_param = sub {
+    my ($conf, $param) = @_;
+
+    my $mem = parse_memory($param->{memory});
+    my $host_max_mem = get_host_max_mem($conf);
+
+    die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	if $mem->{max} && $mem->{max} > $host_max_mem;
+
+    die "current memory value too large (must be smaller than max memory)\n"
+	if $mem->{max} && $mem->{current} > $mem->{max};
+
+    if ($param->{memory} || defined($param->{balloon})) {
+
+	my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+	my $current = get_current_memory($memory);
+	my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
+
+	die "balloon value too large (must be smaller than current assigned memory)\n"
+	    if $balloon && $balloon > $current;
+    }
+};
+
 my $check_cpu_model_access = sub {
     my ($rpcenv, $authuser, $new, $existing) = @_;
 
@@ -1678,15 +1701,7 @@ my $update_vm_api  = sub {
 	    }
 	}
 
-	if ($param->{memory} || defined($param->{balloon})) {
-
-	    my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
-	    my $maxmem = get_current_memory($memory);
-	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
-
-	    die "balloon value too large (must be smaller than assigned memory)\n"
-		if $balloon && $balloon > $maxmem;
-	}
+	&$check_memory_param($conf, $param);
 
 	PVE::Cluster::log_msg('info', $authuser, "update VM $vmid: " . join (' ', @paramarr));
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index d25c79c..cf489ea 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -14,6 +14,8 @@ use base qw(Exporter);
 
 our @EXPORT_OK = qw(
 get_current_memory
+parse_memory
+get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
@@ -98,7 +100,7 @@ sub get_host_phys_address_bits {
     return; # undef, cannot really do anything..
 }
 
-my sub get_max_mem {
+sub get_host_max_mem {
     my ($conf) = @_;
 
     my $cpu = {};
@@ -132,6 +134,21 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+my sub get_max_mem {
+    my ($conf) = @_;
+
+    my $host_max_mem = get_host_max_mem($conf);
+    my $mem = parse_memory($conf->{memory});
+
+    if ($mem->{max}) {
+	die "configured memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	    if $mem->{max} > $host_max_mem;
+	return $mem->{max};
+    }
+
+    return $host_max_mem;
+}
+
 sub get_current_memory {
     my ($value) = @_;
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 06/10] memory: use 64 slots && static dimm size when max is defined
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (6 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 07/10] test: add memory-max tests Alexandre Derumier
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

default kernel vhost config only support 64 slots by default,
for performance since 2015.

Original memory hotplug code was done before, using qemu
max supported 255 slots.

To reach max mem (4TB), we used incremental dimm size.

Instead of dynamic memory size, use 1 static dimm size, compute
from max memory/64.

We also force a static memory to 4GB, as memory under 4G don't
works fine with unplug

Fix:
https://bugzilla.proxmox.com/show_bug.cgi?id=3446
https://bugzilla.proxmox.com/show_bug.cgi?id=1426
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index cf489ea..0d64a5f 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -19,6 +19,7 @@ get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
+my $MAX_SLOTS = 64;
 
 our $memory_fmt = {
     current => {
@@ -73,7 +74,13 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($hotplug) {
+    if ($memory->{max}) {
+	my $dimm_size = $memory->{max} / $MAX_SLOTS;
+	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
+	$static_memory = $dimm_size * $sockets;
+	$static_memory = 4096 if $static_memory < 4096;
+    } elsif ($hotplug) {
+	#legacy
 	$static_memory = 1024;
 	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
     } else {
@@ -196,7 +203,10 @@ sub foreach_dimm{
     my $current_size = $static_memory;
     my $dimm_size = 0;
 
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
+    my $confmem = parse_memory($conf->{memory});
+    if ($confmem->{max}) {
+	$dimm_size = $confmem->{max} / $MAX_SLOTS;
+    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
 	$dimm_size = 1024;
     } else {
 	$dimm_size = 512;
@@ -215,7 +225,7 @@ sub foreach_dimm{
 	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
 	    return  $current_size if $current_size >= $memory;
 	}
-	$dimm_size *= 2;
+	$dimm_size *= 2 if !$confmem->{max};
     }
 }
 
@@ -371,7 +381,9 @@ sub config {
 	}
 
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
-	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
+	my $confmem = parse_memory($conf->{memory});
+	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
+	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
 
     } else {
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 07/10] test: add memory-max tests
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (7 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 06/10] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-max-128G.conf     | 11 ++++
 test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++++++++++++++++++++++++++
 test/cfg2cmd/memory-max-512G.conf     | 11 ++++
 test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++++++++++++++++++
 4 files changed, 166 insertions(+)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd

diff --git a/test/cfg2cmd/memory-max-128G.conf b/test/cfg2cmd/memory-max-128G.conf
new file mode 100644
index 0000000..8f0328d
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 64G with 2 socket, dimm size should be 1GB && static memory should be 4G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-128G.conf.cmd b/test/cfg2cmd/memory-max-128G.conf.cmd
new file mode 100644
index 0000000..5b22027
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf.cmd
@@ -0,0 +1,86 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,slots=64,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=1024M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=1024M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=1024M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=1024M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=1024M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=1024M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=1024M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=1024M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=1024M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=1024M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=1024M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=1024M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=1024M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=1024M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=1024M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=1024M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=1024M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=1024M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=1024M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=1024M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=1024M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=1024M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=1024M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=1024M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=1024M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=1024M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=1024M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=1024M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf b/test/cfg2cmd/memory-max-512G.conf
new file mode 100644
index 0000000..082f75e
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 512GB with 2 socket, dimm size should be 8GB && static memory should be 16G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 131072,max=524288
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf.cmd b/test/cfg2cmd/memory-max-512G.conf.cmd
new file mode 100644
index 0000000..3c04b70
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf.cmd
@@ -0,0 +1,58 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=16384,slots=64,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=8192M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=8192M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=8192M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=8192M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=8192M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=8192M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=8192M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=8192M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=8192M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=8192M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=8192M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=8192M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=8192M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=8192M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=8192M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=8192M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 08/10] memory: add virtio-mem support
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (8 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 07/10] test: add memory-max tests Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 09/10] memory: virtio-mem : implement redispatch retry Alexandre Derumier
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

a 4GiB static memory is needed for DMA+boot memory, as this memory
is almost always un-unpluggeable.

1 virtio-mem pci device is setup for each numa node on pci.4 bridge

virtio-mem use a fixed blocksize with 32000 blocks
Blocksize is computed from the maxmemory-4096/32000 with a minimum of
2MiB to map THP.
(lower blocksize = more chance to unplug memory).

Note: Currently, linux only support 4MiB virtio blocksize, 2MiB support
is currently is progress.

For hotplug/unplug, we are try to allocate/unallocate same amount
of memory aligned to the blocksize on each numa node if possible.
If a node a not able to reach the target memory (could be an unmovable
page on unplug for example), we try again to redispatch memory the
remaining memory on others nodes.

About hugepages:

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

fixes:
https://bugzilla.proxmox.com/show_bug.cgi?id=2949
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         |   9 +-
 PVE/QemuServer.pm        |   8 +-
 PVE/QemuServer/Memory.pm | 219 ++++++++++++++++++++++++++++++++++++---
 PVE/QemuServer/PCI.pm    |   8 ++
 4 files changed, 226 insertions(+), 18 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index f0b0dda..456ee97 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem get_virtiomem_block_size);
 use PVE::QemuServer::PCI;
 use PVE::QemuServer::USB;
 use PVE::QemuMigrate;
@@ -490,6 +490,13 @@ my $check_memory_param = sub {
     die "current memory value too large (must be smaller than max memory)\n"
 	if $mem->{max} && $mem->{current} > $mem->{max};
 
+    if ($mem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	die "memory need to be a multiple of $blocksize MiB when virtiomem is enabled\n"
+	    if $mem->{current} % $blocksize != 0;
+    }
+
     if ($param->{memory} || defined($param->{balloon})) {
 
 	my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 1257902..7e0ade2 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3887,8 +3887,14 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
+    my $mem_devices = {};
     PVE::QemuServer::Memory::config(
-	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
+	$conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd, $mem_devices);
+
+    foreach my $id (sort keys %$mem_devices) {
+	my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);
+	push @$devices, "-device", "$mem_devices->{$id}$pciaddr";
+    }
 
     push @$cmd, '-S' if $conf->{freeze};
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 0d64a5f..dcdf318 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
+use POSIX qw(ceil);
+
 use PVE::Exception qw(raise raise_param_exc);
-use PVE::GuestHelpers qw(safe_num_ne);
+use PVE::GuestHelpers qw(safe_boolean_ne safe_num_ne);
 use PVE::JSONSchema;
 use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 
@@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
 get_current_memory
 parse_memory
 get_host_max_mem
+get_virtiomem_block_size
 );
 
 my $MAX_NUMA = 8;
@@ -38,6 +41,12 @@ our $memory_fmt = {
 	maximum => 4194304,
 	format => 'pve-qm-memory-max',
     },
+    virtio => {
+	description => "Use virtio-mem devices for hotplug (Experimental: Only works with Linux guest with kernel >= 5.10)",
+	type => 'boolean',
+	optional => 1,
+	default => 0,
+    },
 };
 
 PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
@@ -74,7 +83,9 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($memory->{max}) {
+    if ($memory->{virtio}) {
+	$static_memory = 4096;
+    } elsif ($memory->{max}) {
 	my $dimm_size = $memory->{max} / $MAX_SLOTS;
 	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
 	$static_memory = $dimm_size * $sockets;
@@ -163,6 +174,121 @@ sub get_current_memory {
     return $memory->{current};
 }
 
+sub get_virtiomem_block_size {
+    my ($conf) = @_;
+
+    my $sockets = $conf->{sockets} || 1;
+    my $MAX_MEM = get_max_mem($conf);
+    my $static_memory = get_static_mem($conf, $sockets, 1);
+    my $memory = get_current_memory($conf->{memory});
+
+    #virtiomem can map 32000 block size.
+    #try to use lowest blocksize, lower = more chance to unplug memory.
+    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
+    #2MB is the minimum to be aligned with THP
+    $blocksize = 2 if $blocksize < 2;
+    $blocksize = 2**(ceil(log($blocksize)/log(2)));
+    #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
+    $blocksize = 4 if $blocksize < 4;
+
+    return $blocksize;
+}
+
+my sub get_virtiomem_total_current_size {
+    my ($mems) = @_;
+    my $size = 0;
+    for my $mem (values %$mems) {
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
+my sub balance_virtiomem {
+    my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
+
+    my $nb_virtiomem = scalar(keys %$virtiomems);
+
+    print"try to balance memory on $nb_virtiomem virtiomems\n";
+
+    #if we can't share exactly the same amount, we add the remainder on last node
+    my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
+    my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
+
+    my $i = 0;
+    foreach my $id (sort keys %$virtiomems) {
+	my $virtiomem = $virtiomems->{$id};
+	$i++;
+	my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
+	$virtiomem->{completed} = 0;
+	$virtiomem->{retry} = 0;
+	$virtiomem->{target} = $virtiomem_target;
+
+	print "virtiomem$id: set-requested-size : $virtiomem_target\n";
+	mon_cmd(
+	    $vmid,
+	    'qom-set',
+	    path => "/machine/peripheral/virtiomem$id",
+	    property => "requested-size",
+	    value => $virtiomem_target * 1024 * 1024
+	);
+    }
+
+    my $total_finished = 0;
+    my $error = undef;
+
+    while ($total_finished != $nb_virtiomem) {
+
+	sleep 1;
+
+	$total_finished = 0;
+
+	foreach my $id (keys %$virtiomems) {
+
+	    my $virtiomem = $virtiomems->{$id};
+
+	    if ($virtiomem->{error} || $virtiomem->{completed}) {
+		$total_finished++;
+		next;
+	    }
+
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
+	    $virtiomem->{current} = $size / 1024 / 1024;
+	    print"virtiomem$id: last: $virtiomem->{last} current: $virtiomem->{current} target: $virtiomem->{target} retry: $virtiomem->{retry}\n";
+
+	    if($virtiomem->{current} == $virtiomem->{target}) {
+		print"virtiomem$id: completed\n";
+		$virtiomem->{completed} = 1;
+		next;
+	    }
+
+	    if($virtiomem->{current} != $virtiomem->{last}) {
+		#if value has changed, but not yet completed
+		$virtiomem->{retry} = 0;
+		$virtiomem->{last} = $virtiomem->{current};
+		next;
+	    }
+
+	    if($virtiomem->{retry} >= 5) {
+		print "virtiomem$id: target memory still not reached, ignoring device from now on\n";
+		$virtiomem->{error} = 1;
+		$error = 1;
+		#as change is async, we don't want that value change after the api call
+		eval {
+		    mon_cmd(
+			$vmid,
+			'qom-set',
+			path => "/machine/peripheral/virtiomem$id",
+			property => "requested-size",
+			value => $virtiomem->{current} * 1024 *1024
+		    );
+		};
+	    }
+	    $virtiomem->{retry}++;
+	}
+    }
+    die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -249,7 +375,37 @@ sub qemu_memory_hotplug {
     my $MAX_MEM = get_max_mem($conf);
     die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
-    if ($value > $memory) {
+    if ($oldmem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	my $virtiomems = {};
+
+	for (my $i = 0; $i < $sockets; $i++) {
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$i", property => "size");
+	    $size = $size / 1024 /1024;
+	    $virtiomems->{$i} = {
+		current => $size,
+		last => $size,
+		error => 0,
+		completed => 0,
+		retry => 0
+	    };
+	}
+
+	my $target_total = $value - $static_memory;
+	my $err;
+	eval {
+	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+	};
+	$err = $@ if $@;
+
+	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	$newmem->{current} = $current_memory;
+	$conf->{memory} = print_memory($newmem);
+	PVE::QemuConfig->write_config($vmid, $conf);
+	die $err if $err;
+
+    } elsif ($value > $memory) {
 
 	my $numa_hostmap;
 
@@ -342,8 +498,8 @@ sub can_hotplug {
     my $oldmem = parse_memory($conf->{memory});
     my $newmem = parse_memory($value);
 
-    return if safe_num_ne($newmem->{max}, $oldmem->{max});
-
+    return if (safe_num_ne($newmem->{max}, $oldmem->{max}) ||
+	       safe_boolean_ne($newmem->{virtio}, $oldmem->{virtio}));
     return 1;
 }
 
@@ -365,7 +521,7 @@ sub qemu_memdevices_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd, $mem_devices) = @_;
 
     my $memory = get_current_memory($conf->{memory});
     my $static_memory = get_static_mem($conf, $sockets, $hotplug);
@@ -383,7 +539,10 @@ sub config {
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	my $confmem = parse_memory($conf->{memory});
 	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
-	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
+	my $cmdstr = "size=${static_memory}";
+	$cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
+	$cmdstr .= ",maxmem=${MAX_MEM}M";
+	push @$cmd, '-m', $cmdstr;
 
     } else {
 
@@ -455,17 +614,42 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
-	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+	my $confmem = parse_memory($conf->{memory});
 
-	    push @$cmd, "-object" , $mem_object;
-	    push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+	if ($confmem->{'virtio'}) {
 
-	    die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
-		if $current_size > $memory;
-	});
+	    my $MAX_MEM = get_max_mem($conf);
+	    my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
+	    my $node_mem = ($memory - $static_memory) / $sockets;
+	    my $blocksize = get_virtiomem_block_size($conf);
+
+	    die "memory need to be a multiple of $blocksize MiB with maxmemory $MAX_MEM MiB when virtiomem is enabled\n"
+		if $memory % $blocksize != 0;
+
+	    for (my $i = 0; $i < $sockets; $i++)  {
+
+		my $id = "virtiomem$i";
+		my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
+		push @$cmd, "-object" , "$mem_object,reserve=off";
+
+		my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i";
+		$mem_device .= ",prealloc=on" if $conf->{hugepages};
+		$mem_devices->{$id} = $mem_device;
+	    }
+	} else {
+	    foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
+		my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+
+		my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+
+		push @$cmd, "-object" , $mem_object;
+		push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+
+		die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
+		    if $current_size > $memory;
+	    });
+	}
     }
 }
 
@@ -476,8 +660,11 @@ sub print_mem_object {
 
 	my $hugepages_size = hugepages_size($conf, $size);
 	my $path = hugepages_mount_path($hugepages_size);
+	my $confmem = parse_memory($conf->{memory});
 
-	return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
+	my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
+	$object .= ",prealloc=yes" if !$confmem->{virtio};
+	return $object;
     } else {
 	return "memory-backend-ram,id=$id,size=${size}M";
     }
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index 1673041..722c56f 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -261,6 +261,14 @@ sub get_pci_addr_map {
 	'scsihw2' => { bus => 4, addr => 1 },
 	'scsihw3' => { bus => 4, addr => 2 },
 	'scsihw4' => { bus => 4, addr => 3 },
+	'virtiomem0' => { bus => 4, addr => 4 },
+	'virtiomem1' => { bus => 4, addr => 5 },
+	'virtiomem2' => { bus => 4, addr => 6 },
+	'virtiomem3' => { bus => 4, addr => 7 },
+	'virtiomem4' => { bus => 4, addr => 8 },
+	'virtiomem5' => { bus => 4, addr => 9 },
+	'virtiomem6' => { bus => 4, addr => 10 },
+	'virtiomem7' => { bus => 4, addr => 11 },
     } if !defined($pci_addr_map);
     return $pci_addr_map;
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 09/10] memory: virtio-mem : implement redispatch retry.
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (9 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 10/10] tests: add virtio-mem tests Alexandre Derumier
  2023-09-01 12:24 ` [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Fiona Ebner
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

If some memory can be removed on a specific node,
we try to rebalance again on other nodes

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 59 +++++++++++++++++++++++++++++-----------
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index dcdf318..86ab16b 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -203,13 +203,32 @@ my sub get_virtiomem_total_current_size {
     return $size;
 }
 
+my sub get_virtiomem_total_errors_size {
+    my ($mems) = @_;
+
+    my $size = 0;
+    for my $mem (values %$mems) {
+	next if !$mem->{error};
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
 my sub balance_virtiomem {
     my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
 
-    my $nb_virtiomem = scalar(keys %$virtiomems);
+    my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems->%*);
 
     print"try to balance memory on $nb_virtiomem virtiomems\n";
 
+    my $target_total_err = undef;
+    if($target_total < 0) {
+	$target_total = 0;
+	$target_total_err = 1;
+    }
+
+    die "No more virtiomem devices left to try to balance the remaining memory\n" if $nb_virtiomem == 0;
+
     #if we can't share exactly the same amount, we add the remainder on last node
     my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
     my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
@@ -217,6 +236,7 @@ my sub balance_virtiomem {
     my $i = 0;
     foreach my $id (sort keys %$virtiomems) {
 	my $virtiomem = $virtiomems->{$id};
+	next if $virtiomem->{error};
 	$i++;
 	my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
 	$virtiomem->{completed} = 0;
@@ -234,7 +254,6 @@ my sub balance_virtiomem {
     }
 
     my $total_finished = 0;
-    my $error = undef;
 
     while ($total_finished != $nb_virtiomem) {
 
@@ -258,6 +277,7 @@ my sub balance_virtiomem {
 	    if($virtiomem->{current} == $virtiomem->{target}) {
 		print"virtiomem$id: completed\n";
 		$virtiomem->{completed} = 1;
+		$virtiomem->{last} = $virtiomem->{current};
 		next;
 	    }
 
@@ -271,7 +291,6 @@ my sub balance_virtiomem {
 	    if($virtiomem->{retry} >= 5) {
 		print "virtiomem$id: target memory still not reached, ignoring device from now on\n";
 		$virtiomem->{error} = 1;
-		$error = 1;
 		#as change is async, we don't want that value change after the api call
 		eval {
 		    mon_cmd(
@@ -286,7 +305,9 @@ my sub balance_virtiomem {
 	    $virtiomem->{retry}++;
 	}
     }
-    die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
+
+    die "No more virtiomem devices left to try to balance the remaining memory\n"
+	if $target_total_err;
 }
 
 sub get_numa_node_list {
@@ -392,18 +413,24 @@ sub qemu_memory_hotplug {
 	    };
 	}
 
-	my $target_total = $value - $static_memory;
-	my $err;
-	eval {
-	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
-	};
-	$err = $@ if $@;
-
-	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
-	$newmem->{current} = $current_memory;
-	$conf->{memory} = print_memory($newmem);
-	PVE::QemuConfig->write_config($vmid, $conf);
-	die $err if $err;
+	while (1) {
+
+	    my $target_total = $value - $static_memory - get_virtiomem_total_errors_size($virtiomems);
+	    my $err;
+	    eval {
+		balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+	    };
+	    $err = $@ if $@;
+
+	    my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	    $newmem->{current} = $current_memory;
+	    $conf->{memory} = print_memory($newmem);
+	    PVE::QemuConfig->write_config($vmid, $conf);
+
+	    die $err if $err;
+	    last if $current_memory == $value;
+	}
+	return $conf->{memory};
 
     } elsif ($value > $memory) {
 
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] [PATCH v6 qemu-server 10/10] tests: add virtio-mem tests
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (10 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 09/10] memory: virtio-mem : implement redispatch retry Alexandre Derumier
@ 2023-06-19  7:28 ` Alexandre Derumier
  2023-09-01 12:24 ` [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Fiona Ebner
  12 siblings, 0 replies; 29+ messages in thread
From: Alexandre Derumier @ 2023-06-19  7:28 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  | 12 +++++++
 .../memory-virtio-hugepages-1G.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio-max.conf           | 11 ++++++
 test/cfg2cmd/memory-virtio-max.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio.conf               | 11 ++++++
 test/cfg2cmd/memory-virtio.conf.cmd           | 35 +++++++++++++++++++
 6 files changed, 139 insertions(+)
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
new file mode 100644
index 0000000..bbe908c
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
@@ -0,0 +1,12 @@
+# TEST: virtio-mem without memorymax defined and hugepages
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
new file mode 100644
index 0000000..2e7de74
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-virtiomem0,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -object 'memory-backend-file,id=mem-virtiomem1,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,prealloc=on,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,prealloc=on,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf b/test/cfg2cmd/memory-virtio-max.conf
new file mode 100644
index 0000000..59b8e94
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf
@@ -0,0 +1,11 @@
+# TEST: memory max 64G with 2 socket with virtio-mem
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf.cmd b/test/cfg2cmd/memory-virtio-max.conf.cmd
new file mode 100644
index 0000000..6bda268
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=30720M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=30720M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
new file mode 100644
index 0000000..60c0264
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf
@@ -0,0 +1,11 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf.cmd b/test/cfg2cmd/memory-virtio.conf.cmd
new file mode 100644
index 0000000..f392636
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=260096M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=260096M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.39.2




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields Alexandre Derumier
@ 2023-09-01  9:48   ` Thomas Lamprecht
  2023-09-01 10:24     ` Fiona Ebner
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Lamprecht @ 2023-09-01  9:48 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  www/manager6/qemu/MemoryEdit.js | 52 +++++++++++++++++++++++++++------
>  1 file changed, 43 insertions(+), 9 deletions(-)
> 
> diff --git a/www/manager6/qemu/MemoryEdit.js b/www/manager6/qemu/MemoryEdit.js
> index 5e91dc9b..be7903a2 100644
> --- a/www/manager6/qemu/MemoryEdit.js
> +++ b/www/manager6/qemu/MemoryEdit.js

> @@ -63,7 +73,7 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
>  		xtype: 'pveMemoryField',
>  		labelWidth: labelWidth,
>  		fieldLabel: gettext('Memory') + ' (MiB)',
> -		name: 'memory',
> +		name: 'current',
>  		value: '512', // better defaults get set via the view controllers afterrender
>  		minValue: 1,
>  		step: 32,
> @@ -96,12 +106,25 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
>  		allowBlank: false,
>  		listeners: {
>  		    change: function(f, value) {
> -			var memory = me.down('field[name=memory]').getValue();
> +			var memory = me.down('field[name=current]').getValue();
>  			var shares = me.down('field[name=shares]');
>  			shares.setDisabled(value === memory);
>  		    },
>  		},
>  	    },
> +	    {
> +		xtype: 'pveMemoryField',
> +		name: 'max',
> +		minValue: 65536,
> +		maxValue: 4194304,
> +		value: '',
> +		step: 65536,
> +		fieldLabel: gettext('Maximum memory') + ' (MiB)',

This huge step size will be confusing to users, there should be a way to have
smaller steps (e.g., 1 GiB or even 128 MiB).

As even nowadays, with a huge amount of installed memory on a lot of servers,
deciding that a (potentially bad actor) VM can use up 64G or 128G is still
quite the difference on a lot of setups. Fiona is checking the backend here
to see if it might be done with a finer granularity, or what other options
we have here.

> +		labelWidth: labelWidth,
> +		allowBlank: true,
> +		emptyText: gettext('Disabled'),
> +		submitEmptyText: false,

I'd maybe show a warning if the guest has Memory hotplug disabled completely

Also would be great if we could check validity with other fields, i.e., if 

> +	    },
>  	    {
>  		xtype: 'proxmoxintegerfield',
>  		name: 'shares',
> @@ -132,6 +155,13 @@ Ext.define('PVE.qemu.MemoryInputPanel', {
>  		    },
>  		},
>  	    },
> +	    {
> +		xtype: 'proxmoxcheckbox',
> +		labelWidth: labelWidth,
> +		name: 'virtio',
> +		defaultValue: 0,
> +		fieldLabel: gettext('Virtiomem'),
> +	    },

Can we change this to a radio field group with label "Hotplug Type" and options
"VirtIO-Mem" and "Static DIMMs" (or some wording like that) and show a warning if
the OS type isn't Linux (i.e., "l26").

As while there are some experimental patches for Windows [0], they are not
included in the VirtIO drivers and development seems to have stalled a bit.

[0]: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/794




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser Alexandre Derumier
@ 2023-09-01 10:23   ` Fiona Ebner
  0 siblings, 0 replies; 29+ messages in thread
From: Fiona Ebner @ 2023-09-01 10:23 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Am 19.06.23 um 09:28 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---

There's a new usage for get_dervied_property() since this patch was
sent. Please add it to the next version:

> diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
> index d4230895..0e4b1eb8 100644
> --- a/PVE/QemuConfig.pm
> +++ b/PVE/QemuConfig.pm
> @@ -549,8 +549,8 @@ sub get_derived_property {
>         my $cpus =
>             ($conf->{sockets} || $defaults->{sockets}) * ($conf->{cores} || $defaults->{cores});
>         return $conf->{vcpus} || $cpus;
> -    } elsif ($name eq 'max-memory') {
> -       return ($conf->{memory} || $defaults->{memory}) * 1024 * 1024;
> +    } elsif ($name eq 'max-memory') { # current usage maximum, not maximum hotpluggable
> +       return (get_current_memory($conf->{memory}) || $defaults->{memory}) * 1024 * 1024;
>      } else {
>         die "unknown derived property - $name\n";
>      }
> 




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-01  9:48   ` Thomas Lamprecht
@ 2023-09-01 10:24     ` Fiona Ebner
  2023-09-02  6:18       ` DERUMIER, Alexandre
  0 siblings, 1 reply; 29+ messages in thread
From: Fiona Ebner @ 2023-09-01 10:24 UTC (permalink / raw)
  To: Proxmox VE development discussion, Thomas Lamprecht, Alexandre Derumier

Am 01.09.23 um 11:48 schrieb Thomas Lamprecht:
> Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
>> +		xtype: 'pveMemoryField',
>> +		name: 'max',
>> +		minValue: 65536,
>> +		maxValue: 4194304,
>> +		value: '',
>> +		step: 65536,
>> +		fieldLabel: gettext('Maximum memory') + ' (MiB)',
> 
> This huge step size will be confusing to users, there should be a way to have
> smaller steps (e.g., 1 GiB or even 128 MiB).
> 
> As even nowadays, with a huge amount of installed memory on a lot of servers,
> deciding that a (potentially bad actor) VM can use up 64G or 128G is still
> quite the difference on a lot of setups. Fiona is checking the backend here
> to see if it might be done with a finer granularity, or what other options
> we have here.
> 

From a first glance, I think it should be possible. Even if we keep the
restriction "all memory devices should have the same size", which makes
the code easier:

For dimms, we have 64 slots, so I don't see a reason why we can't use 64
MiB granularity rather than 64 GiB.

For virtio-mem, we have one device per socket (up to 8, assuming a power
of 2), and the blocksize is 2 MiB, so we could have 16 MiB granularity.
Or is there an issue setting the 'size' for a  virtio-mem-pci device to
such a fine grained value? Even if there is, we can just create the
device with a bigger supported 'size' and have our API reject a request
to go beyond the maximum later.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* [pve-devel] applied: [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm Alexandre Derumier
@ 2023-09-01 11:39   ` Fiona Ebner
  0 siblings, 0 replies; 29+ messages in thread
From: Fiona Ebner @ 2023-09-01 11:39 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

applied this one, thanks!




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem
  2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (11 preceding siblings ...)
  2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 10/10] tests: add virtio-mem tests Alexandre Derumier
@ 2023-09-01 12:24 ` Fiona Ebner
       [not found]   ` <CAOKSTBveZE6K6etnDESKXBt1_XpDYUMGpr12qQPyuv0beDRcQw@mail.gmail.com>
  2023-09-01 16:32   ` DERUMIER, Alexandre
  12 siblings, 2 replies; 29+ messages in thread
From: Fiona Ebner @ 2023-09-01 12:24 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

I'll send a few patches (likely on Monday) moving a bit of code around,
to get rid of the cyclic include
PVE::QemuServer <-> PVE::QemuServer::Memory

While it shouldn't affect your changes directly, I think it will affect
the patch context, so you might want to wait for that before doing the
next rebase.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem
       [not found]   ` <CAOKSTBveZE6K6etnDESKXBt1_XpDYUMGpr12qQPyuv0beDRcQw@mail.gmail.com>
@ 2023-09-01 16:30     ` DERUMIER, Alexandre
  0 siblings, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2023-09-01 16:30 UTC (permalink / raw)
  To: pve-devel, gilberto.nunes32

Hi Gilberto,

the idea with virtio-mem  (but also classic hotplug), is to define a
max memory.  
(currently it's max=4TB by default without any option).

Then for classic memory:

this maxmemory is shared by a number de memory slots. (64 by default),

So the increment hotplug will be = MAXmem /64 slots. (so MAX=64GB, you
can hotplug/unplug 1GB device up to 64GB,   max=256GB you can
hotplug/unplug 4GB memory up to 256GB


for virtio mem (for guest with kernel ~= >= 5.15)

it's a little bit different, because one big memory device is create,
but you can enable/disable small chunks (maxmemory / 32000).

so with max=64GB, you can hotplug/unplug 2MB increment up to 64GB, ...







Le vendredi 01 septembre 2023 à 09:36 -0300, Gilberto Ferreira a
écrit :
> Hi there.
> While I am a big zero programmer, I like to follow up these threads
> about Proxmox development.
> As always, it's a great piece of software.
> So I would like to ask if this hotplug memory will solve the problem
> we encounter when try to address huge memory in VM, like from 32 up
> to 64, without downtime.
> Much appreciated.
> 
> Thanks
> ---
> Gilberto Nunes Ferreira
> (47) 99676-7530 - Whatsapp / Telegram
> 
> 
> 
> 
> 
> Em sex., 1 de set. de 2023 às 09:24, Fiona Ebner
> <f.ebner@proxmox.com> escreveu:
> > I'll send a few patches (likely on Monday) moving a bit of code
> > around,
> > to get rid of the cyclic include
> > PVE::QemuServer <-> PVE::QemuServer::Memory
> > 
> > While it shouldn't affect your changes directly, I think it will
> > affect
> > the patch context, so you might want to wait for that before doing
> > the
> > next rebase.
> > 
> > 
> > _______________________________________________
> > pve-devel mailing list
> > pve-devel@lists.proxmox.com
> > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> > 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem
  2023-09-01 12:24 ` [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Fiona Ebner
       [not found]   ` <CAOKSTBveZE6K6etnDESKXBt1_XpDYUMGpr12qQPyuv0beDRcQw@mail.gmail.com>
@ 2023-09-01 16:32   ` DERUMIER, Alexandre
  1 sibling, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2023-09-01 16:32 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

ok, thanks Fiona !


Le vendredi 01 septembre 2023 à 14:24 +0200, Fiona Ebner a écrit :
> I'll send a few patches (likely on Monday) moving a bit of code
> around,
> to get rid of the cyclic include
> PVE::QemuServer <-> PVE::QemuServer::Memory
> 
> While it shouldn't affect your changes directly, I think it will
> affect
> the patch context, so you might want to wait for that before doing
> the
> next rebase.
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-01 10:24     ` Fiona Ebner
@ 2023-09-02  6:18       ` DERUMIER, Alexandre
  2023-09-04 10:48         ` Fiona Ebner
  2023-09-04 11:40         ` Thomas Lamprecht
  0 siblings, 2 replies; 29+ messages in thread
From: DERUMIER, Alexandre @ 2023-09-02  6:18 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, f.ebner, aderumier

Le vendredi 01 septembre 2023 à 12:24 +0200, Fiona Ebner a écrit :
> Am 01.09.23 um 11:48 schrieb Thomas Lamprecht:
> > Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
> > > +               xtype: 'pveMemoryField',
> > > +               name: 'max',
> > > +               minValue: 65536,
> > > +               maxValue: 4194304,
> > > +               value: '',
> > > +               step: 65536,
> > > +               fieldLabel: gettext('Maximum memory') + ' (MiB)',
> > 
> > This huge step size will be confusing to users, there should be a
> > way to have
> > smaller steps (e.g., 1 GiB or even 128 MiB).
> > 
> > As even nowadays, with a huge amount of installed memory on a lot
> > of servers,
> > deciding that a (potentially bad actor) VM can use up 64G or 128G
> > is still
> > quite the difference on a lot of setups. Fiona is checking the
> > backend here
> > to see if it might be done with a finer granularity, or what other
> > options
> > we have here.
> > 

I was not think about max size as a security feature, but more to
define the min dimm size to reach this max value.
But indeed, it could be interesting.
 

The step of max should be a at minimum the dimmsize 

max > 4GB && <= 64GB : 1gb dimm size
max > 64GB && <= 128GB : 2gb dimm size
max > 128GB && <= 256GB : 4gb dimm size


and we start qemu with the real max mem of the range. (max conf >4G
<=64GB ----> qemu is started with 64GB maxmem)

Like this, user could change max value between his current max range,
without need to restart vm.


> 
> From a first glance, I think it should be possible. Even if we keep
> the
> restriction "all memory devices should have the same size", which
> makes
> the code easier:
> 
> For dimms, we have 64 slots, so I don't see a reason why we can't use
> 64
> MiB granularity rather than 64 GiB.
> 
> 

Note that I think we shouldn't go under 128Mib for dimmsize as it's the
minimum hotplug granularity on linux

https://docs.kernel.org/admin-guide/mm/memory-hotplug.html
"Memory Hot(Un)Plug Granularity
Memory hot(un)plug in Linux uses the SPARSEMEM memory model, which
divides the physical memory address space into chunks of the same size:
memory sections. The size of a memory section is architecture
dependent. For example, x86_64 uses 128 MiB and ppc64 uses 16 MiB."

Static mem is already setup at 4GB, so I really don't known if we want
128mib dimm size in 2023 ?  

I'm really don't have tested windows and other OS under 1 gbit dimm
size.


If really needed, we could add

max > 4GB && <= 8GB, 128mb dimmsize
max > 8GB && <= 16GB : 256mb dimmsize
max > 16GB && <= 32GB : 512mb dimmsize





> For virtio-mem, we have one device per socket (up to 8, assuming a
> power
> of 2), and the blocksize is 2 MiB, so we could have 16 MiB
> granularity.

Yes, it's not a problem to use 16MiB max step granulary.


> Or is there an issue setting the 'size' for a  virtio-mem-pci device
> to
> such a fine grained value? Even if there is, we can just create the
> device with a bigger supported 'size' and have our API reject a
> request
> to go beyond the maximum later.
> 

Yes, I was more for the second proposal

Like for classic mem, we could simple do something like

max > 4GB && <= 64GB, with a maxstep of 16MbiB :  

--> start qemu with maxmem=64GB , 32000 slots of 2MIB chunks,  but
don't allow user on api to add more memory than max in config.

The max memory is not allocated anyway (until you use hugepages)

This would allow user to change the max value in the current max range
without need to restart he vm.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-02  6:18       ` DERUMIER, Alexandre
@ 2023-09-04 10:48         ` Fiona Ebner
  2023-09-04 11:40         ` Thomas Lamprecht
  1 sibling, 0 replies; 29+ messages in thread
From: Fiona Ebner @ 2023-09-04 10:48 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, t.lamprecht, aderumier

Am 02.09.23 um 08:18 schrieb DERUMIER, Alexandre:
> Le vendredi 01 septembre 2023 à 12:24 +0200, Fiona Ebner a écrit :
>> Am 01.09.23 um 11:48 schrieb Thomas Lamprecht:
>>> Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
>>>> +               xtype: 'pveMemoryField',
>>>> +               name: 'max',
>>>> +               minValue: 65536,
>>>> +               maxValue: 4194304,
>>>> +               value: '',
>>>> +               step: 65536,
>>>> +               fieldLabel: gettext('Maximum memory') + ' (MiB)',
>>>
>>> This huge step size will be confusing to users, there should be a
>>> way to have
>>> smaller steps (e.g., 1 GiB or even 128 MiB).
>>>
>>> As even nowadays, with a huge amount of installed memory on a lot
>>> of servers,
>>> deciding that a (potentially bad actor) VM can use up 64G or 128G
>>> is still
>>> quite the difference on a lot of setups. Fiona is checking the
>>> backend here
>>> to see if it might be done with a finer granularity, or what other
>>> options
>>> we have here.
>>>
> 
> I was not think about max size as a security feature, but more to
> define the min dimm size to reach this max value.
> But indeed, it could be interesting.
> 

Permission-wise there would need to be a difference between changing
'current' and changing 'max', but I'd say that's something for later.

>> From a first glance, I think it should be possible. Even if we keep
>> the
>> restriction "all memory devices should have the same size", which
>> makes
>> the code easier:
>>
>> For dimms, we have 64 slots, so I don't see a reason why we can't use
>> 64
>> MiB granularity rather than 64 GiB.
>>
>>
> 
> Note that I think we shouldn't go under 128Mib for dimmsize as it's the
> minimum hotplug granularity on linux
> 
> https://docs.kernel.org/admin-guide/mm/memory-hotplug.html
> "Memory Hot(Un)Plug Granularity
> Memory hot(un)plug in Linux uses the SPARSEMEM memory model, which
> divides the physical memory address space into chunks of the same size:
> memory sections. The size of a memory section is architecture
> dependent. For example, x86_64 uses 128 MiB and ppc64 uses 16 MiB."
> 

Okay, I see. Then let's go with the "create with support for more
initially and have API deny requests bigger than max"-approach.

> Static mem is already setup at 4GB, so I really don't known if we want
> 128mib dimm size in 2023 ?  
> 
> I'm really don't have tested windows and other OS under 1 gbit dimm
> size.
> 

The current implementation starts out with adding 512 MiB-sized dimms,
so at least those should be fine.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-02  6:18       ` DERUMIER, Alexandre
  2023-09-04 10:48         ` Fiona Ebner
@ 2023-09-04 11:40         ` Thomas Lamprecht
  2023-09-04 11:48           ` Fiona Ebner
  1 sibling, 1 reply; 29+ messages in thread
From: Thomas Lamprecht @ 2023-09-04 11:40 UTC (permalink / raw)
  To: Proxmox VE development discussion, DERUMIER, Alexandre, f.ebner,
	aderumier

Am 02/09/2023 um 08:18 schrieb DERUMIER, Alexandre:
> Le vendredi 01 septembre 2023 à 12:24 +0200, Fiona Ebner a écrit :
>> Am 01.09.23 um 11:48 schrieb Thomas Lamprecht:
>>> Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
>>>> +               xtype: 'pveMemoryField',
>>>> +               name: 'max',
>>>> +               minValue: 65536,
>>>> +               maxValue: 4194304,
>>>> +               value: '',
>>>> +               step: 65536,
>>>> +               fieldLabel: gettext('Maximum memory') + ' (MiB)',
>>> This huge step size will be confusing to users, there should be a
>>> way to have
>>> smaller steps (e.g., 1 GiB or even 128 MiB).
>>>
>>> As even nowadays, with a huge amount of installed memory on a lot
>>> of servers,
>>> deciding that a (potentially bad actor) VM can use up 64G or 128G
>>> is still
>>> quite the difference on a lot of setups. Fiona is checking the
>>> backend here
>>> to see if it might be done with a finer granularity, or what other
>>> options
>>> we have here.
>>>
> I was not think about max size as a security feature, but more to
> define the min dimm size to reach this max value.

Hmm, then I'd might it easier to understand if this is named "DIMM Size"
or "Minimal DIMM-Size", for the UI we could show the resulting max-memory
that one can achieve with each DIMM-Size selected.

The range could be from 128 MB to 64 GB (or higher?), and yeah if we have
an actual maximum we could also auto-calculate it, if not set explicitly
by the user.

But, I'm currently not to deep into this topic, so take my suggestions
with a grain of salt.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-04 11:40         ` Thomas Lamprecht
@ 2023-09-04 11:48           ` Fiona Ebner
  2023-09-05 15:10             ` DERUMIER, Alexandre
  0 siblings, 1 reply; 29+ messages in thread
From: Fiona Ebner @ 2023-09-04 11:48 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox VE development discussion, DERUMIER,
	Alexandre, aderumier

Am 04.09.23 um 13:40 schrieb Thomas Lamprecht:
> Am 02/09/2023 um 08:18 schrieb DERUMIER, Alexandre:
>> Le vendredi 01 septembre 2023 à 12:24 +0200, Fiona Ebner a écrit :
>>> Am 01.09.23 um 11:48 schrieb Thomas Lamprecht:
>>>> Am 19/06/2023 um 09:28 schrieb Alexandre Derumier:
>>>>> +               xtype: 'pveMemoryField',
>>>>> +               name: 'max',
>>>>> +               minValue: 65536,
>>>>> +               maxValue: 4194304,
>>>>> +               value: '',
>>>>> +               step: 65536,
>>>>> +               fieldLabel: gettext('Maximum memory') + ' (MiB)',
>>>> This huge step size will be confusing to users, there should be a
>>>> way to have
>>>> smaller steps (e.g., 1 GiB or even 128 MiB).
>>>>
>>>> As even nowadays, with a huge amount of installed memory on a lot
>>>> of servers,
>>>> deciding that a (potentially bad actor) VM can use up 64G or 128G
>>>> is still
>>>> quite the difference on a lot of setups. Fiona is checking the
>>>> backend here
>>>> to see if it might be done with a finer granularity, or what other
>>>> options
>>>> we have here.
>>>>
>> I was not think about max size as a security feature, but more to
>> define the min dimm size to reach this max value.
> 
> Hmm, then I'd might it easier to understand if this is named "DIMM Size"
> or "Minimal DIMM-Size", for the UI we could show the resulting max-memory
> that one can achieve with each DIMM-Size selected.
> 
> The range could be from 128 MB to 64 GB (or higher?), and yeah if we have
> an actual maximum we could also auto-calculate it, if not set explicitly
> by the user.
> 
> But, I'm currently not to deep into this topic, so take my suggestions
> with a grain of salt.

The advantage with 'max' is that it can be used for both, hotplug with
dimms and virtio-mem. Otherwise, we'd need two different sub-options
depending on hotplug method.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-04 11:48           ` Fiona Ebner
@ 2023-09-05 15:10             ` DERUMIER, Alexandre
  2023-09-05 15:16               ` Thomas Lamprecht
  0 siblings, 1 reply; 29+ messages in thread
From: DERUMIER, Alexandre @ 2023-09-05 15:10 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, f.ebner, aderumier

> 
> The advantage with 'max' is that it can be used for both, hotplug
> with
> dimms and virtio-mem. Otherwise, we'd need two different sub-options
> depending on hotplug method.
> 
yes, that's what I thinked too, it could be great to have same api call
with same options, with or without virtio-mem.

(virtio-mem will be the default for new linux distro, but for windows
or older linux distro, we still need to use old dimm method)



My first idea for the gui, for the max value, was a combobox displaying
an hint with mem topology, something like:

max = 64GB : 64 x 1GB dimm
max = 128GB: 64 x 2GB dimm
...

(or maybe it could be a hint outside a simple integer field)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-05 15:10             ` DERUMIER, Alexandre
@ 2023-09-05 15:16               ` Thomas Lamprecht
  2023-09-05 22:35                 ` DERUMIER, Alexandre
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Lamprecht @ 2023-09-05 15:16 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, f.ebner, aderumier

Am 05/09/2023 um 17:10 schrieb DERUMIER, Alexandre:
>>
>> The advantage with 'max' is that it can be used for both, hotplug
>> with
>> dimms and virtio-mem. Otherwise, we'd need two different sub-options
>> depending on hotplug method.
>>
> yes, that's what I thinked too, it could be great to have same api call
> with same options, with or without virtio-mem.
> 
> (virtio-mem will be the default for new linux distro, but for windows
> or older linux distro, we still need to use old dimm method)
> 
> 
> 
> My first idea for the gui, for the max value, was a combobox displaying
> an hint with mem topology, something like:
> 
> max = 64GB : 64 x 1GB dimm
> max = 128GB: 64 x 2GB dimm
> ...
> 
> (or maybe it could be a hint outside a simple integer field)
> 

We could still allow setting the DIMM size in the UI with a simple integer
field and a step size of 1 (GB) and then calculate the max from that?





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-05 15:16               ` Thomas Lamprecht
@ 2023-09-05 22:35                 ` DERUMIER, Alexandre
  2024-07-08 15:10                   ` Fiona Ebner
  0 siblings, 1 reply; 29+ messages in thread
From: DERUMIER, Alexandre @ 2023-09-05 22:35 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, f.ebner, aderumier

Le mardi 05 septembre 2023 à 17:16 +0200, Thomas Lamprecht a écrit :
> Am 05/09/2023 um 17:10 schrieb DERUMIER, Alexandre:
> > > 
> > > The advantage with 'max' is that it can be used for both, hotplug
> > > with
> > > dimms and virtio-mem. Otherwise, we'd need two different sub-
> > > options
> > > depending on hotplug method.
> > > 
> > yes, that's what I thinked too, it could be great to have same api
> > call
> > with same options, with or without virtio-mem.
> > 
> > (virtio-mem will be the default for new linux distro, but for
> > windows
> > or older linux distro, we still need to use old dimm method)
> > 
> > 
> > 
> > My first idea for the gui, for the max value, was a combobox
> > displaying
> > an hint with mem topology, something like:
> > 
> > max = 64GB : 64 x 1GB dimm
> > max = 128GB: 64 x 2GB dimm
> > ...
> > 
> > (or maybe it could be a hint outside a simple integer field)
> > 
> 
> We could still allow setting the DIMM size in the UI with a simple
> integer
> field and a step size of 1 (GB) and then calculate the max from that?
> 
> 
yes, it could work too. Maybe a dimm size field, changing the max value
, and at the same time, changing the max value is changing the dimm
size field ?


and for virtio-mem, dimmsize can be replace by chunk size

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2023-09-05 22:35                 ` DERUMIER, Alexandre
@ 2024-07-08 15:10                   ` Fiona Ebner
  2024-07-09  9:38                     ` DERUMIER, Alexandre via pve-devel
  0 siblings, 1 reply; 29+ messages in thread
From: Fiona Ebner @ 2024-07-08 15:10 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, t.lamprecht, aderumier

Am 06.09.23 um 00:35 schrieb DERUMIER, Alexandre:
> Le mardi 05 septembre 2023 à 17:16 +0200, Thomas Lamprecht a écrit :
>> Am 05/09/2023 um 17:10 schrieb DERUMIER, Alexandre:
>>>>
>>>> The advantage with 'max' is that it can be used for both, hotplug
>>>> with
>>>> dimms and virtio-mem. Otherwise, we'd need two different sub-
>>>> options
>>>> depending on hotplug method.
>>>>
>>> yes, that's what I thinked too, it could be great to have same api
>>> call
>>> with same options, with or without virtio-mem.
>>>
>>> (virtio-mem will be the default for new linux distro, but for
>>> windows
>>> or older linux distro, we still need to use old dimm method)
>>>
>>>
>>>
>>> My first idea for the gui, for the max value, was a combobox
>>> displaying
>>> an hint with mem topology, something like:
>>>
>>> max = 64GB : 64 x 1GB dimm
>>> max = 128GB: 64 x 2GB dimm
>>> ...
>>>
>>> (or maybe it could be a hint outside a simple integer field)
>>>
>>
>> We could still allow setting the DIMM size in the UI with a simple
>> integer
>> field and a step size of 1 (GB) and then calculate the max from that?
>>
>>
> yes, it could work too. Maybe a dimm size field, changing the max value
> , and at the same time, changing the max value is changing the dimm
> size field ?
> 
> 
> and for virtio-mem, dimmsize can be replace by chunk size
Sorry about the very late response!

When calculating from the DIMM size with 1 GiB step size, we can only
get to max values N * 64 GiB. We could still a have a separate max field
with smaller step size, e.g. a max value of 100 GiB would be using DIMM
size of 2 GiB and reject any API request trying to plug more memory (so
a total of 50 DIMMs can be plugged). I'm just a bit worried the
auto-update might get a bit confusing, e.g. user might just want to
change DIMM size without expecting that this will override the max
value. Should DIMM size maybe be a separate setting?

Maybe it's easier to just start with max and wait for use cases/requests
where changing DIMM size is actually required?

@Alexandre: I came back to this since
https://bugzilla.proxmox.com/show_bug.cgi?id=5585 came in and wanted to
ask if you'd still like to send out a new version with the improved max
granularity or if this should be picked up by a developer at Proxmox
instead?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
  2024-07-08 15:10                   ` Fiona Ebner
@ 2024-07-09  9:38                     ` DERUMIER, Alexandre via pve-devel
  0 siblings, 0 replies; 29+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2024-07-09  9:38 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, f.ebner, aderumier; +Cc: DERUMIER, Alexandre

[-- Attachment #1: Type: message/rfc822, Size: 15867 bytes --]

From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "t.lamprecht@proxmox.com" <t.lamprecht@proxmox.com>, "f.ebner@proxmox.com" <f.ebner@proxmox.com>, "aderumier@odiso.com" <aderumier@odiso.com>
Subject: Re: [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields
Date: Tue, 9 Jul 2024 09:38:22 +0000
Message-ID: <7a69578cce0054f70eff3a04edb722572d070ae9.camel@groupe-cyllene.com>

> 
> and for virtio-mem, dimmsize can be replace by chunk size
>>Sorry about the very late response!
>>
>>When calculating from the DIMM size with 1 GiB step size, we can only
>>get to max values N * 64 GiB. We could still a have a separate max
>>field
>>with smaller step size, e.g. a max value of 100 GiB would be using
>>DIMM
>>size of 2 GiB and reject any API request trying to plug more memory
>>(so
>>a total of 50 DIMMs can be plugged). I'm just a bit worried the
>>auto-update might get a bit confusing, e.g. user might just want to
>>change DIMM size without expecting that this will override the max
>>value. Should DIMM size maybe be a separate setting?
>>
>>Maybe it's easier to just start with max and wait for use
>>cases/requests
>>where changing DIMM size is actually required?

>>@Alexandre: I came back to this since
>>https://antiphishing.vadesecure.com/v4?f=SXhidDdiNE5weDlFTk1JSLHA-
>>liRz7IhIIVh697XNAns0HPu3bRbZ7tq40QrBNcG&i=VEJXQ0RvaGFoOExkQUtDWd5cWRM
>>PcZda_rDn7DGZyXY&k=gTNE&r=ZmttOUJBaHM0cVJhc0pLb7YCjpcuri9D5srwg5HyY68
>>jcxSdKUjiEbecfdTTWjy2vcNi99970VylQXFhxOytiw&s=eb86f2bfd18ff30f77a7ecb
>>15ebf22740dfe242d141e6f28a7a0eb3c94c8a8b0&u=https%3A%2F%2Fbugzilla.pr
>>oxmox.com%2Fshow_bug.cgi%3Fid%3D5585 came in and wanted to
>>ask if you'd still like to send out a new version with the improved
>>max
>>granularity or if this should be picked up by a developer at Proxmox
>>instead?


Hi Fiona !  I'm a lot busy currently working on shared lvm san snapshot
&& thin provisioning (I'll submit an RFC soon).

I don't think I'll time to rework on this until September/October (with
the holiday coming).
So, if a proxmox developper could work again on this, it could be great
:)     

(BTW and they are also the new hv-balloon for windows ^_^)



[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-07-09  9:38 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-19  7:28 [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 01/10] add memory parser Alexandre Derumier
2023-09-01 10:23   ` Fiona Ebner
2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 1/2] ui: qemu: hardware: add new memory format support Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 02/10] memory: add get_static_mem Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v2 pve-manager 2/2] ui: qemu : memoryedit: add new max && virtio fields Alexandre Derumier
2023-09-01  9:48   ` Thomas Lamprecht
2023-09-01 10:24     ` Fiona Ebner
2023-09-02  6:18       ` DERUMIER, Alexandre
2023-09-04 10:48         ` Fiona Ebner
2023-09-04 11:40         ` Thomas Lamprecht
2023-09-04 11:48           ` Fiona Ebner
2023-09-05 15:10             ` DERUMIER, Alexandre
2023-09-05 15:16               ` Thomas Lamprecht
2023-09-05 22:35                 ` DERUMIER, Alexandre
2024-07-08 15:10                   ` Fiona Ebner
2024-07-09  9:38                     ` DERUMIER, Alexandre via pve-devel
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 03/10] memory: use static_memory in foreach_dimm Alexandre Derumier
2023-09-01 11:39   ` [pve-devel] applied: " Fiona Ebner
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 04/10] config: memory: add 'max' option Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 06/10] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 07/10] test: add memory-max tests Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 09/10] memory: virtio-mem : implement redispatch retry Alexandre Derumier
2023-06-19  7:28 ` [pve-devel] [PATCH v6 qemu-server 10/10] tests: add virtio-mem tests Alexandre Derumier
2023-09-01 12:24 ` [pve-devel] [PATCH-SERIE v6 qemu-server/pve-manager] rework memory hotplug + virtiomem Fiona Ebner
     [not found]   ` <CAOKSTBveZE6K6etnDESKXBt1_XpDYUMGpr12qQPyuv0beDRcQw@mail.gmail.com>
2023-09-01 16:30     ` DERUMIER, Alexandre
2023-09-01 16:32   ` DERUMIER, Alexandre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal