public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem
@ 2023-02-02 11:03 Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking Alexandre Derumier
                   ` (12 more replies)
  0 siblings, 13 replies; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

This patch series rework the current memory hotplug + virtiomem.

memory option now have extra options:

memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
ex: memory: current=1024,max=131072,virtio=1


patches 1-4: add a memory parser

patches 5-10: add the max option with 64 static dimm hotplug

for classic memory hotplug, when maxmemory is defined,
we use 64 fixed size dimm.
The max option is a multiple of 64GB.

patches 11-13: add virtio-mem

The virtio option enable new virtio-mem support,
instead of plugging dimm, it's add/removed block inside
big dimm.
virtio-mem can use 32000 blocks, the blocksize is compute from
max memory.


Changelog v2:

update differents patches based on Fiona comments.
(I have send 2 others mails with comments not yet addressed)

Biggest change is on virtio-mem, instead of trying to have same amount of memory
on each virtiomem (some block could be unmovable and break unplug),
we try to balance/dispatch remaining block on other available virtiomems.

Also, the minimum blocksize supported by linux guest os is 4MB currently,
even if virtiomem can use 2MB on qemu side.

Patch10 with hotplug fix has be merged in others patches.

Changelog v3:

- cleanup from last Fiona comments
- classic hotplug : fix memory unplug
- virtio-mem: split retry logic in a second optional patch 
  (feel free to apply it later if needed)

Alexandre Derumier (13):
  memory: extract some code to their own sub for mocking
  tests: add memory tests
  qemu_memory_hotplug: remove unused $opt arg
  add memory parser
  memory: add get_static_mem && remove parse_hotplug_features
  config: memory: add 'max' option
  memory: get_max_mem: use config memory max
  memory: don't use foreach_reversedimm for unplug
  memory: use 64 slots && static dimm size when max is defined
  test: add memory-max tests
  memory: add virtio-mem support
  memory: virtio-mem : implement redispatch retry.
  tests: add virtio-mem tests

 PVE/API2/Qemu.pm                              |  45 +-
 PVE/QemuConfig.pm                             |   4 +-
 PVE/QemuMigrate.pm                            |   6 +-
 PVE/QemuServer.pm                             |  39 +-
 PVE/QemuServer/Helpers.pm                     |   3 +-
 PVE/QemuServer/Memory.pm                      | 508 ++++++++++++++----
 PVE/QemuServer/PCI.pm                         |   8 +
 test/cfg2cmd/memory-hotplug-hugepages.conf    |  12 +
 .../cfg2cmd/memory-hotplug-hugepages.conf.cmd |  62 +++
 test/cfg2cmd/memory-hotplug.conf              |  11 +
 test/cfg2cmd/memory-hotplug.conf.cmd          | 174 ++++++
 test/cfg2cmd/memory-hugepages-1g.conf         |  11 +
 test/cfg2cmd/memory-hugepages-1g.conf.cmd     |  30 ++
 test/cfg2cmd/memory-hugepages-2m.conf         |  11 +
 test/cfg2cmd/memory-hugepages-2m.conf.cmd     |  30 ++
 test/cfg2cmd/memory-max-128G.conf             |  11 +
 test/cfg2cmd/memory-max-128G.conf.cmd         |  86 +++
 test/cfg2cmd/memory-max-512G.conf             |  11 +
 test/cfg2cmd/memory-max-512G.conf.cmd         |  58 ++
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  |  12 +
 .../memory-virtio-hugepages-1G.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio-max.conf           |  11 +
 test/cfg2cmd/memory-virtio-max.conf.cmd       |  35 ++
 test/cfg2cmd/memory-virtio.conf               |  11 +
 test/cfg2cmd/memory-virtio.conf.cmd           |  35 ++
 test/run_config2command_tests.pl              |  15 +
 26 files changed, 1125 insertions(+), 149 deletions(-)
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hotplug.conf
 create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:44   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests Alexandre Derumier
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 3eb1c7c..5a7c1b7 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -348,7 +348,7 @@ sub config {
 	    my $numa_memory = ($static_memory / $sockets);
 
 	    for (my $i = 0; $i < $sockets; $i++)  {
-		die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/" && $conf->{hugepages};
+		die "host NUMA node$i doesn't exist\n" if !host_numanode_exist($i) && $conf->{hugepages};
 
 		my $mem_object = print_mem_object($conf, "ram-node$i", $numa_memory);
 		push @$cmd, '-object', $mem_object;
@@ -391,6 +391,12 @@ sub print_mem_object {
 
 }
 
+sub host_numanode_exist {
+    my ($id) = @_;
+
+    return -d "/sys/devices/system/node/node$id/";
+}
+
 sub print_numa_hostnodes {
     my ($hostnodelists) = @_;
 
@@ -402,7 +408,7 @@ sub print_numa_hostnodes {
 	$hostnodes .= "-$end" if defined($end);
 	$end //= $start;
 	for (my $i = $start; $i <= $end; ++$i ) {
-	    die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/";
+	    die "host NUMA node$i doesn't exist\n" if !host_numanode_exist($i);
 	}
     }
     return $hostnodes;
@@ -445,21 +451,26 @@ sub hugepages_nr {
   return $size / $hugepages_size;
 }
 
+my sub page_chunk {
+    my ($size) = @_;
+
+    return -d "/sys/kernel/mm/hugepages/hugepages-". ($size * 1024) ."kB";
+}
+
 sub hugepages_size {
     my ($conf, $size) = @_;
     die "hugepages option is not enabled" if !$conf->{hugepages};
     die "memory size '$size' is not a positive even integer; cannot use for hugepages\n"
 	if $size <= 0 || $size & 1;
 
-    my $page_chunk = sub { -d  "/sys/kernel/mm/hugepages/hugepages-". ($_[0] * 1024) ."kB" };
-    die "your system doesn't support hugepages\n" if !$page_chunk->(2) && !$page_chunk->(1024);
+    die "your system doesn't support hugepages\n" if !page_chunk(2) && !page_chunk(1024);
 
     if ($conf->{hugepages} eq 'any') {
 
 	# try to use 1GB if available && memory size is matching
-	if ($page_chunk->(1024) && ($size & 1023) == 0) {
+	if (page_chunk(1024) && ($size & 1023) == 0) {
 	    return 1024;
-	} elsif ($page_chunk->(2)) {
+	} elsif (page_chunk(2)) {
 	    return 2;
 	} else {
 	    die "host only supports 1024 GB hugepages, but requested size '$size' is not a multiple of 1024 MB\n"
@@ -468,7 +479,7 @@ sub hugepages_size {
 
 	my $hugepagesize = $conf->{hugepages};
 
-	if (!$page_chunk->($hugepagesize)) {
+	if (!page_chunk($hugepagesize)) {
 	    die "your system doesn't support hugepages of $hugepagesize MB\n";
 	} elsif (($size % $hugepagesize) != 0) {
 	    die "Memory size $size is not a multiple of the requested hugepages size $hugepagesize\n";
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:44   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg Alexandre Derumier
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-hotplug-hugepages.conf    |  12 ++
 .../cfg2cmd/memory-hotplug-hugepages.conf.cmd |  62 +++++++
 test/cfg2cmd/memory-hotplug.conf              |  11 ++
 test/cfg2cmd/memory-hotplug.conf.cmd          | 174 ++++++++++++++++++
 test/cfg2cmd/memory-hugepages-1g.conf         |  11 ++
 test/cfg2cmd/memory-hugepages-1g.conf.cmd     |  30 +++
 test/cfg2cmd/memory-hugepages-2m.conf         |  11 ++
 test/cfg2cmd/memory-hugepages-2m.conf.cmd     |  30 +++
 test/run_config2command_tests.pl              |  15 ++
 9 files changed, 356 insertions(+)
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hotplug.conf
 create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd

diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf b/test/cfg2cmd/memory-hotplug-hugepages.conf
new file mode 100644
index 0000000..6cba31e
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf
@@ -0,0 +1,12 @@
+# TEST: memory hotplug with 1GB hugepage
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 18432
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
new file mode 100644
index 0000000..6d4d8e8
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
@@ -0,0 +1,62 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=2048,slots=255,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-dimm0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-file,id=mem-dimm1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-file,id=mem-dimm2,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-file,id=mem-dimm3,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-file,id=mem-dimm4,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-file,id=mem-dimm5,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-file,id=mem-dimm6,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-file,id=mem-dimm7,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-file,id=mem-dimm8,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-file,id=mem-dimm9,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-file,id=mem-dimm10,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-file,id=mem-dimm11,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-file,id=mem-dimm12,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-file,id=mem-dimm13,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-file,id=mem-dimm14,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-file,id=mem-dimm15,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hotplug.conf b/test/cfg2cmd/memory-hotplug.conf
new file mode 100644
index 0000000..386e61f
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf
@@ -0,0 +1,11 @@
+# TEST: basic memory hotplug
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 66560
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
diff --git a/test/cfg2cmd/memory-hotplug.conf.cmd b/test/cfg2cmd/memory-hotplug.conf.cmd
new file mode 100644
index 0000000..2da7955
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf.cmd
@@ -0,0 +1,174 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=1024,slots=255,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=512M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=512M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=512M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=512M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=512M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=512M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=512M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=512M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=512M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=512M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=512M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=512M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=512M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=512M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=512M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=512M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=512M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=512M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=512M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=512M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=512M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=512M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=512M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=512M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=512M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=512M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=512M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=512M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=512M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=512M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm28,size=512M' \
+  -device 'pc-dimm,id=dimm28,memdev=mem-dimm28,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm29,size=512M' \
+  -device 'pc-dimm,id=dimm29,memdev=mem-dimm29,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm30,size=512M' \
+  -device 'pc-dimm,id=dimm30,memdev=mem-dimm30,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm31,size=512M' \
+  -device 'pc-dimm,id=dimm31,memdev=mem-dimm31,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm32,size=1024M' \
+  -device 'pc-dimm,id=dimm32,memdev=mem-dimm32,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm33,size=1024M' \
+  -device 'pc-dimm,id=dimm33,memdev=mem-dimm33,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm34,size=1024M' \
+  -device 'pc-dimm,id=dimm34,memdev=mem-dimm34,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm35,size=1024M' \
+  -device 'pc-dimm,id=dimm35,memdev=mem-dimm35,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm36,size=1024M' \
+  -device 'pc-dimm,id=dimm36,memdev=mem-dimm36,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm37,size=1024M' \
+  -device 'pc-dimm,id=dimm37,memdev=mem-dimm37,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm38,size=1024M' \
+  -device 'pc-dimm,id=dimm38,memdev=mem-dimm38,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm39,size=1024M' \
+  -device 'pc-dimm,id=dimm39,memdev=mem-dimm39,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm40,size=1024M' \
+  -device 'pc-dimm,id=dimm40,memdev=mem-dimm40,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm41,size=1024M' \
+  -device 'pc-dimm,id=dimm41,memdev=mem-dimm41,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm42,size=1024M' \
+  -device 'pc-dimm,id=dimm42,memdev=mem-dimm42,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm43,size=1024M' \
+  -device 'pc-dimm,id=dimm43,memdev=mem-dimm43,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm44,size=1024M' \
+  -device 'pc-dimm,id=dimm44,memdev=mem-dimm44,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm45,size=1024M' \
+  -device 'pc-dimm,id=dimm45,memdev=mem-dimm45,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm46,size=1024M' \
+  -device 'pc-dimm,id=dimm46,memdev=mem-dimm46,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm47,size=1024M' \
+  -device 'pc-dimm,id=dimm47,memdev=mem-dimm47,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm48,size=1024M' \
+  -device 'pc-dimm,id=dimm48,memdev=mem-dimm48,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm49,size=1024M' \
+  -device 'pc-dimm,id=dimm49,memdev=mem-dimm49,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm50,size=1024M' \
+  -device 'pc-dimm,id=dimm50,memdev=mem-dimm50,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm51,size=1024M' \
+  -device 'pc-dimm,id=dimm51,memdev=mem-dimm51,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm52,size=1024M' \
+  -device 'pc-dimm,id=dimm52,memdev=mem-dimm52,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm53,size=1024M' \
+  -device 'pc-dimm,id=dimm53,memdev=mem-dimm53,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm54,size=1024M' \
+  -device 'pc-dimm,id=dimm54,memdev=mem-dimm54,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm55,size=1024M' \
+  -device 'pc-dimm,id=dimm55,memdev=mem-dimm55,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm56,size=1024M' \
+  -device 'pc-dimm,id=dimm56,memdev=mem-dimm56,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm57,size=1024M' \
+  -device 'pc-dimm,id=dimm57,memdev=mem-dimm57,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm58,size=1024M' \
+  -device 'pc-dimm,id=dimm58,memdev=mem-dimm58,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm59,size=1024M' \
+  -device 'pc-dimm,id=dimm59,memdev=mem-dimm59,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm60,size=1024M' \
+  -device 'pc-dimm,id=dimm60,memdev=mem-dimm60,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm61,size=1024M' \
+  -device 'pc-dimm,id=dimm61,memdev=mem-dimm61,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm62,size=1024M' \
+  -device 'pc-dimm,id=dimm62,memdev=mem-dimm62,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm63,size=1024M' \
+  -device 'pc-dimm,id=dimm63,memdev=mem-dimm63,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm64,size=2048M' \
+  -device 'pc-dimm,id=dimm64,memdev=mem-dimm64,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm65,size=2048M' \
+  -device 'pc-dimm,id=dimm65,memdev=mem-dimm65,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm66,size=2048M' \
+  -device 'pc-dimm,id=dimm66,memdev=mem-dimm66,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm67,size=2048M' \
+  -device 'pc-dimm,id=dimm67,memdev=mem-dimm67,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm68,size=2048M' \
+  -device 'pc-dimm,id=dimm68,memdev=mem-dimm68,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm69,size=2048M' \
+  -device 'pc-dimm,id=dimm69,memdev=mem-dimm69,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm70,size=2048M' \
+  -device 'pc-dimm,id=dimm70,memdev=mem-dimm70,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm71,size=2048M' \
+  -device 'pc-dimm,id=dimm71,memdev=mem-dimm71,node=1' \
+   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf b/test/cfg2cmd/memory-hugepages-1g.conf
new file mode 100644
index 0000000..8db2cca
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 1gb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf.cmd b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
new file mode 100644
index 0000000..63792d2
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 8192 \
+  -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf b/test/cfg2cmd/memory-hugepages-2m.conf
new file mode 100644
index 0000000..f0d65fb
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 2mb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 2
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf.cmd b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
new file mode 100644
index 0000000..287c0ed
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 8192 \
+  -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
index f097811..dafc3b8 100755
--- a/test/run_config2command_tests.pl
+++ b/test/run_config2command_tests.pl
@@ -178,6 +178,21 @@ $qemu_server_config->mock(
     },
 );
 
+my $qemu_server_memory;
+$qemu_server_memory = Test::MockModule->new('PVE::QemuServer::Memory');
+$qemu_server_memory->mock(
+    page_chunk => sub {
+	return 1;
+    },
+    host_numanode_exist => sub {
+	my ($id) = @_;
+	return 1;
+    },
+    get_host_phys_address_bits => sub {
+	return 46;
+    }
+);
+
 my $pve_common_tools;
 $pve_common_tools = Test::MockModule->new('PVE::Tools');
 $pve_common_tools->mock(
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:56   ` [pve-devel] applied: " Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser Alexandre Derumier
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        | 4 ++--
 PVE/QemuServer/Memory.pm | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index e4d1a70..a0e16dc 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5046,7 +5046,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
 		die "skip\n" if !$hotplug_features->{memory};
-		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt);
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
 	    } elsif ($opt eq 'cpulimit') {
@@ -5121,7 +5121,7 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt, $value);
+		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
 		$cgroup->change_cpu_shares($new_cpuunits);
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 5a7c1b7..6bb931e 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -155,7 +155,7 @@ sub foreach_reverse_dimm {
 }
 
 sub qemu_memory_hotplug {
-    my ($vmid, $conf, $defaults, $opt, $value) = @_;
+    my ($vmid, $conf, $defaults, $value) = @_;
 
     return $value if !PVE::QemuServer::check_running($vmid);
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (2 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:44   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features Alexandre Derumier
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm          | 14 +++++--
 PVE/QemuConfig.pm         |  4 +-
 PVE/QemuMigrate.pm        |  6 ++-
 PVE/QemuServer.pm         | 27 +++++++-------
 PVE/QemuServer/Helpers.pm |  3 +-
 PVE/QemuServer/Memory.pm  | 78 ++++++++++++++++++++++++++++++---------
 6 files changed, 92 insertions(+), 40 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 587bb22..6c08280 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -1489,8 +1490,6 @@ my $update_vm_api  = sub {
 
     my $storecfg = PVE::Storage::config();
 
-    my $defaults = PVE::QemuServer::load_defaults();
-
     &$resolve_cdrom_alias($param);
 
     # now try to verify all parameters
@@ -1608,7 +1607,16 @@ my $update_vm_api  = sub {
 	}
 
 	if ($param->{memory} || defined($param->{balloon})) {
-	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
+
+	    my $maxmem = undef;
+	    if ($param->{memory}) {
+	        $maxmem = get_current_memory($param->{memory});
+	    } elsif ($conf->{pending}->{memory}) {
+	        $maxmem = get_current_memory($conf->{pending}->{memory});
+	    } else {
+	        $maxmem = get_current_memory($conf->{memory});
+	    }
+
 	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
 
 	    die "balloon value too large (must be smaller than assigned memory)\n"
diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
index 051382c..999e658 100644
--- a/PVE/QemuConfig.pm
+++ b/PVE/QemuConfig.pm
@@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer;
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::Storage;
 use PVE::Tools;
 use PVE::Format qw(render_bytes render_duration);
@@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
 	$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
     }
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $mem_size = $conf->{memory} // $defaults->{memory};
+    my $mem_size = get_current_memory($conf->{memory});
     my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
     # our savevm-start does live-save of the memory until the space left in the
     # volume is just enough for the remaining memory content + internal state
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 09cc1d8..f86fdbe 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::Helpers qw(min_version);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer;
 
 use PVE::AbstractMigrate;
@@ -1026,7 +1027,8 @@ sub phase2_start_remote_cluster {
     my $remote_vmid = $self->{opts}->{remote}->{vmid};
 
     # like regular start but with some overhead accounted for
-    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+    my $memory = get_current_memory($self->{vmconf}->{memory});
+    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
 
     my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
 
@@ -1181,7 +1183,7 @@ sub phase2 {
     $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
 
     # set cachesize to 10% of the total memory
-    my $memory =  $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
     my $cachesize = int($memory * 1048576 / 10);
     $cachesize = round_powerof2($cachesize);
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a0e16dc..17ecd63 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -53,7 +53,7 @@ use PVE::QemuServer::CGroup;
 use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
 use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
 use PVE::QemuServer::USB qw(parse_usb_device);
@@ -341,11 +341,9 @@ my $confdesc = {
     },
     memory => {
 	optional => 1,
-	type => 'integer',
-	description => "Amount of RAM for the VM in MiB. This is the maximum available memory when"
-	    ." you use the balloon device.",
-	minimum => 16,
-	default => 512,
+	type => 'string',
+	description => "Memory properties.",
+	format => $PVE::QemuServer::Memory::memory_fmt
     },
     balloon => {
 	optional => 1,
@@ -2933,8 +2931,7 @@ sub vmstatus {
 	$d->{cpus} = $conf->{vcpus} if $conf->{vcpus};
 
 	$d->{name} = $conf->{name} || "VM $vmid";
-	$d->{maxmem} = $conf->{memory} ? $conf->{memory}*(1024*1024)
-	    : $defaults->{memory}*(1024*1024);
+	$d->{maxmem} = get_current_memory($conf->{memory})*(1024*1024);
 
 	if ($conf->{balloon}) {
 	    $d->{balloon_min} = $conf->{balloon}*(1024*1024);
@@ -3869,7 +3866,7 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
-    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd);
+    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features, $cmd);
 
     push @$cmd, '-S' if $conf->{freeze};
 
@@ -5033,7 +5030,7 @@ sub vmconfig_hotplug_pending {
 		# enable balloon device is not hotpluggable
 		die "skip\n" if defined($conf->{balloon}) && $conf->{balloon} == 0;
 		# here we reset the ballooning value to memory
-		my $balloon = $conf->{memory} || $defaults->{memory};
+		my $balloon = get_current_memory($conf->{memory});
 		mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 	    } elsif ($fast_plug_option->{$opt}) {
 		# do nothing
@@ -5046,7 +5043,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
 		die "skip\n" if !$hotplug_features->{memory};
-		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
 	    } elsif ($opt eq 'cpulimit') {
@@ -5101,7 +5098,8 @@ sub vmconfig_hotplug_pending {
 
 		# allow manual ballooning if shares is set to zero
 		if ((defined($conf->{shares}) && ($conf->{shares} == 0))) {
-		    my $balloon = $conf->{pending}->{balloon} || $conf->{memory} || $defaults->{memory};
+		    my $memory = get_current_memory($conf->{memory});
+		    my $balloon = $conf->{pending}->{balloon} || $memory;
 		    mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 		}
 	    } elsif ($opt =~ m/^net(\d+)$/) {
@@ -5121,7 +5119,7 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
+		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
 		$cgroup->change_cpu_shares($new_cpuunits);
@@ -5798,7 +5796,8 @@ sub vm_start_nolock {
 	push @$cmd, '-S';
     }
 
-    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
+    my $memory = get_current_memory($conf->{memory});
+    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $memory, $resume);
 
     my $pci_devices = {}; # host pci devices
     for (my $i = 0; $i < $PVE::QemuServer::PCI::MAX_HOSTPCI_DEVICES; $i++)  {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index e91f906..9115d50 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -143,8 +143,7 @@ sub version_cmp {
 }
 
 sub config_aware_timeout {
-    my ($config, $is_suspended) = @_;
-    my $memory = $config->{memory};
+    my ($config, $memory, $is_suspended) = @_;
     my $timeout = 30;
 
     # Based on user reported startup time for vm with 512GiB @ 4-5 minutes
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 6bb931e..0704944 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,10 +8,42 @@ use PVE::Exception qw(raise raise_param_exc);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use base qw(Exporter);
+
+our @EXPORT_OK = qw(
+get_current_memory
+);
 
 my $MAX_NUMA = 8;
 my $STATICMEM = 1024;
 
+our $memory_fmt = {
+    current => {
+	description => "Current amount of online RAM for the VM in MiB. This is the maximum available memory when"
+	    ." you use the balloon device.",
+	type => 'integer',
+	default_key => 1,
+	minimum => 16,
+	default => 512,
+    }
+};
+
+sub print_memory {
+    my $memory = shift;
+
+    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
+}
+
+sub parse_memory {
+    my ($value) = @_;
+
+    return { current => $memory_fmt->{current}->{default} } if !defined($value);
+
+    my $res = PVE::JSONSchema::parse_property_string($memory_fmt, $value)
+	or die "Cannot parse memory properties: $value\n";
+    return $res;
+}
+
 my $_host_bits;
 my sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -63,6 +95,13 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+sub get_current_memory {
+    my ($value) = @_;
+
+    my $memory = parse_memory($value);
+    return $memory->{current};
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -155,16 +194,19 @@ sub foreach_reverse_dimm {
 }
 
 sub qemu_memory_hotplug {
-    my ($vmid, $conf, $defaults, $value) = @_;
+    my ($vmid, $conf, $value) = @_;
 
     return $value if !PVE::QemuServer::check_running($vmid);
 
-    my $sockets = 1;
-    $sockets = $conf->{sockets} if $conf->{sockets};
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return $value if $newmem->{current} == $oldmem->{current};
 
-    my $memory = $conf->{memory} || $defaults->{memory};
-    $value = $defaults->{memory} if !$value;
-    return $value if $value == $memory;
+    my $memory = $oldmem->{current};
+    $value = $newmem->{current};
+
+    my $sockets = $conf->{sockets} || 1;
 
     my $static_memory = $STATICMEM;
     $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
@@ -180,7 +222,7 @@ sub qemu_memory_hotplug {
 	foreach_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size <= $conf->{memory};
+		return if $current_size <= get_current_memory($conf->{memory});
 
 		if ($conf->{hugepages}) {
 		    $numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
@@ -219,7 +261,8 @@ sub qemu_memory_hotplug {
 		    die $err;
 		}
 		#update conf after each succesful module hotplug
-		$conf->{memory} = $current_size;
+		$newmem->{current} = $current_size;
+		$conf->{memory} = print_memory($newmem);
 		PVE::QemuConfig->write_config($vmid, $conf);
 	});
 
@@ -228,7 +271,8 @@ sub qemu_memory_hotplug {
 	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size >= $conf->{memory};
+		return if $current_size >= get_current_memory($conf->{memory});
+
 		print "try to unplug memory dimm $name\n";
 
 		my $retry = 0;
@@ -242,12 +286,14 @@ sub qemu_memory_hotplug {
 		}
 
 		#update conf after each succesful module unplug
-		$conf->{memory} = $current_size;
+		$newmem->{current} = $current_size;
+		$conf->{memory} = print_memory($newmem);
 
 		eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
 		PVE::QemuConfig->write_config($vmid, $conf);
 	});
     }
+    return $conf->{memory};
 }
 
 sub qemu_dimm_list {
@@ -268,9 +314,10 @@ sub qemu_dimm_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug_features, $cmd) = @_;
+
+    my $memory = get_current_memory($conf->{memory});
 
-    my $memory = $conf->{memory} || $defaults->{memory};
     my $static_memory = 0;
 
     if ($hotplug_features->{memory}) {
@@ -496,12 +543,9 @@ sub hugepages_topology {
 
     return if !$conf->{numa};
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf->{memory});
+    my $sockets = $conf->{sockets} || 1;
     my $static_memory = 0;
-    my $sockets = 1;
-    $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
-    $sockets = $conf->{sockets} if $conf->{sockets};
     my $numa_custom_topology = undef;
     my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (3 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:44   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option Alexandre Derumier
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  5 ++--
 PVE/QemuServer/Memory.pm | 51 ++++++++++++++++++++--------------------
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 17ecd63..9db1d8e 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3866,7 +3866,7 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
-    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features, $cmd);
+    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
 
     push @$cmd, '-S' if $conf->{freeze};
 
@@ -5901,7 +5901,8 @@ sub vm_start_nolock {
     if ($conf->{hugepages}) {
 
 	my $code = sub {
-	    my $hugepages_topology = PVE::QemuServer::Memory::hugepages_topology($conf);
+	    my $hotplug_features = parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
+	    my $hugepages_topology = PVE::QemuServer::Memory::hugepages_topology($conf, $hotplug_features->{memory});
 	    my $hugepages_host_topology = PVE::QemuServer::Memory::hugepages_host_topology();
 
 	    PVE::QemuServer::Memory::hugepages_mount();
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 0704944..7079907 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -15,7 +15,6 @@ get_current_memory
 );
 
 my $MAX_NUMA = 8;
-my $STATICMEM = 1024;
 
 our $memory_fmt = {
     current => {
@@ -44,6 +43,23 @@ sub parse_memory {
     return $res;
 }
 
+my sub get_static_mem {
+    my ($conf, $sockets, $hotplug) = @_;
+
+    my $static_memory = 0;
+    my $memory = parse_memory($conf->{memory});
+
+    if ($hotplug) {
+	#legacy
+	$static_memory = 1024;
+	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    } else {
+	$static_memory = $memory->{current};
+    }
+
+    return $static_memory;
+}
+
 my $_host_bits;
 my sub get_host_phys_address_bits {
     return $_host_bits if defined($_host_bits);
@@ -208,8 +224,7 @@ sub qemu_memory_hotplug {
 
     my $sockets = $conf->{sockets} || 1;
 
-    my $static_memory = $STATICMEM;
-    $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    my $static_memory = get_static_mem($conf, $sockets, 1);
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
@@ -314,13 +329,13 @@ sub qemu_dimm_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $hotplug_features, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
 
     my $memory = get_current_memory($conf->{memory});
 
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
 
-    if ($hotplug_features->{memory}) {
+    if ($hotplug) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
 	my $MAX_MEM = get_max_mem($conf);
 	die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
@@ -330,18 +345,10 @@ sub config {
 		if $conf->{"numa$i"};
 	}
 
-	my $sockets = 1;
-	$sockets = $conf->{sockets} if $conf->{sockets};
-
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
 
     } else {
-
-	$static_memory = $memory;
 	push @$cmd, '-m', $static_memory;
     }
 
@@ -408,7 +415,7 @@ sub config {
 	}
     }
 
-    if ($hotplug_features->{memory}) {
+    if ($hotplug) {
 	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
@@ -537,7 +544,7 @@ sub hugepages_size {
 }
 
 sub hugepages_topology {
-    my ($conf) = @_;
+    my ($conf, $hotplug) = @_;
 
     my $hugepages_topology = {};
 
@@ -545,16 +552,8 @@ sub hugepages_topology {
 
     my $memory = get_current_memory($conf->{memory});
     my $sockets = $conf->{sockets} || 1;
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf, $sockets, $hotplug);
     my $numa_custom_topology = undef;
-    my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
-
-    if ($hotplug_features->{memory}) {
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-    } else {
-	$static_memory = $memory;
-    }
 
     #custom numa topology
     for (my $i = 0; $i < $MAX_NUMA; $i++) {
@@ -585,7 +584,7 @@ sub hugepages_topology {
 	}
     }
 
-    if ($hotplug_features->{memory}) {
+    if ($hotplug) {
 	my $numa_hostmap = get_numa_guest_to_host_map($conf);
 
 	foreach_dimm($conf, undef, $memory, $sockets, sub {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (4 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:44   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 07/13] memory: get_max_mem: use config memory max Alexandre Derumier
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

max can be multiple of 64GiB only,
The dimm size is compute from the max memory

we can have 64 slots:

64GiB = 64 slots x 1GiB
128GiB = 64 slots x 2GiB
..
4TiB = 64 slots x 64GiB

Also, with numa, we need to share slot between (up to 8) sockets.

64 is a multiple of 8,

64GiB = 8 sockets * 8 slots * 1GiB
128GiB = 8 sockets * 8 slots * 2GiB
...

and with virtio-mem,
we can have 32000 blocks of 2MiB minimum

64GB = 32000 * 2MiB

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  4 ++--
 PVE/QemuServer/Memory.pm | 29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 9db1d8e..1f73605 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5042,7 +5042,7 @@ sub vmconfig_hotplug_pending {
 		vm_deviceunplug($vmid, $conf, $opt);
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf);
 		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
@@ -5118,7 +5118,7 @@ sub vmconfig_hotplug_pending {
 		vmconfig_update_disk($storecfg, $conf, $hotplug_features->{disk},
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
-		die "skip\n" if !$hotplug_features->{memory};
+		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf, $value);
 		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 7079907..a46b504 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
+use PVE::JSONSchema;
 use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 use PVE::Exception qw(raise raise_param_exc);
+use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
@@ -24,9 +26,23 @@ our $memory_fmt = {
 	default_key => 1,
 	minimum => 16,
 	default => 512,
+    },
+    max => {
+	type => 'integer',
+	optional => 1,
+	minimum => 65536,
+	maximum => 4194304,
+	format => 'pve-qm-memory-max',
     }
 };
 
+PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
+sub verify_qm_memory_max {
+    my ($max, $noerr) = @_;
+
+    die "max memory need to be a multiple of 64GiB\n" if $max && $max % 65536 != 0;
+}
+
 sub print_memory {
     my $memory = shift;
 
@@ -311,6 +327,19 @@ sub qemu_memory_hotplug {
     return $conf->{memory};
 }
 
+sub can_hotplug {
+    my ($hotplug, $conf, $value) = @_;
+
+    return if !$hotplug || !$value;
+
+    my $oldmem = parse_memory($conf->{memory});
+    my $newmem = parse_memory($value);
+
+    return if safe_num_ne($newmem->{max}, $oldmem->{max});
+
+    return 1;
+}
+
 sub qemu_dimm_list {
     my ($vmid) = @_;
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 07/13] memory: get_max_mem: use config memory max
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (5 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

verify than defined vm memorymax is not bigger than
host cpu supported memory

Add add early check in update vm api

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         | 46 +++++++++++++++++++++++++---------------
 PVE/QemuServer/Memory.pm | 19 ++++++++++++++++-
 2 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 6c08280..f94c92b 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -476,6 +476,33 @@ my $create_disks = sub {
     return ($vollist, $res);
 };
 
+my $check_memory_param = sub {
+    my ($conf, $param) = @_;
+
+    my $mem = parse_memory($param->{memory});
+    my $host_max_mem = get_host_max_mem($conf);
+
+    die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	if $mem->{max} && $mem->{max} > $host_max_mem;
+
+    if ($param->{memory} || defined($param->{balloon})) {
+
+	my $maxmem = undef;
+	if ($param->{memory}) {
+	    $maxmem = get_current_memory($param->{memory});
+	} elsif ($conf->{pending}->{memory}) {
+	    $maxmem = get_current_memory($conf->{pending}->{memory});
+	} else {
+	    $maxmem = get_current_memory($conf->{memory});
+	}
+
+	my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
+
+	die "balloon value too large (must be smaller than assigned memory)\n"
+	    if $balloon && $balloon > $maxmem;
+    }
+};
+
 my $check_cpu_model_access = sub {
     my ($rpcenv, $authuser, $new, $existing) = @_;
 
@@ -1606,22 +1633,7 @@ my $update_vm_api  = sub {
 	    }
 	}
 
-	if ($param->{memory} || defined($param->{balloon})) {
-
-	    my $maxmem = undef;
-	    if ($param->{memory}) {
-	        $maxmem = get_current_memory($param->{memory});
-	    } elsif ($conf->{pending}->{memory}) {
-	        $maxmem = get_current_memory($conf->{pending}->{memory});
-	    } else {
-	        $maxmem = get_current_memory($conf->{memory});
-	    }
-
-	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
-
-	    die "balloon value too large (must be smaller than assigned memory)\n"
-		if $balloon && $balloon > $maxmem;
-	}
+	&$check_memory_param($conf, $param);
 
 	PVE::Cluster::log_msg('info', $authuser, "update VM $vmid: " . join (' ', @paramarr));
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index a46b504..b5f1076 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -14,6 +14,8 @@ use base qw(Exporter);
 
 our @EXPORT_OK = qw(
 get_current_memory
+parse_memory
+get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
@@ -93,7 +95,7 @@ my sub get_host_phys_address_bits {
     return; # undef, cannot really do anything..
 }
 
-my sub get_max_mem {
+sub get_host_max_mem {
     my ($conf) = @_;
 
     my $cpu = {};
@@ -127,6 +129,21 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+my sub get_max_mem {
+    my ($conf) = @_;
+
+    my $host_max_mem = get_host_max_mem($conf);
+    my $mem = parse_memory($conf->{memory});
+
+    if ($mem->{max}) {
+	die "configured memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+	    if $mem->{max} > $host_max_mem;
+	return $mem->{max};
+    }
+
+    return $host_max_mem;
+}
+
 sub get_current_memory {
     my ($value) = @_;
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (6 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 07/13] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:45   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

simple use dimm_list() returned by qemu

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 73 ++++++++++++----------------------------
 1 file changed, 22 insertions(+), 51 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index b5f1076..bcf5f95 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -210,38 +210,6 @@ sub foreach_dimm{
     }
 }
 
-sub foreach_reverse_dimm {
-    my ($conf, $vmid, $memory, $sockets, $func) = @_;
-
-    my $dimm_id = 253;
-    my $current_size = 0;
-    my $dimm_size = 0;
-
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 8355840;
-	$dimm_size = 131072;
-    } else {
-	$current_size = 4177920;
-	$dimm_size = 65536;
-    }
-
-    return if $current_size == $memory;
-
-    my @numa_map = get_numa_node_list($conf);
-
-    for (my $j = 0; $j < 8; $j++) {
-	for (my $i = 0; $i < 32; $i++) {
-	    my $name = "dimm${dimm_id}";
-	    $dimm_id--;
-	    my $numanode = $numa_map[(31-$i) % @numa_map];
-	    $current_size -= $dimm_size;
-	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
-	    return  $current_size if $current_size <= $memory;
-	}
-	$dimm_size /= 2;
-    }
-}
-
 sub qemu_memory_hotplug {
     my ($vmid, $conf, $value) = @_;
 
@@ -316,30 +284,33 @@ sub qemu_memory_hotplug {
 
     } else {
 
-	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
-	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+	my $dimms = qemu_dimm_list($vmid);
+	my $current_size = $memory;
+	for my $name (sort { $dimms->{$b}{slot} <=> $dimms->{$a}{slot} } keys %$dimms) {
 
-		return if $current_size >= get_current_memory($conf->{memory});
+	    my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
 
-		print "try to unplug memory dimm $name\n";
+	    last if $current_size <= $value || $current_size <= $static_memory;
 
-		my $retry = 0;
-		while (1) {
-		    eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
-		    sleep 3;
-		    my $dimm_list = qemu_dimm_list($vmid);
-		    last if !$dimm_list->{$name};
-		    raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
-		    $retry++;
-		}
+	    print "try to unplug memory dimm $name\n";
 
-		#update conf after each succesful module unplug
-		$newmem->{current} = $current_size;
-		$conf->{memory} = print_memory($newmem);
+	    my $retry = 0;
+	    while (1) {
+		eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
+		sleep 3;
+		my $dimm_list = qemu_dimm_list($vmid);
+		last if !$dimm_list->{$name};
+		raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
+		$retry++;
+	    }
+	    $current_size -= $dimm_size;
+	    #update conf after each succesful module unplug
+	    $newmem->{current} = $current_size;
+	    $conf->{memory} = print_memory($newmem);
 
-		eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
-		PVE::QemuConfig->write_config($vmid, $conf);
-	});
+	    eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
+	    PVE::QemuConfig->write_config($vmid, $conf);
+	}
     }
     return $conf->{memory};
 }
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (7 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:45   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 10/13] test: add memory-max tests Alexandre Derumier
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

default kernel vhost config only support 64 slots by default,
for performance since 2015.

Original memory hotplug code was done before, using qemu
max supported 255 slots.

To reach max mem (4TB), we used incremental dimm size.

Instead of dynamic memory size, use 1 static dimm size, compute
from max memory/64.

We also force a static memory to 4GB, as memory under 4G don't
works fine with unplug

Fix:
https://bugzilla.proxmox.com/show_bug.cgi?id=3446
https://bugzilla.proxmox.com/show_bug.cgi?id=1426
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index bcf5f95..3dfa479 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -19,6 +19,7 @@ get_host_max_mem
 );
 
 my $MAX_NUMA = 8;
+my $MAX_SLOTS = 64;
 
 our $memory_fmt = {
     current => {
@@ -67,7 +68,12 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($hotplug) {
+    if ($memory->{max}) {
+	my $dimm_size = $memory->{max} / $MAX_SLOTS;
+	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
+	$static_memory = $dimm_size * $sockets;
+	$static_memory = 4096 if $static_memory < 4096;
+    } elsif ($hotplug) {
 	#legacy
 	$static_memory = 1024;
 	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
@@ -179,17 +185,17 @@ sub get_numa_guest_to_host_map {
 }
 
 sub foreach_dimm{
-    my ($conf, $vmid, $memory, $sockets, $func) = @_;
+    my ($conf, $vmid, $memory, $current_size, $func) = @_;
 
     my $dimm_id = 0;
-    my $current_size = 0;
     my $dimm_size = 0;
 
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 1024 * $sockets;
+    my $confmem = parse_memory($conf->{memory});
+    if ($confmem->{max}) {
+	$dimm_size = $confmem->{max} / $MAX_SLOTS;
+    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
 	$dimm_size = 1024;
     } else {
-	$current_size = 1024;
 	$dimm_size = 512;
     }
 
@@ -206,7 +212,7 @@ sub foreach_dimm{
 	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
 	    return  $current_size if $current_size >= $memory;
 	}
-	$dimm_size *= 2;
+	$dimm_size *= 2 if !$confmem->{max};
     }
 }
 
@@ -235,7 +241,7 @@ sub qemu_memory_hotplug {
 
 	my $numa_hostmap;
 
-	foreach_dimm($conf, $vmid, $value, $sockets, sub {
+	foreach_dimm($conf, $vmid, $value, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 		return if $current_size <= get_current_memory($conf->{memory});
@@ -363,7 +369,9 @@ sub config {
 	}
 
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
-	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
+	my $confmem = parse_memory($conf->{memory});
+	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
+	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
 
     } else {
 	push @$cmd, '-m', $static_memory;
@@ -433,7 +441,7 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
+	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
@@ -604,7 +612,7 @@ sub hugepages_topology {
     if ($hotplug) {
 	my $numa_hostmap = get_numa_guest_to_host_map($conf);
 
-	foreach_dimm($conf, undef, $memory, $sockets, sub {
+	foreach_dimm($conf, undef, $memory, $static_memory, sub {
 	    my ($conf, undef, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
 	    $numanode = $numa_hostmap->{$numanode};
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 10/13] test: add memory-max tests
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (8 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support Alexandre Derumier
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-max-128G.conf     | 11 ++++
 test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++++++++++++++++++++++++++
 test/cfg2cmd/memory-max-512G.conf     | 11 ++++
 test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++++++++++++++++++
 4 files changed, 166 insertions(+)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd

diff --git a/test/cfg2cmd/memory-max-128G.conf b/test/cfg2cmd/memory-max-128G.conf
new file mode 100644
index 0000000..8f0328d
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 64G with 2 socket, dimm size should be 1GB && static memory should be 4G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-128G.conf.cmd b/test/cfg2cmd/memory-max-128G.conf.cmd
new file mode 100644
index 0000000..5b22027
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf.cmd
@@ -0,0 +1,86 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,slots=64,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=1024M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=1024M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=1024M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=1024M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=1024M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=1024M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=1024M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=1024M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=1024M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=1024M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=1024M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=1024M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=1024M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=1024M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=1024M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=1024M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=1024M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=1024M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=1024M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=1024M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=1024M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=1024M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=1024M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=1024M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=1024M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=1024M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=1024M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=1024M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf b/test/cfg2cmd/memory-max-512G.conf
new file mode 100644
index 0000000..082f75e
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 512GB with 2 socket, dimm size should be 8GB && static memory should be 16G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 131072,max=524288
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf.cmd b/test/cfg2cmd/memory-max-512G.conf.cmd
new file mode 100644
index 0000000..3c04b70
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf.cmd
@@ -0,0 +1,58 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=16384,slots=64,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=8192M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=8192M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=8192M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=8192M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=8192M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=8192M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=8192M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=8192M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=8192M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=8192M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=8192M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=8192M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=8192M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=8192M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=8192M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=8192M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (9 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 10/13] test: add memory-max tests Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-03 13:46   ` Fiona Ebner
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 12/13] memory: virtio-mem : implement redispatch retry Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 13/13] tests: add virtio-mem tests Alexandre Derumier
  12 siblings, 1 reply; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

a 4GiB static memory is needed for DMA+boot memory, as this memory
is almost always un-unpluggeable.

1 virtio-mem pci device is setup for each numa node on pci.4 bridge

virtio-mem use a fixed blocksize with 32000 blocks
Blocksize is computed from the maxmemory-4096/32000 with a minimum of
2MiB to map THP.
(lower blocksize = more chance to unplug memory).

Note: Currently, linux only support 4MiB virtio blocksize, 2MiB support
is currently is progress.

For hotplug/unplug, we are try to allocate/unallocate same amount
of memory aligned to the blocksize on each numa node if possible.
If a node a not able to reach the target memory (could be an unmovable
page on unplug for example), we try again to redispatch memory the
remaining memory on others nodes.

About hugepages:

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

fixes:
https://bugzilla.proxmox.com/show_bug.cgi?id=2949
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm         |   9 +-
 PVE/QemuServer.pm        |   7 +-
 PVE/QemuServer/Memory.pm | 207 ++++++++++++++++++++++++++++++++++++---
 PVE/QemuServer/PCI.pm    |   8 ++
 4 files changed, 214 insertions(+), 17 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index f94c92b..1686ed5 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem get_virtiomem_block_size);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -485,6 +485,13 @@ my $check_memory_param = sub {
     die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
 	if $mem->{max} && $mem->{max} > $host_max_mem;
 
+    if ($mem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	die "memory need to be a multiple of $blocksize MiB when virtiomem is enabled\n"
+	    if $mem->{current} % $blocksize != 0;
+    }
+
     if ($param->{memory} || defined($param->{balloon})) {
 
 	my $maxmem = undef;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 1f73605..4e4bb0d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3866,7 +3866,12 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
-    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
+    my $mem_devices = {};
+    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd, $mem_devices);
+    foreach my $id (sort keys %$mem_devices) {
+	my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);
+	push @$devices, "-device", "$mem_devices->{$id}$pciaddr";
+    }
 
     push @$cmd, '-S' if $conf->{freeze};
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 3dfa479..9e81719 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,6 +3,8 @@ package PVE::QemuServer::Memory;
 use strict;
 use warnings;
 
+use POSIX qw(ceil);
+
 use PVE::JSONSchema;
 use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
 use PVE::Exception qw(raise raise_param_exc);
@@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
 get_current_memory
 parse_memory
 get_host_max_mem
+get_virtiomem_block_size
 );
 
 my $MAX_NUMA = 8;
@@ -36,6 +39,12 @@ our $memory_fmt = {
 	minimum => 65536,
 	maximum => 4194304,
 	format => 'pve-qm-memory-max',
+    },
+    virtio => {
+	description => "Enable virtio-mem memory (Experimental: Only works with Linux guest with kernel >= 5.10)",
+	type => 'boolean',
+	optional => 1,
+	default => 0,
     }
 };
 
@@ -68,7 +77,9 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = parse_memory($conf->{memory});
 
-    if ($memory->{max}) {
+    if ($memory->{virtio}) {
+	$static_memory = 4096;
+    } elsif ($memory->{max}) {
 	my $dimm_size = $memory->{max} / $MAX_SLOTS;
 	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
 	$static_memory = $dimm_size * $sockets;
@@ -157,6 +168,109 @@ sub get_current_memory {
     return $memory->{current};
 }
 
+sub get_virtiomem_block_size {
+    my ($conf) = @_;
+
+    my $sockets = $conf->{sockets} || 1;
+    my $MAX_MEM = get_max_mem($conf);
+    my $static_memory = get_static_mem($conf, $sockets);
+    my $memory = get_current_memory($conf->{memory});
+
+    #virtiomem can map 32000 block size.
+    #try to use lowest blocksize, lower = more chance to unplug memory.
+    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
+    #2MB is the minimum to be aligned with THP
+    $blocksize = 2 if $blocksize < 2;
+    $blocksize = 2**(ceil(log($blocksize)/log(2)));
+    #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
+    $blocksize = 4 if $blocksize < 4;
+
+    return $blocksize;
+}
+
+my sub get_virtiomem_total_current_size {
+    my ($mems) = @_;
+    my $size = 0;
+    for my $mem (values %$mems) {
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
+my sub balance_virtiomem {
+    my ($vmid, $virtiomems, $blocksize, $target_virtiomem_total) = @_;
+
+    my $nb_virtiomem = keys %$virtiomems;
+
+    print"try to balance memory on $nb_virtiomem virtiomems\n";
+    die "error. no more available blocks in virtiomem to balance the remaining memory" if $target_virtiomem_total < 0;
+
+    #if we can't share exactly the same amount, we add the remainder on last node
+    my $virtiomem_target_aligned = int( $target_virtiomem_total / $nb_virtiomem / $blocksize) * $blocksize;
+    my $virtiomem_target_remaining = $target_virtiomem_total - ($virtiomem_target_aligned * ($nb_virtiomem-1));
+
+    my $i = 0;
+    foreach my $id (sort keys %$virtiomems) {
+	my $virtiomem = $virtiomems->{$id};
+	$i++;
+	my $virtiomem_target = $i == $nb_virtiomem ? $virtiomem_target_remaining : $virtiomem_target_aligned;
+	$virtiomem->{completed} = 0;
+	$virtiomem->{retry} = 0;
+	$virtiomem->{target} = $virtiomem_target;
+
+	print "virtiomem$id: set-requested-size : $virtiomem_target\n";
+	mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/virtiomem$id", property => "requested-size", value => $virtiomem_target * 1024 * 1024);
+    }
+
+    while (1) {
+
+	sleep 1;
+	my $total_finished = 0;
+
+	foreach my $id (keys %$virtiomems) {
+
+	    my $virtiomem = $virtiomems->{$id};
+
+	    if ($virtiomem->{error} || $virtiomem->{completed}) {
+		$total_finished++;
+		next;
+	    }
+
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
+	    $virtiomem->{current} = $size / 1024 / 1024;
+	    print"virtiomem$id: virtiomem->last: $virtiomem->{last} virtiomem->current: $virtiomem->{current} virtio_mem_target:$virtiomem->{target}\n";
+
+	    if($virtiomem->{current} == $virtiomem->{target}) {
+		print"virtiomem$id: completed\n";
+		$virtiomem->{completed} = 1;
+		next;
+	    }
+
+	    if($virtiomem->{current} != $virtiomem->{last}) {
+		#if value has changed, but not yet completed
+		print "virtiomem$id: changed but don't not reach target yet\n";
+		$virtiomem->{retry} = 0;
+		$virtiomem->{last} = $virtiomem->{current};
+		next;
+	    }
+
+	    if($virtiomem->{retry} >= 5) {
+		print "virtiomem$id: too many retry. set error\n";
+		$virtiomem->{error} = 1;
+		#as change is async, we don't want that value change after the api call
+		eval {
+		    mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/virtiomem$id", property => "requested-size", value => $virtiomem->{current} * 1024 *1024);
+		};
+	    }
+	    print"virtiomem$id: increase retry: $virtiomem->{retry}\n";
+	    $virtiomem->{retry}++;
+	}
+
+	my $nb_virtiomem = keys %$virtiomems;
+	return if $total_finished == $nb_virtiomem;
+    }
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -237,7 +351,39 @@ sub qemu_memory_hotplug {
     my $MAX_MEM = get_max_mem($conf);
     die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
-    if ($value > $memory) {
+    my $confmem = parse_memory($conf->{memory});
+
+    if ($confmem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	my $virtiomems = {};
+
+	for (my $i = 0; $i < $sockets; $i++) {
+	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$i", property => "size");
+	    $size = $size / 1024 /1024;
+	    $virtiomems->{$i} = {
+		current => $size,
+		last => $size,
+		error => 0,
+		completed => 0,
+		retry => 0
+	    };
+	}
+
+	my $target_virtiomem_total = $value - $static_memory;
+	my $err;
+	eval {
+	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_virtiomem_total);
+	};
+	$err = $@ if $@;
+
+	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	$newmem->{current} = $current_memory;
+	$conf->{memory} = print_memory($newmem);
+	PVE::QemuConfig->write_config($vmid, $conf);
+	die $err if $err;
+
+    } elsif ($value > $memory) {
 
 	my $numa_hostmap;
 
@@ -329,8 +475,8 @@ sub can_hotplug {
     my $oldmem = parse_memory($conf->{memory});
     my $newmem = parse_memory($value);
 
-    return if safe_num_ne($newmem->{max}, $oldmem->{max});
-
+    return if (safe_num_ne($newmem->{max}, $oldmem->{max}) || 
+	       safe_boolean_ne($newmem->{virtio}, $oldmem->{virtio}));
     return 1;
 }
 
@@ -352,7 +498,7 @@ sub qemu_dimm_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd, $mem_devices) = @_;
 
     my $memory = get_current_memory($conf->{memory});
 
@@ -371,7 +517,10 @@ sub config {
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	my $confmem = parse_memory($conf->{memory});
 	my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
-	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
+	my $cmdstr = "size=${static_memory}";
+	$cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
+	$cmdstr .= ",maxmem=${MAX_MEM}M";
+	push @$cmd, '-m', $cmdstr;
 
     } else {
 	push @$cmd, '-m', $static_memory;
@@ -441,17 +590,42 @@ sub config {
     }
 
     if ($hotplug) {
-	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
-	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+	my $confmem = parse_memory($conf->{memory});
+
+	if ($confmem->{'virtio'}) {
 
-	    push @$cmd, "-object" , $mem_object;
-	    push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+	    my $MAX_MEM = get_max_mem($conf);
+	    my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
+	    my $node_mem = ($memory - $static_memory) / $sockets;
+	    my $blocksize = get_virtiomem_block_size($conf);
 
-	    die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
-		if $current_size > $memory;
-	});
+	    die "memory need to be a multiple of $blocksize MiB with maxmemory $MAX_MEM MiB when virtiomem is enabled\n"
+		if $memory % $blocksize != 0;
+
+	    for (my $i = 0; $i < $sockets; $i++)  {
+
+		my $id = "virtiomem$i";
+		my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
+		push @$cmd, "-object" , "$mem_object,reserve=off";
+
+		my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i";
+		$mem_device .= ",prealloc=on" if $conf->{hugepages};
+		$mem_devices->{$id} = $mem_device;
+	    }
+	} else {
+	    foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
+		my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+
+		my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+
+		push @$cmd, "-object" , $mem_object;
+		push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+
+		die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
+		    if $current_size > $memory;
+	    });
+	}
     }
 }
 
@@ -462,8 +636,11 @@ sub print_mem_object {
 
 	my $hugepages_size = hugepages_size($conf, $size);
 	my $path = hugepages_mount_path($hugepages_size);
+	my $confmem = parse_memory($conf->{memory});
 
-	return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
+	my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
+	$object .= ",prealloc=yes" if !$confmem->{virtio};
+	return $object;
     } else {
 	return "memory-backend-ram,id=$id,size=${size}M";
     }
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index a18b974..0187c74 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -249,6 +249,14 @@ sub get_pci_addr_map {
 	'scsihw2' => { bus => 4, addr => 1 },
 	'scsihw3' => { bus => 4, addr => 2 },
 	'scsihw4' => { bus => 4, addr => 3 },
+	'virtiomem0' => { bus => 4, addr => 4 },
+	'virtiomem1' => { bus => 4, addr => 5 },
+	'virtiomem2' => { bus => 4, addr => 6 },
+	'virtiomem3' => { bus => 4, addr => 7 },
+	'virtiomem4' => { bus => 4, addr => 8 },
+	'virtiomem5' => { bus => 4, addr => 9 },
+	'virtiomem6' => { bus => 4, addr => 10 },
+	'virtiomem7' => { bus => 4, addr => 11 },
     } if !defined($pci_addr_map);
     return $pci_addr_map;
 }
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 12/13] memory: virtio-mem : implement redispatch retry.
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (10 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 13/13] tests: add virtio-mem tests Alexandre Derumier
  12 siblings, 0 replies; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

If some memory can be removed on a specific node,
we try to rebalance again on other nodes

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 54 ++++++++++++++++++++++++++--------------
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 9e81719..a16bdb2 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -197,23 +197,35 @@ my sub get_virtiomem_total_current_size {
     return $size;
 }
 
+my sub get_virtiomem_total_errors_size {
+    my ($mems) = @_;
+
+    my $size = 0;
+    for my $mem (values %$mems) {
+	next if !$mem->{error};
+	$size += $mem->{current};
+    }
+    return $size;
+}
+
 my sub balance_virtiomem {
     my ($vmid, $virtiomems, $blocksize, $target_virtiomem_total) = @_;
 
-    my $nb_virtiomem = keys %$virtiomems;
+    my $virtiomem_total_noerror = scalar(grep { !$_->{error} } values $virtiomems->%*);
 
-    print"try to balance memory on $nb_virtiomem virtiomems\n";
+    print"try to balance memory on $virtiomem_total_noerror remaining virtiomems\n";
     die "error. no more available blocks in virtiomem to balance the remaining memory" if $target_virtiomem_total < 0;
+    die "error. No more available virtiomem to balance the remaining memory\n" if $virtiomem_total_noerror == 0;
 
-    #if we can't share exactly the same amount, we add the remainder on last node
-    my $virtiomem_target_aligned = int( $target_virtiomem_total / $nb_virtiomem / $blocksize) * $blocksize;
-    my $virtiomem_target_remaining = $target_virtiomem_total - ($virtiomem_target_aligned * ($nb_virtiomem-1));
+    my $virtiomem_target_aligned = int( $target_virtiomem_total / $virtiomem_total_noerror / $blocksize) * $blocksize;
+    my $virtiomem_target_remaining = $target_virtiomem_total - ($virtiomem_target_aligned * ($virtiomem_total_noerror-1));
 
     my $i = 0;
     foreach my $id (sort keys %$virtiomems) {
 	my $virtiomem = $virtiomems->{$id};
+	next if $virtiomem->{error};
 	$i++;
-	my $virtiomem_target = $i == $nb_virtiomem ? $virtiomem_target_remaining : $virtiomem_target_aligned;
+	my $virtiomem_target = $i == $virtiomem_total_noerror ? $virtiomem_target_remaining : $virtiomem_target_aligned;
 	$virtiomem->{completed} = 0;
 	$virtiomem->{retry} = 0;
 	$virtiomem->{target} = $virtiomem_target;
@@ -370,18 +382,24 @@ sub qemu_memory_hotplug {
 	    };
 	}
 
-	my $target_virtiomem_total = $value - $static_memory;
-	my $err;
-	eval {
-	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_virtiomem_total);
-	};
-	$err = $@ if $@;
-
-	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
-	$newmem->{current} = $current_memory;
-	$conf->{memory} = print_memory($newmem);
-	PVE::QemuConfig->write_config($vmid, $conf);
-	die $err if $err;
+	while (1) {
+
+	    my $target_virtiomem_total = $value - $static_memory - get_virtiomem_total_errors_size($virtiomems);
+	    my $err;
+	    eval {
+		balance_virtiomem($vmid, $virtiomems, $blocksize, $target_virtiomem_total);
+	    };
+	    $err = $@ if $@;
+
+	    my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+	    $newmem->{current} = $current_memory;
+	    $conf->{memory} = print_memory($newmem);
+	    PVE::QemuConfig->write_config($vmid, $conf);
+
+	    die $err if $err;
+	    last if $current_memory == $value;
+	}
+	return $conf->{memory};
 
     } elsif ($value > $memory) {
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] [PATCH v3 qemu-server 13/13] tests: add virtio-mem tests
  2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (11 preceding siblings ...)
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 12/13] memory: virtio-mem : implement redispatch retry Alexandre Derumier
@ 2023-02-02 11:03 ` Alexandre Derumier
  12 siblings, 0 replies; 24+ messages in thread
From: Alexandre Derumier @ 2023-02-02 11:03 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  | 12 +++++++
 .../memory-virtio-hugepages-1G.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio-max.conf           | 11 ++++++
 test/cfg2cmd/memory-virtio-max.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio.conf               | 11 ++++++
 test/cfg2cmd/memory-virtio.conf.cmd           | 35 +++++++++++++++++++
 6 files changed, 139 insertions(+)
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
new file mode 100644
index 0000000..bbe908c
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
@@ -0,0 +1,12 @@
+# TEST: virtio-mem without memorymax defined and hugepages
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
new file mode 100644
index 0000000..2e7de74
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-virtiomem0,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -object 'memory-backend-file,id=mem-virtiomem1,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,prealloc=on,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,prealloc=on,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf b/test/cfg2cmd/memory-virtio-max.conf
new file mode 100644
index 0000000..59b8e94
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf
@@ -0,0 +1,11 @@
+# TEST: memory max 64G with 2 socket with virtio-mem
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf.cmd b/test/cfg2cmd/memory-virtio-max.conf.cmd
new file mode 100644
index 0000000..6bda268
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=30720M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=30720M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
new file mode 100644
index 0000000..60c0264
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf
@@ -0,0 +1,11 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf.cmd b/test/cfg2cmd/memory-virtio.conf.cmd
new file mode 100644
index 0000000..f392636
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=260096M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=260096M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc+pve0'
\ No newline at end of file
-- 
2.30.2




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking Alexandre Derumier
@ 2023-02-03 13:44   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:44 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer/Memory.pm | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 3eb1c7c..5a7c1b7 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -348,7 +348,7 @@ sub config {
>  	    my $numa_memory = ($static_memory / $sockets);
>  
>  	    for (my $i = 0; $i < $sockets; $i++)  {
> -		die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/" && $conf->{hugepages};
> +		die "host NUMA node$i doesn't exist\n" if !host_numanode_exist($i) && $conf->{hugepages};
>  
>  		my $mem_object = print_mem_object($conf, "ram-node$i", $numa_memory);
>  		push @$cmd, '-object', $mem_object;
> @@ -391,6 +391,12 @@ sub print_mem_object {
>  
>  }
>  
> +sub host_numanode_exist {

Nit: host_numanode_exists sounds more grammatically correct and the
function could be moved above its first usage

> +    my ($id) = @_;
> +
> +    return -d "/sys/devices/system/node/node$id/";
> +}
> +
>  sub print_numa_hostnodes {
>      my ($hostnodelists) = @_;
>  
> @@ -402,7 +408,7 @@ sub print_numa_hostnodes {
>  	$hostnodes .= "-$end" if defined($end);
>  	$end //= $start;
>  	for (my $i = $start; $i <= $end; ++$i ) {
> -	    die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/";
> +	    die "host NUMA node$i doesn't exist\n" if !host_numanode_exist($i);
>  	}
>      }
>      return $hostnodes;
> @@ -445,21 +451,26 @@ sub hugepages_nr {
>    return $size / $hugepages_size;
>  }
>  
> +my sub page_chunk {

This needs a more descriptive name. How about
hugepages_chunk_size_supported?

> +    my ($size) = @_;
> +
> +    return -d "/sys/kernel/mm/hugepages/hugepages-". ($size * 1024) ."kB";
> +}
> +
>  sub hugepages_size {
>      my ($conf, $size) = @_;
>      die "hugepages option is not enabled" if !$conf->{hugepages};
>      die "memory size '$size' is not a positive even integer; cannot use for hugepages\n"
>  	if $size <= 0 || $size & 1;
>  
> -    my $page_chunk = sub { -d  "/sys/kernel/mm/hugepages/hugepages-". ($_[0] * 1024) ."kB" };
> -    die "your system doesn't support hugepages\n" if !$page_chunk->(2) && !$page_chunk->(1024);
> +    die "your system doesn't support hugepages\n" if !page_chunk(2) && !page_chunk(1024);
>  
>      if ($conf->{hugepages} eq 'any') {
>  
>  	# try to use 1GB if available && memory size is matching
> -	if ($page_chunk->(1024) && ($size & 1023) == 0) {
> +	if (page_chunk(1024) && ($size & 1023) == 0) {
>  	    return 1024;
> -	} elsif ($page_chunk->(2)) {
> +	} elsif (page_chunk(2)) {
>  	    return 2;
>  	} else {
>  	    die "host only supports 1024 GB hugepages, but requested size '$size' is not a multiple of 1024 MB\n"
> @@ -468,7 +479,7 @@ sub hugepages_size {
>  
>  	my $hugepagesize = $conf->{hugepages};
>  
> -	if (!$page_chunk->($hugepagesize)) {
> +	if (!page_chunk($hugepagesize)) {
>  	    die "your system doesn't support hugepages of $hugepagesize MB\n";
>  	} elsif (($size % $hugepagesize) != 0) {
>  	    die "Memory size $size is not a multiple of the requested hugepages size $hugepagesize\n";




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests Alexandre Derumier
@ 2023-02-03 13:44   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:44 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
> index f097811..dafc3b8 100755
> --- a/test/run_config2command_tests.pl
> +++ b/test/run_config2command_tests.pl
> @@ -178,6 +178,21 @@ $qemu_server_config->mock(
>      },
>  );
>  
> +my $qemu_server_memory;
> +$qemu_server_memory = Test::MockModule->new('PVE::QemuServer::Memory');
> +$qemu_server_memory->mock(
> +    page_chunk => sub {

Turns out you mocking things declared as 'my sub' doesn't work :( So you
need to change the function in Memory.pm into a 'sub'. Sorry I missed
that until now.

> +	return 1;
> +    },
> +    host_numanode_exist => sub {
> +	my ($id) = @_;
> +	return 1;
> +    },
> +    get_host_phys_address_bits => sub {

Same here.

> +	return 46;
> +    }
> +);
> +
>  my $pve_common_tools;
>  $pve_common_tools = Test::MockModule->new('PVE::Tools');
>  $pve_common_tools->mock(




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser Alexandre Derumier
@ 2023-02-03 13:44   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:44 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/API2/Qemu.pm          | 14 +++++--
>  PVE/QemuConfig.pm         |  4 +-
>  PVE/QemuMigrate.pm        |  6 ++-
>  PVE/QemuServer.pm         | 27 +++++++-------
>  PVE/QemuServer/Helpers.pm |  3 +-
>  PVE/QemuServer/Memory.pm  | 78 ++++++++++++++++++++++++++++++---------
>  6 files changed, 92 insertions(+), 40 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 587bb22..6c08280 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
>  use PVE::QemuServer::ImportDisk;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::Machine;
> +use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::QemuMigrate;
>  use PVE::RPCEnvironment;
>  use PVE::AccessControl;
> @@ -1489,8 +1490,6 @@ my $update_vm_api  = sub {
>  
>      my $storecfg = PVE::Storage::config();
>  
> -    my $defaults = PVE::QemuServer::load_defaults();
> -
>      &$resolve_cdrom_alias($param);
>  
>      # now try to verify all parameters
> @@ -1608,7 +1607,16 @@ my $update_vm_api  = sub {
>  	}
>  
>  	if ($param->{memory} || defined($param->{balloon})) {
> -	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
> +
> +	    my $maxmem = undef;
> +	    if ($param->{memory}) {
> +	        $maxmem = get_current_memory($param->{memory});
> +	    } elsif ($conf->{pending}->{memory}) {
> +	        $maxmem = get_current_memory($conf->{pending}->{memory});
> +	    } else {
> +	        $maxmem = get_current_memory($conf->{memory});
> +	    }

Style nit: could be a one-liner
my $maxmem = get_current_memory($param->{memory} ||
$conf->{pending}->{memory} || $conf->{memory});

> +
>  	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
>  
>  	    die "balloon value too large (must be smaller than assigned memory)\n"

> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 6bb931e..0704944 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -8,10 +8,42 @@ use PVE::Exception qw(raise raise_param_exc);
>  
>  use PVE::QemuServer;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
> +use base qw(Exporter);
> +
> +our @EXPORT_OK = qw(
> +get_current_memory
> +);
>  
>  my $MAX_NUMA = 8;
>  my $STATICMEM = 1024;
>  
> +our $memory_fmt = {
> +    current => {
> +	description => "Current amount of online RAM for the VM in MiB. This is the maximum available memory when"
> +	    ." you use the balloon device.",
> +	type => 'integer',
> +	default_key => 1,
> +	minimum => 16,
> +	default => 512,
> +    }

Style nit: missing comma

> +};
> +
> +sub print_memory {
> +    my $memory = shift;
> +
> +    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
> +}
> +
> +sub parse_memory {
> +    my ($value) = @_;
> +
> +    return { current => $memory_fmt->{current}->{default} } if !defined($value);
> +
> +    my $res = PVE::JSONSchema::parse_property_string($memory_fmt, $value)
> +	or die "Cannot parse memory properties: $value\n";

Nit: parse_property_string will only return something falsy if the
validator attached to the format does (which also shouldn't), so I don't
think the "or die" is required here.

> +    return $res;
> +}
> +
>  my $_host_bits;
>  my sub get_host_phys_address_bits {
>      return $_host_bits if defined($_host_bits);
> @@ -63,6 +95,13 @@ my sub get_max_mem {
>      return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
>  }
>  
> +sub get_current_memory {
> +    my ($value) = @_;
> +
> +    my $memory = parse_memory($value);
> +    return $memory->{current};
> +}
> +
>  sub get_numa_node_list {
>      my ($conf) = @_;
>      my @numa_map;
> @@ -155,16 +194,19 @@ sub foreach_reverse_dimm {
>  }
>  
>  sub qemu_memory_hotplug {
> -    my ($vmid, $conf, $defaults, $value) = @_;
> +    my ($vmid, $conf, $value) = @_;
>  
>      return $value if !PVE::QemuServer::check_running($vmid);
>  
> -    my $sockets = 1;
> -    $sockets = $conf->{sockets} if $conf->{sockets};
> +    my $oldmem = parse_memory($conf->{memory});
> +    my $newmem = parse_memory($value);
> +
> +    return $value if $newmem->{current} == $oldmem->{current};
>  
> -    my $memory = $conf->{memory} || $defaults->{memory};
> -    $value = $defaults->{memory} if !$value;
> -    return $value if $value == $memory;
> +    my $memory = $oldmem->{current};
> +    $value = $newmem->{current};
> +
> +    my $sockets = $conf->{sockets} || 1;

The refactoring of $sockets and below should be its own patch. While it
is small, it has nothing to do with the rest of the patch and the patch
already has quite a bit going on.

>  
>      my $static_memory = $STATICMEM;
>      $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
> @@ -496,12 +543,9 @@ sub hugepages_topology {
>  
>      return if !$conf->{numa};
>  
> -    my $defaults = PVE::QemuServer::load_defaults();
> -    my $memory = $conf->{memory} || $defaults->{memory};
> +    my $memory = get_current_memory($conf->{memory});
> +    my $sockets = $conf->{sockets} || 1;
>      my $static_memory = 0;
> -    my $sockets = 1;
> -    $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
> -    $sockets = $conf->{sockets} if $conf->{sockets};

below here

>      my $numa_custom_topology = undef;
>      my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
>  




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features Alexandre Derumier
@ 2023-02-03 13:44   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:44 UTC (permalink / raw)
  To: pve-devel, aderumier

As the commit title suggests, this patch is doing two things and could
be split up.

Nit: "remove calls to parse_hotplug_features" is more accurate

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> @@ -44,6 +43,23 @@ sub parse_memory {
>      return $res;
>  }
>  
> +my sub get_static_mem {
> +    my ($conf, $sockets, $hotplug) = @_;
> +
> +    my $static_memory = 0;
> +    my $memory = parse_memory($conf->{memory});
> +
> +    if ($hotplug) {
> +	#legacy

Nit: it's not legacy yet, but only after patch 09/13 ;)

> +	$static_memory = 1024;
> +	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);

Style nit: useless parentheses.

> +    } else {
> +	$static_memory = $memory->{current};
> +    }
> +
> +    return $static_memory;
> +}
> +
>  my $_host_bits;
>  my sub get_host_phys_address_bits {
>      return $_host_bits if defined($_host_bits);




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option Alexandre Derumier
@ 2023-02-03 13:44   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:44 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 9db1d8e..1f73605 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5042,7 +5042,7 @@ sub vmconfig_hotplug_pending {
>  		vm_deviceunplug($vmid, $conf, $opt);
>  		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
>  	    } elsif ($opt =~ m/^memory$/) {
> -		die "skip\n" if !$hotplug_features->{memory};
> +		die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf);
>  		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
>  	    } elsif ($opt eq 'cpuunits') {
>  		$cgroup->change_cpu_shares(undef);

(...)

> @@ -24,9 +26,23 @@ our $memory_fmt = {
>  	default_key => 1,
>  	minimum => 16,
>  	default => 512,
> +    },
> +    max => {
> +	type => 'integer',
> +	optional => 1,
> +	minimum => 65536,
> +	maximum => 4194304,
> +	format => 'pve-qm-memory-max',
>      }
>  };
>  
> +PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
> +sub verify_qm_memory_max {
> +    my ($max, $noerr) = @_;
> +

If $noerr is set and the check fails, you need to return undef and not die.

> +    die "max memory need to be a multiple of 64GiB\n" if $max && $max % 65536 != 0;

return $max;

> +}
> +
>  sub print_memory {
>      my $memory = shift;
>  
> @@ -311,6 +327,19 @@ sub qemu_memory_hotplug {
>      return $conf->{memory};
>  }
>  
> +sub can_hotplug {
> +    my ($hotplug, $conf, $value) = @_;
> +
> +    return if !$hotplug || !$value;

We can hotplug if there is no value. Calling qemu_memory_hotplug()
without value is legal. That's the call in vmconfig_hotplug_pending()
when the 'memory' is deleted from the config. While it currently errors
out later (because the default of 512 MiB is less than the static memory
of 1024 MiB), that can change if the default value changes. So we should
still return 1 in this case.

> +
> +    my $oldmem = parse_memory($conf->{memory});
> +    my $newmem = parse_memory($value);
> +
> +    return if safe_num_ne($newmem->{max}, $oldmem->{max});
> +
> +    return 1;
> +}
> +
>  sub qemu_dimm_list {
>      my ($vmid) = @_;
>  




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-02-03 13:45   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:45 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Thanks for looking into this! The new way is more straightforward for
sure :)

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> @@ -316,30 +284,33 @@ sub qemu_memory_hotplug {
>  
>      } else {
>  
> -	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
> -	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
> +	my $dimms = qemu_dimm_list($vmid);

Looking at qemu_dimm_list(), I feel like we should start filtering to
only return dimms (based on the id or is there something better?), since
we now also iterate over them and not only check for existence.
Currently, qemu_dimm_list() returns all memory objects which would
include the virtiomem devices. The function is not called if virtiomem
is used, but this would make it future-proof (and avoid breakage if
people have manually attached non-dimm memory devices, although I'd say
that's not supported from our perspective ;)).

> +	my $current_size = $memory;
> +	for my $name (sort { $dimms->{$b}{slot} <=> $dimms->{$a}{slot} } keys %$dimms) {

Style nit: please use $dimms->{$b}->{slot}

>  
> -		return if $current_size >= get_current_memory($conf->{memory});
> +	    my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
>  
> -		print "try to unplug memory dimm $name\n";
> +	    last if $current_size <= $value || $current_size <= $static_memory;

Nit: the second half of the condition is not really needed. We already
assert at the start of qemu_memory_hotplug() that $value >=
$static_memory and if we didn't somehow end up with values not matching
reality in the config, we should reach $static_memory only after
unplugging everything, or not?

>  
> -		my $retry = 0;
> -		while (1) {
> -		    eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
> -		    sleep 3;
> -		    my $dimm_list = qemu_dimm_list($vmid);
> -		    last if !$dimm_list->{$name};
> -		    raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
> -		    $retry++;
> -		}
> +	    print "try to unplug memory dimm $name\n";
>  
> -		#update conf after each succesful module unplug
> -		$newmem->{current} = $current_size;
> -		$conf->{memory} = print_memory($newmem);
> +	    my $retry = 0;
> +	    while (1) {
> +		eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
> +		sleep 3;
> +		my $dimm_list = qemu_dimm_list($vmid);
> +		last if !$dimm_list->{$name};
> +		raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
> +		$retry++;
> +	    }
> +	    $current_size -= $dimm_size;
> +	    #update conf after each succesful module unplug
> +	    $newmem->{current} = $current_size;
> +	    $conf->{memory} = print_memory($newmem);
>  
> -		eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
> -		PVE::QemuConfig->write_config($vmid, $conf);
> -	});
> +	    eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
> +	    PVE::QemuConfig->write_config($vmid, $conf);
> +	}
>      }
>      return $conf->{memory};
>  }




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
@ 2023-02-03 13:45   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:45 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> @@ -179,17 +185,17 @@ sub get_numa_guest_to_host_map {
>  }
>  
>  sub foreach_dimm{
> -    my ($conf, $vmid, $memory, $sockets, $func) = @_;
> +    my ($conf, $vmid, $memory, $current_size, $func) = @_;

The parameter should not be called $current_size if you pass in
$static_memory (they're only the same value if nothing is yet
hotplugged). It's also not nice to use parameters as local variables.
Renaming the parameter to $static_memory and doing
my $current_size = $static_memory;
is cleaner.

I know I'm saying this often, but changing the argument here should be
its own patch. It has nothing to do with 'max'.

>  
>      my $dimm_id = 0;
> -    my $current_size = 0;
>      my $dimm_size = 0;
>  
> -    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
> -	$current_size = 1024 * $sockets;
> +    my $confmem = parse_memory($conf->{memory});
> +    if ($confmem->{max}) {
> +	$dimm_size = $confmem->{max} / $MAX_SLOTS;
> +    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
>  	$dimm_size = 1024;
>      } else {
> -	$current_size = 1024;
>  	$dimm_size = 512;
>      }
>  




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support Alexandre Derumier
@ 2023-02-03 13:46   ` Fiona Ebner
  2023-02-03 15:48     ` DERUMIER, Alexandre
  0 siblings, 1 reply; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:46 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> @@ -157,6 +168,109 @@ sub get_current_memory {
>      return $memory->{current};
>  }
>  
> +sub get_virtiomem_block_size {
> +    my ($conf) = @_;
> +
> +    my $sockets = $conf->{sockets} || 1;
> +    my $MAX_MEM = get_max_mem($conf);
> +    my $static_memory = get_static_mem($conf, $sockets);
> +    my $memory = get_current_memory($conf->{memory});
> +
> +    #virtiomem can map 32000 block size.
> +    #try to use lowest blocksize, lower = more chance to unplug memory.
> +    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
> +    #2MB is the minimum to be aligned with THP
> +    $blocksize = 2 if $blocksize < 2;
> +    $blocksize = 2**(ceil(log($blocksize)/log(2)));
> +    #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
> +    $blocksize = 4 if $blocksize < 4;
> +
> +    return $blocksize;
> +}
> +
> +my sub get_virtiomem_total_current_size {
> +    my ($mems) = @_;
> +    my $size = 0;
> +    for my $mem (values %$mems) {
> +	$size += $mem->{current};
> +    }
> +    return $size;
> +}
> +
> +my sub balance_virtiomem {

The function should die if not all memory can be plugged.

> +    my ($vmid, $virtiomems, $blocksize, $target_virtiomem_total) = @_;
> +
> +    my $nb_virtiomem = keys %$virtiomems;

Style nit: explicit scalar() would be nice.

> +
> +    print"try to balance memory on $nb_virtiomem virtiomems\n";
> +    die "error. no more available blocks in virtiomem to balance the remaining memory" if $target_virtiomem_total < 0;
> +
> +    #if we can't share exactly the same amount, we add the remainder on last node
> +    my $virtiomem_target_aligned = int( $target_virtiomem_total / $nb_virtiomem / $blocksize) * $blocksize;
> +    my $virtiomem_target_remaining = $target_virtiomem_total - ($virtiomem_target_aligned * ($nb_virtiomem-1));
> +
> +    my $i = 0;
> +    foreach my $id (sort keys %$virtiomems) {
> +	my $virtiomem = $virtiomems->{$id};
> +	$i++;
> +	my $virtiomem_target = $i == $nb_virtiomem ? $virtiomem_target_remaining : $virtiomem_target_aligned;
> +	$virtiomem->{completed} = 0;
> +	$virtiomem->{retry} = 0;
> +	$virtiomem->{target} = $virtiomem_target;
> +
> +	print "virtiomem$id: set-requested-size : $virtiomem_target\n";
> +	mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/virtiomem$id", property => "requested-size", value => $virtiomem_target * 1024 * 1024);

Style nit: some lines are over 100 chars and some quite a bit

> +    }
> +
> +    while (1) {
> +
> +	sleep 1;
> +	my $total_finished = 0;
> +
> +	foreach my $id (keys %$virtiomems) {
> +
> +	    my $virtiomem = $virtiomems->{$id};
> +
> +	    if ($virtiomem->{error} || $virtiomem->{completed}) {
> +		$total_finished++;
> +		next;
> +	    }
> +
> +	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
> +	    $virtiomem->{current} = $size / 1024 / 1024;
> +	    print"virtiomem$id: virtiomem->last: $virtiomem->{last} virtiomem->current: $virtiomem->{current} virtio_mem_target:$virtiomem->{target}\n";
> +
> +	    if($virtiomem->{current} == $virtiomem->{target}) {
> +		print"virtiomem$id: completed\n";
> +		$virtiomem->{completed} = 1;
> +		next;
> +	    }
> +
> +	    if($virtiomem->{current} != $virtiomem->{last}) {
> +		#if value has changed, but not yet completed
> +		print "virtiomem$id: changed but don't not reach target yet\n";
> +		$virtiomem->{retry} = 0;
> +		$virtiomem->{last} = $virtiomem->{current};
> +		next;
> +	    }
> +
> +	    if($virtiomem->{retry} >= 5) {
> +		print "virtiomem$id: too many retry. set error\n";
> +		$virtiomem->{error} = 1;
> +		#as change is async, we don't want that value change after the api call
> +		eval {
> +		    mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/virtiomem$id", property => "requested-size", value => $virtiomem->{current} * 1024 *1024);
> +		};
> +	    }
> +	    print"virtiomem$id: increase retry: $virtiomem->{retry}\n";
> +	    $virtiomem->{retry}++;
> +	}
> +
> +	my $nb_virtiomem = keys %$virtiomems;

Redeclares variable. The number can't change from before, or am I
missing something?

> +	return if $total_finished == $nb_virtiomem;

Style nit: could also use
while ($total_finished != $nb_virtiomem)
at the beginning of the loop.

> +    }
> +}
> +
>  sub get_numa_node_list {
>      my ($conf) = @_;
>      my @numa_map;
> @@ -237,7 +351,39 @@ sub qemu_memory_hotplug {
>      my $MAX_MEM = get_max_mem($conf);
>      die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
>  
> -    if ($value > $memory) {
> +    my $confmem = parse_memory($conf->{memory});

You can just re-use the existing $oldmem?

> +
> +    if ($confmem->{virtio}) {
> +	my $blocksize = get_virtiomem_block_size($conf);
> +
> +	my $virtiomems = {};
> +
> +	for (my $i = 0; $i < $sockets; $i++) {
> +	    my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$i", property => "size");
> +	    $size = $size / 1024 /1024;
> +	    $virtiomems->{$i} = {
> +		current => $size,
> +		last => $size,
> +		error => 0,
> +		completed => 0,
> +		retry => 0
> +	    };
> +	}
> +
> +	my $target_virtiomem_total = $value - $static_memory;
> +	my $err;
> +	eval {
> +	    balance_virtiomem($vmid, $virtiomems, $blocksize, $target_virtiomem_total);
> +	};
> +	$err = $@ if $@;
> +
> +	my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
> +	$newmem->{current} = $current_memory;
> +	$conf->{memory} = print_memory($newmem);
> +	PVE::QemuConfig->write_config($vmid, $conf);
> +	die $err if $err;
> +
> +    } elsif ($value > $memory) {
>  
>  	my $numa_hostmap;
>  
> @@ -441,17 +590,42 @@ sub config {
>      }
>  
>      if ($hotplug) {
> -	foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
> -	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
>  
> -	    my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
> +	my $confmem = parse_memory($conf->{memory});
> +
> +	if ($confmem->{'virtio'}) {
>  
> -	    push @$cmd, "-object" , $mem_object;
> -	    push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
> +	    my $MAX_MEM = get_max_mem($conf);
> +	    my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
> +	    my $node_mem = ($memory - $static_memory) / $sockets;

Nit: If the number of $sockets is not a power of 2, I think this breaks.
But I guess we already don't support it. Running current version without
your patches (for a VM with memory hotplug):
root@pve701 ~ # qm set 131 --sockets 3
update VM 131: -sockets 3
root@pve701 ~ # qm set 131 -memory 8192
update VM 131: -memory 8192
root@pve701 ~ # qm start 131
kvm: total memory for NUMA nodes (0x3fffffff) should equal RAM size
(0x40000000)
start failed: QEMU exited with code 1

I guess we can just fix it up together with the existing rounding issue
if/when somebody complains about it ;)

> +	    my $blocksize = get_virtiomem_block_size($conf);
>  
> -	    die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
> -		if $current_size > $memory;
> -	});
> +	    die "memory need to be a multiple of $blocksize MiB with maxmemory $MAX_MEM MiB when virtiomem is enabled\n"
> +		if $memory % $blocksize != 0;
> +
> +	    for (my $i = 0; $i < $sockets; $i++)  {
> +
> +		my $id = "virtiomem$i";
> +		my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
> +		push @$cmd, "-object" , "$mem_object,reserve=off";
> +
> +		my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i";
> +		$mem_device .= ",prealloc=on" if $conf->{hugepages};
> +		$mem_devices->{$id} = $mem_device;
> +	    }
> +	} else {
> +	    foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
> +		my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
> +
> +		my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
> +
> +		push @$cmd, "-object" , $mem_object;
> +		push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
> +
> +		die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
> +		    if $current_size > $memory;
> +	    });
> +	}
>      }
>  }
>  




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [pve-devel] applied: [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg
  2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg Alexandre Derumier
@ 2023-02-03 13:56   ` Fiona Ebner
  0 siblings, 0 replies; 24+ messages in thread
From: Fiona Ebner @ 2023-02-03 13:56 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

Am 02.02.23 um 12:03 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer.pm        | 4 ++--
>  PVE/QemuServer/Memory.pm | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index e4d1a70..a0e16dc 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -5046,7 +5046,7 @@ sub vmconfig_hotplug_pending {
>  		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
>  	    } elsif ($opt =~ m/^memory$/) {
>  		die "skip\n" if !$hotplug_features->{memory};
> -		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt);
> +		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
>  	    } elsif ($opt eq 'cpuunits') {
>  		$cgroup->change_cpu_shares(undef);
>  	    } elsif ($opt eq 'cpulimit') {
> @@ -5121,7 +5121,7 @@ sub vmconfig_hotplug_pending {
>  				     $vmid, $opt, $value, $arch, $machine_type);
>  	    } elsif ($opt =~ m/^memory$/) { #dimms
>  		die "skip\n" if !$hotplug_features->{memory};
> -		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt, $value);
> +		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
>  	    } elsif ($opt eq 'cpuunits') {
>  		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
>  		$cgroup->change_cpu_shares($new_cpuunits);
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 5a7c1b7..6bb931e 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -155,7 +155,7 @@ sub foreach_reverse_dimm {
>  }
>  
>  sub qemu_memory_hotplug {
> -    my ($vmid, $conf, $defaults, $opt, $value) = @_;
> +    my ($vmid, $conf, $defaults, $value) = @_;
>  
>      return $value if !PVE::QemuServer::check_running($vmid);
>  

applied this one, thanks!




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support
  2023-02-03 13:46   ` Fiona Ebner
@ 2023-02-03 15:48     ` DERUMIER, Alexandre
  0 siblings, 0 replies; 24+ messages in thread
From: DERUMIER, Alexandre @ 2023-02-03 15:48 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Le vendredi 03 février 2023 à 14:46 +0100, Fiona Ebner a écrit :
> Nit: If the number of $sockets is not a power of 2, I think this
> breaks.
> But I guess we already don't support it. Running current version
> without
> your patches (for a VM with memory hotplug):
> root@pve701 ~ # qm set 131 --sockets 3
> update VM 131: -sockets 3
> root@pve701 ~ # qm set 131 -memory 8192
> update VM 131: -memory 8192
> root@pve701 ~ # qm start 131
> kvm: total memory for NUMA nodes (0x3fffffff) should equal RAM size
> (0x40000000)
> start failed: QEMU exited with code 1
> 
> I guess we can just fix it up together with the existing rounding
> issue
> if/when somebody complains about it ;)

Yes, indeed, we never have supported it with numa, as we try to match
real hardware numa. (and real hardware are always power of 2).


with auto-numabalancing in kernel, the host kernel should try to
balance guest numa node to physical numanode.

Personnaly, I'll prefer to add a check somewhere and forbid it.

(I really don't known the behaviour of guest kernel scheduler with non-
power of 2 numa)




I'll rework the patches series for a v4 for end the next week.


(BTW, I'm going to fosdem tomorrow, so if some proxmox devs are there,
I'll happy to drink a beer with you guys ;)


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-02-03 15:49 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-02 11:03 [pve-devel] [PATCH v3 qemu-server 00/13] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 01/13] memory: extract some code to their own sub for mocking Alexandre Derumier
2023-02-03 13:44   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 02/13] tests: add memory tests Alexandre Derumier
2023-02-03 13:44   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 03/13] qemu_memory_hotplug: remove unused $opt arg Alexandre Derumier
2023-02-03 13:56   ` [pve-devel] applied: " Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 04/13] add memory parser Alexandre Derumier
2023-02-03 13:44   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 05/13] memory: add get_static_mem && remove parse_hotplug_features Alexandre Derumier
2023-02-03 13:44   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 06/13] config: memory: add 'max' option Alexandre Derumier
2023-02-03 13:44   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 07/13] memory: get_max_mem: use config memory max Alexandre Derumier
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 08/13] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
2023-02-03 13:45   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 09/13] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
2023-02-03 13:45   ` Fiona Ebner
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 10/13] test: add memory-max tests Alexandre Derumier
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 11/13] memory: add virtio-mem support Alexandre Derumier
2023-02-03 13:46   ` Fiona Ebner
2023-02-03 15:48     ` DERUMIER, Alexandre
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 12/13] memory: virtio-mem : implement redispatch retry Alexandre Derumier
2023-02-02 11:03 ` [pve-devel] [PATCH v3 qemu-server 13/13] tests: add virtio-mem tests Alexandre Derumier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal