* [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
@ 2023-02-13 12:00 Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking Alexandre Derumier
` (17 more replies)
0 siblings, 18 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
This patch series rework the current memory hotplug + virtiomem.
memory option now have extra options:
memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
ex: memory: current=1024,max=131072,virtio=1
for classic memory hotplug, when maxmemory is defined,
we use 64 fixed size dimm.
The max option is a multiple of 64GB.
The virtio option enable new virtio-mem support,
instead of plugging dimm, it's add/removed block inside
big dimm.
virtio-mem can use 32000 blocks, the blocksize is compute from
max memory.
Changelog v2:
update differents patches based on Fiona comments.
(I have send 2 others mails with comments not yet addressed)
Biggest change is on virtio-mem, instead of trying to have same amount of memory
on each virtiomem (some block could be unmovable and break unplug),
we try to balance/dispatch remaining block on other available virtiomems.
Also, the minimum blocksize supported by linux guest os is 4MB currently,
even if virtiomem can use 2MB on qemu side.
Changelog v3:
- cleanup from last Fiona comments
- classic hotplug : fix memory unplug
- virtio-mem: split retry logic in a second optional patch
(feel free to apply it later if needed)
Changelog v4:
- cleanup from last Fiona comments
Alexandre Derumier (16):
memory: extract some code to their own sub for mocking
tests: add memory tests
memory: refactor sockets
memory: remove calls to parse_hotplug_features
add memory parser
memory: add get_static_mem
memory: use static_memory in foreach_dimm
config: memory: add 'max' option
memory: get_max_mem: use config memory max
memory: rename qemu_dimm_list to qemu_memdevices_list
memory: don't use foreach_reversedimm for unplug
memory: use 64 slots && static dimm size when max is defined
test: add memory-max tests
memory: add virtio-mem support
memory: virtio-mem : implement redispatch retry.
tests: add virtio-mem tests
PVE/API2/Qemu.pm | 38 +-
PVE/QemuConfig.pm | 4 +-
PVE/QemuMigrate.pm | 6 +-
PVE/QemuServer.pm | 40 +-
PVE/QemuServer/Helpers.pm | 3 +-
PVE/QemuServer/Memory.pm | 527 ++++++++++++++----
PVE/QemuServer/PCI.pm | 8 +
test/cfg2cmd/memory-hotplug-hugepages.conf | 12 +
.../cfg2cmd/memory-hotplug-hugepages.conf.cmd | 62 +++
test/cfg2cmd/memory-hotplug.conf | 11 +
test/cfg2cmd/memory-hotplug.conf.cmd | 174 ++++++
test/cfg2cmd/memory-hugepages-1g.conf | 11 +
test/cfg2cmd/memory-hugepages-1g.conf.cmd | 30 +
test/cfg2cmd/memory-hugepages-2m.conf | 11 +
test/cfg2cmd/memory-hugepages-2m.conf.cmd | 30 +
test/cfg2cmd/memory-max-128G.conf | 11 +
test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++
test/cfg2cmd/memory-max-512G.conf | 11 +
test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++
test/cfg2cmd/memory-virtio-hugepages-1G.conf | 12 +
.../memory-virtio-hugepages-1G.conf.cmd | 35 ++
test/cfg2cmd/memory-virtio-max.conf | 11 +
test/cfg2cmd/memory-virtio-max.conf.cmd | 35 ++
test/cfg2cmd/memory-virtio.conf | 11 +
test/cfg2cmd/memory-virtio.conf.cmd | 35 ++
test/run_config2command_tests.pl | 15 +
26 files changed, 1135 insertions(+), 152 deletions(-)
create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
create mode 100644 test/cfg2cmd/memory-hotplug.conf
create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd
create mode 100644 test/cfg2cmd/memory-max-128G.conf
create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
create mode 100644 test/cfg2cmd/memory-max-512G.conf
create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
create mode 100644 test/cfg2cmd/memory-virtio-max.conf
create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
create mode 100644 test/cfg2cmd/memory-virtio.conf
create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 02/16] tests: add memory tests Alexandre Derumier
` (16 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 6fac468..e10c0b7 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -13,7 +13,7 @@ my $MAX_NUMA = 8;
my $STATICMEM = 1024;
my $_host_bits;
-my sub get_host_phys_address_bits {
+sub get_host_phys_address_bits {
return $_host_bits if defined($_host_bits);
my $fh = IO::File->new ('/proc/cpuinfo', "r") or return;
@@ -76,6 +76,12 @@ sub get_numa_node_list {
return (0..($sockets-1));
}
+sub host_numanode_exists {
+ my ($id) = @_;
+
+ return -d "/sys/devices/system/node/node$id/";
+}
+
# only valid when numa nodes map to a single host node
sub get_numa_guest_to_host_map {
my ($conf) = @_;
@@ -348,7 +354,7 @@ sub config {
my $numa_memory = ($static_memory / $sockets);
for (my $i = 0; $i < $sockets; $i++) {
- die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/" && $conf->{hugepages};
+ die "host NUMA node$i doesn't exist\n" if !host_numanode_exists($i) && $conf->{hugepages};
my $mem_object = print_mem_object($conf, "ram-node$i", $numa_memory);
push @$cmd, '-object', $mem_object;
@@ -402,7 +408,7 @@ sub print_numa_hostnodes {
$hostnodes .= "-$end" if defined($end);
$end //= $start;
for (my $i = $start; $i <= $end; ++$i ) {
- die "host NUMA node$i doesn't exist\n" if ! -d "/sys/devices/system/node/node$i/";
+ die "host NUMA node$i doesn't exist\n" if !host_numanode_exists($i);
}
}
return $hostnodes;
@@ -445,21 +451,26 @@ sub hugepages_nr {
return $size / $hugepages_size;
}
+sub hugepages_chunk_size_supported {
+ my ($size) = @_;
+
+ return -d "/sys/kernel/mm/hugepages/hugepages-". ($size * 1024) ."kB";
+}
+
sub hugepages_size {
my ($conf, $size) = @_;
die "hugepages option is not enabled" if !$conf->{hugepages};
die "memory size '$size' is not a positive even integer; cannot use for hugepages\n"
if $size <= 0 || $size & 1;
- my $page_chunk = sub { -d "/sys/kernel/mm/hugepages/hugepages-". ($_[0] * 1024) ."kB" };
- die "your system doesn't support hugepages\n" if !$page_chunk->(2) && !$page_chunk->(1024);
+ die "your system doesn't support hugepages\n" if !hugepages_chunk_size_supported(2) && !hugepages_chunk_size_supported(1024);
if ($conf->{hugepages} eq 'any') {
# try to use 1GB if available && memory size is matching
- if ($page_chunk->(1024) && ($size & 1023) == 0) {
+ if (hugepages_chunk_size_supported(1024) && ($size & 1023) == 0) {
return 1024;
- } elsif ($page_chunk->(2)) {
+ } elsif (hugepages_chunk_size_supported(2)) {
return 2;
} else {
die "host only supports 1024 GB hugepages, but requested size '$size' is not a multiple of 1024 MB\n"
@@ -468,7 +479,7 @@ sub hugepages_size {
my $hugepagesize = $conf->{hugepages};
- if (!$page_chunk->($hugepagesize)) {
+ if (!hugepages_chunk_size_supported($hugepagesize)) {
die "your system doesn't support hugepages of $hugepagesize MB\n";
} elsif (($size % $hugepagesize) != 0) {
die "Memory size $size is not a multiple of the requested hugepages size $hugepagesize\n";
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 02/16] tests: add memory tests
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 03/16] memory: refactor sockets Alexandre Derumier
` (15 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
test/cfg2cmd/memory-hotplug-hugepages.conf | 12 ++
.../cfg2cmd/memory-hotplug-hugepages.conf.cmd | 62 +++++++
test/cfg2cmd/memory-hotplug.conf | 11 ++
test/cfg2cmd/memory-hotplug.conf.cmd | 174 ++++++++++++++++++
test/cfg2cmd/memory-hugepages-1g.conf | 11 ++
test/cfg2cmd/memory-hugepages-1g.conf.cmd | 30 +++
test/cfg2cmd/memory-hugepages-2m.conf | 11 ++
test/cfg2cmd/memory-hugepages-2m.conf.cmd | 30 +++
test/run_config2command_tests.pl | 15 ++
9 files changed, 356 insertions(+)
create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
create mode 100644 test/cfg2cmd/memory-hotplug.conf
create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd
diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf b/test/cfg2cmd/memory-hotplug-hugepages.conf
new file mode 100644
index 0000000..6cba31e
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf
@@ -0,0 +1,12 @@
+# TEST: memory hotplug with 1GB hugepage
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 18432
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
new file mode 100644
index 0000000..6d4d8e8
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
@@ -0,0 +1,62 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=2048,slots=255,maxmem=524288M' \
+ -object 'memory-backend-file,id=ram-node0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-file,id=ram-node1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-file,id=mem-dimm0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+ -object 'memory-backend-file,id=mem-dimm1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+ -object 'memory-backend-file,id=mem-dimm2,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+ -object 'memory-backend-file,id=mem-dimm3,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+ -object 'memory-backend-file,id=mem-dimm4,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+ -object 'memory-backend-file,id=mem-dimm5,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+ -object 'memory-backend-file,id=mem-dimm6,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+ -object 'memory-backend-file,id=mem-dimm7,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+ -object 'memory-backend-file,id=mem-dimm8,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+ -object 'memory-backend-file,id=mem-dimm9,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+ -object 'memory-backend-file,id=mem-dimm10,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+ -object 'memory-backend-file,id=mem-dimm11,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+ -object 'memory-backend-file,id=mem-dimm12,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+ -object 'memory-backend-file,id=mem-dimm13,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+ -object 'memory-backend-file,id=mem-dimm14,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+ -object 'memory-backend-file,id=mem-dimm15,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hotplug.conf b/test/cfg2cmd/memory-hotplug.conf
new file mode 100644
index 0000000..386e61f
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf
@@ -0,0 +1,11 @@
+# TEST: basic memory hotplug
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 66560
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
diff --git a/test/cfg2cmd/memory-hotplug.conf.cmd b/test/cfg2cmd/memory-hotplug.conf.cmd
new file mode 100644
index 0000000..2da7955
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf.cmd
@@ -0,0 +1,174 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=1024,slots=255,maxmem=524288M' \
+ -object 'memory-backend-ram,id=ram-node0,size=512M' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-ram,id=ram-node1,size=512M' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-ram,id=mem-dimm0,size=512M' \
+ -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm1,size=512M' \
+ -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm2,size=512M' \
+ -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm3,size=512M' \
+ -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm4,size=512M' \
+ -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm5,size=512M' \
+ -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm6,size=512M' \
+ -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm7,size=512M' \
+ -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm8,size=512M' \
+ -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm9,size=512M' \
+ -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm10,size=512M' \
+ -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm11,size=512M' \
+ -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm12,size=512M' \
+ -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm13,size=512M' \
+ -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm14,size=512M' \
+ -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm15,size=512M' \
+ -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm16,size=512M' \
+ -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm17,size=512M' \
+ -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm18,size=512M' \
+ -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm19,size=512M' \
+ -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm20,size=512M' \
+ -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm21,size=512M' \
+ -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm22,size=512M' \
+ -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm23,size=512M' \
+ -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm24,size=512M' \
+ -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm25,size=512M' \
+ -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm26,size=512M' \
+ -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm27,size=512M' \
+ -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm28,size=512M' \
+ -device 'pc-dimm,id=dimm28,memdev=mem-dimm28,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm29,size=512M' \
+ -device 'pc-dimm,id=dimm29,memdev=mem-dimm29,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm30,size=512M' \
+ -device 'pc-dimm,id=dimm30,memdev=mem-dimm30,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm31,size=512M' \
+ -device 'pc-dimm,id=dimm31,memdev=mem-dimm31,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm32,size=1024M' \
+ -device 'pc-dimm,id=dimm32,memdev=mem-dimm32,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm33,size=1024M' \
+ -device 'pc-dimm,id=dimm33,memdev=mem-dimm33,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm34,size=1024M' \
+ -device 'pc-dimm,id=dimm34,memdev=mem-dimm34,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm35,size=1024M' \
+ -device 'pc-dimm,id=dimm35,memdev=mem-dimm35,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm36,size=1024M' \
+ -device 'pc-dimm,id=dimm36,memdev=mem-dimm36,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm37,size=1024M' \
+ -device 'pc-dimm,id=dimm37,memdev=mem-dimm37,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm38,size=1024M' \
+ -device 'pc-dimm,id=dimm38,memdev=mem-dimm38,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm39,size=1024M' \
+ -device 'pc-dimm,id=dimm39,memdev=mem-dimm39,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm40,size=1024M' \
+ -device 'pc-dimm,id=dimm40,memdev=mem-dimm40,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm41,size=1024M' \
+ -device 'pc-dimm,id=dimm41,memdev=mem-dimm41,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm42,size=1024M' \
+ -device 'pc-dimm,id=dimm42,memdev=mem-dimm42,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm43,size=1024M' \
+ -device 'pc-dimm,id=dimm43,memdev=mem-dimm43,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm44,size=1024M' \
+ -device 'pc-dimm,id=dimm44,memdev=mem-dimm44,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm45,size=1024M' \
+ -device 'pc-dimm,id=dimm45,memdev=mem-dimm45,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm46,size=1024M' \
+ -device 'pc-dimm,id=dimm46,memdev=mem-dimm46,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm47,size=1024M' \
+ -device 'pc-dimm,id=dimm47,memdev=mem-dimm47,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm48,size=1024M' \
+ -device 'pc-dimm,id=dimm48,memdev=mem-dimm48,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm49,size=1024M' \
+ -device 'pc-dimm,id=dimm49,memdev=mem-dimm49,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm50,size=1024M' \
+ -device 'pc-dimm,id=dimm50,memdev=mem-dimm50,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm51,size=1024M' \
+ -device 'pc-dimm,id=dimm51,memdev=mem-dimm51,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm52,size=1024M' \
+ -device 'pc-dimm,id=dimm52,memdev=mem-dimm52,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm53,size=1024M' \
+ -device 'pc-dimm,id=dimm53,memdev=mem-dimm53,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm54,size=1024M' \
+ -device 'pc-dimm,id=dimm54,memdev=mem-dimm54,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm55,size=1024M' \
+ -device 'pc-dimm,id=dimm55,memdev=mem-dimm55,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm56,size=1024M' \
+ -device 'pc-dimm,id=dimm56,memdev=mem-dimm56,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm57,size=1024M' \
+ -device 'pc-dimm,id=dimm57,memdev=mem-dimm57,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm58,size=1024M' \
+ -device 'pc-dimm,id=dimm58,memdev=mem-dimm58,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm59,size=1024M' \
+ -device 'pc-dimm,id=dimm59,memdev=mem-dimm59,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm60,size=1024M' \
+ -device 'pc-dimm,id=dimm60,memdev=mem-dimm60,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm61,size=1024M' \
+ -device 'pc-dimm,id=dimm61,memdev=mem-dimm61,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm62,size=1024M' \
+ -device 'pc-dimm,id=dimm62,memdev=mem-dimm62,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm63,size=1024M' \
+ -device 'pc-dimm,id=dimm63,memdev=mem-dimm63,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm64,size=2048M' \
+ -device 'pc-dimm,id=dimm64,memdev=mem-dimm64,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm65,size=2048M' \
+ -device 'pc-dimm,id=dimm65,memdev=mem-dimm65,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm66,size=2048M' \
+ -device 'pc-dimm,id=dimm66,memdev=mem-dimm66,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm67,size=2048M' \
+ -device 'pc-dimm,id=dimm67,memdev=mem-dimm67,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm68,size=2048M' \
+ -device 'pc-dimm,id=dimm68,memdev=mem-dimm68,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm69,size=2048M' \
+ -device 'pc-dimm,id=dimm69,memdev=mem-dimm69,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm70,size=2048M' \
+ -device 'pc-dimm,id=dimm70,memdev=mem-dimm70,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm71,size=2048M' \
+ -device 'pc-dimm,id=dimm71,memdev=mem-dimm71,node=1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf b/test/cfg2cmd/memory-hugepages-1g.conf
new file mode 100644
index 0000000..8db2cca
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 1gb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf.cmd b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
new file mode 100644
index 0000000..63792d2
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 8192 \
+ -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf b/test/cfg2cmd/memory-hugepages-2m.conf
new file mode 100644
index 0000000..f0d65fb
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 2mb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 2
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf.cmd b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
new file mode 100644
index 0000000..287c0ed
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 8192 \
+ -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc'
diff --git a/test/run_config2command_tests.pl b/test/run_config2command_tests.pl
index f097811..c496690 100755
--- a/test/run_config2command_tests.pl
+++ b/test/run_config2command_tests.pl
@@ -178,6 +178,21 @@ $qemu_server_config->mock(
},
);
+my $qemu_server_memory;
+$qemu_server_memory = Test::MockModule->new('PVE::QemuServer::Memory');
+$qemu_server_memory->mock(
+ hugepages_chunk_size_supported => sub {
+ return 1;
+ },
+ host_numanode_exists => sub {
+ my ($id) = @_;
+ return 1;
+ },
+ get_host_phys_address_bits => sub {
+ return 46;
+ }
+);
+
my $pve_common_tools;
$pve_common_tools = Test::MockModule->new('PVE::Tools');
$pve_common_tools->mock(
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 03/16] memory: refactor sockets
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 02/16] tests: add memory tests Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 04/16] memory: remove calls to parse_hotplug_features Alexandre Derumier
` (14 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index e10c0b7..3899917 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -165,8 +165,7 @@ sub qemu_memory_hotplug {
return $value if !PVE::QemuServer::check_running($vmid);
- my $sockets = 1;
- $sockets = $conf->{sockets} if $conf->{sockets};
+ my $sockets = $conf->{sockets} || 1;
my $memory = $conf->{memory} || $defaults->{memory};
$value = $defaults->{memory} if !$value;
@@ -289,8 +288,7 @@ sub config {
if $conf->{"numa$i"};
}
- my $sockets = 1;
- $sockets = $conf->{sockets} if $conf->{sockets};
+ my $sockets = $conf->{sockets} || 1;
$static_memory = $STATICMEM;
$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
@@ -499,9 +497,7 @@ sub hugepages_topology {
my $defaults = PVE::QemuServer::load_defaults();
my $memory = $conf->{memory} || $defaults->{memory};
my $static_memory = 0;
- my $sockets = 1;
- $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
- $sockets = $conf->{sockets} if $conf->{sockets};
+ my $sockets = $conf->{sockets} || 1;
my $numa_custom_topology = undef;
my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 04/16] memory: remove calls to parse_hotplug_features
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (2 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 03/16] memory: refactor sockets Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 05/16] add memory parser Alexandre Derumier
` (13 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer.pm | 6 ++++--
PVE/QemuServer/Memory.pm | 13 ++++++-------
2 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a0e16dc..97185e1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3869,7 +3869,7 @@ sub config_to_command {
push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
}
- PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd);
+ PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features->{memory}, $cmd);
push @$cmd, '-S' if $conf->{freeze};
@@ -5902,7 +5902,9 @@ sub vm_start_nolock {
if ($conf->{hugepages}) {
my $code = sub {
- my $hugepages_topology = PVE::QemuServer::Memory::hugepages_topology($conf);
+ my $hotplug_features = parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
+ my $hugepages_topology = PVE::QemuServer::Memory::hugepages_topology($conf, $hotplug_features->{memory});
+
my $hugepages_host_topology = PVE::QemuServer::Memory::hugepages_host_topology();
PVE::QemuServer::Memory::hugepages_mount();
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 3899917..4812a9e 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -273,12 +273,12 @@ sub qemu_dimm_list {
}
sub config {
- my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd) = @_;
+ my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug, $cmd) = @_;
my $memory = $conf->{memory} || $defaults->{memory};
my $static_memory = 0;
- if ($hotplug_features->{memory}) {
+ if ($hotplug) {
die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
my $MAX_MEM = get_max_mem($conf);
die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
@@ -365,7 +365,7 @@ sub config {
}
}
- if ($hotplug_features->{memory}) {
+ if ($hotplug) {
foreach_dimm($conf, $vmid, $memory, $sockets, sub {
my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
@@ -488,7 +488,7 @@ sub hugepages_size {
}
sub hugepages_topology {
- my ($conf) = @_;
+ my ($conf, $hotplug) = @_;
my $hugepages_topology = {};
@@ -499,9 +499,8 @@ sub hugepages_topology {
my $static_memory = 0;
my $sockets = $conf->{sockets} || 1;
my $numa_custom_topology = undef;
- my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
- if ($hotplug_features->{memory}) {
+ if ($hotplug) {
$static_memory = $STATICMEM;
$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
} else {
@@ -537,7 +536,7 @@ sub hugepages_topology {
}
}
- if ($hotplug_features->{memory}) {
+ if ($hotplug) {
my $numa_hostmap = get_numa_guest_to_host_map($conf);
foreach_dimm($conf, undef, $memory, $sockets, sub {
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 05/16] add memory parser
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (3 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 04/16] memory: remove calls to parse_hotplug_features Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 06/16] memory: add get_static_mem Alexandre Derumier
` (12 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/API2/Qemu.pm | 7 ++--
PVE/QemuConfig.pm | 4 +--
PVE/QemuMigrate.pm | 6 ++--
PVE/QemuServer.pm | 27 +++++++--------
PVE/QemuServer/Helpers.pm | 3 +-
PVE/QemuServer/Memory.pm | 71 ++++++++++++++++++++++++++++++++-------
6 files changed, 82 insertions(+), 36 deletions(-)
diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 587bb22..9b80da1 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
use PVE::QemuServer::ImportDisk;
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
use PVE::QemuMigrate;
use PVE::RPCEnvironment;
use PVE::AccessControl;
@@ -1489,8 +1490,6 @@ my $update_vm_api = sub {
my $storecfg = PVE::Storage::config();
- my $defaults = PVE::QemuServer::load_defaults();
-
&$resolve_cdrom_alias($param);
# now try to verify all parameters
@@ -1608,7 +1607,9 @@ my $update_vm_api = sub {
}
if ($param->{memory} || defined($param->{balloon})) {
- my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
+
+ my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+ my $maxmem = get_current_memory($memory);
my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
die "balloon value too large (must be smaller than assigned memory)\n"
diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
index 051382c..999e658 100644
--- a/PVE/QemuConfig.pm
+++ b/PVE/QemuConfig.pm
@@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer;
use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
use PVE::Storage;
use PVE::Tools;
use PVE::Format qw(render_bytes render_duration);
@@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
}
- my $defaults = PVE::QemuServer::load_defaults();
- my $mem_size = $conf->{memory} // $defaults->{memory};
+ my $mem_size = get_current_memory($conf->{memory});
my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
# our savevm-start does live-save of the memory until the space left in the
# volume is just enough for the remaining memory content + internal state
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 09cc1d8..f86fdbe 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
use PVE::QemuServer::Helpers qw(min_version);
use PVE::QemuServer::Machine;
use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::Memory qw(get_current_memory);
use PVE::QemuServer;
use PVE::AbstractMigrate;
@@ -1026,7 +1027,8 @@ sub phase2_start_remote_cluster {
my $remote_vmid = $self->{opts}->{remote}->{vmid};
# like regular start but with some overhead accounted for
- my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+ my $memory = get_current_memory($self->{vmconf}->{memory});
+ my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
@@ -1181,7 +1183,7 @@ sub phase2 {
$qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
# set cachesize to 10% of the total memory
- my $memory = $conf->{memory} || $defaults->{memory};
+ my $memory = get_current_memory($conf->{memory});
my $cachesize = int($memory * 1048576 / 10);
$cachesize = round_powerof2($cachesize);
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 97185e1..f1330f1 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -53,7 +53,7 @@ use PVE::QemuServer::CGroup;
use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory;
+use PVE::QemuServer::Memory qw(get_current_memory);
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
use PVE::QemuServer::USB qw(parse_usb_device);
@@ -341,11 +341,9 @@ my $confdesc = {
},
memory => {
optional => 1,
- type => 'integer',
- description => "Amount of RAM for the VM in MiB. This is the maximum available memory when"
- ." you use the balloon device.",
- minimum => 16,
- default => 512,
+ type => 'string',
+ description => "Memory properties.",
+ format => $PVE::QemuServer::Memory::memory_fmt
},
balloon => {
optional => 1,
@@ -2933,8 +2931,7 @@ sub vmstatus {
$d->{cpus} = $conf->{vcpus} if $conf->{vcpus};
$d->{name} = $conf->{name} || "VM $vmid";
- $d->{maxmem} = $conf->{memory} ? $conf->{memory}*(1024*1024)
- : $defaults->{memory}*(1024*1024);
+ $d->{maxmem} = get_current_memory($conf->{memory})*(1024*1024);
if ($conf->{balloon}) {
$d->{balloon_min} = $conf->{balloon}*(1024*1024);
@@ -3869,7 +3866,7 @@ sub config_to_command {
push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
}
- PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features->{memory}, $cmd);
+ PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
push @$cmd, '-S' if $conf->{freeze};
@@ -5033,7 +5030,7 @@ sub vmconfig_hotplug_pending {
# enable balloon device is not hotpluggable
die "skip\n" if defined($conf->{balloon}) && $conf->{balloon} == 0;
# here we reset the ballooning value to memory
- my $balloon = $conf->{memory} || $defaults->{memory};
+ my $balloon = get_current_memory($conf->{memory});
mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
} elsif ($fast_plug_option->{$opt}) {
# do nothing
@@ -5046,7 +5043,7 @@ sub vmconfig_hotplug_pending {
vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
} elsif ($opt =~ m/^memory$/) {
die "skip\n" if !$hotplug_features->{memory};
- PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults);
+ PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
} elsif ($opt eq 'cpuunits') {
$cgroup->change_cpu_shares(undef);
} elsif ($opt eq 'cpulimit') {
@@ -5101,7 +5098,8 @@ sub vmconfig_hotplug_pending {
# allow manual ballooning if shares is set to zero
if ((defined($conf->{shares}) && ($conf->{shares} == 0))) {
- my $balloon = $conf->{pending}->{balloon} || $conf->{memory} || $defaults->{memory};
+ my $memory = get_current_memory($conf->{memory});
+ my $balloon = $conf->{pending}->{balloon} || $memory;
mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
}
} elsif ($opt =~ m/^net(\d+)$/) {
@@ -5121,7 +5119,7 @@ sub vmconfig_hotplug_pending {
$vmid, $opt, $value, $arch, $machine_type);
} elsif ($opt =~ m/^memory$/) { #dimms
die "skip\n" if !$hotplug_features->{memory};
- $value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $value);
+ $value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
} elsif ($opt eq 'cpuunits') {
my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
$cgroup->change_cpu_shares($new_cpuunits);
@@ -5798,7 +5796,8 @@ sub vm_start_nolock {
push @$cmd, '-S';
}
- my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
+ my $memory = get_current_memory($conf->{memory});
+ my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $memory, $resume);
my $pci_devices = {}; # host pci devices
for (my $i = 0; $i < $PVE::QemuServer::PCI::MAX_HOSTPCI_DEVICES; $i++) {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index e91f906..9115d50 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -143,8 +143,7 @@ sub version_cmp {
}
sub config_aware_timeout {
- my ($config, $is_suspended) = @_;
- my $memory = $config->{memory};
+ my ($config, $memory, $is_suspended) = @_;
my $timeout = 30;
# Based on user reported startup time for vm with 512GiB @ 4-5 minutes
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 4812a9e..09fa25c 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,10 +8,42 @@ use PVE::Exception qw(raise raise_param_exc);
use PVE::QemuServer;
use PVE::QemuServer::Monitor qw(mon_cmd);
+use base qw(Exporter);
+
+our @EXPORT_OK = qw(
+get_current_memory
+);
my $MAX_NUMA = 8;
my $STATICMEM = 1024;
+our $memory_fmt = {
+ current => {
+ description => "Current amount of online RAM for the VM in MiB. This is the maximum available memory when"
+ ." you use the balloon device.",
+ type => 'integer',
+ default_key => 1,
+ minimum => 16,
+ default => 512,
+ },
+};
+
+sub print_memory {
+ my $memory = shift;
+
+ return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
+}
+
+sub parse_memory {
+ my ($value) = @_;
+
+ return { current => $memory_fmt->{current}->{default} } if !defined($value);
+
+ my $res = PVE::JSONSchema::parse_property_string($memory_fmt, $value);
+
+ return $res;
+}
+
my $_host_bits;
sub get_host_phys_address_bits {
return $_host_bits if defined($_host_bits);
@@ -63,6 +95,13 @@ my sub get_max_mem {
return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
}
+sub get_current_memory {
+ my ($value) = @_;
+
+ my $memory = parse_memory($value);
+ return $memory->{current};
+}
+
sub get_numa_node_list {
my ($conf) = @_;
my @numa_map;
@@ -161,16 +200,19 @@ sub foreach_reverse_dimm {
}
sub qemu_memory_hotplug {
- my ($vmid, $conf, $defaults, $value) = @_;
+ my ($vmid, $conf, $value) = @_;
return $value if !PVE::QemuServer::check_running($vmid);
- my $sockets = $conf->{sockets} || 1;
+ my $oldmem = parse_memory($conf->{memory});
+ my $newmem = parse_memory($value);
+
+ return $value if $newmem->{current} == $oldmem->{current};
- my $memory = $conf->{memory} || $defaults->{memory};
- $value = $defaults->{memory} if !$value;
- return $value if $value == $memory;
+ my $memory = $oldmem->{current};
+ $value = $newmem->{current};
+ my $sockets = $conf->{sockets} || 1;
my $static_memory = $STATICMEM;
$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
@@ -185,7 +227,7 @@ sub qemu_memory_hotplug {
foreach_dimm($conf, $vmid, $value, $sockets, sub {
my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
- return if $current_size <= $conf->{memory};
+ return if $current_size <= get_current_memory($conf->{memory});
if ($conf->{hugepages}) {
$numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
@@ -224,7 +266,8 @@ sub qemu_memory_hotplug {
die $err;
}
#update conf after each succesful module hotplug
- $conf->{memory} = $current_size;
+ $newmem->{current} = $current_size;
+ $conf->{memory} = print_memory($newmem);
PVE::QemuConfig->write_config($vmid, $conf);
});
@@ -233,7 +276,8 @@ sub qemu_memory_hotplug {
foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
- return if $current_size >= $conf->{memory};
+ return if $current_size >= get_current_memory($conf->{memory});
+
print "try to unplug memory dimm $name\n";
my $retry = 0;
@@ -247,12 +291,14 @@ sub qemu_memory_hotplug {
}
#update conf after each succesful module unplug
- $conf->{memory} = $current_size;
+ $newmem->{current} = $current_size;
+ $conf->{memory} = print_memory($newmem);
eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
PVE::QemuConfig->write_config($vmid, $conf);
});
}
+ return $conf->{memory};
}
sub qemu_dimm_list {
@@ -273,9 +319,9 @@ sub qemu_dimm_list {
}
sub config {
- my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug, $cmd) = @_;
+ my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
- my $memory = $conf->{memory} || $defaults->{memory};
+ my $memory = get_current_memory($conf->{memory});
my $static_memory = 0;
if ($hotplug) {
@@ -494,8 +540,7 @@ sub hugepages_topology {
return if !$conf->{numa};
- my $defaults = PVE::QemuServer::load_defaults();
- my $memory = $conf->{memory} || $defaults->{memory};
+ my $memory = get_current_memory($conf->{memory});
my $static_memory = 0;
my $sockets = $conf->{sockets} || 1;
my $numa_custom_topology = undef;
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 06/16] memory: add get_static_mem
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (4 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 05/16] add memory parser Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 07/16] memory: use static_memory in foreach_dimm Alexandre Derumier
` (11 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 37 +++++++++++++++++++------------------
1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 09fa25c..5dc2cb8 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -15,7 +15,6 @@ get_current_memory
);
my $MAX_NUMA = 8;
-my $STATICMEM = 1024;
our $memory_fmt = {
current => {
@@ -44,6 +43,22 @@ sub parse_memory {
return $res;
}
+my sub get_static_mem {
+ my ($conf, $sockets, $hotplug) = @_;
+
+ my $static_memory = 0;
+ my $memory = parse_memory($conf->{memory});
+
+ if ($hotplug) {
+ $static_memory = 1024;
+ $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+ } else {
+ $static_memory = $memory->{current};
+ }
+
+ return $static_memory;
+}
+
my $_host_bits;
sub get_host_phys_address_bits {
return $_host_bits if defined($_host_bits);
@@ -213,8 +228,7 @@ sub qemu_memory_hotplug {
$value = $newmem->{current};
my $sockets = $conf->{sockets} || 1;
- my $static_memory = $STATICMEM;
- $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+ my $static_memory = get_static_mem($conf, $sockets, 1);
die "memory can't be lower than $static_memory MB" if $value < $static_memory;
my $MAX_MEM = get_max_mem($conf);
@@ -322,7 +336,7 @@ sub config {
my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
my $memory = get_current_memory($conf->{memory});
- my $static_memory = 0;
+ my $static_memory = get_static_mem($conf, $sockets, $hotplug);
if ($hotplug) {
die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
@@ -334,17 +348,11 @@ sub config {
if $conf->{"numa$i"};
}
- my $sockets = $conf->{sockets} || 1;
-
- $static_memory = $STATICMEM;
- $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-
die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
} else {
- $static_memory = $memory;
push @$cmd, '-m', $static_memory;
}
@@ -541,17 +549,10 @@ sub hugepages_topology {
return if !$conf->{numa};
my $memory = get_current_memory($conf->{memory});
- my $static_memory = 0;
my $sockets = $conf->{sockets} || 1;
+ my $static_memory = get_static_mem($conf, $sockets, $hotplug);
my $numa_custom_topology = undef;
- if ($hotplug) {
- $static_memory = $STATICMEM;
- $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
- } else {
- $static_memory = $memory;
- }
-
#custom numa topology
for (my $i = 0; $i < $MAX_NUMA; $i++) {
next if !$conf->{"numa$i"};
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 07/16] memory: use static_memory in foreach_dimm
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (5 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 06/16] memory: add get_static_mem Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option Alexandre Derumier
` (10 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 5dc2cb8..32fbdc5 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -151,17 +151,15 @@ sub get_numa_guest_to_host_map {
}
sub foreach_dimm{
- my ($conf, $vmid, $memory, $sockets, $func) = @_;
+ my ($conf, $vmid, $memory, $static_memory, $func) = @_;
my $dimm_id = 0;
- my $current_size = 0;
+ my $current_size = $static_memory;
my $dimm_size = 0;
if($conf->{hugepages} && $conf->{hugepages} == 1024) {
- $current_size = 1024 * $sockets;
$dimm_size = 1024;
} else {
- $current_size = 1024;
$dimm_size = 512;
}
@@ -238,7 +236,7 @@ sub qemu_memory_hotplug {
my $numa_hostmap;
- foreach_dimm($conf, $vmid, $value, $sockets, sub {
+ foreach_dimm($conf, $vmid, $value, $static_memory, sub {
my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
return if $current_size <= get_current_memory($conf->{memory});
@@ -420,7 +418,7 @@ sub config {
}
if ($hotplug) {
- foreach_dimm($conf, $vmid, $memory, $sockets, sub {
+ foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
@@ -585,7 +583,7 @@ sub hugepages_topology {
if ($hotplug) {
my $numa_hostmap = get_numa_guest_to_host_map($conf);
- foreach_dimm($conf, undef, $memory, $sockets, sub {
+ foreach_dimm($conf, undef, $memory, $static_memory, sub {
my ($conf, undef, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
$numanode = $numa_hostmap->{$numanode};
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (6 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 07/16] memory: use static_memory in foreach_dimm Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:18 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max Alexandre Derumier
` (9 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
max can be multiple of 64GiB only,
The dimm size is compute from the max memory
we can have 64 slots:
64GiB = 64 slots x 1GiB
128GiB = 64 slots x 2GiB
..
4TiB = 64 slots x 64GiB
Also, with numa, we need to share slot between (up to 8) sockets.
64 is a multiple of 8,
64GiB = 8 sockets * 8 slots * 1GiB
128GiB = 8 sockets * 8 slots * 2GiB
...
and with virtio-mem,
we can have 32000 blocks of 2MiB minimum
64GB = 32000 * 2MiB
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer.pm | 4 ++--
PVE/QemuServer/Memory.pm | 33 +++++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index f1330f1..3bd86da 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5042,7 +5042,7 @@ sub vmconfig_hotplug_pending {
vm_deviceunplug($vmid, $conf, $opt);
vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
} elsif ($opt =~ m/^memory$/) {
- die "skip\n" if !$hotplug_features->{memory};
+ die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf);
PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf);
} elsif ($opt eq 'cpuunits') {
$cgroup->change_cpu_shares(undef);
@@ -5118,7 +5118,7 @@ sub vmconfig_hotplug_pending {
vmconfig_update_disk($storecfg, $conf, $hotplug_features->{disk},
$vmid, $opt, $value, $arch, $machine_type);
} elsif ($opt =~ m/^memory$/) { #dimms
- die "skip\n" if !$hotplug_features->{memory};
+ die "skip\n" if !PVE::QemuServer::Memory::can_hotplug($hotplug_features->{memory}, $conf, $value);
$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $value);
} elsif ($opt eq 'cpuunits') {
my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 32fbdc5..deeb88f 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
use strict;
use warnings;
+use PVE::JSONSchema;
use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
use PVE::Exception qw(raise raise_param_exc);
+use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
use PVE::QemuServer;
use PVE::QemuServer::Monitor qw(mon_cmd);
@@ -25,8 +27,26 @@ our $memory_fmt = {
minimum => 16,
default => 512,
},
+ max => {
+ type => 'integer',
+ optional => 1,
+ minimum => 65536,
+ maximum => 4194304,
+ format => 'pve-qm-memory-max',
+ },
};
+PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
+sub verify_qm_memory_max {
+ my ($max, $noerr) = @_;
+
+ return if $noerr;
+
+ die "max memory need to be a multiple of 64GiB\n" if $max && $max % 65536 != 0;
+
+ return $max
+}
+
sub print_memory {
my $memory = shift;
@@ -313,6 +333,19 @@ sub qemu_memory_hotplug {
return $conf->{memory};
}
+sub can_hotplug {
+ my ($hotplug, $conf, $value) = @_;
+
+ return if !$hotplug;
+
+ my $oldmem = parse_memory($conf->{memory});
+ my $newmem = parse_memory($value);
+
+ return if safe_num_ne($newmem->{max}, $oldmem->{max});
+
+ return 1;
+}
+
sub qemu_dimm_list {
my ($vmid) = @_;
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (7 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
` (8 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
verify than defined vm memorymax is not bigger than
host cpu supported memory
Add add early check in update vm api
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/API2/Qemu.pm | 32 ++++++++++++++++++++++----------
PVE/QemuServer/Memory.pm | 19 ++++++++++++++++++-
2 files changed, 40 insertions(+), 11 deletions(-)
diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 9b80da1..6627910 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
use PVE::QemuServer::ImportDisk;
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
use PVE::QemuMigrate;
use PVE::RPCEnvironment;
use PVE::AccessControl;
@@ -476,6 +476,26 @@ my $create_disks = sub {
return ($vollist, $res);
};
+my $check_memory_param = sub {
+ my ($conf, $param) = @_;
+
+ my $mem = parse_memory($param->{memory});
+ my $host_max_mem = get_host_max_mem($conf);
+
+ die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+ if $mem->{max} && $mem->{max} > $host_max_mem;
+
+ if ($param->{memory} || defined($param->{balloon})) {
+
+ my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
+ my $maxmem = get_current_memory($memory);
+ my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
+
+ die "balloon value too large (must be smaller than assigned memory)\n"
+ if $balloon && $balloon > $maxmem;
+ }
+};
+
my $check_cpu_model_access = sub {
my ($rpcenv, $authuser, $new, $existing) = @_;
@@ -1606,15 +1626,7 @@ my $update_vm_api = sub {
}
}
- if ($param->{memory} || defined($param->{balloon})) {
-
- my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
- my $maxmem = get_current_memory($memory);
- my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
-
- die "balloon value too large (must be smaller than assigned memory)\n"
- if $balloon && $balloon > $maxmem;
- }
+ &$check_memory_param($conf, $param);
PVE::Cluster::log_msg('info', $authuser, "update VM $vmid: " . join (' ', @paramarr));
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index deeb88f..c912e4f 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -14,6 +14,8 @@ use base qw(Exporter);
our @EXPORT_OK = qw(
get_current_memory
+parse_memory
+get_host_max_mem
);
my $MAX_NUMA = 8;
@@ -96,7 +98,7 @@ sub get_host_phys_address_bits {
return; # undef, cannot really do anything..
}
-my sub get_max_mem {
+sub get_host_max_mem {
my ($conf) = @_;
my $cpu = {};
@@ -130,6 +132,21 @@ my sub get_max_mem {
return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
}
+my sub get_max_mem {
+ my ($conf) = @_;
+
+ my $host_max_mem = get_host_max_mem($conf);
+ my $mem = parse_memory($conf->{memory});
+
+ if ($mem->{max}) {
+ die "configured memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
+ if $mem->{max} > $host_max_mem;
+ return $mem->{max};
+ }
+
+ return $host_max_mem;
+}
+
sub get_current_memory {
my ($value) = @_;
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (8 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
` (7 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
current qemu_dimm_list can return any kind of memory devices.
make it more generic, with a regex filter to choose kind of device
from id.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index c912e4f..a13b3a1 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -333,7 +333,7 @@ sub qemu_memory_hotplug {
while (1) {
eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
sleep 3;
- my $dimm_list = qemu_dimm_list($vmid);
+ my $dimm_list = qemu_memdevices_list($vmid, '^dimm(\d+)$');
last if !$dimm_list->{$name};
raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
$retry++;
@@ -363,14 +363,14 @@ sub can_hotplug {
return 1;
}
-sub qemu_dimm_list {
- my ($vmid) = @_;
+sub qemu_memdevices_list {
+ my ($vmid, $filter) = @_;
my $dimmarray = mon_cmd($vmid, "query-memory-devices");
my $dimms = {};
foreach my $dimm (@$dimmarray) {
-
+ next if $filter && $dimm->{data}->{id} !~ /$filter/;
$dimms->{$dimm->{data}->{id}}->{id} = $dimm->{data}->{id};
$dimms->{$dimm->{data}->{id}}->{node} = $dimm->{data}->{node};
$dimms->{$dimm->{data}->{id}}->{addr} = $dimm->{data}->{addr};
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (9 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 12/16] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
` (6 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
simple use dimm_list() returned by qemu
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 73 ++++++++++++----------------------------
1 file changed, 22 insertions(+), 51 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index a13b3a1..cf1ddb9 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -217,38 +217,6 @@ sub foreach_dimm{
}
}
-sub foreach_reverse_dimm {
- my ($conf, $vmid, $memory, $sockets, $func) = @_;
-
- my $dimm_id = 253;
- my $current_size = 0;
- my $dimm_size = 0;
-
- if($conf->{hugepages} && $conf->{hugepages} == 1024) {
- $current_size = 8355840;
- $dimm_size = 131072;
- } else {
- $current_size = 4177920;
- $dimm_size = 65536;
- }
-
- return if $current_size == $memory;
-
- my @numa_map = get_numa_node_list($conf);
-
- for (my $j = 0; $j < 8; $j++) {
- for (my $i = 0; $i < 32; $i++) {
- my $name = "dimm${dimm_id}";
- $dimm_id--;
- my $numanode = $numa_map[(31-$i) % @numa_map];
- $current_size -= $dimm_size;
- &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
- return $current_size if $current_size <= $memory;
- }
- $dimm_size /= 2;
- }
-}
-
sub qemu_memory_hotplug {
my ($vmid, $conf, $value) = @_;
@@ -322,30 +290,33 @@ sub qemu_memory_hotplug {
} else {
- foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
- my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+ my $dimms = qemu_dimm_list($vmid);
+ my $current_size = $memory;
+ for my $name (sort { $dimms->{$b}{slot} <=> $dimms->{$a}{slot} } keys %$dimms) {
- return if $current_size >= get_current_memory($conf->{memory});
+ my $dimm_size = $dimms->{$name}->{size} / 1024 / 1024;
- print "try to unplug memory dimm $name\n";
+ last if $current_size <= $value;
- my $retry = 0;
- while (1) {
- eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
- sleep 3;
- my $dimm_list = qemu_memdevices_list($vmid, '^dimm(\d+)$');
- last if !$dimm_list->{$name};
- raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
- $retry++;
- }
+ print "try to unplug memory dimm $name\n";
- #update conf after each succesful module unplug
- $newmem->{current} = $current_size;
- $conf->{memory} = print_memory($newmem);
+ my $retry = 0;
+ while (1) {
+ eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
+ sleep 3;
+ my $dimm_list = qemu_memdevices_list($vmid, '^dimm(\d+)$');
+ last if !$dimm_list->{$name};
+ raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
+ $retry++;
+ }
+ $current_size -= $dimm_size;
+ #update conf after each succesful module unplug
+ $newmem->{current} = $current_size;
+ $conf->{memory} = print_memory($newmem);
- eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
- PVE::QemuConfig->write_config($vmid, $conf);
- });
+ eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
+ PVE::QemuConfig->write_config($vmid, $conf);
+ }
}
return $conf->{memory};
}
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 12/16] memory: use 64 slots && static dimm size when max is defined
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (10 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 13/16] test: add memory-max tests Alexandre Derumier
` (5 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
default kernel vhost config only support 64 slots by default,
for performance since 2015.
Original memory hotplug code was done before, using qemu
max supported 255 slots.
To reach max mem (4TB), we used incremental dimm size.
Instead of dynamic memory size, use 1 static dimm size, compute
from max memory/64.
We also force a static memory to 4GB, as memory under 4G don't
works fine with unplug
Fix:
https://bugzilla.proxmox.com/show_bug.cgi?id=3446
https://bugzilla.proxmox.com/show_bug.cgi?id=1426
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index cf1ddb9..1b1c99d 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -19,6 +19,7 @@ get_host_max_mem
);
my $MAX_NUMA = 8;
+my $MAX_SLOTS = 64;
our $memory_fmt = {
current => {
@@ -71,7 +72,13 @@ my sub get_static_mem {
my $static_memory = 0;
my $memory = parse_memory($conf->{memory});
- if ($hotplug) {
+ if ($memory->{max}) {
+ my $dimm_size = $memory->{max} / $MAX_SLOTS;
+ #static mem can't be lower than 4G and lower than 1 dimmsize by socket
+ $static_memory = $dimm_size * $sockets;
+ $static_memory = 4096 if $static_memory < 4096;
+ } elsif ($hotplug) {
+ #legacy
$static_memory = 1024;
$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
} else {
@@ -194,7 +201,10 @@ sub foreach_dimm{
my $current_size = $static_memory;
my $dimm_size = 0;
- if($conf->{hugepages} && $conf->{hugepages} == 1024) {
+ my $confmem = parse_memory($conf->{memory});
+ if ($confmem->{max}) {
+ $dimm_size = $confmem->{max} / $MAX_SLOTS;
+ } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
$dimm_size = 1024;
} else {
$dimm_size = 512;
@@ -213,7 +223,7 @@ sub foreach_dimm{
&$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
return $current_size if $current_size >= $memory;
}
- $dimm_size *= 2;
+ $dimm_size *= 2 if !$confmem->{max};
}
}
@@ -368,7 +378,9 @@ sub config {
}
die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
- push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
+ my $confmem = parse_memory($conf->{memory});
+ my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
+ push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
} else {
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 13/16] test: add memory-max tests
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (11 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 12/16] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support Alexandre Derumier
` (4 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
test/cfg2cmd/memory-max-128G.conf | 11 ++++
test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++++++++++++++++++++++++++
test/cfg2cmd/memory-max-512G.conf | 11 ++++
test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++++++++++++++++++
4 files changed, 166 insertions(+)
create mode 100644 test/cfg2cmd/memory-max-128G.conf
create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
create mode 100644 test/cfg2cmd/memory-max-512G.conf
create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
diff --git a/test/cfg2cmd/memory-max-128G.conf b/test/cfg2cmd/memory-max-128G.conf
new file mode 100644
index 0000000..8f0328d
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 64G with 2 socket, dimm size should be 1GB && static memory should be 4G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-128G.conf.cmd b/test/cfg2cmd/memory-max-128G.conf.cmd
new file mode 100644
index 0000000..5b22027
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf.cmd
@@ -0,0 +1,86 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=4096,slots=64,maxmem=65536M' \
+ -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-ram,id=mem-dimm0,size=1024M' \
+ -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm1,size=1024M' \
+ -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm2,size=1024M' \
+ -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm3,size=1024M' \
+ -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm4,size=1024M' \
+ -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm5,size=1024M' \
+ -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm6,size=1024M' \
+ -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm7,size=1024M' \
+ -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm8,size=1024M' \
+ -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm9,size=1024M' \
+ -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm10,size=1024M' \
+ -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm11,size=1024M' \
+ -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm12,size=1024M' \
+ -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm13,size=1024M' \
+ -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm14,size=1024M' \
+ -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm15,size=1024M' \
+ -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm16,size=1024M' \
+ -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm17,size=1024M' \
+ -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm18,size=1024M' \
+ -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm19,size=1024M' \
+ -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm20,size=1024M' \
+ -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm21,size=1024M' \
+ -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm22,size=1024M' \
+ -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm23,size=1024M' \
+ -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm24,size=1024M' \
+ -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm25,size=1024M' \
+ -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm26,size=1024M' \
+ -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm27,size=1024M' \
+ -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf b/test/cfg2cmd/memory-max-512G.conf
new file mode 100644
index 0000000..082f75e
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf
@@ -0,0 +1,11 @@
+# TEST: memorymax 512GB with 2 socket, dimm size should be 8GB && static memory should be 16G
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 131072,max=524288
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf.cmd b/test/cfg2cmd/memory-max-512G.conf.cmd
new file mode 100644
index 0000000..3c04b70
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf.cmd
@@ -0,0 +1,58 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=16384,slots=64,maxmem=524288M' \
+ -object 'memory-backend-ram,id=ram-node0,size=8192M' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-ram,id=ram-node1,size=8192M' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-ram,id=mem-dimm0,size=8192M' \
+ -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm1,size=8192M' \
+ -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm2,size=8192M' \
+ -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm3,size=8192M' \
+ -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm4,size=8192M' \
+ -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm5,size=8192M' \
+ -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm6,size=8192M' \
+ -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm7,size=8192M' \
+ -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm8,size=8192M' \
+ -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm9,size=8192M' \
+ -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm10,size=8192M' \
+ -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm11,size=8192M' \
+ -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+ -object 'memory-backend-ram,id=mem-dimm12,size=8192M' \
+ -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+ -object 'memory-backend-ram,id=mem-dimm13,size=8192M' \
+ -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc+pve0'
\ No newline at end of file
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (12 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 13/16] test: add memory-max tests Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry Alexandre Derumier
` (3 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
a 4GiB static memory is needed for DMA+boot memory, as this memory
is almost always un-unpluggeable.
1 virtio-mem pci device is setup for each numa node on pci.4 bridge
virtio-mem use a fixed blocksize with 32000 blocks
Blocksize is computed from the maxmemory-4096/32000 with a minimum of
2MiB to map THP.
(lower blocksize = more chance to unplug memory).
Note: Currently, linux only support 4MiB virtio blocksize, 2MiB support
is currently is progress.
For hotplug/unplug, we are try to allocate/unallocate same amount
of memory aligned to the blocksize on each numa node if possible.
If a node a not able to reach the target memory (could be an unmovable
page on unplug for example), we try again to redispatch memory the
remaining memory on others nodes.
About hugepages:
For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.
Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.
fixes:
https://bugzilla.proxmox.com/show_bug.cgi?id=2949
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/API2/Qemu.pm | 9 +-
PVE/QemuServer.pm | 7 +-
PVE/QemuServer/Memory.pm | 215 ++++++++++++++++++++++++++++++++++++---
PVE/QemuServer/PCI.pm | 8 ++
4 files changed, 222 insertions(+), 17 deletions(-)
diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 6627910..7a2128a 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
use PVE::QemuServer::ImportDisk;
use PVE::QemuServer::Monitor qw(mon_cmd);
use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
+use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem get_virtiomem_block_size);
use PVE::QemuMigrate;
use PVE::RPCEnvironment;
use PVE::AccessControl;
@@ -485,6 +485,13 @@ my $check_memory_param = sub {
die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
if $mem->{max} && $mem->{max} > $host_max_mem;
+ if ($mem->{virtio}) {
+ my $blocksize = get_virtiomem_block_size($conf);
+
+ die "memory need to be a multiple of $blocksize MiB when virtiomem is enabled\n"
+ if $mem->{current} % $blocksize != 0;
+ }
+
if ($param->{memory} || defined($param->{balloon})) {
my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 3bd86da..a34ff3e 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3866,7 +3866,12 @@ sub config_to_command {
push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
}
- PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd);
+ my $mem_devices = {};
+ PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $hotplug_features->{memory}, $cmd, $mem_devices);
+ foreach my $id (sort keys %$mem_devices) {
+ my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);
+ push @$devices, "-device", "$mem_devices->{$id}$pciaddr";
+ }
push @$cmd, '-S' if $conf->{freeze};
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 1b1c99d..bf4e92a 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -3,6 +3,8 @@ package PVE::QemuServer::Memory;
use strict;
use warnings;
+use POSIX qw(ceil);
+
use PVE::JSONSchema;
use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
use PVE::Exception qw(raise raise_param_exc);
@@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
get_current_memory
parse_memory
get_host_max_mem
+get_virtiomem_block_size
);
my $MAX_NUMA = 8;
@@ -37,6 +40,12 @@ our $memory_fmt = {
maximum => 4194304,
format => 'pve-qm-memory-max',
},
+ virtio => {
+ description => "Enable virtio-mem memory (Experimental: Only works with Linux guest with kernel >= 5.10)",
+ type => 'boolean',
+ optional => 1,
+ default => 0,
+ },
};
PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
@@ -72,7 +81,9 @@ my sub get_static_mem {
my $static_memory = 0;
my $memory = parse_memory($conf->{memory});
- if ($memory->{max}) {
+ if ($memory->{virtio}) {
+ $static_memory = 4096;
+ } elsif ($memory->{max}) {
my $dimm_size = $memory->{max} / $MAX_SLOTS;
#static mem can't be lower than 4G and lower than 1 dimmsize by socket
$static_memory = $dimm_size * $sockets;
@@ -161,6 +172,117 @@ sub get_current_memory {
return $memory->{current};
}
+sub get_virtiomem_block_size {
+ my ($conf) = @_;
+
+ my $sockets = $conf->{sockets} || 1;
+ my $MAX_MEM = get_max_mem($conf);
+ my $static_memory = get_static_mem($conf, $sockets);
+ my $memory = get_current_memory($conf->{memory});
+
+ #virtiomem can map 32000 block size.
+ #try to use lowest blocksize, lower = more chance to unplug memory.
+ my $blocksize = ($MAX_MEM - $static_memory) / 32000;
+ #2MB is the minimum to be aligned with THP
+ $blocksize = 2 if $blocksize < 2;
+ $blocksize = 2**(ceil(log($blocksize)/log(2)));
+ #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
+ $blocksize = 4 if $blocksize < 4;
+
+ return $blocksize;
+}
+
+my sub get_virtiomem_total_current_size {
+ my ($mems) = @_;
+ my $size = 0;
+ for my $mem (values %$mems) {
+ $size += $mem->{current};
+ }
+ return $size;
+}
+
+my sub balance_virtiomem {
+ my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
+
+ my $nb_virtiomem = scalar(keys %$virtiomems);
+
+ print"try to balance memory on $nb_virtiomem virtiomems\n";
+
+ #if we can't share exactly the same amount, we add the remainder on last node
+ my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
+ my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
+
+ my $i = 0;
+ foreach my $id (sort keys %$virtiomems) {
+ my $virtiomem = $virtiomems->{$id};
+ $i++;
+ my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
+ $virtiomem->{completed} = 0;
+ $virtiomem->{retry} = 0;
+ $virtiomem->{target} = $virtiomem_target;
+
+ print "virtiomem$id: set-requested-size : $virtiomem_target\n";
+ mon_cmd($vmid, 'qom-set',
+ path => "/machine/peripheral/virtiomem$id",
+ property => "requested-size",
+ value => $virtiomem_target * 1024 * 1024);
+ }
+
+ my $total_finished = 0;
+ my $error = undef;
+
+ while ($total_finished != $nb_virtiomem) {
+
+ sleep 1;
+
+ $total_finished = 0;
+
+ foreach my $id (keys %$virtiomems) {
+
+ my $virtiomem = $virtiomems->{$id};
+
+ if ($virtiomem->{error} || $virtiomem->{completed}) {
+ $total_finished++;
+ next;
+ }
+
+ my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
+ $virtiomem->{current} = $size / 1024 / 1024;
+ print"virtiomem$id: last: $virtiomem->{last} current: $virtiomem->{current} target: $virtiomem->{target}\n";
+
+ if($virtiomem->{current} == $virtiomem->{target}) {
+ print"virtiomem$id: completed\n";
+ $virtiomem->{completed} = 1;
+ next;
+ }
+
+ if($virtiomem->{current} != $virtiomem->{last}) {
+ #if value has changed, but not yet completed
+ print "virtiomem$id: changed but don't not reach target yet\n";
+ $virtiomem->{retry} = 0;
+ $virtiomem->{last} = $virtiomem->{current};
+ next;
+ }
+
+ if($virtiomem->{retry} >= 5) {
+ print "virtiomem$id: too many retry. set error\n";
+ $virtiomem->{error} = 1;
+ $error = 1;
+ #as change is async, we don't want that value change after the api call
+ eval {
+ mon_cmd($vmid, 'qom-set',
+ path => "/machine/peripheral/virtiomem$id",
+ property => "requested-size",
+ value => $virtiomem->{current} * 1024 *1024);
+ };
+ }
+ print"virtiomem$id: increase retry: $virtiomem->{retry}\n";
+ $virtiomem->{retry}++;
+ }
+ }
+ die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
+}
+
sub get_numa_node_list {
my ($conf) = @_;
my @numa_map;
@@ -247,7 +369,39 @@ sub qemu_memory_hotplug {
my $MAX_MEM = get_max_mem($conf);
die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
- if ($value > $memory) {
+ my $confmem = parse_memory($conf->{memory});
+
+ if ($confmem->{virtio}) {
+ my $blocksize = get_virtiomem_block_size($conf);
+
+ my $virtiomems = {};
+
+ for (my $i = 0; $i < $sockets; $i++) {
+ my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$i", property => "size");
+ $size = $size / 1024 /1024;
+ $virtiomems->{$i} = {
+ current => $size,
+ last => $size,
+ error => 0,
+ completed => 0,
+ retry => 0
+ };
+ }
+
+ my $target_total = $value - $static_memory;
+ my $err;
+ eval {
+ balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+ };
+ $err = $@ if $@;
+
+ my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+ $newmem->{current} = $current_memory;
+ $conf->{memory} = print_memory($newmem);
+ PVE::QemuConfig->write_config($vmid, $conf);
+ die $err if $err;
+
+ } elsif ($value > $memory) {
my $numa_hostmap;
@@ -339,8 +493,8 @@ sub can_hotplug {
my $oldmem = parse_memory($conf->{memory});
my $newmem = parse_memory($value);
- return if safe_num_ne($newmem->{max}, $oldmem->{max});
-
+ return if (safe_num_ne($newmem->{max}, $oldmem->{max}) ||
+ safe_boolean_ne($newmem->{virtio}, $oldmem->{virtio}));
return 1;
}
@@ -362,7 +516,7 @@ sub qemu_memdevices_list {
}
sub config {
- my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd) = @_;
+ my ($conf, $vmid, $sockets, $cores, $hotplug, $cmd, $mem_devices) = @_;
my $memory = get_current_memory($conf->{memory});
my $static_memory = get_static_mem($conf, $sockets, $hotplug);
@@ -380,7 +534,10 @@ sub config {
die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
my $confmem = parse_memory($conf->{memory});
my $slots = $confmem->{max} ? $MAX_SLOTS : 255;
- push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
+ my $cmdstr = "size=${static_memory}";
+ $cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
+ $cmdstr .= ",maxmem=${MAX_MEM}M";
+ push @$cmd, '-m', $cmdstr;
} else {
@@ -451,17 +608,42 @@ sub config {
}
if ($hotplug) {
- foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
- my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
- my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+ my $confmem = parse_memory($conf->{memory});
+
+ if ($confmem->{'virtio'}) {
- push @$cmd, "-object" , $mem_object;
- push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+ my $MAX_MEM = get_max_mem($conf);
+ my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
+ my $node_mem = ($memory - $static_memory) / $sockets;
+ my $blocksize = get_virtiomem_block_size($conf);
- die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
- if $current_size > $memory;
- });
+ die "memory need to be a multiple of $blocksize MiB with maxmemory $MAX_MEM MiB when virtiomem is enabled\n"
+ if $memory % $blocksize != 0;
+
+ for (my $i = 0; $i < $sockets; $i++) {
+
+ my $id = "virtiomem$i";
+ my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
+ push @$cmd, "-object" , "$mem_object,reserve=off";
+
+ my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i";
+ $mem_device .= ",prealloc=on" if $conf->{hugepages};
+ $mem_devices->{$id} = $mem_device;
+ }
+ } else {
+ foreach_dimm($conf, $vmid, $memory, $static_memory, sub {
+ my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
+
+ my $mem_object = print_mem_object($conf, "mem-$name", $dimm_size);
+
+ push @$cmd, "-object" , $mem_object;
+ push @$cmd, "-device", "pc-dimm,id=$name,memdev=mem-$name,node=$numanode";
+
+ die "memory size ($memory) must be aligned to $dimm_size for hotplugging\n"
+ if $current_size > $memory;
+ });
+ }
}
}
@@ -472,8 +654,11 @@ sub print_mem_object {
my $hugepages_size = hugepages_size($conf, $size);
my $path = hugepages_mount_path($hugepages_size);
+ my $confmem = parse_memory($conf->{memory});
- return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
+ my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
+ $object .= ",prealloc=yes" if !$confmem->{virtio};
+ return $object;
} else {
return "memory-backend-ram,id=$id,size=${size}M";
}
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index a18b974..0187c74 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -249,6 +249,14 @@ sub get_pci_addr_map {
'scsihw2' => { bus => 4, addr => 1 },
'scsihw3' => { bus => 4, addr => 2 },
'scsihw4' => { bus => 4, addr => 3 },
+ 'virtiomem0' => { bus => 4, addr => 4 },
+ 'virtiomem1' => { bus => 4, addr => 5 },
+ 'virtiomem2' => { bus => 4, addr => 6 },
+ 'virtiomem3' => { bus => 4, addr => 7 },
+ 'virtiomem4' => { bus => 4, addr => 8 },
+ 'virtiomem5' => { bus => 4, addr => 9 },
+ 'virtiomem6' => { bus => 4, addr => 10 },
+ 'virtiomem7' => { bus => 4, addr => 11 },
} if !defined($pci_addr_map);
return $pci_addr_map;
}
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry.
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (13 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 16/16] tests: add virtio-mem tests Alexandre Derumier
` (2 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
If some memory can be removed on a specific node,
we try to rebalance again on other nodes
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
PVE/QemuServer/Memory.pm | 51 +++++++++++++++++++++++++++-------------
1 file changed, 35 insertions(+), 16 deletions(-)
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index bf4e92a..f02b4e0 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -201,13 +201,28 @@ my sub get_virtiomem_total_current_size {
return $size;
}
+my sub get_virtiomem_total_errors_size {
+ my ($mems) = @_;
+
+ my $size = 0;
+ for my $mem (values %$mems) {
+ next if !$mem->{error};
+ $size += $mem->{current};
+ }
+ return $size;
+}
+
my sub balance_virtiomem {
my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
- my $nb_virtiomem = scalar(keys %$virtiomems);
+ my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems->%*);
print"try to balance memory on $nb_virtiomem virtiomems\n";
+ die "No more available blocks in virtiomem to balance all requested memory\n"
+ if $target_total < 0;
+ die "No more available virtiomem to balance the remaining memory\n" if $nb_virtiomem == 0;
+
#if we can't share exactly the same amount, we add the remainder on last node
my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
@@ -215,6 +230,7 @@ my sub balance_virtiomem {
my $i = 0;
foreach my $id (sort keys %$virtiomems) {
my $virtiomem = $virtiomems->{$id};
+ next if $virtiomem->{error};
$i++;
my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
$virtiomem->{completed} = 0;
@@ -229,7 +245,6 @@ my sub balance_virtiomem {
}
my $total_finished = 0;
- my $error = undef;
while ($total_finished != $nb_virtiomem) {
@@ -267,7 +282,6 @@ my sub balance_virtiomem {
if($virtiomem->{retry} >= 5) {
print "virtiomem$id: too many retry. set error\n";
$virtiomem->{error} = 1;
- $error = 1;
#as change is async, we don't want that value change after the api call
eval {
mon_cmd($vmid, 'qom-set',
@@ -280,7 +294,6 @@ my sub balance_virtiomem {
$virtiomem->{retry}++;
}
}
- die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
}
sub get_numa_node_list {
@@ -388,18 +401,24 @@ sub qemu_memory_hotplug {
};
}
- my $target_total = $value - $static_memory;
- my $err;
- eval {
- balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
- };
- $err = $@ if $@;
-
- my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
- $newmem->{current} = $current_memory;
- $conf->{memory} = print_memory($newmem);
- PVE::QemuConfig->write_config($vmid, $conf);
- die $err if $err;
+ while (1) {
+
+ my $target_total = $value - $static_memory - get_virtiomem_total_errors_size($virtiomems);
+ my $err;
+ eval {
+ balance_virtiomem($vmid, $virtiomems, $blocksize, $target_total);
+ };
+ $err = $@ if $@;
+
+ my $current_memory = $static_memory + get_virtiomem_total_current_size($virtiomems);
+ $newmem->{current} = $current_memory;
+ $conf->{memory} = print_memory($newmem);
+ PVE::QemuConfig->write_config($vmid, $conf);
+
+ die $err if $err;
+ last if $current_memory == $value;
+ }
+ return $conf->{memory};
} elsif ($value > $memory) {
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] [PATCH v4 qemu-server 16/16] tests: add virtio-mem tests
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (14 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry Alexandre Derumier
@ 2023-02-13 12:00 ` Alexandre Derumier
2023-02-15 13:42 ` [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Fiona Ebner
2023-02-22 15:25 ` [pve-devel] " Fiona Ebner
17 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2023-02-13 12:00 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
test/cfg2cmd/memory-virtio-hugepages-1G.conf | 12 +++++++
.../memory-virtio-hugepages-1G.conf.cmd | 35 +++++++++++++++++++
test/cfg2cmd/memory-virtio-max.conf | 11 ++++++
test/cfg2cmd/memory-virtio-max.conf.cmd | 35 +++++++++++++++++++
test/cfg2cmd/memory-virtio.conf | 11 ++++++
test/cfg2cmd/memory-virtio.conf.cmd | 35 +++++++++++++++++++
6 files changed, 139 insertions(+)
create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
create mode 100644 test/cfg2cmd/memory-virtio-max.conf
create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
create mode 100644 test/cfg2cmd/memory-virtio.conf
create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
new file mode 100644
index 0000000..bbe908c
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
@@ -0,0 +1,12 @@
+# TEST: virtio-mem without memorymax defined and hugepages
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
new file mode 100644
index 0000000..2e7de74
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=4096,maxmem=524288M' \
+ -object 'memory-backend-file,id=ram-node0,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-file,id=ram-node1,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-file,id=mem-virtiomem0,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+ -object 'memory-backend-file,id=mem-virtiomem1,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,prealloc=on,bus=pci.4,addr=0x4' \
+ -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,prealloc=on,bus=pci.4,addr=0x5' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf b/test/cfg2cmd/memory-virtio-max.conf
new file mode 100644
index 0000000..59b8e94
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf
@@ -0,0 +1,11 @@
+# TEST: memory max 64G with 2 socket with virtio-mem
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,max=65536,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf.cmd b/test/cfg2cmd/memory-virtio-max.conf.cmd
new file mode 100644
index 0000000..6bda268
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=4096,maxmem=65536M' \
+ -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-ram,id=mem-virtiomem0,size=30720M,reserve=off' \
+ -object 'memory-backend-ram,id=mem-virtiomem1,size=30720M,reserve=off' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+ -device 'virtio-mem-pci,block-size=4M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc+pve0'
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
new file mode 100644
index 0000000..60c0264
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf
@@ -0,0 +1,11 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 7.2
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf.cmd b/test/cfg2cmd/memory-virtio.conf.cmd
new file mode 100644
index 0000000..f392636
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+ -id 8006 \
+ -name 'simple,debug-threads=on' \
+ -no-shutdown \
+ -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+ -mon 'chardev=qmp,mode=control' \
+ -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+ -mon 'chardev=qmp-event,mode=control' \
+ -pidfile /var/run/qemu-server/8006.pid \
+ -daemonize \
+ -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+ -smp '4,sockets=2,cores=2,maxcpus=4' \
+ -nodefaults \
+ -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+ -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+ -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+ -m 'size=4096,maxmem=524288M' \
+ -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+ -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+ -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+ -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+ -object 'memory-backend-ram,id=mem-virtiomem0,size=260096M,reserve=off' \
+ -object 'memory-backend-ram,id=mem-virtiomem1,size=260096M,reserve=off' \
+ -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+ -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+ -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+ -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+ -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+ -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+ -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+ -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+ -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+ -machine 'type=pc+pve0'
\ No newline at end of file
--
2.30.2
^ permalink raw reply [flat|nested] 32+ messages in thread
* [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (15 preceding siblings ...)
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 16/16] tests: add virtio-mem tests Alexandre Derumier
@ 2023-02-15 13:42 ` Fiona Ebner
2023-02-16 12:35 ` Fiona Ebner
2023-02-22 15:25 ` [pve-devel] " Fiona Ebner
17 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-02-15 13:42 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> This patch series rework the current memory hotplug + virtiomem.
>
> memory option now have extra options:
>
> memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
> ex: memory: current=1024,max=131072,virtio=1
>
>
> for classic memory hotplug, when maxmemory is defined,
> we use 64 fixed size dimm.
> The max option is a multiple of 64GB.
>
> The virtio option enable new virtio-mem support,
> instead of plugging dimm, it's add/removed block inside
> big dimm.
> virtio-mem can use 32000 blocks, the blocksize is compute from
> max memory.
>
>
Thanks! Applied the first 4 patches, will take a look at the others
tomorrow. We still need to adapt HA before memory can be turned into a
format string here ;)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
2023-02-15 13:42 ` [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Fiona Ebner
@ 2023-02-16 12:35 ` Fiona Ebner
2023-02-27 14:04 ` Thomas Lamprecht
0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-02-16 12:35 UTC (permalink / raw)
To: pve-devel, aderumier, Thomas Lamprecht
Am 15.02.23 um 14:42 schrieb Fiona Ebner:
> Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
>> This patch series rework the current memory hotplug + virtiomem.
>>
>> memory option now have extra options:
>>
>> memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
>> ex: memory: current=1024,max=131072,virtio=1
>>
>>
>> for classic memory hotplug, when maxmemory is defined,
>> we use 64 fixed size dimm.
>> The max option is a multiple of 64GB.
>>
>> The virtio option enable new virtio-mem support,
>> instead of plugging dimm, it's add/removed block inside
>> big dimm.
>> virtio-mem can use 32000 blocks, the blocksize is compute from
>> max memory.
>>
>>
> Thanks! Applied the first 4 patches, will take a look at the others
> tomorrow. We still need to adapt HA before memory can be turned into a
> format string here ;)
I'll try and work out patches to make HA independent of config details.
@Thomas: based on our brief discussion off-list, would the following be
okay?
Add a get_derived_property() method to AbstractConfig.pm, taking a
(partial) config and the name for the derived property. The first two
will be 'maxcpu' and 'maxmem' (should I use 'max-cpu' or different
names?), because that's what HA manager needs.
The partial config needs to contain all actual properties needed to
compute the derived property, e.g. 'vcpus', 'sockets' and 'cores' for
'maxcpu' in case of VMs.
If we switch to get_guest_config_properties(), there needs to be a
second method to request which properties are actually needed for the
computation of the derived property beforehand.
Whenever something in the actual properties changes, the calculation
needs to be adapted to still yield the same result for the derived
property, but this can be done entirely in the VM/CT plugin implementation.
Whenever HA manager needs something new, a new derived property is
implemented in the plugins and dependency bumps for
qemu-server/pve-container are done.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option Alexandre Derumier
@ 2023-02-22 15:18 ` Fiona Ebner
2023-02-23 7:35 ` DERUMIER, Alexandre
0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:18 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 32fbdc5..deeb88f 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -3,8 +3,10 @@ package PVE::QemuServer::Memory;
> use strict;
> use warnings;
>
> +use PVE::JSONSchema;
> use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
> use PVE::Exception qw(raise raise_param_exc);
> +use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
Stlye nit: pre-existing, but now it's even less sorted alphabetically.
Also, you don't use safe_string_ne so no need to import (and
safe_boolean_ne should be imported by the patch that uses it).
>
> use PVE::QemuServer;
> use PVE::QemuServer::Monitor qw(mon_cmd);
> @@ -25,8 +27,26 @@ our $memory_fmt = {
> minimum => 16,
> default => 512,
> },
> + max => {
Really could use a description, so that users know that it's used for
hotplug.
> + type => 'integer',
> + optional => 1,
> + minimum => 65536,
> + maximum => 4194304,
> + format => 'pve-qm-memory-max',
> + },
> };
>
> +PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
> +sub verify_qm_memory_max {
> + my ($max, $noerr) = @_;
> +
> + return if $noerr;
$noerr only switches if the function should die upon error or return
undef upon error. But if there is no error, you always need to return
the verified value, i.e. $max.
> +
> + die "max memory need to be a multiple of 64GiB\n" if $max && $max % 65536 != 0;
> +
> + return $max
> +}
> +
> sub print_memory {
> my $memory = shift;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2023-02-22 15:19 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 9b80da1..6627910 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -32,7 +32,7 @@ use PVE::QemuServer::Drive;
> use PVE::QemuServer::ImportDisk;
> use PVE::QemuServer::Monitor qw(mon_cmd);
> use PVE::QemuServer::Machine;
> -use PVE::QemuServer::Memory qw(get_current_memory);
> +use PVE::QemuServer::Memory qw(get_current_memory parse_memory get_host_max_mem);
> use PVE::QemuMigrate;
> use PVE::RPCEnvironment;
> use PVE::AccessControl;
> @@ -476,6 +476,26 @@ my $create_disks = sub {
> return ($vollist, $res);
> };
>
> +my $check_memory_param = sub {
> + my ($conf, $param) = @_;
> +
Nit: Could also die if current > max.
> + my $mem = parse_memory($param->{memory});
> + my $host_max_mem = get_host_max_mem($conf);
> +
> + die "memory max can't be bigger than supported cpu architecture $host_max_mem MiB\n"
> + if $mem->{max} && $mem->{max} > $host_max_mem;
> +
> + if ($param->{memory} || defined($param->{balloon})) {
> +
> + my $memory = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory};
> + my $maxmem = get_current_memory($memory);
> + my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
> +
> + die "balloon value too large (must be smaller than assigned memory)\n"
> + if $balloon && $balloon > $maxmem;
> + }
> +};
> +
> my $check_cpu_model_access = sub {
> my ($rpcenv, $authuser, $new, $existing) = @_;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
@ 2023-02-22 15:19 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> current qemu_dimm_list can return any kind of memory devices.
>
> make it more generic, with a regex filter to choose kind of device
> from id.
>
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
> PVE/QemuServer/Memory.pm | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index c912e4f..a13b3a1 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -333,7 +333,7 @@ sub qemu_memory_hotplug {
> while (1) {
> eval { PVE::QemuServer::qemu_devicedel($vmid, $name) };
> sleep 3;
> - my $dimm_list = qemu_dimm_list($vmid);
> + my $dimm_list = qemu_memdevices_list($vmid, '^dimm(\d+)$');
> last if !$dimm_list->{$name};
> raise_param_exc({ $name => "error unplug memory module" }) if $retry > 5;
> $retry++;
> @@ -363,14 +363,14 @@ sub can_hotplug {
> return 1;
> }
>
> -sub qemu_dimm_list {
> - my ($vmid) = @_;
> +sub qemu_memdevices_list {
> + my ($vmid, $filter) = @_;
Nit: not sure if we really need/want a full-blown filter here. I'd just
start out with a $type and we can make sure to only use IDs of the form
$type\d+. If we ever need more, we can always change it to a filter.
>
> my $dimmarray = mon_cmd($vmid, "query-memory-devices");
> my $dimms = {};
>
> foreach my $dimm (@$dimmarray) {
> -
> + next if $filter && $dimm->{data}->{id} !~ /$filter/;
> $dimms->{$dimm->{data}->{id}}->{id} = $dimm->{data}->{id};
> $dimms->{$dimm->{data}->{id}}->{node} = $dimm->{data}->{node};
> $dimms->{$dimm->{data}->{id}}->{addr} = $dimm->{data}->{addr};
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
@ 2023-02-22 15:19 ` Fiona Ebner
2023-02-23 8:38 ` DERUMIER, Alexandre
0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> @@ -322,30 +290,33 @@ sub qemu_memory_hotplug {
>
> } else {
>
> - foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
> - my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
> + my $dimms = qemu_dimm_list($vmid);
Has been renamed in the last patch ;) Ideally, this and the last patch
would be ordered first in the series. Then I would've applied them right
away (fixing up the renamed call), because they fix an existing bug, see
below.
> + my $current_size = $memory;
> + for my $name (sort { $dimms->{$b}{slot} <=> $dimms->{$a}{slot} } keys %$dimms) {
Style nit: Please use $dimms->{$b}->{slot}
Turns out the current code with foreach_reverse_dimm() is actually buggy
(not only when reaching $dimm_id==254). With qemu-server 7.3-3:
> root@pve701 ~ # grep memory: /etc/pve/qemu-server/135.conf
> memory: 1536
> root@pve701 ~ # perl memory-qmp.pm 135
> dimm0 (node 0): 536870912
> $VAR1 = {
> 'base-memory' => 1073741824,
> 'plugged-memory' => 536870912
> };
> root@pve701 ~ # qm set 135 --memory 20480
> update VM 135: -memory 20480
> root@pve701 ~ # grep memory: /etc/pve/qemu-server/135.conf
> memory: 20480
> root@pve701 ~ # perl memory-qmp.pm 135
> dimm0 (node 0): 536870912
> dimm1 (node 1): 536870912
> ...
> dimm33 (node 1): 1073741824
> dimm34 (node 0): 1073741824
> $VAR1 = {
> 'base-memory' => 1073741824,
> 'plugged-memory' => '20401094656'
> };
> root@pve701 ~ # qm set 135 --memory 1536
> update VM 135: -memory 1536
> try to unplug memory dimm dimm33
> try to unplug memory dimm dimm32
> ...
> try to unplug memory dimm dimm2
> try to unplug memory dimm dimm1
> root@pve701 ~ # grep memory: /etc/pve/qemu-server/135.conf
> memory: 1536
> root@pve701 ~ # perl memory-qmp.pm 135
> dimm0 (node 0): 536870912
> dimm34 (node 0): 1073741824
> $VAR1 = {
> 'base-memory' => 1073741824,
> 'plugged-memory' => 1610612736
> };
And this patch fixes that :)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support Alexandre Derumier
@ 2023-02-22 15:19 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 1b1c99d..bf4e92a 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -3,6 +3,8 @@ package PVE::QemuServer::Memory;
> use strict;
> use warnings;
>
> +use POSIX qw(ceil);
> +
> use PVE::JSONSchema;
> use PVE::Tools qw(run_command lock_file lock_file_full file_read_firstline dir_glob_foreach);
> use PVE::Exception qw(raise raise_param_exc);
> @@ -16,6 +18,7 @@ our @EXPORT_OK = qw(
> get_current_memory
> parse_memory
> get_host_max_mem
> +get_virtiomem_block_size
> );
>
> my $MAX_NUMA = 8;
> @@ -37,6 +40,12 @@ our $memory_fmt = {
> maximum => 4194304,
> format => 'pve-qm-memory-max',
> },
> + virtio => {
> + description => "Enable virtio-mem memory (Experimental: Only works with Linux guest with kernel >= 5.10)",
Nit: How about "Use virtio-mem devices for hotplug (Experimental: ...)",
then people immediately know it's for hotplug.
> + type => 'boolean',
> + optional => 1,
> + default => 0,
> + },
> };
>
> PVE::JSONSchema::register_format('pve-qm-memory-max', \&verify_qm_memory_max);
> @@ -72,7 +81,9 @@ my sub get_static_mem {
> my $static_memory = 0;
> my $memory = parse_memory($conf->{memory});
>
> - if ($memory->{max}) {
> + if ($memory->{virtio}) {
> + $static_memory = 4096;
> + } elsif ($memory->{max}) {
> my $dimm_size = $memory->{max} / $MAX_SLOTS;
> #static mem can't be lower than 4G and lower than 1 dimmsize by socket
> $static_memory = $dimm_size * $sockets;
> @@ -161,6 +172,117 @@ sub get_current_memory {
> return $memory->{current};
> }
>
> +sub get_virtiomem_block_size {
> + my ($conf) = @_;
> +
> + my $sockets = $conf->{sockets} || 1;
> + my $MAX_MEM = get_max_mem($conf);
> + my $static_memory = get_static_mem($conf, $sockets);
Nit: Not making a difference with the current implemenetation, but this
should pass 1 for hotplug (we only use the virtio-mem devices for hotplug).
> + my $memory = get_current_memory($conf->{memory});
> +
> + #virtiomem can map 32000 block size.
> + #try to use lowest blocksize, lower = more chance to unplug memory.
> + my $blocksize = ($MAX_MEM - $static_memory) / 32000;
> + #2MB is the minimum to be aligned with THP
> + $blocksize = 2 if $blocksize < 2;
> + $blocksize = 2**(ceil(log($blocksize)/log(2)));
> + #Linux guest kernel only support 4MiB block currently (kernel <= 6.2)
> + $blocksize = 4 if $blocksize < 4;
> +
> + return $blocksize;
> +}
> +
> +my sub get_virtiomem_total_current_size {
> + my ($mems) = @_;
> + my $size = 0;
> + for my $mem (values %$mems) {
> + $size += $mem->{current};
> + }
> + return $size;
> +}
> +
> +my sub balance_virtiomem {
> + my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
> +
> + my $nb_virtiomem = scalar(keys %$virtiomems);
> +
> + print"try to balance memory on $nb_virtiomem virtiomems\n";
> +
> + #if we can't share exactly the same amount, we add the remainder on last node
> + my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
> + my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
> +
> + my $i = 0;
> + foreach my $id (sort keys %$virtiomems) {
> + my $virtiomem = $virtiomems->{$id};
> + $i++;
> + my $virtiomem_target = $i == $nb_virtiomem ? $target_remaining : $target_aligned;
> + $virtiomem->{completed} = 0;
> + $virtiomem->{retry} = 0;
> + $virtiomem->{target} = $virtiomem_target;
> +
> + print "virtiomem$id: set-requested-size : $virtiomem_target\n";
> + mon_cmd($vmid, 'qom-set',
> + path => "/machine/peripheral/virtiomem$id",
> + property => "requested-size",
> + value => $virtiomem_target * 1024 * 1024);
Style nit: trailing spaces and should really put each argument on its
own line, with mon_cmd( and the final ) on their own line too.
> + }
> +
> + my $total_finished = 0;
> + my $error = undef;
> +
> + while ($total_finished != $nb_virtiomem) {
> +
> + sleep 1;
> +
> + $total_finished = 0;
> +
> + foreach my $id (keys %$virtiomems) {
> +
> + my $virtiomem = $virtiomems->{$id};
> +
> + if ($virtiomem->{error} || $virtiomem->{completed}) {
> + $total_finished++;
> + next;
> + }
> +
> + my $size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/virtiomem$id", property => "size");
> + $virtiomem->{current} = $size / 1024 / 1024;
> + print"virtiomem$id: last: $virtiomem->{last} current: $virtiomem->{current} target: $virtiomem->{target}\n";
[0] marker so I can reference this message below :)
> +
> + if($virtiomem->{current} == $virtiomem->{target}) {
> + print"virtiomem$id: completed\n";
> + $virtiomem->{completed} = 1;
> + next;
> + }
> +
> + if($virtiomem->{current} != $virtiomem->{last}) {
> + #if value has changed, but not yet completed
> + print "virtiomem$id: changed but don't not reach target yet\n";
"don't not" is wrong. But do we really need this print? I feel like the
above[0] is already enough. It already contains the information about
last and current.
> + $virtiomem->{retry} = 0;
> + $virtiomem->{last} = $virtiomem->{current};
> + next;
> + }
> +
> + if($virtiomem->{retry} >= 5) {
> + print "virtiomem$id: too many retry. set error\n";
s/retry/retries/
But I'd also change the message to be a bit more telling to users, "set
error" could mean anything. Maybe something like: "virtiomem$id: target
memory still not reached, ignoring device from now on"?
> + $virtiomem->{error} = 1;
> + $error = 1;
> + #as change is async, we don't want that value change after the api call
> + eval {
> + mon_cmd($vmid, 'qom-set',
> + path => "/machine/peripheral/virtiomem$id",
> + property => "requested-size",
> + value => $virtiomem->{current} * 1024 *1024);
> + };
> + }
> + print"virtiomem$id: increase retry: $virtiomem->{retry}\n";
Maybe add output the retry counter in the message [0] to avoid output bloat?
> + $virtiomem->{retry}++;
> + }
> + }
> + die "No more available blocks in virtiomem to balance all requested memory\n" if $error;
> +}
> +
> sub get_numa_node_list {
> my ($conf) = @_;
> my @numa_map;
> @@ -247,7 +369,39 @@ sub qemu_memory_hotplug {
> my $MAX_MEM = get_max_mem($conf);
> die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
>
> - if ($value > $memory) {
> + my $confmem = parse_memory($conf->{memory});
This is already $oldmem, no need for this second variable.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry.
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry Alexandre Derumier
@ 2023-02-22 15:19 ` Fiona Ebner
[not found] ` <00eab4f6356c760a55182497eb0ad0bac57bdcb4.camel@groupe-cyllene.com>
0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:19 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> If some memory can be removed on a specific node,
> we try to rebalance again on other nodes
>
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
> PVE/QemuServer/Memory.pm | 51 +++++++++++++++++++++++++++-------------
> 1 file changed, 35 insertions(+), 16 deletions(-)
>
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index bf4e92a..f02b4e0 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -201,13 +201,28 @@ my sub get_virtiomem_total_current_size {
> return $size;
> }
>
> +my sub get_virtiomem_total_errors_size {
> + my ($mems) = @_;
> +
> + my $size = 0;
> + for my $mem (values %$mems) {
> + next if !$mem->{error};
> + $size += $mem->{current};
> + }
> + return $size;
> +}
> +
> my sub balance_virtiomem {
> my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
>
> - my $nb_virtiomem = scalar(keys %$virtiomems);
> + my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems->%*);
>
> print"try to balance memory on $nb_virtiomem virtiomems\n";
>
> + die "No more available blocks in virtiomem to balance all requested memory\n"
> + if $target_total < 0;
I fee like this message is a bit confusing. This can only happen on
unplug, right? And reading that "no more blocks are available" sounds
like a paradox then. It's rather that no more blocks can be unplugged.
If we really want to, if the $target_total is negative, we could set it
to 0 (best to do it at the call-side already) and try to unplug
everything else? We won't reach the goal anymore, but we could still get
closer to it in some cases. Would need a bit more adaptation to avoid an
endless loop: we also need to stop if all devices reached their current
goal this round (and no new errors appeared), e.g. balance_virtiomem()
could just have that info as its return value.
Example:
> update VM 101: -memory 4100,max=65536,virtio=1
> try to balance memory on 2 virtiomems
> virtiomem0: set-requested-size : 0
> virtiomem1: set-requested-size : 4
> virtiomem1: last: 4 current: 4 target: 4
> virtiomem1: completed
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 0
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 1
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 2
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 3
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: increase retry: 4
> virtiomem0: last: 16 current: 16 target: 0
> virtiomem0: too many retry. set error
> virtiomem0: increase retry: 5
Currently it stops here, but with setting $target_total = 0 it continues...
> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed
...and gets closer to the goal...
> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed
> try to balance memory on 1 virtiomems
> virtiomem1: set-requested-size : 0
> virtiomem1: last: 4 current: 0 target: 0
> virtiomem1: completed
...but then it loops, because I didn't add the other stop condition yet
;). But not sure, likely too much magic.
> + die "No more available virtiomem to balance the remaining memory\n" if $nb_virtiomem == 0;
"No more virtiomem devices left to try to ..." might be a bit clearer.
Technically, they are still available, we just ignore them because they
don't reach the target in time.
> +
> #if we can't share exactly the same amount, we add the remainder on last node
> my $target_aligned = int( $target_total / $nb_virtiomem / $blocksize) * $blocksize;
> my $target_remaining = $target_total - ($target_aligned * ($nb_virtiomem-1));
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
` (16 preceding siblings ...)
2023-02-15 13:42 ` [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Fiona Ebner
@ 2023-02-22 15:25 ` Fiona Ebner
17 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-22 15:25 UTC (permalink / raw)
To: Proxmox VE development discussion, Alexandre Derumier
Am 13.02.23 um 13:00 schrieb Alexandre Derumier:
> This patch series rework the current memory hotplug + virtiomem.
>
> memory option now have extra options:
>
> memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
> ex: memory: current=1024,max=131072,virtio=1
>
>
> for classic memory hotplug, when maxmemory is defined,
> we use 64 fixed size dimm.
> The max option is a multiple of 64GB.
>
> The virtio option enable new virtio-mem support,
> instead of plugging dimm, it's add/removed block inside
> big dimm.
> virtio-mem can use 32000 blocks, the blocksize is compute from
> max memory.
>
>
Looking good :) I just have a few suggestions for further improvements
and there's a wrong function call in 11/16. Consider everything with no
answer (or with just nits if you don't change the patch):
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Still, would be nice if you could address (some of) the nits too.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option
2023-02-22 15:18 ` Fiona Ebner
@ 2023-02-23 7:35 ` DERUMIER, Alexandre
2023-02-23 7:44 ` Fiona Ebner
0 siblings, 1 reply; 32+ messages in thread
From: DERUMIER, Alexandre @ 2023-02-23 7:35 UTC (permalink / raw)
To: pve-devel, aderumier, f.ebner
Hi,
> > +PVE::JSONSchema::register_format('pve-qm-memory-max',
> > \&verify_qm_memory_max);
> > +sub verify_qm_memory_max {
> > + my ($max, $noerr) = @_;
> > +
> > + return if $noerr;
>
> $noerr only switches if the function should die upon error or return
> undef upon error. But if there is no error, you always need to return
> the verified value, i.e. $max.
>
I'm a bit lost here. I have looked at other verify subs, and I don't
have find a single one returning a value on noerr.
All other verify subs have something like:
if ($max % 65536 != 0) {
return if $noerr;
die "max memory need to be a multiple of 64GiB\n";
}
Could you provide me a code sample of what you want exactly here ?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option
2023-02-23 7:35 ` DERUMIER, Alexandre
@ 2023-02-23 7:44 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-23 7:44 UTC (permalink / raw)
To: DERUMIER, Alexandre, pve-devel, aderumier
Am 23.02.23 um 08:35 schrieb DERUMIER, Alexandre:
> Hi,
>
>>> +PVE::JSONSchema::register_format('pve-qm-memory-max',
>>> \&verify_qm_memory_max);
>>> +sub verify_qm_memory_max {
>>> + my ($max, $noerr) = @_;
>>> +
>>> + return if $noerr;
>>
>> $noerr only switches if the function should die upon error or return
>> undef upon error. But if there is no error, you always need to return
>> the verified value, i.e. $max.
>>
>
> I'm a bit lost here. I have looked at other verify subs, and I don't
> have find a single one returning a value on noerr.
If there is no error and $noerr is set, they do.
>
> All other verify subs have something like:
>
> if ($max % 65536 != 0) {
> return if $noerr;
> die "max memory need to be a multiple of 64GiB\n";
> }
>
Yes, this would be correct. Note that it behaves differently from the
current patch. The current patch always executes
return if $noerr;
but the new version only executes it when $max % 65536 != 0. If $max %
65536 == 0, you don't enter the if and then return $max even if $noerr
is set.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug
2023-02-22 15:19 ` Fiona Ebner
@ 2023-02-23 8:38 ` DERUMIER, Alexandre
0 siblings, 0 replies; 32+ messages in thread
From: DERUMIER, Alexandre @ 2023-02-23 8:38 UTC (permalink / raw)
To: pve-devel, aderumier, f.ebner
Le mercredi 22 février 2023 à 16:19 +0100, Fiona Ebner a écrit :
> > dimm0 (node 0): 536870912
> > dimm34 (node 0): 1073741824
> > $VAR1 = {
> > 'base-memory' => 1073741824,
> > 'plugged-memory' => 1610612736
> > };
>
> And this patch fixes that :)
Oh, yes. I was suspicious about that too, but never had time to debug.
Since years, I had problem to unplug last dimm, I was thinking it was
the memory set as unmovable by guest :)
I'll put the 2 patches at the begin of the patches series.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry.
[not found] ` <00eab4f6356c760a55182497eb0ad0bac57bdcb4.camel@groupe-cyllene.com>
@ 2023-02-24 7:12 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-24 7:12 UTC (permalink / raw)
To: DERUMIER, Alexandre, Proxmox VE development discussion
Am 23.02.23 um 16:01 schrieb DERUMIER, Alexandre:
>
>>> + die "No more available blocks in virtiomem to balance all
>>> requested memory\n"
>>> + if $target_total < 0;
>>
>> I fee like this message is a bit confusing. This can only happen on
>> unplug, right?
> yes,unplug only. (when guest os set memory block as unmovable).
> I don't see case where it doesn't work on hotplug.
>
>> And reading that "no more blocks are available" sounds
>> like a paradox then. It's rather that no more blocks can be
>> unplugged.
>>
>> If we really want to, if the $target_total is negative, we could set
>> it
>> to 0 (best to do it at the call-side already) and try to unplug
>> everything else? We won't reach the goal anymore, but we could still
>> get
>> closer to it in some cases.
>
> yes, good idea
>
>> Would need a bit more adaptation to avoid an
>> endless loop: we also need to stop if all devices reached their
>> current
>> goal this round (and no new errors appeared), e.g.
>> balance_virtiomem()
>> could just have that info as its return value.
>>
>
> I think we should still die with an error message in this case,(maybe
> at the end of balance_virtiomem). We still have some memory not removed
> in all cases if target_memory was < 0;
>
Sure, we still should die at the end.
> something like:
>
>
> my sub balance_virtiomem {
> my ($vmid, $virtiomems, $blocksize, $target_total) = @_;
>
> my $nb_virtiomem = scalar(grep { !$_->{error} } values $virtiomems-
>> %*);
>
> print"try to balance memory on $nb_virtiomem virtiomems\n";
>
> my $target_total_err = undef;
> if($target_total < 0) {
> $target_total = 0;
> $target_total_err = 1;
> }
>
> ...
> while ($total_finished != $nb_virtiomem) {
> ...
> }
> ...
>
> die "No more virtiomem devices left to try to balance the remaining
> memory\n"
> if $target_total_err;
> }
>
>
Yes, like this we make one final round, which is good enough IMHO.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
2023-02-16 12:35 ` Fiona Ebner
@ 2023-02-27 14:04 ` Thomas Lamprecht
2023-02-28 7:35 ` Fiona Ebner
0 siblings, 1 reply; 32+ messages in thread
From: Thomas Lamprecht @ 2023-02-27 14:04 UTC (permalink / raw)
To: Proxmox VE development discussion, Fiona Ebner, aderumier
Am 16/02/2023 um 13:35 schrieb Fiona Ebner:
> Add a get_derived_property() method to AbstractConfig.pm, taking a
> (partial) config and the name for the derived property. The first two
> will be 'maxcpu' and 'maxmem' (should I use 'max-cpu' or different
> names?), because that's what HA manager needs.
Yeah, would prefer having *some* sort of casing, as none is IMO needlessly
hard to read. kebab-case would work fine for me.
>
> The partial config needs to contain all actual properties needed to
> compute the derived property, e.g. 'vcpus', 'sockets' and 'cores' for
> 'maxcpu' in case of VMs.
>
> If we switch to get_guest_config_properties(), there needs to be a
do you mean `get_derived_property` here?
> second method to request which properties are actually needed for the
> computation of the derived property beforehand.
I.e., something like a `get_derived_property_inputs` sub?
>
> Whenever something in the actual properties changes, the calculation
> needs to be adapted to still yield the same result for the derived
> property, but this can be done entirely in the VM/CT plugin implementation.
>
> Whenever HA manager needs something new, a new derived property is
> implemented in the plugins and dependency bumps for
> qemu-server/pve-container are done.
It has some complexity, but I don't have a really better idea, so fine
by me.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem
2023-02-27 14:04 ` Thomas Lamprecht
@ 2023-02-28 7:35 ` Fiona Ebner
0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-02-28 7:35 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Am 27.02.23 um 15:04 schrieb Thomas Lamprecht:
> Am 16/02/2023 um 13:35 schrieb Fiona Ebner:
>> Add a get_derived_property() method to AbstractConfig.pm, taking a
>> (partial) config and the name for the derived property. The first two
>> will be 'maxcpu' and 'maxmem' (should I use 'max-cpu' or different
>> names?), because that's what HA manager needs.
>
> Yeah, would prefer having *some* sort of casing, as none is IMO needlessly
> hard to read. kebab-case would work fine for me.
>
Okay!
>
>>
>> The partial config needs to contain all actual properties needed to
>> compute the derived property, e.g. 'vcpus', 'sockets' and 'cores' for
>> 'maxcpu' in case of VMs.
>>
>> If we switch to get_guest_config_properties(), there needs to be a
>
> do you mean `get_derived_property` here?
Sorry, should've been clearer. I mean
PVE::Cluster::get_guest_config_properties() where it's only possible to
query specific properties, hence the need for the second method.
>
>> second method to request which properties are actually needed for the
>> computation of the derived property beforehand.
>
> I.e., something like a `get_derived_property_inputs` sub?
>
Yes.
>>
>> Whenever something in the actual properties changes, the calculation
>> needs to be adapted to still yield the same result for the derived
>> property, but this can be done entirely in the VM/CT plugin implementation.
>>
>> Whenever HA manager needs something new, a new derived property is
>> implemented in the plugins and dependency bumps for
>> qemu-server/pve-container are done.
>
> It has some complexity, but I don't have a really better idea, so fine
> by me.
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2023-02-28 7:35 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-13 12:00 [pve-devel] [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 01/16] memory: extract some code to their own sub for mocking Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 02/16] tests: add memory tests Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 03/16] memory: refactor sockets Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 04/16] memory: remove calls to parse_hotplug_features Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 05/16] add memory parser Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 06/16] memory: add get_static_mem Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 07/16] memory: use static_memory in foreach_dimm Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 08/16] config: memory: add 'max' option Alexandre Derumier
2023-02-22 15:18 ` Fiona Ebner
2023-02-23 7:35 ` DERUMIER, Alexandre
2023-02-23 7:44 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 09/16] memory: get_max_mem: use config memory max Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 10/16] memory: rename qemu_dimm_list to qemu_memdevices_list Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 11/16] memory: don't use foreach_reversedimm for unplug Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-23 8:38 ` DERUMIER, Alexandre
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 12/16] memory: use 64 slots && static dimm size when max is defined Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 13/16] test: add memory-max tests Alexandre Derumier
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 14/16] memory: add virtio-mem support Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 15/16] memory: virtio-mem : implement redispatch retry Alexandre Derumier
2023-02-22 15:19 ` Fiona Ebner
[not found] ` <00eab4f6356c760a55182497eb0ad0bac57bdcb4.camel@groupe-cyllene.com>
2023-02-24 7:12 ` Fiona Ebner
2023-02-13 12:00 ` [pve-devel] [PATCH v4 qemu-server 16/16] tests: add virtio-mem tests Alexandre Derumier
2023-02-15 13:42 ` [pve-devel] partially-applied: [PATCH v4 qemu-server 00/16] rework memory hotplug + virtiomem Fiona Ebner
2023-02-16 12:35 ` Fiona Ebner
2023-02-27 14:04 ` Thomas Lamprecht
2023-02-28 7:35 ` Fiona Ebner
2023-02-22 15:25 ` [pve-devel] " Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox