public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem
@ 2022-12-09 19:27 Alexandre Derumier
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 01/10] test: add memory tests Alexandre Derumier
                   ` (10 more replies)
  0 siblings, 11 replies; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

This patch series rework the current memory hotplug + virtiomem.

memory option now have extra options:

memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
ex: memory: current=1024,max=131072,virtio=1


patches 1-2: add a memory parser

patches 3-7: add the max option with 64 static dimm hotplug

for classic memory hotplug, when maxmemory is defined,
we use 64 fixed size dimm.
The max option is a multiple of 64GB.

patches 8-9: add virtio-mem

The virtio option enable new virtio-mem support,
instead of plugging dimm, it's add/removed block inside
big dimm.
virtio-mem can use 32000 blocks, the blocksize is compute from
max memory.

patch10: hotplug fix

Alexandre Derumier (10):
  test: add memory tests
  add memory parser
  config: memory: add 'max' option
  memory: add get_static_mem
  memory: get_max_mem: use config memory max
  memory: use 64 slots && static dimm size with max is defined
  test: add memory-max tests
  memory: add virtio-mem support
  tests: add virtio-mem tests
  memory: fix hotplug with virtiomem && maxmem

 PVE/API2/Qemu.pm                              |  12 +-
 PVE/QemuConfig.pm                             |   4 +-
 PVE/QemuMigrate.pm                            |   6 +-
 PVE/QemuServer.pm                             |  73 ++++--
 PVE/QemuServer/Helpers.pm                     |   3 +-
 PVE/QemuServer/Memory.pm                      | 229 ++++++++++++++----
 PVE/QemuServer/PCI.pm                         |   8 +
 test/cfg2cmd/memory-hotplug-hugepages.conf    |  12 +
 .../cfg2cmd/memory-hotplug-hugepages.conf.cmd |  62 +++++
 test/cfg2cmd/memory-hotplug.conf              |  11 +
 test/cfg2cmd/memory-hotplug.conf.cmd          | 174 +++++++++++++
 test/cfg2cmd/memory-hugepages-1g.conf         |  11 +
 test/cfg2cmd/memory-hugepages-1g.conf.cmd     |  30 +++
 test/cfg2cmd/memory-hugepages-2m.conf         |  11 +
 test/cfg2cmd/memory-hugepages-2m.conf.cmd     |  30 +++
 test/cfg2cmd/memory-max-128G.conf             |  10 +
 test/cfg2cmd/memory-max-128G.conf.cmd         |  86 +++++++
 test/cfg2cmd/memory-max-512G.conf             |  10 +
 test/cfg2cmd/memory-max-512G.conf.cmd         |  58 +++++
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  |  11 +
 .../memory-virtio-hugepages-1G.conf.cmd       |  35 +++
 test/cfg2cmd/memory-virtio-max.conf           |  10 +
 test/cfg2cmd/memory-virtio-max.conf.cmd       |  35 +++
 test/cfg2cmd/memory-virtio.conf               |  10 +
 test/cfg2cmd/memory-virtio.conf.cmd           |  35 +++
 25 files changed, 909 insertions(+), 67 deletions(-)
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hotplug.conf
 create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 01/10] test: add memory tests
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:38   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 02/10] add memory parser Alexandre Derumier
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-hotplug-hugepages.conf    |  12 ++
 .../cfg2cmd/memory-hotplug-hugepages.conf.cmd |  62 +++++++
 test/cfg2cmd/memory-hotplug.conf              |  11 ++
 test/cfg2cmd/memory-hotplug.conf.cmd          | 174 ++++++++++++++++++
 test/cfg2cmd/memory-hugepages-1g.conf         |  11 ++
 test/cfg2cmd/memory-hugepages-1g.conf.cmd     |  30 +++
 test/cfg2cmd/memory-hugepages-2m.conf         |  11 ++
 test/cfg2cmd/memory-hugepages-2m.conf.cmd     |  30 +++
 8 files changed, 341 insertions(+)
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
 create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hotplug.conf
 create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
 create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd

diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf b/test/cfg2cmd/memory-hotplug-hugepages.conf
new file mode 100644
index 0000000..6cba31e
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf
@@ -0,0 +1,12 @@
+# TEST: memory hotplug with 1GB hugepage
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 18432
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
new file mode 100644
index 0000000..6d4d8e8
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
@@ -0,0 +1,62 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=2048,slots=255,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-dimm0,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-file,id=mem-dimm1,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-file,id=mem-dimm2,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-file,id=mem-dimm3,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-file,id=mem-dimm4,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-file,id=mem-dimm5,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-file,id=mem-dimm6,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-file,id=mem-dimm7,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-file,id=mem-dimm8,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-file,id=mem-dimm9,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-file,id=mem-dimm10,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-file,id=mem-dimm11,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-file,id=mem-dimm12,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-file,id=mem-dimm13,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-file,id=mem-dimm14,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-file,id=mem-dimm15,size=1024M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hotplug.conf b/test/cfg2cmd/memory-hotplug.conf
new file mode 100644
index 0000000..386e61f
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf
@@ -0,0 +1,11 @@
+# TEST: basic memory hotplug
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 66560
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hotplug: memory
diff --git a/test/cfg2cmd/memory-hotplug.conf.cmd b/test/cfg2cmd/memory-hotplug.conf.cmd
new file mode 100644
index 0000000..2da7955
--- /dev/null
+++ b/test/cfg2cmd/memory-hotplug.conf.cmd
@@ -0,0 +1,174 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=1024,slots=255,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=512M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=512M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=512M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=512M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=512M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=512M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=512M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=512M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=512M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=512M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=512M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=512M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=512M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=512M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=512M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=512M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=512M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=512M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=512M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=512M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=512M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=512M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=512M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=512M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=512M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=512M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=512M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=512M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=512M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=512M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm28,size=512M' \
+  -device 'pc-dimm,id=dimm28,memdev=mem-dimm28,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm29,size=512M' \
+  -device 'pc-dimm,id=dimm29,memdev=mem-dimm29,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm30,size=512M' \
+  -device 'pc-dimm,id=dimm30,memdev=mem-dimm30,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm31,size=512M' \
+  -device 'pc-dimm,id=dimm31,memdev=mem-dimm31,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm32,size=1024M' \
+  -device 'pc-dimm,id=dimm32,memdev=mem-dimm32,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm33,size=1024M' \
+  -device 'pc-dimm,id=dimm33,memdev=mem-dimm33,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm34,size=1024M' \
+  -device 'pc-dimm,id=dimm34,memdev=mem-dimm34,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm35,size=1024M' \
+  -device 'pc-dimm,id=dimm35,memdev=mem-dimm35,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm36,size=1024M' \
+  -device 'pc-dimm,id=dimm36,memdev=mem-dimm36,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm37,size=1024M' \
+  -device 'pc-dimm,id=dimm37,memdev=mem-dimm37,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm38,size=1024M' \
+  -device 'pc-dimm,id=dimm38,memdev=mem-dimm38,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm39,size=1024M' \
+  -device 'pc-dimm,id=dimm39,memdev=mem-dimm39,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm40,size=1024M' \
+  -device 'pc-dimm,id=dimm40,memdev=mem-dimm40,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm41,size=1024M' \
+  -device 'pc-dimm,id=dimm41,memdev=mem-dimm41,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm42,size=1024M' \
+  -device 'pc-dimm,id=dimm42,memdev=mem-dimm42,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm43,size=1024M' \
+  -device 'pc-dimm,id=dimm43,memdev=mem-dimm43,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm44,size=1024M' \
+  -device 'pc-dimm,id=dimm44,memdev=mem-dimm44,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm45,size=1024M' \
+  -device 'pc-dimm,id=dimm45,memdev=mem-dimm45,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm46,size=1024M' \
+  -device 'pc-dimm,id=dimm46,memdev=mem-dimm46,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm47,size=1024M' \
+  -device 'pc-dimm,id=dimm47,memdev=mem-dimm47,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm48,size=1024M' \
+  -device 'pc-dimm,id=dimm48,memdev=mem-dimm48,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm49,size=1024M' \
+  -device 'pc-dimm,id=dimm49,memdev=mem-dimm49,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm50,size=1024M' \
+  -device 'pc-dimm,id=dimm50,memdev=mem-dimm50,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm51,size=1024M' \
+  -device 'pc-dimm,id=dimm51,memdev=mem-dimm51,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm52,size=1024M' \
+  -device 'pc-dimm,id=dimm52,memdev=mem-dimm52,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm53,size=1024M' \
+  -device 'pc-dimm,id=dimm53,memdev=mem-dimm53,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm54,size=1024M' \
+  -device 'pc-dimm,id=dimm54,memdev=mem-dimm54,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm55,size=1024M' \
+  -device 'pc-dimm,id=dimm55,memdev=mem-dimm55,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm56,size=1024M' \
+  -device 'pc-dimm,id=dimm56,memdev=mem-dimm56,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm57,size=1024M' \
+  -device 'pc-dimm,id=dimm57,memdev=mem-dimm57,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm58,size=1024M' \
+  -device 'pc-dimm,id=dimm58,memdev=mem-dimm58,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm59,size=1024M' \
+  -device 'pc-dimm,id=dimm59,memdev=mem-dimm59,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm60,size=1024M' \
+  -device 'pc-dimm,id=dimm60,memdev=mem-dimm60,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm61,size=1024M' \
+  -device 'pc-dimm,id=dimm61,memdev=mem-dimm61,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm62,size=1024M' \
+  -device 'pc-dimm,id=dimm62,memdev=mem-dimm62,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm63,size=1024M' \
+  -device 'pc-dimm,id=dimm63,memdev=mem-dimm63,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm64,size=2048M' \
+  -device 'pc-dimm,id=dimm64,memdev=mem-dimm64,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm65,size=2048M' \
+  -device 'pc-dimm,id=dimm65,memdev=mem-dimm65,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm66,size=2048M' \
+  -device 'pc-dimm,id=dimm66,memdev=mem-dimm66,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm67,size=2048M' \
+  -device 'pc-dimm,id=dimm67,memdev=mem-dimm67,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm68,size=2048M' \
+  -device 'pc-dimm,id=dimm68,memdev=mem-dimm68,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm69,size=2048M' \
+  -device 'pc-dimm,id=dimm69,memdev=mem-dimm69,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm70,size=2048M' \
+  -device 'pc-dimm,id=dimm70,memdev=mem-dimm70,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm71,size=2048M' \
+  -device 'pc-dimm,id=dimm71,memdev=mem-dimm71,node=1' \
+   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf b/test/cfg2cmd/memory-hugepages-1g.conf
new file mode 100644
index 0000000..8db2cca
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 1gb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-1g.conf.cmd b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
new file mode 100644
index 0000000..63792d2
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-1g.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 8192 \
+  -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf b/test/cfg2cmd/memory-hugepages-2m.conf
new file mode 100644
index 0000000..f0d65fb
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf
@@ -0,0 +1,11 @@
+# TEST: memory wih 2mb hugepages
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 8192
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 2
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-hugepages-2m.conf.cmd b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
new file mode 100644
index 0000000..287c0ed
--- /dev/null
+++ b/test/cfg2cmd/memory-hugepages-2m.conf.cmd
@@ -0,0 +1,30 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 8192 \
+  -object 'memory-backend-file,id=ram-node0,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=4096M,mem-path=/run/hugepages/kvm/2048kB,share=on,prealloc=yes' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 01/10] test: add memory tests Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:38   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option Alexandre Derumier
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/API2/Qemu.pm          | 12 ++++++++-
 PVE/QemuConfig.pm         |  4 +--
 PVE/QemuMigrate.pm        |  6 +++--
 PVE/QemuServer.pm         | 56 ++++++++++++++++++++++++++++++---------
 PVE/QemuServer/Helpers.pm |  3 +--
 PVE/QemuServer/Memory.pm  | 35 +++++++++++++++++-------
 6 files changed, 88 insertions(+), 28 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index badfc37..ba893d2 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
 use PVE::AccessControl;
@@ -1608,7 +1609,16 @@ my $update_vm_api  = sub {
 	}
 
 	if ($param->{memory} || defined($param->{balloon})) {
-	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
+
+	    my $maxmem = undef;
+	    if($param->{memory}) {
+	        $maxmem = get_current_memory($param);
+	    } elsif ($conf->{pending}->{memory}) {
+	        $maxmem = get_current_memory($conf->{pending});
+	    } elsif ($conf->{memory}) {
+	        $maxmem = get_current_memory($conf);
+	    }
+
 	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
 
 	    die "balloon value too large (must be smaller than assigned memory)\n"
diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
index 051382c..6fc4170 100644
--- a/PVE/QemuConfig.pm
+++ b/PVE/QemuConfig.pm
@@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer;
 use PVE::QemuServer::Machine;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::Storage;
 use PVE::Tools;
 use PVE::Format qw(render_bytes render_duration);
@@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
 	$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
     }
 
-    my $defaults = PVE::QemuServer::load_defaults();
-    my $mem_size = $conf->{memory} // $defaults->{memory};
+    my $mem_size = get_current_memory($conf);
     my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
     # our savevm-start does live-save of the memory until the space left in the
     # volume is just enough for the remaining memory content + internal state
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 5e466d9..121b7e2 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
 use PVE::QemuServer::Helpers qw(min_version);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer;
 
 use PVE::AbstractMigrate;
@@ -1024,7 +1025,8 @@ sub phase2_start_remote_cluster {
     my $remote_vmid = $self->{opts}->{remote}->{vmid};
 
     # like regular start but with some overhead accounted for
-    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
+    my $memory = get_current_memory($self->{vmconf}); 
+    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
 
     my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
 
@@ -1179,7 +1181,7 @@ sub phase2 {
     $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
 
     # set cachesize to 10% of the total memory
-    my $memory =  $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf);
     my $cachesize = int($memory * 1048576 / 10);
     $cachesize = round_powerof2($cachesize);
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a52a883..ad69b76 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -52,7 +52,7 @@ use PVE::QemuServer::CGroup;
 use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
 use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
 use PVE::QemuServer::Machine;
-use PVE::QemuServer::Memory;
+use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
 use PVE::QemuServer::USB qw(parse_usb_device);
@@ -267,6 +267,18 @@ my $rng_fmt = {
     },
 };
 
+my $memory_fmt = {
+    current => {
+        description => "Current amount of online RAM for the VM in MB. This is the maximum available memory when"
+            ." you use the balloon device.",
+        type => 'integer',
+	default_key => 1,
+        optional => 1,
+        minimum => 16,
+        default => 512,
+    }
+};
+
 my $meta_info_fmt = {
     'ctime' => {
 	type => 'integer',
@@ -340,11 +352,8 @@ my $confdesc = {
     },
     memory => {
 	optional => 1,
-	type => 'integer',
-	description => "Amount of RAM for the VM in MB. This is the maximum available memory when"
-	    ." you use the balloon device.",
-	minimum => 16,
-	default => 512,
+	type => 'string',
+	format => $memory_fmt,
     },
     balloon => {
 	optional => 1,
@@ -2168,6 +2177,24 @@ sub parse_rng {
     return $res;
 }
 
+sub print_memory {
+    my $memory = shift;
+
+    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
+}
+
+sub parse_memory {
+    my ($value) = @_;
+
+    my $current_default = $memory_fmt->{current}->{default};
+    my $res = { current => $current_default };
+    return $res if !defined($value);
+
+    $res = eval { parse_property_string($memory_fmt, $value) };
+    warn $@ if $@;
+    return $res;
+}
+
 sub parse_meta_info {
     my ($value) = @_;
 
@@ -2928,8 +2955,7 @@ sub vmstatus {
 	$d->{cpus} = $conf->{vcpus} if $conf->{vcpus};
 
 	$d->{name} = $conf->{name} || "VM $vmid";
-	$d->{maxmem} = $conf->{memory} ? $conf->{memory}*(1024*1024)
-	    : $defaults->{memory}*(1024*1024);
+	$d->{maxmem} = get_current_memory($conf)*(1024*1024);
 
 	if ($conf->{balloon}) {
 	    $d->{balloon_min} = $conf->{balloon}*(1024*1024);
@@ -5028,7 +5054,7 @@ sub vmconfig_hotplug_pending {
 		# enable balloon device is not hotpluggable
 		die "skip\n" if defined($conf->{balloon}) && $conf->{balloon} == 0;
 		# here we reset the ballooning value to memory
-		my $balloon = $conf->{memory} || $defaults->{memory};
+		my $balloon = get_current_memory($conf);
 		mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 	    } elsif ($fast_plug_option->{$opt}) {
 		# do nothing
@@ -5096,7 +5122,8 @@ sub vmconfig_hotplug_pending {
 
 		# allow manual ballooning if shares is set to zero
 		if ((defined($conf->{shares}) && ($conf->{shares} == 0))) {
-		    my $balloon = $conf->{pending}->{balloon} || $conf->{memory} || $defaults->{memory};
+		    my $memory = get_current_memory($conf);
+		    my $balloon = $conf->{pending}->{balloon} || $memory;
 		    mon_cmd($vmid, "balloon", value => $balloon*1024*1024);
 		}
 	    } elsif ($opt =~ m/^net(\d+)$/) {
@@ -5116,7 +5143,11 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		$value = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt, $value);
+		my $memory = get_current_memory($conf->{pending});
+		my $result = {};
+		$result->{current} = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt, $memory);
+		$value = PVE::QemuServer::print_memory($result);
+
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
 		$cgroup->change_cpu_shares($new_cpuunits);
@@ -5793,7 +5824,8 @@ sub vm_start_nolock {
 	push @$cmd, '-S';
     }
 
-    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $resume);
+    my $memory = get_current_memory($conf);
+    my $start_timeout = $params->{timeout} // config_aware_timeout($conf, $memory, $resume);
 
     my $pci_devices = {}; # host pci devices
     for (my $i = 0; $i < $PVE::QemuServer::PCI::MAX_HOSTPCI_DEVICES; $i++)  {
diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
index e91f906..9115d50 100644
--- a/PVE/QemuServer/Helpers.pm
+++ b/PVE/QemuServer/Helpers.pm
@@ -143,8 +143,7 @@ sub version_cmp {
 }
 
 sub config_aware_timeout {
-    my ($config, $is_suspended) = @_;
-    my $memory = $config->{memory};
+    my ($config, $memory, $is_suspended) = @_;
     my $timeout = 30;
 
     # Based on user reported startup time for vm with 512GiB @ 4-5 minutes
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index f8fc534..668508b 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,6 +8,11 @@ use PVE::Exception qw(raise raise_param_exc);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use base qw(Exporter);
+
+our @EXPORT_OK = qw(
+get_current_memory
+);
 
 my $MAX_NUMA = 8;
 my $STATICMEM = 1024;
@@ -63,6 +68,13 @@ my sub get_max_mem {
     return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
 }
 
+sub get_current_memory{
+    my ($conf) = @_;
+
+    my $memory = PVE::QemuServer::parse_memory($conf->{memory});
+    return $memory->{current};
+}
+
 sub get_numa_node_list {
     my ($conf) = @_;
     my @numa_map;
@@ -162,8 +174,7 @@ sub qemu_memory_hotplug {
     my $sockets = 1;
     $sockets = $conf->{sockets} if $conf->{sockets};
 
-    my $memory = $conf->{memory} || $defaults->{memory};
-    $value = $defaults->{memory} if !$value;
+    my $memory = get_current_memory($conf);
     return $value if $value == $memory;
 
     my $static_memory = $STATICMEM;
@@ -171,7 +182,7 @@ sub qemu_memory_hotplug {
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
-    die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $memory > $MAX_MEM;
+    die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
     if ($value > $memory) {
 
@@ -180,7 +191,7 @@ sub qemu_memory_hotplug {
 	foreach_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size <= $conf->{memory};
+		return if $current_size <= get_current_memory($conf);
 
 		if ($conf->{hugepages}) {
 		    $numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
@@ -219,7 +230,9 @@ sub qemu_memory_hotplug {
 		    die $err;
 		}
 		#update conf after each succesful module hotplug
-		$conf->{memory} = $current_size;
+		my $mem = {};
+		$mem->{current} = $current_size;
+		$conf->{memory} = PVE::QemuServer::print_memory($mem);
 		PVE::QemuConfig->write_config($vmid, $conf);
 	});
 
@@ -228,7 +241,8 @@ sub qemu_memory_hotplug {
 	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-		return if $current_size >= $conf->{memory};
+		return if $current_size >= get_current_memory($conf);
+
 		print "try to unplug memory dimm $name\n";
 
 		my $retry = 0;
@@ -242,7 +256,9 @@ sub qemu_memory_hotplug {
 		}
 
 		#update conf after each succesful module unplug
-		$conf->{memory} = $current_size;
+		my $mem = {};
+		$mem->{current} = $current_size;
+		$conf->{memory} = PVE::QemuServer::print_memory($mem);
 
 		eval { PVE::QemuServer::qemu_objectdel($vmid, "mem-$name"); };
 		PVE::QemuConfig->write_config($vmid, $conf);
@@ -270,7 +286,8 @@ sub qemu_dimm_list {
 sub config {
     my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd) = @_;
 
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf);
+
     my $static_memory = 0;
 
     if ($hotplug_features->{memory}) {
@@ -486,7 +503,7 @@ sub hugepages_topology {
     return if !$conf->{numa};
 
     my $defaults = PVE::QemuServer::load_defaults();
-    my $memory = $conf->{memory} || $defaults->{memory};
+    my $memory = get_current_memory($conf);
     my $static_memory = 0;
     my $sockets = 1;
     $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 01/10] test: add memory tests Alexandre Derumier
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 02/10] add memory parser Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:38   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem Alexandre Derumier
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

max can be multiple of 64GB only.

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index ad69b76..0d5b550 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -267,6 +267,9 @@ my $rng_fmt = {
     },
 };
 
+
+my @max_memory_list = map 65536*$_, 1..64;
+
 my $memory_fmt = {
     current => {
         description => "Current amount of online RAM for the VM in MB. This is the maximum available memory when"
@@ -276,7 +279,12 @@ my $memory_fmt = {
         optional => 1,
         minimum => 16,
         default => 512,
-    }
+    },
+    max => {
+        type => 'integer',
+        optional => 1,
+	enum => [@max_memory_list],
+    },
 };
 
 my $meta_info_fmt = {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (2 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:38   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 50 ++++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 668508b..90e355b 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -15,7 +15,33 @@ get_current_memory
 );
 
 my $MAX_NUMA = 8;
-my $STATICMEM = 1024;
+
+my sub get_static_mem {
+    my ($conf) = @_;
+
+    my $sockets = 1;
+    $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
+    $sockets = $conf->{sockets} if $conf->{sockets};
+    my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
+
+    my $static_memory = 0;
+    my $memory = PVE::QemuServer::parse_memory($conf->{memory});
+
+    if($memory->{max}) {
+	my $dimm_size = $memory->{max} / 64;
+	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
+	$static_memory = $dimm_size * $sockets;
+	$static_memory = 4096 if $static_memory < 4096;
+    } elsif ($hotplug_features->{memory}) {
+	#legacy
+	$static_memory = 1024;
+	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    } else {
+       $static_memory = $memory->{current};
+    }
+
+    return $static_memory;
+}
 
 my $_host_bits;
 my sub get_host_phys_address_bits {
@@ -177,8 +203,7 @@ sub qemu_memory_hotplug {
     my $memory = get_current_memory($conf);
     return $value if $value == $memory;
 
-    my $static_memory = $STATICMEM;
-    $static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
+    my $static_memory = get_static_mem($conf);
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
@@ -288,7 +313,7 @@ sub config {
 
     my $memory = get_current_memory($conf);
 
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf);
 
     if ($hotplug_features->{memory}) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
@@ -300,18 +325,10 @@ sub config {
 		if $conf->{"numa$i"};
 	}
 
-	my $sockets = 1;
-	$sockets = $conf->{sockets} if $conf->{sockets};
-
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
 	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
 
     } else {
-
-	$static_memory = $memory;
 	push @$cmd, '-m', $static_memory;
     }
 
@@ -504,20 +521,13 @@ sub hugepages_topology {
 
     my $defaults = PVE::QemuServer::load_defaults();
     my $memory = get_current_memory($conf);
-    my $static_memory = 0;
+    my $static_memory = get_static_mem($conf);
     my $sockets = 1;
     $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused
     $sockets = $conf->{sockets} if $conf->{sockets};
     my $numa_custom_topology = undef;
     my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
 
-    if ($hotplug_features->{memory}) {
-	$static_memory = $STATICMEM;
-	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
-    } else {
-	$static_memory = $memory;
-    }
-
     #custom numa topology
     for (my $i = 0; $i < $MAX_NUMA; $i++) {
 	next if !$conf->{"numa$i"};
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (3 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:39   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined Alexandre Derumier
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 90e355b..b847742 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -91,7 +91,15 @@ my sub get_max_mem {
     # heuristic: remove 20 bits to get MB and half that as QEMU needs some overhead
     my $bits_to_max_mem = int(1<<($bits - 21));
 
-    return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
+    my $cpu_max_mem = $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
+
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+    if($confmem->{max}) {
+       die "configured memory max can't be bigger than supported cpu architecture $cpu_max_mem MB" if $confmem->{max} > $cpu_max_mem;
+       return $confmem->{max};
+    }
+
+    return $cpu_max_mem;
 }
 
 sub get_current_memory{
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (4 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:39   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 07/10] test: add memory-max tests Alexandre Derumier
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

default kernel vhost config only support 64 slots by default,
for performance since 2015.

Original memory hotplug code was done before, using qemu
max supported 255 slots.

To reach max mem (4TB), we used incremental dimm size.

Instead of dynamic memory size, use 1 static dimm size, compute
from max memory/64.

Fix:
https://bugzilla.proxmox.com/show_bug.cgi?id=3446
https://bugzilla.proxmox.com/show_bug.cgi?id=1426
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer/Memory.pm | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index b847742..8bbbf07 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -140,14 +140,15 @@ sub foreach_dimm{
     my ($conf, $vmid, $memory, $sockets, $func) = @_;
 
     my $dimm_id = 0;
-    my $current_size = 0;
+    my $current_size = get_static_mem($conf);
     my $dimm_size = 0;
 
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
-	$current_size = 1024 * $sockets;
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+    if ($confmem->{max}) {
+       $dimm_size = $confmem->{max}/64;
+    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
 	$dimm_size = 1024;
     } else {
-	$current_size = 1024;
 	$dimm_size = 512;
     }
 
@@ -164,7 +165,7 @@ sub foreach_dimm{
 	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
 	    return  $current_size if $current_size >= $memory;
 	}
-	$dimm_size *= 2;
+	$dimm_size *= 2 if !$confmem->{max};
     }
 }
 
@@ -175,7 +176,12 @@ sub foreach_reverse_dimm {
     my $current_size = 0;
     my $dimm_size = 0;
 
-    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+    if ($confmem->{max}) {
+       $dimm_id = 63;
+       $current_size = $confmem->{max};
+       $dimm_size = $confmem->{max}/64;
+    } elsif ($conf->{hugepages} && $conf->{hugepages} == 1024) {
 	$current_size = 8355840;
 	$dimm_size = 131072;
     } else {
@@ -197,6 +203,7 @@ sub foreach_reverse_dimm {
 	    return  $current_size if $current_size <= $memory;
 	}
 	$dimm_size /= 2;
+	$dimm_size /= 2 if !$confmem->{max};
     }
 }
 
@@ -263,7 +270,7 @@ sub qemu_memory_hotplug {
 		    die $err;
 		}
 		#update conf after each succesful module hotplug
-		my $mem = {};
+		my $mem = { max => $MAX_MEM };
 		$mem->{current} = $current_size;
 		$conf->{memory} = PVE::QemuServer::print_memory($mem);
 		PVE::QemuConfig->write_config($vmid, $conf);
@@ -289,7 +296,7 @@ sub qemu_memory_hotplug {
 		}
 
 		#update conf after each succesful module unplug
-		my $mem = {};
+		my $mem = { max => $MAX_MEM };
 		$mem->{current} = $current_size;
 		$conf->{memory} = PVE::QemuServer::print_memory($mem);
 
@@ -322,8 +329,9 @@ sub config {
     my $memory = get_current_memory($conf);
 
     my $static_memory = get_static_mem($conf);
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
 
-    if ($hotplug_features->{memory}) {
+    if ($hotplug_features->{memory} || defined($confmem->{max})) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
 	my $MAX_MEM = get_max_mem($conf);
 	die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
@@ -334,7 +342,8 @@ sub config {
 	}
 
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
-	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
+	my $slots = $confmem->{max} ? 64 : 255;
+	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
 
     } else {
 	push @$cmd, '-m', $static_memory;
@@ -403,7 +412,7 @@ sub config {
 	}
     }
 
-    if ($hotplug_features->{memory}) {
+    if ($hotplug_features->{memory} || $confmem->{max}) {
 	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 07/10] test: add memory-max tests
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (5 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-max-128G.conf     | 10 ++++
 test/cfg2cmd/memory-max-128G.conf.cmd | 86 +++++++++++++++++++++++++++
 test/cfg2cmd/memory-max-512G.conf     | 10 ++++
 test/cfg2cmd/memory-max-512G.conf.cmd | 58 ++++++++++++++++++
 4 files changed, 164 insertions(+)
 create mode 100644 test/cfg2cmd/memory-max-128G.conf
 create mode 100644 test/cfg2cmd/memory-max-128G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-max-512G.conf
 create mode 100644 test/cfg2cmd/memory-max-512G.conf.cmd

diff --git a/test/cfg2cmd/memory-max-128G.conf b/test/cfg2cmd/memory-max-128G.conf
new file mode 100644
index 0000000..c023282
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf
@@ -0,0 +1,10 @@
+# TEST: memorymax 64G with 2 socket, dimm size should be 1GB && static memory should be 4G
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 32768,max=65536
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-128G.conf.cmd b/test/cfg2cmd/memory-max-128G.conf.cmd
new file mode 100644
index 0000000..4477faf
--- /dev/null
+++ b/test/cfg2cmd/memory-max-128G.conf.cmd
@@ -0,0 +1,86 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,slots=64,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=1024M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=1024M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=1024M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=1024M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=1024M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=1024M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=1024M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=1024M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=1024M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=1024M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=1024M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=1024M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=1024M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=1024M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm14,size=1024M' \
+  -device 'pc-dimm,id=dimm14,memdev=mem-dimm14,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm15,size=1024M' \
+  -device 'pc-dimm,id=dimm15,memdev=mem-dimm15,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm16,size=1024M' \
+  -device 'pc-dimm,id=dimm16,memdev=mem-dimm16,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm17,size=1024M' \
+  -device 'pc-dimm,id=dimm17,memdev=mem-dimm17,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm18,size=1024M' \
+  -device 'pc-dimm,id=dimm18,memdev=mem-dimm18,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm19,size=1024M' \
+  -device 'pc-dimm,id=dimm19,memdev=mem-dimm19,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm20,size=1024M' \
+  -device 'pc-dimm,id=dimm20,memdev=mem-dimm20,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm21,size=1024M' \
+  -device 'pc-dimm,id=dimm21,memdev=mem-dimm21,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm22,size=1024M' \
+  -device 'pc-dimm,id=dimm22,memdev=mem-dimm22,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm23,size=1024M' \
+  -device 'pc-dimm,id=dimm23,memdev=mem-dimm23,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm24,size=1024M' \
+  -device 'pc-dimm,id=dimm24,memdev=mem-dimm24,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm25,size=1024M' \
+  -device 'pc-dimm,id=dimm25,memdev=mem-dimm25,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm26,size=1024M' \
+  -device 'pc-dimm,id=dimm26,memdev=mem-dimm26,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm27,size=1024M' \
+  -device 'pc-dimm,id=dimm27,memdev=mem-dimm27,node=1' \
+   -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-max-512G.conf b/test/cfg2cmd/memory-max-512G.conf
new file mode 100644
index 0000000..5882ae4
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf
@@ -0,0 +1,10 @@
+# TEST: memorymax 512GB with 2 socket, dimm size should be 8GB && static memory should be 16G
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 131072,max=524288
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-max-512G.conf.cmd b/test/cfg2cmd/memory-max-512G.conf.cmd
new file mode 100644
index 0000000..ba40298
--- /dev/null
+++ b/test/cfg2cmd/memory-max-512G.conf.cmd
@@ -0,0 +1,58 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=16384,slots=64,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=8192M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=8192M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-dimm0,size=8192M' \
+  -device 'pc-dimm,id=dimm0,memdev=mem-dimm0,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm1,size=8192M' \
+  -device 'pc-dimm,id=dimm1,memdev=mem-dimm1,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm2,size=8192M' \
+  -device 'pc-dimm,id=dimm2,memdev=mem-dimm2,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm3,size=8192M' \
+  -device 'pc-dimm,id=dimm3,memdev=mem-dimm3,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm4,size=8192M' \
+  -device 'pc-dimm,id=dimm4,memdev=mem-dimm4,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm5,size=8192M' \
+  -device 'pc-dimm,id=dimm5,memdev=mem-dimm5,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm6,size=8192M' \
+  -device 'pc-dimm,id=dimm6,memdev=mem-dimm6,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm7,size=8192M' \
+  -device 'pc-dimm,id=dimm7,memdev=mem-dimm7,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm8,size=8192M' \
+  -device 'pc-dimm,id=dimm8,memdev=mem-dimm8,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm9,size=8192M' \
+  -device 'pc-dimm,id=dimm9,memdev=mem-dimm9,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm10,size=8192M' \
+  -device 'pc-dimm,id=dimm10,memdev=mem-dimm10,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm11,size=8192M' \
+  -device 'pc-dimm,id=dimm11,memdev=mem-dimm11,node=1' \
+  -object 'memory-backend-ram,id=mem-dimm12,size=8192M' \
+  -device 'pc-dimm,id=dimm12,memdev=mem-dimm12,node=0' \
+  -object 'memory-backend-ram,id=mem-dimm13,size=8192M' \
+  -device 'pc-dimm,id=dimm13,memdev=mem-dimm13,node=1' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (6 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 07/10] test: add memory-max tests Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:42   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests Alexandre Derumier
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

a 4GB static memory is needed for DMA+boot memory, as this memory
is almost always un-unpluggeable.

1 virtio-mem pci device is setup for each numa node on pci.4 bridge

virtio-mem use a fixed blocksize with 32000 blocks
Blocksize is computed from the maxmemory-4096/32000 with a minimum of
2MB to map THP.
(lower blocksize = more chance to unplug memory).

fixes:
https://bugzilla.proxmox.com/show_bug.cgi?id=931
https://bugzilla.proxmox.com/show_bug.cgi?id=2949
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  8 +++-
 PVE/QemuServer/Memory.pm | 98 +++++++++++++++++++++++++++++++++++++---
 PVE/QemuServer/PCI.pm    |  8 ++++
 3 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 0d5b550..43fab29 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -285,6 +285,12 @@ my $memory_fmt = {
         optional => 1,
 	enum => [@max_memory_list],
     },
+    virtio => {
+	description => "enable virtio-mem memory",
+	type => 'boolean',
+	optional => 1,
+	default => 0,
+    },
 };
 
 my $meta_info_fmt = {
@@ -3898,7 +3904,7 @@ sub config_to_command {
 	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
     }
 
-    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd);
+    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd, $devices, $bridges, $arch, $machine_type);
 
     push @$cmd, '-S' if $conf->{freeze};
 
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 8bbbf07..70ab65a 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -8,6 +8,8 @@ use PVE::Exception qw(raise raise_param_exc);
 
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
+use PVE::QemuServer::PCI qw(print_pci_addr);
+
 use base qw(Exporter);
 
 our @EXPORT_OK = qw(
@@ -27,7 +29,9 @@ my sub get_static_mem {
     my $static_memory = 0;
     my $memory = PVE::QemuServer::parse_memory($conf->{memory});
 
-    if($memory->{max}) {
+    if ($memory->{virtio}) {
+	$static_memory = 4096;
+    } elsif ($memory->{max}) {
 	my $dimm_size = $memory->{max} / 64;
 	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
 	$static_memory = $dimm_size * $sockets;
@@ -102,6 +106,24 @@ my sub get_max_mem {
     return $cpu_max_mem;
 }
 
+my sub get_virtiomem_block_size {
+    my ($conf) = @_;
+
+    my $MAX_MEM = get_max_mem($conf);
+    my $static_memory = get_static_mem($conf);
+    my $memory = get_current_memory($conf);
+    #virtiomem can map 32000 block size. try to use lowerst blocksize, lower = more chance to unplug memory.
+    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
+    #round next power of 2
+    $blocksize = 2**(int(log($blocksize)/log(2))+1);
+    #2MB is the minimum to be aligned with THP
+    $blocksize = 2 if $blocksize < 2;
+
+    die "memory size need to be multiple of $blocksize MB when virtio-mem is enabled" if ($memory % $blocksize != 0);
+
+    return $blocksize;
+}
+
 sub get_current_memory{
     my ($conf) = @_;
 
@@ -224,7 +246,41 @@ sub qemu_memory_hotplug {
     my $MAX_MEM = get_max_mem($conf);
     die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
-    if ($value > $memory) {
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+
+    if ($confmem->{virtio}) {
+	my $blocksize = get_virtiomem_block_size($conf);
+	my $requested_size = ($value - $static_memory) / $sockets * 1024 * 1024;
+	my $totalsize = $static_memory;
+	my $err = undef;
+
+	for (my $i = 0; $i < $sockets; $i++)  {
+
+	    my $id = "virtiomem$i";
+	    my $retry = 0;
+	    mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/$id", property => "requested-size", value => int($requested_size));
+
+	    my $size = 0;
+	    while (1) {
+		sleep 1;
+		$size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/$id", property => "size");
+		$err = 1 if $retry > 5;
+		last if $size eq $requested_size || $retry > 5;
+		$retry++;
+	    }
+	    $totalsize += ($size / 1024 / 1024 );
+	}
+	#update conf after each succesfull change
+	if($err) {
+	    my $mem = { max => $MAX_MEM, virtio => 1};
+	    $mem->{current} = $totalsize;
+	    $conf->{memory} = PVE::QemuServer::print_memory($mem);
+	    PVE::QemuConfig->write_config($vmid, $conf);
+	    raise_param_exc({ 'memory' => "error modify virtio memory" }) if $err;
+	}
+	return $totalsize;
+
+    } elsif ($value > $memory) {
 
 	my $numa_hostmap;
 
@@ -324,14 +380,15 @@ sub qemu_dimm_list {
 }
 
 sub config {
-    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd) = @_;
+    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd, $devices, $bridges, $arch, $machine_type) = @_;
 
     my $memory = get_current_memory($conf);
 
     my $static_memory = get_static_mem($conf);
+
     my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
 
-    if ($hotplug_features->{memory} || defined($confmem->{max})) {
+    if ($hotplug_features->{memory} || defined($confmem->{max}) || defined($confmem->{virtio})) {
 	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
 	my $MAX_MEM = get_max_mem($conf);
 	die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
@@ -342,8 +399,12 @@ sub config {
 	}
 
 	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
+
+	my $cmdstr = "size=${static_memory}";
 	my $slots = $confmem->{max} ? 64 : 255;
-	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
+	$cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
+	$cmdstr .= ",maxmem=${MAX_MEM}M";
+	push @$cmd, '-m', $cmdstr;
 
     } else {
 	push @$cmd, '-m', $static_memory;
@@ -412,7 +473,26 @@ sub config {
 	}
     }
 
-    if ($hotplug_features->{memory} || $confmem->{max}) {
+    if ($confmem->{'virtio'}) {
+	my $MAX_MEM = get_max_mem($conf);
+	my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
+	my $node_mem = ($memory - $static_memory) / $sockets;
+	my $blocksize = get_virtiomem_block_size($conf);
+
+	for (my $i = 0; $i < $sockets; $i++)  {
+
+	    my $id = "virtiomem$i";
+	    my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
+	    push @$cmd, "-object" , "$mem_object,reserve=off";
+
+	    my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);
+	    my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i$pciaddr";
+	    $mem_device .= ",prealloc=on" if $conf->{hugepages};
+	    push @$devices, "-device", $mem_device;
+	}
+
+    } elsif ($hotplug_features->{memory} || $confmem->{max}) {
+
 	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
 	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
 
@@ -430,12 +510,16 @@ sub config {
 sub print_mem_object {
     my ($conf, $id, $size) = @_;
 
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+
     if ($conf->{hugepages}) {
 
 	my $hugepages_size = hugepages_size($conf, $size);
 	my $path = hugepages_mount_path($hugepages_size);
 
-	return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
+	my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
+	$object .= ",prealloc=yes" if !$confmem->{virtio};
+	return $object;
     } else {
 	return "memory-backend-ram,id=$id,size=${size}M";
     }
diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
index a18b974..0187c74 100644
--- a/PVE/QemuServer/PCI.pm
+++ b/PVE/QemuServer/PCI.pm
@@ -249,6 +249,14 @@ sub get_pci_addr_map {
 	'scsihw2' => { bus => 4, addr => 1 },
 	'scsihw3' => { bus => 4, addr => 2 },
 	'scsihw4' => { bus => 4, addr => 3 },
+	'virtiomem0' => { bus => 4, addr => 4 },
+	'virtiomem1' => { bus => 4, addr => 5 },
+	'virtiomem2' => { bus => 4, addr => 6 },
+	'virtiomem3' => { bus => 4, addr => 7 },
+	'virtiomem4' => { bus => 4, addr => 8 },
+	'virtiomem5' => { bus => 4, addr => 9 },
+	'virtiomem6' => { bus => 4, addr => 10 },
+	'virtiomem7' => { bus => 4, addr => 11 },
     } if !defined($pci_addr_map);
     return $pci_addr_map;
 }
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (7 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:42   ` Fiona Ebner
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem Alexandre Derumier
  2022-12-16 13:38 ` [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Fiona Ebner
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/memory-virtio-hugepages-1G.conf  | 11 ++++++
 .../memory-virtio-hugepages-1G.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio-max.conf           | 10 ++++++
 test/cfg2cmd/memory-virtio-max.conf.cmd       | 35 +++++++++++++++++++
 test/cfg2cmd/memory-virtio.conf               | 10 ++++++
 test/cfg2cmd/memory-virtio.conf.cmd           | 35 +++++++++++++++++++
 6 files changed, 136 insertions(+)
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
 create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf
 create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
 create mode 100644 test/cfg2cmd/memory-virtio.conf
 create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd

diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
new file mode 100644
index 0000000..616b98e
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
@@ -0,0 +1,11 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
+hugepages: 1024
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
new file mode 100644
index 0000000..1592004
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-file,id=ram-node0,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-file,id=ram-node1,size=2048M,mem-path=/run/hugepages/kvm/1048576kB,share=on' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-file,id=mem-virtiomem0,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -object 'memory-backend-file,id=mem-virtiomem1,size=260096M,mem-path=/run/hugepages/kvm/1048576kB,share=on,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4,prealloc=on' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5,prealloc=on' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-virtio-max.conf b/test/cfg2cmd/memory-virtio-max.conf
new file mode 100644
index 0000000..fe24a36
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf
@@ -0,0 +1,10 @@
+# TEST: memory max 64G with 2 socket with virtio-mem
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 32768,max=65536,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio-max.conf.cmd b/test/cfg2cmd/memory-virtio-max.conf.cmd
new file mode 100644
index 0000000..5b03e0d
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio-max.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=65536M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=30720M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=30720M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=2M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=2M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
new file mode 100644
index 0000000..f88f6f5
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf
@@ -0,0 +1,10 @@
+# TEST: virtio-mem with memorymax defined
+# QEMU_VERSION: 3.0
+cores: 2
+memory: 32768,virtio=1
+name: simple
+numa: 1
+ostype: l26
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 2
+vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
\ No newline at end of file
diff --git a/test/cfg2cmd/memory-virtio.conf.cmd b/test/cfg2cmd/memory-virtio.conf.cmd
new file mode 100644
index 0000000..283c87b
--- /dev/null
+++ b/test/cfg2cmd/memory-virtio.conf.cmd
@@ -0,0 +1,35 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name 'simple,debug-threads=on' \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '4,sockets=2,cores=2,maxcpus=4' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 'size=4096,maxmem=524288M' \
+  -object 'memory-backend-ram,id=ram-node0,size=2048M' \
+  -numa 'node,nodeid=0,cpus=0-1,memdev=ram-node0' \
+  -object 'memory-backend-ram,id=ram-node1,size=2048M' \
+  -numa 'node,nodeid=1,cpus=2-3,memdev=ram-node1' \
+  -object 'memory-backend-ram,id=mem-virtiomem0,size=260096M,reserve=off' \
+  -object 'memory-backend-ram,id=mem-virtiomem1,size=260096M,reserve=off' \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'pci-bridge,id=pci.4,chassis_nr=4,bus=pci.1,addr=0x1c' \
+  -device 'vmgenid,guid=c773c261-d800-4348-9f5d-167fadd53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem0,memdev=mem-virtiomem0,node=0,bus=pci.4,addr=0x4' \
+  -device 'virtio-mem-pci,block-size=32M,requested-size=14336M,id=virtiomem1,memdev=mem-virtiomem1,node=1,bus=pci.4,addr=0x5' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -machine 'type=pc'
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (8 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests Alexandre Derumier
@ 2022-12-09 19:27 ` Alexandre Derumier
  2022-12-16 13:42   ` Fiona Ebner
  2022-12-16 13:38 ` [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Fiona Ebner
  10 siblings, 1 reply; 32+ messages in thread
From: Alexandre Derumier @ 2022-12-09 19:27 UTC (permalink / raw)
  To: pve-devel, t.lamprecht

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm        |  9 ++++-----
 PVE/QemuServer/Memory.pm | 23 ++++++++++++++++-------
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 43fab29..549e448 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -5081,7 +5081,8 @@ sub vmconfig_hotplug_pending {
 		vmconfig_delete_or_detach_drive($vmid, $storecfg, $conf, $opt, $force);
 	    } elsif ($opt =~ m/^memory$/) {
 		die "skip\n" if !$hotplug_features->{memory};
-		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt);
+		my $new_mem = PVE::QemuServer::parse_memory();
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $new_mem);
 	    } elsif ($opt eq 'cpuunits') {
 		$cgroup->change_cpu_shares(undef);
 	    } elsif ($opt eq 'cpulimit') {
@@ -5157,10 +5158,8 @@ sub vmconfig_hotplug_pending {
 				     $vmid, $opt, $value, $arch, $machine_type);
 	    } elsif ($opt =~ m/^memory$/) { #dimms
 		die "skip\n" if !$hotplug_features->{memory};
-		my $memory = get_current_memory($conf->{pending});
-		my $result = {};
-		$result->{current} = PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $defaults, $opt, $memory);
-		$value = PVE::QemuServer::print_memory($result);
+		my $new_mem = PVE::QemuServer::parse_memory($conf->{pending}->{memory});
+		PVE::QemuServer::Memory::qemu_memory_hotplug($vmid, $conf, $new_mem);
 
 	    } elsif ($opt eq 'cpuunits') {
 		my $new_cpuunits = PVE::CGroup::clamp_cpu_shares($conf->{pending}->{$opt}); #clamp
diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
index 70ab65a..84a9126 100644
--- a/PVE/QemuServer/Memory.pm
+++ b/PVE/QemuServer/Memory.pm
@@ -9,6 +9,7 @@ use PVE::Exception qw(raise raise_param_exc);
 use PVE::QemuServer;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::PCI qw(print_pci_addr);
+use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
 
 use base qw(Exporter);
 
@@ -230,24 +231,32 @@ sub foreach_reverse_dimm {
 }
 
 sub qemu_memory_hotplug {
-    my ($vmid, $conf, $defaults, $opt, $value) = @_;
+    my ($vmid, $conf, $new_mem) = @_;
 
-    return $value if !PVE::QemuServer::check_running($vmid);
+    return if !PVE::QemuServer::check_running($vmid);
 
-    my $sockets = 1;
-    $sockets = $conf->{sockets} if $conf->{sockets};
+    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
+
+    # skip non hotpluggable value
+    if (safe_string_ne($new_mem->{max}, $confmem->{max}) ||
+	safe_boolean_ne($new_mem->{virtio}, $confmem->{virtio})) {
+	die "skip\n";
+    }
+
+    my $value = $new_mem->{current};
+    my $memory = $confmem->{current};
 
-    my $memory = get_current_memory($conf);
     return $value if $value == $memory;
 
+    my $sockets = 1;
+    $sockets = $conf->{sockets} if $conf->{sockets};
+
     my $static_memory = get_static_mem($conf);
 
     die "memory can't be lower than $static_memory MB" if $value < $static_memory;
     my $MAX_MEM = get_max_mem($conf);
     die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
 
-    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
-
     if ($confmem->{virtio}) {
 	my $blocksize = get_virtiomem_block_size($conf);
 	my $requested_size = ($value - $static_memory) / $sockets * 1024 * 1024;
-- 
2.30.2




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem
  2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
                   ` (9 preceding siblings ...)
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem Alexandre Derumier
@ 2022-12-16 13:38 ` Fiona Ebner
  2022-12-19 11:31   ` DERUMIER, Alexandre
  10 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:38 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> This patch series rework the current memory hotplug + virtiomem.
> 
> memory option now have extra options:
> 
> memory: [[current=]<integer>] [,max=<enum>] [,virtio=<1|0>]
> ex: memory: current=1024,max=131072,virtio=1
> 
> 
> patches 1-2: add a memory parser
> 
> patches 3-7: add the max option with 64 static dimm hotplug
> 
> for classic memory hotplug, when maxmemory is defined,
> we use 64 fixed size dimm.
> The max option is a multiple of 64GB.
> 
> patches 8-9: add virtio-mem
> 
> The virtio option enable new virtio-mem support,
> instead of plugging dimm, it's add/removed block inside
> big dimm.
> virtio-mem can use 32000 blocks, the blocksize is compute from
> max memory.
> 
> patch10: hotplug fix
> 
> Alexandre Derumier (10):
>   test: add memory tests
>   add memory parser
>   config: memory: add 'max' option
>   memory: add get_static_mem
>   memory: get_max_mem: use config memory max
>   memory: use 64 slots && static dimm size with max is defined
>   test: add memory-max tests
>   memory: add virtio-mem support
>   tests: add virtio-mem tests
>   memory: fix hotplug with virtiomem && maxmem
> 
The general ideas looks fine to me and my basic testing seemed fine too,
but most patches could use a few improvements. The biggest issue is that
getting static information in HA manager is not adapted to the change.

The virtio feature should be marked as technology preview and mentioned
that it only works for guests running Linux >= 5.8 [0].

IMHO, it'd be nicer to add the property string handling/registering in
the Memory.pm module directly, so that all is contained in one place.

I also think that using a validator for the format might be worth it.

Also, the tests should be made more host-independent, but for the NUMA
node issue 01/10 that might require a bit of rework, to be able to mock
the relevant parts.

Should the virtio feature be QEMU-version-guarded? It's explicitly
opt-in and technology preview, so maybe not really needed?

See the individual replies on patches for details.

[0]: https://virtio-mem.gitlab.io/user-guide/




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 01/10] test: add memory tests
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 01/10] test: add memory tests Alexandre Derumier
@ 2022-12-16 13:38   ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:38 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  test/cfg2cmd/memory-hotplug-hugepages.conf    |  12 ++
>  .../cfg2cmd/memory-hotplug-hugepages.conf.cmd |  62 +++++++
>  test/cfg2cmd/memory-hotplug.conf              |  11 ++
>  test/cfg2cmd/memory-hotplug.conf.cmd          | 174 ++++++++++++++++++
>  test/cfg2cmd/memory-hugepages-1g.conf         |  11 ++
>  test/cfg2cmd/memory-hugepages-1g.conf.cmd     |  30 +++
>  test/cfg2cmd/memory-hugepages-2m.conf         |  11 ++
>  test/cfg2cmd/memory-hugepages-2m.conf.cmd     |  30 +++
>  8 files changed, 341 insertions(+)
>  create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf
>  create mode 100644 test/cfg2cmd/memory-hotplug-hugepages.conf.cmd
>  create mode 100644 test/cfg2cmd/memory-hotplug.conf
>  create mode 100644 test/cfg2cmd/memory-hotplug.conf.cmd
>  create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf
>  create mode 100644 test/cfg2cmd/memory-hugepages-1g.conf.cmd
>  create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf
>  create mode 100644 test/cfg2cmd/memory-hugepages-2m.conf.cmd

Some of these didn't work on my development VM:
> got unexpected error: host NUMA node1 doesn't exist

I had to enable NUMA for the VM to make it work, but ideally such tests
should work on all setups.




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 02/10] add memory parser Alexandre Derumier
@ 2022-12-16 13:38   ` Fiona Ebner
  2023-01-02 10:50     ` DERUMIER, Alexandre
  2023-01-02 11:23     ` DERUMIER, Alexandre
  0 siblings, 2 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:38 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/API2/Qemu.pm          | 12 ++++++++-
>  PVE/QemuConfig.pm         |  4 +--
>  PVE/QemuMigrate.pm        |  6 +++--
>  PVE/QemuServer.pm         | 56 ++++++++++++++++++++++++++++++---------
>  PVE/QemuServer/Helpers.pm |  3 +--
>  PVE/QemuServer/Memory.pm  | 35 +++++++++++++++++-------
>  6 files changed, 88 insertions(+), 28 deletions(-)
> 

From a modularity standpoint, it would be nice to move the format
description, parsing+printing to PVE/QemuServer/Memory.pm, similar to
how it is done in PVE/QemuServer/PCI.pm

Since $confdesc->{memory}->{default} is now gone, load_defaults() will
not return a default for 'memory' anymore. And $conf->{memory} needs to
be parsed everywhere. These two things will break getting static usage
in HA manager, which uses load_defaults() and relies on $conf->{memory}
to be a single value right now. We can switch there to use
get_current_memory() too of course, but it'd require a versioned
Breaks+Depends.

Alternatively, we could add a function in qemu-server for calculating
the static stats and call that from HA. Requires a versioned
Breaks+Depends too, but then we'd be safe in the future and also catch
such changes more easily. OTOH, it'd make the coupling go in the other
direction: if HA manager wants to change what it uses for static
consideration, then qemu-server would need to adapt. So not sure.

> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index badfc37..ba893d2 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -32,6 +32,7 @@ use PVE::QemuServer::Drive;
>  use PVE::QemuServer::ImportDisk;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::Machine;
> +use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::QemuMigrate;
>  use PVE::RPCEnvironment;
>  use PVE::AccessControl;
> @@ -1608,7 +1609,16 @@ my $update_vm_api  = sub {
>  	}
>  
>  	if ($param->{memory} || defined($param->{balloon})) {
> -	    my $maxmem = $param->{memory} || $conf->{pending}->{memory} || $conf->{memory} || $defaults->{memory};
> +
> +	    my $maxmem = undef;
> +	    if($param->{memory}) {

Style nit: missing space after if

> +	        $maxmem = get_current_memory($param);
> +	    } elsif ($conf->{pending}->{memory}) {
> +	        $maxmem = get_current_memory($conf->{pending});
> +	    } elsif ($conf->{memory}) {

Should be 'else'. Otherwise, you don't get the default when no branch is
taken.

> +	        $maxmem = get_current_memory($conf);
> +	    }> +
>  	    my $balloon = defined($param->{balloon}) ? $param->{balloon} : $conf->{pending}->{balloon} || $conf->{balloon};
>  
>  	    die "balloon value too large (must be smaller than assigned memory)\n"
> diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm
> index 051382c..6fc4170 100644
> --- a/PVE/QemuConfig.pm
> +++ b/PVE/QemuConfig.pm
> @@ -12,6 +12,7 @@ use PVE::QemuServer::Helpers;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer;
>  use PVE::QemuServer::Machine;
> +use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::Storage;
>  use PVE::Tools;
>  use PVE::Format qw(render_bytes render_duration);
> @@ -208,8 +209,7 @@ sub __snapshot_save_vmstate {
>  	$target = PVE::QemuServer::find_vmstate_storage($conf, $storecfg);
>      }
>  
> -    my $defaults = PVE::QemuServer::load_defaults();
> -    my $mem_size = $conf->{memory} // $defaults->{memory};
> +    my $mem_size = get_current_memory($conf);
>      my $driver_state_size = 500; # assume 500MB is enough to safe all driver state;
>      # our savevm-start does live-save of the memory until the space left in the
>      # volume is just enough for the remaining memory content + internal state
> diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
> index 5e466d9..121b7e2 100644
> --- a/PVE/QemuMigrate.pm
> +++ b/PVE/QemuMigrate.pm
> @@ -26,6 +26,7 @@ use PVE::QemuServer::Drive;
>  use PVE::QemuServer::Helpers qw(min_version);
>  use PVE::QemuServer::Machine;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
> +use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::QemuServer;
>  
>  use PVE::AbstractMigrate;
> @@ -1024,7 +1025,8 @@ sub phase2_start_remote_cluster {
>      my $remote_vmid = $self->{opts}->{remote}->{vmid};
>  
>      # like regular start but with some overhead accounted for
> -    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}) + 10;
> +    my $memory = get_current_memory($self->{vmconf}); 

Style nit: trailing space

> +    my $timeout = PVE::QemuServer::Helpers::config_aware_timeout($self->{vmconf}, $memory) + 10;
>  
>      my $res = PVE::Tunnel::write_tunnel($self->{tunnel}, $timeout, "start", $params);
>  
> @@ -1179,7 +1181,7 @@ sub phase2 {
>      $qemu_migrate_params->{'downtime-limit'} = int($migrate_downtime);
>  
>      # set cachesize to 10% of the total memory
> -    my $memory =  $conf->{memory} || $defaults->{memory};
> +    my $memory = get_current_memory($conf);
>      my $cachesize = int($memory * 1048576 / 10);
>      $cachesize = round_powerof2($cachesize);
>  
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index a52a883..ad69b76 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -52,7 +52,7 @@ use PVE::QemuServer::CGroup;
>  use PVE::QemuServer::CPUConfig qw(print_cpu_device get_cpu_options);
>  use PVE::QemuServer::Drive qw(is_valid_drivename drive_is_cloudinit drive_is_cdrom drive_is_read_only parse_drive print_drive);
>  use PVE::QemuServer::Machine;
> -use PVE::QemuServer::Memory;
> +use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::PCI qw(print_pci_addr print_pcie_addr print_pcie_root_port parse_hostpci);
>  use PVE::QemuServer::USB qw(parse_usb_device);
> @@ -267,6 +267,18 @@ my $rng_fmt = {
>      },
>  };
>  
> +my $memory_fmt = {
> +    current => {
> +        description => "Current amount of online RAM for the VM in MB. This is the maximum available memory when"
> +            ." you use the balloon device.",
> +        type => 'integer',
> +	default_key => 1,
> +        optional => 1,
> +        minimum => 16,
> +        default => 512,
> +    }

Style nit: whitespace errors

> +};
> +
>  my $meta_info_fmt = {
>      'ctime' => {
>  	type => 'integer',
> @@ -340,11 +352,8 @@ my $confdesc = {
>      },
>      memory => {
>  	optional => 1,
> -	type => 'integer',
> -	description => "Amount of RAM for the VM in MB. This is the maximum available memory when"
> -	    ." you use the balloon device.",
> -	minimum => 16,
> -	default => 512,
> +	type => 'string',
> +	format => $memory_fmt,
>      },
>      balloon => {
>  	optional => 1,
> @@ -2168,6 +2177,24 @@ sub parse_rng {
>      return $res;
>  }
>  
> +sub print_memory {
> +    my $memory = shift;
> +
> +    return PVE::JSONSchema::print_property_string($memory, $memory_fmt);
> +}
> +
> +sub parse_memory {
> +    my ($value) = @_;
> +
> +    my $current_default = $memory_fmt->{current}->{default};
> +    my $res = { current => $current_default };
> +    return $res if !defined($value);
> +
> +    $res = eval { parse_property_string($memory_fmt, $value) };
> +    warn $@ if $@;

Why not die? The configuration is wrong after all.

> +    return $res;
> +}
> +
>  sub parse_meta_info {
>      my ($value) = @_;
>  

(...)

> diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
> index e91f906..9115d50 100644
> --- a/PVE/QemuServer/Helpers.pm
> +++ b/PVE/QemuServer/Helpers.pm
> @@ -143,8 +143,7 @@ sub version_cmp {
>  }
>  
>  sub config_aware_timeout {
> -    my ($config, $is_suspended) = @_;
> -    my $memory = $config->{memory};
> +    my ($config, $memory, $is_suspended) = @_;

Why do you add this? Also, when you adapt the callers, you only pass in
$config->{memory} which is already part of $config.

>      my $timeout = 30;
>  
>      # Based on user reported startup time for vm with 512GiB @ 4-5 minutes
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index f8fc534..668508b 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -8,6 +8,11 @@ use PVE::Exception qw(raise raise_param_exc);
>  
>  use PVE::QemuServer;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
> +use base qw(Exporter);
> +
> +our @EXPORT_OK = qw(
> +get_current_memory
> +);
>  
>  my $MAX_NUMA = 8;
>  my $STATICMEM = 1024;
> @@ -63,6 +68,13 @@ my sub get_max_mem {
>      return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
>  }
>  
> +sub get_current_memory{

Style nit: missing space

> +    my ($conf) = @_;

Could take the memory property string as a parameter rather than a whole
config.

> +
> +    my $memory = PVE::QemuServer::parse_memory($conf->{memory});
> +    return $memory->{current};
> +}
> +
>  sub get_numa_node_list {
>      my ($conf) = @_;
>      my @numa_map;
> @@ -162,8 +174,7 @@ sub qemu_memory_hotplug {
>      my $sockets = 1;
>      $sockets = $conf->{sockets} if $conf->{sockets};
>  
> -    my $memory = $conf->{memory} || $defaults->{memory};
> -    $value = $defaults->{memory} if !$value;

Now there is no fallback for $value anymore.

> +    my $memory = get_current_memory($conf);
>      return $value if $value == $memory;
>  
>      my $static_memory = $STATICMEM;
> @@ -171,7 +182,7 @@ sub qemu_memory_hotplug {
>  
>      die "memory can't be lower than $static_memory MB" if $value < $static_memory;
>      my $MAX_MEM = get_max_mem($conf);
> -    die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $memory > $MAX_MEM;
> +    die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;

So this check was wrong previously? This should be its own patch.

>  
>      if ($value > $memory) {
>  
> @@ -180,7 +191,7 @@ sub qemu_memory_hotplug {
>  	foreach_dimm($conf, $vmid, $value, $sockets, sub {
>  	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
>  
> -		return if $current_size <= $conf->{memory};
> +		return if $current_size <= get_current_memory($conf);
>  
>  		if ($conf->{hugepages}) {
>  		    $numa_hostmap = get_numa_guest_to_host_map($conf) if !$numa_hostmap;
> @@ -219,7 +230,9 @@ sub qemu_memory_hotplug {
>  		    die $err;
>  		}
>  		#update conf after each succesful module hotplug
> -		$conf->{memory} = $current_size;
> +		my $mem = {};
> +		$mem->{current} = $current_size;
> +		$conf->{memory} = PVE::QemuServer::print_memory($mem);

Future properties other than current are dropped here. You do fix it in
10/10, but ideally it would be good from the start.

>  		PVE::QemuConfig->write_config($vmid, $conf);
>  	});
>  
> @@ -228,7 +241,8 @@ sub qemu_memory_hotplug {
>  	foreach_reverse_dimm($conf, $vmid, $value, $sockets, sub {
>  	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
>  
> -		return if $current_size >= $conf->{memory};
> +		return if $current_size >= get_current_memory($conf);
> +
>  		print "try to unplug memory dimm $name\n";
>  
>  		my $retry = 0;
> @@ -242,7 +256,9 @@ sub qemu_memory_hotplug {
>  		}
>  
>  		#update conf after each succesful module unplug
> -		$conf->{memory} = $current_size;
> +		my $mem = {};
> +		$mem->{current} = $current_size;
> +		$conf->{memory} = PVE::QemuServer::print_memory($mem);

Same here.




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option Alexandre Derumier
@ 2022-12-16 13:38   ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:38 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> max can be multiple of 64GB only.
> 

Some quick rationale as to why would be nice to have.

> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer.pm | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index ad69b76..0d5b550 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -267,6 +267,9 @@ my $rng_fmt = {
>      },
>  };
>  
> +
> +my @max_memory_list = map 65536*$_, 1..64;
> +
>  my $memory_fmt = {
>      current => {
>          description => "Current amount of online RAM for the VM in MB. This is the maximum available memory when"
> @@ -276,7 +279,12 @@ my $memory_fmt = {
>          optional => 1,
>          minimum => 16,
>          default => 512,
> -    }
> +    },
> +    max => {
> +        type => 'integer',
> +        optional => 1,

Style nit: wrong indentations

Missing description. Even if it's just something like "Maximum amount of
memory that can be hotplugged", it's better to have something than nothing.

> +	enum => [@max_memory_list],

This feels like an abuse of enum to me. Instead, you could attach a
verifier function to the format and die there with an appropriate error
message.

Or it will be this ;)

memory.max: value '1' does not have a value in the enumeration '65536,
131072, 196608, 262144, 327680, 393216, 458752, 524288, 589824, 655360,
720896, 786432, 851968, 917504, 983040, 1048576, 1114112, 1179648,
1245184, 1310720, 1376256, 1441792, 1507328, 1572864, 1638400, 1703936,
1769472, 1835008, 1900544, 1966080, 2031616, 2097152, 2162688, 2228224,
2293760, 2359296, 2424832, 2490368, 2555904, 2621440, 2686976, 2752512,
2818048, 2883584, 2949120, 3014656, 3080192, 3145728, 3211264, 3276800,
3342336, 3407872, 3473408, 3538944, 3604480, 3670016, 3735552, 3801088,
3866624, 3932160, 3997696, 4063232, 4128768, 4194304'




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem Alexandre Derumier
@ 2022-12-16 13:38   ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:38 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---

Ideally, this would be two patches:
One for introducing the helper
One for handling the new 'max' property (the patches that handle 'max'
could also be squashed together with the one introducing the property,
but It's fine either way).

>  PVE/QemuServer/Memory.pm | 50 ++++++++++++++++++++++++----------------
>  1 file changed, 30 insertions(+), 20 deletions(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 668508b..90e355b 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -15,7 +15,33 @@ get_current_memory
>  );
>  
>  my $MAX_NUMA = 8;
> -my $STATICMEM = 1024;
> +
> +my sub get_static_mem {
> +    my ($conf) = @_;
> +
> +    my $sockets = 1;

Could get the default via load_defaults().

> +    $sockets = $conf->{smp} if $conf->{smp}; # old style - no longer iused

Can be dropped. AFAICT, smp was not written to configs since the
beginning of git history.

> +    $sockets = $conf->{sockets} if $conf->{sockets};
> +    my $hotplug_features = PVE::QemuServer::parse_hotplug_features(defined($conf->{hotplug}) ? $conf->{hotplug} : '1');
> +
> +    my $static_memory = 0;
> +    my $memory = PVE::QemuServer::parse_memory($conf->{memory});
> +
> +    if($memory->{max}) {

Style nit: missing space after if

> +	my $dimm_size = $memory->{max} / 64;
> +	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
> +	$static_memory = $dimm_size * $sockets;
> +	$static_memory = 4096 if $static_memory < 4096;
> +    } elsif ($hotplug_features->{memory}) {
> +	#legacy
> +	$static_memory = 1024;
> +	$static_memory = $static_memory * $sockets if ($conf->{hugepages} && $conf->{hugepages} == 1024);
> +    } else {
> +       $static_memory = $memory->{current};

Style nit: wrong indentation




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
@ 2022-12-16 13:39   ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:39 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer/Memory.pm | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 90e355b..b847742 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -91,7 +91,15 @@ my sub get_max_mem {
>      # heuristic: remove 20 bits to get MB and half that as QEMU needs some overhead
>      my $bits_to_max_mem = int(1<<($bits - 21));
>  
> -    return $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
> +    my $cpu_max_mem = $bits_to_max_mem > 4*1024*1024 ? 4*1024*1024 : $bits_to_max_mem;
> +
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +    if($confmem->{max}) {

Style nit: missing space after if

> +       die "configured memory max can't be bigger than supported cpu architecture $cpu_max_mem MB" if $confmem->{max} > $cpu_max_mem;

Missing newline for the error message.

If we had a validator for the 'memory' format string, we could repeat
this check early, i.e. already when setting the 'max' property. Probably
makes sense to factor out the $cpu_max_mem calculation into its own
helper, so that we can get that value easily for that check.

Style nit: please move the post-if to its own line

> +       return $confmem->{max};
> +    }
> +
> +    return $cpu_max_mem;
>  }
>  
>  sub get_current_memory{




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined Alexandre Derumier
@ 2022-12-16 13:39   ` Fiona Ebner
  2022-12-19 12:05     ` DERUMIER, Alexandre
  0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:39 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> default kernel vhost config only support 64 slots by default,
> for performance since 2015.
> 
> Original memory hotplug code was done before, using qemu
> max supported 255 slots.
> 
> To reach max mem (4TB), we used incremental dimm size.
> 
> Instead of dynamic memory size, use 1 static dimm size, compute
> from max memory/64.
> 
> Fix:
> https://bugzilla.proxmox.com/show_bug.cgi?id=3446
> https://bugzilla.proxmox.com/show_bug.cgi?id=1426
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer/Memory.pm | 31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index b847742..8bbbf07 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -140,14 +140,15 @@ sub foreach_dimm{
>      my ($conf, $vmid, $memory, $sockets, $func) = @_;
>  
>      my $dimm_id = 0;
> -    my $current_size = 0;
> +    my $current_size = get_static_mem($conf);
>      my $dimm_size = 0;
>  
> -    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
> -	$current_size = 1024 * $sockets;
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +    if ($confmem->{max}) {
> +       $dimm_size = $confmem->{max}/64;

Style nit: missing spaces around division operator

> +    } elsif($conf->{hugepages} && $conf->{hugepages} == 1024) {
>  	$dimm_size = 1024;
>      } else {
> -	$current_size = 1024;
>  	$dimm_size = 512;
>      }
>  
> @@ -164,7 +165,7 @@ sub foreach_dimm{
>  	    &$func($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory);
>  	    return  $current_size if $current_size >= $memory;
>  	}
> -	$dimm_size *= 2;
> +	$dimm_size *= 2 if !$confmem->{max};
>      }
>  }
>  
> @@ -175,7 +176,12 @@ sub foreach_reverse_dimm {
>      my $current_size = 0;
>      my $dimm_size = 0;
>  
> -    if($conf->{hugepages} && $conf->{hugepages} == 1024) {
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +    if ($confmem->{max}) {
> +       $dimm_id = 63;
> +       $current_size = $confmem->{max};
> +       $dimm_size = $confmem->{max}/64;
> +    } elsif ($conf->{hugepages} && $conf->{hugepages} == 1024) {
>  	$current_size = 8355840;
>  	$dimm_size = 131072;
>      } else {
> @@ -197,6 +203,7 @@ sub foreach_reverse_dimm {
>  	    return  $current_size if $current_size <= $memory;
>  	}
>  	$dimm_size /= 2;

This line should be removed?

> +	$dimm_size /= 2 if !$confmem->{max};
>      }
>  }
>  
> @@ -263,7 +270,7 @@ sub qemu_memory_hotplug {
>  		    die $err;
>  		}
>  		#update conf after each succesful module hotplug
> -		my $mem = {};
> +		my $mem = { max => $MAX_MEM };
>  		$mem->{current} = $current_size;
>  		$conf->{memory} = PVE::QemuServer::print_memory($mem);
>  		PVE::QemuConfig->write_config($vmid, $conf);
> @@ -289,7 +296,7 @@ sub qemu_memory_hotplug {
>  		}
>  
>  		#update conf after each succesful module unplug
> -		my $mem = {};
> +		my $mem = { max => $MAX_MEM };
>  		$mem->{current} = $current_size;
>  		$conf->{memory} = PVE::QemuServer::print_memory($mem);
>  
> @@ -322,8 +329,9 @@ sub config {
>      my $memory = get_current_memory($conf);
>  
>      my $static_memory = get_static_mem($conf);
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
>  
> -    if ($hotplug_features->{memory}) {
> +    if ($hotplug_features->{memory} || defined($confmem->{max})) {

So setting 'max' auto-attches the dimms[0], but without memory hotplug
enabled, those are useless? Or am I missing something? I feel like we
can just keep the conditional here and below[0] as-is.

>  	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
>  	my $MAX_MEM = get_max_mem($conf);
>  	die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
> @@ -334,7 +342,8 @@ sub config {
>  	}
>  
>  	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
> -	push @$cmd, '-m', "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
> +	my $slots = $confmem->{max} ? 64 : 255;
> +	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
>  
>      } else {
>  	push @$cmd, '-m', $static_memory;
> @@ -403,7 +412,7 @@ sub config {
>  	}
>      }
>  
> -    if ($hotplug_features->{memory}) {
> +    if ($hotplug_features->{memory} || $confmem->{max}) {

[0]: here

>  	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
>  	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
>  




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
@ 2022-12-16 13:42   ` Fiona Ebner
       [not found]     ` <7b9306c429440304fb37601ece5ffdbad0b90e5f.camel@groupe-cyllene.com>
  0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:42 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> a 4GB static memory is needed for DMA+boot memory, as this memory
> is almost always un-unpluggeable.
> 
> 1 virtio-mem pci device is setup for each numa node on pci.4 bridge
> 
> virtio-mem use a fixed blocksize with 32000 blocks
> Blocksize is computed from the maxmemory-4096/32000 with a minimum of
> 2MB to map THP.
> (lower blocksize = more chance to unplug memory).
> 
> fixes:
> https://bugzilla.proxmox.com/show_bug.cgi?id=931

Comment 7 talks about Windows, and virtio-mem is not supported there at
the moment, so I don't think we should consider it fixed ;)

> https://bugzilla.proxmox.com/show_bug.cgi?id=2949
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer.pm        |  8 +++-
>  PVE/QemuServer/Memory.pm | 98 +++++++++++++++++++++++++++++++++++++---
>  PVE/QemuServer/PCI.pm    |  8 ++++
>  3 files changed, 106 insertions(+), 8 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 0d5b550..43fab29 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -285,6 +285,12 @@ my $memory_fmt = {
>          optional => 1,
>  	enum => [@max_memory_list],
>      },
> +    virtio => {
> +	description => "enable virtio-mem memory",

We really should mention that it's a technology preview, and that it
only works for Linux >=5.8 guests currently:
https://virtio-mem.gitlab.io/user-guide/#important-current-limitations

> +	type => 'boolean',
> +	optional => 1,
> +	default => 0,
> +    },
>  };
>  
>  my $meta_info_fmt = {
> @@ -3898,7 +3904,7 @@ sub config_to_command {
>  	push @$cmd, get_cpu_options($conf, $arch, $kvm, $kvm_off, $machine_version, $winversion, $gpu_passthrough);
>      }
>  
> -    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd);
> +    PVE::QemuServer::Memory::config($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd, $devices, $bridges, $arch, $machine_type);
>  
>      push @$cmd, '-S' if $conf->{freeze};
>  
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 8bbbf07..70ab65a 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -8,6 +8,8 @@ use PVE::Exception qw(raise raise_param_exc);
>  
>  use PVE::QemuServer;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
> +use PVE::QemuServer::PCI qw(print_pci_addr);
> +
>  use base qw(Exporter);
>  
>  our @EXPORT_OK = qw(
> @@ -27,7 +29,9 @@ my sub get_static_mem {
>      my $static_memory = 0;
>      my $memory = PVE::QemuServer::parse_memory($conf->{memory});
>  
> -    if($memory->{max}) {
> +    if ($memory->{virtio}) {
> +	$static_memory = 4096;
> +    } elsif ($memory->{max}) {
>  	my $dimm_size = $memory->{max} / 64;
>  	#static mem can't be lower than 4G and lower than 1 dimmsize by socket
>  	$static_memory = $dimm_size * $sockets;
> @@ -102,6 +106,24 @@ my sub get_max_mem {
>      return $cpu_max_mem;
>  }
>  
> +my sub get_virtiomem_block_size {
> +    my ($conf) = @_;
> +
> +    my $MAX_MEM = get_max_mem($conf);
> +    my $static_memory = get_static_mem($conf);
> +    my $memory = get_current_memory($conf);
> +    #virtiomem can map 32000 block size. try to use lowerst blocksize, lower = more chance to unplug memory.

Style nit: line too long

> +    my $blocksize = ($MAX_MEM - $static_memory) / 32000;
> +    #round next power of 2
> +    $blocksize = 2**(int(log($blocksize)/log(2))+1);

What if log($blocksize)/log(2) is exactly an integer? Do we still want
to add 1 then? If not, please use ceil(...) instead of int(...)+1. Well,
I guess it can't happen in practice by what values are possible for
$MAX_MEM and $static_memory, but still.

> +    #2MB is the minimum to be aligned with THP
> +    $blocksize = 2 if $blocksize < 2;

Nit: $blocksize is at least 2**1 after the previous caluclation, so this
isn't really needed.

> +
> +    die "memory size need to be multiple of $blocksize MB when virtio-mem is enabled" if ($memory % $blocksize != 0);

Missing newline in error message.

Style nit: line too long

> +
> +    return $blocksize;
> +}
> +
>  sub get_current_memory{
>      my ($conf) = @_;
>  
> @@ -224,7 +246,41 @@ sub qemu_memory_hotplug {
>      my $MAX_MEM = get_max_mem($conf);
>      die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
>  
> -    if ($value > $memory) {
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +
> +    if ($confmem->{virtio}) {
> +	my $blocksize = get_virtiomem_block_size($conf);
> +	my $requested_size = ($value - $static_memory) / $sockets * 1024 * 1024;
> +	my $totalsize = $static_memory;
> +	my $err = undef;
> +
> +	for (my $i = 0; $i < $sockets; $i++)  {
> +
> +	    my $id = "virtiomem$i";
> +	    my $retry = 0;
> +	    mon_cmd($vmid, 'qom-set', path => "/machine/peripheral/$id", property => "requested-size", value => int($requested_size));

I'd eval the mon_cmd's and also catch errors there.

> +
> +	    my $size = 0;
> +	    while (1) {
> +		sleep 1;

If there really is no good alternative to this querying loop, I'd rather
issue the qom-set command for all virtio-mem devices first, and then do
the loop. Maybe also move the sleep to the end of the loop.

> +		$size = mon_cmd($vmid, 'qom-get', path => "/machine/peripheral/$id", property => "size");
> +		$err = 1 if $retry > 5;
> +		last if $size eq $requested_size || $retry > 5;

I think, $requested_size doesn't have to be a multiple of $sockets, so
this should rather be int($requested_size), which is what you set above.

> +		$retry++;
> +	    }
> +	    $totalsize += ($size / 1024 / 1024 );
> +	}
> +	#update conf after each succesfull change
> +	if($err) {

But this is only done in the error case, not after each successful change.

> +	    my $mem = { max => $MAX_MEM, virtio => 1};
> +	    $mem->{current} = $totalsize;

Nit: int($totalsize) just to be sure?

> +	    $conf->{memory} = PVE::QemuServer::print_memory($mem);
> +	    PVE::QemuConfig->write_config($vmid, $conf);
> +	    raise_param_exc({ 'memory' => "error modify virtio memory" }) if $err;

It's not necessarily a parameter issue, please use die instead.

> +	}
> +	return $totalsize;

The other branches don't (explicitly) return anything.

> +
> +    } elsif ($value > $memory) {
>  
>  	my $numa_hostmap;
>  
> @@ -324,14 +380,15 @@ sub qemu_dimm_list {
>  }
>  
>  sub config {
> -    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd) = @_;
> +    my ($conf, $vmid, $sockets, $cores, $defaults, $hotplug_features, $cmd, $devices, $bridges, $arch, $machine_type) = @_;
>  
>      my $memory = get_current_memory($conf);
>  
>      my $static_memory = get_static_mem($conf);
> +
>      my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
>  
> -    if ($hotplug_features->{memory} || defined($confmem->{max})) {
> +    if ($hotplug_features->{memory} || defined($confmem->{max}) || defined($confmem->{virtio})) {

Again, should we even bother attaching the devices if hotplug is not
enabled?

>  	die "NUMA needs to be enabled for memory hotplug\n" if !$conf->{numa};
>  	my $MAX_MEM = get_max_mem($conf);
>  	die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory > $MAX_MEM;
> @@ -342,8 +399,12 @@ sub config {
>  	}
>  
>  	die "minimum memory must be ${static_memory}MB\n" if($memory < $static_memory);
> +
> +	my $cmdstr = "size=${static_memory}";
>  	my $slots = $confmem->{max} ? 64 : 255;
> -	push @$cmd, '-m', "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
> +	$cmdstr .= ",slots=$slots" if !$confmem->{'virtio'};
> +	$cmdstr .= ",maxmem=${MAX_MEM}M";
> +	push @$cmd, '-m', $cmdstr;
>  
>      } else {
>  	push @$cmd, '-m', $static_memory;
> @@ -412,7 +473,26 @@ sub config {
>  	}
>      }
>  
> -    if ($hotplug_features->{memory} || $confmem->{max}) {
> +    if ($confmem->{'virtio'}) {
> +	my $MAX_MEM = get_max_mem($conf);
> +	my $node_maxmem = ($MAX_MEM - $static_memory) / $sockets;
> +	my $node_mem = ($memory - $static_memory) / $sockets;
> +	my $blocksize = get_virtiomem_block_size($conf);
> +
> +	for (my $i = 0; $i < $sockets; $i++)  {
> +
> +	    my $id = "virtiomem$i";
> +	    my $mem_object = print_mem_object($conf, "mem-$id", $node_maxmem);
> +	    push @$cmd, "-object" , "$mem_object,reserve=off";
> +
> +	    my $pciaddr = print_pci_addr($id, $bridges, $arch, $machine_type);

Can we rather handle the PCI address printing in config_to_command() and
only pass in the addresses here? That would also avoid the PCI module
usage and the new "one-time usage" parameters passed to Memory::config().

Maybe have a small helper in here, that just returns the needed $ids
depending on the config. Then in config_to_command(), call that helper,
print the PCI addresses, then call Memory::config(..., { $id1 =>
$address1, $id2 => $adress2, ... }).

Might also not be the nicest, but at least be a little less cluttering
IMHO. But feel free to come up with something better or keep it as-is if
you really want to ;)

> +	    my $mem_device = "virtio-mem-pci,block-size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-$id,node=$i$pciaddr";
> +	    $mem_device .= ",prealloc=on" if $conf->{hugepages};

So prealloc=on for the device, but not prealloc=yes for the object
below[0]. Would you mind explaining it to me? I just found the part
mentioned for v7.0 here
https://virtio-mem.gitlab.io/user-guide/user-guide-qemu.html#updates

> +	    push @$devices, "-device", $mem_device;

The dimm devices in the other branch are not pushed onto $devices, so
this feels inconsistent. Why not add it onto $cmd too?

> +	}
> +
> +    } elsif ($hotplug_features->{memory} || $confmem->{max}) {
> +
>  	foreach_dimm($conf, $vmid, $memory, $sockets, sub {
>  	    my ($conf, $vmid, $name, $dimm_size, $numanode, $current_size, $memory) = @_;
>  
> @@ -430,12 +510,16 @@ sub config {
>  sub print_mem_object {
>      my ($conf, $id, $size) = @_;
>  
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +
>      if ($conf->{hugepages}) {
>  
>  	my $hugepages_size = hugepages_size($conf, $size);
>  	my $path = hugepages_mount_path($hugepages_size);
>  
> -	return "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on,prealloc=yes";
> +	my $object = "memory-backend-file,id=$id,size=${size}M,mem-path=$path,share=on";
> +	$object .= ",prealloc=yes" if !$confmem->{virtio};

[0]

> +	return $object;
>      } else {
>  	return "memory-backend-ram,id=$id,size=${size}M";
>      }
> diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
> index a18b974..0187c74 100644
> --- a/PVE/QemuServer/PCI.pm
> +++ b/PVE/QemuServer/PCI.pm
> @@ -249,6 +249,14 @@ sub get_pci_addr_map {
>  	'scsihw2' => { bus => 4, addr => 1 },
>  	'scsihw3' => { bus => 4, addr => 2 },
>  	'scsihw4' => { bus => 4, addr => 3 },
> +	'virtiomem0' => { bus => 4, addr => 4 },
> +	'virtiomem1' => { bus => 4, addr => 5 },
> +	'virtiomem2' => { bus => 4, addr => 6 },
> +	'virtiomem3' => { bus => 4, addr => 7 },
> +	'virtiomem4' => { bus => 4, addr => 8 },
> +	'virtiomem5' => { bus => 4, addr => 9 },
> +	'virtiomem6' => { bus => 4, addr => 10 },
> +	'virtiomem7' => { bus => 4, addr => 11 },

What if $conf->{sockets} > 8? Maybe mention the limitation in the
description of the 'virtio' property in the 'memory' string. Is the plan
to just add on more virtiomem PCI devices in the future?

>      } if !defined($pci_addr_map);
>      return $pci_addr_map;
>  }




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests Alexandre Derumier
@ 2022-12-16 13:42   ` Fiona Ebner
  2022-12-19 14:48     ` Thomas Lamprecht
  0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:42 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  test/cfg2cmd/memory-virtio-hugepages-1G.conf  | 11 ++++++
>  .../memory-virtio-hugepages-1G.conf.cmd       | 35 +++++++++++++++++++
>  test/cfg2cmd/memory-virtio-max.conf           | 10 ++++++
>  test/cfg2cmd/memory-virtio-max.conf.cmd       | 35 +++++++++++++++++++
>  test/cfg2cmd/memory-virtio.conf               | 10 ++++++
>  test/cfg2cmd/memory-virtio.conf.cmd           | 35 +++++++++++++++++++
>  6 files changed, 136 insertions(+)
>  create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf
>  create mode 100644 test/cfg2cmd/memory-virtio-hugepages-1G.conf.cmd
>  create mode 100644 test/cfg2cmd/memory-virtio-max.conf
>  create mode 100644 test/cfg2cmd/memory-virtio-max.conf.cmd
>  create mode 100644 test/cfg2cmd/memory-virtio.conf
>  create mode 100644 test/cfg2cmd/memory-virtio.conf.cmd
> 
> diff --git a/test/cfg2cmd/memory-virtio-hugepages-1G.conf b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
> new file mode 100644
> index 0000000..616b98e
> --- /dev/null
> +++ b/test/cfg2cmd/memory-virtio-hugepages-1G.conf
> @@ -0,0 +1,11 @@
> +# TEST: virtio-mem with memorymax defined

memorymax not defined. I think get_max_mem() and thus the test then
depends on get_host_phys_address_bits(), i.e. on which host it is run?

> +# QEMU_VERSION: 3.0

It wasn't supported on QEMU 3.0 ;) But it raises the question if we
should introduce version guards for the feature? Not sure what our
current policy is when it comes to actively-opt-in features. Especially
if it's still a technology preview.

> +cores: 2
> +memory: 32768,virtio=1
> +name: simple
> +numa: 1
> +ostype: l26
> +smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
> +sockets: 2
> +vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
> +hugepages: 1024
> \ No newline at end of file

(...)

> diff --git a/test/cfg2cmd/memory-virtio.conf b/test/cfg2cmd/memory-virtio.conf
> new file mode 100644
> index 0000000..f88f6f5
> --- /dev/null
> +++ b/test/cfg2cmd/memory-virtio.conf
> @@ -0,0 +1,10 @@
> +# TEST: virtio-mem with memorymax defined

same

> +# QEMU_VERSION: 3.0

same

> +cores: 2
> +memory: 32768,virtio=1
> +name: simple
> +numa: 1
> +ostype: l26
> +smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
> +sockets: 2
> +vmgenid: c773c261-d800-4348-9f5d-167fadd53cf8
> \ No newline at end of file




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem
  2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem Alexandre Derumier
@ 2022-12-16 13:42   ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-16 13:42 UTC (permalink / raw)
  To: pve-devel, aderumier

Am 09.12.22 um 20:27 schrieb Alexandre Derumier:
> diff --git a/PVE/QemuServer/Memory.pm b/PVE/QemuServer/Memory.pm
> index 70ab65a..84a9126 100644
> --- a/PVE/QemuServer/Memory.pm
> +++ b/PVE/QemuServer/Memory.pm
> @@ -9,6 +9,7 @@ use PVE::Exception qw(raise raise_param_exc);
>  use PVE::QemuServer;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::PCI qw(print_pci_addr);
> +use PVE::GuestHelpers qw(safe_string_ne safe_num_ne safe_boolean_ne);
>  
>  use base qw(Exporter);
>  
> @@ -230,24 +231,32 @@ sub foreach_reverse_dimm {
>  }
>  
>  sub qemu_memory_hotplug {
> -    my ($vmid, $conf, $defaults, $opt, $value) = @_;
> +    my ($vmid, $conf, $new_mem) = @_;
>  
> -    return $value if !PVE::QemuServer::check_running($vmid);
> +    return if !PVE::QemuServer::check_running($vmid);

Unrelated change not belonging in this patch

>  
> -    my $sockets = 1;
> -    $sockets = $conf->{sockets} if $conf->{sockets};
> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> +
> +    # skip non hotpluggable value
> +    if (safe_string_ne($new_mem->{max}, $confmem->{max}) ||
> +	safe_boolean_ne($new_mem->{virtio}, $confmem->{virtio})) {
> +	die "skip\n";

This special error "return" value should really not be used here. It's
just too intransparent in the long run. This should be done by the
callers and actually belongs in the patches introducing 'virtio' and
'max' (handling) respectively.

> +    }
> +
> +    my $value = $new_mem->{current};
> +    my $memory = $confmem->{current};
>  
> -    my $memory = get_current_memory($conf);
>      return $value if $value == $memory;
>  
> +    my $sockets = 1;
> +    $sockets = $conf->{sockets} if $conf->{sockets};
> +
>      my $static_memory = get_static_mem($conf);
>  
>      die "memory can't be lower than $static_memory MB" if $value < $static_memory;
>      my $MAX_MEM = get_max_mem($conf);
>      die "you cannot add more memory than max mem $MAX_MEM MB!\n" if $value > $MAX_MEM;
>  
> -    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> -
>      if ($confmem->{virtio}) {
>  	my $blocksize = get_virtiomem_block_size($conf);
>  	my $requested_size = ($value - $static_memory) / $sockets * 1024 * 1024;




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem
  2022-12-16 13:38 ` [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Fiona Ebner
@ 2022-12-19 11:31   ` DERUMIER, Alexandre
  0 siblings, 0 replies; 32+ messages in thread
From: DERUMIER, Alexandre @ 2022-12-19 11:31 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

> > 

Hi Fiona, thanks for the review

> > 
> The general ideas looks fine to me and my basic testing seemed fine
> too,
> but most patches could use a few improvements. The biggest issue is
> that
> getting static information in HA manager is not adapted to the
> change.
> 
yes, I known, I wanted a review first.


> The virtio feature should be marked as technology preview and
> mentioned
> that it only works for guests running Linux >= 5.8 [0].
> 
yes. (and from my test, it's more 5.13~5.15 to have something working
correctly)

> IMHO, it'd be nicer to add the property string handling/registering
> in
> the Memory.pm module directly, so that all is contained in one place.

ok, I'll move it,no problem.

> 
> I also think that using a validator for the format might be worth it.
> 
> Also, the tests should be made more host-independent, but for the
> NUMA
> node issue 01/10 that might require a bit of rework, to be able to
> mock
> the relevant parts.
> 
I'll rework the tests, I didn't known about the host numa detection, as
I'm always working on a numa server ;) .I'll mock host max memory
detection too.


> Should the virtio feature be QEMU-version-guarded? It's explicitly
> opt-in and technology preview, so maybe not really needed?
> 
I think it's ok, it's an opt-in option.


> See the individual replies on patches for details.
> 
I'll reply in the patches

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined
  2022-12-16 13:39   ` Fiona Ebner
@ 2022-12-19 12:05     ` DERUMIER, Alexandre
  2022-12-19 12:28       ` Fiona Ebner
  0 siblings, 1 reply; 32+ messages in thread
From: DERUMIER, Alexandre @ 2022-12-19 12:05 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner


> >      my $static_memory = get_static_mem($conf);
> > +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
> >  
> > -    if ($hotplug_features->{memory}) {
> > +    if ($hotplug_features->{memory} || defined($confmem->{max})) {
> 
> So setting 'max' auto-attches the dimms[0], but without memory
> hotplug
> enabled, those are useless? Or am I missing something? I feel like we
> can just keep the conditional here and below[0] as-is.
> 

yes, I was not sure here(same for virtio-mem).

virtiomem && maxmem values only usefull make sense if memory hotplug is
enabled, but for example, for cpu hotplug, we can define a specific
vcpu number, without the hotplug cpu option enabled.

So I don't known if we should conditionnaly change hardware topology
depending of an hotplug option.
(I mean, only the hotplug action itself should be blocked/allowed by
the option)



> >         die "NUMA needs to be enabled for memory hotplug\n" if
> > !$conf->{numa};
> >         my $MAX_MEM = get_max_mem($conf);
> >         die "Total memory is bigger than ${MAX_MEM}MB\n" if $memory
> > > $MAX_MEM;
> > @@ -334,7 +342,8 @@ sub config {
> >         }
> >  
> >         die "minimum memory must be ${static_memory}MB\n"
> > if($memory < $static_memory);
> > -       push @$cmd, '-m',
> > "size=${static_memory},slots=255,maxmem=${MAX_MEM}M";
> > +       my $slots = $confmem->{max} ? 64 : 255;
> > +       push @$cmd, '-m',
> > "size=${static_memory},slots=$slots,maxmem=${MAX_MEM}M";
> >  
> >      } else {
> >         push @$cmd, '-m', $static_memory;
> > @@ -403,7 +412,7 @@ sub config {
> >         }
> >      }
> >  
> > -    if ($hotplug_features->{memory}) {
> > +    if ($hotplug_features->{memory} || $confmem->{max}) {
> 
> [0]: here
> 
> >         foreach_dimm($conf, $vmid, $memory, $sockets, sub {
> >             my ($conf, $vmid, $name, $dimm_size, $numanode,
> > $current_size, $memory) = @_;
> >  
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined
  2022-12-19 12:05     ` DERUMIER, Alexandre
@ 2022-12-19 12:28       ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2022-12-19 12:28 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 19.12.22 um 13:05 schrieb DERUMIER, Alexandre:
> 
>>>      my $static_memory = get_static_mem($conf);
>>> +    my $confmem = PVE::QemuServer::parse_memory($conf->{memory});
>>>  
>>> -    if ($hotplug_features->{memory}) {
>>> +    if ($hotplug_features->{memory} || defined($confmem->{max})) {
>>
>> So setting 'max' auto-attches the dimms[0], but without memory
>> hotplug
>> enabled, those are useless? Or am I missing something? I feel like we
>> can just keep the conditional here and below[0] as-is.
>>
> 
> yes, I was not sure here(same for virtio-mem).
> 
> virtiomem && maxmem values only usefull make sense if memory hotplug is
> enabled, but for example, for cpu hotplug, we can define a specific
> vcpu number, without the hotplug cpu option enabled.
> 
> So I don't known if we should conditionnaly change hardware topology
> depending of an hotplug option.
> (I mean, only the hotplug action itself should be blocked/allowed by
> the option)

It's just that the 'max' setting doesn't explicitly describe the
hardware topology, but sure, the setting only makes sense for hotplug
after all and then a different hardware topology is needed. And with the
virito setting, one can argue that it does describe the hardware
topology. So feel free to keep the conditionals as-is in your patches :)




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests
  2022-12-16 13:42   ` Fiona Ebner
@ 2022-12-19 14:48     ` Thomas Lamprecht
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Lamprecht @ 2022-12-19 14:48 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner, aderumier

On 16/12/2022 14:42, Fiona Ebner wrote:
> It wasn't supported on QEMU 3.0 😉 But it raises the question if we
> should introduce version guards for the feature? Not sure what our
> current policy is when it comes to actively-opt-in features. Especially
> if it's still a technology preview.

Yeah with the whole thing being opt-in it isn't that bad, can still
help to encode them though, especially to guard against some dev or
admin mishaps, e.g., on migration to old-er node. So IMO not a must,
but quite probably the cleaner and better approach, so would only not
do it if it's a big PITA to handle (shouldn't really be).




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support
       [not found]     ` <7b9306c429440304fb37601ece5ffdbad0b90e5f.camel@groupe-cyllene.com>
@ 2022-12-20 10:26       ` Fiona Ebner
       [not found]         ` <b354aab5e4791e7c862b15470ca24c273b8030be.camel@groupe-cyllene.com>
  0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2022-12-20 10:26 UTC (permalink / raw)
  To: DERUMIER, Alexandre, Proxmox VE development discussion

Am 20.12.22 um 10:50 schrieb DERUMIER, Alexandre:
>>
>>> +    my $blocksize = ($MAX_MEM - $static_memory) / 32000;>>> +    #round next power of 2
>>> +    $blocksize = 2**(int(log($blocksize)/log(2))+1);
>>
>> What if log($blocksize)/log(2) is exactly an integer? Do we still
>> want
>> to add 1 then? If not, please use ceil(...) instead of int(...)+1.
>> Well,
>> I guess it can't happen in practice by what values are possible for
>> $MAX_MEM and $static_memory, but still.
>>
> The math seem to be good to have power of 2 as blocksize
> 
> log(1)/log(2):0  blocksize:2
> log(2)/log(2):1  blocksize:4

Yes, but here you'd use 4, when 2 would also be good. If that doesn't
matter, feel free to keep it as-is; I was just wondering. It's likely
not even relevant in practice, because with the values $MAX_MEM and
$static_memory can take, I'm not sure we can even get integers for the
argument to the first log()?

> with a ceil, we don't have block begin to 2
> 
> log(1)/log(2):0  blocksize:1
> log(2)/log(2):1  blocksize:2

Isn't ($MAX_MEM - $static_memory) / 32000 always strictly greater than
1? And if it could get smaller than 1, we also might have issues with
the int()+1 approach, because the result of the first log() will become
negative.

To be on the safe side we could just move the minimum check up:

my $blocksize = ($MAX_MEM - $static_memory) / 32000;
$blocksize = 2 if $blocksize < 2;
$blocksize = 2**(ceil(log($blocksize)/log(2)));

>>> +    #2MB is the minimum to be aligned with THP
>>> +    $blocksize = 2 if $blocksize < 2;
>>
>> Nit: $blocksize is at least 2**1 after the previous caluclation, so
>> this
>> isn't really needed.
> yes,indeed.
> 
>>
>>> +           my $mem_device = "virtio-mem-pci,block-
>>> size=${blocksize}M,requested-size=${node_mem}M,id=$id,memdev=mem-
>>> $id,node=$i$pciaddr";
>>> +           $mem_device .= ",prealloc=on" if $conf->{hugepages};
>>
>> So prealloc=on for the device, but not prealloc=yes for the object
>> below[0]. Would you mind explaining it to me? I just found the part
>> mentioned for v7.0 here
>>
> oh, yes, sorry.
> Just look here for details:
> https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg00902.html
> "A common use case for hugetlb will be using "reserve=off,prealloc=off"
> for
> the memory backend and "prealloc=on" for the virtio-mem device. This
> way, no huge pages will be reserved for the process, but we can recover
> if there are no actual huge pages when plugging memory. Libvirt is
> already prepared for this.
> "
> 

Thanks! Could you please add it to the commit message?

>>> diff --git a/PVE/QemuServer/PCI.pm b/PVE/QemuServer/PCI.pm
>>> index a18b974..0187c74 100644
>>> --- a/PVE/QemuServer/PCI.pm
>>> +++ b/PVE/QemuServer/PCI.pm
>>> @@ -249,6 +249,14 @@ sub get_pci_addr_map {
>>>         'scsihw2' => { bus => 4, addr => 1 },
>>>         'scsihw3' => { bus => 4, addr => 2 },
>>>         'scsihw4' => { bus => 4, addr => 3 },
>>> +       'virtiomem0' => { bus => 4, addr => 4 },
>>> +       'virtiomem1' => { bus => 4, addr => 5 },
>>> +       'virtiomem2' => { bus => 4, addr => 6 },
>>> +       'virtiomem3' => { bus => 4, addr => 7 },
>>> +       'virtiomem4' => { bus => 4, addr => 8 },
>>> +       'virtiomem5' => { bus => 4, addr => 9 },
>>> +       'virtiomem6' => { bus => 4, addr => 10 },
>>> +       'virtiomem7' => { bus => 4, addr => 11 },
>>
>> What if $conf->{sockets} > 8? Maybe mention the limitation in the
>> description of the 'virtio' property in the 'memory' string. Is the
>> plan
>> to just add on more virtiomem PCI devices in the future?
>>
> Well, do we really support more than 8 sockets ? 
> I'm not sure than current numa implementation is working with more than
> 8 socket, and in real world, I never have seen servers with more than 8
> sockets. (I think super-expensive hardware with 16 - 32 sockets is the
> max)
> AFAIK, The main usecase of sockets/numa node, is for auto
> numabalancing, to try to map the qemu numa nodes to physical numa
> nodes.
> 
> If needed, we could a dedicated pci bridge (like for virtioscsi), with
> 32 devices.
> and use something like virtiomem0 = addr=1 , virtiomem1=addr=2
> 
> 

Okay, makes sense. You never know what the future brings, but if it'll
take some time until we need to add more (and not that much more) I
guess it's fine.




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support
       [not found]         ` <b354aab5e4791e7c862b15470ca24c273b8030be.camel@groupe-cyllene.com>
@ 2022-12-20 12:31           ` DERUMIER, Alexandre
  0 siblings, 0 replies; 32+ messages in thread
From: DERUMIER, Alexandre @ 2022-12-20 12:31 UTC (permalink / raw)
  To: pve-devel, f.ebner

Le mardi 20 décembre 2022 à 13:16 +0100, alexandre derumier a écrit :
> I totally forget than mem was in bytes,
> 
> so the minimum blocksize is 2048

oh, sorry, it's in MB in the configuration not bytes. 
(don't have sleep enough this night ^_^ )


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2022-12-16 13:38   ` Fiona Ebner
@ 2023-01-02 10:50     ` DERUMIER, Alexandre
  2023-01-05 12:47       ` Fiona Ebner
  2023-01-02 11:23     ` DERUMIER, Alexandre
  1 sibling, 1 reply; 32+ messages in thread
From: DERUMIER, Alexandre @ 2023-01-02 10:50 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Hi Fiona,


I'm beginning to rework the patch serie 

If it's ok for you, I'll split it in differents patch series, 1 to add
the parser, 1 for maxmem and 1 for virtio.

About this comment:

Le vendredi 16 décembre 2022 à 14:38 +0100, Fiona Ebner a écrit :
> > diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
> > index e91f906..9115d50 100644
> > --- a/PVE/QemuServer/Helpers.pm
> > +++ b/PVE/QemuServer/Helpers.pm
> > @@ -143,8 +143,7 @@ sub version_cmp {
> >   }
> >   
> >   sub config_aware_timeout {
> > -    my ($config, $is_suspended) = @_;
> > -    my $memory = $config->{memory};
> > +    my ($config, $memory, $is_suspended) = @_;
> 
> Why do you add this? Also, when you adapt the callers, you only pass
> in
> $config->{memory} which is already part of $config.

When I try to 

use PVE::QemuServer::Memory qw(get_current_memory);

in the Helpers.pm, to parse the $conf->{memory},

I'm getting errors in tests:

# error does not match expected error: 'Undefined subroutine
&PVE::QemuServer::windows_version called at ../PVE/QemuServer.pm line
3562.

That's why I'm parsing the memory in QemuServer and send it as param
config_aware_timeout.


Do you known a better way to do it ?




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2022-12-16 13:38   ` Fiona Ebner
  2023-01-02 10:50     ` DERUMIER, Alexandre
@ 2023-01-02 11:23     ` DERUMIER, Alexandre
  2023-01-05 12:48       ` Fiona Ebner
  1 sibling, 1 reply; 32+ messages in thread
From: DERUMIER, Alexandre @ 2023-01-02 11:23 UTC (permalink / raw)
  To: pve-devel, aderumier, f.ebner

Le vendredi 16 décembre 2022 à 14:38 +0100, Fiona Ebner a écrit :
> From a modularity standpoint, it would be nice to move the format
> description, parsing+printing to PVE/QemuServer/Memory.pm, similar to
> how it is done in PVE/QemuServer/PCI.pm
> 
> Since $confdesc->{memory}->{default} is now gone, load_defaults()
> will
> not return a default for 'memory' anymore. And $conf->{memory} needs
> to
> be parsed everywhere. These two things will break getting static
> usage
> in HA manager, which uses load_defaults() and relies on $conf-
> >{memory}
> to be a single value right now. We can switch there to use
> get_current_memory() too of course, but it'd require a versioned
> Breaks+Depends.
> 
> Alternatively, we could add a function in qemu-server for calculating
> the static stats and call that from HA. Requires a versioned
> Breaks+Depends too, but then we'd be safe in the future and also
> catch
> such changes more easily. OTOH, it'd make the coupling go in the
> other
> direction: if HA manager wants to change what it uses for static
> consideration, then qemu-server would need to adapt. So not sure.

I was think about this,
When dynamic scheduler will be implemented, you'll need to use values
streamed from pvestatd.
So why can't we do the same for static values ?  (maxmem && maxcpus are
already send by pvestatd).

This should avoid the need to parse vm config,
and maybe avoid to use load_config ?
(from your initial commit,
https://git.proxmox.com/?p=pve-ha-manager.git;a=commit;h=561e7f4bfb235fcdca5b0bbb8422ce742a5da75f,
it seem to be slow)



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2023-01-02 10:50     ` DERUMIER, Alexandre
@ 2023-01-05 12:47       ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-01-05 12:47 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 02.01.23 um 11:50 schrieb DERUMIER, Alexandre:
> Hi Fiona,
> 
> 
> I'm beginning to rework the patch serie 
> 
> If it's ok for you, I'll split it in differents patch series, 1 to add
> the parser, 1 for maxmem and 1 for virtio.
> 
> About this comment:
> 
> Le vendredi 16 décembre 2022 à 14:38 +0100, Fiona Ebner a écrit :
>>> diff --git a/PVE/QemuServer/Helpers.pm b/PVE/QemuServer/Helpers.pm
>>> index e91f906..9115d50 100644
>>> --- a/PVE/QemuServer/Helpers.pm
>>> +++ b/PVE/QemuServer/Helpers.pm
>>> @@ -143,8 +143,7 @@ sub version_cmp {
>>>   }
>>>   
>>>   sub config_aware_timeout {
>>> -    my ($config, $is_suspended) = @_;
>>> -    my $memory = $config->{memory};
>>> +    my ($config, $memory, $is_suspended) = @_;
>>
>> Why do you add this? Also, when you adapt the callers, you only pass
>> in
>> $config->{memory} which is already part of $config.
> 
> When I try to 
> 
> use PVE::QemuServer::Memory qw(get_current_memory);
> 
> in the Helpers.pm, to parse the $conf->{memory},
> 
> I'm getting errors in tests:
> 
> # error does not match expected error: 'Undefined subroutine
> &PVE::QemuServer::windows_version called at ../PVE/QemuServer.pm line
> 3562.
> 
> That's why I'm parsing the memory in QemuServer and send it as param
> config_aware_timeout.
> 

I suspect it's because PVE::QemuServer::Memory does a
use PVE::QemuServer;
resulting in a circular include and that seems to break the export?

The same error also shows up if I do
use PVE::QemuServer;
in PVE::QemuServer::Helpers directly.

But it's strange, because your series turns PVE::QemuServer::Memory into
an exporter and PVE::QemuServer uses it, and there it doesn't break. But
I guess we just got lucky there :/


I tried to track it down and it seems to be because
run_config2command_tests.pl does a
use PVE::QMPClient;
which in turn does a
use PVE::QemuServer::Helpers;

It seems the "early" include of Helpers invalidates the later one with qw().


A small reproducer is:

febner@pve7-dev ~/playground (git)-[master] % cat Helpers.pm
package Helpers;

use strict;
use warnings;

use Server;

use base qw(Exporter);
our @EXPORT_OK = qw(foo);
sub foo { print "foo\n"; }

1;
febner@pve7-dev ~/playground (git)-[master] % cat Server.pm
package Server;

use strict;
use warnings;

use Helpers qw(foo);

sub bar {
    foo();
}

1;
febner@pve7-dev ~/playground (git)-[master] % cat test.pl
#!/usr/bin/perl

use strict;
use warnings;

use lib qw(.);

#use Helpers;
use Server;

Server::bar();
febner@pve7-dev ~/playground (git)-[master] % ./test.pl
foo
febner@pve7-dev ~/playground (git)-[master] % vi test.pl
febner@pve7-dev ~/playground (git)-[master] % cat test.pl
#!/usr/bin/perl

use strict;
use warnings;

use lib qw(.);

use Helpers;
use Server;

Server::bar();
febner@pve7-dev ~/playground (git)-[master] % ./test.pl
Undefined subroutine &Server::foo called at Server.pm line 9.

> 
> Do you known a better way to do it ?
> 

I guess we should avoid circular use statements whenever possible, so
the current way can be fine.




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
  2023-01-02 11:23     ` DERUMIER, Alexandre
@ 2023-01-05 12:48       ` Fiona Ebner
       [not found]         ` <4ba723fb986517054761eb65f38812fac86a895b.camel@groupe-cyllene.com>
  0 siblings, 1 reply; 32+ messages in thread
From: Fiona Ebner @ 2023-01-05 12:48 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

Am 02.01.23 um 12:23 schrieb DERUMIER, Alexandre:
> Le vendredi 16 décembre 2022 à 14:38 +0100, Fiona Ebner a écrit :
>> From a modularity standpoint, it would be nice to move the format
>> description, parsing+printing to PVE/QemuServer/Memory.pm, similar to
>> how it is done in PVE/QemuServer/PCI.pm
>>
>> Since $confdesc->{memory}->{default} is now gone, load_defaults()
>> will
>> not return a default for 'memory' anymore. And $conf->{memory} needs
>> to
>> be parsed everywhere. These two things will break getting static
>> usage
>> in HA manager, which uses load_defaults() and relies on $conf-
>>> {memory}
>> to be a single value right now. We can switch there to use
>> get_current_memory() too of course, but it'd require a versioned
>> Breaks+Depends.
>>
>> Alternatively, we could add a function in qemu-server for calculating
>> the static stats and call that from HA. Requires a versioned
>> Breaks+Depends too, but then we'd be safe in the future and also
>> catch
>> such changes more easily. OTOH, it'd make the coupling go in the
>> other
>> direction: if HA manager wants to change what it uses for static
>> consideration, then qemu-server would need to adapt. So not sure.
> 
> I was think about this,
> When dynamic scheduler will be implemented, you'll need to use values
> streamed from pvestatd.
> So why can't we do the same for static values ?  (maxmem && maxcpus are
> already send by pvestatd).
> 
> This should avoid the need to parse vm config,
> and maybe avoid to use load_config ?
> (from your initial commit,
> https://git.proxmox.com/?p=pve-ha-manager.git;a=commit;h=561e7f4bfb235fcdca5b0bbb8422ce742a5da75f,
> it seem to be slow)
> 
> 

The information is already readily available on the cluster file system,
so sending it around via pvestatd additionally isn't ideal IMHO. maxmem
and maxcpus are only one per node and were not available before.

The load_config() call is not really problematic, because the result
from cfs_read_file() is cached. The real issue is that
recompute_online_node_usage() and thus getting the static info is called
very often currently. There was an RFC [0] to get the information using
PVE::Cluster::get_guest_config_properties(), but it's only twice as
fast. Optimizing how often we call recompute_online_node_usage() can
give us much more.

In any case, HA manager needs to be adapted before the memory setting
can be turned into a property string.

[0]: https://lists.proxmox.com/pipermail/pve-devel/2022-November/054956.html




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 02/10] add memory parser
       [not found]         ` <4ba723fb986517054761eb65f38812fac86a895b.camel@groupe-cyllene.com>
@ 2023-01-09 14:35           ` Fiona Ebner
  0 siblings, 0 replies; 32+ messages in thread
From: Fiona Ebner @ 2023-01-09 14:35 UTC (permalink / raw)
  To: DERUMIER, Alexandre, Proxmox VE development discussion

Am 05.01.23 um 15:21 schrieb DERUMIER, Alexandre:
> Le jeudi 05 janvier 2023 à 13:48 +0100, Fiona Ebner a écrit :
>> Am 02.01.23 um 12:23 schrieb DERUMIER, Alexandre:
>>> Le vendredi 16 décembre 2022 à 14:38 +0100, Fiona Ebner a écrit :
>>>> From a modularity standpoint, it would be nice to move the format
>>>> description, parsing+printing to PVE/QemuServer/Memory.pm,
>>>> similar to
>>>> how it is done in PVE/QemuServer/PCI.pm
>>>>
>>>> Since $confdesc->{memory}->{default} is now gone, load_defaults()
>>>> will
>>>> not return a default for 'memory' anymore. And $conf->{memory}
>>>> needs
>>>> to
>>>> be parsed everywhere. These two things will break getting static
>>>> usage
>>>> in HA manager, which uses load_defaults() and relies on $conf-
>>>>> {memory}
>>>> to be a single value right now. We can switch there to use
>>>> get_current_memory() too of course, but it'd require a versioned
>>>> Breaks+Depends.
>>>>
>>>> Alternatively, we could add a function in qemu-server for
>>>> calculating
>>>> the static stats and call that from HA. Requires a versioned
>>>> Breaks+Depends too, but then we'd be safe in the future and also
>>>> catch
>>>> such changes more easily. OTOH, it'd make the coupling go in the
>>>> other
>>>> direction: if HA manager wants to change what it uses for static
>>>> consideration, then qemu-server would need to adapt. So not sure.
>>>
>>> I was think about this,
>>> When dynamic scheduler will be implemented, you'll need to use
>>> values
>>> streamed from pvestatd.
>>> So why can't we do the same for static values ?  (maxmem && maxcpus
>>> are
>>> already send by pvestatd).
>>>
>>> This should avoid the need to parse vm config,
>>> and maybe avoid to use load_config ?
>>> (from your initial commit,
>>> https://antiphishing.cetsi.fr/proxy/v3?i=empzeXJKYXZmc05YYWxacy8a0i
>>> pdXQjseJrjy1XKYRo&r=VmtndDVTbzdiM2ZTWE5zNOCuIdjZagQdntGyKix9ckH0vqu
>>> 2mBQ8ZB8NuZkj4rZot7vf1bLmZyMMHUI_kNRzCQ&f=SXFHV0doZ0hlNkF0enZmVuReM
>>> 9JWfuGCjA-
>>> Pu0y17Kx4xMFPn1ZetTNBKCwDYMki&u=https%3A//git.proxmox.com/%3Fp%3Dpv
>>> e-ha-
>>> manager.git%3Ba%3Dcommit%3Bh%3D561e7f4bfb235fcdca5b0bbb8422ce742a5d
>>> a75f%2C&k=NcQA
>>> it seem to be slow)
>>>
>>>
>>
>> The information is already readily available on the cluster file
>> system,
>> so sending it around via pvestatd additionally isn't ideal IMHO.
>> maxmem
>> and maxcpus are only one per node and were not available before.
>>
>> The load_config() call is not really problematic, because the result
>> from cfs_read_file() is cached. The real issue is that
>> recompute_online_node_usage() and thus getting the static info is
>> called
>> very often currently. There was an RFC [0] to get the information
>> using
>> PVE::Cluster::get_guest_config_properties(), but it's only twice as
>> fast. Optimizing how often we call recompute_online_node_usage() can
>> give us much more.
>>
> Ok thanks for the details. I'll try to work on this after virtiomem,
> as I have big cluster (1000vms minimum), and I really need it.

I can also do it if you want, requires remove_service_usage(_to_node)
functions in the Rust backend and replacing
recompute_online_node_usage() calls with more fine-grained add/remove
usage calls.

> Maybe only recompute by increment add/del from/to nodes where the vms
> are migrated. (At least when we iterate through the main loop with
> services).

Yes, the idea is to introduce a remove_service_usage_to_node() as an
inverse to add_service_usage_to_node(). Adding or removing usage in
PVE/HA/Manager.pm's change_service_state() depending on old and new
state seems to be a natural way to do it. Currently, we do a full
recompute every time in change_service_state(). During migration we
count usage on both source and target, since the VM impacts both, that
should be fine already.

> 
> A do a full recompute on new crm run each 10s.

Yes. We could also do a full recompute after iterating every N services
just to be sure.

> 
>> In any case, HA manager needs to be adapted before the memory setting
>> can be turned into a property string.
>>
> 
> I was thinked about pvestatd statd, because it's already parsed on
> pvestatd side, and we only have to use raw metrics.
> But no problem, I can use QemuServer::Memory::get_current_memory() in
> pve-ha-manager. Still not sure it's the best way. or a special
> QemuServer::get_ha_stats() we could reuse later to add other stats ?

In both cases, a versioned Breaks+Depends is needed. The second approach
would likely avoid the need for versioned Breaks for similar changes in
the future, but it tightens the coupling with the qemu-server repo:
qemu-server then needs to know what information HA manager wants and
needs to adapt when that changes. IMHO both approaches can be fine.

Yes, we could then add the dynamic stats to such a method, but again,
it's a bit more coupling. Not sure if it's not just better to handle
those directly in HA manager.




^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-01-09 14:35 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09 19:27 [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Alexandre Derumier
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 01/10] test: add memory tests Alexandre Derumier
2022-12-16 13:38   ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 02/10] add memory parser Alexandre Derumier
2022-12-16 13:38   ` Fiona Ebner
2023-01-02 10:50     ` DERUMIER, Alexandre
2023-01-05 12:47       ` Fiona Ebner
2023-01-02 11:23     ` DERUMIER, Alexandre
2023-01-05 12:48       ` Fiona Ebner
     [not found]         ` <4ba723fb986517054761eb65f38812fac86a895b.camel@groupe-cyllene.com>
2023-01-09 14:35           ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 03/10] config: memory: add 'max' option Alexandre Derumier
2022-12-16 13:38   ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 04/10] memory: add get_static_mem Alexandre Derumier
2022-12-16 13:38   ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 05/10] memory: get_max_mem: use config memory max Alexandre Derumier
2022-12-16 13:39   ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 06/10] memory: use 64 slots && static dimm size with max is defined Alexandre Derumier
2022-12-16 13:39   ` Fiona Ebner
2022-12-19 12:05     ` DERUMIER, Alexandre
2022-12-19 12:28       ` Fiona Ebner
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 07/10] test: add memory-max tests Alexandre Derumier
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 08/10] memory: add virtio-mem support Alexandre Derumier
2022-12-16 13:42   ` Fiona Ebner
     [not found]     ` <7b9306c429440304fb37601ece5ffdbad0b90e5f.camel@groupe-cyllene.com>
2022-12-20 10:26       ` Fiona Ebner
     [not found]         ` <b354aab5e4791e7c862b15470ca24c273b8030be.camel@groupe-cyllene.com>
2022-12-20 12:31           ` DERUMIER, Alexandre
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 09/10] tests: add virtio-mem tests Alexandre Derumier
2022-12-16 13:42   ` Fiona Ebner
2022-12-19 14:48     ` Thomas Lamprecht
2022-12-09 19:27 ` [pve-devel] [PATCH qemu-server 10/10] memory: fix hotplug with virtiomem && maxmem Alexandre Derumier
2022-12-16 13:42   ` Fiona Ebner
2022-12-16 13:38 ` [pve-devel] [PATCH qemu-server 00/10] rework memory hotplug + virtiomem Fiona Ebner
2022-12-19 11:31   ` DERUMIER, Alexandre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal