public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting
@ 2022-03-06 12:46 Alexandre Derumier
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 1/2] " Alexandre Derumier
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alexandre Derumier @ 2022-03-06 12:46 UTC (permalink / raw)
  To: pve-devel

Hi,

Currently, if a guest vm allocate a memory page, and freed it later in the guest,
the memory is not freed on the host side.

Balloon device have a new option since qemu 5.1 "free-page-reporting" (and need host kernel 5.7)

https://events19.linuxfoundation.org/wp-content/uploads/2017/12/KVMForum2018.pdf
https://lwn.net/Articles/759413/

This is working like the discard option for disk, memory is freed async by the host when vm is freeing it.

I'm running it production since 1 month without any problem. With a lot of vms and spiky workload, the memory
freed is really huge.

Here an example of a host with 650GB+200GB ksm going down to 250GB memory
https://mutulin1.odiso.net/ballon-size.png
(around 400vms with 2GB max memory, previously always allocated)


This patch enabled it by default force machine version >= 6.2.


changelogv2:

- enabled it only for machine version > 6.2
- add test

Alexandre Derumier (2):
  enable balloon free-page-reporting
  add test for virtio-balloon free-page-reporting=on. (qemu 6.2)

 PVE/QemuServer.pm                             |  4 ++-
 test/cfg2cmd/q35-simple-7.0.conf.cmd          |  2 +-
 .../simple-balloon-free-page-reporting.conf   | 15 +++++++++
 ...imple-balloon-free-page-reporting.conf.cmd | 33 +++++++++++++++++++
 4 files changed, 52 insertions(+), 2 deletions(-)
 create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf
 create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd

-- 
2.30.2




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH V2 qemu-server 1/2] enable balloon free-page-reporting
  2022-03-06 12:46 [pve-devel] [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Alexandre Derumier
@ 2022-03-06 12:46 ` Alexandre Derumier
  2022-03-16 17:48   ` Thomas Lamprecht
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 2/2] add test for virtio-balloon free-page-reporting=on. (qemu 6.2) Alexandre Derumier
  2022-04-27  9:23 ` [pve-devel] applied: [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Thomas Lamprecht
  2 siblings, 1 reply; 8+ messages in thread
From: Alexandre Derumier @ 2022-03-06 12:46 UTC (permalink / raw)
  To: pve-devel

Allow balloon device  driver to report hints of guest free pages to the host,
for auto memory reclaim

https://lwn.net/Articles/759413/
https://events19.linuxfoundation.org/wp-content/uploads/2017/12/KVMForum2018.pdf
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 PVE/QemuServer.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 42f0fbd..a9e86b3 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -3846,7 +3846,9 @@ sub config_to_command {
     # enable balloon by default, unless explicitly disabled
     if (!defined($conf->{balloon}) || $conf->{balloon}) {
 	my $pciaddr = print_pci_addr("balloon0", $bridges, $arch, $machine_type);
-	push @$devices, '-device', "virtio-balloon-pci,id=balloon0$pciaddr";
+	my $ballooncmd = "virtio-balloon-pci,id=balloon0$pciaddr";
+	$ballooncmd .= ",free-page-reporting=on" if min_version($machine_version, 6, 2);
+	push @$devices, '-device', $ballooncmd;
     }
 
     if ($conf->{watchdog}) {
-- 
2.30.2




^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] [PATCH V2 qemu-server 2/2] add test for virtio-balloon free-page-reporting=on. (qemu 6.2)
  2022-03-06 12:46 [pve-devel] [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Alexandre Derumier
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 1/2] " Alexandre Derumier
@ 2022-03-06 12:46 ` Alexandre Derumier
  2022-04-27  9:23 ` [pve-devel] applied: [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Thomas Lamprecht
  2 siblings, 0 replies; 8+ messages in thread
From: Alexandre Derumier @ 2022-03-06 12:46 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
---
 test/cfg2cmd/q35-simple-7.0.conf.cmd          |  2 +-
 .../simple-balloon-free-page-reporting.conf   | 15 +++++++++
 ...imple-balloon-free-page-reporting.conf.cmd | 33 +++++++++++++++++++
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf
 create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd

diff --git a/test/cfg2cmd/q35-simple-7.0.conf.cmd b/test/cfg2cmd/q35-simple-7.0.conf.cmd
index 5045caf..be7a36c 100644
--- a/test/cfg2cmd/q35-simple-7.0.conf.cmd
+++ b/test/cfg2cmd/q35-simple-7.0.conf.cmd
@@ -21,7 +21,7 @@
   -device 'vmgenid,guid=54d1c06c-8f5b-440f-b5b2-6eab1380e13d' \
   -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' \
   -device 'VGA,id=vga,bus=pcie.0,addr=0x1' \
-  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
   -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
   -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
   -device 'virtio-net-pci,mac=2E:01:68:F9:9C:87,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
diff --git a/test/cfg2cmd/simple-balloon-free-page-reporting.conf b/test/cfg2cmd/simple-balloon-free-page-reporting.conf
new file mode 100644
index 0000000..e7cd1e4
--- /dev/null
+++ b/test/cfg2cmd/simple-balloon-free-page-reporting.conf
@@ -0,0 +1,15 @@
+# TEST: Simple test for balloon free page reporting enabled by default on 6.2
+# QEMU_VERSION: 6.2
+bootdisk: scsi0
+cores: 3
+ide2: none,media=cdrom
+memory: 768
+name: simple
+net0: virtio=A2:C0:43:77:08:A0,bridge=vmbr0
+numa: 0
+ostype: l26
+scsi0: local:8006/vm-8006-disk-0.qcow2,discard=on,size=104858K
+scsihw: virtio-scsi-pci
+smbios1: uuid=7b10d7af-b932-4c66-b2c3-3996152ec465
+sockets: 1
+vmgenid: c773c261-d800-4348-1010-1010add53cf8
diff --git a/test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd b/test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd
new file mode 100644
index 0000000..232e348
--- /dev/null
+++ b/test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd
@@ -0,0 +1,33 @@
+/usr/bin/kvm \
+  -id 8006 \
+  -name simple \
+  -no-shutdown \
+  -chardev 'socket,id=qmp,path=/var/run/qemu-server/8006.qmp,server=on,wait=off' \
+  -mon 'chardev=qmp,mode=control' \
+  -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' \
+  -mon 'chardev=qmp-event,mode=control' \
+  -pidfile /var/run/qemu-server/8006.pid \
+  -daemonize \
+  -smbios 'type=1,uuid=7b10d7af-b932-4c66-b2c3-3996152ec465' \
+  -smp '3,sockets=1,cores=3,maxcpus=3' \
+  -nodefaults \
+  -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' \
+  -vnc 'unix:/var/run/qemu-server/8006.vnc,password=on' \
+  -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep \
+  -m 768 \
+  -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' \
+  -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' \
+  -device 'vmgenid,guid=c773c261-d800-4348-1010-1010add53cf8' \
+  -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' \
+  -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' \
+  -device 'VGA,id=vga,bus=pci.0,addr=0x2' \
+  -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \
+  -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \
+  -drive 'if=none,id=drive-ide2,media=cdrom,aio=io_uring' \
+  -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' \
+  -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' \
+  -drive 'file=/var/lib/vz/images/8006/vm-8006-disk-0.qcow2,if=none,id=drive-scsi0,discard=on,format=qcow2,cache=none,aio=io_uring,detect-zeroes=unmap' \
+  -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' \
+  -netdev 'type=tap,id=net0,ifname=tap8006i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' \
+  -device 'virtio-net-pci,mac=A2:C0:43:77:08:A0,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' \
+  -machine 'type=pc+pve0'
-- 
2.30.2




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH V2 qemu-server 1/2] enable balloon free-page-reporting
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 1/2] " Alexandre Derumier
@ 2022-03-16 17:48   ` Thomas Lamprecht
  2022-03-16 19:32     ` DERUMIER, Alexandre
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Lamprecht @ 2022-03-16 17:48 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

On 06.03.22 13:46, Alexandre Derumier wrote:
> Allow balloon device  driver to report hints of guest free pages to the host,
> for auto memory reclaim
> 
> https://lwn.net/Articles/759413/
> https://events19.linuxfoundation.org/wp-content/uploads/2017/12/KVMForum2018.pdf
> Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> ---
>  PVE/QemuServer.pm | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 42f0fbd..a9e86b3 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -3846,7 +3846,9 @@ sub config_to_command {
>      # enable balloon by default, unless explicitly disabled
>      if (!defined($conf->{balloon}) || $conf->{balloon}) {
>  	my $pciaddr = print_pci_addr("balloon0", $bridges, $arch, $machine_type);
> -	push @$devices, '-device', "virtio-balloon-pci,id=balloon0$pciaddr";
> +	my $ballooncmd = "virtio-balloon-pci,id=balloon0$pciaddr";
> +	$ballooncmd .= ",free-page-reporting=on" if min_version($machine_version, 6, 2);

do we even need to guard this behind 6.2 machine version, as I tried to add it
on a running host and migrations in both directions went just fine with a windows
10 VM.

Asking mostly because we already have QEMU 6.2 available publicly on pvetest and
use it also for some infrastructure of ours, so if it really would be breaking
we'd need to use our separate qemu-version independent machine bump mechanism
(+pve1).

But it seems that its not required, or did you find that it can indeed break live
migration? fwiw, for us is really only forward migration, from vm without reporting
enabled to vm with reporting enabled, relevant.

> +	push @$devices, '-device', $ballooncmd;
>      }
>  
>      if ($conf->{watchdog}) {





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH V2 qemu-server 1/2] enable balloon free-page-reporting
  2022-03-16 17:48   ` Thomas Lamprecht
@ 2022-03-16 19:32     ` DERUMIER, Alexandre
  2022-03-17  8:35       ` Thomas Lamprecht
  2022-03-28 10:06       ` DERUMIER, Alexandre
  0 siblings, 2 replies; 8+ messages in thread
From: DERUMIER, Alexandre @ 2022-03-16 19:32 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, aderumier

Le mercredi 16 mars 2022 à 18:48 +0100, Thomas Lamprecht a écrit :
> On 06.03.22 13:46, Alexandre Derumier wrote:
> > Allow balloon device  driver to report hints of guest free pages to
> > the host,
> > for auto memory reclaim
> > 
> > https://antiphishing.cetsi.fr/proxy/v3?i=WjB4M1dJWGJJMnNGTHV5MuAPDw
> > EdQko7KGyaWIIeme0&r=Skk2OVhvdXl2cm1uOWJtRKZfDro27Y-
> > CXDQsnaz4_yALcilBfMoOADH4vBnleGIe&f=M2FwZHlGNnU1aUlkc09ZNN_YvBMHDOR
> > QlhAYZyYtaZUztHfYUKPa_DyZ9e1ZULhe&u=https%3A//lwn.net/Articles/7594
> > 13/&k=CXOq
> > https://antiphishing.cetsi.fr/proxy/v3?i=WjB4M1dJWGJJMnNGTHV5MuAPDw
> > EdQko7KGyaWIIeme0&r=Skk2OVhvdXl2cm1uOWJtRKZfDro27Y-
> > CXDQsnaz4_yALcilBfMoOADH4vBnleGIe&f=M2FwZHlGNnU1aUlkc09ZNN_YvBMHDOR
> > QlhAYZyYtaZUztHfYUKPa_DyZ9e1ZULhe&u=https%3A//events19.linuxfoundat
> > ion.org/wp-content/uploads/2017/12/KVMForum2018.pdf&k=CXOq
> > Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> > ---
> >  PVE/QemuServer.pm | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> > index 42f0fbd..a9e86b3 100644
> > --- a/PVE/QemuServer.pm
> > +++ b/PVE/QemuServer.pm
> > @@ -3846,7 +3846,9 @@ sub config_to_command {
> >      # enable balloon by default, unless explicitly disabled
> >      if (!defined($conf->{balloon}) || $conf->{balloon}) {
> >         my $pciaddr = print_pci_addr("balloon0", $bridges, $arch,
> > $machine_type);
> > -       push @$devices, '-device', "virtio-balloon-
> > pci,id=balloon0$pciaddr";
> > +       my $ballooncmd = "virtio-balloon-pci,id=balloon0$pciaddr";
> > +       $ballooncmd .= ",free-page-reporting=on" if
> > min_version($machine_version, 6, 2);
> 
> do we even need to guard this behind 6.2 machine version, as I tried
> to add it
> on a running host and migrations in both directions went just fine
> with a windows
> 10 VM.
> 
> Asking mostly because we already have QEMU 6.2 available publicly on
> pvetest and
> use it also for some infrastructure of ours, so if it really would be
> breaking
> we'd need to use our separate qemu-version independent machine bump
> mechanism
> (+pve1).

> 
> But it seems that its not required, or did you find that it can
> indeed break live
> migration? fwiw, for us is really only forward migration, from vm
> without reporting
> enabled to vm with reporting enabled, relevant.
> 
> > +       push @$devices, '-device', $ballooncmd;
> >      }
> >  
> >      if ($conf->{watchdog}) {
> 
> 


oh , sorry, I thinked that 6.2 was not yet pubicly available.

From my tests:
a already booted vm without the balloon freepage option enabled --->
migrating to new vm with balloon freepage option enabled : works

Then failback it to previous node : works


But starting a new vm with the option enabled then migrate it to a new
vm withtout the option:

migration die on resume.

2022-03-16 20:28:30 average migration speed: 1.5 GiB/s - downtime 30 ms
2022-03-16 20:28:30 migration status: completed
2022-03-16 20:28:30 ERROR: tunnel replied 'ERR: resume failed - VM 104
not running' to command 'resume 104'
2022-03-16 20:28:39 ERROR: migration finished with problems (duration
00:00:18)
TASK ERROR: migration problems


If think this is because guest kernel balloon driver enabled it at boot
only.


Note that I don't think that current windows drivers already support it
(I have looked at the source code, so even if the option is enabled at
qemu level, it don't do nothing inside windows.
So I think that migration will works in both direction with windows
vms.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH V2 qemu-server 1/2] enable balloon free-page-reporting
  2022-03-16 19:32     ` DERUMIER, Alexandre
@ 2022-03-17  8:35       ` Thomas Lamprecht
  2022-03-28 10:06       ` DERUMIER, Alexandre
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2022-03-17  8:35 UTC (permalink / raw)
  To: DERUMIER, Alexandre, pve-devel, aderumier

On 16.03.22 20:32, DERUMIER, Alexandre wrote:
> From my tests:
> a already booted vm without the balloon freepage option enabled --->
> migrating to new vm with balloon freepage option enabled : works
> 
> Then failback it to previous node : works
> 
> 
> But starting a new vm with the option enabled then migrate it to a new
> vm withtout the option:
> 
> migration die on resume.
> 
> 2022-03-16 20:28:30 average migration speed: 1.5 GiB/s - downtime 30 ms
> 2022-03-16 20:28:30 migration status: completed
> 2022-03-16 20:28:30 ERROR: tunnel replied 'ERR: resume failed - VM 104
> not running' to command 'resume 104'
> 2022-03-16 20:28:39 ERROR: migration finished with problems (duration
> 00:00:18)
> TASK ERROR: migration problems
> 

I now tested with a Debian Testing (bookworm) VM and ensured it actually logs
"Free page reporting enabled" in the kernel log and you're right it fails there.

But that's only relevant for migrating a VM that got freshly started (!) on
a new (upgraded) host and then gets migrated to an old one, and that's something
we never supported anyway (besides some very light best-effort).
As the VM doesn't even crashes then, and just fails to migrate (source VM stays
running), it's not even problematic for that case; the admin can just upgrade the
qemu-server on the other node already and continue.

Same holds for live-snapshots, only a snasphot from a freshly started VM on an
already upgraded host would cause issues on rollback, but *only* on an old host,
so the same as any QEMU update would cause anyway.

So, it seems we can just enable it with 6.2 as machine guard and be done.

> 
> If think this is because guest kernel balloon driver enabled it at boot
> only.
> 

yeah seems so.

> 
> Note that I don't think that current windows drivers already support it
> (I have looked at the source code, so even if the option is enabled at
> qemu level, it don't do nothing inside windows.
> So I think that migration will works in both direction with windows
> vms.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pve-devel] [PATCH V2 qemu-server 1/2] enable balloon free-page-reporting
  2022-03-16 19:32     ` DERUMIER, Alexandre
  2022-03-17  8:35       ` Thomas Lamprecht
@ 2022-03-28 10:06       ` DERUMIER, Alexandre
  1 sibling, 0 replies; 8+ messages in thread
From: DERUMIER, Alexandre @ 2022-03-28 10:06 UTC (permalink / raw)
  To: pve-devel, t.lamprecht, aderumier

Hi Thomas,

any news about this one ?

if you want to enable it by default, I'm doing it in the v1 of the
patch
https://lists.proxmox.com/pipermail/pve-devel/2022-March/051881.html


Also, I have send v2 for bridge no-learning to handle the nolearning

pve-network : fix bridge-disable-mac-learning
https://lists.proxmox.com/pipermail/pve-devel/2022-March/052205.html

[PATCH V2 pve-container 0/1] add disable bridge learning feature 
https://lists.proxmox.com/pipermail/pve-devel/2022-March/052206.html

[PATCH V2 qemu-server 0/3] add disable bridge learning feature  
https://lists.proxmox.com/pipermail/pve-devel/2022-March/052210.html



not related to nolearning but rebased :

[PATCH V2 pve-common 0/1] network: tap_plug: fix mtu bugs 
https://lists.proxmox.com/pipermail/pve-devel/2022-March/052213.html


Also some other pending patches:

[pve-devel] [PATCH pve-docs 0/1] bgp/evpn improvements  
https://lists.proxmox.com/pipermail/pve-devel/2022-February/051710.html
---> updated doc with last sdn changes (already in gui && pve-network)


close #2949: add virtio-mem support
https://lists.proxmox.com/pipermail/pve-devel/2022-March/051955.html
---> need opinion about the implementation




(BTW, I'm still working on pveha balancing, I'll made big rework with
new algorithms. I'll send patchs next month for review)




Le mercredi 16 mars 2022 à 20:32 +0100, alexandre derumier a écrit :
> Le mercredi 16 mars 2022 à 18:48 +0100, Thomas Lamprecht a écrit :
> > On 06.03.22 13:46, Alexandre Derumier wrote:
> > > Allow balloon device  driver to report hints of guest free pages
> > > to
> > > the host,
> > > for auto memory reclaim
> > > 
> > > https://antiphishing.cetsi.fr/proxy/v3?i=WjB4M1dJWGJJMnNGTHV5MuAP
> > > Dw
> > > EdQko7KGyaWIIeme0&r=Skk2OVhvdXl2cm1uOWJtRKZfDro27Y-
> > > CXDQsnaz4_yALcilBfMoOADH4vBnleGIe&f=M2FwZHlGNnU1aUlkc09ZNN_YvBMHD
> > > OR
> > > QlhAYZyYtaZUztHfYUKPa_DyZ9e1ZULhe&u=https%3A//lwn.net/Articles/75
> > > 94
> > > 13/&k=CXOq
> > > https://antiphishing.cetsi.fr/proxy/v3?i=WjB4M1dJWGJJMnNGTHV5MuAP
> > > Dw
> > > EdQko7KGyaWIIeme0&r=Skk2OVhvdXl2cm1uOWJtRKZfDro27Y-
> > > CXDQsnaz4_yALcilBfMoOADH4vBnleGIe&f=M2FwZHlGNnU1aUlkc09ZNN_YvBMHD
> > > OR
> > > QlhAYZyYtaZUztHfYUKPa_DyZ9e1ZULhe&u=https%3A//events19.linuxfound
> > > at
> > > ion.org/wp-content/uploads/2017/12/KVMForum2018.pdf&k=CXOq
> > > Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
> > > ---
> > >  PVE/QemuServer.pm | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> > > index 42f0fbd..a9e86b3 100644
> > > --- a/PVE/QemuServer.pm
> > > +++ b/PVE/QemuServer.pm
> > > @@ -3846,7 +3846,9 @@ sub config_to_command {
> > >      # enable balloon by default, unless explicitly disabled
> > >      if (!defined($conf->{balloon}) || $conf->{balloon}) {
> > >         my $pciaddr = print_pci_addr("balloon0", $bridges, $arch,
> > > $machine_type);
> > > -       push @$devices, '-device', "virtio-balloon-
> > > pci,id=balloon0$pciaddr";
> > > +       my $ballooncmd = "virtio-balloon-
> > > pci,id=balloon0$pciaddr";
> > > +       $ballooncmd .= ",free-page-reporting=on" if
> > > min_version($machine_version, 6, 2);
> > 
> > do we even need to guard this behind 6.2 machine version, as I
> > tried
> > to add it
> > on a running host and migrations in both directions went just fine
> > with a windows
> > 10 VM.
> > 
> > Asking mostly because we already have QEMU 6.2 available publicly
> > on
> > pvetest and
> > use it also for some infrastructure of ours, so if it really would
> > be
> > breaking
> > we'd need to use our separate qemu-version independent machine bump
> > mechanism
> > (+pve1).
> 
> > 
> > But it seems that its not required, or did you find that it can
> > indeed break live
> > migration? fwiw, for us is really only forward migration, from vm
> > without reporting
> > enabled to vm with reporting enabled, relevant.
> > 
> > > +       push @$devices, '-device', $ballooncmd;
> > >      }
> > >  
> > >      if ($conf->{watchdog}) {
> > 
> > 
> 
> 
> oh , sorry, I thinked that 6.2 was not yet pubicly available.
> 
> From my tests:
> a already booted vm without the balloon freepage option enabled --->
> migrating to new vm with balloon freepage option enabled : works
> 
> Then failback it to previous node : works
> 
> 
> But starting a new vm with the option enabled then migrate it to a
> new
> vm withtout the option:
> 
> migration die on resume.
> 
> 2022-03-16 20:28:30 average migration speed: 1.5 GiB/s - downtime 30
> ms
> 2022-03-16 20:28:30 migration status: completed
> 2022-03-16 20:28:30 ERROR: tunnel replied 'ERR: resume failed - VM
> 104
> not running' to command 'resume 104'
> 2022-03-16 20:28:39 ERROR: migration finished with problems (duration
> 00:00:18)
> TASK ERROR: migration problems
> 
> 
> If think this is because guest kernel balloon driver enabled it at
> boot
> only.
> 
> 
> Note that I don't think that current windows drivers already support
> it
> (I have looked at the source code, so even if the option is enabled
> at
> qemu level, it don't do nothing inside windows.
> So I think that migration will works in both direction with windows
> vms.
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pve-devel] applied: [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting
  2022-03-06 12:46 [pve-devel] [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Alexandre Derumier
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 1/2] " Alexandre Derumier
  2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 2/2] add test for virtio-balloon free-page-reporting=on. (qemu 6.2) Alexandre Derumier
@ 2022-04-27  9:23 ` Thomas Lamprecht
  2 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2022-04-27  9:23 UTC (permalink / raw)
  To: Proxmox VE development discussion, Alexandre Derumier

On 06.03.22 13:46, Alexandre Derumier wrote:
> Hi,
> 
> Currently, if a guest vm allocate a memory page, and freed it later in the guest,
> the memory is not freed on the host side.
> 
> Balloon device have a new option since qemu 5.1 "free-page-reporting" (and need host kernel 5.7)
> 
> https://events19.linuxfoundation.org/wp-content/uploads/2017/12/KVMForum2018.pdf
> https://lwn.net/Articles/759413/
> 
> This is working like the discard option for disk, memory is freed async by the host when vm is freeing it.
> 
> I'm running it production since 1 month without any problem. With a lot of vms and spiky workload, the memory
> freed is really huge.
> 
> Here an example of a host with 650GB+200GB ksm going down to 250GB memory
> https://mutulin1.odiso.net/ballon-size.png
> (around 400vms with 2GB max memory, previously always allocated)
> 
> 
> This patch enabled it by default force machine version >= 6.2.
> 
> 
> changelogv2:
> 
> - enabled it only for machine version > 6.2
> - add test
> 
> Alexandre Derumier (2):
>   enable balloon free-page-reporting
>   add test for virtio-balloon free-page-reporting=on. (qemu 6.2)
> 
>  PVE/QemuServer.pm                             |  4 ++-
>  test/cfg2cmd/q35-simple-7.0.conf.cmd          |  2 +-
>  .../simple-balloon-free-page-reporting.conf   | 15 +++++++++
>  ...imple-balloon-free-page-reporting.conf.cmd | 33 +++++++++++++++++++
>  4 files changed, 52 insertions(+), 2 deletions(-)
>  create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf
>  create mode 100644 test/cfg2cmd/simple-balloon-free-page-reporting.conf.cmd
> 



applied series, thanks! Had to fixup the existing un-versioned tests though.




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-04-27  9:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-06 12:46 [pve-devel] [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Alexandre Derumier
2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 1/2] " Alexandre Derumier
2022-03-16 17:48   ` Thomas Lamprecht
2022-03-16 19:32     ` DERUMIER, Alexandre
2022-03-17  8:35       ` Thomas Lamprecht
2022-03-28 10:06       ` DERUMIER, Alexandre
2022-03-06 12:46 ` [pve-devel] [PATCH V2 qemu-server 2/2] add test for virtio-balloon free-page-reporting=on. (qemu 6.2) Alexandre Derumier
2022-04-27  9:23 ` [pve-devel] applied: [PATCH V2 qemu-server 0/2] enable balloon free-page-reporting Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal