[pve-devel] [PATCH pve-storage 0/2] move qemu_img_create to common helpers and enable preallocation on backed images

* [pve-devel] [PATCH pve-storage 0/2] move qemu_img_create to common helpers and enable preallocation on backed images
@ 2025-05-22 13:53 Alexandre Derumier via pve-devel
  2025-05-27  8:49 ` Fiona Ebner
  0 siblings, 1 reply; 6+ messages in thread
From: Alexandre Derumier via pve-devel @ 2025-05-22 13:53 UTC (permalink / raw)
  To: pve-devel; +Cc: Alexandre Derumier

[-- Attachment #1: Type: message/rfc822, Size: 6869 bytes --]

From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH pve-storage 0/2] move qemu_img_create to common helpers and enable preallocation on backed images
Date: Thu, 22 May 2025 15:53:02 +0200
Message-ID: <20250522135304.2513284-1-alexandre.derumier@groupe-cyllene.com>

This is part of my work on qcow2 external snapshot, but could improve current qcow2 linked clone

This patch serie move qemu_img_create to common helpers,
and enable preallocation on backed_image to increase performance

This require l2_extended=on on the backed image

I have done some benchmarks on localssd with 100gb qcow2, the performance in randwrite 4k is 5x faster

some presentation of l2_extended=on are available here
https://www.youtube.com/watch?v=zJetcfDVFNw
https://www.youtube.com/watch?v=NfgLCdtkRus

I don't have enabled it for base image, as I think that Fabian see performance regression some month ago.
but I don't see performance difference in my bench. (can you could test on your side again ?)

It could help to reduce qcow2 overhead on disk,
and allow to keep more metadatas in memory for bigger image, as qemu default memory l2_cache_size=1MB)
https://www.ibm.com/products/tutorials/how-to-tune-qemu-l2-cache-size-and-qcow2-cluster-size
Maybe more test with bigger image (>1TB) could be done too to see if it's help

bench on 100G qcow2 file:

fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --iodepth=32 --ioengine=libaio --name=test
fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k --iodepth=32 --ioengine=libaio --name=test

base image:

randwrite 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 20215
randread 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 22219
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 20217
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 21742
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 21599
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 22037

linked clone image with backing file:

randwrite 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 3912
randread 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 21476
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 20563
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 22265
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 18016
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 21611

Update:

I have done some tests with suballocated cluster and base image without
backing_file, indeed, I'm seeing a small performance degradation on big
1TB image.

with a 30GB image, I'm around 22000 iops 4k randwrite/randread  (with
or without l2_extended=on)

with a 1TB image, the result is different

fio –filename=/dev/sdb –direct=1 –rw=randwrite –bs=4k –iodepth=32
–ioengine=libaio –name=test

default l2-cache-size (32MB) , extended_l2=off, cluster_size=64k : 2700 iops
default l2-cache-size (32MB) , extended_l2=on, cluster_size=128k: 1500 iops

I have also play with qemu l2-cache-size option of drive (default value
is 32MB, and it's not enough for a 1TB image to keep all metadatas in
memory)
https://github.com/qemu/qemu/commit/80668d0fb735f0839a46278a7d42116089b82816

l2-cache-size=8MB , extended_l2=off, cluster_size=64k: 2900 iops
l2-cache-size=64MB , extended_l2=off, cluster_size=64k: 5100 iops
l2-cache-size=128MB , extended_l2=off, cluster_size=64k : 22000 iops

l2-cache-size=8MB , extended_l2=on, cluster_size=128k: 2000 iops
l2-cache-size=64MB , extended_l2=on, cluster_size=128k: 4500 iops
l2-cache-size=128MB , extended_l2=on, cluster_size=128k: 22000 iops

So no difference in needed memory, with or with extended_l2.

but the l2-cache-size tuning is really something we should add in
another patch I think ,for general performance with qcow2.

Alexandre Derumier (2):
  common: add qemu_img_create an preallocation_cmd_option
  common: qemu_img_create: add backing_file support

 src/PVE/Storage/Common.pm          | 62 ++++++++++++++++++++++++++++++
 src/PVE/Storage/GlusterfsPlugin.pm |  2 +-
 src/PVE/Storage/Plugin.pm          | 52 +------------------------
 3 files changed, 65 insertions(+), 51 deletions(-)

-- 
2.39.5

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread