From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id 88E4A1FF165
	for <inbox@lore.proxmox.com>; Thu, 22 May 2025 15:53:47 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 6FD1E369D8;
	Thu, 22 May 2025 15:53:50 +0200 (CEST)
To: pve-devel@lists.proxmox.com
Date: Thu, 22 May 2025 15:53:02 +0200
MIME-Version: 1.0
Message-ID: <mailman.561.1747922029.394.pve-devel@lists.proxmox.com>
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
From: Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com>
Precedence: list
Cc: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
X-Mailman-Version: 2.1.29
X-BeenThere: pve-devel@lists.proxmox.com
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
Subject: [pve-devel] [PATCH pve-storage 0/2] move qemu_img_create to common
 helpers and enable preallocation on backed images
Content-Type: multipart/mixed; boundary="===============7398678095171015554=="
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

--===============7398678095171015554==
Content-Type: message/rfc822
Content-Disposition: inline

Return-Path: <root@formationkvm1.odiso.net>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by lists.proxmox.com (Postfix) with ESMTPS id 47BCDD06A2
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 15:53:48 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 29F153677F
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 15:53:18 +0200 (CEST)
Received: from bastiontest.odiso.net (unknown [IPv6:2a0a:1580:2000:6700::14])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by firstgate.proxmox.com (Proxmox) with ESMTPS
	for <pve-devel@lists.proxmox.com>; Thu, 22 May 2025 15:53:16 +0200 (CEST)
Received: from formationkvm1.odiso.net (unknown [10.11.201.57])
	by bastiontest.odiso.net (Postfix) with ESMTP id 35AD4860944;
	Thu, 22 May 2025 15:53:09 +0200 (CEST)
Received: by formationkvm1.odiso.net (Postfix, from userid 0)
	id B95C711F6B04; Thu, 22 May 2025 15:53:05 +0200 (CEST)
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH pve-storage 0/2] move qemu_img_create to common helpers and enable preallocation on backed images
Date: Thu, 22 May 2025 15:53:02 +0200
Message-Id: <20250522135304.2513284-1-alexandre.derumier@groupe-cyllene.com>
X-Mailer: git-send-email 2.39.5
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
	AWL                     0.099 Adjusted score from AWL reputation of From: address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_NONE                0.1 DMARC none policy
	HEADER_FROM_DIFFERENT_DOMAINS  0.001 From and EnvelopeFrom 2nd level mail domains are different
	KAM_DMARC_NONE           0.25 DKIM has Failed or SPF has failed on the message and the domain has no DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
	KAM_LAZY_DOMAIN_SECURITY      1 Sending domain does not have any anti-forgery methods
	RDNS_NONE               0.793 Delivered to internal network by a host with no rDNS
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_NONE                0.001 SPF: sender does not publish an SPF Record
	URIBL_BLOCKED           0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked.  See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [common.pm,glusterfsplugin.pm,plugin.pm]

This is part of my work on qcow2 external snapshot, but could improve current qcow2 linked clone

This patch serie move qemu_img_create to common helpers,
and enable preallocation on backed_image to increase performance

This require l2_extended=on on the backed image

I have done some benchmarks on localssd with 100gb qcow2, the performance in randwrite 4k is 5x faster

some presentation of l2_extended=on are available here
https://www.youtube.com/watch?v=zJetcfDVFNw
https://www.youtube.com/watch?v=NfgLCdtkRus

I don't have enabled it for base image, as I think that Fabian see performance regression some month ago.
but I don't see performance difference in my bench. (can you could test on your side again ?)

It could help to reduce qcow2 overhead on disk,
and allow to keep more metadatas in memory for bigger image, as qemu default memory l2_cache_size=1MB)
https://www.ibm.com/products/tutorials/how-to-tune-qemu-l2-cache-size-and-qcow2-cluster-size
Maybe more test with bigger image (>1TB) could be done too to see if it's help



bench on 100G qcow2 file:

fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --iodepth=32 --ioengine=libaio --name=test
fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k --iodepth=32 --ioengine=libaio --name=test

base image:

randwrite 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 20215
randread 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 22219
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 20217
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 21742
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 21599
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 22037

linked clone image with backing file:

randwrite 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 3912
randread 4k: prealloc=metadata, l2_extended=off, cluster_size=64k: 21476
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 20563
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=64k: 22265
randwrite 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 18016
randread 4k: prealloc=metadata, l2_extended=on, cluster_size=128k: 21611



Update:

I have done some tests with suballocated cluster and base image without
backing_file, indeed, I'm seeing a small performance degradation on big
1TB image.


with a 30GB image, I'm around 22000 iops 4k randwrite/randread  (with
or without l2_extended=on)

with a 1TB image, the result is different


fio –filename=/dev/sdb –direct=1 –rw=randwrite –bs=4k –iodepth=32
–ioengine=libaio –name=test

default l2-cache-size (32MB) , extended_l2=off, cluster_size=64k : 2700 iops
default l2-cache-size (32MB) , extended_l2=on, cluster_size=128k: 1500 iops


I have also play with qemu l2-cache-size option of drive (default value
is 32MB, and it's not enough for a 1TB image to keep all metadatas in
memory)
https://github.com/qemu/qemu/commit/80668d0fb735f0839a46278a7d42116089b82816


l2-cache-size=8MB , extended_l2=off, cluster_size=64k: 2900 iops
l2-cache-size=64MB , extended_l2=off, cluster_size=64k: 5100 iops
l2-cache-size=128MB , extended_l2=off, cluster_size=64k : 22000 iops

l2-cache-size=8MB , extended_l2=on, cluster_size=128k: 2000 iops
l2-cache-size=64MB , extended_l2=on, cluster_size=128k: 4500 iops
l2-cache-size=128MB , extended_l2=on, cluster_size=128k: 22000 iops


So no difference in needed memory, with or with extended_l2.

but the l2-cache-size tuning is really something we should add in
another patch I think ,for general performance with qcow2.



Alexandre Derumier (2):
  common: add qemu_img_create an preallocation_cmd_option
  common: qemu_img_create: add backing_file support

 src/PVE/Storage/Common.pm          | 62 ++++++++++++++++++++++++++++++
 src/PVE/Storage/GlusterfsPlugin.pm |  2 +-
 src/PVE/Storage/Plugin.pm          | 52 +------------------------
 3 files changed, 65 insertions(+), 51 deletions(-)

-- 
2.39.5



--===============7398678095171015554==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

--===============7398678095171015554==--