* [pve-devel] [PATCH pve-storage] qcow2 format: enable subcluster allocation by default
@ 2024-07-03 14:24 Alexandre Derumier via pve-devel
2024-09-11 11:44 ` Fiona Ebner
0 siblings, 1 reply; 2+ messages in thread
From: Alexandre Derumier via pve-devel @ 2024-07-03 14:24 UTC (permalink / raw)
To: pve-devel; +Cc: Alexandre Derumier
[-- Attachment #1: Type: message/rfc822, Size: 4454 bytes --]
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH pve-storage] qcow2 format: enable subcluster allocation by default
Date: Wed, 3 Jul 2024 16:24:47 +0200
Message-ID: <20240703142447.602210-1-alexandre.derumier@groupe-cyllene.com>
extended_l2 is an optimisation to reduce write amplification.
Currently,without it, when a vm write 4k, a full 64k cluster
need to be writen.
When enabled, the cluster is splitted in 32 subclusters.
We use a 128k cluster by default, to have 32 * 4k subclusters
https://blogs.igalia.com/berto/2020/12/03/subcluster-allocation-for-qcow2-images/
https://static.sched.com/hosted_files/kvmforum2020/d9/qcow2-subcluster-allocation.pdf
some stats for 4k randwrite benchmark
Cluster size Without subclusters With subclusters
16 KB 5859 IOPS 8063 IOPS
32 KB 5674 IOPS 11107 IOPS
64 KB 2527 IOPS 12731 IOPS
128 KB 1576 IOPS 11808 IOPS
256 KB 976 IOPS 9195 IOPS
512 KB 510 IOPS 7079 IOPS
1 MB 448 IOPS 3306 IOPS
2 MB 262 IOPS 2269 IOPS
Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
---
src/PVE/Storage/Plugin.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 6444390..31b20fe 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -561,7 +561,7 @@ sub preallocation_cmd_option {
die "preallocation mode '$prealloc' not supported by format '$fmt'\n"
if !$QCOW2_PREALLOCATION->{$prealloc};
- return "preallocation=$prealloc";
+ return "preallocation=$prealloc,extended_l2=on,cluster_size=128k";
} elsif ($fmt eq 'raw') {
$prealloc = $prealloc // 'off';
$prealloc = 'off' if $prealloc eq 'metadata';
--
2.39.2
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] qcow2 format: enable subcluster allocation by default
2024-07-03 14:24 [pve-devel] [PATCH pve-storage] qcow2 format: enable subcluster allocation by default Alexandre Derumier via pve-devel
@ 2024-09-11 11:44 ` Fiona Ebner
0 siblings, 0 replies; 2+ messages in thread
From: Fiona Ebner @ 2024-09-11 11:44 UTC (permalink / raw)
To: Proxmox VE development discussion
Am 03.07.24 um 16:24 schrieb Alexandre Derumier via pve-devel:
>
>
> extended_l2 is an optimisation to reduce write amplification.
> Currently,without it, when a vm write 4k, a full 64k cluster
s/write/writes/
> need to be writen.
needs to be written.
>
> When enabled, the cluster is splitted in 32 subclusters.
s/splitted/split/
>
> We use a 128k cluster by default, to have 32 * 4k subclusters
>
> https://blogs.igalia.com/berto/2020/12/03/subcluster-allocation-for-qcow2-images/
> https://static.sched.com/hosted_files/kvmforum2020/d9/qcow2-subcluster-allocation.pdf
>
> some stats for 4k randwrite benchmark
Can you please share the exact command you used? What kind of underlying
disks do you have?
>
> Cluster size Without subclusters With subclusters
> 16 KB 5859 IOPS 8063 IOPS
> 32 KB 5674 IOPS 11107 IOPS
> 64 KB 2527 IOPS 12731 IOPS
> 128 KB 1576 IOPS 11808 IOPS
> 256 KB 976 IOPS 9195 IOPS
> 512 KB 510 IOPS 7079 IOPS
> 1 MB 448 IOPS 3306 IOPS
> 2 MB 262 IOPS 2269 IOPS
>
How does read performance compare for you (with 128 KiB cluster size)?
I don't see any noticeable difference in my testing with an ext4
directory storage on an SSD, attaching the qcow2 images as SCSI disks to
the VM, neither for reading nor writing. I only tested without your
change and with your change using 4k (rand)read and (rand)write.
I'm not sure we should enable this for everybody, there's always a risk
to break stuff with added complexity. Maybe it's better to have a
storage configuration option that people can opt-in to, e.g.
qcow2-create-opts extended_l2=on,cluster_size=128k
If we get enough positive feedback, we can still change the default in a
future (major) release.
> Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> ---
> src/PVE/Storage/Plugin.pm | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
> index 6444390..31b20fe 100644
> --- a/src/PVE/Storage/Plugin.pm
> +++ b/src/PVE/Storage/Plugin.pm
> @@ -561,7 +561,7 @@ sub preallocation_cmd_option {
> die "preallocation mode '$prealloc' not supported by format '$fmt'\n"
> if !$QCOW2_PREALLOCATION->{$prealloc};
>
> - return "preallocation=$prealloc";
> + return "preallocation=$prealloc,extended_l2=on,cluster_size=128k";
Also, it doesn't really fit here in the preallocation helper as the
helper is specific to that setting.
> } elsif ($fmt eq 'raw') {
> $prealloc = $prealloc // 'off';
> $prealloc = 'off' if $prealloc eq 'metadata';
> --
> 2.39.2
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-09-11 11:44 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-03 14:24 [pve-devel] [PATCH pve-storage] qcow2 format: enable subcluster allocation by default Alexandre Derumier via pve-devel
2024-09-11 11:44 ` Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox