From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id D39DF1FF15C for ; Fri, 25 Jul 2025 12:50:32 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 24FA4158CC; Fri, 25 Jul 2025 12:51:21 +0200 (CEST) From: Fiona Ebner To: pve-devel@lists.proxmox.com Date: Fri, 25 Jul 2025 12:50:53 +0200 Message-ID: <20250725105109.54093-7-f.ebner@proxmox.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250725105109.54093-1-f.ebner@proxmox.com> References: <20250725105109.54093-1-f.ebner@proxmox.com> MIME-Version: 1.0 X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1753440669285 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.151 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LOTSOFHASH 0.25 Emails with lots of hash-like gibberish SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH qemu-server v2 6/8] fix #6543: use qcow2 'discard-no-unref' option when using snapshot-as-volume-chain X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Without the 'discard-no-unref', a qcow2 file can grow beyond what 'qemu-img measure' reports, because of fragmentation. This can lead to IO errors with qcow2 on top of LVM storages, where the containing LV is allocated with that size. Guard enabling the option with having 'snapshot-as-volume-chain' in the storage configuration for now. Enabling it always should be evaluated a bit more and tested on different storages. It is a runtime-only option just affecting how referencing clusters is handled during discard in qcow2 and nothing else, so it is also fine for existing images and migration streams. While 'snapshot-as-volume-chain' is not the perfect proxy, as that's not only for LVM, it's an experimental feature that covers the LVM case and it seems like a nice fit to try out the new option on file-based storages too. Suggested-by: Alexandre Derumier Signed-off-by: Fiona Ebner --- Changes in v2: * add missing qcow2 format check for qemu_img * fix mocking File::stat, expected test output was wrong src/PVE/QemuServer/Blockdev.pm | 7 ++++++ src/PVE/QemuServer/QemuImage.pm | 20 +++++++++++++++++ src/test/cfg2cmd/simple-backingchain.conf.cmd | 2 +- src/test/run_qemu_img_convert_tests.pl | 22 ++++++++++++++----- 4 files changed, 44 insertions(+), 7 deletions(-) diff --git a/src/PVE/QemuServer/Blockdev.pm b/src/PVE/QemuServer/Blockdev.pm index 8528a587..1487bc99 100644 --- a/src/PVE/QemuServer/Blockdev.pm +++ b/src/PVE/QemuServer/Blockdev.pm @@ -372,6 +372,13 @@ my sub generate_format_blockdev { $blockdev->{size} = int($options->{size}); } + # see bug #6543: without this option, fragmentation can lead to the qcow2 file growing larger + # than what qemu-img measure reports, which is problematic for qcow2-on-top-of-LVM + # TODO test and consider enabling this in general + if ($scfg && $scfg->{'snapshot-as-volume-chain'}) { + $blockdev->{'discard-no-unref'} = JSON::true if $format eq 'qcow2'; + } + return $blockdev; } diff --git a/src/PVE/QemuServer/QemuImage.pm b/src/PVE/QemuServer/QemuImage.pm index 026c24e9..804cbc8e 100644 --- a/src/PVE/QemuServer/QemuImage.pm +++ b/src/PVE/QemuServer/QemuImage.pm @@ -3,6 +3,9 @@ package PVE::QemuServer::QemuImage; use strict; use warnings; +use Fcntl qw(S_ISBLK); +use File::stat; + use PVE::Format qw(render_bytes); use PVE::Storage; use PVE::Tools; @@ -27,6 +30,18 @@ sub convert_iscsi_path { die "cannot convert iscsi path '$path', unknown format\n"; } +my sub qcow2_target_image_opts { + my ($path, @qcow2_opts) = @_; + + my $st = File::stat::stat($path) or die "stat for '$path' failed - $!\n"; + + my $driver = S_ISBLK($st->mode) ? 'host_device' : 'file'; + + my $qcow2_opts_str = ',' . join(',', @qcow2_opts); + + return "driver=qcow2$qcow2_opts_str,file.driver=$driver,file.filename=$path"; +} + # The possible options are: # bwlimit - The bandwidth limit in KiB/s. # is-zero-initialized - If the destination image is zero-initialized. @@ -71,6 +86,8 @@ sub convert { my $dst_format = checked_volume_format($storecfg, $dst_volid); my $dst_path = PVE::Storage::path($storecfg, $dst_volid); my $dst_is_iscsi = ($dst_path =~ m|^iscsi://|); + my $dst_needs_discard_no_unref = + $dst_scfg->{'snapshot-as-volume-chain'} && $dst_format eq 'qcow2'; my $support_qemu_snapshots = PVE::Storage::volume_qemu_snapshot_method($storecfg, $src_volid); my $cmd = []; @@ -94,6 +111,9 @@ sub convert { if ($dst_is_iscsi) { push @$cmd, '--target-image-opts'; $dst_path = convert_iscsi_path($dst_path); + } elsif ($dst_needs_discard_no_unref) { + push @$cmd, '--target-image-opts'; + $dst_path = qcow2_target_image_opts($dst_path, 'discard-no-unref=true'); } else { push @$cmd, '-O', $dst_format; } diff --git a/src/test/cfg2cmd/simple-backingchain.conf.cmd b/src/test/cfg2cmd/simple-backingchain.conf.cmd index cae2ad4b..4ac24b93 100644 --- a/src/test/cfg2cmd/simple-backingchain.conf.cmd +++ b/src/test/cfg2cmd/simple-backingchain.conf.cmd @@ -26,7 +26,7 @@ -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on' \ -iscsi 'initiator-name=iqn.1993-08.org.debian:01:aabbccddeeff' \ -device 'lsi,id=scsihw0,bus=pci.0,addr=0x5' \ - -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"backing":{"backing":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/snap1-vm-8006-disk-0.qcow2","node-name":"ea91a385a49a008a4735c0aec5c6749","read-only":false},"node-name":"fa91a385a49a008a4735c0aec5c6749","read-only":false},"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/snap2-vm-8006-disk-0.qcow2","node-name":"ec0289317073959d450248d8cd7a480","read-only":false},"node-name":"fc0289317073959d450248d8cd7a480","read-only":false},"cache":{"direct":true,"no-flush":false},"detect-ze roes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/vm-8006-disk-0.qcow2","node-name":"e74f4959037afb46eddc7313c43dfdd","read-only":false},"node-name":"f74f4959037afb46eddc7313c43dfdd","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ + -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"backing":{"backing":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","discard-no-unref":true,"driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/snap1-vm-8006-disk-0.qcow2","node-name":"ea91a385a49a008a4735c0aec5c6749","read-only":false},"node-name":"fa91a385a49a008a4735c0aec5c6749","read-only":false},"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","discard-no-unref":true,"driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/snap2-vm-8006-disk-0.qcow2","node-name":"ec0289317073959d450248d8cd7a480","read-only":false},"node-name":"fc0289317073959d450248d8cd7a480","read-only":false},"ca che":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","discard-no-unref":true,"driver":"qcow2","file":{"aio":"io_uring","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"file","filename":"/var/lib/vzsnapext/images/8006/vm-8006-disk-0.qcow2","node-name":"e74f4959037afb46eddc7313c43dfdd","read-only":false},"node-name":"f74f4959037afb46eddc7313c43dfdd","read-only":false},"node-name":"drive-scsi0","read-only":false,"throttle-group":"throttle-drive-scsi0"}' \ -device 'scsi-hd,bus=scsihw0.0,scsi-id=0,drive=drive-scsi0,id=scsi0,device_id=drive-scsi0,write-cache=on' \ -blockdev '{"detect-zeroes":"on","discard":"ignore","driver":"throttle","file":{"backing":{"backing":{"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"native","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"host_device","filename":"/dev/veegee/snap1-vm-8006-disk-0.qcow2","node-name":"e25f58d3e6e11f2065ad41253988915","read-only":false},"node-name":"f25f58d3e6e11f2065ad41253988915","read-only":false},"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"qcow2","file":{"aio":"native","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"host_device","filename":"/dev/veegee/snap2-vm-8006-disk-0.qcow2","node-name":"e9415bb5e484c1e25d25063b01686fe","read-only":false},"node-name":"f9415bb5e484c1e25d25063b01686fe","read-only":false},"cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore ","driver":"qcow2","file":{"aio":"native","cache":{"direct":true,"no-flush":false},"detect-zeroes":"on","discard":"ignore","driver":"host_device","filename":"/dev/veegee/vm-8006-disk-0.qcow2","node-name":"e87358a470ca311f94d5cc61d1eb428","read-only":false},"node-name":"f87358a470ca311f94d5cc61d1eb428","read-only":false},"node-name":"drive-scsi1","read-only":false,"throttle-group":"throttle-drive-scsi1"}' \ -device 'scsi-hd,bus=scsihw0.0,scsi-id=1,drive=drive-scsi1,id=scsi1,device_id=drive-scsi1,write-cache=on' \ diff --git a/src/test/run_qemu_img_convert_tests.pl b/src/test/run_qemu_img_convert_tests.pl index 64c98327..3c8f09f0 100755 --- a/src/test/run_qemu_img_convert_tests.pl +++ b/src/test/run_qemu_img_convert_tests.pl @@ -542,10 +542,10 @@ my $tests = [ "-n", "-f", "raw", - "-O", - "qcow2", + "--target-image-opts", "/var/lib/vz/images/$vmid/vm-$vmid-disk-0.raw", - "/var/lib/vzsnapext/images/$vmid/vm-$vmid-disk-0.qcow2", + "driver=qcow2,discard-no-unref=true,file.driver=file," + . "file.filename=/var/lib/vzsnapext/images/$vmid/vm-$vmid-disk-0.qcow2", ], }, { @@ -560,16 +560,26 @@ my $tests = [ "-n", "-f", "raw", - "-O", - "qcow2", + "--target-image-opts", "/var/lib/vz/images/$vmid/vm-$vmid-disk-0.raw", - "/dev/pve/vm-$vmid-disk-0.qcow2", + "driver=qcow2,discard-no-unref=true,file.driver=host_device," + . "file.filename=/dev/pve/vm-$vmid-disk-0.qcow2", ], }, ]; my $command; +my $file_stat_module = Test::MockModule->new("File::stat"); +$file_stat_module->mock( + stat => sub { + my ($path) = @_; + my $st = $file_stat_module->original('stat')->('./run_qemu_img_convert_tests.pl'); + $st->[2] = 25008 if $path =~ m!/dev/!; # block device + return $st; + }, +); + my $storage_module = Test::MockModule->new("PVE::Storage"); $storage_module->mock( config => sub { -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel