* [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive [not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com> @ 2025-10-21 14:00 ` Alexandre Derumier via pve-devel 2025-10-22 8:22 ` Fiona Ebner 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel 1 sibling, 1 reply; 7+ messages in thread From: Alexandre Derumier via pve-devel @ 2025-10-21 14:00 UTC (permalink / raw) To: pve-devel; +Cc: Alexandre Derumier [-- Attachment #1: Type: message/rfc822, Size: 12253 bytes --] From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> To: pve-devel@lists.proxmox.com Subject: [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Date: Tue, 21 Oct 2025 16:00:53 +0200 Message-ID: <20251021140054.3916467-2-alexandre.derumier@groupe-cyllene.com> Current cstream implementation is pretty slow, even without throttling. use blkdiscard --zeroout instead when storage support it, which is a few magnitudes faster. Another benefit is that blkdiscard is skipping already zeroed block, so for empty temp images like snapshot, is pretty fast. blkdiscard don't have throttling like cstream, but we can tune the step size of zeroes pushed to the storage. I'm using 32MB stepsize by default , like ovirt, where it seem to be the best balance between speed and load. https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986 but it can be reduce with "saferemove_stepsize" option. stepsize is also autoreduce to sysfs write_zeroes_max_bytes, which is the maximum zeroing batch supported by the storage Also adding an option "saferemove_discard", to use discard instead zeroing. test with a 100G volume (empty): time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 1 -b 1048576 13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s 26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s 38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s 50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s 63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s 76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s 89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s 101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s 107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s write: No space left on device real 1m24.394s user 0m0.171s sys 1m24.052s time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v /dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset 0 real 0m3.641s user 0m0.001s sys 0m3.433s test with a 100G volume with random data: time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v /dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0 /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 4764729344 /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 9428795392 /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 14260633600 /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 19092471808 /dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 23924310016 /dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 28789702656 /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 33386659840 /dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 38117834752 /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 42412802048 /dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 47076868096 /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 51774488576 /dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 56438554624 /dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 60699967488 /dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 65062043648 /dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 69189238784 /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 73517760512 /dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 78248935424 /dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 82275467264 /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 86469771264 /dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 91133837312 /dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 95529467904 /dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 99153346560 /dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 103616086016 real 0m23.969s user 0m0.030s sys 0m0.144s Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> --- src/PVE/Storage/LVMPlugin.pm | 94 +++++++++++++++++++++++++++++++----- 1 file changed, 83 insertions(+), 11 deletions(-) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 0416c9e..1c633a3 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -3,10 +3,11 @@ package PVE::Storage::LVMPlugin; use strict; use warnings; +use Cwd 'abs_path'; use File::Basename; use IO::File; -use PVE::Tools qw(run_command trim); +use PVE::Tools qw(run_command file_read_firstline trim); use PVE::Storage::Plugin; use PVE::JSONSchema qw(get_standard_option); @@ -284,23 +285,40 @@ my sub free_lvm_volumes { my $vg = $scfg->{vgname}; - # we need to zero out LVM data for security reasons - # and to allow thin provisioning - my $zero_out_worker = sub { - # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput - my $throughput = '-10485760'; - if ($scfg->{saferemove_throughput}) { - $throughput = $scfg->{saferemove_throughput}; + my $secure_delete_cmd = sub { + my ($lvmpath) = @_; + + my $stepsize = $scfg->{'saferemove-stepsize'} // 32; + $stepsize = $stepsize * 1024 * 1024; + + my $bdev = abs_path($lvmpath); + + my $sysdir = undef; + if ($bdev && $bdev =~ m|^/dev/(dm-\d+)|) { + $sysdir = "/sys/block/$1"; + } else { + warn "skip secure delete. lvm volume don't seem to be activated\n"; + return; } - for my $name (@$volnames) { - print "zero-out data on image $name (/dev/$vg/del-$name)\n"; + + my $write_zeroes_max_bytes = + file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0; + ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~ m/^(\d+)$/; #untaint + + if ($write_zeroes_max_bytes == 0) { + # if storage don't support write zeroes, we fallback to cstream + # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput + my $throughput = '-10485760'; + if ($scfg->{saferemove_throughput}) { + $throughput = $scfg->{saferemove_throughput}; + } my $cmd = [ '/usr/bin/cstream', '-i', '/dev/zero', '-o', - "/dev/$vg/del-$name", + $lvmpath, '-T', '10', '-v', @@ -318,6 +336,47 @@ my sub free_lvm_volumes { }; warn $@ if $@; + } else { + + # if the storage support write_zeroes but stepsize is too big, + # reduce the stepsize to the max possible + if ($write_zeroes_max_bytes > 0 && $stepsize > $write_zeroes_max_bytes) { + warn "reduce stepsize to the maximum supported by the storage:" + . "$write_zeroes_max_bytes bytes\n"; + + $stepsize = $write_zeroes_max_bytes; + } + + my $discard_enabled = undef; + + if ($scfg->{'saferemove-discard'}) { + my $discard_zeroes_data = + file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0; + + if ($discard_zeroes_data == 0) { + warn "Discard zeroes not supported. Fallback to zeroing.\n"; + } else { + $discard_enabled = 1; + } + } + + my $cmd = ['/usr/sbin/blkdiscard', $lvmpath, '-v', '--step', "${stepsize}"]; + push @$cmd, '--zeroout' if !$discard_enabled; + + eval { run_command($cmd); }; + warn $@ if $@; + } + }; + + # we need to zero out LVM data for security reasons + # and to allow thin provisioning + my $zero_out_worker = sub { + for my $name (@$volnames) { + my $lvmpath = "/dev/$vg/del-$name"; + print "zero-out data on image $name ($lvmpath)\n"; + + $secure_delete_cmd->($lvmpath); + $class->cluster_lock_storage( $storeid, $scfg->{shared}, @@ -376,6 +435,17 @@ sub properties { description => "Zero-out data when removing LVs.", type => 'boolean', }, + 'saferemove-discard' => { + description => "Wipe with discard instead of zeroing.", + type => 'boolean', + default => 0, + }, + 'saferemove-stepsize' => { + description => "Wipe step size (default 32MB)." + . "It will be capped to the maximum storage support.", + enum => [qw(1 2 4 8 16 32)], + type => 'integer', + }, saferemove_throughput => { description => "Wipe throughput (cstream -t parameter value).", type => 'string', @@ -394,6 +464,8 @@ sub options { shared => { optional => 1 }, disable => { optional => 1 }, saferemove => { optional => 1 }, + 'saferemove-discard' => { optional => 1 }, + 'saferemove-stepsize' => { optional => 1 }, saferemove_throughput => { optional => 1 }, content => { optional => 1 }, base => { fixed => 1, optional => 1 }, -- 2.47.3 [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel @ 2025-10-22 8:22 ` Fiona Ebner 2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel [not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com> 0 siblings, 2 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-22 8:22 UTC (permalink / raw) To: Proxmox VE development discussion It's v4 not v2. Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel: > diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm > index 0416c9e..1c633a3 100644 > --- a/src/PVE/Storage/LVMPlugin.pm > +++ b/src/PVE/Storage/LVMPlugin.pm > @@ -3,10 +3,11 @@ package PVE::Storage::LVMPlugin; > use strict; > use warnings; > > +use Cwd 'abs_path'; > use File::Basename; > use IO::File; > > -use PVE::Tools qw(run_command trim); > +use PVE::Tools qw(run_command file_read_firstline trim); > use PVE::Storage::Plugin; > use PVE::JSONSchema qw(get_standard_option); > > @@ -284,23 +285,40 @@ my sub free_lvm_volumes { > > my $vg = $scfg->{vgname}; > > - # we need to zero out LVM data for security reasons > - # and to allow thin provisioning > - my $zero_out_worker = sub { > - # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput > - my $throughput = '-10485760'; > - if ($scfg->{saferemove_throughput}) { > - $throughput = $scfg->{saferemove_throughput}; > + my $secure_delete_cmd = sub { > + my ($lvmpath) = @_; > + > + my $stepsize = $scfg->{'saferemove-stepsize'} // 32; > + $stepsize = $stepsize * 1024 * 1024; > + > + my $bdev = abs_path($lvmpath); > + > + my $sysdir = undef; > + if ($bdev && $bdev =~ m|^/dev/(dm-\d+)|) { > + $sysdir = "/sys/block/$1"; > + } else { > + warn "skip secure delete. lvm volume don't seem to be activated\n"; > + return; > } > - for my $name (@$volnames) { > - print "zero-out data on image $name (/dev/$vg/del-$name)\n"; > + > + my $write_zeroes_max_bytes = > + file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0; > + ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~ m/^(\d+)$/; #untaint > + > + if ($write_zeroes_max_bytes == 0) { What if a storage supports discard, but this value is zero? Then we'd fall back to the slow method even if we don't need to. > + # if storage don't support write zeroes, we fallback to cstream > + # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput > + my $throughput = '-10485760'; > + if ($scfg->{saferemove_throughput}) { > + $throughput = $scfg->{saferemove_throughput}; > + } > > my $cmd = [ > '/usr/bin/cstream', > '-i', > '/dev/zero', > '-o', > - "/dev/$vg/del-$name", > + $lvmpath, > '-T', > '10', > '-v', > @@ -318,6 +336,47 @@ my sub free_lvm_volumes { > }; > warn $@ if $@; > > + } else { > + > + # if the storage support write_zeroes but stepsize is too big, > + # reduce the stepsize to the max possible > + if ($write_zeroes_max_bytes > 0 && $stepsize > $write_zeroes_max_bytes) { > + warn "reduce stepsize to the maximum supported by the storage:" > + . "$write_zeroes_max_bytes bytes\n"; > + > + $stepsize = $write_zeroes_max_bytes; Similar here, we'd also reduce the step size even if later using discard. > + } > + > + my $discard_enabled = undef; > + > + if ($scfg->{'saferemove-discard'}) { > + my $discard_zeroes_data = > + file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0; Are you sure this works? See: https://www.kernel.org/doc/html/v6.17/admin-guide/abi-stable.html#abi-sys-block-disk-queue-discard-zeroes-data "[RO] Will always return 0. Don’t rely on any specific behavior for discards, and don’t read this file." Isn't discard_max_hw_bytes the correct one, which also can be used to determine the step size: https://www.kernel.org/doc/html/v6.17/admin-guide/abi-stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes "[RO] Devices that support discard functionality may have internal limits on the number of bytes that can be trimmed or unmapped in a single operation. The discard_max_hw_bytes parameter is set by the device driver to the maximum number of bytes that can be discarded in a single operation. Discard requests issued to the device must not exceed this limit. A discard_max_hw_bytes value of 0 means that the device does not support discard functionality." And I'm not sure a limit of 32 MiB makes sense then. If the hardware supports much more, it should be fine to use that, or? Do we even want to consider saferemove-stepsize if discard is supported? Of course depending on what we decide on the description in the schema needs to be adapted. > + > + if ($discard_zeroes_data == 0) { > + warn "Discard zeroes not supported. Fallback to zeroing.\n"; > + } else { > + $discard_enabled = 1; > + } > + } > + > + my $cmd = ['/usr/sbin/blkdiscard', $lvmpath, '-v', '--step', "${stepsize}"]; > + push @$cmd, '--zeroout' if !$discard_enabled; > + > + eval { run_command($cmd); }; > + warn $@ if $@; > + } > + }; > + > + # we need to zero out LVM data for security reasons > + # and to allow thin provisioning > + my $zero_out_worker = sub { > + for my $name (@$volnames) { > + my $lvmpath = "/dev/$vg/del-$name"; > + print "zero-out data on image $name ($lvmpath)\n"; > + > + $secure_delete_cmd->($lvmpath); > + > $class->cluster_lock_storage( > $storeid, > $scfg->{shared}, > @@ -376,6 +435,17 @@ sub properties { > description => "Zero-out data when removing LVs.", > type => 'boolean', > }, > + 'saferemove-discard' => { > + description => "Wipe with discard instead of zeroing.", > + type => 'boolean', > + default => 0, > + }, > + 'saferemove-stepsize' => { > + description => "Wipe step size (default 32MB)." I'd put "Wipe step size in MiB." in the description and put a "default => 32" via the dedicated key for defaults in the schema. > + . "It will be capped to the maximum storage support.", > + enum => [qw(1 2 4 8 16 32)], > + type => 'integer', > + }, > saferemove_throughput => { > description => "Wipe throughput (cstream -t parameter value).", > type => 'string', > @@ -394,6 +464,8 @@ sub options { > shared => { optional => 1 }, > disable => { optional => 1 }, > saferemove => { optional => 1 }, > + 'saferemove-discard' => { optional => 1 }, > + 'saferemove-stepsize' => { optional => 1 }, > saferemove_throughput => { optional => 1 }, > content => { optional => 1 }, > base => { fixed => 1, optional => 1 }, > -- > 2.47.3 > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive 2025-10-22 8:22 ` Fiona Ebner @ 2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel [not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com> 1 sibling, 0 replies; 7+ messages in thread From: DERUMIER, Alexandre via pve-devel @ 2025-10-22 9:35 UTC (permalink / raw) To: pve-devel, f.ebner; +Cc: DERUMIER, Alexandre [-- Attachment #1: Type: message/rfc822, Size: 17464 bytes --] From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com> To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "f.ebner@proxmox.com" <f.ebner@proxmox.com> Subject: Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Date: Wed, 22 Oct 2025 09:35:32 +0000 Message-ID: <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com> > + > file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0; > + ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~ > m/^(\d+)$/; #untaint > + > + if ($write_zeroes_max_bytes == 0) { >>What if a storage supports discard, but this value is zero? Then we'd >>fall back to the slow method even if we don't need to. Do you think that they exist storages supporting discard but not write zeroes ? but ok , no problem, I'll change this to really use cstream as fallback for both (but see my other comments, I think that we can't use on discard anyway) > + } > + > + my $discard_enabled = undef; > + > + if ($scfg->{'saferemove-discard'}) { > + my $discard_zeroes_data = > + > file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0; >>Are you sure this works? See: >>https://www.kernel.org/doc/html/v6.17/admin-guide/abi- >>stable.html#abi-sys-block-disk-queue-discard-zeroes-data >> >>"[RO] Will always return 0. Don’t rely on any specific behavior for >>discards, and don’t read this file." ah, you are right, is has been removed some year ago https://git.zx2c4.com/linux-rng/commit/?h=jd/vdso-test-harness&id=48920ff2a5a940cd07d12cc79e4a2c75f1185aee >> from what I understand, it was a hack because REQ_OP_WRITE_ZEROES was not implemented, so it was using discard with zeroing (when it was possible). >>Isn't discard_max_hw_bytes the correct one, which also can be used to >>determine the step size: >>https://www.kernel.org/doc/html/v6.17/admin-guide/abi- >>stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes >>"[RO] Devices that support discard functionality may have internal >>limits on the number of bytes that can be trimmed or unmapped in a >>single operation. The discard_max_hw_bytes parameter is set by the >>device driver to the maximum number of bytes that can be discarded in >>a >>single operation. Discard requests issued to the device must not >>exceed >>this limit. A discard_max_hw_bytes value of 0 means that the device >>does >>not support discard functionality." mmm, it's not that simple because the queue/discard_zeroes_data was a flag to known if the discarded block on the storage are also really zero filled at same time (RZAT TRIM (return zero after trim).) I known this was quite buggy because of storage implementation bug, maybe this is why redhat have removed discard support later https://access.redhat.com/errata/RHBA-2018:0135 "The kernel no longer supports the /sys/block/dm- X/queue/discard_zeroes_data file in sysfs. It is therefore no longer possible to determine whether discarded blocks from a block device returns zeros or the actual data. Therefore, the virtual machine disk properties "Wipe After Delete" and "Enable Discard" are no longer supported at the same time. (BZ#1529305)" So, maybe we don't need to use discard at all, REQ_OP_WRITE_ZEROES is enough. (From my test, it's quite fast) >>And I'm not sure a limit of 32 MiB makes sense then. If the hardware >>supports much more, it should be fine to use that, or? The default of 32MB for zeroing was from redhat benchmark on real san hardware, balance between zeroing speed and latency for the running vm workload. >>Do we even >>want >>to consider saferemove-stepsize if discard is supported? Of course >>depending on what we decide on the description in the schema needs to >>be >>adapted. it could still be possible to use the step-size under disk-queue- discard-max-hw-bytes. It's really here to avoid pushing too much load at once. [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>]
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive [not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com> @ 2025-10-22 9:43 ` Fiona Ebner 0 siblings, 0 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-22 9:43 UTC (permalink / raw) To: DERUMIER, Alexandre, pve-devel Am 22.10.25 um 11:35 AM schrieb DERUMIER, Alexandre: >> + } >> + >> + my $discard_enabled = undef; >> + >> + if ($scfg->{'saferemove-discard'}) { >> + my $discard_zeroes_data = >> + >> file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0; > >>> Are you sure this works? See: >>> https://www.kernel.org/doc/html/v6.17/admin-guide/abi- >>> stable.html#abi-sys-block-disk-queue-discard-zeroes-data >>> >>> "[RO] Will always return 0. Don’t rely on any specific behavior for >>> discards, and don’t read this file." > > ah, you are right, is has been removed some year ago > https://git.zx2c4.com/linux-rng/commit/?h=jd/vdso-test-harness&id=48920ff2a5a940cd07d12cc79e4a2c75f1185aee >>> > from what I understand, it was a hack because REQ_OP_WRITE_ZEROES was > not implemented, so it was using discard with zeroing (when it was > possible). > > >>> Isn't discard_max_hw_bytes the correct one, which also can be used to >>> determine the step size: >>> https://www.kernel.org/doc/html/v6.17/admin-guide/abi- >>> stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes > >>> "[RO] Devices that support discard functionality may have internal >>> limits on the number of bytes that can be trimmed or unmapped in a >>> single operation. The discard_max_hw_bytes parameter is set by the >>> device driver to the maximum number of bytes that can be discarded in >>> a >>> single operation. Discard requests issued to the device must not >>> exceed >>> this limit. A discard_max_hw_bytes value of 0 means that the device >>> does >>> not support discard functionality." > > mmm, it's not that simple because the queue/discard_zeroes_data was a > flag to known if the discarded block on the storage are also really > zero filled at same time (RZAT TRIM (return zero after trim).) > > I known this was quite buggy because of storage implementation bug, > maybe this is why redhat have removed discard support later > https://access.redhat.com/errata/RHBA-2018:0135 > "The kernel no longer supports the /sys/block/dm- > X/queue/discard_zeroes_data file in sysfs. It is therefore no longer > possible to determine whether discarded blocks from a block device > returns zeros or the actual data. Therefore, the virtual machine disk > properties "Wipe After Delete" and "Enable Discard" are no longer > supported at the same time. (BZ#1529305)" > > > So, maybe we don't need to use discard at all, REQ_OP_WRITE_ZEROES is > enough. (From my test, it's quite fast) Oh sorry, of course discard by itself is not good enough. Yes, let's just go with always using --zeroout (or the fallback if REQ_OP_WRITE_ZEROES is not supported). >>> And I'm not sure a limit of 32 MiB makes sense then. If the hardware >>> supports much more, it should be fine to use that, or? > > The default of 32MB for zeroing was from redhat benchmark on real san > hardware, balance between zeroing speed and latency for the running vm > workload. For zeroing, the default is perfectly sensible, I was only talking about discard here. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete [not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com> 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel @ 2025-10-21 14:00 ` Alexandre Derumier via pve-devel 2025-10-22 9:12 ` Fiona Ebner 1 sibling, 1 reply; 7+ messages in thread From: Alexandre Derumier via pve-devel @ 2025-10-21 14:00 UTC (permalink / raw) To: pve-devel; +Cc: Alexandre Derumier [-- Attachment #1: Type: message/rfc822, Size: 5452 bytes --] From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> To: pve-devel@lists.proxmox.com Subject: [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Date: Tue, 21 Oct 2025 16:00:54 +0200 Message-ID: <20251021140054.3916467-3-alexandre.derumier@groupe-cyllene.com> It was lost in the lvm qcow2 patches Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> --- src/PVE/Storage/LVMPlugin.pm | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm index 1c633a3..9716854 100644 --- a/src/PVE/Storage/LVMPlugin.pm +++ b/src/PVE/Storage/LVMPlugin.pm @@ -375,6 +375,17 @@ my sub free_lvm_volumes { my $lvmpath = "/dev/$vg/del-$name"; print "zero-out data on image $name ($lvmpath)\n"; + my $cmd_activate = ['/sbin/lvchange', '-aly', $lvmpath]; + run_command( + $cmd_activate, + errmsg => "can't activate LV '$lvmpath' to zero-out its data", + ); + $cmd_activate = ['/sbin/lvchange', '--refresh', $lvmpath]; + run_command( + $cmd_activate, + errmsg => "can't refresh LV '$lvmpath' to zero-out its data", + ); + $secure_delete_cmd->($lvmpath); $class->cluster_lock_storage( @@ -755,13 +766,6 @@ my sub alloc_snap_image { my sub free_snap_image { my ($class, $storeid, $scfg, $volname, $snap) = @_; - #activate only the snapshot volume - my $path = $class->path($scfg, $volname, $storeid, $snap); - my $cmd = ['/sbin/lvchange', '-aly', $path]; - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data"); - $cmd = ['/sbin/lvchange', '--refresh', $path]; - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data"); - my $snap_volname = get_snap_name($class, $volname, $snap); return free_lvm_volumes($class, $scfg, $storeid, [$snap_volname]); } @@ -777,11 +781,7 @@ sub free_image { #activate volumes && snapshot volumes my $path = $class->path($scfg, $volname, $storeid); $path = "\@pve-$name" if $format && $format eq 'qcow2'; - my $cmd = ['/sbin/lvchange', '-aly', $path]; - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data"); - $cmd = ['/sbin/lvchange', '--refresh', $path]; - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data"); - + $class->activate_volume($storeid, $scfg, $volname); my $snapshots = $class->volume_snapshot_info($scfg, $storeid, $volname); for my $snapid ( sort { $snapshots->{$a}->{order} <=> $snapshots->{$b}->{order} } -- 2.47.3 [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel @ 2025-10-22 9:12 ` Fiona Ebner 2025-10-22 9:35 ` Fiona Ebner 0 siblings, 1 reply; 7+ messages in thread From: Fiona Ebner @ 2025-10-22 9:12 UTC (permalink / raw) To: Proxmox VE development discussion Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel: > It was lost in the lvm qcow2 patches This commit message is very unspecific. Should at least mention what the issue is/that this affects raw volumes. When referring to an older commit, please include the short commit ID and title. > Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> I like the approach of moving activation closer to where actually needed :) Some comments inline: > --- > src/PVE/Storage/LVMPlugin.pm | 24 ++++++++++++------------ > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm > index 1c633a3..9716854 100644 > --- a/src/PVE/Storage/LVMPlugin.pm > +++ b/src/PVE/Storage/LVMPlugin.pm > @@ -375,6 +375,17 @@ my sub free_lvm_volumes { > my $lvmpath = "/dev/$vg/del-$name"; > print "zero-out data on image $name ($lvmpath)\n"; > > + my $cmd_activate = ['/sbin/lvchange', '-aly', $lvmpath]; > + run_command( > + $cmd_activate, > + errmsg => "can't activate LV '$lvmpath' to zero-out its data", > + ); > + $cmd_activate = ['/sbin/lvchange', '--refresh', $lvmpath]; > + run_command( > + $cmd_activate, > + errmsg => "can't refresh LV '$lvmpath' to zero-out its data", > + ); > + > $secure_delete_cmd->($lvmpath); > > $class->cluster_lock_storage( > @@ -755,13 +766,6 @@ my sub alloc_snap_image { > my sub free_snap_image { > my ($class, $storeid, $scfg, $volname, $snap) = @_; > > - #activate only the snapshot volume > - my $path = $class->path($scfg, $volname, $storeid, $snap); > - my $cmd = ['/sbin/lvchange', '-aly', $path]; > - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data"); > - $cmd = ['/sbin/lvchange', '--refresh', $path]; > - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data"); > - > my $snap_volname = get_snap_name($class, $volname, $snap); > return free_lvm_volumes($class, $scfg, $storeid, [$snap_volname]); > } > @@ -777,11 +781,7 @@ sub free_image { > #activate volumes && snapshot volumes This comment is misleading now. It should note that snapshots are activated later in free_lvm_volumes() if needed for zeroing. > my $path = $class->path($scfg, $volname, $storeid); > $path = "\@pve-$name" if $format && $format eq 'qcow2'; The $path variable is not used anymore and can be dropped. On another note, the way of using tags like "@pve-vm-105-disk-2.qcow2" is not quite correct, because there might be multiple LVM storages with qcow2 and volumes with the same name. And those then should not be tagged the same, but currently are. While it's probably very rare to come across such a setup, it's not impossible and could lead to hard-to-debug issues down the line. But it's out of scope for the current series. > - my $cmd = ['/sbin/lvchange', '-aly', $path]; > - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data"); > - $cmd = ['/sbin/lvchange', '--refresh', $path]; > - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data"); > - > + $class->activate_volume($storeid, $scfg, $volname); > my $snapshots = $class->volume_snapshot_info($scfg, $storeid, $volname); > for my $snapid ( > sort { $snapshots->{$a}->{order} <=> $snapshots->{$b}->{order} } > -- > 2.47.3 > > _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete 2025-10-22 9:12 ` Fiona Ebner @ 2025-10-22 9:35 ` Fiona Ebner 0 siblings, 0 replies; 7+ messages in thread From: Fiona Ebner @ 2025-10-22 9:35 UTC (permalink / raw) To: Proxmox VE development discussion Am 22.10.25 um 11:12 AM schrieb Fiona Ebner: > Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel: >> @@ -777,11 +781,7 @@ sub free_image { >> #activate volumes && snapshot volumes > > This comment is misleading now. It should note that snapshots are > activated later in free_lvm_volumes() if needed for zeroing. > >> my $path = $class->path($scfg, $volname, $storeid); >> $path = "\@pve-$name" if $format && $format eq 'qcow2'; > > The $path variable is not used anymore and can be dropped. > > On another note, the way of using tags like "@pve-vm-105-disk-2.qcow2" > is not quite correct, because there might be multiple LVM storages with > qcow2 and volumes with the same name. And those then should not be > tagged the same, but currently are. While it's probably very rare to > come across such a setup, it's not impossible and could lead to > hard-to-debug issues down the line. But it's out of scope for the > current series. Addendum: an approach would be including the storage ID in the tag, but then renaming a storage would cause breakage and it wouldn't help for existing setups. Most robust would be to use a list of explicit LVs when issuing lvchange commands. _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-22 9:42 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com> 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel 2025-10-22 8:22 ` Fiona Ebner 2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel [not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com> 2025-10-22 9:43 ` Fiona Ebner 2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel 2025-10-22 9:12 ` Fiona Ebner 2025-10-22 9:35 ` Fiona Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox