* [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
[not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com>
@ 2025-10-21 14:00 ` Alexandre Derumier via pve-devel
2025-10-22 8:22 ` Fiona Ebner
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel
1 sibling, 1 reply; 7+ messages in thread
From: Alexandre Derumier via pve-devel @ 2025-10-21 14:00 UTC (permalink / raw)
To: pve-devel; +Cc: Alexandre Derumier
[-- Attachment #1: Type: message/rfc822, Size: 12253 bytes --]
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
Date: Tue, 21 Oct 2025 16:00:53 +0200
Message-ID: <20251021140054.3916467-2-alexandre.derumier@groupe-cyllene.com>
Current cstream implementation is pretty slow, even without throttling.
use blkdiscard --zeroout instead when storage support it,
which is a few magnitudes faster.
Another benefit is that blkdiscard is skipping already zeroed block, so for empty
temp images like snapshot, is pretty fast.
blkdiscard don't have throttling like cstream, but we can tune the step size
of zeroes pushed to the storage.
I'm using 32MB stepsize by default , like ovirt, where it seem to be the best
balance between speed and load.
https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986
but it can be reduce with "saferemove_stepsize" option.
stepsize is also autoreduce to sysfs write_zeroes_max_bytes, which is the maximum
zeroing batch supported by the storage
Also adding an option "saferemove_discard", to use discard instead zeroing.
test with a 100G volume (empty):
time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 1 -b 1048576
13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s
26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s
38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s
50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s
63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s
76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s
89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s
101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s
107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s
write: No space left on device
real 1m24.394s
user 0m0.171s
sys 1m24.052s
time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
/dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset 0
real 0m3.641s
user 0m0.001s
sys 0m3.433s
test with a 100G volume with random data:
time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
/dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 4764729344
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 9428795392
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 14260633600
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 19092471808
/dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 23924310016
/dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 28789702656
/dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 33386659840
/dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 38117834752
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 42412802048
/dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 47076868096
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 51774488576
/dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 56438554624
/dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 60699967488
/dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 65062043648
/dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 69189238784
/dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 73517760512
/dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 78248935424
/dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 82275467264
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 86469771264
/dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 91133837312
/dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 95529467904
/dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 99153346560
/dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 103616086016
real 0m23.969s
user 0m0.030s
sys 0m0.144s
Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
---
src/PVE/Storage/LVMPlugin.pm | 94 +++++++++++++++++++++++++++++++-----
1 file changed, 83 insertions(+), 11 deletions(-)
diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
index 0416c9e..1c633a3 100644
--- a/src/PVE/Storage/LVMPlugin.pm
+++ b/src/PVE/Storage/LVMPlugin.pm
@@ -3,10 +3,11 @@ package PVE::Storage::LVMPlugin;
use strict;
use warnings;
+use Cwd 'abs_path';
use File::Basename;
use IO::File;
-use PVE::Tools qw(run_command trim);
+use PVE::Tools qw(run_command file_read_firstline trim);
use PVE::Storage::Plugin;
use PVE::JSONSchema qw(get_standard_option);
@@ -284,23 +285,40 @@ my sub free_lvm_volumes {
my $vg = $scfg->{vgname};
- # we need to zero out LVM data for security reasons
- # and to allow thin provisioning
- my $zero_out_worker = sub {
- # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
- my $throughput = '-10485760';
- if ($scfg->{saferemove_throughput}) {
- $throughput = $scfg->{saferemove_throughput};
+ my $secure_delete_cmd = sub {
+ my ($lvmpath) = @_;
+
+ my $stepsize = $scfg->{'saferemove-stepsize'} // 32;
+ $stepsize = $stepsize * 1024 * 1024;
+
+ my $bdev = abs_path($lvmpath);
+
+ my $sysdir = undef;
+ if ($bdev && $bdev =~ m|^/dev/(dm-\d+)|) {
+ $sysdir = "/sys/block/$1";
+ } else {
+ warn "skip secure delete. lvm volume don't seem to be activated\n";
+ return;
}
- for my $name (@$volnames) {
- print "zero-out data on image $name (/dev/$vg/del-$name)\n";
+
+ my $write_zeroes_max_bytes =
+ file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0;
+ ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~ m/^(\d+)$/; #untaint
+
+ if ($write_zeroes_max_bytes == 0) {
+ # if storage don't support write zeroes, we fallback to cstream
+ # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
+ my $throughput = '-10485760';
+ if ($scfg->{saferemove_throughput}) {
+ $throughput = $scfg->{saferemove_throughput};
+ }
my $cmd = [
'/usr/bin/cstream',
'-i',
'/dev/zero',
'-o',
- "/dev/$vg/del-$name",
+ $lvmpath,
'-T',
'10',
'-v',
@@ -318,6 +336,47 @@ my sub free_lvm_volumes {
};
warn $@ if $@;
+ } else {
+
+ # if the storage support write_zeroes but stepsize is too big,
+ # reduce the stepsize to the max possible
+ if ($write_zeroes_max_bytes > 0 && $stepsize > $write_zeroes_max_bytes) {
+ warn "reduce stepsize to the maximum supported by the storage:"
+ . "$write_zeroes_max_bytes bytes\n";
+
+ $stepsize = $write_zeroes_max_bytes;
+ }
+
+ my $discard_enabled = undef;
+
+ if ($scfg->{'saferemove-discard'}) {
+ my $discard_zeroes_data =
+ file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0;
+
+ if ($discard_zeroes_data == 0) {
+ warn "Discard zeroes not supported. Fallback to zeroing.\n";
+ } else {
+ $discard_enabled = 1;
+ }
+ }
+
+ my $cmd = ['/usr/sbin/blkdiscard', $lvmpath, '-v', '--step', "${stepsize}"];
+ push @$cmd, '--zeroout' if !$discard_enabled;
+
+ eval { run_command($cmd); };
+ warn $@ if $@;
+ }
+ };
+
+ # we need to zero out LVM data for security reasons
+ # and to allow thin provisioning
+ my $zero_out_worker = sub {
+ for my $name (@$volnames) {
+ my $lvmpath = "/dev/$vg/del-$name";
+ print "zero-out data on image $name ($lvmpath)\n";
+
+ $secure_delete_cmd->($lvmpath);
+
$class->cluster_lock_storage(
$storeid,
$scfg->{shared},
@@ -376,6 +435,17 @@ sub properties {
description => "Zero-out data when removing LVs.",
type => 'boolean',
},
+ 'saferemove-discard' => {
+ description => "Wipe with discard instead of zeroing.",
+ type => 'boolean',
+ default => 0,
+ },
+ 'saferemove-stepsize' => {
+ description => "Wipe step size (default 32MB)."
+ . "It will be capped to the maximum storage support.",
+ enum => [qw(1 2 4 8 16 32)],
+ type => 'integer',
+ },
saferemove_throughput => {
description => "Wipe throughput (cstream -t parameter value).",
type => 'string',
@@ -394,6 +464,8 @@ sub options {
shared => { optional => 1 },
disable => { optional => 1 },
saferemove => { optional => 1 },
+ 'saferemove-discard' => { optional => 1 },
+ 'saferemove-stepsize' => { optional => 1 },
saferemove_throughput => { optional => 1 },
content => { optional => 1 },
base => { fixed => 1, optional => 1 },
--
2.47.3
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete
[not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com>
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel
@ 2025-10-21 14:00 ` Alexandre Derumier via pve-devel
2025-10-22 9:12 ` Fiona Ebner
1 sibling, 1 reply; 7+ messages in thread
From: Alexandre Derumier via pve-devel @ 2025-10-21 14:00 UTC (permalink / raw)
To: pve-devel; +Cc: Alexandre Derumier
[-- Attachment #1: Type: message/rfc822, Size: 5452 bytes --]
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete
Date: Tue, 21 Oct 2025 16:00:54 +0200
Message-ID: <20251021140054.3916467-3-alexandre.derumier@groupe-cyllene.com>
It was lost in the lvm qcow2 patches
Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
---
src/PVE/Storage/LVMPlugin.pm | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
index 1c633a3..9716854 100644
--- a/src/PVE/Storage/LVMPlugin.pm
+++ b/src/PVE/Storage/LVMPlugin.pm
@@ -375,6 +375,17 @@ my sub free_lvm_volumes {
my $lvmpath = "/dev/$vg/del-$name";
print "zero-out data on image $name ($lvmpath)\n";
+ my $cmd_activate = ['/sbin/lvchange', '-aly', $lvmpath];
+ run_command(
+ $cmd_activate,
+ errmsg => "can't activate LV '$lvmpath' to zero-out its data",
+ );
+ $cmd_activate = ['/sbin/lvchange', '--refresh', $lvmpath];
+ run_command(
+ $cmd_activate,
+ errmsg => "can't refresh LV '$lvmpath' to zero-out its data",
+ );
+
$secure_delete_cmd->($lvmpath);
$class->cluster_lock_storage(
@@ -755,13 +766,6 @@ my sub alloc_snap_image {
my sub free_snap_image {
my ($class, $storeid, $scfg, $volname, $snap) = @_;
- #activate only the snapshot volume
- my $path = $class->path($scfg, $volname, $storeid, $snap);
- my $cmd = ['/sbin/lvchange', '-aly', $path];
- run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data");
- $cmd = ['/sbin/lvchange', '--refresh', $path];
- run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data");
-
my $snap_volname = get_snap_name($class, $volname, $snap);
return free_lvm_volumes($class, $scfg, $storeid, [$snap_volname]);
}
@@ -777,11 +781,7 @@ sub free_image {
#activate volumes && snapshot volumes
my $path = $class->path($scfg, $volname, $storeid);
$path = "\@pve-$name" if $format && $format eq 'qcow2';
- my $cmd = ['/sbin/lvchange', '-aly', $path];
- run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data");
- $cmd = ['/sbin/lvchange', '--refresh', $path];
- run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data");
-
+ $class->activate_volume($storeid, $scfg, $volname);
my $snapshots = $class->volume_snapshot_info($scfg, $storeid, $volname);
for my $snapid (
sort { $snapshots->{$a}->{order} <=> $snapshots->{$b}->{order} }
--
2.47.3
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel
@ 2025-10-22 8:22 ` Fiona Ebner
2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel
[not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>
0 siblings, 2 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-10-22 8:22 UTC (permalink / raw)
To: Proxmox VE development discussion
It's v4 not v2.
Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel:
> diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
> index 0416c9e..1c633a3 100644
> --- a/src/PVE/Storage/LVMPlugin.pm
> +++ b/src/PVE/Storage/LVMPlugin.pm
> @@ -3,10 +3,11 @@ package PVE::Storage::LVMPlugin;
> use strict;
> use warnings;
>
> +use Cwd 'abs_path';
> use File::Basename;
> use IO::File;
>
> -use PVE::Tools qw(run_command trim);
> +use PVE::Tools qw(run_command file_read_firstline trim);
> use PVE::Storage::Plugin;
> use PVE::JSONSchema qw(get_standard_option);
>
> @@ -284,23 +285,40 @@ my sub free_lvm_volumes {
>
> my $vg = $scfg->{vgname};
>
> - # we need to zero out LVM data for security reasons
> - # and to allow thin provisioning
> - my $zero_out_worker = sub {
> - # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
> - my $throughput = '-10485760';
> - if ($scfg->{saferemove_throughput}) {
> - $throughput = $scfg->{saferemove_throughput};
> + my $secure_delete_cmd = sub {
> + my ($lvmpath) = @_;
> +
> + my $stepsize = $scfg->{'saferemove-stepsize'} // 32;
> + $stepsize = $stepsize * 1024 * 1024;
> +
> + my $bdev = abs_path($lvmpath);
> +
> + my $sysdir = undef;
> + if ($bdev && $bdev =~ m|^/dev/(dm-\d+)|) {
> + $sysdir = "/sys/block/$1";
> + } else {
> + warn "skip secure delete. lvm volume don't seem to be activated\n";
> + return;
> }
> - for my $name (@$volnames) {
> - print "zero-out data on image $name (/dev/$vg/del-$name)\n";
> +
> + my $write_zeroes_max_bytes =
> + file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0;
> + ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~ m/^(\d+)$/; #untaint
> +
> + if ($write_zeroes_max_bytes == 0) {
What if a storage supports discard, but this value is zero? Then we'd
fall back to the slow method even if we don't need to.
> + # if storage don't support write zeroes, we fallback to cstream
> + # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
> + my $throughput = '-10485760';
> + if ($scfg->{saferemove_throughput}) {
> + $throughput = $scfg->{saferemove_throughput};
> + }
>
> my $cmd = [
> '/usr/bin/cstream',
> '-i',
> '/dev/zero',
> '-o',
> - "/dev/$vg/del-$name",
> + $lvmpath,
> '-T',
> '10',
> '-v',
> @@ -318,6 +336,47 @@ my sub free_lvm_volumes {
> };
> warn $@ if $@;
>
> + } else {
> +
> + # if the storage support write_zeroes but stepsize is too big,
> + # reduce the stepsize to the max possible
> + if ($write_zeroes_max_bytes > 0 && $stepsize > $write_zeroes_max_bytes) {
> + warn "reduce stepsize to the maximum supported by the storage:"
> + . "$write_zeroes_max_bytes bytes\n";
> +
> + $stepsize = $write_zeroes_max_bytes;
Similar here, we'd also reduce the step size even if later using discard.
> + }
> +
> + my $discard_enabled = undef;
> +
> + if ($scfg->{'saferemove-discard'}) {
> + my $discard_zeroes_data =
> + file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0;
Are you sure this works? See:
https://www.kernel.org/doc/html/v6.17/admin-guide/abi-stable.html#abi-sys-block-disk-queue-discard-zeroes-data
"[RO] Will always return 0. Don’t rely on any specific behavior for
discards, and don’t read this file."
Isn't discard_max_hw_bytes the correct one, which also can be used to
determine the step size:
https://www.kernel.org/doc/html/v6.17/admin-guide/abi-stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes
"[RO] Devices that support discard functionality may have internal
limits on the number of bytes that can be trimmed or unmapped in a
single operation. The discard_max_hw_bytes parameter is set by the
device driver to the maximum number of bytes that can be discarded in a
single operation. Discard requests issued to the device must not exceed
this limit. A discard_max_hw_bytes value of 0 means that the device does
not support discard functionality."
And I'm not sure a limit of 32 MiB makes sense then. If the hardware
supports much more, it should be fine to use that, or? Do we even want
to consider saferemove-stepsize if discard is supported? Of course
depending on what we decide on the description in the schema needs to be
adapted.
> +
> + if ($discard_zeroes_data == 0) {
> + warn "Discard zeroes not supported. Fallback to zeroing.\n";
> + } else {
> + $discard_enabled = 1;
> + }
> + }
> +
> + my $cmd = ['/usr/sbin/blkdiscard', $lvmpath, '-v', '--step', "${stepsize}"];
> + push @$cmd, '--zeroout' if !$discard_enabled;
> +
> + eval { run_command($cmd); };
> + warn $@ if $@;
> + }
> + };
> +
> + # we need to zero out LVM data for security reasons
> + # and to allow thin provisioning
> + my $zero_out_worker = sub {
> + for my $name (@$volnames) {
> + my $lvmpath = "/dev/$vg/del-$name";
> + print "zero-out data on image $name ($lvmpath)\n";
> +
> + $secure_delete_cmd->($lvmpath);
> +
> $class->cluster_lock_storage(
> $storeid,
> $scfg->{shared},
> @@ -376,6 +435,17 @@ sub properties {
> description => "Zero-out data when removing LVs.",
> type => 'boolean',
> },
> + 'saferemove-discard' => {
> + description => "Wipe with discard instead of zeroing.",
> + type => 'boolean',
> + default => 0,
> + },
> + 'saferemove-stepsize' => {
> + description => "Wipe step size (default 32MB)."
I'd put "Wipe step size in MiB." in the description and put a
"default => 32" via the dedicated key for defaults in the schema.
> + . "It will be capped to the maximum storage support.",
> + enum => [qw(1 2 4 8 16 32)],
> + type => 'integer',
> + },
> saferemove_throughput => {
> description => "Wipe throughput (cstream -t parameter value).",
> type => 'string',
> @@ -394,6 +464,8 @@ sub options {
> shared => { optional => 1 },
> disable => { optional => 1 },
> saferemove => { optional => 1 },
> + 'saferemove-discard' => { optional => 1 },
> + 'saferemove-stepsize' => { optional => 1 },
> saferemove_throughput => { optional => 1 },
> content => { optional => 1 },
> base => { fixed => 1, optional => 1 },
> --
> 2.47.3
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel
@ 2025-10-22 9:12 ` Fiona Ebner
2025-10-22 9:35 ` Fiona Ebner
0 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2025-10-22 9:12 UTC (permalink / raw)
To: Proxmox VE development discussion
Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel:
> It was lost in the lvm qcow2 patches
This commit message is very unspecific. Should at least mention what the
issue is/that this affects raw volumes. When referring to an older
commit, please include the short commit ID and title.
> Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
I like the approach of moving activation closer to where actually needed
:) Some comments inline:
> ---
> src/PVE/Storage/LVMPlugin.pm | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
> index 1c633a3..9716854 100644
> --- a/src/PVE/Storage/LVMPlugin.pm
> +++ b/src/PVE/Storage/LVMPlugin.pm
> @@ -375,6 +375,17 @@ my sub free_lvm_volumes {
> my $lvmpath = "/dev/$vg/del-$name";
> print "zero-out data on image $name ($lvmpath)\n";
>
> + my $cmd_activate = ['/sbin/lvchange', '-aly', $lvmpath];
> + run_command(
> + $cmd_activate,
> + errmsg => "can't activate LV '$lvmpath' to zero-out its data",
> + );
> + $cmd_activate = ['/sbin/lvchange', '--refresh', $lvmpath];
> + run_command(
> + $cmd_activate,
> + errmsg => "can't refresh LV '$lvmpath' to zero-out its data",
> + );
> +
> $secure_delete_cmd->($lvmpath);
>
> $class->cluster_lock_storage(
> @@ -755,13 +766,6 @@ my sub alloc_snap_image {
> my sub free_snap_image {
> my ($class, $storeid, $scfg, $volname, $snap) = @_;
>
> - #activate only the snapshot volume
> - my $path = $class->path($scfg, $volname, $storeid, $snap);
> - my $cmd = ['/sbin/lvchange', '-aly', $path];
> - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data");
> - $cmd = ['/sbin/lvchange', '--refresh', $path];
> - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data");
> -
> my $snap_volname = get_snap_name($class, $volname, $snap);
> return free_lvm_volumes($class, $scfg, $storeid, [$snap_volname]);
> }
> @@ -777,11 +781,7 @@ sub free_image {
> #activate volumes && snapshot volumes
This comment is misleading now. It should note that snapshots are
activated later in free_lvm_volumes() if needed for zeroing.
> my $path = $class->path($scfg, $volname, $storeid);
> $path = "\@pve-$name" if $format && $format eq 'qcow2';
The $path variable is not used anymore and can be dropped.
On another note, the way of using tags like "@pve-vm-105-disk-2.qcow2"
is not quite correct, because there might be multiple LVM storages with
qcow2 and volumes with the same name. And those then should not be
tagged the same, but currently are. While it's probably very rare to
come across such a setup, it's not impossible and could lead to
hard-to-debug issues down the line. But it's out of scope for the
current series.
> - my $cmd = ['/sbin/lvchange', '-aly', $path];
> - run_command($cmd, errmsg => "can't activate LV '$path' to zero-out its data");
> - $cmd = ['/sbin/lvchange', '--refresh', $path];
> - run_command($cmd, errmsg => "can't refresh LV '$path' to zero-out its data");
> -
> + $class->activate_volume($storeid, $scfg, $volname);
> my $snapshots = $class->volume_snapshot_info($scfg, $storeid, $volname);
> for my $snapid (
> sort { $snapshots->{$a}->{order} <=> $snapshots->{$b}->{order} }
> --
> 2.47.3
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete
2025-10-22 9:12 ` Fiona Ebner
@ 2025-10-22 9:35 ` Fiona Ebner
0 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-10-22 9:35 UTC (permalink / raw)
To: Proxmox VE development discussion
Am 22.10.25 um 11:12 AM schrieb Fiona Ebner:
> Am 21.10.25 um 4:01 PM schrieb Alexandre Derumier via pve-devel:
>> @@ -777,11 +781,7 @@ sub free_image {
>> #activate volumes && snapshot volumes
>
> This comment is misleading now. It should note that snapshots are
> activated later in free_lvm_volumes() if needed for zeroing.
>
>> my $path = $class->path($scfg, $volname, $storeid);
>> $path = "\@pve-$name" if $format && $format eq 'qcow2';
>
> The $path variable is not used anymore and can be dropped.
>
> On another note, the way of using tags like "@pve-vm-105-disk-2.qcow2"
> is not quite correct, because there might be multiple LVM storages with
> qcow2 and volumes with the same name. And those then should not be
> tagged the same, but currently are. While it's probably very rare to
> come across such a setup, it's not impossible and could lead to
> hard-to-debug issues down the line. But it's out of scope for the
> current series.
Addendum: an approach would be including the storage ID in the tag, but
then renaming a storage would cause breakage and it wouldn't help for
existing setups. Most robust would be to use a list of explicit LVs when
issuing lvchange commands.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
2025-10-22 8:22 ` Fiona Ebner
@ 2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel
[not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>
1 sibling, 0 replies; 7+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-10-22 9:35 UTC (permalink / raw)
To: pve-devel, f.ebner; +Cc: DERUMIER, Alexandre
[-- Attachment #1: Type: message/rfc822, Size: 17464 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "f.ebner@proxmox.com" <f.ebner@proxmox.com>
Subject: Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
Date: Wed, 22 Oct 2025 09:35:32 +0000
Message-ID: <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>
> +
> file_read_firstline("$sysdir/queue/write_zeroes_max_bytes") // 0;
> + ($write_zeroes_max_bytes) = $write_zeroes_max_bytes =~
> m/^(\d+)$/; #untaint
> +
> + if ($write_zeroes_max_bytes == 0) {
>>What if a storage supports discard, but this value is zero? Then we'd
>>fall back to the slow method even if we don't need to.
Do you think that they exist storages supporting discard but not write
zeroes ?
but ok , no problem, I'll change this to really use cstream as fallback
for both (but see my other comments, I think that we can't use on
discard anyway)
> + }
> +
> + my $discard_enabled = undef;
> +
> + if ($scfg->{'saferemove-discard'}) {
> + my $discard_zeroes_data =
> +
> file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0;
>>Are you sure this works? See:
>>https://www.kernel.org/doc/html/v6.17/admin-guide/abi-
>>stable.html#abi-sys-block-disk-queue-discard-zeroes-data
>>
>>"[RO] Will always return 0. Don’t rely on any specific behavior for
>>discards, and don’t read this file."
ah, you are right, is has been removed some year ago
https://git.zx2c4.com/linux-rng/commit/?h=jd/vdso-test-harness&id=48920ff2a5a940cd07d12cc79e4a2c75f1185aee
>>
from what I understand, it was a hack because REQ_OP_WRITE_ZEROES was
not implemented, so it was using discard with zeroing (when it was
possible).
>>Isn't discard_max_hw_bytes the correct one, which also can be used to
>>determine the step size:
>>https://www.kernel.org/doc/html/v6.17/admin-guide/abi-
>>stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes
>>"[RO] Devices that support discard functionality may have internal
>>limits on the number of bytes that can be trimmed or unmapped in a
>>single operation. The discard_max_hw_bytes parameter is set by the
>>device driver to the maximum number of bytes that can be discarded in
>>a
>>single operation. Discard requests issued to the device must not
>>exceed
>>this limit. A discard_max_hw_bytes value of 0 means that the device
>>does
>>not support discard functionality."
mmm, it's not that simple because the queue/discard_zeroes_data was a
flag to known if the discarded block on the storage are also really
zero filled at same time (RZAT TRIM (return zero after trim).)
I known this was quite buggy because of storage implementation bug,
maybe this is why redhat have removed discard support later
https://access.redhat.com/errata/RHBA-2018:0135
"The kernel no longer supports the /sys/block/dm-
X/queue/discard_zeroes_data file in sysfs. It is therefore no longer
possible to determine whether discarded blocks from a block device
returns zeros or the actual data. Therefore, the virtual machine disk
properties "Wipe After Delete" and "Enable Discard" are no longer
supported at the same time. (BZ#1529305)"
So, maybe we don't need to use discard at all, REQ_OP_WRITE_ZEROES is
enough. (From my test, it's quite fast)
>>And I'm not sure a limit of 32 MiB makes sense then. If the hardware
>>supports much more, it should be fine to use that, or?
The default of 32MB for zeroing was from redhat benchmark on real san
hardware, balance between zeroing speed and latency for the running vm
workload.
>>Do we even >>want
>>to consider saferemove-stepsize if discard is supported? Of course
>>depending on what we decide on the description in the schema needs to
>>be
>>adapted.
it could still be possible to use the step-size under disk-queue-
discard-max-hw-bytes. It's really here to avoid pushing too much load
at once.
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive
[not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>
@ 2025-10-22 9:43 ` Fiona Ebner
0 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2025-10-22 9:43 UTC (permalink / raw)
To: DERUMIER, Alexandre, pve-devel
Am 22.10.25 um 11:35 AM schrieb DERUMIER, Alexandre:
>> + }
>> +
>> + my $discard_enabled = undef;
>> +
>> + if ($scfg->{'saferemove-discard'}) {
>> + my $discard_zeroes_data =
>> +
>> file_read_firstline("$sysdir/queue/discard_zeroes_data") // 0;
>
>>> Are you sure this works? See:
>>> https://www.kernel.org/doc/html/v6.17/admin-guide/abi-
>>> stable.html#abi-sys-block-disk-queue-discard-zeroes-data
>>>
>>> "[RO] Will always return 0. Don’t rely on any specific behavior for
>>> discards, and don’t read this file."
>
> ah, you are right, is has been removed some year ago
> https://git.zx2c4.com/linux-rng/commit/?h=jd/vdso-test-harness&id=48920ff2a5a940cd07d12cc79e4a2c75f1185aee
>>>
> from what I understand, it was a hack because REQ_OP_WRITE_ZEROES was
> not implemented, so it was using discard with zeroing (when it was
> possible).
>
>
>>> Isn't discard_max_hw_bytes the correct one, which also can be used to
>>> determine the step size:
>>> https://www.kernel.org/doc/html/v6.17/admin-guide/abi-
>>> stable.html#abi-sys-block-disk-queue-discard-max-hw-bytes
>
>>> "[RO] Devices that support discard functionality may have internal
>>> limits on the number of bytes that can be trimmed or unmapped in a
>>> single operation. The discard_max_hw_bytes parameter is set by the
>>> device driver to the maximum number of bytes that can be discarded in
>>> a
>>> single operation. Discard requests issued to the device must not
>>> exceed
>>> this limit. A discard_max_hw_bytes value of 0 means that the device
>>> does
>>> not support discard functionality."
>
> mmm, it's not that simple because the queue/discard_zeroes_data was a
> flag to known if the discarded block on the storage are also really
> zero filled at same time (RZAT TRIM (return zero after trim).)
>
> I known this was quite buggy because of storage implementation bug,
> maybe this is why redhat have removed discard support later
> https://access.redhat.com/errata/RHBA-2018:0135
> "The kernel no longer supports the /sys/block/dm-
> X/queue/discard_zeroes_data file in sysfs. It is therefore no longer
> possible to determine whether discarded blocks from a block device
> returns zeros or the actual data. Therefore, the virtual machine disk
> properties "Wipe After Delete" and "Enable Discard" are no longer
> supported at the same time. (BZ#1529305)"
>
>
> So, maybe we don't need to use discard at all, REQ_OP_WRITE_ZEROES is
> enough. (From my test, it's quite fast)
Oh sorry, of course discard by itself is not good enough. Yes, let's
just go with always using --zeroout (or the fallback if
REQ_OP_WRITE_ZEROES is not supported).
>>> And I'm not sure a limit of 32 MiB makes sense then. If the hardware
>>> supports much more, it should be fine to use that, or?
>
> The default of 32MB for zeroing was from redhat benchmark on real san
> hardware, balance between zeroing speed and latency for the running vm
> workload.
For zeroing, the default is perfectly sensible, I was only talking about
discard here.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-22 9:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20251021140054.3916467-1-alexandre.derumier@groupe-cyllene.com>
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 1/2] lvmplugin: use blkdiscard when supported instead cstream to saferemove drive Alexandre Derumier via pve-devel
2025-10-22 8:22 ` Fiona Ebner
2025-10-22 9:35 ` DERUMIER, Alexandre via pve-devel
[not found] ` <414f3fba72876b4138458b625275892e059378af.camel@groupe-cyllene.com>
2025-10-22 9:43 ` Fiona Ebner
2025-10-21 14:00 ` [pve-devel] [PATCH v2 pve-storage 2/2] fix #6941 : lvmplugin : fix activation on secure delete Alexandre Derumier via pve-devel
2025-10-22 9:12 ` Fiona Ebner
2025-10-22 9:35 ` Fiona Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox