* [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
@ 2025-08-14 9:24 Alexandre Derumier via pve-devel
2025-08-14 9:37 ` Wolfgang Bumiller
0 siblings, 1 reply; 7+ messages in thread
From: Alexandre Derumier via pve-devel @ 2025-08-14 9:24 UTC (permalink / raw)
To: pve-devel; +Cc: Alexandre Derumier
[-- Attachment #1: Type: message/rfc822, Size: 9876 bytes --]
From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
To: pve-devel@lists.proxmox.com
Subject: [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Thu, 14 Aug 2025 11:24:48 +0200
Message-ID: <20250814092448.919114-1-alexandre.derumier@groupe-cyllene.com>
Current cstream implementation is pretty slow as hell, even without throttling.
use blkdiscard --zeroout instead, which is lot of magnetude faster
Another benefit is that blkdiscard is skipping already zeroed block, so for empty
temp images like snapshot, is pretty fast.
blkdiscard don't have throttling like cstream, but we can tune the step size
of zeroes pushed to the storage.
I'm using 32MB stepsize by default , like ovirt, where it seem to be the best
balance between speed and load.
https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986
but it can be reduce with "saferemove_stepsize" option.
Also adding an option "saferemove_discard", to use discard instead zeroing.
test with a 100G volume (empty):
time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 1 -b 1048576
13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s
26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s
38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s
50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s
63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s
76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s
89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s
101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s
107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s
write: No space left on device
real 1m24.394s
user 0m0.171s
sys 1m24.052s
time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
/dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset 0
real 0m3.641s
user 0m0.001s
sys 0m3.433s
test with a 100G volume with random data:
time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
/dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 4764729344
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 9428795392
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 14260633600
/dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 19092471808
/dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 23924310016
/dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 28789702656
/dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 33386659840
/dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 38117834752
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 42412802048
/dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 47076868096
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 51774488576
/dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 56438554624
/dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 60699967488
/dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 65062043648
/dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 69189238784
/dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 73517760512
/dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 78248935424
/dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 82275467264
/dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 86469771264
/dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 91133837312
/dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 95529467904
/dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 99153346560
/dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 103616086016
real 0m23.969s
user 0m0.030s
sys 0m0.144s
Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
---
src/PVE/Storage/LVMPlugin.pm | 43 ++++++++++++------------------------
1 file changed, 14 insertions(+), 29 deletions(-)
diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
index 0416c9e..5ea9d8b 100644
--- a/src/PVE/Storage/LVMPlugin.pm
+++ b/src/PVE/Storage/LVMPlugin.pm
@@ -287,35 +287,15 @@ my sub free_lvm_volumes {
# we need to zero out LVM data for security reasons
# and to allow thin provisioning
my $zero_out_worker = sub {
- # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
- my $throughput = '-10485760';
- if ($scfg->{saferemove_throughput}) {
- $throughput = $scfg->{saferemove_throughput};
- }
for my $name (@$volnames) {
print "zero-out data on image $name (/dev/$vg/del-$name)\n";
-
+ my $stepsize = $scfg->{saferemove_stepsize} // 32;
my $cmd = [
- '/usr/bin/cstream',
- '-i',
- '/dev/zero',
- '-o',
- "/dev/$vg/del-$name",
- '-T',
- '10',
- '-v',
- '1',
- '-b',
- '1048576',
- '-t',
- "$throughput",
+ '/usr/sbin/blkdiscard', "/dev/$vg/del-$name", '-v', '--step', "${stepsize}M",
];
- eval {
- run_command(
- $cmd,
- errmsg => "zero out finished (note: 'No space left on device' is ok here)",
- );
- };
+ push @$cmd, '--zeroout' if !$scfg->{saferemove_discard};
+
+ eval { run_command($cmd); };
warn $@ if $@;
$class->cluster_lock_storage(
@@ -376,9 +356,13 @@ sub properties {
description => "Zero-out data when removing LVs.",
type => 'boolean',
},
- saferemove_throughput => {
- description => "Wipe throughput (cstream -t parameter value).",
- type => 'string',
+ saferemove_stepsize => {
+ description => "Wipe step size (default 32MB).",
+ enum => [qw(1 2 4 8 16 32)],
+ },
+ saferemove_discard => {
+ description => "Wipe with discard instead zeroing.",
+ type => 'boolean',
},
tagged_only => {
description => "Only use logical volumes tagged with 'pve-vm-ID'.",
@@ -394,7 +378,8 @@ sub options {
shared => { optional => 1 },
disable => { optional => 1 },
saferemove => { optional => 1 },
- saferemove_throughput => { optional => 1 },
+ saferemove_discard => { optional => 1 },
+ saferemove_stepsize => { optional => 1 },
content => { optional => 1 },
base => { fixed => 1, optional => 1 },
tagged_only => { optional => 1 },
--
2.47.2
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
2025-08-14 9:24 [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive Alexandre Derumier via pve-devel
@ 2025-08-14 9:37 ` Wolfgang Bumiller
2025-08-14 10:07 ` DERUMIER, Alexandre via pve-devel
[not found] ` <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Bumiller @ 2025-08-14 9:37 UTC (permalink / raw)
To: Proxmox VE development discussion
On Thu, Aug 14, 2025 at 11:24:48AM +0200, Alexandre Derumier via pve-devel wrote:
> From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> To: pve-devel@lists.proxmox.com
> Subject: [PATCH pve-storage] lvm: use blkdiscard instead cstream to
> saferemove drive
> Date: Thu, 14 Aug 2025 11:24:48 +0200
> Message-ID: <20250814092448.919114-1-alexandre.derumier@groupe-cyllene.com>
> X-Mailer: git-send-email 2.47.2
>
> Current cstream implementation is pretty slow as hell, even without throttling.
>
> use blkdiscard --zeroout instead, which is lot of magnetude faster
Makes sense, given that it probably uses dedicated zero-out `ioctls()`.
>
> Another benefit is that blkdiscard is skipping already zeroed block, so for empty
> temp images like snapshot, is pretty fast.
>
> blkdiscard don't have throttling like cstream, but we can tune the step size
> of zeroes pushed to the storage.
> I'm using 32MB stepsize by default , like ovirt, where it seem to be the best
> balance between speed and load.
> https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986
>
> but it can be reduce with "saferemove_stepsize" option.
>
> Also adding an option "saferemove_discard", to use discard instead zeroing.
>
> test with a 100G volume (empty):
>
> time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 1 -b 1048576
>
> 13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s
> 26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s
> 38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s
> 50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s
> 63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s
> 76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s
> 89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s
> 101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s
> 107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s
> write: No space left on device
>
> real 1m24.394s
> user 0m0.171s
> sys 1m24.052s
>
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
> /dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset 0
>
> real 0m3.641s
> user 0m0.001s
> sys 0m3.433s
>
> test with a 100G volume with random data:
>
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
>
> /dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 4764729344
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 9428795392
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 14260633600
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 19092471808
> /dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 23924310016
> /dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 28789702656
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 33386659840
> /dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 38117834752
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 42412802048
> /dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 47076868096
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 51774488576
> /dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 56438554624
> /dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 60699967488
> /dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 65062043648
> /dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 69189238784
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 73517760512
> /dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 78248935424
> /dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 82275467264
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 86469771264
> /dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 91133837312
> /dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 95529467904
> /dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 99153346560
> /dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 103616086016
>
> real 0m23.969s
> user 0m0.030s
> sys 0m0.144s
>
> Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> ---
> src/PVE/Storage/LVMPlugin.pm | 43 ++++++++++++------------------------
> 1 file changed, 14 insertions(+), 29 deletions(-)
>
> diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
> index 0416c9e..5ea9d8b 100644
> --- a/src/PVE/Storage/LVMPlugin.pm
> +++ b/src/PVE/Storage/LVMPlugin.pm
> @@ -287,35 +287,15 @@ my sub free_lvm_volumes {
> # we need to zero out LVM data for security reasons
> # and to allow thin provisioning
> my $zero_out_worker = sub {
> - # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
> - my $throughput = '-10485760';
> - if ($scfg->{saferemove_throughput}) {
> - $throughput = $scfg->{saferemove_throughput};
> - }
> for my $name (@$volnames) {
> print "zero-out data on image $name (/dev/$vg/del-$name)\n";
> -
> + my $stepsize = $scfg->{saferemove_stepsize} // 32;
> my $cmd = [
> - '/usr/bin/cstream',
> - '-i',
> - '/dev/zero',
> - '-o',
> - "/dev/$vg/del-$name",
> - '-T',
> - '10',
> - '-v',
> - '1',
> - '-b',
> - '1048576',
> - '-t',
> - "$throughput",
> + '/usr/sbin/blkdiscard', "/dev/$vg/del-$name", '-v', '--step', "${stepsize}M",
> ];
> - eval {
> - run_command(
> - $cmd,
> - errmsg => "zero out finished (note: 'No space left on device' is ok here)",
> - );
> - };
> + push @$cmd, '--zeroout' if !$scfg->{saferemove_discard};
> +
> + eval { run_command($cmd); };
> warn $@ if $@;
>
> $class->cluster_lock_storage(
> @@ -376,9 +356,13 @@ sub properties {
> description => "Zero-out data when removing LVs.",
> type => 'boolean',
> },
> - saferemove_throughput => {
> - description => "Wipe throughput (cstream -t parameter value).",
> - type => 'string',
We do still need to keep the old option around to not break existing
configs. We should document its deprecation and add a warning if it is
set.
> + saferemove_stepsize => {
> + description => "Wipe step size (default 32MB).",
> + enum => [qw(1 2 4 8 16 32)],
> + },
> + saferemove_discard => {
> + description => "Wipe with discard instead zeroing.",
> + type => 'boolean',
Not sure we need this (not sure when this is actually useful), but it's
cheap enough to have around. Should add a `default => 0` for
documentation purposes, though.
> },
> tagged_only => {
> description => "Only use logical volumes tagged with 'pve-vm-ID'.",
> @@ -394,7 +378,8 @@ sub options {
> shared => { optional => 1 },
> disable => { optional => 1 },
> saferemove => { optional => 1 },
> - saferemove_throughput => { optional => 1 },
> + saferemove_discard => { optional => 1 },
> + saferemove_stepsize => { optional => 1 },
> content => { optional => 1 },
> base => { fixed => 1, optional => 1 },
> tagged_only => { optional => 1 },
> --
> 2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
2025-08-14 9:37 ` Wolfgang Bumiller
@ 2025-08-14 10:07 ` DERUMIER, Alexandre via pve-devel
[not found] ` <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
1 sibling, 0 replies; 7+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-08-14 10:07 UTC (permalink / raw)
To: w.bumiller, pve-devel; +Cc: DERUMIER, Alexandre
[-- Attachment #1: Type: message/rfc822, Size: 12883 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "w.bumiller@proxmox.com" <w.bumiller@proxmox.com>, "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Thu, 14 Aug 2025 10:07:59 +0000
Message-ID: <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
> + saferemove_discard => {
> + description => "Wipe with discard instead zeroing.",
> + type => 'boolean',
>>Not sure we need this (not sure when this is actually useful), but
>>it's
>>cheap enough to have around. Should add a `default => 0` for
>>documentation purposes, though.
Some storage allow overprovisioning. (create a lun bigger than the real
storage size), it can be interesting to discard instead zeroing (to
free space on storage side).
as snapshots currently use lvm volume with same size than the main
volume, it can be interesting to have overprovisioning.
btw, blockbridge guys have made a nice article here:
https://kb.blockbridge.com/technote/proxmox-qcow-snapshots-on-lvm/
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>]
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
[not found] ` <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
@ 2025-08-14 12:05 ` Wolfgang Bumiller
2025-08-14 13:11 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Bumiller @ 2025-08-14 12:05 UTC (permalink / raw)
To: DERUMIER, Alexandre; +Cc: pve-devel
On Thu, Aug 14, 2025 at 10:07:59AM +0000, DERUMIER, Alexandre wrote:
> > + saferemove_discard => {
> > + description => "Wipe with discard instead zeroing.",
> > + type => 'boolean',
>
> >>Not sure we need this (not sure when this is actually useful), but
> >>it's
> >>cheap enough to have around. Should add a `default => 0` for
> >>documentation purposes, though.
>
> Some storage allow overprovisioning. (create a lun bigger than the real
> storage size), it can be interesting to discard instead zeroing (to
> free space on storage side).
> as snapshots currently use lvm volume with same size than the main
> volume, it can be interesting to have overprovisioning.
But this does not guarantee that the data is actually erased/zeroed, in
which case I'd just disable the thing altogether (but then again I use
`issue_discards` in lvm.conf to cause `lvremove` to discard the whole
thing... ;-)
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
2025-08-14 12:05 ` Wolfgang Bumiller
@ 2025-08-14 13:11 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
1 sibling, 0 replies; 7+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-08-14 13:11 UTC (permalink / raw)
To: w.bumiller; +Cc: DERUMIER, Alexandre, pve-devel
[-- Attachment #1: Type: message/rfc822, Size: 15476 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "w.bumiller@proxmox.com" <w.bumiller@proxmox.com>
Cc: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Thu, 14 Aug 2025 13:11:00 +0000
Message-ID: <89d59912c82b9b98e1422786aad88a1606108c96.camel@groupe-cyllene.com>
-------- Message initial --------
De: Wolfgang Bumiller <w.bumiller@proxmox.com>
À: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
Cc: pve-devel@lists.proxmox.com <pve-devel@lists.proxmox.com>
Objet: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead
cstream to saferemove drive
Date: 14/08/2025 14:05:51
On Thu, Aug 14, 2025 at 10:07:59AM +0000, DERUMIER, Alexandre wrote:
> > + saferemove_discard => {
> > + description => "Wipe with discard instead zeroing.",
> > + type => 'boolean',
>
> > > Not sure we need this (not sure when this is actually useful),
> > > but
> > > it's
> > > cheap enough to have around. Should add a `default => 0` for
> > > documentation purposes, though.
>
> Some storage allow overprovisioning. (create a lun bigger than the
> real
> storage size), it can be interesting to discard instead zeroing (to
> free space on storage side).
> as snapshots currently use lvm volume with same size than the main
> volume, it can be interesting to have overprovisioning.
>>But this does not guarantee that the data is actually erased/zeroed,
some san array are able to zeroing on discard (but also, some san array
have buggy discard implementation, even if they expose the feature)
>>in
>>which case I'd just disable the thing altogether (but then again I
>>use
>>`issue_discards` in lvm.conf to cause `lvremove` to discard the whole
>>thing... ;-)
yes, this is mostly to avoid the need to tune issue_discards, as if you
have multiple storage, afaik, it's not easy to enable issue_discard for
1 storage and not another.
I was talking about it with blockbridge guy, I think He's better than
me of this subject :
https://forum.proxmox.com/threads/inside-proxmox-ve-9-san-snapshot-support.169675/
"The main issue we see is ensuring the raw device is fully zeroed
before layering a QCOW on top. QCOW doesn't serialize metadata and data
writes, and it lacks a journal. If the backing device doesn't read
zeros, a power loss or process termination can lead to data corruption.
Addressing this is tricky. Zeroing semantics for unmap/discard vary by
device and implementation, but they can usually be checked via the
device's VPD pages. Offloaded zeroing is worth exploring, though older
systems might not support it, and multiple device vendors have buggy
implementations that the kernel disabled via quirks."
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
2025-08-14 12:05 ` Wolfgang Bumiller
2025-08-14 13:11 ` DERUMIER, Alexandre via pve-devel
@ 2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:47 ` DERUMIER, Alexandre via pve-devel
1 sibling, 1 reply; 7+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-08-15 5:10 UTC (permalink / raw)
To: w.bumiller; +Cc: DERUMIER, Alexandre, pve-devel
[-- Attachment #1: Type: message/rfc822, Size: 15986 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "w.bumiller@proxmox.com" <w.bumiller@proxmox.com>
Cc: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Fri, 15 Aug 2025 05:10:58 +0000
Message-ID: <f7d67adf5349e0e58669611eb94960a7f9fee006.camel@groupe-cyllene.com>
I found some doc on blkdiscard on redhat ovirt
https://www.ovirt.org/develop/release-management/features/storage/wipe-volumes-using-blkdiscard.html
"Let lunX be a device and dm-X be its corresponding dm device for a
natural number X. Then lunX is considered to support write same iff the
value of /sys/block/dm-X/queue/write_same_max_bytes is bigger than 0.
A device that supports write same is a device that allows to write a
single data block to a range of several contiguous blocks in the
storage.
That means that instead of writing a 1MB block of zeros 1024 times to
zero a volume of 1GB (as vdsm does with dd today), a single request to
write that 1MB block of zeros to the right range is enough, and the
rest is done by the storage array.
When calling blkdiscard -z <block_device>:
If the block device supports write same, then the kernel quickly zeroes
it using write same.
Else, the kernel zeroes it by writing pages of zeros.
"
(my test was without writesame support, so even wihout it, it's still a
lot faster than cstream).
and for discard with zeroes:
"
Then lunX supports the property that discard zeroes the data ff the
value of /sys/block/dm-X/queue/discard_zeroes_data is 1"
(but here, some san can have bad implementation,so we can't auto
enabled i.
Maybe we should check at minimum that discard_zeroes_data=1 if user
enable saferemove_discard, and fallback to classic zeroing if not ?
-------- Message initial --------
De: Wolfgang Bumiller <w.bumiller@proxmox.com>
À: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
Cc: pve-devel@lists.proxmox.com <pve-devel@lists.proxmox.com>
Objet: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead
cstream to saferemove drive
Date: 14/08/2025 14:05:51
On Thu, Aug 14, 2025 at 10:07:59AM +0000, DERUMIER, Alexandre wrote:
> > + saferemove_discard => {
> > + description => "Wipe with discard instead zeroing.",
> > + type => 'boolean',
>
> > > Not sure we need this (not sure when this is actually useful),
> > > but
> > > it's
> > > cheap enough to have around. Should add a `default => 0` for
> > > documentation purposes, though.
>
> Some storage allow overprovisioning. (create a lun bigger than the
> real
> storage size), it can be interesting to discard instead zeroing (to
> free space on storage side).
> as snapshots currently use lvm volume with same size than the main
> volume, it can be interesting to have overprovisioning.
But this does not guarantee that the data is actually erased/zeroed, in
which case I'd just disable the thing altogether (but then again I use
`issue_discards` in lvm.conf to cause `lvremove` to discard the whole
thing... ;-)
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
@ 2025-08-15 5:47 ` DERUMIER, Alexandre via pve-devel
0 siblings, 0 replies; 7+ messages in thread
From: DERUMIER, Alexandre via pve-devel @ 2025-08-15 5:47 UTC (permalink / raw)
To: pve-devel, w.bumiller; +Cc: DERUMIER, Alexandre
[-- Attachment #1: Type: message/rfc822, Size: 13428 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "w.bumiller@proxmox.com" <w.bumiller@proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Fri, 15 Aug 2025 05:47:03 +0000
Message-ID: <b2661e0b39c65aaf6e3ba5859fc932080cbb6464.camel@groupe-cyllene.com>
The doc from ovirt was a little bit old,
REQ_OP_WRITE_SAME is not used anymore in recent kernel,
REQ_OP_WRITE_ZEROES is used since 2017, and
/sys/block/dm-25/queue/write_zeroes_max_bytes
show if the feature is available.
(it's was enabled on my test machine, that's why blkdiscard is a lot
faster than cstream)
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-15 5:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-08-14 9:24 [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive Alexandre Derumier via pve-devel
2025-08-14 9:37 ` Wolfgang Bumiller
2025-08-14 10:07 ` DERUMIER, Alexandre via pve-devel
[not found] ` <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
2025-08-14 12:05 ` Wolfgang Bumiller
2025-08-14 13:11 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:47 ` DERUMIER, Alexandre via pve-devel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox