From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-storage] lvm: use blkdiscard instead cstream to saferemove drive
Date: Thu, 14 Aug 2025 11:37:44 +0200 [thread overview]
Message-ID: <j5teuxtburvuysuyh6k7mukh5bmfi2e3ddwzyarlehtkpjt7sr@uh4pelztnqwy> (raw)
In-Reply-To: <mailman.70.1755163524.385.pve-devel@lists.proxmox.com>
On Thu, Aug 14, 2025 at 11:24:48AM +0200, Alexandre Derumier via pve-devel wrote:
> From: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> To: pve-devel@lists.proxmox.com
> Subject: [PATCH pve-storage] lvm: use blkdiscard instead cstream to
> saferemove drive
> Date: Thu, 14 Aug 2025 11:24:48 +0200
> Message-ID: <20250814092448.919114-1-alexandre.derumier@groupe-cyllene.com>
> X-Mailer: git-send-email 2.47.2
>
> Current cstream implementation is pretty slow as hell, even without throttling.
>
> use blkdiscard --zeroout instead, which is lot of magnetude faster
Makes sense, given that it probably uses dedicated zero-out `ioctls()`.
>
> Another benefit is that blkdiscard is skipping already zeroed block, so for empty
> temp images like snapshot, is pretty fast.
>
> blkdiscard don't have throttling like cstream, but we can tune the step size
> of zeroes pushed to the storage.
> I'm using 32MB stepsize by default , like ovirt, where it seem to be the best
> balance between speed and load.
> https://github.com/oVirt/vdsm/commit/79f1d79058aad863ca4b6672d4a5ce2be8e48986
>
> but it can be reduce with "saferemove_stepsize" option.
>
> Also adding an option "saferemove_discard", to use discard instead zeroing.
>
> test with a 100G volume (empty):
>
> time /usr/bin/cstream -i /dev/zero -o /dev/test/vm-100-disk-0.qcow2 -T 10 -v 1 -b 1048576
>
> 13561233408 B 12.6 GB 10.00 s 1356062979 B/s 1.26 GB/s
> 26021462016 B 24.2 GB 20.00 s 1301029969 B/s 1.21 GB/s
> 38585499648 B 35.9 GB 30.00 s 1286135343 B/s 1.20 GB/s
> 50998542336 B 47.5 GB 40.00 s 1274925312 B/s 1.19 GB/s
> 63702765568 B 59.3 GB 50.00 s 1274009877 B/s 1.19 GB/s
> 76721885184 B 71.5 GB 60.00 s 1278640698 B/s 1.19 GB/s
> 89126539264 B 83.0 GB 70.00 s 1273178488 B/s 1.19 GB/s
> 101666459648 B 94.7 GB 80.00 s 1270779024 B/s 1.18 GB/s
> 107390959616 B 100.0 GB 84.39 s 1272531142 B/s 1.19 GB/s
> write: No space left on device
>
> real 1m24.394s
> user 0m0.171s
> sys 1m24.052s
>
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
> /dev/test/vm-100-disk-0.qcow2: Zero-filled 107390959616 bytes from the offset 0
>
> real 0m3.641s
> user 0m0.001s
> sys 0m3.433s
>
> test with a 100G volume with random data:
>
> time blkdiscard --zeroout /dev/test/vm-100-disk-0.qcow2 -v
>
> /dev/test/vm-112-disk-1: Zero-filled 4764729344 bytes from the offset 0
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 4764729344
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 9428795392
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 14260633600
> /dev/test/vm-112-disk-1: Zero-filled 4831838208 bytes from the offset 19092471808
> /dev/test/vm-112-disk-1: Zero-filled 4865392640 bytes from the offset 23924310016
> /dev/test/vm-112-disk-1: Zero-filled 4596957184 bytes from the offset 28789702656
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 33386659840
> /dev/test/vm-112-disk-1: Zero-filled 4294967296 bytes from the offset 38117834752
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 42412802048
> /dev/test/vm-112-disk-1: Zero-filled 4697620480 bytes from the offset 47076868096
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 51774488576
> /dev/test/vm-112-disk-1: Zero-filled 4261412864 bytes from the offset 56438554624
> /dev/test/vm-112-disk-1: Zero-filled 4362076160 bytes from the offset 60699967488
> /dev/test/vm-112-disk-1: Zero-filled 4127195136 bytes from the offset 65062043648
> /dev/test/vm-112-disk-1: Zero-filled 4328521728 bytes from the offset 69189238784
> /dev/test/vm-112-disk-1: Zero-filled 4731174912 bytes from the offset 73517760512
> /dev/test/vm-112-disk-1: Zero-filled 4026531840 bytes from the offset 78248935424
> /dev/test/vm-112-disk-1: Zero-filled 4194304000 bytes from the offset 82275467264
> /dev/test/vm-112-disk-1: Zero-filled 4664066048 bytes from the offset 86469771264
> /dev/test/vm-112-disk-1: Zero-filled 4395630592 bytes from the offset 91133837312
> /dev/test/vm-112-disk-1: Zero-filled 3623878656 bytes from the offset 95529467904
> /dev/test/vm-112-disk-1: Zero-filled 4462739456 bytes from the offset 99153346560
> /dev/test/vm-112-disk-1: Zero-filled 3758096384 bytes from the offset 103616086016
>
> real 0m23.969s
> user 0m0.030s
> sys 0m0.144s
>
> Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com>
> ---
> src/PVE/Storage/LVMPlugin.pm | 43 ++++++++++++------------------------
> 1 file changed, 14 insertions(+), 29 deletions(-)
>
> diff --git a/src/PVE/Storage/LVMPlugin.pm b/src/PVE/Storage/LVMPlugin.pm
> index 0416c9e..5ea9d8b 100644
> --- a/src/PVE/Storage/LVMPlugin.pm
> +++ b/src/PVE/Storage/LVMPlugin.pm
> @@ -287,35 +287,15 @@ my sub free_lvm_volumes {
> # we need to zero out LVM data for security reasons
> # and to allow thin provisioning
> my $zero_out_worker = sub {
> - # wipe throughput up to 10MB/s by default; may be overwritten with saferemove_throughput
> - my $throughput = '-10485760';
> - if ($scfg->{saferemove_throughput}) {
> - $throughput = $scfg->{saferemove_throughput};
> - }
> for my $name (@$volnames) {
> print "zero-out data on image $name (/dev/$vg/del-$name)\n";
> -
> + my $stepsize = $scfg->{saferemove_stepsize} // 32;
> my $cmd = [
> - '/usr/bin/cstream',
> - '-i',
> - '/dev/zero',
> - '-o',
> - "/dev/$vg/del-$name",
> - '-T',
> - '10',
> - '-v',
> - '1',
> - '-b',
> - '1048576',
> - '-t',
> - "$throughput",
> + '/usr/sbin/blkdiscard', "/dev/$vg/del-$name", '-v', '--step', "${stepsize}M",
> ];
> - eval {
> - run_command(
> - $cmd,
> - errmsg => "zero out finished (note: 'No space left on device' is ok here)",
> - );
> - };
> + push @$cmd, '--zeroout' if !$scfg->{saferemove_discard};
> +
> + eval { run_command($cmd); };
> warn $@ if $@;
>
> $class->cluster_lock_storage(
> @@ -376,9 +356,13 @@ sub properties {
> description => "Zero-out data when removing LVs.",
> type => 'boolean',
> },
> - saferemove_throughput => {
> - description => "Wipe throughput (cstream -t parameter value).",
> - type => 'string',
We do still need to keep the old option around to not break existing
configs. We should document its deprecation and add a warning if it is
set.
> + saferemove_stepsize => {
> + description => "Wipe step size (default 32MB).",
> + enum => [qw(1 2 4 8 16 32)],
> + },
> + saferemove_discard => {
> + description => "Wipe with discard instead zeroing.",
> + type => 'boolean',
Not sure we need this (not sure when this is actually useful), but it's
cheap enough to have around. Should add a `default => 0` for
documentation purposes, though.
> },
> tagged_only => {
> description => "Only use logical volumes tagged with 'pve-vm-ID'.",
> @@ -394,7 +378,8 @@ sub options {
> shared => { optional => 1 },
> disable => { optional => 1 },
> saferemove => { optional => 1 },
> - saferemove_throughput => { optional => 1 },
> + saferemove_discard => { optional => 1 },
> + saferemove_stepsize => { optional => 1 },
> content => { optional => 1 },
> base => { fixed => 1, optional => 1 },
> tagged_only => { optional => 1 },
> --
> 2.47.2
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-08-14 9:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-14 9:24 Alexandre Derumier via pve-devel
2025-08-14 9:37 ` Wolfgang Bumiller [this message]
2025-08-14 10:07 ` DERUMIER, Alexandre via pve-devel
[not found] ` <9d5353263ee0aa3ee67d8d331f674f8a00044b1e.camel@groupe-cyllene.com>
2025-08-14 12:05 ` Wolfgang Bumiller
2025-08-14 13:11 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:10 ` DERUMIER, Alexandre via pve-devel
2025-08-15 5:47 ` DERUMIER, Alexandre via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=j5teuxtburvuysuyh6k7mukh5bmfi2e3ddwzyarlehtkpjt7sr@uh4pelztnqwy \
--to=w.bumiller@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.