* [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
@ 2021-04-13 14:35 Dominik Csapak
2021-04-14 13:24 ` Fabian Grünbichler
2021-04-14 15:42 ` [pbs-devel] applied: " Thomas Lamprecht
0 siblings, 2 replies; 6+ messages in thread
From: Dominik Csapak @ 2021-04-13 14:35 UTC (permalink / raw)
To: pbs-devel
before reading the chunks from disk in the order of the index file,
stat them first and sort them by inode number.
this can have a very positive impact on read speed on spinning disks,
even with the additional stat'ing of the chunks.
memory footprint should be tolerable, for 1_000_000 chunks
we need about ~16MiB of memory (Vec of 64bit position + 64bit inode)
(assuming 4MiB Chunks, such an index would reference 4TiB of data)
two small benchmarks (single spinner, ext4) here showed an improvement from
~430 seconds to ~330 seconds for a 32GiB fixed index
and from
~160 seconds to ~120 seconds for a 10GiB dynamic index
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
it would be great if other people could also benchmark this patch on
different setups a little (in addition to me), to verify or disprove my results
src/backup/datastore.rs | 5 +++++
src/backup/verify.rs | 32 +++++++++++++++++++++++++++++---
2 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/src/backup/datastore.rs b/src/backup/datastore.rs
index 28dda7e7..8162c269 100644
--- a/src/backup/datastore.rs
+++ b/src/backup/datastore.rs
@@ -686,6 +686,11 @@ impl DataStore {
}
+ pub fn stat_chunk(&self, digest: &[u8; 32]) -> Result<std::fs::Metadata, Error> {
+ let (chunk_path, _digest_str) = self.chunk_store.chunk_path(digest);
+ std::fs::metadata(chunk_path).map_err(Error::from)
+ }
+
pub fn load_chunk(&self, digest: &[u8; 32]) -> Result<DataBlob, Error> {
let (chunk_path, digest_str) = self.chunk_store.chunk_path(digest);
diff --git a/src/backup/verify.rs b/src/backup/verify.rs
index ac4a6c29..9173bd9d 100644
--- a/src/backup/verify.rs
+++ b/src/backup/verify.rs
@@ -159,13 +159,16 @@ fn verify_index_chunks(
}
);
- for pos in 0..index.index_count() {
+ let index_count = index.index_count();
+ let mut chunk_list = Vec::with_capacity(index_count);
+ use std::os::unix::fs::MetadataExt;
+
+ for pos in 0..index_count {
verify_worker.worker.check_abort()?;
crate::tools::fail_on_shutdown()?;
let info = index.chunk_info(pos).unwrap();
- let size = info.size();
if verify_worker.verified_chunks.lock().unwrap().contains(&info.digest) {
continue; // already verified
@@ -178,15 +181,38 @@ fn verify_index_chunks(
continue;
}
+ match verify_worker.datastore.stat_chunk(&info.digest) {
+ Err(err) => {
+ verify_worker.corrupt_chunks.lock().unwrap().insert(info.digest);
+ task_log!(verify_worker.worker, "can't verify chunk, stat failed - {}", err);
+ errors.fetch_add(1, Ordering::SeqCst);
+ rename_corrupted_chunk(verify_worker.datastore.clone(), &info.digest, &verify_worker.worker);
+ }
+ Ok(metadata) => {
+ chunk_list.push((pos, metadata.ino()));
+ }
+ }
+ }
+
+ chunk_list.sort_unstable_by(|(_, ino_a), (_, ino_b)| {
+ ino_a.cmp(&ino_b)
+ });
+
+ for (pos, _) in chunk_list {
+ verify_worker.worker.check_abort()?;
+ crate::tools::fail_on_shutdown()?;
+
+ let info = index.chunk_info(pos).unwrap();
+
match verify_worker.datastore.load_chunk(&info.digest) {
Err(err) => {
verify_worker.corrupt_chunks.lock().unwrap().insert(info.digest);
task_log!(verify_worker.worker, "can't verify chunk, load failed - {}", err);
errors.fetch_add(1, Ordering::SeqCst);
rename_corrupted_chunk(verify_worker.datastore.clone(), &info.digest, &verify_worker.worker);
- continue;
}
Ok(chunk) => {
+ let size = info.size();
read_bytes += chunk.raw_size();
decoder_pool.send((chunk, info.digest, size))?;
decoded_bytes += size;
--
2.20.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
2021-04-13 14:35 [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode Dominik Csapak
@ 2021-04-14 13:24 ` Fabian Grünbichler
2021-04-14 15:42 ` [pbs-devel] applied: " Thomas Lamprecht
1 sibling, 0 replies; 6+ messages in thread
From: Fabian Grünbichler @ 2021-04-14 13:24 UTC (permalink / raw)
To: Proxmox Backup Server development discussion
On April 13, 2021 4:35 pm, Dominik Csapak wrote:
> before reading the chunks from disk in the order of the index file,
> stat them first and sort them by inode number.
>
> this can have a very positive impact on read speed on spinning disks,
> even with the additional stat'ing of the chunks.
>
> memory footprint should be tolerable, for 1_000_000 chunks
> we need about ~16MiB of memory (Vec of 64bit position + 64bit inode)
> (assuming 4MiB Chunks, such an index would reference 4TiB of data)
>
> two small benchmarks (single spinner, ext4) here showed an improvement from
> ~430 seconds to ~330 seconds for a 32GiB fixed index
> and from
> ~160 seconds to ~120 seconds for a 10GiB dynamic index
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> it would be great if other people could also benchmark this patch on
> different setups a little (in addition to me), to verify or disprove my results
zfs with single spinner + fast special device, with a (not counted ;))
warmup run and everything fitting into cache:
Benchmark #1: stock
Time (mean ± σ): 21.407 s ± 0.819 s [User: 20.1 ms, System: 15.2 ms]
Range (min … max): 21.070 s … 23.078 s 6 runs
Benchmark #2: patched
Time (mean ± σ): 47.119 s ± 0.018 s [User: 29.5 ms, System: 15.1 ms]
Range (min … max): 47.107 s … 47.154 s 6 runs
Summary
'stock' ran
2.20 ± 0.08 times faster than 'patched'
same setup, but ARC reduced so that verified data > ARC and we start
bottle-necking on the spinner:
Benchmark #1: stock
Time (mean ± σ): 367.821 s ± 0.801 s [User: 195.9 ms, System: 80.0 ms]
Range (min … max): 366.840 s … 368.802 s 4 runs
Benchmark #2: patched
Time (mean ± σ): 406.391 s ± 1.304 s [User: 188.3 ms, System: 100.8 ms]
Range (min … max): 404.891 s … 407.919 s 4 runs
Summary
'stock' ran
1.10 ± 0.00 times faster than 'patched'
both benchmarks for verifying a datastore with ~12G of on-disk chunk
data.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [pbs-devel] applied: [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
2021-04-13 14:35 [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode Dominik Csapak
2021-04-14 13:24 ` Fabian Grünbichler
@ 2021-04-14 15:42 ` Thomas Lamprecht
1 sibling, 0 replies; 6+ messages in thread
From: Thomas Lamprecht @ 2021-04-14 15:42 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
On 13.04.21 16:35, Dominik Csapak wrote:
> before reading the chunks from disk in the order of the index file,
> stat them first and sort them by inode number.
>
> this can have a very positive impact on read speed on spinning disks,
> even with the additional stat'ing of the chunks.
>
> memory footprint should be tolerable, for 1_000_000 chunks
> we need about ~16MiB of memory (Vec of 64bit position + 64bit inode)
> (assuming 4MiB Chunks, such an index would reference 4TiB of data)
>
> two small benchmarks (single spinner, ext4) here showed an improvement from
> ~430 seconds to ~330 seconds for a 32GiB fixed index
> and from
> ~160 seconds to ~120 seconds for a 10GiB dynamic index
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> it would be great if other people could also benchmark this patch on
> different setups a little (in addition to me), to verify or disprove my results
>
> src/backup/datastore.rs | 5 +++++
> src/backup/verify.rs | 32 +++++++++++++++++++++++++++++---
> 2 files changed, 34 insertions(+), 3 deletions(-)
>
>
applied this for now, did already so before Fabians feedback.
Actually I had a a slight regression here too, but not as bad as Fabian reported
and also on a plain SSD-backed ext4, where I expected that the overhead of getting
the inodes out weights the advantages for storage which is already good at random IO.
I booted an older test-server and with lots of data and a more complex spinner
setup, lets see what that one reports.
Any how, we could, and probably should, make this a switch very easily, either as a
datastore option, or by checking the underlying storage - the latter is easy for single
disk storage (just check the rotational flag in /sys/block/...) but gets quickly ugly
with zfs/btrfs/... and the special devices they support.
If we further add such optimizations for sync (to remote and tape) then those would also
fall under that option-switch. Admins like to tune and this would give them a knob to
check what's best for a setup of theirs.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
@ 2021-04-14 7:17 Dietmar Maurer
0 siblings, 0 replies; 6+ messages in thread
From: Dietmar Maurer @ 2021-04-14 7:17 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Dominik Csapak
> two small benchmarks (single spinner, ext4) here showed an improvement from
> ~430 seconds to ~330 seconds for a 32GiB fixed index
> and from
> ~160 seconds to ~120 seconds for a 10GiB dynamic index
On my single disk XFS datastore, I get a speedup of 1.6 (from 742 down to 466 seconds).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
@ 2021-04-14 16:44 Dietmar Maurer
2021-04-15 7:19 ` Fabian Grünbichler
0 siblings, 1 reply; 6+ messages in thread
From: Dietmar Maurer @ 2021-04-14 16:44 UTC (permalink / raw)
To: Proxmox Backup Server development discussion, Fabian Grünbichler
> zfs with single spinner + fast special device, with a (not counted ;))
> warmup run and everything fitting into cache:
>
> Benchmark #1: stock
> Time (mean ± σ): 21.407 s ± 0.819 s [User: 20.1 ms, System: 15.2 ms]
> Range (min … max): 21.070 s … 23.078 s 6 runs
>
> Benchmark #2: patched
> Time (mean ± σ): 47.119 s ± 0.018 s [User: 29.5 ms, System: 15.1 ms]
> Range (min … max): 47.107 s … 47.154 s 6 runs
>
> Summary
> 'stock' ran
> 2.20 ± 0.08 times faster than 'patched'
I assume you have about 3000 chunks? On XFS, I can do 3000 stat() calls per second (after cache cleared).
So why the hell are stat calls that slow on ZFS with special device? Or what causes that delay (stat or sort)?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode
2021-04-14 16:44 Dietmar Maurer
@ 2021-04-15 7:19 ` Fabian Grünbichler
0 siblings, 0 replies; 6+ messages in thread
From: Fabian Grünbichler @ 2021-04-15 7:19 UTC (permalink / raw)
To: Dietmar Maurer, Proxmox Backup Server development discussion
On April 14, 2021 6:44 pm, Dietmar Maurer wrote:
>> zfs with single spinner + fast special device, with a (not counted ;))
>> warmup run and everything fitting into cache:
>>
>> Benchmark #1: stock
>> Time (mean ± σ): 21.407 s ± 0.819 s [User: 20.1 ms, System: 15.2 ms]
>> Range (min … max): 21.070 s … 23.078 s 6 runs
>>
>> Benchmark #2: patched
>> Time (mean ± σ): 47.119 s ± 0.018 s [User: 29.5 ms, System: 15.1 ms]
>> Range (min … max): 47.107 s … 47.154 s 6 runs
>>
>> Summary
>> 'stock' ran
>> 2.20 ± 0.08 times faster than 'patched'
>
> I assume you have about 3000 chunks? On XFS, I can do 3000 stat() calls per second (after cache cleared).
> So why the hell are stat calls that slow on ZFS with special device? Or what causes that delay (stat or sort)?
13432 chunks. the run you quoted is with everything in ARC, so this is
not even hitting disk (neither special nor regular vdev). I haven't
profiled what exactly is the cause of the slowdown.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-04-15 7:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 14:35 [pbs-devel] [PATCH proxmox-backup] backup/verify: improve speed by sorting chunks by inode Dominik Csapak
2021-04-14 13:24 ` Fabian Grünbichler
2021-04-14 15:42 ` [pbs-devel] applied: " Thomas Lamprecht
2021-04-14 7:17 [pbs-devel] " Dietmar Maurer
2021-04-14 16:44 Dietmar Maurer
2021-04-15 7:19 ` Fabian Grünbichler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox