Re: [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

From: Stefan Reiter <s.reiter@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	pbs-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader
Date: Wed, 17 Mar 2021 14:37:44 +0100	[thread overview]
Message-ID: <570fbf9f-988c-c3a7-1475-ff0406ca590e@proxmox.com> (raw)
In-Reply-To: <f3df01a9-71a6-9b20-dafa-3cdda78f2e72@proxmox.com>

On 16/03/2021 21:17, Thomas Lamprecht wrote:
> On 03.03.21 10:56, Stefan Reiter wrote:
>> Values chosen by fair dice roll, seems to be a good sweet spot on my
>> machine where any less causes performance degradation but any more
>> doesn't really make it go any faster.
>>
>> Keep in mind that those values are per drive in an actual restore.
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>>
>> Depends on new proxmox-backup.
>>
>> v2:
>> * unchanged
>>
>>   src/restore.rs | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/restore.rs b/src/restore.rs
>> index 0790d7f..a1acce4 100644
>> --- a/src/restore.rs
>> +++ b/src/restore.rs
>> @@ -218,15 +218,16 @@ impl RestoreTask {
>>   
>>           let index = client.download_fixed_index(&manifest, &archive_name).await?;
>>           let archive_size = index.index_bytes();
>> -        let most_used = index.find_most_used_chunks(8);
>> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache
> 
> 
> 
>>   
>>           let file_info = manifest.lookup_file_info(&archive_name)?;
>>   
>> -        let chunk_reader = RemoteChunkReader::new(
>> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>>               Arc::clone(&client),
>>               self.crypt_config.clone(),
>>               file_info.chunk_crypt_mode(),
>>               most_used,
>> +            64, // 256 MB LRU cache
> 
> how does this work with low(er) memory situations? Lots of people do not over
> dimension their memory that much, and especially the need for mass-recovery could
> seem to correlate with reduced resource availability (a node failed, now I need
> to restore X backups on my <test/old/other-already-in-use> node, so multiple
> restore jobs may run in parallel, and they all may have even multiple disks,
> so tens of GiB of memory just for the cache are not that unlikely.

This is a seperate function from the regular restore, so it currently 
only affects live-restore. This is not an operation you would usually do 
under memory constraints anyway, and regular restore is unaffected if 
you just want the data.

Upcoming single-file restore too though, I suppose, where it might make 
more sense...

> 
> How is the behavior, hard failure if memory is not available? Also, some archives
> may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
> and get 64 of most used chunks if that's all/more than it would actually need to
> be..

Yes, if memory is unavailable it is a hard error. Memory should not be 
pre-allocated however, so restoring this way will only ever use as much 
memory as the disk size (not accounting for overhead).

> 
> There may be the reversed situation too, beefy fast node with lots of memory
> and restore is used as recovery or migration but network bw/latency to PBS is not
> that good - so bigger cache could be wanted.

The reason I chose the numbers I did was that I couldn't see any real 
performance benefits by going higher, though I didn't specifically test 
with slow networking.

I don't believe more cache would improve the situation there though, 
this is mostly to avoid random access from the guest and the linear 
access from the block-stream operation to interfere with each other, and 
allow multiple smaller guest reads within the same chunk to be served 
quickly.

> 
> Maybe we could get the available memory and use that as hint, I mean as memory
> usage can be highly dynamic it will never be perfect, but better than just ignoring
> it..

If anything, I'd make it user-configurable - I don't think a heuristic 
would be a good choice here.

This way we could also set it smaller for single-file restore for 
example - on the other hand, that adds another parameter to the already 
somewhat cluttered QEMU<->Rust interface.

> 
>>           );
>>   
>>           let reader = AsyncIndexReader::new(index, chunk_reader);
>>
>

WARNING: multiple messages have this Message-ID

From: Stefan Reiter <s.reiter@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader
Date: Wed, 17 Mar 2021 14:37:44 +0100	[thread overview]
Message-ID: <570fbf9f-988c-c3a7-1475-ff0406ca590e@proxmox.com> (raw)
In-Reply-To: <f3df01a9-71a6-9b20-dafa-3cdda78f2e72@proxmox.com>

On 16/03/2021 21:17, Thomas Lamprecht wrote:
> On 03.03.21 10:56, Stefan Reiter wrote:
>> Values chosen by fair dice roll, seems to be a good sweet spot on my
>> machine where any less causes performance degradation but any more
>> doesn't really make it go any faster.
>>
>> Keep in mind that those values are per drive in an actual restore.
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>>
>> Depends on new proxmox-backup.
>>
>> v2:
>> * unchanged
>>
>>   src/restore.rs | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/restore.rs b/src/restore.rs
>> index 0790d7f..a1acce4 100644
>> --- a/src/restore.rs
>> +++ b/src/restore.rs
>> @@ -218,15 +218,16 @@ impl RestoreTask {
>>   
>>           let index = client.download_fixed_index(&manifest, &archive_name).await?;
>>           let archive_size = index.index_bytes();
>> -        let most_used = index.find_most_used_chunks(8);
>> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache
> 
> 
> 
>>   
>>           let file_info = manifest.lookup_file_info(&archive_name)?;
>>   
>> -        let chunk_reader = RemoteChunkReader::new(
>> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>>               Arc::clone(&client),
>>               self.crypt_config.clone(),
>>               file_info.chunk_crypt_mode(),
>>               most_used,
>> +            64, // 256 MB LRU cache
> 
> how does this work with low(er) memory situations? Lots of people do not over
> dimension their memory that much, and especially the need for mass-recovery could
> seem to correlate with reduced resource availability (a node failed, now I need
> to restore X backups on my <test/old/other-already-in-use> node, so multiple
> restore jobs may run in parallel, and they all may have even multiple disks,
> so tens of GiB of memory just for the cache are not that unlikely.

This is a seperate function from the regular restore, so it currently 
only affects live-restore. This is not an operation you would usually do 
under memory constraints anyway, and regular restore is unaffected if 
you just want the data.

Upcoming single-file restore too though, I suppose, where it might make 
more sense...

> 
> How is the behavior, hard failure if memory is not available? Also, some archives
> may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
> and get 64 of most used chunks if that's all/more than it would actually need to
> be..

Yes, if memory is unavailable it is a hard error. Memory should not be 
pre-allocated however, so restoring this way will only ever use as much 
memory as the disk size (not accounting for overhead).

> 
> There may be the reversed situation too, beefy fast node with lots of memory
> and restore is used as recovery or migration but network bw/latency to PBS is not
> that good - so bigger cache could be wanted.

The reason I chose the numbers I did was that I couldn't see any real 
performance benefits by going higher, though I didn't specifically test 
with slow networking.

I don't believe more cache would improve the situation there though, 
this is mostly to avoid random access from the guest and the linear 
access from the block-stream operation to interfere with each other, and 
allow multiple smaller guest reads within the same chunk to be served 
quickly.

> 
> Maybe we could get the available memory and use that as hint, I mean as memory
> usage can be highly dynamic it will never be perfect, but better than just ignoring
> it..

If anything, I'd make it user-configurable - I don't think a heuristic 
would be a good choice here.

This way we could also set it smaller for single-file restore for 
example - on the other hand, that adds another parameter to the already 
somewhat cluttered QEMU<->Rust interface.

> 
>>           );
>>   
>>           let reader = AsyncIndexReader::new(index, chunk_reader);
>>
>

next prev parent reply	other threads:[~2021-03-17 13:37 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03  9:56 [pve-devel] [PATCH v2 00/11] live-restore for PBS snapshots Stefan Reiter
2021-03-03  9:56 ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 01/11] clean up pve/ patches by merging Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03 16:32   ` [pve-devel] applied: " Thomas Lamprecht
2021-03-03 16:32     ` [pbs-devel] applied: [pve-devel] " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 02/11] move bitmap-mirror patches to seperate folder Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03 16:32   ` [pve-devel] applied: " Thomas Lamprecht
2021-03-03 16:32     ` [pbs-devel] applied: [pve-devel] " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 03/11] add alloc-track block driver patch Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-15 14:14   ` [pve-devel] " Wolfgang Bumiller
2021-03-15 14:14     ` [pbs-devel] " Wolfgang Bumiller
2021-03-15 15:41     ` [pve-devel] [PATCH pve-qemu v3] " Stefan Reiter
2021-03-15 15:41       ` [pbs-devel] " Stefan Reiter
2021-03-16 19:57       ` [pve-devel] applied: " Thomas Lamprecht
2021-03-16 19:57         ` [pbs-devel] applied: [pve-devel] " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 proxmox-backup 04/11] RemoteChunkReader: add LRU cached variant Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-16 20:17   ` [pve-devel] " Thomas Lamprecht
2021-03-16 20:17     ` [pbs-devel] " Thomas Lamprecht
2021-03-17 13:37     ` Stefan Reiter [this message]
2021-03-17 13:37       ` Stefan Reiter
2021-03-17 13:59       ` Thomas Lamprecht
2021-03-17 13:59         ` [pbs-devel] " Thomas Lamprecht
2021-03-17 16:03         ` [pve-devel] [pbs-devel] " Dietmar Maurer
2021-03-17 16:03           ` [pbs-devel] [pve-devel] " Dietmar Maurer
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 06/11] make qemu_drive_mirror_monitor more generic Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 07/11] cfg2cmd: allow PBS snapshots as backing files for drives Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 08/11] enable live-restore for PBS Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 09/11] extract register_qmeventd_handle to QemuServer.pm Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 10/11] live-restore: register qmeventd handle Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 manager 11/11] ui: restore: add live-restore checkbox Stefan Reiter
2021-03-03  9:56   ` [pbs-devel] " Stefan Reiter
2021-04-15 18:34   ` [pve-devel] applied: " Thomas Lamprecht
2021-04-15 18:34     ` [pbs-devel] applied: [pve-devel] " Thomas Lamprecht
2021-03-22 11:08 ` [pve-devel] [PATCH v2 00/11] live-restore for PBS snapshots Dominic Jäger
2021-03-22 11:08   ` [pbs-devel] " Dominic Jäger
2021-04-06 19:09 ` [pve-devel] partially-applied: " Thomas Lamprecht
2021-04-06 19:09   ` [pbs-devel] partially-applied: [pve-devel] " Thomas Lamprecht
2021-04-15 18:35 ` Thomas Lamprecht
2021-04-15 18:35   ` [pbs-devel] " Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=570fbf9f-988c-c3a7-1475-ff0406ca590e@proxmox.com \
    --to=s.reiter@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal