Re: [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader

public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed

From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Stefan Reiter <s.reiter@proxmox.com>,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	pbs-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader
Date: Wed, 17 Mar 2021 14:59:05 +0100	[thread overview]
Message-ID: <9177e016-7e42-cee7-f948-887af087311c@proxmox.com> (raw)
In-Reply-To: <570fbf9f-988c-c3a7-1475-ff0406ca590e@proxmox.com>

On 17.03.21 14:37, Stefan Reiter wrote:
> On 16/03/2021 21:17, Thomas Lamprecht wrote:
>> On 03.03.21 10:56, Stefan Reiter wrote:
>>> Values chosen by fair dice roll, seems to be a good sweet spot on my
>>> machine where any less causes performance degradation but any more
>>> doesn't really make it go any faster.
>>>
>>> Keep in mind that those values are per drive in an actual restore.
>>>
>>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>>> ---
>>>
>>> Depends on new proxmox-backup.
>>>
>>> v2:
>>> * unchanged
>>>
>>>   src/restore.rs | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/restore.rs b/src/restore.rs
>>> index 0790d7f..a1acce4 100644
>>> --- a/src/restore.rs
>>> +++ b/src/restore.rs
>>> @@ -218,15 +218,16 @@ impl RestoreTask {
>>>             let index = client.download_fixed_index(&manifest, &archive_name).await?;
>>>           let archive_size = index.index_bytes();
>>> -        let most_used = index.find_most_used_chunks(8);
>>> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache
>>
>>
>>
>>>             let file_info = manifest.lookup_file_info(&archive_name)?;
>>>   -        let chunk_reader = 
RemoteChunkReader::new(
>>> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>>>               Arc::clone(&client),
>>>               self.crypt_config.clone(),
>>>               file_info.chunk_crypt_mode(),
>>>               most_used,
>>> +            64, // 256 MB LRU cache
>>
>> how does this work with low(er) memory situations? Lots of people do not over
>> dimension their memory that much, and especially the need for mass-recovery could
>> seem to correlate with reduced resource availability (a node failed, now I need
>> to restore X backups on my <test/old/other-already-in-use> node, so multiple
>> restore jobs may run in parallel, and they all may have even multiple disks,
>> so tens of GiB of memory just for the cache are not that unlikely.
> 
> This is a seperate function from the regular restore, so it currently only affects live-restore. This is not an operation you would usually do under memory constraints anyway, and regular restore is unaffected if you just want the data.

And how exactly do you figure/argue that users won't use it if easily available?
Users *will* do use this in a memory constrained environment as it gets their guest
faster up again, cue mass restore on node with not much resources left.
 
> Upcoming single-file restore too though, I suppose, where it might make 
more sense...
> 
>>
>> How is the behavior, hard failure if memory is not available? Also, some archives
>> may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
>> and get 64 of most used chunks if that's all/more than it would actually need to
>> be..
> 
> Yes, if memory is unavailable it is a hard error. Memory should not be pre-allocated however, so restoring this way will only ever use as much memory as the disk size (not accounting for overhead).

So basically RSS is increased by chunk-sized blocks. But a alloc error is 
not a hard
error here for the total operation, couldn't we catch that and continue with the LRU
size we actually have allocated?

> 
>>
>> There may be the reversed situation too, beefy fast node with lots of memory
>> and restore is used as recovery or migration but network bw/latency to 
PBS is not
>> that good - so bigger cache could be wanted.
> 
> The reason I chose the numbers I did was that I couldn't see any real performance benefits by going higher, though I didn't specifically test with slow networking.
> 
> I don't believe more cache would improve the situation there though, this is mostly to avoid random access from the guest and the linear access from the block-stream operation to interfere with each other, and allow multiple smaller guest reads within the same chunk to be served quickly.

What are the workloads you tested to be so sure about this?

From above statement I'd think that for any workload with a working set 
bigger than
256 MiB it would help? So basically any production DB load (albeit that should be
handled by the DBs memory caching, so maybe not the best example).

I'm just thinking that exposing this as a knob could help, must not be 
placed, but would be nice if there.

> 
>>
>> Maybe we could get the available memory and use that as hint, I mean as memory
>> usage can be highly dynamic it will never be perfect, but better than just ignoring
>> it..
> 
> If anything, I'd make it user-configurable - I don't think a heuristic would be a good choice here.

Yeah, heuristic is not an good option as we cannot know how the system memory
situation will be in the future.

> 
> This way we could also set it smaller for single-file restore for example - on the other hand, that adds another parameter to the already somewhat cluttered QEMU<->Rust interface.

cue versioned structs incoming ;)

> 
>>
>>>           );
>>>             let reader = AsyncIndexReader::new(index, chunk_reader);
>>>
>>

next prev parent reply	other threads:[~2021-03-17 13:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03  9:56 [pve-devel] [PATCH v2 00/11] live-restore for PBS snapshots Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 01/11] clean up pve/ patches by merging Stefan Reiter
2021-03-03 16:32   ` [pve-devel] applied: " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 02/11] move bitmap-mirror patches to seperate folder Stefan Reiter
2021-03-03 16:32   ` [pve-devel] applied: " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 pve-qemu 03/11] add alloc-track block driver patch Stefan Reiter
2021-03-15 14:14   ` Wolfgang Bumiller
2021-03-15 15:41     ` [pve-devel] [PATCH pve-qemu v3] " Stefan Reiter
2021-03-16 19:57       ` [pve-devel] applied: " Thomas Lamprecht
2021-03-03  9:56 ` [pve-devel] [PATCH v2 proxmox-backup 04/11] RemoteChunkReader: add LRU cached variant Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader Stefan Reiter
2021-03-16 20:17   ` Thomas Lamprecht
2021-03-17 13:37     ` Stefan Reiter
2021-03-17 13:59       ` Thomas Lamprecht [this message]
2021-03-17 16:03         ` [pve-devel] [pbs-devel] " Dietmar Maurer
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 06/11] make qemu_drive_mirror_monitor more generic Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 07/11] cfg2cmd: allow PBS snapshots as backing files for drives Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 08/11] enable live-restore for PBS Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 09/11] extract register_qmeventd_handle to QemuServer.pm Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 qemu-server 10/11] live-restore: register qmeventd handle Stefan Reiter
2021-03-03  9:56 ` [pve-devel] [PATCH v2 manager 11/11] ui: restore: add live-restore checkbox Stefan Reiter
2021-04-15 18:34   ` [pve-devel] applied: " Thomas Lamprecht
2021-03-22 11:08 ` [pve-devel] [PATCH v2 00/11] live-restore for PBS snapshots Dominic Jäger
2021-04-06 19:09 ` [pve-devel] partially-applied: " Thomas Lamprecht
2021-04-15 18:35 ` [pve-devel] " Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9177e016-7e42-cee7-f948-887af087311c@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=s.reiter@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal