From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 049C66B7FB; Wed, 17 Mar 2021 14:37:48 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E7FBF2DECE; Wed, 17 Mar 2021 14:37:47 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id A1AAA2DEBE; Wed, 17 Mar 2021 14:37:46 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6A73D4277D; Wed, 17 Mar 2021 14:37:46 +0100 (CET) To: Thomas Lamprecht , Proxmox VE development discussion , pbs-devel@lists.proxmox.com References: <20210303095612.7475-1-s.reiter@proxmox.com> <20210303095612.7475-6-s.reiter@proxmox.com> From: Stefan Reiter Message-ID: <570fbf9f-988c-c3a7-1475-ff0406ca590e@proxmox.com> Date: Wed, 17 Mar 2021 14:37:44 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.022 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [restore.rs] Subject: Re: [pbs-devel] [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use bigger cache and LRU chunk reader X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Mar 2021 13:37:48 -0000 On 16/03/2021 21:17, Thomas Lamprecht wrote: > On 03.03.21 10:56, Stefan Reiter wrote: >> Values chosen by fair dice roll, seems to be a good sweet spot on my >> machine where any less causes performance degradation but any more >> doesn't really make it go any faster. >> >> Keep in mind that those values are per drive in an actual restore. >> >> Signed-off-by: Stefan Reiter >> --- >> >> Depends on new proxmox-backup. >> >> v2: >> * unchanged >> >> src/restore.rs | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/src/restore.rs b/src/restore.rs >> index 0790d7f..a1acce4 100644 >> --- a/src/restore.rs >> +++ b/src/restore.rs >> @@ -218,15 +218,16 @@ impl RestoreTask { >> >> let index = client.download_fixed_index(&manifest, &archive_name).await?; >> let archive_size = index.index_bytes(); >> - let most_used = index.find_most_used_chunks(8); >> + let most_used = index.find_most_used_chunks(16); // 64 MB most used cache > > > >> >> let file_info = manifest.lookup_file_info(&archive_name)?; >> >> - let chunk_reader = RemoteChunkReader::new( >> + let chunk_reader = RemoteChunkReader::new_lru_cached( >> Arc::clone(&client), >> self.crypt_config.clone(), >> file_info.chunk_crypt_mode(), >> most_used, >> + 64, // 256 MB LRU cache > > how does this work with low(er) memory situations? Lots of people do not over > dimension their memory that much, and especially the need for mass-recovery could > seem to correlate with reduced resource availability (a node failed, now I need > to restore X backups on my node, so multiple > restore jobs may run in parallel, and they all may have even multiple disks, > so tens of GiB of memory just for the cache are not that unlikely. This is a seperate function from the regular restore, so it currently only affects live-restore. This is not an operation you would usually do under memory constraints anyway, and regular restore is unaffected if you just want the data. Upcoming single-file restore too though, I suppose, where it might make more sense... > > How is the behavior, hard failure if memory is not available? Also, some archives > may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache > and get 64 of most used chunks if that's all/more than it would actually need to > be.. Yes, if memory is unavailable it is a hard error. Memory should not be pre-allocated however, so restoring this way will only ever use as much memory as the disk size (not accounting for overhead). > > There may be the reversed situation too, beefy fast node with lots of memory > and restore is used as recovery or migration but network bw/latency to PBS is not > that good - so bigger cache could be wanted. The reason I chose the numbers I did was that I couldn't see any real performance benefits by going higher, though I didn't specifically test with slow networking. I don't believe more cache would improve the situation there though, this is mostly to avoid random access from the guest and the linear access from the block-stream operation to interfere with each other, and allow multiple smaller guest reads within the same chunk to be served quickly. > > Maybe we could get the available memory and use that as hint, I mean as memory > usage can be highly dynamic it will never be perfect, but better than just ignoring > it.. If anything, I'd make it user-configurable - I don't think a heuristic would be a good choice here. This way we could also set it smaller for single-file restore for example - on the other hand, that adds another parameter to the already somewhat cluttered QEMU<->Rust interface. > >> ); >> >> let reader = AsyncIndexReader::new(index, chunk_reader); >> >