From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 791556B51A;
 Tue, 16 Mar 2021 21:18:22 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 66905271D5;
 Tue, 16 Mar 2021 21:17:52 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 3A5B4271C5;
 Tue, 16 Mar 2021 21:17:51 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id F14D6427C5;
 Tue, 16 Mar 2021 21:17:50 +0100 (CET)
Message-ID: <f3df01a9-71a6-9b20-dafa-3cdda78f2e72@proxmox.com>
Date: Tue, 16 Mar 2021 21:17:49 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101
 Thunderbird/87.0
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Stefan Reiter <s.reiter@proxmox.com>, pbs-devel@lists.proxmox.com
References: <20210303095612.7475-1-s.reiter@proxmox.com>
 <20210303095612.7475-6-s.reiter@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
In-Reply-To: <20210303095612.7475-6-s.reiter@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.046 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [restore.rs]
Subject: Re: [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11] access: use
 bigger cache and LRU chunk reader
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 16 Mar 2021 20:18:22 -0000

On 03.03.21 10:56, Stefan Reiter wrote:
> Values chosen by fair dice roll, seems to be a good sweet spot on my
> machine where any less causes performance degradation but any more
> doesn't really make it go any faster.
> 
> Keep in mind that those values are per drive in an actual restore.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
> 
> Depends on new proxmox-backup.
> 
> v2:
> * unchanged
> 
>  src/restore.rs | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/restore.rs b/src/restore.rs
> index 0790d7f..a1acce4 100644
> --- a/src/restore.rs
> +++ b/src/restore.rs
> @@ -218,15 +218,16 @@ impl RestoreTask {
>  
>          let index = client.download_fixed_index(&manifest, &archive_name).await?;
>          let archive_size = index.index_bytes();
> -        let most_used = index.find_most_used_chunks(8);
> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache



>  
>          let file_info = manifest.lookup_file_info(&archive_name)?;
>  
> -        let chunk_reader = RemoteChunkReader::new(
> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>              Arc::clone(&client),
>              self.crypt_config.clone(),
>              file_info.chunk_crypt_mode(),
>              most_used,
> +            64, // 256 MB LRU cache

how does this work with low(er) memory situations? Lots of people do not over
dimension their memory that much, and especially the need for mass-recovery could
seem to correlate with reduced resource availability (a node failed, now I need
to restore X backups on my <test/old/other-already-in-use> node, so multiple
restore jobs may run in parallel, and they all may have even multiple disks,
so tens of GiB of memory just for the cache are not that unlikely.

How is the behavior, hard failure if memory is not available? Also, some archives
may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
and get 64 of most used chunks if that's all/more than it would actually need to
be..

There may be the reversed situation too, beefy fast node with lots of memory
and restore is used as recovery or migration but network bw/latency to PBS is not
that good - so bigger cache could be wanted.

Maybe we could get the available memory and use that as hint, I mean as memory
usage can be highly dynamic it will never be perfect, but better than just ignoring
it..

>          );
>  
>          let reader = AsyncIndexReader::new(index, chunk_reader);
> 





From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 791556B51A;
 Tue, 16 Mar 2021 21:18:22 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 66905271D5;
 Tue, 16 Mar 2021 21:17:52 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 3A5B4271C5;
 Tue, 16 Mar 2021 21:17:51 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id F14D6427C5;
 Tue, 16 Mar 2021 21:17:50 +0100 (CET)
Message-ID: <f3df01a9-71a6-9b20-dafa-3cdda78f2e72@proxmox.com>
Date: Tue, 16 Mar 2021 21:17:49 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101
 Thunderbird/87.0
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Stefan Reiter <s.reiter@proxmox.com>, pbs-devel@lists.proxmox.com
References: <20210303095612.7475-1-s.reiter@proxmox.com>
 <20210303095612.7475-6-s.reiter@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
In-Reply-To: <20210303095612.7475-6-s.reiter@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.046 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [restore.rs]
Subject: Re: [pbs-devel] [pve-devel] [PATCH v2 proxmox-backup-qemu 05/11]
 access: use bigger cache and LRU chunk reader
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 16 Mar 2021 20:18:22 -0000

On 03.03.21 10:56, Stefan Reiter wrote:
> Values chosen by fair dice roll, seems to be a good sweet spot on my
> machine where any less causes performance degradation but any more
> doesn't really make it go any faster.
> 
> Keep in mind that those values are per drive in an actual restore.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
> 
> Depends on new proxmox-backup.
> 
> v2:
> * unchanged
> 
>  src/restore.rs | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/restore.rs b/src/restore.rs
> index 0790d7f..a1acce4 100644
> --- a/src/restore.rs
> +++ b/src/restore.rs
> @@ -218,15 +218,16 @@ impl RestoreTask {
>  
>          let index = client.download_fixed_index(&manifest, &archive_name).await?;
>          let archive_size = index.index_bytes();
> -        let most_used = index.find_most_used_chunks(8);
> +        let most_used = index.find_most_used_chunks(16); // 64 MB most used cache



>  
>          let file_info = manifest.lookup_file_info(&archive_name)?;
>  
> -        let chunk_reader = RemoteChunkReader::new(
> +        let chunk_reader = RemoteChunkReader::new_lru_cached(
>              Arc::clone(&client),
>              self.crypt_config.clone(),
>              file_info.chunk_crypt_mode(),
>              most_used,
> +            64, // 256 MB LRU cache

how does this work with low(er) memory situations? Lots of people do not over
dimension their memory that much, and especially the need for mass-recovery could
seem to correlate with reduced resource availability (a node failed, now I need
to restore X backups on my <test/old/other-already-in-use> node, so multiple
restore jobs may run in parallel, and they all may have even multiple disks,
so tens of GiB of memory just for the cache are not that unlikely.

How is the behavior, hard failure if memory is not available? Also, some archives
may be smaller than 256 MiB (EFI disk??) so there it'd be weird to have 256 cache
and get 64 of most used chunks if that's all/more than it would actually need to
be..

There may be the reversed situation too, beefy fast node with lots of memory
and restore is used as recovery or migration but network bw/latency to PBS is not
that good - so bigger cache could be wanted.

Maybe we could get the available memory and use that as hint, I mean as memory
usage can be highly dynamic it will never be perfect, but better than just ignoring
it..

>          );
>  
>          let reader = AsyncIndexReader::new(index, chunk_reader);
>