From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pbs-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id 402F91FF38E
	for <inbox@lore.proxmox.com>; Wed,  8 May 2024 15:37:46 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 746318A5;
	Wed,  8 May 2024 15:37:51 +0200 (CEST)
Mime-Version: 1.0
Date: Wed, 08 May 2024 15:37:16 +0200
Message-Id: <D14AY5X8H2A6.13Z06BU3IRB9G@proxmox.com>
To: "Dominik Csapak" <d.csapak@proxmox.com>, "Proxmox Backup Server
 development discussion" <pbs-devel@lists.proxmox.com>
From: "Max Carrara" <m.carrara@proxmox.com>
X-Mailer: aerc 0.17.0-72-g6a84f1331f1c
References: <20240507072955.364206-1-d.csapak@proxmox.com>
 <20240507072955.364206-2-d.csapak@proxmox.com>
 <D13IAVJ7MM88.ZZT9M6HM142C@proxmox.com>
 <2d13fe7c-25be-4eff-9cc6-85b65e829c36@proxmox.com>
In-Reply-To: <2d13fe7c-25be-4eff-9cc6-85b65e829c36@proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.027 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup v2 1/3] tape: introduce a
 tape backup read thread tuning opti
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Backup Server development discussion
 <pbs-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pbs-devel-bounces@lists.proxmox.com
Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com>

On Wed May 8, 2024 at 8:56 AM CEST, Dominik Csapak wrote:
> On 5/7/24 17:10, Max Carrara wrote:
> > On Tue May 7, 2024 at 9:29 AM CEST, Dominik Csapak wrote:
> >> using a single thread for reading is not optimal in some cases, e.g.
> >> when the underlying storage can handle more reads in parallel than with
> >> a single thread.
> >>
> > 
> > Got only a few tiny notes that are otherwise unremarkable:
> > 
> > Would prefer to reword this as:
> > 
> > Using a single thread for reading is not optimal in some cases, e.g.
> > when the underlying storage can handle reads from multiple threads in
> > parallel.
> > 
> >> We use the ParallelHandler to handle the actual reads. Make the
> >> sync_channel buffer size depending on the number of threads so we have
> >> space for two chunks per thread.
> > 
> > Would s/depending/depend/ for the above paragraph.
> > 
>
> ok to both comments, will update in a v3 (if necessary)
> or if preferred the one committing could fixup the commit message when applying
>
> >>
> >> How this impacts the backup speed largely depends on the underlying
> >> storage and how the backup is laid out on it.
> >>
> >> I benchmarked the following setups:
> >>
> >> * Setup A: relatively spread out backup on a virtualized pbs on single HDDs
> >> * Setup B: mostly sequential chunks on a virtualized pbs on single HDDs
> >> * Setup C: backup on virtualized pbs on a fast NVME
> >> * Setup D: backup on bare metal pbs with ZFS in a RAID10 with 6 HDDs
> >>    and 2 fast special devices in a mirror
> >>
> >> (values are reported in MB/s as seen in the task log, caches were
> >> cleared between runs, backups were bigger than the memory available)
> >>
> >> setup  1 thread  2 threads  4 threads  8 threads
> >> A      55        70         80         95
> >> B      110       89         100        108
> >> C      294       294        294        294
> >> D      118       180        300        300
> >>
> >> So there are cases where multiple read threads speed up the tape backup
> >> (dramatically). On the other hand there are situations where reading
> >> from a single thread is actually faster, probably because we can read
> >> from the HDD sequentially.
> >>
> >> I used a new default value of '4' here since that gave good performance
> >> on all setups (not necessarily the best) and we also have a default
> >> value of '4' for verification threads.
> >>
> >> We use a struct here for the datastore since we want to introduce other
> >> thread tuning options too.
> >>
> >> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> > 
> > The patch is pretty straightforward; there's one question inline, but
> > otherwise this looks fine to me. Solid work!
> > 
> > Wasn't able to test this yet as setting up a VTL on Bookworm failed for
> > me, unfortunately. Will try to test this tomorrow if possible.
>
> it's probably obvious, but just to be sure:
>
> make sure the target storage for the vtl is faster than your
> source storage, otherwise you'll always be limited by that...

Thanks for the tip! Will consider it. Am in the process of spinning up
another VM.

>
> > 
> > For now:
> > 
> > Reviewed-by: Max Carrara <m.carrara@proxmox.com>
> > 
> [snip]
> > diff --git a/src/tape/pool_writer/new_chunks_iterator.rs b/src/tape/pool_writer/new_chunks_iterator.rs
> >> index 1454b33d2..a2a8091f6 100644
> >> --- a/src/tape/pool_writer/new_chunks_iterator.rs
> >> +++ b/src/tape/pool_writer/new_chunks_iterator.rs
> >> @@ -6,8 +6,9 @@ use anyhow::{format_err, Error};
> >>   use pbs_datastore::{DataBlob, DataStore, SnapshotReader};
> >>   
> >>   use crate::tape::CatalogSet;
> >> +use crate::tools::parallel_handler::ParallelHandler;
> >>   
> >> -/// Chunk iterator which use a separate thread to read chunks
> >> +/// Chunk iterator which uses separate threads to read chunks
> >>   ///
> >>   /// The iterator skips duplicate chunks and chunks already in the
> >>   /// catalog.
> >> @@ -25,7 +26,11 @@ impl NewChunksIterator {
> >>           snapshot_reader: Arc<Mutex<SnapshotReader>>,
> >>           catalog_set: Arc<Mutex<CatalogSet>>,
> >>       ) -> Result<(std::thread::JoinHandle<()>, Self), Error> {
> >> -        let (tx, rx) = std::sync::mpsc::sync_channel(3);
> >> +        let read_threads = datastore
> >> +            .get_thread_configuration()
> >> +            .tape_backup_read_threads;
> >> +
> >> +        let (tx, rx) = std::sync::mpsc::sync_channel(read_threads * 2);
> > 
> > Is there any reason you're using `* 2` here? For example, is the
> > throughput unaffected if you use a larger value, like `* 8`?
> > 
> > If the constant has an effect like that it should IMO be documented, but
> > if not, then it can just stay like it is.
>
> i did not benchmark that much here (would have increased my benchmark time since
> that's another dimension in the testing with many possible values)
>
> but my though process was this:
>
> if the source storage is much slower than what we can write to tape, that
> buffer will be empty most of the time
>
> if the source storage is much faster than what we can write to tape, it
> will be full most of the time
>
> in both cases the buffer size won't matter much (as long as there is a bit of
> buffer)
>
> the sweet spot is when the storage is about as fast as the tape writes
> and in that case you want to be able to have just a bit more buffer
> than threads reading from the storage so there is always a read
> going on (so the buffer will never empty, since that slows
> down the tape write)
>
> so 1*number of threads seemed to low, but i think anything bigger
> than 2* is just a waste of memory
>
> does that make sense?
>
> if we really want, i can make additional benchmarks but my guess
> is that it won't make much difference in practice

Okay that makes a lot of sense, that has cleared everything up for me.
Thanks!

I don't think this needs to be changed at all, then. LGTM


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel