From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pbs-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id D3DE71FF189 for <inbox@lore.proxmox.com>; Fri, 21 Mar 2025 09:31:45 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4C40618EC4; Fri, 21 Mar 2025 09:31:44 +0100 (CET) Message-ID: <b4cdb6f9-7276-45e3-b386-758164831eb0@proxmox.com> Date: Fri, 21 Mar 2025 09:31:40 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta To: Thomas Lamprecht <t.lamprecht@proxmox.com>, Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> References: <20250221150631.3791658-1-d.csapak@proxmox.com> <20250221150631.3791658-3-d.csapak@proxmox.com> <e5a2dcac-630e-4797-bbbf-f38bc260c2ca@proxmox.com> Content-Language: en-US From: Dominik Csapak <d.csapak@proxmox.com> In-Reply-To: <e5a2dcac-630e-4797-bbbf-f38bc260c2ca@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.021 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 1/1] tape: introduce a tape backup job worker thread option X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion <pbs-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/> List-Post: <mailto:pbs-devel@lists.proxmox.com> List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com> On 3/20/25 17:30, Thomas Lamprecht wrote: > Am 21.02.25 um 16:06 schrieb Dominik Csapak: >> Using a single thread for reading is not optimal in some cases, e.g. >> when the underlying storage can handle reads from multiple threads in >> parallel. >> >> We use the ParallelHandler to handle the actual reads. Make the >> sync_channel buffer size depend on the number of threads so we have >> space for two chunks per thread. (But keep the minimum to 3 like >> before). >> >> How this impacts the backup speed largely depends on the underlying >> storage and how the backup is laid out on it. > > And the amount of tasks going on at the same time, which can wildly > influence the result and is not fully under the admin control as the > duration of tasks is not exactly fixed. > > FWIW, I think this is an OK stop-gap as it's really simple and if the > admin is somewhat careful with configuring this and the tasks schedules > it might help them already quite a bit as your benchmark shows, and > again, it _really_ is simple, and the whole tape subsystem is a bit more > specific and contained. > > That said, in the long-term this would probably be better replaced with > a global scheduling approach that respects this and other tasks workloads > and resource usage, which is certainly not an easy thing to do as there > are many aspects one needs to think through and schedulers are not really > a done thing in academic research either, especially not general ones. > I'm mostly mentioning this to avoid proliferation of such a mechanism to > other tasks, as that would result in a configuration hell for admin where > they hardly can tune for their workloads sanely anymore. > I fully agree here with you, I sent it this way again because we have users running into this now and don't want/can't wait for the proper fix with a scheduler. > Code looks OK to me, one tiny nit about a comment inline though. > >> diff --git a/src/tape/pool_writer/new_chunks_iterator.rs b/src/tape/pool_writer/new_chunks_iterator.rs >> index 1454b33d2..de847b3c9 100644 >> --- a/src/tape/pool_writer/new_chunks_iterator.rs >> +++ b/src/tape/pool_writer/new_chunks_iterator.rs >> @@ -6,8 +6,9 @@ use anyhow::{format_err, Error}; >> use pbs_datastore::{DataBlob, DataStore, SnapshotReader}; >> >> use crate::tape::CatalogSet; >> +use crate::tools::parallel_handler::ParallelHandler; >> >> -/// Chunk iterator which use a separate thread to read chunks >> +/// Chunk iterator which uses separate threads to read chunks >> /// >> /// The iterator skips duplicate chunks and chunks already in the >> /// catalog. >> @@ -24,8 +25,11 @@ impl NewChunksIterator { >> datastore: Arc<DataStore>, >> snapshot_reader: Arc<Mutex<SnapshotReader>>, >> catalog_set: Arc<Mutex<CatalogSet>>, >> + read_threads: usize, >> ) -> Result<(std::thread::JoinHandle<()>, Self), Error> { >> - let (tx, rx) = std::sync::mpsc::sync_channel(3); >> + // use twice the threadcount for the channel, so the read thread can already send another >> + // one when the previous one was not consumed yet, but keep the minimum at 3 > > this reads a bit confusing like the channel could go up to 2 x thread > count, while that does not make much sense if one is well acquainted > with the matter at hand it'd be IMO still nicer to clarify that for > others stumbling into this, maybe something like: > > > // set the buffer size of the channel queues to twice the number of threads or 3, whichever > // is greater, to reduce the chance of a reader thread (producer) being blocked. > > Can be fixed up on applying though, if you agree (or propose something better). sound fine to me > >> + let (tx, rx) = std::sync::mpsc::sync_channel((read_threads * 2).max(3)); > > _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel