From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 15F581FF15D for ; Thu, 25 Jul 2024 12:20:19 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E4A9434F6; Thu, 25 Jul 2024 12:20:19 +0200 (CEST) From: Christian Ebner To: pbs-devel@lists.proxmox.com Date: Thu, 25 Jul 2024 12:19:18 +0200 Message-Id: <20240725101922.231053-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.021 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pbs-devel] [RFC proxmox-backup 0/4] concurrent group pull support for sync jobs X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" Pulling contents from a remote source via a sync job suffers from low throughput on high latency networks because of limitations by the HTTP/2 connection, as described in [0]. As a workaround, pulling multiple groups in parallel by establishing multiple reader instances has been suggested. This patch series therefore adds a configuration property `group-sync-tasks` to sync jobs which allows to define the number of concurrent group pull tasks for each job. This is currently not exposed on the UI. A valid config would look like this: ``` sync: s-d2755441-cca9 ns owner root@pam group-sync-tasks 4 remote pbs-remote-source remote-ns remote-store store remove-vanished false schedule daily store pullstore ``` This brings improvements as roughly tested by artificially increasing the latency on the bridge of the pull target host to 150ms via `tc qdisc add dev vmbr0 root netem delay 150ms` and verifying by pinging the remote source host that the latency applied. Pulling using 2 concurrent tasks reduced the task runtime by about -25% as compared to only a single task, 4 configured tasks reduced the runtime by about -30%. The current approach however interferes with status logging of a sync job, as now no sequence is guaranteed anymore. Therefore, the logs are buffered instead and only shown after the corresponding group pull tasks has been run to completion. Sending this as RFC as I am not to happy with how logging is handled, maybe somebody has a better idea. [0] https://bugzilla.proxmox.com/show_bug.cgi?id=4182 Christian Ebner (4): api: config/sync: add optional group-sync-tasks property server: pull: factor out group pull task into helper fix #4182: server: sync: allow pulling groups concurrently server: pull: conditionally buffer parallel tasks log output pbs-api-types/src/jobs.rs | 14 ++ pbs-datastore/src/store_progress.rs | 2 +- src/api2/config/sync.rs | 10 + src/api2/pull.rs | 13 +- src/server/pull.rs | 311 +++++++++++++++++++++------- 5 files changed, 274 insertions(+), 76 deletions(-) -- 2.39.2 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel