public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [RFC proxmox-backup 0/4] concurrent group pull support for sync jobs
Date: Thu, 25 Jul 2024 12:19:18 +0200	[thread overview]
Message-ID: <20240725101922.231053-1-c.ebner@proxmox.com> (raw)

Pulling contents from a remote source via a sync job suffers from low
throughput on high latency networks because of limitations by the
HTTP/2 connection, as described in [0]. As a workaround, pulling
multiple groups in parallel by establishing multiple reader instances
has been suggested.

This patch series therefore adds a configuration property
`group-sync-tasks` to sync jobs which allows to define the number of
concurrent group pull tasks for each job. This is currently not
exposed on the UI. A valid config would look like this:

```
sync: s-d2755441-cca9
	ns
	owner root@pam
	group-sync-tasks 4
	remote pbs-remote-source
	remote-ns
	remote-store store
	remove-vanished false
	schedule daily
	store pullstore
```

This brings improvements as roughly tested by artificially increasing
the latency on the bridge of the pull target host to 150ms via

`tc qdisc add dev vmbr0 root netem delay 150ms`

and verifying by pinging the remote source host that the latency
applied.

Pulling using 2 concurrent tasks reduced the task runtime by about
-25% as compared to only a single task, 4 configured tasks reduced the
runtime by about -30%.

The current approach however interferes with status logging of a sync
job, as now no sequence is guaranteed anymore. Therefore, the logs are
buffered instead and only shown after the corresponding group pull
tasks has been run to completion.

Sending this as RFC as I am not to happy with how logging is handled,
maybe somebody has a better idea.

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=4182

Christian Ebner (4):
  api: config/sync: add optional group-sync-tasks property
  server: pull: factor out group pull task into helper
  fix #4182: server: sync: allow pulling groups concurrently
  server: pull: conditionally buffer parallel tasks log output

 pbs-api-types/src/jobs.rs           |  14 ++
 pbs-datastore/src/store_progress.rs |   2 +-
 src/api2/config/sync.rs             |  10 +
 src/api2/pull.rs                    |  13 +-
 src/server/pull.rs                  | 311 +++++++++++++++++++++-------
 5 files changed, 274 insertions(+), 76 deletions(-)

-- 
2.39.2



_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


             reply	other threads:[~2024-07-25 10:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-25 10:19 Christian Ebner [this message]
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 1/4] api: config/sync: add optional group-sync-tasks property Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 2/4] server: pull: factor out group pull task into helper Christian Ebner
2024-07-30 15:56   ` Gabriel Goller
2024-07-31  7:38     ` Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 3/4] fix #4182: server: sync: allow pulling groups concurrently Christian Ebner
2024-07-30 15:54   ` Gabriel Goller
2024-07-31  7:35     ` Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 4/4] server: pull: conditionally buffer parallel tasks log output Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240725101922.231053-1-c.ebner@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal