From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [RFC proxmox-backup 0/4] concurrent group pull support for sync jobs
Date: Thu, 25 Jul 2024 12:19:18 +0200 [thread overview]
Message-ID: <20240725101922.231053-1-c.ebner@proxmox.com> (raw)
Pulling contents from a remote source via a sync job suffers from low
throughput on high latency networks because of limitations by the
HTTP/2 connection, as described in [0]. As a workaround, pulling
multiple groups in parallel by establishing multiple reader instances
has been suggested.
This patch series therefore adds a configuration property
`group-sync-tasks` to sync jobs which allows to define the number of
concurrent group pull tasks for each job. This is currently not
exposed on the UI. A valid config would look like this:
```
sync: s-d2755441-cca9
ns
owner root@pam
group-sync-tasks 4
remote pbs-remote-source
remote-ns
remote-store store
remove-vanished false
schedule daily
store pullstore
```
This brings improvements as roughly tested by artificially increasing
the latency on the bridge of the pull target host to 150ms via
`tc qdisc add dev vmbr0 root netem delay 150ms`
and verifying by pinging the remote source host that the latency
applied.
Pulling using 2 concurrent tasks reduced the task runtime by about
-25% as compared to only a single task, 4 configured tasks reduced the
runtime by about -30%.
The current approach however interferes with status logging of a sync
job, as now no sequence is guaranteed anymore. Therefore, the logs are
buffered instead and only shown after the corresponding group pull
tasks has been run to completion.
Sending this as RFC as I am not to happy with how logging is handled,
maybe somebody has a better idea.
[0] https://bugzilla.proxmox.com/show_bug.cgi?id=4182
Christian Ebner (4):
api: config/sync: add optional group-sync-tasks property
server: pull: factor out group pull task into helper
fix #4182: server: sync: allow pulling groups concurrently
server: pull: conditionally buffer parallel tasks log output
pbs-api-types/src/jobs.rs | 14 ++
pbs-datastore/src/store_progress.rs | 2 +-
src/api2/config/sync.rs | 10 +
src/api2/pull.rs | 13 +-
src/server/pull.rs | 311 +++++++++++++++++++++-------
5 files changed, 274 insertions(+), 76 deletions(-)
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next reply other threads:[~2024-07-25 10:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-25 10:19 Christian Ebner [this message]
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 1/4] api: config/sync: add optional group-sync-tasks property Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 2/4] server: pull: factor out group pull task into helper Christian Ebner
2024-07-30 15:56 ` Gabriel Goller
2024-07-31 7:38 ` Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 3/4] fix #4182: server: sync: allow pulling groups concurrently Christian Ebner
2024-07-30 15:54 ` Gabriel Goller
2024-07-31 7:35 ` Christian Ebner
2024-07-25 10:19 ` [pbs-devel] [RFC proxmox-backup 4/4] server: pull: conditionally buffer parallel tasks log output Christian Ebner
2025-01-20 10:57 ` [pbs-devel] [RFC proxmox-backup 0/4] concurrent group pull support for sync jobs Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240725101922.231053-1-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox