public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox{,-backup} v5 00/12] fix #4182: concurrent group pull/push support for sync jobs
Date: Mon, 23 Mar 2026 13:37:25 +0100	[thread overview]
Message-ID: <1774263381.bngcrer2th.astroid@yuna.none> (raw)
In-Reply-To: <20260309162050.1047341-1-c.ebner@proxmox.com>

On March 9, 2026 5:20 pm, Christian Ebner wrote:
> Syncing contents from/to a remote source via a sync job suffers from
> low throughput on high latency networks because of limitations by the
> HTTP/2 connection, as described in [0]. To improve, syncing multiple
> groups in parallel by establishing multiple reader instances has been
> suggested.
> 
> This patch series implements the functionality by adding the sync job
> configuration property `worker-threads`, allowing to define the
> number of groups pull/push tokio tasks to be executed in parallel on
> the runtime during each job.
> 
> Examplary configuration:
> ```
> sync: s-8764c440-3a6c
>         ...
> 	store datastore
> 	sync-direction push
> 	worker-threads 4
> ```
> 
> Since log messages are now also written concurrently, prefix logs
> related to groups, snapshots and archives with their respective
> context prefix and add context to error messages.
> 
> Further, improve logging especially for sync jobs in push direction,
> which only displayed limited information so far.

I think we need to re-think this part here, because right now I get log
output like this:

2026-03-23T11:54:43+01:00: Starting datastore sync job 'local:test:tank:foobar:s-bc01cba6-805a'
2026-03-23T11:54:43+01:00: sync datastore 'tank' from 'local/test'
2026-03-23T11:54:43+01:00: ----
2026-03-23T11:54:43+01:00: Syncing datastore 'test', root namespace into datastore 'tank', namespace 'foobar'
2026-03-23T11:54:43+01:00: Found 0 groups to sync (out of 0 total)
2026-03-23T11:54:43+01:00: Finished syncing root namespace, current progress: 0 groups, 0 snapshots
2026-03-23T11:54:43+01:00: ----
2026-03-23T11:54:43+01:00: Syncing datastore 'test', namespace 'test' into datastore 'tank', namespace 'foobar/test'
2026-03-23T11:54:43+01:00: Found 19 groups to sync (out of 19 total)
2026-03-23T11:54:43+01:00: Group host/fourmeg: skipped: 1 snapshot(s) (2023-06-28T11:14:35Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/onemeg/2023-06-28T11:13:51Z
2026-03-23T11:54:43+01:00: Group host/format-v2-test: skipped: 2 snapshot(s) (2024-06-07T12:08:37Z .. 2024-06-07T12:08:47Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/symlink: skipped: 1 snapshot(s) (2023-06-07T06:48:26Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Snapshot host/onemeg/2023-06-28T11:13:51Z: no data changes
2026-03-23T11:54:43+01:00: Group host/onemeg: percentage done: 10.53% (2/19 groups)
2026-03-23T11:54:43+01:00: Group host/foobar: skipped: 1 snapshot(s) (2024-06-10T07:48:51Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/incrementaltest2: skipped: 1 snapshot(s) (2023-09-25T13:43:25Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/exclusion-test: skipped: 1 snapshot(s) (2024-04-23T06:53:44Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/fourmeg/2023-06-28T12:37:16Z
2026-03-23T11:54:43+01:00: Group host/inctest: skipped: 2 snapshot(s) (2023-11-13T13:22:57Z .. 2023-11-13T13:24:17Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/format-v2-test/2024-06-10T11:12:56Z
2026-03-23T11:54:43+01:00: re-sync snapshot host/symlink/2023-06-07T06:49:43Z
2026-03-23T11:54:43+01:00: skipping snapshot host/test-another-mail/2024-03-14T08:44:25Z - in-progress backup
2026-03-23T11:54:43+01:00: re-sync snapshot host/logtest/2024-01-30T10:52:34Z
2026-03-23T11:54:43+01:00: Group host/format-v3-test: skipped: 3 snapshot(s) (2024-06-03T10:34:11Z .. 2024-06-04T11:27:03Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/foobar/2024-06-10T07:48:57Z
2026-03-23T11:54:43+01:00: Snapshot host/format-v2-test/2024-06-10T11:12:56Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/fourmeg/2023-06-28T12:37:16Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot host/inctest2/2023-11-13T13:53:27Z
2026-03-23T11:54:43+01:00: Group host/format-v2-test: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/fourmeg: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/exclusion-test/2024-04-23T06:54:30Z
2026-03-23T11:54:43+01:00: Snapshot host/symlink/2023-06-07T06:49:43Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/logtest/2024-01-30T10:52:34Z: no data changes
2026-03-23T11:54:43+01:00: Group host/test: skipped: 10 snapshot(s) (2024-07-10T08:18:51Z .. 2025-10-02T12:58:20Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/symlink: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/bookworm/2023-01-18T11:10:34Z
2026-03-23T11:54:43+01:00: Group host/logtest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/foobar/2024-06-10T07:48:57Z: no data changes
2026-03-23T11:54:43+01:00: Group host/foobar: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/incrementaltest2/2023-09-25T13:43:35Z
2026-03-23T11:54:43+01:00: Snapshot host/inctest2/2023-11-13T13:53:27Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot ct/999/2023-03-15T08:00:13Z
2026-03-23T11:54:43+01:00: Group host/inctest2: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/inctest/2023-11-13T13:24:26Z
2026-03-23T11:54:43+01:00: Snapshot host/exclusion-test/2024-04-23T06:54:30Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/bookworm/2023-01-18T11:10:34Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot host/linuxtesttest/2024-07-17T13:38:31Z
2026-03-23T11:54:43+01:00: Group host/exclusion-test: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/bookworm: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot ct/999/2023-03-15T08:00:13Z: no data changes
2026-03-23T11:54:43+01:00: Group ct/999: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/incrementaltest2/2023-09-25T13:43:35Z: no data changes
2026-03-23T11:54:43+01:00: Group host/incrementaltest2: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/linuxtesttest/2024-07-17T13:38:31Z: no data changes
2026-03-23T11:54:43+01:00: Group host/linuxtesttest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/inctest/2023-11-13T13:24:26Z: no data changes
2026-03-23T11:54:43+01:00: Group host/inctest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/format-v3-test/2024-06-04T09:13:52Z due to corruption
2026-03-23T11:54:43+01:00: re-sync snapshot host/acltest/2024-10-07T11:17:58Z
2026-03-23T11:54:43+01:00: Snapshot host/format-v3-test/2024-06-04T09:13:52Z: sync archive linux.ppxar.didx
2026-03-23T11:54:43+01:00: Snapshot host/acltest/2024-10-07T11:17:58Z: no data changes
2026-03-23T11:54:43+01:00: Group host/acltest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/format-v3-test: percentage done: 18.42% (3/19 groups, 1/2 snapshots in group #4)
2026-03-23T11:54:43+01:00: sync group host/format-v3-test failed - Index and chunk CryptMode don't match.
2026-03-23T11:54:43+01:00: Finished syncing namespace test, current progress: 18 groups, 0 snapshots
2026-03-23T11:54:43+01:00: TASK ERROR: sync failed with some errors.

for a no-change sync, which is completely broken? 

but even for longer-running transfers, it might be benefitial to buffer
log lines and group those together by group which arrive in short
succession?

and also maybe try a different prefix? it's very noisy atm, something
like

{group}: [{snapshot timestamp}: [{archive name}:]]

might work better flow-wise, since it allows scanning for the group
right at the start of the line.. I also think a lot of the error paths
are lacking the prefix and would need to be adapted as well.


> 
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=4182
> 
> Change since version 4 (thanks @Max for review):
> - Use dedicated tokio tasks to run in parallel on different runtime threads,
>   not just multiple concurrent futures on the same thread.
> - Rework store progress accounting logic to avoid mutex locks when possible,
>   use atomic counters instead.
> - Expose setting also in the sync job edit window, not just the config.
> 
> 
> proxmox:
> 
> Christian Ebner (1):
>   pbs api types: add `worker-threads` to sync job config
> 
>  pbs-api-types/src/jobs.rs | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> 
> proxmox-backup:
> 
> Christian Ebner (11):
>   client: backup writer: fix upload stats size and rate for push sync
>   api: config/sync: add optional `worker-threads` property
>   sync: pull: revert avoiding reinstantiation for encountered chunks map
>   sync: pull: factor out backup group locking and owner check
>   sync: pull: prepare pull parameters to be shared across parallel tasks
>   fix #4182: server: sync: allow pulling backup groups in parallel
>   server: pull: prefix log messages and add error context
>   sync: push: prepare push parameters to be shared across parallel tasks
>   server: sync: allow pushing groups concurrently
>   server: push: prefix log messages and add additional logging
>   ui: expose group worker setting in sync job edit window
> 
>  pbs-client/src/backup_stats.rs  |  20 +--
>  pbs-client/src/backup_writer.rs |   4 +-
>  src/api2/config/sync.rs         |  10 ++
>  src/api2/pull.rs                |   9 +-
>  src/api2/push.rs                |   8 +-
>  src/server/pull.rs              | 246 +++++++++++++++++++-------------
>  src/server/push.rs              | 178 +++++++++++++++++------
>  src/server/sync.rs              |  90 +++++++++++-
>  www/window/SyncJobEdit.js       |  11 ++
>  9 files changed, 411 insertions(+), 165 deletions(-)
> 
> 
> Summary over all repositories:
>   10 files changed, 422 insertions(+), 165 deletions(-)
> 
> -- 
> Generated by murpp 0.9.0
> 
> 
> 
> 
> 




      parent reply	other threads:[~2026-03-23 12:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09 16:20 Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox v5 1/1] pbs api types: add `worker-threads` to sync job config Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 01/11] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 02/11] api: config/sync: add optional `worker-threads` property Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 03/11] sync: pull: revert avoiding reinstantiation for encountered chunks map Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 04/11] sync: pull: factor out backup group locking and owner check Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 05/11] sync: pull: prepare pull parameters to be shared across parallel tasks Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 06/11] fix #4182: server: sync: allow pulling backup groups in parallel Christian Ebner
2026-03-23 12:36   ` Fabian Grünbichler
2026-03-09 16:20 ` [PATCH proxmox-backup v5 07/11] server: pull: prefix log messages and add error context Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 08/11] sync: push: prepare push parameters to be shared across parallel tasks Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 09/11] server: sync: allow pushing groups concurrently Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 10/11] server: push: prefix log messages and add additional logging Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 11/11] ui: expose group worker setting in sync job edit window Christian Ebner
2026-03-23 12:37 ` Fabian Grünbichler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1774263381.bngcrer2th.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal