all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox{,-backup} v5 00/12] fix #4182: concurrent group pull/push support for sync jobs
Date: Mon, 23 Mar 2026 13:37:25 +0100	[thread overview]
Message-ID: <1774263381.bngcrer2th.astroid@yuna.none> (raw)
In-Reply-To: <20260309162050.1047341-1-c.ebner@proxmox.com>

On March 9, 2026 5:20 pm, Christian Ebner wrote:
> Syncing contents from/to a remote source via a sync job suffers from
> low throughput on high latency networks because of limitations by the
> HTTP/2 connection, as described in [0]. To improve, syncing multiple
> groups in parallel by establishing multiple reader instances has been
> suggested.
> 
> This patch series implements the functionality by adding the sync job
> configuration property `worker-threads`, allowing to define the
> number of groups pull/push tokio tasks to be executed in parallel on
> the runtime during each job.
> 
> Examplary configuration:
> ```
> sync: s-8764c440-3a6c
>         ...
> 	store datastore
> 	sync-direction push
> 	worker-threads 4
> ```
> 
> Since log messages are now also written concurrently, prefix logs
> related to groups, snapshots and archives with their respective
> context prefix and add context to error messages.
> 
> Further, improve logging especially for sync jobs in push direction,
> which only displayed limited information so far.

I think we need to re-think this part here, because right now I get log
output like this:

2026-03-23T11:54:43+01:00: Starting datastore sync job 'local:test:tank:foobar:s-bc01cba6-805a'
2026-03-23T11:54:43+01:00: sync datastore 'tank' from 'local/test'
2026-03-23T11:54:43+01:00: ----
2026-03-23T11:54:43+01:00: Syncing datastore 'test', root namespace into datastore 'tank', namespace 'foobar'
2026-03-23T11:54:43+01:00: Found 0 groups to sync (out of 0 total)
2026-03-23T11:54:43+01:00: Finished syncing root namespace, current progress: 0 groups, 0 snapshots
2026-03-23T11:54:43+01:00: ----
2026-03-23T11:54:43+01:00: Syncing datastore 'test', namespace 'test' into datastore 'tank', namespace 'foobar/test'
2026-03-23T11:54:43+01:00: Found 19 groups to sync (out of 19 total)
2026-03-23T11:54:43+01:00: Group host/fourmeg: skipped: 1 snapshot(s) (2023-06-28T11:14:35Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/onemeg/2023-06-28T11:13:51Z
2026-03-23T11:54:43+01:00: Group host/format-v2-test: skipped: 2 snapshot(s) (2024-06-07T12:08:37Z .. 2024-06-07T12:08:47Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/symlink: skipped: 1 snapshot(s) (2023-06-07T06:48:26Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Snapshot host/onemeg/2023-06-28T11:13:51Z: no data changes
2026-03-23T11:54:43+01:00: Group host/onemeg: percentage done: 10.53% (2/19 groups)
2026-03-23T11:54:43+01:00: Group host/foobar: skipped: 1 snapshot(s) (2024-06-10T07:48:51Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/incrementaltest2: skipped: 1 snapshot(s) (2023-09-25T13:43:25Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/exclusion-test: skipped: 1 snapshot(s) (2024-04-23T06:53:44Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/fourmeg/2023-06-28T12:37:16Z
2026-03-23T11:54:43+01:00: Group host/inctest: skipped: 2 snapshot(s) (2023-11-13T13:22:57Z .. 2023-11-13T13:24:17Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/format-v2-test/2024-06-10T11:12:56Z
2026-03-23T11:54:43+01:00: re-sync snapshot host/symlink/2023-06-07T06:49:43Z
2026-03-23T11:54:43+01:00: skipping snapshot host/test-another-mail/2024-03-14T08:44:25Z - in-progress backup
2026-03-23T11:54:43+01:00: re-sync snapshot host/logtest/2024-01-30T10:52:34Z
2026-03-23T11:54:43+01:00: Group host/format-v3-test: skipped: 3 snapshot(s) (2024-06-03T10:34:11Z .. 2024-06-04T11:27:03Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: re-sync snapshot host/foobar/2024-06-10T07:48:57Z
2026-03-23T11:54:43+01:00: Snapshot host/format-v2-test/2024-06-10T11:12:56Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/fourmeg/2023-06-28T12:37:16Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot host/inctest2/2023-11-13T13:53:27Z
2026-03-23T11:54:43+01:00: Group host/format-v2-test: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/fourmeg: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/exclusion-test/2024-04-23T06:54:30Z
2026-03-23T11:54:43+01:00: Snapshot host/symlink/2023-06-07T06:49:43Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/logtest/2024-01-30T10:52:34Z: no data changes
2026-03-23T11:54:43+01:00: Group host/test: skipped: 10 snapshot(s) (2024-07-10T08:18:51Z .. 2025-10-02T12:58:20Z) - older than the newest snapshot present on sync target
2026-03-23T11:54:43+01:00: Group host/symlink: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/bookworm/2023-01-18T11:10:34Z
2026-03-23T11:54:43+01:00: Group host/logtest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/foobar/2024-06-10T07:48:57Z: no data changes
2026-03-23T11:54:43+01:00: Group host/foobar: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/incrementaltest2/2023-09-25T13:43:35Z
2026-03-23T11:54:43+01:00: Snapshot host/inctest2/2023-11-13T13:53:27Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot ct/999/2023-03-15T08:00:13Z
2026-03-23T11:54:43+01:00: Group host/inctest2: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/inctest/2023-11-13T13:24:26Z
2026-03-23T11:54:43+01:00: Snapshot host/exclusion-test/2024-04-23T06:54:30Z: no data changes
2026-03-23T11:54:43+01:00: Snapshot host/bookworm/2023-01-18T11:10:34Z: no data changes
2026-03-23T11:54:43+01:00: re-sync snapshot host/linuxtesttest/2024-07-17T13:38:31Z
2026-03-23T11:54:43+01:00: Group host/exclusion-test: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/bookworm: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot ct/999/2023-03-15T08:00:13Z: no data changes
2026-03-23T11:54:43+01:00: Group ct/999: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/incrementaltest2/2023-09-25T13:43:35Z: no data changes
2026-03-23T11:54:43+01:00: Group host/incrementaltest2: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/linuxtesttest/2024-07-17T13:38:31Z: no data changes
2026-03-23T11:54:43+01:00: Group host/linuxtesttest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Snapshot host/inctest/2023-11-13T13:24:26Z: no data changes
2026-03-23T11:54:43+01:00: Group host/inctest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: re-sync snapshot host/format-v3-test/2024-06-04T09:13:52Z due to corruption
2026-03-23T11:54:43+01:00: re-sync snapshot host/acltest/2024-10-07T11:17:58Z
2026-03-23T11:54:43+01:00: Snapshot host/format-v3-test/2024-06-04T09:13:52Z: sync archive linux.ppxar.didx
2026-03-23T11:54:43+01:00: Snapshot host/acltest/2024-10-07T11:17:58Z: no data changes
2026-03-23T11:54:43+01:00: Group host/acltest: percentage done: 21.05% (4/19 groups)
2026-03-23T11:54:43+01:00: Group host/format-v3-test: percentage done: 18.42% (3/19 groups, 1/2 snapshots in group #4)
2026-03-23T11:54:43+01:00: sync group host/format-v3-test failed - Index and chunk CryptMode don't match.
2026-03-23T11:54:43+01:00: Finished syncing namespace test, current progress: 18 groups, 0 snapshots
2026-03-23T11:54:43+01:00: TASK ERROR: sync failed with some errors.

for a no-change sync, which is completely broken? 

but even for longer-running transfers, it might be benefitial to buffer
log lines and group those together by group which arrive in short
succession?

and also maybe try a different prefix? it's very noisy atm, something
like

{group}: [{snapshot timestamp}: [{archive name}:]]

might work better flow-wise, since it allows scanning for the group
right at the start of the line.. I also think a lot of the error paths
are lacking the prefix and would need to be adapted as well.


> 
> [0] https://bugzilla.proxmox.com/show_bug.cgi?id=4182
> 
> Change since version 4 (thanks @Max for review):
> - Use dedicated tokio tasks to run in parallel on different runtime threads,
>   not just multiple concurrent futures on the same thread.
> - Rework store progress accounting logic to avoid mutex locks when possible,
>   use atomic counters instead.
> - Expose setting also in the sync job edit window, not just the config.
> 
> 
> proxmox:
> 
> Christian Ebner (1):
>   pbs api types: add `worker-threads` to sync job config
> 
>  pbs-api-types/src/jobs.rs | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> 
> proxmox-backup:
> 
> Christian Ebner (11):
>   client: backup writer: fix upload stats size and rate for push sync
>   api: config/sync: add optional `worker-threads` property
>   sync: pull: revert avoiding reinstantiation for encountered chunks map
>   sync: pull: factor out backup group locking and owner check
>   sync: pull: prepare pull parameters to be shared across parallel tasks
>   fix #4182: server: sync: allow pulling backup groups in parallel
>   server: pull: prefix log messages and add error context
>   sync: push: prepare push parameters to be shared across parallel tasks
>   server: sync: allow pushing groups concurrently
>   server: push: prefix log messages and add additional logging
>   ui: expose group worker setting in sync job edit window
> 
>  pbs-client/src/backup_stats.rs  |  20 +--
>  pbs-client/src/backup_writer.rs |   4 +-
>  src/api2/config/sync.rs         |  10 ++
>  src/api2/pull.rs                |   9 +-
>  src/api2/push.rs                |   8 +-
>  src/server/pull.rs              | 246 +++++++++++++++++++-------------
>  src/server/push.rs              | 178 +++++++++++++++++------
>  src/server/sync.rs              |  90 +++++++++++-
>  www/window/SyncJobEdit.js       |  11 ++
>  9 files changed, 411 insertions(+), 165 deletions(-)
> 
> 
> Summary over all repositories:
>   10 files changed, 422 insertions(+), 165 deletions(-)
> 
> -- 
> Generated by murpp 0.9.0
> 
> 
> 
> 
> 




      parent reply	other threads:[~2026-03-23 12:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09 16:20 Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox v5 1/1] pbs api types: add `worker-threads` to sync job config Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 01/11] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 02/11] api: config/sync: add optional `worker-threads` property Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 03/11] sync: pull: revert avoiding reinstantiation for encountered chunks map Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 04/11] sync: pull: factor out backup group locking and owner check Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 05/11] sync: pull: prepare pull parameters to be shared across parallel tasks Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 06/11] fix #4182: server: sync: allow pulling backup groups in parallel Christian Ebner
2026-03-23 12:36   ` Fabian Grünbichler
2026-03-09 16:20 ` [PATCH proxmox-backup v5 07/11] server: pull: prefix log messages and add error context Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 08/11] sync: push: prepare push parameters to be shared across parallel tasks Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 09/11] server: sync: allow pushing groups concurrently Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 10/11] server: push: prefix log messages and add additional logging Christian Ebner
2026-03-09 16:20 ` [PATCH proxmox-backup v5 11/11] ui: expose group worker setting in sync job edit window Christian Ebner
2026-03-23 12:37 ` Fabian Grünbichler [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1774263381.bngcrer2th.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal