From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v6 11/15] server: pull: prefix log messages and add error context
Date: Tue, 21 Apr 2026 10:00:26 +0200 [thread overview]
Message-ID: <1776757625.a5d7yahnh6.astroid@yuna.none> (raw)
In-Reply-To: <7f8852e8-c092-4e2d-a031-192c087fb364@proxmox.com>
On April 21, 2026 9:42 am, Christian Ebner wrote:
> On 4/21/26 9:21 AM, Christian Ebner wrote:
>> On 4/20/26 1:54 PM, Fabian Grünbichler wrote:
>>> On April 17, 2026 11:26 am, Christian Ebner wrote:
>>>> Pulling groups and therefore also snapshots in parallel leads to
>>>> unordered log outputs, making it mostly impossible to relate a log
>>>> message to a backup snapshot/group.
>>>>
>>>> Therefore, prefix pull job log messages by the corresponding group or
>>>> snapshot and set the error context accordingly.
>>>>
>>>> Also, reword some messages, inline variables in format strings and
>>>> start log lines with capital letters to get consistent output.
>>>>
>>>> By using the buffered logger implementation and buffer up to 5 lines
>>>> with a timeout of 1 second, subsequent log lines arriving in fast
>>>> succession are kept together, reducing the mixing of lines.
>>>>
>>>> Example output for a sequential pull job:
>>>> ```
>>>> ...
>>>> [ct/100]: 2025-11-17T10:11:42Z: start sync
>>>> [ct/100]: 2025-11-17T10:11:42Z: pct.conf.blob: sync archive
>>>> [ct/100]: 2025-11-17T10:11:42Z: root.ppxar.didx: sync archive
>>>> [ct/100]: 2025-11-17T10:11:42Z: root.ppxar.didx: downloaded 16.785
>>>> MiB (280.791 MiB/s)
>>>> [ct/100]: 2025-11-17T10:11:42Z: root.mpxar.didx: sync archive
>>>> [ct/100]: 2025-11-17T10:11:42Z: root.mpxar.didx: downloaded 65.703
>>>> KiB (29.1 MiB/s)
>>>> [ct/100]: 2025-11-17T10:11:42Z: sync done
>>>> [ct/100]: percentage done: 9.09% (1/11 groups)
>>>> [ct/101]: 2026-03-31T12:20:16Z: start sync
>>>> [ct/101]: 2026-03-31T12:20:16Z: pct.conf.blob: sync archive
>>>> [ct/101]: 2026-03-31T12:20:16Z: root.pxar.didx: sync archive
>>>> [ct/101]: 2026-03-31T12:20:16Z: root.pxar.didx: downloaded 199.806
>>>> MiB (311.91 MiB/s)
>>>> [ct/101]: 2026-03-31T12:20:16Z: catalog.pcat1.didx: sync archive
>>>> [ct/101]: 2026-03-31T12:20:16Z: catalog.pcat1.didx: downloaded
>>>> 180.379 KiB (22.748 MiB/s)
>>>> [ct/101]: 2026-03-31T12:20:16Z: sync done
>>>
>>> this is probably the wrong patch for this comment, but since you have
>>> the sample output here ;)
>>>
>>> should we pad the labels? the buffered logger knows all currently active
>>> labels and could adapt it? otherwise especially for host backups the
>>> logs are again not really scannable by humans, because there will be
>>> weird jumps in alignment.. or (.. continued below)
>>
>> That will not work though? While the logger knows about the labels when
>> receiving them, it does not know at all about future ones that might
>> still come in. So it can happen that the padding does not fit anymore,
>> and one gets even uglier formatting issues.
>>
>> I would prefer to keep them as is for the time being. For vm/ct it is
>> not that bad, one could maybe forsee the ID to be a 4 digit number and
>> define a minimum label lenght to be padded if not reached.
ack, let's leave this as it is for now, can always be bikeshedded if
needed later on!
>>
>> Host backups or backups with explicit label
>>
cut off?
>>
>>>> ...
>>>> ```
>>>>
>>>> Example output for a parallel pull job:
>>>> ```
>>>> ...
>>>> [ct/107]: 2025-07-16T09:14:01Z: start sync
>>>> [ct/107]: 2025-07-16T09:14:01Z: pct.conf.blob: sync archive
>>>> [ct/107]: 2025-07-16T09:14:01Z: root.ppxar.didx: sync archive
>>>> [vm/108]: 2025-09-19T07:37:19Z: start sync
>>>> [vm/108]: 2025-09-19T07:37:19Z: qemu-server.conf.blob: sync archive
>>>> [vm/108]: 2025-09-19T07:37:19Z: drive-scsi0.img.fidx: sync archive
>>>> [ct/107]: 2025-07-16T09:14:01Z: root.ppxar.didx: downloaded 609.233
>>>> MiB (112.628 MiB/s)
>>>> [ct/107]: 2025-07-16T09:14:01Z: root.mpxar.didx: sync archive
>>>> [ct/107]: 2025-07-16T09:14:01Z: root.mpxar.didx: downloaded 1.172 MiB
>>>> (17.838 MiB/s)
>>>> [ct/107]: 2025-07-16T09:14:01Z: sync done
>>>> [ct/107]: percentage done: 72.73% (8/11 groups)
>>>
>>> the way the prefix and snapshot are formatted could also be interpreted
>>> at first glance as a timestamp of the log line.. why not just prepend
>>> the prefix on the logger side, and leave it up to the caller to do the
>>> formatting? then we could us "{type}/{id}/" as prefix here? or add
>>> 'snapshot' in those lines to make it clear? granted, this is more of an
>>> issue when viewing the log via `proxmox-backup-manager`, as in the UI we
>>> have the log timestamps up front..
>>
>> I simply followed along the line of your suggestion here. In prior
>> versions I had exactly that but you rejected it as to verbose? :)
>>
>> https://lore.proxmox.com/pbs-devel/1774263381.bngcrer2th.astroid@yuna.none/
>>
my suggestions (for just the ct/107 lines above) would have been:
ct/107: 2025-07-16T09:14:01Z: start sync
ct/107: 2025-07-16T09:14:01Z pct.conf.blob: sync archive
ct/107: 2025-07-16T09:14:01Z root.ppxar.didx: sync archive
ct/107: 2025-07-16T09:14:01Z root.ppxar.didx: downloaded 609.233 MiB (112.628 MiB/s)
ct/107: 2025-07-16T09:14:01Z root.mpxar.didx: sync archive
ct/107: 2025-07-16T09:14:01Z root.mpxar.didx: downloaded 1.172 MiB (17.838 MiB/s)
ct/107: 2025-07-16T09:14:01Z: sync done
ct/107: percentage done: 72.73% (8/11 groups)
though it might make sense to group that even more visually
ct/107: 2025-07-16T09:14:01Z: start sync
ct/107: 2025-07-16T09:14:01Z/pct.conf.blob: sync archive
ct/107: 2025-07-16T09:14:01Z/root.ppxar.didx: sync archive
ct/107: 2025-07-16T09:14:01Z/root.ppxar.didx: downloaded 609.233 MiB (112.628 MiB/s)
ct/107: 2025-07-16T09:14:01Z/root.mpxar.didx: sync archive
ct/107: 2025-07-16T09:14:01Z/root.mpxar.didx: downloaded 1.172 MiB (17.838 MiB/s)
ct/107: 2025-07-16T09:14:01Z: sync done
ct/107: percentage done: 72.73% (8/11 groups)
or
[ct/107 2025-07-16T09:14:01Z] start sync
[ct/107 2025-07-16T09:14:01Z pct.conf.blob] sync archive
[ct/107 2025-07-16T09:14:01Z root.ppxar.didx] sync archive
[ct/107 2025-07-16T09:14:01Z root.ppxar.didx] downloaded 609.233 MiB (112.628 MiB/s)
[ct/107 2025-07-16T09:14:01Z root.mpxar.didx] sync archive
[ct/107 2025-07-16T09:14:01Z root.mpxar.didx] downloaded 1.172 MiB (17.838 MiB/s)
[ct/107 2025-07-16T09:14:01Z] sync done
[ct/107] percentage done: 72.73% (8/11 groups)
but that is probably a matter of taste, so let's postpone this aspect
for now.
>>>
>>> and maybe(?) log the progress line using a different prefix? because
>>> right now the information that the group [ct/107] is finished is not
>>> really clear from the output, IMHO.
>>
>> That I can add, yes.
thanks!
>>
>>>
>>> the progress logging is also still broken (this is for a sync that takes
>>> a while, this is not log messages being buffered and re-ordered!):
>>>
>>> $ proxmox-backup-manager task log
>>> 'UPID:yuna:00070656:001E420F:00000002:69E610EB:syncjob:local\x3atest\x3atank\x3a\x3as\x2dbc01cba6\x2d805a:root@pam:' | grep -e namespace -e 'percentage done'
>>> Syncing datastore 'test', root namespace into datastore 'tank', root
>>> namespace
>>> Finished syncing root namespace, current progress: 0 groups, 0 snapshots
>>> Syncing datastore 'test', namespace 'test' into datastore 'tank',
>>> namespace 'test'
>>> [host/exclusion-test]: percentage done: 5.26% (1/19 groups)
>>> [host/acltest]: percentage done: 5.26% (1/19 groups)
>>> [host/logtest]: percentage done: 5.26% (1/19 groups)
>>> [host/onemeg]: percentage done: 5.26% (1/19 groups)
>>> [host/fourmeg]: percentage done: 2.63% (0/19 groups, 1/2 snapshots in
>>> group #1)
>>> [host/symlink]: percentage done: 2.63% (0/19 groups, 1/2 snapshots in
>>> group #1)
>>> [host/symlink]: percentage done: 5.26% (1/19 groups)
>>> [host/fourmeg]: percentage done: 5.26% (1/19 groups)
>>> [host/format-v2-test]: percentage done: 1.75% (0/19 groups, 1/3
>>> snapshots in group #1)
>>> [host/format-v2-test]: percentage done: 3.51% (0/19 groups, 2/3
>>> snapshots in group #1)
>>> [host/format-v2-test]: percentage done: 5.26% (1/19 groups)
>>> [host/incrementaltest2]: percentage done: 2.63% (0/19 groups, 1/2
>>> snapshots in group #1)
>>> [host/incrementaltest2]: percentage done: 5.26% (1/19 groups)
>>
>> There will always be cases where it does not work I guess, not sure what
>> I can do to make you happy here.
>
> Oh, sorry, might have been to quick on this one. You are saying this
> happens for a non-parallel sync?
no, this was a parallel sync, but the progress should be correct for
parallel syncs as well, shouldn't it?
here's the same log but less truncated:
[..]
2026-04-20T13:41:31+02:00: ----
2026-04-20T13:41:31+02:00: Syncing datastore 'test', namespace 'test' into datastore 'tank', namespace 'test'
2026-04-20T13:41:31+02:00: Found 19 groups to sync (out of 19 total)
2026-04-20T13:41:31+02:00: [ct/106]: sync group ct/106 failed - owner check failed (test@pbs != root@pam)
2026-04-20T13:41:31+02:00: [ct/999]: sync group ct/999 failed - owner check failed (test@pbs != root@pam)
2026-04-20T13:41:31+02:00: skipping snapshot host/test-another-mail/2024-03-14T08:44:25Z - in-progress backup
2026-04-20T13:41:31+02:00: [host/exclusion-test]: host/exclusion-test: skipped: 1 snapshot(s) (2024-04-23T06:53:44Z) - older than the newest snapshot present on sync target
2026-04-20T13:41:31+02:00: [host/exclusion-test]: re-sync snapshot 2024-04-23T06:54:30Z
2026-04-20T13:41:31+02:00: [host/exclusion-test]: 2024-04-23T06:54:30Z: no data changes
2026-04-20T13:41:31+02:00: [host/exclusion-test]: percentage done: 5.26% (1/19 groups)
2026-04-20T13:41:31+02:00: [host/acltest]: re-sync snapshot 2024-10-07T11:17:58Z
2026-04-20T13:41:31+02:00: [host/acltest]: 2024-10-07T11:17:58Z: no data changes
2026-04-20T13:41:31+02:00: [host/acltest]: percentage done: 5.26% (1/19 groups)
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: start sync
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: test.pxar.didx: sync archive
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: test.pxar.didx: downloaded 13.169 KiB (168.332 KiB/s)
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: catalog.pcat1.didx: sync archive
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: catalog.pcat1.didx: downloaded 2.157 KiB (10.2 KiB/s)
2026-04-20T13:41:32+02:00: [host/logtest]: 2024-01-30T10:52:34Z: rsa-encrypted.key.blob: sync archive
2026-04-20T13:41:33+02:00: [host/logtest]: 2024-01-30T10:52:34Z: sync done
2026-04-20T13:41:33+02:00: [host/logtest]: percentage done: 5.26% (1/19 groups)
2026-04-20T13:41:33+02:00: [host/onemeg]: 2023-06-28T11:13:51Z: start sync
2026-04-20T13:41:33+02:00: [host/onemeg]: 2023-06-28T11:13:51Z: test.img.fidx: sync archive
2026-04-20T13:41:33+02:00: [host/onemeg]: 2023-06-28T11:13:51Z: test.img.fidx: downloaded 2.012 KiB (5.704 KiB/s)
2026-04-20T13:41:33+02:00: [host/onemeg]: 2023-06-28T11:13:51Z: sync done
2026-04-20T13:41:33+02:00: [host/onemeg]: percentage done: 5.26% (1/19 groups)
2026-04-20T13:41:33+02:00: [host/fourmeg]: 2023-06-28T11:14:35Z: start sync
2026-04-20T13:41:33+02:00: [host/fourmeg]: 2023-06-28T11:14:35Z: test.img.fidx: sync archive
2026-04-20T13:41:33+02:00: [host/fourmeg]: 2023-06-28T11:14:35Z: test.img.fidx: downloaded 2.012 KiB (5.885 KiB/s)
2026-04-20T13:41:33+02:00: [host/fourmeg]: 2023-06-28T11:14:35Z: sync done
2026-04-20T13:41:33+02:00: [host/fourmeg]: percentage done: 2.63% (0/19 groups, 1/2 snapshots in group #1)
2026-04-20T13:41:33+02:00: [host/fourmeg]: 2023-06-28T12:37:16Z: start sync
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: start sync
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: symlink.pxar.didx: sync archive
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: symlink.pxar.didx: downloaded 145 B (802.587 B/s)
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: catalog.pcat1.didx: sync archive
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: catalog.pcat1.didx: downloaded 57 B (279.566 B/s)
2026-04-20T13:41:34+02:00: [host/symlink]: 2023-06-07T06:48:26Z: sync done
2026-04-20T13:41:37+02:00: [host/incrementaltest2]: 2023-09-25T13:43:25Z: start sync
2026-04-20T13:41:37+02:00: [host/incrementaltest2]: 2023-09-25T13:43:25Z: test.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/format-v3-test]: 2024-06-03T10:34:11Z: start sync
2026-04-20T13:41:37+02:00: [host/format-v3-test]: 2024-06-03T10:34:11Z: linux.ppxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/fourmeg]: 2023-06-28T12:37:16Z: test.img.fidx: sync archive
2026-04-20T13:41:37+02:00: [host/linuxtesttest]: 2024-07-17T13:38:31Z: start sync
2026-04-20T13:41:37+02:00: [host/linuxtesttest]: 2024-07-17T13:38:31Z: linux.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/inctest]: 2023-11-13T13:22:57Z: start sync
2026-04-20T13:41:37+02:00: [host/inctest]: 2023-11-13T13:22:57Z: test.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/inctest2]: 2023-11-13T13:53:27Z: start sync
2026-04-20T13:41:37+02:00: [host/inctest2]: 2023-11-13T13:53:27Z: test.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/format-v2-test]: 2024-06-07T12:08:37Z: start sync
2026-04-20T13:41:37+02:00: [host/format-v2-test]: 2024-06-07T12:08:37Z: linux.ppxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/foobar]: re-sync snapshot 2024-06-10T07:48:51Z due to corruption
2026-04-20T13:41:37+02:00: [host/foobar]: 2024-06-10T07:48:51Z: foobar.ppxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/symlink]: percentage done: 2.63% (0/19 groups, 1/2 snapshots in group #1)
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: start sync
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: foo.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: foo.pxar.didx: downloaded 0 B (0 B/s)
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: catalog.pcat1.didx: sync archive
2026-04-20T13:41:37+02:00: [host/bookworm]: re-sync snapshot 2023-01-18T11:10:34Z due to corruption
2026-04-20T13:41:37+02:00: [host/bookworm]: 2023-01-18T11:10:34Z: repo.pxar.didx: sync archive
2026-04-20T13:41:37+02:00: [host/test]: 2024-07-10T08:18:51Z: start sync
2026-04-20T13:41:37+02:00: [host/test]: 2024-07-10T08:18:51Z: test.img.fidx: sync archive
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: catalog.pcat1.didx: downloaded 53 B (216.705 B/s)
2026-04-20T13:41:37+02:00: [host/symlink]: 2023-06-07T06:49:43Z: sync done
2026-04-20T13:41:37+02:00: [host/symlink]: percentage done: 5.26% (1/19 groups)
2026-04-20T13:42:55+02:00: [host/fourmeg]: 2023-06-28T12:37:16Z: test.img.fidx: downloaded 4 MiB (51.191 KiB/s)
2026-04-20T13:42:55+02:00: [host/fourmeg]: 2023-06-28T12:37:16Z: sync done
2026-04-20T13:42:55+02:00: [host/fourmeg]: percentage done: 5.26% (1/19 groups)
those percentage lines are just complete bogus?? and there are at least
two "breaks" of over a second where all buffered log lines must have
been flushed, so it's also not that the lines with lower perecentage
were re-ordered somehow..
it goes on and on and the percentage always stays at at most 5.26% since
the finished group count never goes higher than 1.. though I have to
admit I stopped it before it finished, so maybe it would have later gone
higher? I can re-run it to completion and check what it says before it's
done?
next prev parent reply other threads:[~2026-04-21 8:01 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-17 9:26 [PATCH proxmox{,-backup} v6 00/15] fix #4182: concurrent group pull/push support for sync jobs Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox v6 01/15] pbs api types: add `worker-threads` to sync job config Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 02/15] tools: group and sort module imports Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 03/15] tools: implement buffered logger for concurrent log messages Christian Ebner
2026-04-20 10:57 ` Fabian Grünbichler
2026-04-20 17:15 ` Christian Ebner
2026-04-21 6:49 ` Fabian Grünbichler
2026-04-17 9:26 ` [PATCH proxmox-backup v6 04/15] tools: add bounded join set to run concurrent tasks bound by limit Christian Ebner
2026-04-20 11:15 ` Fabian Grünbichler
2026-04-17 9:26 ` [PATCH proxmox-backup v6 05/15] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2026-04-20 12:29 ` Fabian Grünbichler
2026-04-17 9:26 ` [PATCH proxmox-backup v6 06/15] api: config/sync: add optional `worker-threads` property Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 07/15] sync: pull: revert avoiding reinstantiation for encountered chunks map Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 08/15] sync: pull: factor out backup group locking and owner check Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 09/15] sync: pull: prepare pull parameters to be shared across parallel tasks Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 10/15] fix #4182: server: sync: allow pulling backup groups in parallel Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 11/15] server: pull: prefix log messages and add error context Christian Ebner
2026-04-20 11:56 ` Fabian Grünbichler
2026-04-21 7:21 ` Christian Ebner
2026-04-21 7:42 ` Christian Ebner
2026-04-21 8:00 ` Fabian Grünbichler [this message]
2026-04-21 8:04 ` Christian Ebner
2026-04-21 12:57 ` Thomas Lamprecht
2026-04-17 9:26 ` [PATCH proxmox-backup v6 12/15] sync: push: prepare push parameters to be shared across parallel tasks Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 13/15] server: sync: allow pushing groups concurrently Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 14/15] server: push: prefix log messages and add additional logging Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 15/15] ui: expose group worker setting in sync job edit window Christian Ebner
2026-04-20 12:33 ` [PATCH proxmox{,-backup} v6 00/15] fix #4182: concurrent group pull/push support for sync jobs Fabian Grünbichler
2026-04-21 10:28 ` superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1776757625.a5d7yahnh6.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox