From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>, pbs-devel@lists.proxmox.com
Subject: Re: [PATCH proxmox-backup v6 03/15] tools: implement buffered logger for concurrent log messages
Date: Tue, 21 Apr 2026 08:49:42 +0200 [thread overview]
Message-ID: <1776753720.36yeg85o88.astroid@yuna.none> (raw)
In-Reply-To: <bda178d7-f666-49e7-9859-29f972bae36b@proxmox.com>
On April 20, 2026 7:15 pm, Christian Ebner wrote:
> On 4/20/26 12:56 PM, Fabian Grünbichler wrote:
>> On April 17, 2026 11:26 am, Christian Ebner wrote:
>>> Implements a buffered logger instance which collects messages send
>>> from different sender instances via an async tokio channel and
>>> buffers them. Sender identify by label and provide a log level for
>>> each log line to be buffered and flushed.
>>>
>>> On collection, log lines are grouped by label and buffered in
>>> sequence of arrival per label, up to the configured maximum number of
>>> per group lines or periodically with the configured interval. The
>>> interval timeout is reset when contents are flushed. In addition,
>>> senders can request flushing at any given point.
>>>
>>> When the timeout set based on the interval is reached, all labels
>>> log buffers are flushed. There is no guarantee on the order of labels
>>> when flushing.
>>>
>>> Log output is written based on provided log line level and prefixed
>>> by the label.
>>>
>>> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> [..]
>>> + /// Starts the collection loop spawned on a new tokio task
>>> + /// Finishes when all sender belonging to the channel have been dropped.
>>> + pub fn run_log_collection(mut self) {
>>> + let future = async move {
>>> + loop {
>>> + let deadline = Instant::now() + self.max_aggregation_time;
>>> + match time::timeout_at(deadline, self.receive_log_line()).await {
>>
>> why manually calculate the deadline, wouldn't using `time::timeout` work
>> as well? the only difference from a quick glance is that that one does a
>> checked_add for now + delay..
>
> No specific reason for using timeout_at() here, was primed by having
> based this on the s3 client timeout
I guess it depends how we tackle below, for some approaches timeout_at
with a deadline calculation that is not reset each iteration would also
be appropriate..
>> but also, isn't this kind of broken in any case? let's say I have two
>> labels A and B:
>>
>> 0.99 A1
>> 1.98 A2
>> 2.97 A3
>> 3.96 A4
>> 4.95 A5 (now A is at capacity)
>> 5.94 B1
>> 9.90 B5 (now B is at capacity as well)
>>
>> either
>>
>> 10.90 timeout elapses, everything is flushed
>>
>> or
>>
>> 10.89 A6 (A gets flushed and can start over - but B hasn't been flushed)
>> 11.88 A7
>> 12.87 A8
>> 13.86 A9
>> 14.85 A10 (A has 5 buffered messages again)
>> ..
>>
>> this means that any label that doesn't log a 6th message can stall for
>> quite a long time, as long as other labels make progress (and it isn't
>> flushed explicitly)?
>
> Yes, this is true, but that is not really avoidable unless there is a
> timeout per label. Or would you suggest to simply flush all buffered
> lines at periodic intervals, without resetting at all?
yeah, I think we want to have a hard limit for the delay per label as
well, not just one for the delay-if-no-activity-at-all.
because the log timestamp is only added when emitting the log line to
the real logger, and if we delay too long the log doesn't reflect
reality anymore..
basically what we want to achieve here is that
- a single no-change sync should be emitted as one block of log lines,
unless it takes unusually long
- a small burst of back-to-back log messages of the same group (e.g.,
two warning lines) are emitted as one block, unless timing was really
bad
- no individual log line should be delayed for more than N where N is
rather small
that does mean we need to do some extra checking when handling the
messages coming in over the channel, because otherwise the channel
traffic could overwhelm the flushing logic and violate the third
property.
>>> + Ok(finished) => {
>>> + if finished {
>>> + break;
>>> + }
>>> + }
>>> + Err(_timeout) => self.flush_all_buffered(),
>>> + }
>>> + }
>>> + };
>>> + match LogContext::current() {
>>> + None => tokio::spawn(future),
>>> + Some(context) => tokio::spawn(context.scope(future)),
>>> + };
>>> + }
>>> +
>>> + /// Collects new log lines, buffers and flushes them if max lines limit exceeded.
>>> + ///
>>> + /// Returns `true` if all the senders have been dropped and the task should no
>>> + /// longer wait for new messages and finish.
>>> + async fn receive_log_line(&mut self) -> bool {
>>> + if let Some(request) = self.receiver.recv().await {
>>> + match request {
>>> + SenderRequest::Flush(label) => {
>>> + if let Some(log_lines) = self.buffer_map.get_mut(&label) {
>>> + Self::log_with_label(&label, log_lines);
>>> + log_lines.clear();
>>> + }
>>> + }
>>> + SenderRequest::Message(log_line) => {
>>
>> if this would be Message((label, level, line)) or Message((label,
>> level_and_line)) the label would not need to be stored in the buffer
>> keys and values..
>
> Yes, adapted based on above mention already
>
>>
>>> + if self.max_buffered_lines == 0
>>> + || self.max_aggregation_time < Duration::from_secs(0)
>>
>> the timeout can never be below zero, as that is the minimum duration
>> (duration is unsigned)?
>
> This is a typo, the intention was to check for durations below 1 second,
> but since the granularity is seconds, this should check for 0 instead.
in that case, you can just call self.max_aggregation_time.is_zero() :)
though if we go with the trait/non-buffering implementation, this whole
check could go, since a buffered logger would never be instantiated with
buffering disabled?
>>> + {
>>> + // shortcut if no buffering should happen
>>> + Self::log_by_level(&log_line.label, &log_line);
>>
>> shouldn't we rather handle this by not using the buffered logger in the
>> first place? e.g., have this and a simple not-buffering logger implement
>> a shared logging trait, or something similar?
>
> Hmm, that might be better, yes. Will add a trait with 2 implementations
> based on which logger is required.
>
>>
>> one simple approach would be to just make the LogLineSender log directly
>> in this case, and not send anything at all?
>>
>> because if we don't want buffering, sending all log messages through a
>> channel and setting up the timeout machinery can be avoided completely..
>>
> [..]
next prev parent reply other threads:[~2026-04-21 6:49 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-17 9:26 [PATCH proxmox{,-backup} v6 00/15] fix #4182: concurrent group pull/push support for sync jobs Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox v6 01/15] pbs api types: add `worker-threads` to sync job config Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 02/15] tools: group and sort module imports Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 03/15] tools: implement buffered logger for concurrent log messages Christian Ebner
2026-04-20 10:57 ` Fabian Grünbichler
2026-04-20 17:15 ` Christian Ebner
2026-04-21 6:49 ` Fabian Grünbichler [this message]
2026-04-17 9:26 ` [PATCH proxmox-backup v6 04/15] tools: add bounded join set to run concurrent tasks bound by limit Christian Ebner
2026-04-20 11:15 ` Fabian Grünbichler
2026-04-17 9:26 ` [PATCH proxmox-backup v6 05/15] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2026-04-20 12:29 ` Fabian Grünbichler
2026-04-17 9:26 ` [PATCH proxmox-backup v6 06/15] api: config/sync: add optional `worker-threads` property Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 07/15] sync: pull: revert avoiding reinstantiation for encountered chunks map Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 08/15] sync: pull: factor out backup group locking and owner check Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 09/15] sync: pull: prepare pull parameters to be shared across parallel tasks Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 10/15] fix #4182: server: sync: allow pulling backup groups in parallel Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 11/15] server: pull: prefix log messages and add error context Christian Ebner
2026-04-20 11:56 ` Fabian Grünbichler
2026-04-21 7:21 ` Christian Ebner
2026-04-21 7:42 ` Christian Ebner
2026-04-21 8:00 ` Fabian Grünbichler
2026-04-21 8:04 ` Christian Ebner
2026-04-21 12:57 ` Thomas Lamprecht
2026-04-17 9:26 ` [PATCH proxmox-backup v6 12/15] sync: push: prepare push parameters to be shared across parallel tasks Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 13/15] server: sync: allow pushing groups concurrently Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 14/15] server: push: prefix log messages and add additional logging Christian Ebner
2026-04-17 9:26 ` [PATCH proxmox-backup v6 15/15] ui: expose group worker setting in sync job edit window Christian Ebner
2026-04-20 12:33 ` [PATCH proxmox{,-backup} v6 00/15] fix #4182: concurrent group pull/push support for sync jobs Fabian Grünbichler
2026-04-21 10:28 ` superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1776753720.36yeg85o88.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.