public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v2 proxmox-backup] partial fix #5560: client: periodically show backup progress
Date: Thu, 10 Oct 2024 17:25:50 +0200	[thread overview]
Message-ID: <8c0826e5-8829-469d-9b88-2059aa36fc7a@proxmox.com> (raw)
In-Reply-To: <43470f91-ba2e-4090-a525-c6b64369ad0a@proxmox.com>

On 10/10/24 16:45, Thomas Lamprecht wrote:
> Am 09/10/2024 um 11:20 schrieb Christian Ebner:
>> Spawn a new tokio task which about every minute displays the
>> cumulative progress of the backup for pxar, ppxar or img archive
>> streams. Catalog and metadata archive streams are excluded from the
>> output for better readability, and because the catalog upload lives
>> for the whole upload time, leading to possible temporal
>> misalignments in the output. The actual payload data is written via
>> the other streams anyway.
>>
>> Add accounting for uploaded chunks, to distinguish from chunks queued
>> for upload, but not actually uploaded yet.
>>
>> Example output in the backup task log:
>> ```
>> ...
>> INFO:  root.pxar: elapsed 60.00 s, new: 191.446 MiB, reused: 0 B, total: 191.446 MiB, uploaded: 13.021 MiB (compressed 5.327 MiB, average: 222.221 KiB/s)
>> INFO:  root.pxar: elapsed 120.00 s, new: 191.446 MiB, reused: 0 B, total: 191.446 MiB, uploaded: 27.068 MiB (compressed 11.583 MiB, average: 230.977 KiB/s)
>> INFO:  root.pxar: elapsed 180.00 s, new: 191.446 MiB, reused: 0 B, total: 191.446 MiB, uploaded: 36.138 MiB (compressed 14.987 MiB, average: 205.58 KiB/s)
> 
> Thx for tackling this, but I'm rather nitpicky with the formatting of
> progress reports, so quite a bit commentary w.r.t. that:

Fine by me, I did include most of the information based on what seemed 
of interest to me, but it might be a bit to much information at once, 
agreed.

> I'm not a total fan of those averaged bandwidth indicators, as they often
> suggest a slow tool (or uplink) if not much new data has to be sent.
> If, it might make a bit more sense to print the bandwidth of the total
> processed data?

Well, this is the average over the cumulative upload, not the progress 
of each individual time frame. So maybe calling this `total bandwidth` 
might be more fitting?

> Printing the elapsed time just in seconds can be rather unwieldy for longer
> running operations, e.g. "elapsed 32280 s, ..." for "8 h 58 m" is not so
> easy to parse. A HumanDuration which renders to something like, for example,
> "1w 2d 3h 4m 5.67s" could be nicer here (parts that are 0 simply omitted),
> but even just a local fn that handles this up to hour range would be a lot
> better.

Yeah, since the output is generated roughly once a minute, that should 
help for readability. Will see to improve this in a new version.

> And I see some confusion potential with "new" as in "is it new since last
> status report output or total new data compared to previous snapshot"

All values are given in a cumulative way, non of them is calculated on a 
time frame difference, seems this needs to be emphasized more in the log 
output somehow.

Above is new in the sense of total bytes of data located in new chunks, 
to be (or already) uploaded to the server, while the reused indicates 
the total number of bytes in reused chunks, therefore not re-uploaded.

> Is "total" the amount of read data here? As that might be one of the better
> indicators, i.e. if I (roughly) know that directory I back up holds
> 10 GB of data and the client reports it read 8.7 GB it would be helpful
> for me even if it's naturally also not guaranteed to progress linearly.

The total is the total number of bytes after chunking, (so the stream 
length), meaning the sum of bytes of new and reused chunks.

> Potentially also just report the compressed amount for "uploaded", as
> that's what really got uploaded?

The compressed is exactly that, the number of total compressed uploaded 
bytes. That was the intention of placing it in brackets next to the 
uploaded value.

> As of is, the format seems to benefit devs and technical users the most
> way, for the ordinary user it might be a bit much.
> 
> Maybe reduce this to something like:
> 
> processed X data in T (optionally: processing-rate) uploaded Y

Okay, that should be a good enough as starting point. The main 
motivation for this was anyhow to give some feedback to the user that 
the backup is still making progress.

> where X is totally processed data, T is the elapsed time and Y is the
> amount of data that actually had to be sent over the network link.
> Just as an more actionable idea, there might be better variants.
> 
> FWIW: We could still add a more detail reporting mode enabled through
> some CLI option later.

Okay, thanks for your input, I will see to improve this in a new version 
of the patches based on your comments and suggestions.


_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel


  reply	other threads:[~2024-10-10 15:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09  9:20 Christian Ebner
2024-10-10 10:50 ` Gabriel Goller
2024-10-10 14:45 ` Thomas Lamprecht
2024-10-10 15:25   ` Christian Ebner [this message]
2024-10-11  9:41 ` Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c0826e5-8829-469d-9b88-2059aa36fc7a@proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal