public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	Dominik Csapak <d.csapak@proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup] client/backup_writer: clarify backup and upload size
Date: Wed, 24 Mar 2021 15:39:04 +0100	[thread overview]
Message-ID: <e8c4d3eb-1988-251d-0359-77f3fe9995fc@proxmox.com> (raw)
In-Reply-To: <20210324105904.28126-1-d.csapak@proxmox.com>

On 24.03.21 11:59, Dominik Csapak wrote:
> The text 'had to upload [KMG]iB' implies that this is the size we
> actually had to send to the server, while in reality it is the
> raw data size before compression.
> 
> Count the size of the compressed chunks and print it separately.
> Split the average speed into its own line so they do not get too long.
> 

looks Ok code-wise, so I can only complain about a few naming and a format nits
and that we now def. need to fix that mess of a return signature tuple monster
Fabian also noted off-list.

> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
>  src/client/backup_writer.rs | 35 ++++++++++++++++++++++-------------
>  1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/src/client/backup_writer.rs b/src/client/backup_writer.rs
> index cef7edef..1be66707 100644
> --- a/src/client/backup_writer.rs
> +++ b/src/client/backup_writer.rs
> @@ -1,6 +1,6 @@
>  use std::collections::HashSet;
>  use std::os::unix::fs::OpenOptionsExt;
> -use std::sync::atomic::{AtomicUsize, Ordering};
> +use std::sync::atomic::{AtomicUsize, AtomicU64, Ordering};
>  use std::sync::{Arc, Mutex};
>  
>  use anyhow::{bail, format_err, Error};
> @@ -256,7 +256,7 @@ impl BackupWriter {
>  
>          let wid = self.h2.post(&index_path, Some(param)).await?.as_u64().unwrap();
>  
> -        let (chunk_count, chunk_reused, size, size_reused, duration, csum) =
> +        let (chunk_count, chunk_reused, size, size_reused, uploaded, duration, csum) =

why not compressed_size ? it may not necessarily be the total upload amount (protocol
and tls overhead, ...?)

>              Self::upload_chunk_info_stream(
>                  self.h2.clone(),
>                  wid,
> @@ -269,7 +269,7 @@ impl BackupWriter {
>              )
>              .await?;
>  
> -        let uploaded = size - size_reused;
> +        let saved = size - size_reused;

hmm, that name change is not really improves understanding for me, `saved` is rather
general and a bit confusing (as we effectively saved all of `size`, not only this
amount). Maybe `dirty_size`?

>          let vsize_h: HumanByte = size.into();
>          let archive = if self.verbose {
>              archive_name.to_string()
> @@ -277,9 +277,11 @@ impl BackupWriter {
>              crate::tools::format::strip_server_file_extension(archive_name)
>          };
>          if archive_name != CATALOG_NAME {
> -            let speed: HumanByte = ((uploaded * 1_000_000) / (duration.as_micros() as usize)).into();
> +            let speed: HumanByte = ((saved * 1_000_000) / (duration.as_micros() as usize)).into();
> +            let saved: HumanByte = saved.into();
>              let uploaded: HumanByte = uploaded.into();
> -            println!("{}: had to upload {} of {} in {:.2}s, average speed {}/s).", archive, uploaded, vsize_h, duration.as_secs_f64(), speed);
> +            println!("{}: had to backup {} of {} (compressed {}) in {:.2}s", archive, saved, vsize_h, uploaded, duration.as_secs_f64());
> +            println!("Average backup speed: {}.", speed);

we normally put the archive in front for archive-specific messages, and
there's no point at the end of the sentence in surrounding prints.

>          } else {
>              println!("Uploaded backup catalog ({})", vsize_h);
>          }
> @@ -521,7 +523,7 @@ impl BackupWriter {
>          crypt_config: Option<Arc<CryptConfig>>,
>          compress: bool,
>          verbose: bool,
> -    ) -> impl Future<Output = Result<(usize, usize, usize, usize, std::time::Duration, [u8; 32]), Error>> {
> +    ) -> impl Future<Output = Result<(usize, usize, usize, usize, usize, std::time::Duration, [u8; 32]), Error>> {

a WriterStats (just first name popping into my head) struct would be really
nicer here.

>  
>          let total_chunks = Arc::new(AtomicUsize::new(0));
>          let total_chunks2 = total_chunks.clone();
> @@ -530,6 +532,8 @@ impl BackupWriter {
>  
>          let stream_len = Arc::new(AtomicUsize::new(0));
>          let stream_len2 = stream_len.clone();
> +        let compressed_stream_len = Arc::new(AtomicU64::new(0));
> +        let compressed_stream_len2 = compressed_stream_len.clone();
>          let reused_len = Arc::new(AtomicUsize::new(0));
>          let reused_len2 = reused_len.clone();
>  
> @@ -572,6 +576,7 @@ impl BackupWriter {
>                  csum.update(digest);
>  
>                  let chunk_is_known = known_chunks.contains(digest);
> +                let compressed_stream_len2 = compressed_stream_len.clone();
>                  if chunk_is_known {
>                      known_chunk_count.fetch_add(1, Ordering::SeqCst);
>                      reused_len.fetch_add(chunk_len, Ordering::SeqCst);
> @@ -580,12 +585,15 @@ impl BackupWriter {
>                      known_chunks.insert(*digest);
>                      future::ready(chunk_builder
>                          .build()
> -                        .map(move |(chunk, digest)| MergedChunkInfo::New(ChunkInfo {
> -                            chunk,
> -                            digest,
> -                            chunk_len: chunk_len as u64,
> -                            offset,
> -                        }))
> +                        .map(move |(chunk, digest)| {
> +                            compressed_stream_len2.fetch_add(chunk.raw_size(), Ordering::SeqCst);
> +                            MergedChunkInfo::New(ChunkInfo {
> +                                chunk,
> +                                digest,
> +                                chunk_len: chunk_len as u64,
> +                                offset,
> +                            })
> +                        })
>                      )
>                  }
>              })
> @@ -645,12 +653,13 @@ impl BackupWriter {
>                  let total_chunks = total_chunks2.load(Ordering::SeqCst);
>                  let known_chunk_count = known_chunk_count2.load(Ordering::SeqCst);
>                  let stream_len = stream_len2.load(Ordering::SeqCst);
> +                let upload_len = compressed_stream_len2.load(Ordering::SeqCst) as usize;
>                  let reused_len = reused_len2.load(Ordering::SeqCst);
>  
>                  let mut guard = index_csum_2.lock().unwrap();
>                  let csum = guard.take().unwrap().finish();
>  
> -                futures::future::ok((total_chunks, known_chunk_count, stream_len, reused_len, duration, csum))
> +                futures::future::ok((total_chunks, known_chunk_count, stream_len, reused_len, upload_len, duration, csum))
>              })
>      }
>  
> 





      reply	other threads:[~2021-03-24 14:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-24 10:59 Dominik Csapak
2021-03-24 14:39 ` Thomas Lamprecht [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8c4d3eb-1988-251d-0359-77f3fe9995fc@proxmox.com \
    --to=t.lamprecht@proxmox.com \
    --cc=d.csapak@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal