public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output
@ 2022-03-24 15:44 Daniel Tschlatscher
  2022-03-25  8:29 ` Thomas Lamprecht
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Tschlatscher @ 2022-03-24 15:44 UTC (permalink / raw)
  To: pve-devel

The backup restore dialogue now displays size information and duration in a format more easily understandable for humans. The output was adapted to match the output of the backup restore dialogue where possible.
Added 2 helper methods for displaying bytes and time in human readable format.

Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
---
 ...VE-Backup-add-vma-backup-format-code.patch | 117 ++++++++++++++----
 1 file changed, 96 insertions(+), 21 deletions(-)

diff --git a/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch b/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
index c4ed5bb..2dc1bd8 100644
--- a/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
+++ b/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
@@ -9,11 +9,11 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
 ---
  block/meson.build |   2 +
  meson.build       |   5 +
- vma-reader.c      | 857 ++++++++++++++++++++++++++++++++++++++++++++++
- vma-writer.c      | 790 ++++++++++++++++++++++++++++++++++++++++++
- vma.c             | 851 +++++++++++++++++++++++++++++++++++++++++++++
- vma.h             | 150 ++++++++
- 6 files changed, 2655 insertions(+)
+ vma-reader.c      | 922 ++++++++++++++++++++++++++++++++++++++++++++++
+ vma-writer.c      | 790 +++++++++++++++++++++++++++++++++++++++
+ vma.c             | 857 ++++++++++++++++++++++++++++++++++++++++++
+ vma.h             | 153 ++++++++
+ 6 files changed, 2729 insertions(+)
  create mode 100644 vma-reader.c
  create mode 100644 vma-writer.c
  create mode 100644 vma.c
@@ -57,10 +57,10 @@ index 96de1a6ef9..54c23b9567 100644
    subdir('contrib/elf2dmp')
 diff --git a/vma-reader.c b/vma-reader.c
 new file mode 100644
-index 0000000000..2b1d1cdab3
+index 0000000000..907759b295
 --- /dev/null
 +++ b/vma-reader.c
-@@ -0,0 +1,857 @@
+@@ -0,0 +1,922 @@
 +/*
 + * VMA: Virtual Machine Archive
 + *
@@ -77,6 +77,7 @@ index 0000000000..2b1d1cdab3
 +#include "qemu/osdep.h"
 +#include <glib.h>
 +#include <uuid/uuid.h>
++#include <math.h>
 +
 +#include "qemu-common.h"
 +#include "qemu/timer.h"
@@ -87,6 +88,9 @@ index 0000000000..2b1d1cdab3
 +
 +static unsigned char zero_vma_block[VMA_BLOCK_SIZE];
 +
++static time_t last_time = 0;
++static int64_t last_size = 0;
++
 +typedef struct VmaRestoreState {
 +    BlockBackend *target;
 +    bool write_zeroes;
@@ -649,13 +653,31 @@ index 0000000000..2b1d1cdab3
 +
 +        if (verbose) {
 +            time_t duration = time(NULL) - vmar->start_time;
-+            int percent = (vmar->clusters_read*100)/vmar->cluster_count;
-+            if (percent != vmar->clusters_read_per) {
-+                printf("progress %d%% (read %zd bytes, duration %zd sec)\n",
-+                       percent, vmar->clusters_read*VMA_CLUSTER_SIZE,
-+                       duration);
++            int percent = (vmar->clusters_read*100) / vmar->cluster_count;
++
++            /* Dont spam so many progress prints, but still show the 100% message*/
++            if ((duration - last_time) > 2 || percent == 100) {
++                int delta = duration - last_time;
++                int64_t bps = vmar->clusters_read*VMA_CLUSTER_SIZE - last_size;
++
++                if (delta != 0)
++                    bps /= delta;
++
++                printf("Progress %d%% (", percent);
++                print_human_readable_byte_count(vmar->clusters_read*VMA_CLUSTER_SIZE);
++                printf(" of ");
++                print_human_readable_byte_count(vmar->devinfo[dev_id].size);
++                printf(") in ");
++                print_human_readable_time(duration);
++                printf(" - ");
++                print_human_readable_byte_count(bps);
++                printf("/s\n");
++
 +                fflush(stdout);
++
 +                vmar->clusters_read_per = percent;
++                last_time = duration;
++                last_size = vmar->clusters_read*VMA_CLUSTER_SIZE;
 +            }
 +        }
 +
@@ -881,11 +903,17 @@ index 0000000000..2b1d1cdab3
 +
 +    if (verbose) {
 +        if (vmar->clusters_read) {
-+            printf("total bytes read %zd, sparse bytes %zd (%.3g%%)\n",
-+                   vmar->clusters_read*VMA_CLUSTER_SIZE,
-+                   vmar->zero_cluster_data,
-+                   (double)(100.0*vmar->zero_cluster_data)/
-+                   (vmar->clusters_read*VMA_CLUSTER_SIZE));
++            double sparse_percent = (double)(100.0*vmar->zero_cluster_data) /
++                   (vmar->clusters_read*VMA_CLUSTER_SIZE);
++            time_t duration = time(NULL) - vmar->start_time;
++
++            printf("Finished restoring ");
++            print_human_readable_byte_count(vmar->clusters_read*VMA_CLUSTER_SIZE);
++            printf(" bytes in ");
++	    print_human_readable_time(duration);
++	    printf(" with ");
++	    print_human_readable_byte_count(vmar->zero_cluster_data);
++	    printf(" of sparse data. (%.3g%%)\n", sparse_percent);
 +
 +            int64_t datasize = vmar->clusters_read*VMA_CLUSTER_SIZE-vmar->zero_cluster_data;
 +            if (datasize) { // this does not make sense for empty files
@@ -918,6 +946,44 @@ index 0000000000..2b1d1cdab3
 +    return vma_reader_restore_full(vmar, -1, verbose, true, errp);
 +}
 +
++void print_human_readable_time(int seconds) {
++	int days, hours, mins;
++
++	days = seconds / 86400;
++	seconds = seconds % 86400;
++
++	hours = seconds / 3600;
++	seconds = seconds % 3600;
++
++	mins = seconds / 60;
++	seconds = seconds % 60;
++
++	if (days)
++		printf("%d d ", days);
++	if (hours)
++		printf("%d h ", hours);
++	if (mins)
++		printf("%d m ", mins);
++	printf("%d s", seconds);
++}
++
++/* This should correctly display values up to 9,2 Ebibytes*/
++void print_human_readable_byte_count(int64_t value) {
++	double calculated = (double)value;
++	const char* units = "KMGTPE";
++	char unit;
++
++	int maxUnit = 0;
++	if (value > 1023) {
++		maxUnit = (int)(log(value)/log(1024));
++		calculated = value / pow(1024, maxUnit);
++		unit = units[maxUnit - 1];
++		printf("%.2f %ciB", calculated, unit);
++	} else {
++		printf("%zd B", (int64_t)calculated);
++	}
++}
+\ No newline at end of file
 diff --git a/vma-writer.c b/vma-writer.c
 new file mode 100644
 index 0000000000..11d8321ffd
@@ -1716,10 +1782,10 @@ index 0000000000..11d8321ffd
 +}
 diff --git a/vma.c b/vma.c
 new file mode 100644
-index 0000000000..df542b7732
+index 0000000000..781b5bf700
 --- /dev/null
 +++ b/vma.c
-@@ -0,0 +1,851 @@
+@@ -0,0 +1,857 @@
 +/*
 + * VMA: Virtual Machine Archive
 + *
@@ -1802,8 +1868,14 @@ index 0000000000..df542b7732
 +            if (strcmp(di->devname, "vmstate") == 0) {
 +                printf("VMSTATE: dev_id=%d memory: %zd\n", i, di->size);
 +            } else {
++                /* Information that is needed in qemu-server (PVE::QemuServer.pm)
++                   Change only if you know what you are doing */
 +                printf("DEV: dev_id=%d size: %zd devname: %s\n",
 +                       i, di->size, di->devname);
++
++                printf("Info: dev_id=%d size: ", i);
++                print_human_readable_byte_count(di->size);
++                printf(" devname: %s\n", di->devname);
 +            }
 +        }
 +    }
@@ -2573,10 +2645,10 @@ index 0000000000..df542b7732
 +}
 diff --git a/vma.h b/vma.h
 new file mode 100644
-index 0000000000..c895c97f6d
+index 0000000000..c4867b8584
 --- /dev/null
 +++ b/vma.h
-@@ -0,0 +1,150 @@
+@@ -0,0 +1,153 @@
 +/*
 + * VMA: Virtual Machine Archive
 + *
@@ -2726,4 +2798,7 @@ index 0000000000..c895c97f6d
 +                       Error **errp);
 +int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);
 +
++void print_human_readable_time(int);
++void print_human_readable_byte_count(int64_t);
++
 +#endif /* BACKUP_VMA_H */
-- 
2.30.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output
  2022-03-24 15:44 [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output Daniel Tschlatscher
@ 2022-03-25  8:29 ` Thomas Lamprecht
       [not found]   ` <8af5712b-d33d-ffb7-ccd5-5dda8973822a@proxmox.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2022-03-25  8:29 UTC (permalink / raw)
  To: Proxmox VE development discussion, Daniel Tschlatscher

Thanks for the patch, some comments inline.

On 24.03.22 16:44, Daniel Tschlatscher wrote:
> The backup restore dialogue now displays size information and duration in a format more easily understandable for humans. The output was adapted to match the output of the backup restore dialogue where possible.

But this is the backup restore, so do you mean it was matched to how its
outputted for a special case there already or how its outputted for the backup
(i.e., the other direction) one?

> Added 2 helper methods for displaying bytes and time in human readable format.

please format the commit message according our submission rules:
https://pve.proxmox.com/wiki/Developer_Documentation#Commits_and_Commit_Messages

an output example (maybe even an before/after) excerpt would be nice to have here.

> 
> Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
> ---
>  ...VE-Backup-add-vma-backup-format-code.patch | 117 ++++++++++++++----
>  1 file changed, 96 insertions(+), 21 deletions(-)
> 
> diff --git a/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch b/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
> index c4ed5bb..2dc1bd8 100644
> --- a/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
> +++ b/debian/patches/pve/0026-PVE-Backup-add-vma-backup-format-code.patch
> @@ -9,11 +9,11 @@ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>  ---
>   block/meson.build |   2 +
>   meson.build       |   5 +
> - vma-reader.c      | 857 ++++++++++++++++++++++++++++++++++++++++++++++
> - vma-writer.c      | 790 ++++++++++++++++++++++++++++++++++++++++++
> - vma.c             | 851 +++++++++++++++++++++++++++++++++++++++++++++
> - vma.h             | 150 ++++++++
> - 6 files changed, 2655 insertions(+)
> + vma-reader.c      | 922 ++++++++++++++++++++++++++++++++++++++++++++++
> + vma-writer.c      | 790 +++++++++++++++++++++++++++++++++++++++
> + vma.c             | 857 ++++++++++++++++++++++++++++++++++++++++++
> + vma.h             | 153 ++++++++
> + 6 files changed, 2729 insertions(+)
>   create mode 100644 vma-reader.c
>   create mode 100644 vma-writer.c
>   create mode 100644 vma.c
> @@ -57,10 +57,10 @@ index 96de1a6ef9..54c23b9567 100644
>     subdir('contrib/elf2dmp')
>  diff --git a/vma-reader.c b/vma-reader.c
>  new file mode 100644
> -index 0000000000..2b1d1cdab3
> +index 0000000000..907759b295
>  --- /dev/null
>  +++ b/vma-reader.c
> -@@ -0,0 +1,857 @@
> +@@ -0,0 +1,922 @@
>  +/*
>  + * VMA: Virtual Machine Archive
>  + *
> @@ -77,6 +77,7 @@ index 0000000000..2b1d1cdab3
>  +#include "qemu/osdep.h"
>  +#include <glib.h>
>  +#include <uuid/uuid.h>
> ++#include <math.h>
>  +
>  +#include "qemu-common.h"
>  +#include "qemu/timer.h"
> @@ -87,6 +88,9 @@ index 0000000000..2b1d1cdab3
>  +
>  +static unsigned char zero_vma_block[VMA_BLOCK_SIZE];
>  +
> ++static time_t last_time = 0;
> ++static int64_t last_size = 0;
> ++
>  +typedef struct VmaRestoreState {
>  +    BlockBackend *target;
>  +    bool write_zeroes;
> @@ -649,13 +653,31 @@ index 0000000000..2b1d1cdab3
>  +
>  +        if (verbose) {
>  +            time_t duration = time(NULL) - vmar->start_time;

The resulting accuracy for time in seconds could be not that good, did you make any
thoughts or comparison regarding that.

> -+            int percent = (vmar->clusters_read*100)/vmar->cluster_count;
> -+            if (percent != vmar->clusters_read_per) {
> -+                printf("progress %d%% (read %zd bytes, duration %zd sec)\n",
> -+                       percent, vmar->clusters_read*VMA_CLUSTER_SIZE,
> -+                       duration);
> ++            int percent = (vmar->clusters_read*100) / vmar->cluster_count;

if we already touch this then please also add the missing spaces around `*`

> ++
> ++            /* Dont spam so many progress prints, but still show the 100% message*/
> ++            if ((duration - last_time) > 2 || percent == 100) {
> ++                int delta = duration - last_time;
> ++                int64_t bps = vmar->clusters_read*VMA_CLUSTER_SIZE - last_size;

missing spaces around `*`, shortening such units can be confusing, I'd rather suggest:
(may need be written more nicely if it fails the 100 characters width formatting rule)

int64_t byte_throughput = delta > 0 ? (vmar->clusters_read * VMA_CLUSTER_SIZE - last_size) / delta : 0;

> ++
> ++                if (delta != 0)
> ++                    bps /= delta;

currently, if delta would be null you'd print a size unit as throughput unit?

> ++
> ++                printf("Progress %d%% (", percent);
> ++                print_human_readable_byte_count(vmar->clusters_read*VMA_CLUSTER_SIZE);
> ++                printf(" of ");
> ++                print_human_readable_byte_count(vmar->devinfo[dev_id].size);
> ++                printf(") in ");
> ++                print_human_readable_time(duration);
> ++                printf(" - ");
> ++                print_human_readable_byte_count(bps);
> ++                printf("/s\n");
> ++
>  +                fflush(stdout);

would IMO be worth it to factor above lines out in a static local helper to avoid
crowding this function to much, e.g. (types may be adapted if reasonable):

print_restore_progress(uint64_t total_byte, uint64_t restored_byte, uint64_t duration_ms);

I would handle the last_X statics in there and also re-calculate the percentage as float
in there, so that we can print a similar amount of digits after the decimal place like
we do in backup.

> ++
>  +                vmar->clusters_read_per = percent;
> ++                last_time = duration;
> ++                last_size = vmar->clusters_read*VMA_CLUSTER_SIZE;
>  +            }
>  +        }
>  +
> @@ -881,11 +903,17 @@ index 0000000000..2b1d1cdab3
>  +
>  +    if (verbose) {
>  +        if (vmar->clusters_read) {
> -+            printf("total bytes read %zd, sparse bytes %zd (%.3g%%)\n",
> -+                   vmar->clusters_read*VMA_CLUSTER_SIZE,
> -+                   vmar->zero_cluster_data,
> -+                   (double)(100.0*vmar->zero_cluster_data)/
> -+                   (vmar->clusters_read*VMA_CLUSTER_SIZE));
> ++            double sparse_percent = (double)(100.0*vmar->zero_cluster_data) /
> ++                   (vmar->clusters_read*VMA_CLUSTER_SIZE);
> ++            time_t duration = time(NULL) - vmar->start_time;
> ++
> ++            printf("Finished restoring ");
> ++            print_human_readable_byte_count(vmar->clusters_read*VMA_CLUSTER_SIZE);
> ++            printf(" bytes in ");
> ++	    print_human_readable_time(duration);
> ++	    printf(" with ");
> ++	    print_human_readable_byte_count(vmar->zero_cluster_data);
> ++	    printf(" of sparse data. (%.3g%%)\n", sparse_percent);
>  +
>  +            int64_t datasize = vmar->clusters_read*VMA_CLUSTER_SIZE-vmar->zero_cluster_data;
>  +            if (datasize) { // this does not make sense for empty files
> @@ -918,6 +946,44 @@ index 0000000000..2b1d1cdab3
>  +    return vma_reader_restore_full(vmar, -1, verbose, true, errp);
>  +}
>  +
> ++void print_human_readable_time(int seconds) {
> ++	int days, hours, mins;
> ++
> ++	days = seconds / 86400;
> ++	seconds = seconds % 86400;

fwiw, modulo is always expensive, maybe try:

seconds -= days * 24 * 3600;


> ++
> ++	hours = seconds / 3600;
> ++	seconds = seconds % 3600;
> ++
> ++	mins = seconds / 60;
> ++	seconds = seconds % 60;
> ++
> ++	if (days)
> ++		printf("%d d ", days);
> ++	if (hours)
> ++		printf("%d h ", hours);
> ++	if (mins)> ++		printf("%d m ", mins);
> ++	printf("%d s", seconds);
> ++}
> ++
> ++/* This should correctly display values up to 9,2 Ebibytes*/
> ++void print_human_readable_byte_count(int64_t value) {
> ++	double calculated = (double)value;
> ++	const char* units = "KMGTPE";
> ++	char unit;
> ++
> ++	int maxUnit = 0;
> ++	if (value > 1023) {
> ++		maxUnit = (int)(log(value)/log(1024));

log is quite expensive, I'd rather go for a similar approach like we do in JS:

https://git.proxmox.com/?p=proxmox-widget-toolkit.git;a=blob;f=src/Utils.js;h=6a03057a704943539b90ed58dfb3d0b05ecf7883;hb=HEAD#l656

The units char array can stay, and if we just want base-2 units we can loop over
the value, shift it right by 10 each round until < 1024, then do a final /1024.0
division to float so that we get the digits after the decimal place.

> ++		calculated = value / pow(1024, maxUnit);
> ++		unit = units[maxUnit - 1];
> ++		printf("%.2f %ciB", calculated, unit);
> ++	} else {
> ++		printf("%zd B", (int64_t)calculated);
> ++	}
> ++}
> +\ No newline at end of file
>  diff --git a/vma-writer.c b/vma-writer.c
>  new file mode 100644
>  index 0000000000..11d8321ffd
> @@ -1716,10 +1782,10 @@ index 0000000000..11d8321ffd
>  +}
>  diff --git a/vma.c b/vma.c
>  new file mode 100644
> -index 0000000000..df542b7732
> +index 0000000000..781b5bf700
>  --- /dev/null
>  +++ b/vma.c
> -@@ -0,0 +1,851 @@
> +@@ -0,0 +1,857 @@
>  +/*
>  + * VMA: Virtual Machine Archive
>  + *
> @@ -1802,8 +1868,14 @@ index 0000000000..df542b7732
>  +            if (strcmp(di->devname, "vmstate") == 0) {
>  +                printf("VMSTATE: dev_id=%d memory: %zd\n", i, di->size);
>  +            } else {
> ++                /* Information that is needed in qemu-server (PVE::QemuServer.pm)
> ++                   Change only if you know what you are doing */
>  +                printf("DEV: dev_id=%d size: %zd devname: %s\n",
>  +                       i, di->size, di->devname);
> ++
> ++                printf("Info: dev_id=%d size: ", i);
> ++                print_human_readable_byte_count(di->size);
> ++                printf(" devname: %s\n", di->devname);

is this directly related to the swap restore-output to human readable?

>  +            }
>  +        }
>  +    }
> @@ -2573,10 +2645,10 @@ index 0000000000..df542b7732
>  +}
>  diff --git a/vma.h b/vma.h
>  new file mode 100644
> -index 0000000000..c895c97f6d
> +index 0000000000..c4867b8584
>  --- /dev/null
>  +++ b/vma.h
> -@@ -0,0 +1,150 @@
> +@@ -0,0 +1,153 @@
>  +/*
>  + * VMA: Virtual Machine Archive
>  + *
> @@ -2726,4 +2798,7 @@ index 0000000000..c895c97f6d
>  +                       Error **errp);
>  +int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp);
>  +
> ++void print_human_readable_time(int);
> ++void print_human_readable_byte_count(int64_t);
> ++
>  +#endif /* BACKUP_VMA_H */





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output
       [not found]   ` <8af5712b-d33d-ffb7-ccd5-5dda8973822a@proxmox.com>
@ 2022-03-25 11:05     ` Thomas Lamprecht
  2022-03-25 12:39       ` Daniel Tschlatscher
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2022-03-25 11:05 UTC (permalink / raw)
  To: Daniel Tschlatscher, PVE development discussion

(re-sending, seems you forgot to hit Reply-All and only sent it to me directly).

On 25.03.22 10:41, Daniel Tschlatscher wrote:
> On 3/25/22 09:29, Thomas Lamprecht wrote:
>> Thanks for the patch, some comments inline.
>> On 24.03.22 16:44, Daniel Tschlatscher wrote:

>> please format the commit message according our submission rules:
>> https://pve.proxmox.com/wiki/Developer_Documentation#Commits_and_Commit_Messages
>>
>> an output example (maybe even an before/after) excerpt would be nice to have here.
>
> Just leave out the ":" after "fix", right? Or is there something else I should change?

I actually meant the per-line character length, quoting:
> Make sure the line length of the commit's message is not longer than 70 characters.
> Note, HTTPS links are an exception and should not be split.

but yeah, now that you say it, it really should be `fix #3865: ...`


>>>   +typedef struct VmaRestoreState {
>>>   +    BlockBackend *target;
>>>   +    bool write_zeroes;
>>> @@ -649,13 +653,31 @@ index 0000000000..2b1d1cdab3
>>>   +
>>>   +        if (verbose) {
>>>   +            time_t duration = time(NULL) - vmar->start_time;
>> The resulting accuracy for time in seconds could be not that good, did you make any
>> thoughts or comparison regarding that.
> I "built" upon what was here. This is the way that seconds passed were measured in the original code and works well enough in this case. time_t has some functionality for tracking milliseconds though but I don't think that much fine control over the time elapsed has any real benefits here.

sorry but what? Example we have a byte delta of 100 MiB since last report, and nothing
guarantees that we're currently very close to a second boundary, so lets see
how the error compares comparing second resolution to some millisecond ones.

Using second granularity anything between 1 <= t < 2 gets you 100 MiB/s
On the other hand:

                    "actual"     "1s cut off"   "error"
100 MiB / 1.000s = 100.0 MiB/s   100.0 MiB/s      0.0
100 MiB / 1.050s =  95.2 MiB/s   100.0 MiB/s      4.8
100 MiB / 1.123s =  89.0 MiB/s   100.0 MiB/s     11.0
100 MiB / 1.130s =  88.5 MiB/s   100.0 MiB/s     11.5
100 MiB / 1.500s =  66.7 MiB/s   100.0 MiB/s     33.3
100 MiB / 1.750s =  57.1 MiB/s   100.0 MiB/s     42.9
100 MiB / 1.999s =  50.0 MiB/s   100.0 MiB/s     50.0

So even just 7 ms (1.123s vs 1.13s) make a 0.5 MiB/s difference, IMO already relevant.

Wouldn't dare to call that "working well enough", or are we talking about
different things?

>>> ++
>>> ++                if (delta != 0)
>>> ++                    bps /= delta;
>> currently, if delta would be null you'd print a size unit as throughput unit?
>>
> This is due to the underlying code. The function that is wrapping this part of the code is called multiple times every second (something like ~20 times / sec) and in the original code a print statement would only occur after the percentage of bytes transferred increased by at least 1%.> 
> The time elapsed between an increase in percent could be less than a second. In this case the output simply prints how much bytes were transferred since this second began. This means the value for throughput would at the time of the print be always lower than the real world B/s and only the last print in that second would represent the actual B/s.

This just doesn't makes sense to me, the if clauses enforces that its called at
max every 2s (or 100% edge case) already, and using wrong units is using wrong
units, you cannot really argue basic math/physics away..

>
> Nonetheless, I later adapted this code to behave like the backup create function does, which updates every 3 seconds, eliminating this problem entirely. This is therefore dead code anyway.
>

later where?

Actual improvement, track duration in milliseconds or less and still catch the edge
case for when delta is zero and just set throughput to zero then (as no data can
transfer in zero time).


>> would IMO be worth it to factor above lines out in a static local helper to avoid
>> crowding this function to much, e.g. (types may be adapted if reasonable):
>>
>> print_restore_progress(uint64_t total_byte, uint64_t restored_byte, uint64_t duration_ms);
>>
>> I would handle the last_X statics in there and also re-calculate the percentage as float
>> in there, so that we can print a similar amount of digits after the decimal place like
>> we do in backup.
>
> A function sounds good, though I am not quite sure what you mean by "print a similar amount of digits after the decimal place like we do in backup"

Hmm, I thought percentage isn't a integer there, so meant something like 5.1%,
but apparently it is a plain integer (may just have been in an experiment done
once I rewrote that whole thing a bit ago), so for now you can ignore that.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output
  2022-03-25 11:05     ` Thomas Lamprecht
@ 2022-03-25 12:39       ` Daniel Tschlatscher
  2022-03-25 17:59         ` Thomas Lamprecht
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Tschlatscher @ 2022-03-25 12:39 UTC (permalink / raw)
  To: Thomas Lamprecht, PVE development discussion


On 3/25/22 12:05, Thomas Lamprecht wrote:
> (re-sending, seems you forgot to hit Reply-All and only sent it to me directly).
>
> On 25.03.22 10:41, Daniel Tschlatscher wrote:
>> On 3/25/22 09:29, Thomas Lamprecht wrote:
>>> Thanks for the patch, some comments inline.
>>> On 24.03.22 16:44, Daniel Tschlatscher wrote:
>>> please format the commit message according our submission rules:
>>> https://pve.proxmox.com/wiki/Developer_Documentation#Commits_and_Commit_Messages
>>>
>>> an output example (maybe even an before/after) excerpt would be nice to have here.
>> Just leave out the ":" after "fix", right? Or is there something else I should change?
> I actually meant the per-line character length, quoting:
>> Make sure the line length of the commit's message is not longer than 70 characters.
>> Note, HTTPS links are an exception and should not be split.
> but yeah, now that you say it, it really should be `fix #3865: ...`
>
>
>>>>    +typedef struct VmaRestoreState {
>>>>    +    BlockBackend *target;
>>>>    +    bool write_zeroes;
>>>> @@ -649,13 +653,31 @@ index 0000000000..2b1d1cdab3
>>>>    +
>>>>    +        if (verbose) {
>>>>    +            time_t duration = time(NULL) - vmar->start_time;
>>> The resulting accuracy for time in seconds could be not that good, did you make any
>>> thoughts or comparison regarding that.
>> I "built" upon what was here. This is the way that seconds passed were measured in the original code and works well enough in this case. time_t has some functionality for tracking milliseconds though but I don't think that much fine control over the time elapsed has any real benefits here.
> sorry but what? Example we have a byte delta of 100 MiB since last report, and nothing
> guarantees that we're currently very close to a second boundary, so lets see
> how the error compares comparing second resolution to some millisecond ones.
>
> Using second granularity anything between 1 <= t < 2 gets you 100 MiB/s
> On the other hand:
>
>                      "actual"     "1s cut off"   "error"
> 100 MiB / 1.000s = 100.0 MiB/s   100.0 MiB/s      0.0
> 100 MiB / 1.050s =  95.2 MiB/s   100.0 MiB/s      4.8
> 100 MiB / 1.123s =  89.0 MiB/s   100.0 MiB/s     11.0
> 100 MiB / 1.130s =  88.5 MiB/s   100.0 MiB/s     11.5
> 100 MiB / 1.500s =  66.7 MiB/s   100.0 MiB/s     33.3
> 100 MiB / 1.750s =  57.1 MiB/s   100.0 MiB/s     42.9
> 100 MiB / 1.999s =  50.0 MiB/s   100.0 MiB/s     50.0
>
> So even just 7 ms (1.123s vs 1.13s) make a 0.5 MiB/s difference, IMO already relevant.
>
> Wouldn't dare to call that "working well enough", or are we talking about
> different things?
>
>>>> ++
>>>> ++                if (delta != 0)
>>>> ++                    bps /= delta;
>>> currently, if delta would be null you'd print a size unit as throughput unit?
>>>
>> This is due to the underlying code. The function that is wrapping this part of the code is called multiple times every second (something like ~20 times / sec) and in the original code a print statement would only occur after the percentage of bytes transferred increased by at least 1%.>
>> The time elapsed between an increase in percent could be less than a second. In this case the output simply prints how much bytes were transferred since this second began. This means the value for throughput would at the time of the print be always lower than the real world B/s and only the last print in that second would represent the actual B/s.
> This just doesn't makes sense to me, the if clauses enforces that its called at
> max every 2s (or 100% edge case) already, and using wrong units is using wrong
> units, you cannot really argue basic math/physics away..
>
The 100% edge case is the problem for the most part.

It comes down to the inaccuracy of keeping time in seconds rather than 
something more granular. While it is true that in 0 time no actual data 
could be transferred, 0 in this case means somewhere from 0 => 0.999 
seconds. Delta could be zero while the transferred bytes for example 
count 20MB.

Just printing 0 would be confusing because *some* data was in fact 
transferred -> reflected in the bps variable. It could also give the 
user a false impression that their disk is slow or something similar if 
they see a lot of "0 B/s" entries (which was the actual reasoning here).
The current implementation means, if delta is 0, bps = 20MB and in 
reality 0.33 seconds have elapsed that the program prints "20MB/s". In 
actuality the expected rate of transfer would be "60 MB/s" though.

My thinking not using milliseconds here was: Even though I could 
calculate how much the transfer rate per second is, during the time of 
the print, it is still an approximation. Imagine following scenario:
In the first 0.33 secs 20MB/s are  transferred (=> print goes out with 
60 MB/s)
In the next 0.33 secs only 10MB/s are transferred
In the last 0.33 secs only 0.5MB/s are transferred
In this case "60MB/s" was printed but 30.5MB have been transferred.

The current implementation might be severely understating the actual 
throughput, but will never show a value for B/s that is higher than the 
actual throughput per second. The unaccounted for bytes here would be 
then reflected in the following output.
The millisecond approach would have both the same and opposite problem, 
e.g. if delta is only a few milliseconds and the throughput in that time 
a lot higher than average, then the printed B/s value could be highly 
exaggerated.

I let the current implementation be because, in practice, at the end of 
the backup restore there are usually a lot of sparse bytes (I hope I am 
using this term correctly). This means that the restore would go from 
56% to 100% in less than a second.  Simply writing 0 here would probably 
confuse the user more than anything because the value suddenly jumped by 
e.g.  6GiB with 0B/s.

Still, I do concede that using milliseconds just makes more sense 
because it's the more "proper" way to implement something like this and 
especially because of the unit. In the current implementation it would 
be something like B/last-output.
Also, it seems to me that the rate of throughput plummeting or 
skyrocketing in the span of less than a second is rather unlikely, the 
millisecond method would statistically just be more correct.
I will change it in v3.

>> Nonetheless, I later adapted this code to behave like the backup create function does, which updates every 3 seconds, eliminating this problem entirely. This is therefore dead code anyway.
>>
> later where?
>
> Actual improvement, track duration in milliseconds or less and still catch the edge
> case for when delta is zero and just set throughput to zero then (as no data can
> transfer in zero time).
>
Later in the sense of time. I rewrote the if statement above to only 
execute if delta is greater than 2 seconds (before it updated with every 
percent increase).
_>> __"eliminating this problem entirely"_ is, I retract, not true 
though because I forgot about the check whether progress is at 100%, in 
which case delta could still be 0.

>>> would IMO be worth it to factor above lines out in a static local helper to avoid
>>> crowding this function to much, e.g. (types may be adapted if reasonable):
>>>
>>> print_restore_progress(uint64_t total_byte, uint64_t restored_byte, uint64_t duration_ms);
>>>
>>> I would handle the last_X statics in there and also re-calculate the percentage as float
>>> in there, so that we can print a similar amount of digits after the decimal place like
>>> we do in backup.
>> A function sounds good, though I am not quite sure what you mean by "print a similar amount of digits after the decimal place like we do in backup"
> Hmm, I thought percentage isn't a integer there, so meant something like 5.1%,
> but apparently it is a plain integer (may just have been in an experiment done
> once I rewrote that whole thing a bit ago), so for now you can ignore that.
>
From a.lauterer@proxmox.com  Fri Mar 25 15:00:22 2022
Return-Path: <a.lauterer@proxmox.com>
X-Original-To: pve-devel@lists.proxmox.com
Delivered-To: pve-devel@lists.proxmox.com
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 6D38569E28
 for <pve-devel@lists.proxmox.com>; Fri, 25 Mar 2022 15:00:22 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 5968B200C
 for <pve-devel@lists.proxmox.com>; Fri, 25 Mar 2022 15:00:22 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 6B4DF1EDE
 for <pve-devel@lists.proxmox.com>; Fri, 25 Mar 2022 15:00:21 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 36D7846FBB
 for <pve-devel@lists.proxmox.com>; Fri, 25 Mar 2022 15:00:21 +0100 (CET)
From: Aaron Lauterer <a.lauterer@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Fri, 25 Mar 2022 15:00:13 +0100
Message-Id: <20220325140017.336307-2-a.lauterer@proxmox.com>
X-Mailer: git-send-email 2.30.2
In-Reply-To: <20220325140017.336307-1-a.lauterer@proxmox.com>
References: <20220325140017.336307-1-a.lauterer@proxmox.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.024 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
Subject: [pve-devel] [PATCH v5 manager 1/5] ui: utils: refactor mps to mp
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Fri, 25 Mar 2022 14:00:22 -0000

Using the actual config key instead of the pluralization, makes it
easier in the situations where we need to match against it.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
changes since:
v4: moved at the start of the series to avoid unnecessary steps

 www/manager6/Utils.js           | 4 ++--
 www/manager6/lxc/MPEdit.js      | 2 +-
 www/manager6/lxc/MultiMPEdit.js | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/www/manager6/Utils.js b/www/manager6/Utils.js
index aafe359a..34cf49b3 100644
--- a/www/manager6/Utils.js
+++ b/www/manager6/Utils.js
@@ -1578,12 +1578,12 @@ Ext.define('PVE.Utils', {
     },
 
     mp_counts: {
-	mps: 256,
+	mp: 256,
 	unused: 256,
     },
 
     forEachMP: function(func, includeUnused) {
-	for (let i = 0; i < PVE.Utils.mp_counts.mps; i++) {
+	for (let i = 0; i < PVE.Utils.mp_counts.mp; i++) {
 	    let cont = func('mp', i);
 	    if (!cont && cont !== undefined) {
 		return;
diff --git a/www/manager6/lxc/MPEdit.js b/www/manager6/lxc/MPEdit.js
index b8c6b6cb..609447ef 100644
--- a/www/manager6/lxc/MPEdit.js
+++ b/www/manager6/lxc/MPEdit.js
@@ -194,7 +194,7 @@ Ext.define('PVE.lxc.MountPointInputPanel', {
 	    name: 'mpid',
 	    fieldLabel: gettext('Mount Point ID'),
 	    minValue: 0,
-	    maxValue: PVE.Utils.mp_counts.mps - 1,
+	    maxValue: PVE.Utils.mp_counts.mp - 1,
 	    hidden: true,
 	    allowBlank: false,
 	    disabled: true,
diff --git a/www/manager6/lxc/MultiMPEdit.js b/www/manager6/lxc/MultiMPEdit.js
index 709dacb1..36cee4ff 100644
--- a/www/manager6/lxc/MultiMPEdit.js
+++ b/www/manager6/lxc/MultiMPEdit.js
@@ -8,7 +8,7 @@ Ext.define('PVE.lxc.MultiMPPanel', {
 	xclass: 'Ext.app.ViewController',
 
 	// count of mps + rootfs
-	maxCount: PVE.Utils.mp_counts.mps + 1,
+	maxCount: PVE.Utils.mp_counts.mp + 1,
 
 	getNextFreeDisk: function(vmconfig) {
 	    let nextFreeDisk;
@@ -17,7 +17,7 @@ Ext.define('PVE.lxc.MultiMPPanel', {
 		    confid: 'rootfs',
 		};
 	    } else {
-		for (let i = 0; i < PVE.Utils.mp_counts.mps; i++) {
+		for (let i = 0; i < PVE.Utils.mp_counts.mp; i++) {
 		    let confid = `mp${i}`;
 		    if (!vmconfig[confid]) {
 			nextFreeDisk = {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output
  2022-03-25 12:39       ` Daniel Tschlatscher
@ 2022-03-25 17:59         ` Thomas Lamprecht
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2022-03-25 17:59 UTC (permalink / raw)
  To: Daniel Tschlatscher, PVE development discussion

On 25.03.22 13:39, Daniel Tschlatscher wrote:
> On 3/25/22 12:05, Thomas Lamprecht wrote:
>>>>> ++
>>>>> ++                if (delta != 0)
>>>>> ++                    bps /= delta;
>>>> currently, if delta would be null you'd print a size unit as throughput unit?
>>>>
>>> This is due to the underlying code. The function that is wrapping this part of the code is called multiple times every second (something like ~20 times / sec) and in the original code a print statement would only occur after the percentage of bytes transferred increased by at least 1%.>
>>> The time elapsed between an increase in percent could be less than a second. In this case the output simply prints how much bytes were transferred since this second began. This means the value for throughput would at the time of the print be always lower than the real world B/s and only the last print in that second would represent the actual B/s.
>> This just doesn't makes sense to me, the if clauses enforces that its called at
>> max every 2s (or 100% edge case) already, and using wrong units is using wrong
>> units, you cannot really argue basic math/physics away..
>>
> The 100% edge case is the problem for the most part.

You could also ignore that in loop but print a 100% done after, rethinking this
makes the current approach to handle that look rather odd anyhow.

> 
> It comes down to the inaccuracy of keeping time in seconds rather than something more granular. While it is true that in 0 time no actual data could be transferred, 0 in this case means somewhere from 0 => 0.999 seconds. Delta could be zero while the transferred bytes for example count 20MB.

yes, as mentioned using something more granular improves a lot of things..

> 
> Just printing 0 would be confusing because *some* data was in fact transferred -> reflected in the bps variable. It could also give the user a false impression that their disk is slow or something similar if they see a lot of "0 B/s" entries (which was the actual reasoning here).

it isn't really reflect in bps as it's just wrong to print a size as throughput,
and with finer granularity or just not spamming the log at the 100% stage would
be a nicer, more practical solution.

The delta 0 -> throughput 0 is then still the only valid way to handle it, the
minimal time resolution the program handles needs to be used as metric, not the
physical (almost) infinitesimal one, and once that's ms or µs its only going to
happen when throughput would be rounded down to 0 anyway.

> The current implementation means, if delta is 0, bps = 20MB and in reality 0.33 seconds have elapsed that the program prints "20MB/s". In actuality the expected rate of transfer would be "60 MB/s" though.

bps (byte per second) cannot be MB, you really need to get your units straigt,
as else its just confusing to me, but I assume you meant a delta of 20 MiB in
bytes restored since last time, and then 20 / 0.33 = 60 so not seeing the issue.

> 
> My thinking not using milliseconds here was: Even though I could calculate how much the transfer rate per second is, during the time of the print, it is still an approximation. Imagine following scenario:

So, you're arguing that because a much more accurate solution is still not 100%
accurate (which is impossible to do) lets go for the much less accurate one?
Sorry, but not really following that logic and really not getting why you still
argue for it?! It should be clear that using millisecond (or lower) resolution
gives much more accurate results, and can *never* be worse than more granular ones.

> In the first 0.33 secs 20MB/s are  transferred (=> print goes out with 60 MB/s)

Do you meant to write "In the first 0.33s 20 MiB are transferred", as else we talk
second order derivative, i.e., throughput acceleration?


> In the next 0.33 secs only 10MB/s are transferred
> In the last 0.33 secs only 0.5MB/s are transferred
> In this case "60MB/s" was printed but 30.5MB have been transferred.

I don't get the point/example...

> 
> The current implementation might be severely understating the actual throughput, but will never show a value for B/s that is higher than the actual throughput per second. The unaccounted for bytes here would be then reflected in the following output.

exactly, but making the interval, that the error "leaks" into, 1000, or even 1
million times smaller reduces the difference from reality vs. actual output quite
a lot.

> The millisecond approach would have both the same and opposite problem, e.g. if delta is only a few milliseconds and the throughput in that time a lot higher than average, then the printed B/s value could be highly exaggerated.

no, they do not have the opposite problem, that cannot really be as its the
same thing but just more accurate, it's still not perfect, and for very very
high bandwidths (100s of GiB/s) the error may be even still get relevant with
a per ms sampling, but then it'd be still 1000 times better on average.

> 
> I let the current implementation be because, in practice, at the end of the backup restore there are usually a lot of sparse bytes (I hope I am using this term correctly). This means that the restore would go from 56% to 100% in less than a second.  Simply writing 0 here would probably confuse the user more than anything because the value suddenly jumped by e.g.  6GiB with 0B/s.

This cannot be generalized in practice, on a long running production VM holes due
to sparseness can be there or not, and may not be accumulated all at the end.
Different OS and the filesystems they use also have very different impacts.

> 
> Still, I do concede that using milliseconds just makes more sense because it's the more "proper" way to implement something like this and especially because of the unit. In the current implementation it would be something like B/last-output.
> Also, it seems to me that the rate of throughput plummeting or skyrocketing in the span of less than a second is rather unlikely, the millisecond method would statistically just be more correct.

Hows that unlikely? Modern HW and their caches are really not deterministic at all,
e.g., QLC SSDs with a few tens of GiB of SLC and some GiB of DDR4 cache can
show very stark jumps in throughput, from 6 Gbps (maxing out SATA link speed) to 100
Mbps (maxing out QLC). In addition to that, the IO schedule handling multiple
processes submitting IO concurrently doesn't makes it anymore stable, then add effects
of filesystems or possible ZFS RAID, ... on top. Also, simply that another process is
done with IO can make a lot of capacity available in mere micro seconds.
There's a reason some counters/timing bits of the IO paths adds to the kernel's RNG
entropy pool.

Further, note that the base assumption of why I went into this wasn't even erratic IO,
but rather CPU scheduling leading into very different actual duration intervals.
And that could even make a stable IO throughput flow display as completely erratic,
even if it actually wasn't. I.e., using such a low accuracy just adds way to much
distortion in any direction.

> I will change it in v3.
> 
>>> Nonetheless, I later adapted this code to behave like the backup create function does, which updates every 3 seconds, eliminating this problem entirely. This is therefore dead code anyway.
>>>
>> later where?
>>
>> Actual improvement, track duration in milliseconds or less and still catch the edge
>> case for when delta is zero and just set throughput to zero then (as no data can
>> transfer in zero time).
>>
> Later in the sense of time. I rewrote the if statement above to only execute if delta is greater than 2 seconds (before it updated with every percent increase).

Ah ok, in a future patch - I encountered "later" quite often like in "Later in this patch
or series", so yeah can be clear enough & probably just confusion on my side.

> _>> __"eliminating this problem entirely"_ is, I retract, not true though because I forgot about the check whether progress is at 100%, in which case delta could still be 0.

In that case you either skip reporting this round and/or move the 100% reporting out of
the loop entirely, which could be the nicer approach in general anyway, and with the
reporting factored out in its own function also not much code overhead.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-25 17:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-24 15:44 [pve-devel] [PATCH pve-qemu] fix: 3865: backup restore human readable output Daniel Tschlatscher
2022-03-25  8:29 ` Thomas Lamprecht
     [not found]   ` <8af5712b-d33d-ffb7-ccd5-5dda8973822a@proxmox.com>
2022-03-25 11:05     ` Thomas Lamprecht
2022-03-25 12:39       ` Daniel Tschlatscher
2022-03-25 17:59         ` Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal