public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v2 common] tools: file_set_contents: use syswrite instead of print
@ 2024-09-30 11:40 Filip Schauer
  2024-10-14  8:42 ` [pve-devel] applied: " Thomas Lamprecht
  2024-10-14  9:26 ` [pve-devel] " Thomas Lamprecht
  0 siblings, 2 replies; 5+ messages in thread
From: Filip Schauer @ 2024-09-30 11:40 UTC (permalink / raw)
  To: pve-devel

The use of `print` can be inefficient for writing larger files due to
its default buffering in 8 KiB blocks.

This is especially problematic on `pmxcfs` where files are written in
4 KiB blocks due to the defaults of `libfuse2`. This leads to
significant write amplification on files larger than 4 KiB.

Patch (fix #5728: pmxcfs: allow bigger writes than 4k for fuse) [1]
addresses this by enabling `big_writes`, allowing up to 128 KiB blocks.
But due to the use of `print` in `file_set_contents`, writes are still
only buffered in 8 KiB blocks.

To further address this, this commit switches to using `syswrite`
instead of `print` to mitigate the block size limit imposed by `print`.
Combined with patch [1], file writes to `/etc/pve/` are now buffered in
128 KiB blocks.

The table below illustrates the drastic reduction in write
amplification when writing files of different sizes to `/etc/pve/` using
`file_set_contents`:

           print                big_writes+print     big_writes+syswrite
file size  written     amplif.  written     amplif.  written    amplif.
    1 KiB      48 KiB     48.0      45 KiB     45.0     41 KiB     41.0
    2 KiB      48 KiB     24.0      45 KiB     22.5     62 KiB     31.0
    4 KiB      82 KiB     20.5      80 KiB     20.0     73 KiB     18.3
    8 KiB     121 KiB     15.1      90 KiB     11.3     89 KiB     11.1
   16 KiB     217 KiB     13.6     146 KiB      9.1    113 KiB      7.1
   32 KiB     506 KiB     15.8     314 KiB      9.8    158 KiB      4.9
   64 KiB    1472 KiB     23.0     826 KiB     12.9    259 KiB      4.0
  128 KiB    5585 KiB     43.6    3765 KiB     29.4    452 KiB      3.5
  256 KiB   20424 KiB     79.8   10743 KiB     42.0   2351 KiB      9.2
  512 KiB   86715 KiB    169.4   43650 KiB     85.3   3204 KiB      6.3
 1024 KiB  369568 KiB    360.9  187496 KiB    183.1  15845 KiB     15.5

Since `file_set_contents` also performs a `rename` after writing, the
following table shows the results when the file is written without
renaming it afterwards:

           print                big_writes+print     big_writes+syswrite
file size  written     amplif.  written     amplif.  written     amplif.
    1 KiB      29 KiB     29.0      29 KiB     29.0     25 KiB      25.0
    2 KiB      29 KiB     14.5      30 KiB     15.0     25 KiB      12.5
    4 KiB      37 KiB      9.3      44 KiB     11.0     41 KiB      10.3
    8 KiB      61 KiB      7.6      45 KiB      5.6     45 KiB       5.6
   16 KiB     143 KiB      8.9      86 KiB      5.4     57 KiB       3.6
   32 KiB     396 KiB     12.4     225 KiB      7.0     69 KiB       2.2
   64 KiB    1281 KiB     20.0     673 KiB     10.5    105 KiB       1.6
  128 KiB    4789 KiB     37.4    3478 KiB     27.2    169 KiB       1.3
  256 KiB   18868 KiB     73.7    9976 KiB     39.0    572 KiB       2.2
  512 KiB   79304 KiB    154.9   42714 KiB     83.4   2150 KiB       4.2
 1024 KiB  347929 KiB    339.8  182483 KiB    178.2  11133 KiB      10.9

[1] https://lists.proxmox.com/pipermail/pve-devel/2024-September/065396.html

Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
---
Changes since v1:
* Add benchmark results without rename to commit message
* Fix "Wide character in syswrite" error by first encoding $data with print

 src/PVE/Tools.pm | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/PVE/Tools.pm b/src/PVE/Tools.pm
index bd305bd..e4ff7f9 100644
--- a/src/PVE/Tools.pm
+++ b/src/PVE/Tools.pm
@@ -285,10 +285,25 @@ sub file_set_contents {
 	}
 	die "unable to open file '$tmpname' - $!\n" if !$fh;
 
-	binmode($fh, ":encoding(UTF-8)") if $force_utf8;
+	if ($force_utf8) {
+	    $data = encode("utf8", $data);
+	} else {
+	    # Encode wide characters with print before passing them to syswrite
+	    my $unencoded_data = $data;
+	    open my $datafh, '>', \$data;
+	    print $datafh $unencoded_data;
+	    close $datafh;
+	}
+
+	my $offset = 0;
+	my $len = length($data);
+
+	while ($offset < $len) {
+	    $offset += syswrite($fh, $data, $len - $offset, $offset)
+		or die "unable to write '$tmpname' - $!\n";
+	}
 
-	die "unable to write '$tmpname' - $!\n" unless print $fh $data;
-	die "closing file '$tmpname' failed - $!\n" unless close $fh;
+	close $fh or die "closing file '$tmpname' failed - $!\n";
     };
     my $err = $@;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] applied: [PATCH v2 common] tools: file_set_contents: use syswrite instead of print
  2024-09-30 11:40 [pve-devel] [PATCH v2 common] tools: file_set_contents: use syswrite instead of print Filip Schauer
@ 2024-10-14  8:42 ` Thomas Lamprecht
  2024-10-14  9:07   ` Dominik Csapak
  2024-10-14  9:26 ` [pve-devel] " Thomas Lamprecht
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas Lamprecht @ 2024-10-14  8:42 UTC (permalink / raw)
  To: Proxmox VE development discussion, Filip Schauer

Am 30/09/2024 um 13:40 schrieb Filip Schauer:
> The use of `print` can be inefficient for writing larger files due to
> its default buffering in 8 KiB blocks.
> 
> This is especially problematic on `pmxcfs` where files are written in
> 4 KiB blocks due to the defaults of `libfuse2`. This leads to
> significant write amplification on files larger than 4 KiB.
> 
> Patch (fix #5728: pmxcfs: allow bigger writes than 4k for fuse) [1]
> addresses this by enabling `big_writes`, allowing up to 128 KiB blocks.
> But due to the use of `print` in `file_set_contents`, writes are still
> only buffered in 8 KiB blocks.
> 
> To further address this, this commit switches to using `syswrite`
> instead of `print` to mitigate the block size limit imposed by `print`.
> Combined with patch [1], file writes to `/etc/pve/` are now buffered in
> 128 KiB blocks.
> 
> The table below illustrates the drastic reduction in write
> amplification when writing files of different sizes to `/etc/pve/` using
> `file_set_contents`:
> 
>            print                big_writes+print     big_writes+syswrite
> file size  written     amplif.  written     amplif.  written    amplif.
>     1 KiB      48 KiB     48.0      45 KiB     45.0     41 KiB     41.0
>     2 KiB      48 KiB     24.0      45 KiB     22.5     62 KiB     31.0
>     4 KiB      82 KiB     20.5      80 KiB     20.0     73 KiB     18.3
>     8 KiB     121 KiB     15.1      90 KiB     11.3     89 KiB     11.1
>    16 KiB     217 KiB     13.6     146 KiB      9.1    113 KiB      7.1
>    32 KiB     506 KiB     15.8     314 KiB      9.8    158 KiB      4.9
>    64 KiB    1472 KiB     23.0     826 KiB     12.9    259 KiB      4.0
>   128 KiB    5585 KiB     43.6    3765 KiB     29.4    452 KiB      3.5
>   256 KiB   20424 KiB     79.8   10743 KiB     42.0   2351 KiB      9.2
>   512 KiB   86715 KiB    169.4   43650 KiB     85.3   3204 KiB      6.3
>  1024 KiB  369568 KiB    360.9  187496 KiB    183.1  15845 KiB     15.5
> 
> Since `file_set_contents` also performs a `rename` after writing, the
> following table shows the results when the file is written without
> renaming it afterwards:
> 
>            print                big_writes+print     big_writes+syswrite
> file size  written     amplif.  written     amplif.  written     amplif.
>     1 KiB      29 KiB     29.0      29 KiB     29.0     25 KiB      25.0
>     2 KiB      29 KiB     14.5      30 KiB     15.0     25 KiB      12.5
>     4 KiB      37 KiB      9.3      44 KiB     11.0     41 KiB      10.3
>     8 KiB      61 KiB      7.6      45 KiB      5.6     45 KiB       5.6
>    16 KiB     143 KiB      8.9      86 KiB      5.4     57 KiB       3.6
>    32 KiB     396 KiB     12.4     225 KiB      7.0     69 KiB       2.2
>    64 KiB    1281 KiB     20.0     673 KiB     10.5    105 KiB       1.6
>   128 KiB    4789 KiB     37.4    3478 KiB     27.2    169 KiB       1.3
>   256 KiB   18868 KiB     73.7    9976 KiB     39.0    572 KiB       2.2
>   512 KiB   79304 KiB    154.9   42714 KiB     83.4   2150 KiB       4.2
>  1024 KiB  347929 KiB    339.8  182483 KiB    178.2  11133 KiB      10.9
> 
> [1] https://lists.proxmox.com/pipermail/pve-devel/2024-September/065396.html
> 
> Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
> ---
> Changes since v1:
> * Add benchmark results without rename to commit message
> * Fix "Wide character in syswrite" error by first encoding $data with print
> 
>  src/PVE/Tools.pm | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
>

applied, and many thanks for the detailed benchmarks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] applied: [PATCH v2 common] tools: file_set_contents: use syswrite instead of print
  2024-10-14  8:42 ` [pve-devel] applied: " Thomas Lamprecht
@ 2024-10-14  9:07   ` Dominik Csapak
  2024-10-14  9:22     ` Thomas Lamprecht
  0 siblings, 1 reply; 5+ messages in thread
From: Dominik Csapak @ 2024-10-14  9:07 UTC (permalink / raw)
  To: Proxmox VE development discussion, Thomas Lamprecht

On 10/14/24 10:42, Thomas Lamprecht wrote:
> Am 30/09/2024 um 13:40 schrieb Filip Schauer:
>> The use of `print` can be inefficient for writing larger files due to
>> its default buffering in 8 KiB blocks.
>>
>> This is especially problematic on `pmxcfs` where files are written in
>> 4 KiB blocks due to the defaults of `libfuse2`. This leads to
>> significant write amplification on files larger than 4 KiB.
>>
>> Patch (fix #5728: pmxcfs: allow bigger writes than 4k for fuse) [1]
>> addresses this by enabling `big_writes`, allowing up to 128 KiB blocks.
>> But due to the use of `print` in `file_set_contents`, writes are still
>> only buffered in 8 KiB blocks.
>>
>> To further address this, this commit switches to using `syswrite`
>> instead of `print` to mitigate the block size limit imposed by `print`.
>> Combined with patch [1], file writes to `/etc/pve/` are now buffered in
>> 128 KiB blocks.
>>
>> The table below illustrates the drastic reduction in write
>> amplification when writing files of different sizes to `/etc/pve/` using
>> `file_set_contents`:
>>
>>             print                big_writes+print     big_writes+syswrite
>> file size  written     amplif.  written     amplif.  written    amplif.
>>      1 KiB      48 KiB     48.0      45 KiB     45.0     41 KiB     41.0
>>      2 KiB      48 KiB     24.0      45 KiB     22.5     62 KiB     31.0
>>      4 KiB      82 KiB     20.5      80 KiB     20.0     73 KiB     18.3
>>      8 KiB     121 KiB     15.1      90 KiB     11.3     89 KiB     11.1
>>     16 KiB     217 KiB     13.6     146 KiB      9.1    113 KiB      7.1
>>     32 KiB     506 KiB     15.8     314 KiB      9.8    158 KiB      4.9
>>     64 KiB    1472 KiB     23.0     826 KiB     12.9    259 KiB      4.0
>>    128 KiB    5585 KiB     43.6    3765 KiB     29.4    452 KiB      3.5
>>    256 KiB   20424 KiB     79.8   10743 KiB     42.0   2351 KiB      9.2
>>    512 KiB   86715 KiB    169.4   43650 KiB     85.3   3204 KiB      6.3
>>   1024 KiB  369568 KiB    360.9  187496 KiB    183.1  15845 KiB     15.5
>>
>> Since `file_set_contents` also performs a `rename` after writing, the
>> following table shows the results when the file is written without
>> renaming it afterwards:
>>
>>             print                big_writes+print     big_writes+syswrite
>> file size  written     amplif.  written     amplif.  written     amplif.
>>      1 KiB      29 KiB     29.0      29 KiB     29.0     25 KiB      25.0
>>      2 KiB      29 KiB     14.5      30 KiB     15.0     25 KiB      12.5
>>      4 KiB      37 KiB      9.3      44 KiB     11.0     41 KiB      10.3
>>      8 KiB      61 KiB      7.6      45 KiB      5.6     45 KiB       5.6
>>     16 KiB     143 KiB      8.9      86 KiB      5.4     57 KiB       3.6
>>     32 KiB     396 KiB     12.4     225 KiB      7.0     69 KiB       2.2
>>     64 KiB    1281 KiB     20.0     673 KiB     10.5    105 KiB       1.6
>>    128 KiB    4789 KiB     37.4    3478 KiB     27.2    169 KiB       1.3
>>    256 KiB   18868 KiB     73.7    9976 KiB     39.0    572 KiB       2.2
>>    512 KiB   79304 KiB    154.9   42714 KiB     83.4   2150 KiB       4.2
>>   1024 KiB  347929 KiB    339.8  182483 KiB    178.2  11133 KiB      10.9
>>
>> [1] https://lists.proxmox.com/pipermail/pve-devel/2024-September/065396.html
>>
>> Signed-off-by: Filip Schauer <f.schauer@proxmox.com>
>> ---
>> Changes since v1:
>> * Add benchmark results without rename to commit message
>> * Fix "Wide character in syswrite" error by first encoding $data with print
>>
>>   src/PVE/Tools.pm | 21 ++++++++++++++++++---
>>   1 file changed, 18 insertions(+), 3 deletions(-)
>>
>>
> 
> applied, and many thanks for the detailed benchmarks!
> 

hi since you applied this, we probably also want to apply the pmxcfs patch?
(since without that, this patch does not really have an effect)

should i send a new version (as non rfc) with a better commit message for that
or is that not necessary from your POV ?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] applied: [PATCH v2 common] tools: file_set_contents: use syswrite instead of print
  2024-10-14  9:07   ` Dominik Csapak
@ 2024-10-14  9:22     ` Thomas Lamprecht
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2024-10-14  9:22 UTC (permalink / raw)
  To: Dominik Csapak, Proxmox VE development discussion

Am 14/10/2024 um 11:07 schrieb Dominik Csapak:
> hi since you applied this, we probably also want to apply the pmxcfs patch?

This is already an improvement on its own independent of where the destination
resides on, but yes, the pmxcfs one would have been the next to (re-)look more
closely at.

> (since without that, this patch does not really have an effect)
> 
> should i send a new version (as non rfc) with a better commit message for that
> or is that not necessary from your POV ?

Yes, please. It would be great if you could include some more background that
we learned since original submission and maybe some benchmarks directly, or at
least point to the one from Filip's commit.
But, as always, I'm a fan of keeping the essentials in the commit message
directly to make it more self-contained.




_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH v2 common] tools: file_set_contents: use syswrite instead of print
  2024-09-30 11:40 [pve-devel] [PATCH v2 common] tools: file_set_contents: use syswrite instead of print Filip Schauer
  2024-10-14  8:42 ` [pve-devel] applied: " Thomas Lamprecht
@ 2024-10-14  9:26 ` Thomas Lamprecht
  1 sibling, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2024-10-14  9:26 UTC (permalink / raw)
  To: Proxmox VE development discussion, Filip Schauer

Am 30/09/2024 um 13:40 schrieb Filip Schauer:
> +	    $offset += syswrite($fh, $data, $len - $offset, $offset)
> +		or die "unable to write '$tmpname' - $!\n";

FYI: Wolfgang noticed something nasty with the subtle difference between
the `or` and `||` operator [0] that introduces a bug w.r.t. error handling
here.

Basically `or` has lower precedence and thus would have made the code
act like:

($offset += syswrite($fh, $data, $len - $offset, $offset))
    or die "unable to write '$tmpname' - $!\n";

Thus, never taking the error path once $offset was incresead to something
to non-zero, i.e. after the first round. See my follow-up commit [1] for
more details.


[0]: https://perldoc.perl.org/perlop#Logical-or-and-Exclusive-Or
[1]: https://git.proxmox.com/?p=pve-common.git;a=commitdiff;h=f1fe7a0733570e84343f152e2409b22782feb2d3


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-14  9:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-30 11:40 [pve-devel] [PATCH v2 common] tools: file_set_contents: use syswrite instead of print Filip Schauer
2024-10-14  8:42 ` [pve-devel] applied: " Thomas Lamprecht
2024-10-14  9:07   ` Dominik Csapak
2024-10-14  9:22     ` Thomas Lamprecht
2024-10-14  9:26 ` [pve-devel] " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal