all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH qemu 0/1] Extend qga file-read with chunked access for large files
@ 2026-02-23 20:16 Markus Ebner
  2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Ebner @ 2026-02-23 20:16 UTC (permalink / raw)
  To: pve-devel; +Cc: Markus Ebner

The file-read command of the QEMU guest agent previously had several
practical limitations. It always read a fixed 16 MiB block starting at
offset 0, making it impossible to retrieve larger files in multiple
chunks. On busy or resource‑constrained hosts, requests for large files
often timed out because the agent attempted to read and JSON‑encode the
entire 16 MiB block at once.

Binary data was also returned as raw JSON strings with extensive
escaping, which inflated payload size and caused compatibility issues
with some JSON parsers.

This patch extends the file-read method with three new parameters:

- decode — Controls whether the base64‑encoded data returned by the
  guest agent should be decoded before being sent back through the API.
  When disabled, the base64 string is passed through unchanged, which is
  ideal for binary data and mirrors the existing encode parameter of
  file-write.

- offset — Allows reading from an arbitrary byte offset within the
  file.

- count — Allows requesting a smaller number of bytes than the
  internal 16 MiB limit, avoiding unnecessary overhead and reducing
  timeout risk.

With these additions, the behavior now mirrors standard file operations
(fopen, fseek, fread). Reading beyond EOF returns zero bytes.
Seek can choose any non-negative position within the file, without
bounds checking. Reading out of bounds returns 0 bytes.
This allows conveniently reading an entire file in a robust way:
while(truncated && content.length != 0) {}
and also enables things like tailing a changing file.
This makes the file-read command significantly more flexible.

All parameter additions were done in a backwards-compatible fashion.

Markus Ebner (1):
  close #7342: Extend qga file-read with chunked access for large files

 src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 8 deletions(-)

-- 
2.53.0




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH container] close #7342: Extend qga file-read with chunked access for large files
  2026-02-23 20:16 [PATCH qemu 0/1] Extend qga file-read with chunked access for large files Markus Ebner
@ 2026-02-23 20:16 ` Markus Ebner
  2026-02-24 11:08   ` Fiona Ebner
  0 siblings, 1 reply; 3+ messages in thread
From: Markus Ebner @ 2026-02-23 20:16 UTC (permalink / raw)
  To: pve-devel; +Cc: Markus Ebner

The file-read command of the QEMU guest agent previously had several
practical limitations. It always read a fixed 16 MiB block starting at
offset 0, making it impossible to retrieve larger files in multiple
chunks. On busy or resource‑constrained hosts, requests for large files
often timed out because the agent attempted to read and JSON‑encode the
entire 16 MiB block at once.

Binary data was also returned as raw JSON strings with extensive
escaping, which inflated payload size and caused compatibility issues
with some JSON parsers.

This patch extends the file-read method with three new parameters:

- decode — Controls whether the base64‑encoded data returned by the
  guest agent should be decoded before being sent back through the API.
  When disabled, the base64 string is passed through unchanged, which is
  ideal for binary data and mirrors the existing encode parameter of
  file-write.

- offset — Allows reading from an arbitrary byte offset within the
  file.

- count — Allows requesting a smaller number of bytes than the
  internal 16 MiB limit, avoiding unnecessary overhead and reducing
  timeout risk.

With these additions, the behavior now mirrors standard file operations
(fopen, fseek, fread). Reading beyond EOF returns zero bytes.
Seek can choose any non-negative position within the file, without
bounds checking. Reading out of bounds returns 0 bytes.
This allows conveniently reading an entire file in a robust way:
while(truncated && content.length != 0) {}
and also enables things like tailing a changing file.
This makes the file-read command significantly more flexible.

All parameter additions were done in a backwards-compatible fashion.

Signed-off-by: Markus Ebner <info@ebner-markus.de>
---
 src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm
index de36ce1e..ccd1dca2 100644
--- a/src/PVE/API2/Qemu/Agent.pm
+++ b/src/PVE/API2/Qemu/Agent.pm
@@ -464,6 +464,28 @@ __PACKAGE__->register_method({
                 'pve-vmid',
                 { completion => \&PVE::QemuServer::complete_vmid_running },
             ),
+            decode => {
+                type => 'boolean',
+                optional => 1,
+                default => 1,
+                description =>
+                    "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded."
+                    . "Otherwise the content is forwarded with base64 encoding - defaults to true.",
+            },
+            offset => {
+                type => 'integer',
+                optional => 1,
+                default => 0,
+                description => "Offset to start reading at",
+            },
+            count => {
+                type => 'integer',
+                optional => 1,
+                minimum => 0,
+                maximum => $MAX_READ_SIZE,
+                default => $MAX_READ_SIZE,
+                description => "Number of bytes to read.",
+            },
             file => {
                 type => 'string',
                 description => 'The path to the file',
@@ -487,6 +509,9 @@ __PACKAGE__->register_method({
     },
     code => sub {
         my ($param) = @_;
+        my $param_offset = int($param->{offset} // 0);
+        my $param_decode = $param->{decode} // 1;
+        my $param_count = int($param->{count} // $MAX_READ_SIZE);
 
         my $vmid = $param->{vmid};
         my $conf = PVE::QemuConfig->load_config($vmid);
@@ -494,18 +519,33 @@ __PACKAGE__->register_method({
         my $qgafh =
             agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file");
 
-        my $bytes_left = $MAX_READ_SIZE;
+        if ($param_offset > 0) {
+            my $seek = mon_cmd(
+                $vmid, "guest-file-seek",
+                handle => $qgafh,
+                offset => $param_offset,
+                whence => 'set',
+            );
+            check_agent_error($seek, "can't seek to offset position");
+        }
+
+        my $bytes_read = 0;
         my $eof = 0;
         my $read_size = 1024 * 1024;
         my $content = "";
 
-        while ($bytes_left > 0 && !$eof) {
+        while ($bytes_read < $param_count && !$eof) {
+            my $bytes_left = $param_count - $bytes_read;
+            my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size;
             my $read =
-                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size));
+                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size));
             check_agent_error($read, "can't read from file");
 
-            $content .= decode_base64($read->{'buf-b64'});
-            $bytes_left -= $read->{count};
+            my $chunk = $read->{'buf-b64'};
+            $chunk = decode_base64($chunk) if $param_decode;
+            $content .= $chunk;
+
+            $bytes_read += $read->{count};
             $eof = $read->{eof} // 0;
         }
 
@@ -514,12 +554,10 @@ __PACKAGE__->register_method({
 
         my $result = {
             content => $content,
-            'bytes-read' => ($MAX_READ_SIZE - $bytes_left),
+            'bytes-read' => $bytes_read,
         };
 
         if (!$eof) {
-            warn
-                "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n";
             $result->{truncated} = 1;
         }
 
-- 
2.53.0




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH container] close #7342: Extend qga file-read with chunked access for large files
  2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner
@ 2026-02-24 11:08   ` Fiona Ebner
  0 siblings, 0 replies; 3+ messages in thread
From: Fiona Ebner @ 2026-02-24 11:08 UTC (permalink / raw)
  To: Markus Ebner, pve-devel

As you already noted yourself, the prefix is wrong. Should be
'qemu-server' rather than 'container' or 'qemu'.

Am 24.02.26 um 9:47 AM schrieb Markus Ebner:
> The file-read command of the QEMU guest agent previously had several
> practical limitations.

The limitations are in the file-read API endpoint, not in the guest
agent itself.

> It always read a fixed 16 MiB block starting at
> offset 0, making it impossible to retrieve larger files in multiple
> chunks. On busy or resource‑constrained hosts, requests for large files
> often timed out because the agent attempted to read and JSON‑encode the
> entire 16 MiB block at once.

Okay, 16 MiB does not seem like that much, but if you ran into the
issue, it makes sense to be more flexible.

> Binary data was also returned as raw JSON strings with extensive
> escaping, which inflated payload size and caused compatibility issues
> with some JSON parsers.

Could you give a concrete example here? What JSON parser/what
compatibility issue? I'd add that the 'decode' parameter is there for
improving this.

> This patch extends the file-read method with three new parameters:
> 
> - decode — Controls whether the base64‑encoded data returned by the
>   guest agent should be decoded before being sent back through the API.
>   When disabled, the base64 string is passed through unchanged, which is
>   ideal for binary data and mirrors the existing encode parameter of
>   file-write.
> 
> - offset — Allows reading from an arbitrary byte offset within the
>   file.
> 
> - count — Allows requesting a smaller number of bytes than the
>   internal 16 MiB limit, avoiding unnecessary overhead and reducing
>   timeout risk.

I'd prefer if there were three patches, one for each new parameter.

> 
> With these additions, the behavior now mirrors standard file operations
> (fopen, fseek, fread). Reading beyond EOF returns zero bytes.
> Seek can choose any non-negative position within the file, without
> bounds checking. Reading out of bounds returns 0 bytes.
> This allows conveniently reading an entire file in a robust way:
> while(truncated && content.length != 0) {}

It should be enough to only check the truncated flag, or what additional
info does length being non-zero give?

> and also enables things like tailing a changing file.
> This makes the file-read command significantly more flexible.
> 
> All parameter additions were done in a backwards-compatible fashion.
> 
> Signed-off-by: Markus Ebner <info@ebner-markus.de>

Thank you for your contribution! A few comments below, but it's looking
quite nice already :)

> ---
>  src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------
>  1 file changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm
> index de36ce1e..ccd1dca2 100644
> --- a/src/PVE/API2/Qemu/Agent.pm
> +++ b/src/PVE/API2/Qemu/Agent.pm
> @@ -464,6 +464,28 @@ __PACKAGE__->register_method({
>                  'pve-vmid',
>                  { completion => \&PVE::QemuServer::complete_vmid_running },
>              ),
> +            decode => {
> +                type => 'boolean',
> +                optional => 1,
> +                default => 1,
> +                description =>
> +                    "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded."

Style nit: line is longer than 100 columns

> +                    . "Otherwise the content is forwarded with base64 encoding - defaults to true.",

Nit: missing space at the beginning, since the strings are joined.

> +            },
> +            offset => {
> +                type => 'integer',
> +                optional => 1,
> +                default => 0,
> +                description => "Offset to start reading at",
> +            },
> +            count => {
> +                type => 'integer',
> +                optional => 1,
> +                minimum => 0,
> +                maximum => $MAX_READ_SIZE,
> +                default => $MAX_READ_SIZE,
> +                description => "Number of bytes to read.",
> +            },
>              file => {
>                  type => 'string',
>                  description => 'The path to the file',
> @@ -487,6 +509,9 @@ __PACKAGE__->register_method({
>      },
>      code => sub {
>          my ($param) = @_;
> +        my $param_offset = int($param->{offset} // 0);
> +        my $param_decode = $param->{decode} // 1;
> +        my $param_count = int($param->{count} // $MAX_READ_SIZE);

Nit: I'd drop the $param_ prefix

>  
>          my $vmid = $param->{vmid};
>          my $conf = PVE::QemuConfig->load_config($vmid);
> @@ -494,18 +519,33 @@ __PACKAGE__->register_method({
>          my $qgafh =
>              agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file");
>  
> -        my $bytes_left = $MAX_READ_SIZE;
> +        if ($param_offset > 0) {
> +            my $seek = mon_cmd(
> +                $vmid, "guest-file-seek",
> +                handle => $qgafh,
> +                offset => $param_offset,
> +                whence => 'set',
> +            );
> +            check_agent_error($seek, "can't seek to offset position");

We should check the result to see if the seek position is as expected.
I'd rather tell the user "you searched to an invalid position" than
implicitly return 0 bytes. If we seek exactly to the EOF it can still be
fine to return 0 bytes I guess. We can set $eof=1 early then.

What do you think?

> +        }
> +
> +        my $bytes_read = 0;
>          my $eof = 0;
>          my $read_size = 1024 * 1024;
>          my $content = "";
>  
> -        while ($bytes_left > 0 && !$eof) {
> +        while ($bytes_read < $param_count && !$eof) {
> +            my $bytes_left = $param_count - $bytes_read;
> +            my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size;
>              my $read =
> -                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size));
> +                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size));
>              check_agent_error($read, "can't read from file");
>  
> -            $content .= decode_base64($read->{'buf-b64'});
> -            $bytes_left -= $read->{count};
> +            my $chunk = $read->{'buf-b64'};
> +            $chunk = decode_base64($chunk) if $param_decode;
> +            $content .= $chunk;
> +
> +            $bytes_read += $read->{count};
>              $eof = $read->{eof} // 0;
>          }
>  
> @@ -514,12 +554,10 @@ __PACKAGE__->register_method({
>  
>          my $result = {
>              content => $content,
> -            'bytes-read' => ($MAX_READ_SIZE - $bytes_left),
> +            'bytes-read' => $bytes_read,
>          };
>  
>          if (!$eof) {
> -            warn
> -                "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n";

I think we should still warn this if no read size was explicitly specified.

>              $result->{truncated} = 1;
>          }
>  





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-24 11:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 20:16 [PATCH qemu 0/1] Extend qga file-read with chunked access for large files Markus Ebner
2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner
2026-02-24 11:08   ` Fiona Ebner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal