* [PATCH qemu 0/1] Extend qga file-read with chunked access for large files
@ 2026-02-23 20:16 Markus Ebner
2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner
0 siblings, 1 reply; 3+ messages in thread
From: Markus Ebner @ 2026-02-23 20:16 UTC (permalink / raw)
To: pve-devel; +Cc: Markus Ebner
The file-read command of the QEMU guest agent previously had several
practical limitations. It always read a fixed 16 MiB block starting at
offset 0, making it impossible to retrieve larger files in multiple
chunks. On busy or resource‑constrained hosts, requests for large files
often timed out because the agent attempted to read and JSON‑encode the
entire 16 MiB block at once.
Binary data was also returned as raw JSON strings with extensive
escaping, which inflated payload size and caused compatibility issues
with some JSON parsers.
This patch extends the file-read method with three new parameters:
- decode — Controls whether the base64‑encoded data returned by the
guest agent should be decoded before being sent back through the API.
When disabled, the base64 string is passed through unchanged, which is
ideal for binary data and mirrors the existing encode parameter of
file-write.
- offset — Allows reading from an arbitrary byte offset within the
file.
- count — Allows requesting a smaller number of bytes than the
internal 16 MiB limit, avoiding unnecessary overhead and reducing
timeout risk.
With these additions, the behavior now mirrors standard file operations
(fopen, fseek, fread). Reading beyond EOF returns zero bytes.
Seek can choose any non-negative position within the file, without
bounds checking. Reading out of bounds returns 0 bytes.
This allows conveniently reading an entire file in a robust way:
while(truncated && content.length != 0) {}
and also enables things like tailing a changing file.
This makes the file-read command significantly more flexible.
All parameter additions were done in a backwards-compatible fashion.
Markus Ebner (1):
close #7342: Extend qga file-read with chunked access for large files
src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------
1 file changed, 46 insertions(+), 8 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 3+ messages in thread* [PATCH container] close #7342: Extend qga file-read with chunked access for large files 2026-02-23 20:16 [PATCH qemu 0/1] Extend qga file-read with chunked access for large files Markus Ebner @ 2026-02-23 20:16 ` Markus Ebner 2026-02-24 11:08 ` Fiona Ebner 0 siblings, 1 reply; 3+ messages in thread From: Markus Ebner @ 2026-02-23 20:16 UTC (permalink / raw) To: pve-devel; +Cc: Markus Ebner The file-read command of the QEMU guest agent previously had several practical limitations. It always read a fixed 16 MiB block starting at offset 0, making it impossible to retrieve larger files in multiple chunks. On busy or resource‑constrained hosts, requests for large files often timed out because the agent attempted to read and JSON‑encode the entire 16 MiB block at once. Binary data was also returned as raw JSON strings with extensive escaping, which inflated payload size and caused compatibility issues with some JSON parsers. This patch extends the file-read method with three new parameters: - decode — Controls whether the base64‑encoded data returned by the guest agent should be decoded before being sent back through the API. When disabled, the base64 string is passed through unchanged, which is ideal for binary data and mirrors the existing encode parameter of file-write. - offset — Allows reading from an arbitrary byte offset within the file. - count — Allows requesting a smaller number of bytes than the internal 16 MiB limit, avoiding unnecessary overhead and reducing timeout risk. With these additions, the behavior now mirrors standard file operations (fopen, fseek, fread). Reading beyond EOF returns zero bytes. Seek can choose any non-negative position within the file, without bounds checking. Reading out of bounds returns 0 bytes. This allows conveniently reading an entire file in a robust way: while(truncated && content.length != 0) {} and also enables things like tailing a changing file. This makes the file-read command significantly more flexible. All parameter additions were done in a backwards-compatible fashion. Signed-off-by: Markus Ebner <info@ebner-markus.de> --- src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 8 deletions(-) diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm index de36ce1e..ccd1dca2 100644 --- a/src/PVE/API2/Qemu/Agent.pm +++ b/src/PVE/API2/Qemu/Agent.pm @@ -464,6 +464,28 @@ __PACKAGE__->register_method({ 'pve-vmid', { completion => \&PVE::QemuServer::complete_vmid_running }, ), + decode => { + type => 'boolean', + optional => 1, + default => 1, + description => + "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded." + . "Otherwise the content is forwarded with base64 encoding - defaults to true.", + }, + offset => { + type => 'integer', + optional => 1, + default => 0, + description => "Offset to start reading at", + }, + count => { + type => 'integer', + optional => 1, + minimum => 0, + maximum => $MAX_READ_SIZE, + default => $MAX_READ_SIZE, + description => "Number of bytes to read.", + }, file => { type => 'string', description => 'The path to the file', @@ -487,6 +509,9 @@ __PACKAGE__->register_method({ }, code => sub { my ($param) = @_; + my $param_offset = int($param->{offset} // 0); + my $param_decode = $param->{decode} // 1; + my $param_count = int($param->{count} // $MAX_READ_SIZE); my $vmid = $param->{vmid}; my $conf = PVE::QemuConfig->load_config($vmid); @@ -494,18 +519,33 @@ __PACKAGE__->register_method({ my $qgafh = agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file"); - my $bytes_left = $MAX_READ_SIZE; + if ($param_offset > 0) { + my $seek = mon_cmd( + $vmid, "guest-file-seek", + handle => $qgafh, + offset => $param_offset, + whence => 'set', + ); + check_agent_error($seek, "can't seek to offset position"); + } + + my $bytes_read = 0; my $eof = 0; my $read_size = 1024 * 1024; my $content = ""; - while ($bytes_left > 0 && !$eof) { + while ($bytes_read < $param_count && !$eof) { + my $bytes_left = $param_count - $bytes_read; + my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size; my $read = - mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size)); + mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size)); check_agent_error($read, "can't read from file"); - $content .= decode_base64($read->{'buf-b64'}); - $bytes_left -= $read->{count}; + my $chunk = $read->{'buf-b64'}; + $chunk = decode_base64($chunk) if $param_decode; + $content .= $chunk; + + $bytes_read += $read->{count}; $eof = $read->{eof} // 0; } @@ -514,12 +554,10 @@ __PACKAGE__->register_method({ my $result = { content => $content, - 'bytes-read' => ($MAX_READ_SIZE - $bytes_left), + 'bytes-read' => $bytes_read, }; if (!$eof) { - warn - "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n"; $result->{truncated} = 1; } -- 2.53.0 ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH container] close #7342: Extend qga file-read with chunked access for large files 2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner @ 2026-02-24 11:08 ` Fiona Ebner 0 siblings, 0 replies; 3+ messages in thread From: Fiona Ebner @ 2026-02-24 11:08 UTC (permalink / raw) To: Markus Ebner, pve-devel As you already noted yourself, the prefix is wrong. Should be 'qemu-server' rather than 'container' or 'qemu'. Am 24.02.26 um 9:47 AM schrieb Markus Ebner: > The file-read command of the QEMU guest agent previously had several > practical limitations. The limitations are in the file-read API endpoint, not in the guest agent itself. > It always read a fixed 16 MiB block starting at > offset 0, making it impossible to retrieve larger files in multiple > chunks. On busy or resource‑constrained hosts, requests for large files > often timed out because the agent attempted to read and JSON‑encode the > entire 16 MiB block at once. Okay, 16 MiB does not seem like that much, but if you ran into the issue, it makes sense to be more flexible. > Binary data was also returned as raw JSON strings with extensive > escaping, which inflated payload size and caused compatibility issues > with some JSON parsers. Could you give a concrete example here? What JSON parser/what compatibility issue? I'd add that the 'decode' parameter is there for improving this. > This patch extends the file-read method with three new parameters: > > - decode — Controls whether the base64‑encoded data returned by the > guest agent should be decoded before being sent back through the API. > When disabled, the base64 string is passed through unchanged, which is > ideal for binary data and mirrors the existing encode parameter of > file-write. > > - offset — Allows reading from an arbitrary byte offset within the > file. > > - count — Allows requesting a smaller number of bytes than the > internal 16 MiB limit, avoiding unnecessary overhead and reducing > timeout risk. I'd prefer if there were three patches, one for each new parameter. > > With these additions, the behavior now mirrors standard file operations > (fopen, fseek, fread). Reading beyond EOF returns zero bytes. > Seek can choose any non-negative position within the file, without > bounds checking. Reading out of bounds returns 0 bytes. > This allows conveniently reading an entire file in a robust way: > while(truncated && content.length != 0) {} It should be enough to only check the truncated flag, or what additional info does length being non-zero give? > and also enables things like tailing a changing file. > This makes the file-read command significantly more flexible. > > All parameter additions were done in a backwards-compatible fashion. > > Signed-off-by: Markus Ebner <info@ebner-markus.de> Thank you for your contribution! A few comments below, but it's looking quite nice already :) > --- > src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------ > 1 file changed, 46 insertions(+), 8 deletions(-) > > diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm > index de36ce1e..ccd1dca2 100644 > --- a/src/PVE/API2/Qemu/Agent.pm > +++ b/src/PVE/API2/Qemu/Agent.pm > @@ -464,6 +464,28 @@ __PACKAGE__->register_method({ > 'pve-vmid', > { completion => \&PVE::QemuServer::complete_vmid_running }, > ), > + decode => { > + type => 'boolean', > + optional => 1, > + default => 1, > + description => > + "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded." Style nit: line is longer than 100 columns > + . "Otherwise the content is forwarded with base64 encoding - defaults to true.", Nit: missing space at the beginning, since the strings are joined. > + }, > + offset => { > + type => 'integer', > + optional => 1, > + default => 0, > + description => "Offset to start reading at", > + }, > + count => { > + type => 'integer', > + optional => 1, > + minimum => 0, > + maximum => $MAX_READ_SIZE, > + default => $MAX_READ_SIZE, > + description => "Number of bytes to read.", > + }, > file => { > type => 'string', > description => 'The path to the file', > @@ -487,6 +509,9 @@ __PACKAGE__->register_method({ > }, > code => sub { > my ($param) = @_; > + my $param_offset = int($param->{offset} // 0); > + my $param_decode = $param->{decode} // 1; > + my $param_count = int($param->{count} // $MAX_READ_SIZE); Nit: I'd drop the $param_ prefix > > my $vmid = $param->{vmid}; > my $conf = PVE::QemuConfig->load_config($vmid); > @@ -494,18 +519,33 @@ __PACKAGE__->register_method({ > my $qgafh = > agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file"); > > - my $bytes_left = $MAX_READ_SIZE; > + if ($param_offset > 0) { > + my $seek = mon_cmd( > + $vmid, "guest-file-seek", > + handle => $qgafh, > + offset => $param_offset, > + whence => 'set', > + ); > + check_agent_error($seek, "can't seek to offset position"); We should check the result to see if the seek position is as expected. I'd rather tell the user "you searched to an invalid position" than implicitly return 0 bytes. If we seek exactly to the EOF it can still be fine to return 0 bytes I guess. We can set $eof=1 early then. What do you think? > + } > + > + my $bytes_read = 0; > my $eof = 0; > my $read_size = 1024 * 1024; > my $content = ""; > > - while ($bytes_left > 0 && !$eof) { > + while ($bytes_read < $param_count && !$eof) { > + my $bytes_left = $param_count - $bytes_read; > + my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size; > my $read = > - mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size)); > + mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size)); > check_agent_error($read, "can't read from file"); > > - $content .= decode_base64($read->{'buf-b64'}); > - $bytes_left -= $read->{count}; > + my $chunk = $read->{'buf-b64'}; > + $chunk = decode_base64($chunk) if $param_decode; > + $content .= $chunk; > + > + $bytes_read += $read->{count}; > $eof = $read->{eof} // 0; > } > > @@ -514,12 +554,10 @@ __PACKAGE__->register_method({ > > my $result = { > content => $content, > - 'bytes-read' => ($MAX_READ_SIZE - $bytes_left), > + 'bytes-read' => $bytes_read, > }; > > if (!$eof) { > - warn > - "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n"; I think we should still warn this if no read size was explicitly specified. > $result->{truncated} = 1; > } > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-02-24 11:08 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2026-02-23 20:16 [PATCH qemu 0/1] Extend qga file-read with chunked access for large files Markus Ebner 2026-02-23 20:16 ` [PATCH container] close #7342: " Markus Ebner 2026-02-24 11:08 ` Fiona Ebner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.