From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id DD9F21FF139 for ; Tue, 24 Feb 2026 12:08:21 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2D12157CB; Tue, 24 Feb 2026 12:09:14 +0100 (CET) Message-ID: <97d497d1-067f-4da3-a3f9-f0ea87a27e54@proxmox.com> Date: Tue, 24 Feb 2026 12:08:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH container] close #7342: Extend qga file-read with chunked access for large files To: Markus Ebner , pve-devel@lists.proxmox.com References: <20260223201648.297620-1-info@ebner-markus.de> <20260223201648.297620-2-info@ebner-markus.de> Content-Language: en-US From: Fiona Ebner In-Reply-To: <20260223201648.297620-2-info@ebner-markus.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1771931302926 X-SPAM-LEVEL: Spam detection results: 0 AWL -1.078 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 1.179 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.717 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.236 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: 6BJGO567J5OJRFUWUIJTQ4PITGN4ZFU2 X-Message-ID-Hash: 6BJGO567J5OJRFUWUIJTQ4PITGN4ZFU2 X-MailFrom: f.ebner@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: As you already noted yourself, the prefix is wrong. Should be 'qemu-server' rather than 'container' or 'qemu'. Am 24.02.26 um 9:47 AM schrieb Markus Ebner: > The file-read command of the QEMU guest agent previously had several > practical limitations. The limitations are in the file-read API endpoint, not in the guest agent itself. > It always read a fixed 16 MiB block starting at > offset 0, making it impossible to retrieve larger files in multiple > chunks. On busy or resource‑constrained hosts, requests for large files > often timed out because the agent attempted to read and JSON‑encode the > entire 16 MiB block at once. Okay, 16 MiB does not seem like that much, but if you ran into the issue, it makes sense to be more flexible. > Binary data was also returned as raw JSON strings with extensive > escaping, which inflated payload size and caused compatibility issues > with some JSON parsers. Could you give a concrete example here? What JSON parser/what compatibility issue? I'd add that the 'decode' parameter is there for improving this. > This patch extends the file-read method with three new parameters: > > - decode — Controls whether the base64‑encoded data returned by the > guest agent should be decoded before being sent back through the API. > When disabled, the base64 string is passed through unchanged, which is > ideal for binary data and mirrors the existing encode parameter of > file-write. > > - offset — Allows reading from an arbitrary byte offset within the > file. > > - count — Allows requesting a smaller number of bytes than the > internal 16 MiB limit, avoiding unnecessary overhead and reducing > timeout risk. I'd prefer if there were three patches, one for each new parameter. > > With these additions, the behavior now mirrors standard file operations > (fopen, fseek, fread). Reading beyond EOF returns zero bytes. > Seek can choose any non-negative position within the file, without > bounds checking. Reading out of bounds returns 0 bytes. > This allows conveniently reading an entire file in a robust way: > while(truncated && content.length != 0) {} It should be enough to only check the truncated flag, or what additional info does length being non-zero give? > and also enables things like tailing a changing file. > This makes the file-read command significantly more flexible. > > All parameter additions were done in a backwards-compatible fashion. > > Signed-off-by: Markus Ebner Thank you for your contribution! A few comments below, but it's looking quite nice already :) > --- > src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------ > 1 file changed, 46 insertions(+), 8 deletions(-) > > diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm > index de36ce1e..ccd1dca2 100644 > --- a/src/PVE/API2/Qemu/Agent.pm > +++ b/src/PVE/API2/Qemu/Agent.pm > @@ -464,6 +464,28 @@ __PACKAGE__->register_method({ > 'pve-vmid', > { completion => \&PVE::QemuServer::complete_vmid_running }, > ), > + decode => { > + type => 'boolean', > + optional => 1, > + default => 1, > + description => > + "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded." Style nit: line is longer than 100 columns > + . "Otherwise the content is forwarded with base64 encoding - defaults to true.", Nit: missing space at the beginning, since the strings are joined. > + }, > + offset => { > + type => 'integer', > + optional => 1, > + default => 0, > + description => "Offset to start reading at", > + }, > + count => { > + type => 'integer', > + optional => 1, > + minimum => 0, > + maximum => $MAX_READ_SIZE, > + default => $MAX_READ_SIZE, > + description => "Number of bytes to read.", > + }, > file => { > type => 'string', > description => 'The path to the file', > @@ -487,6 +509,9 @@ __PACKAGE__->register_method({ > }, > code => sub { > my ($param) = @_; > + my $param_offset = int($param->{offset} // 0); > + my $param_decode = $param->{decode} // 1; > + my $param_count = int($param->{count} // $MAX_READ_SIZE); Nit: I'd drop the $param_ prefix > > my $vmid = $param->{vmid}; > my $conf = PVE::QemuConfig->load_config($vmid); > @@ -494,18 +519,33 @@ __PACKAGE__->register_method({ > my $qgafh = > agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file"); > > - my $bytes_left = $MAX_READ_SIZE; > + if ($param_offset > 0) { > + my $seek = mon_cmd( > + $vmid, "guest-file-seek", > + handle => $qgafh, > + offset => $param_offset, > + whence => 'set', > + ); > + check_agent_error($seek, "can't seek to offset position"); We should check the result to see if the seek position is as expected. I'd rather tell the user "you searched to an invalid position" than implicitly return 0 bytes. If we seek exactly to the EOF it can still be fine to return 0 bytes I guess. We can set $eof=1 early then. What do you think? > + } > + > + my $bytes_read = 0; > my $eof = 0; > my $read_size = 1024 * 1024; > my $content = ""; > > - while ($bytes_left > 0 && !$eof) { > + while ($bytes_read < $param_count && !$eof) { > + my $bytes_left = $param_count - $bytes_read; > + my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size; > my $read = > - mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size)); > + mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size)); > check_agent_error($read, "can't read from file"); > > - $content .= decode_base64($read->{'buf-b64'}); > - $bytes_left -= $read->{count}; > + my $chunk = $read->{'buf-b64'}; > + $chunk = decode_base64($chunk) if $param_decode; > + $content .= $chunk; > + > + $bytes_read += $read->{count}; > $eof = $read->{eof} // 0; > } > > @@ -514,12 +554,10 @@ __PACKAGE__->register_method({ > > my $result = { > content => $content, > - 'bytes-read' => ($MAX_READ_SIZE - $bytes_left), > + 'bytes-read' => $bytes_read, > }; > > if (!$eof) { > - warn > - "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n"; I think we should still warn this if no read size was explicitly specified. > $result->{truncated} = 1; > } >