From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id DD9F21FF139
	for <inbox@lore.proxmox.com>; Tue, 24 Feb 2026 12:08:21 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 2D12157CB;
	Tue, 24 Feb 2026 12:09:14 +0100 (CET)
Message-ID: <97d497d1-067f-4da3-a3f9-f0ea87a27e54@proxmox.com>
Date: Tue, 24 Feb 2026 12:08:38 +0100
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: [PATCH container] close #7342: Extend qga file-read with chunked
 access for large files
To: Markus Ebner <info@ebner-markus.de>, pve-devel@lists.proxmox.com
References: <20260223201648.297620-1-info@ebner-markus.de>
 <20260223201648.297620-2-info@ebner-markus.de>
Content-Language: en-US
From: Fiona Ebner <f.ebner@proxmox.com>
In-Reply-To: <20260223201648.297620-2-info@ebner-markus.de>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1771931302926
X-SPAM-LEVEL: Spam detection results:  0
	AWL                    -1.078 Adjusted score from AWL reputation of From:
 address
	BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
	DMARC_MISSING             0.1 Missing DMARC policy
	KAM_DMARC_STATUS         0.01 Test Rule for DKIM or SPF Failure with Strict
 Alignment
	RCVD_IN_VALIDITY_CERTIFIED_BLOCKED  1.179 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	RCVD_IN_VALIDITY_RPBL_BLOCKED  0.717 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	RCVD_IN_VALIDITY_SAFE_BLOCKED  0.236 ADMINISTRATOR NOTICE: The query to
 Validity was blocked.  See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
	SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
	SPF_PASS               -0.001 SPF: sender matches SPF record
Message-ID-Hash: 6BJGO567J5OJRFUWUIJTQ4PITGN4ZFU2
X-Message-ID-Hash: 6BJGO567J5OJRFUWUIJTQ4PITGN4ZFU2
X-MailFrom: f.ebner@proxmox.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop;
 banned-address; emergency; member-moderation; nonmember-moderation;
 administrivia; implicit-dest; max-recipients; max-size; news-moderation;
 no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.10
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Owner: <mailto:pve-devel-owner@lists.proxmox.com>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Subscribe: <mailto:pve-devel-join@lists.proxmox.com>
List-Unsubscribe: <mailto:pve-devel-leave@lists.proxmox.com>

As you already noted yourself, the prefix is wrong. Should be
'qemu-server' rather than 'container' or 'qemu'.

Am 24.02.26 um 9:47 AM schrieb Markus Ebner:
> The file-read command of the QEMU guest agent previously had several
> practical limitations.

The limitations are in the file-read API endpoint, not in the guest
agent itself.

> It always read a fixed 16 MiB block starting at
> offset 0, making it impossible to retrieve larger files in multiple
> chunks. On busy or resource‑constrained hosts, requests for large files
> often timed out because the agent attempted to read and JSON‑encode the
> entire 16 MiB block at once.

Okay, 16 MiB does not seem like that much, but if you ran into the
issue, it makes sense to be more flexible.

> Binary data was also returned as raw JSON strings with extensive
> escaping, which inflated payload size and caused compatibility issues
> with some JSON parsers.

Could you give a concrete example here? What JSON parser/what
compatibility issue? I'd add that the 'decode' parameter is there for
improving this.

> This patch extends the file-read method with three new parameters:
> 
> - decode — Controls whether the base64‑encoded data returned by the
>   guest agent should be decoded before being sent back through the API.
>   When disabled, the base64 string is passed through unchanged, which is
>   ideal for binary data and mirrors the existing encode parameter of
>   file-write.
> 
> - offset — Allows reading from an arbitrary byte offset within the
>   file.
> 
> - count — Allows requesting a smaller number of bytes than the
>   internal 16 MiB limit, avoiding unnecessary overhead and reducing
>   timeout risk.

I'd prefer if there were three patches, one for each new parameter.

> 
> With these additions, the behavior now mirrors standard file operations
> (fopen, fseek, fread). Reading beyond EOF returns zero bytes.
> Seek can choose any non-negative position within the file, without
> bounds checking. Reading out of bounds returns 0 bytes.
> This allows conveniently reading an entire file in a robust way:
> while(truncated && content.length != 0) {}

It should be enough to only check the truncated flag, or what additional
info does length being non-zero give?

> and also enables things like tailing a changing file.
> This makes the file-read command significantly more flexible.
> 
> All parameter additions were done in a backwards-compatible fashion.
> 
> Signed-off-by: Markus Ebner <info@ebner-markus.de>

Thank you for your contribution! A few comments below, but it's looking
quite nice already :)

> ---
>  src/PVE/API2/Qemu/Agent.pm | 54 ++++++++++++++++++++++++++++++++------
>  1 file changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/src/PVE/API2/Qemu/Agent.pm b/src/PVE/API2/Qemu/Agent.pm
> index de36ce1e..ccd1dca2 100644
> --- a/src/PVE/API2/Qemu/Agent.pm
> +++ b/src/PVE/API2/Qemu/Agent.pm
> @@ -464,6 +464,28 @@ __PACKAGE__->register_method({
>                  'pve-vmid',
>                  { completion => \&PVE::QemuServer::complete_vmid_running },
>              ),
> +            decode => {
> +                type => 'boolean',
> +                optional => 1,
> +                default => 1,
> +                description =>
> +                    "Data received from the QEMU Guest-Agent is base64 encoded. If this is set to true, the data is decoded."

Style nit: line is longer than 100 columns

> +                    . "Otherwise the content is forwarded with base64 encoding - defaults to true.",

Nit: missing space at the beginning, since the strings are joined.

> +            },
> +            offset => {
> +                type => 'integer',
> +                optional => 1,
> +                default => 0,
> +                description => "Offset to start reading at",
> +            },
> +            count => {
> +                type => 'integer',
> +                optional => 1,
> +                minimum => 0,
> +                maximum => $MAX_READ_SIZE,
> +                default => $MAX_READ_SIZE,
> +                description => "Number of bytes to read.",
> +            },
>              file => {
>                  type => 'string',
>                  description => 'The path to the file',
> @@ -487,6 +509,9 @@ __PACKAGE__->register_method({
>      },
>      code => sub {
>          my ($param) = @_;
> +        my $param_offset = int($param->{offset} // 0);
> +        my $param_decode = $param->{decode} // 1;
> +        my $param_count = int($param->{count} // $MAX_READ_SIZE);

Nit: I'd drop the $param_ prefix

>  
>          my $vmid = $param->{vmid};
>          my $conf = PVE::QemuConfig->load_config($vmid);
> @@ -494,18 +519,33 @@ __PACKAGE__->register_method({
>          my $qgafh =
>              agent_cmd($vmid, $conf, "file-open", { path => $param->{file} }, "can't open file");
>  
> -        my $bytes_left = $MAX_READ_SIZE;
> +        if ($param_offset > 0) {
> +            my $seek = mon_cmd(
> +                $vmid, "guest-file-seek",
> +                handle => $qgafh,
> +                offset => $param_offset,
> +                whence => 'set',
> +            );
> +            check_agent_error($seek, "can't seek to offset position");

We should check the result to see if the seek position is as expected.
I'd rather tell the user "you searched to an invalid position" than
implicitly return 0 bytes. If we seek exactly to the EOF it can still be
fine to return 0 bytes I guess. We can set $eof=1 early then.

What do you think?

> +        }
> +
> +        my $bytes_read = 0;
>          my $eof = 0;
>          my $read_size = 1024 * 1024;
>          my $content = "";
>  
> -        while ($bytes_left > 0 && !$eof) {
> +        while ($bytes_read < $param_count && !$eof) {
> +            my $bytes_left = $param_count - $bytes_read;
> +            my $chunk_size = $bytes_left < $read_size ? $bytes_left : $read_size;
>              my $read =
> -                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($read_size));
> +                mon_cmd($vmid, "guest-file-read", handle => $qgafh, count => int($chunk_size));
>              check_agent_error($read, "can't read from file");
>  
> -            $content .= decode_base64($read->{'buf-b64'});
> -            $bytes_left -= $read->{count};
> +            my $chunk = $read->{'buf-b64'};
> +            $chunk = decode_base64($chunk) if $param_decode;
> +            $content .= $chunk;
> +
> +            $bytes_read += $read->{count};
>              $eof = $read->{eof} // 0;
>          }
>  
> @@ -514,12 +554,10 @@ __PACKAGE__->register_method({
>  
>          my $result = {
>              content => $content,
> -            'bytes-read' => ($MAX_READ_SIZE - $bytes_left),
> +            'bytes-read' => $bytes_read,
>          };
>  
>          if (!$eof) {
> -            warn
> -                "agent file-read: reached maximum read size: $MAX_READ_SIZE bytes. output might be truncated.\n";

I think we should still warn this if no read size was explicitly specified.

>              $result->{truncated} = 1;
>          }
>