From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id F2E1B61A77 for ; Wed, 26 Jul 2023 14:31:53 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id DB02F62BD for ; Wed, 26 Jul 2023 14:31:53 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 26 Jul 2023 14:31:53 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E1BA04538F for ; Wed, 26 Jul 2023 14:31:52 +0200 (CEST) Date: Wed, 26 Jul 2023 14:31:45 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion References: <20230725143720.620577-1-p.hufnagl@proxmox.com> <20230725143720.620577-2-p.hufnagl@proxmox.com> In-Reply-To: <20230725143720.620577-2-p.hufnagl@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid) Message-Id: <1690365209.m36aguo1vk.astroid@yuna.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.070 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH pve-storage 1/2] fix #4849: download-url: allow download and decompression of compressed ISOs X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2023 12:31:54 -0000 On July 25, 2023 4:37 pm, Philipp Hufnagl wrote: > Signed-off-by: Philipp Hufnagl > --- > src/PVE/API2/Storage/Status.pm | 20 ++++++++++++++++++-- > src/PVE/Storage.pm | 22 ++++++++++++++++++++++ > 2 files changed, 40 insertions(+), 2 deletions(-) >=20 > diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status= .pm > index e4ce698..9ac4660 100644 > --- a/src/PVE/API2/Storage/Status.pm > +++ b/src/PVE/API2/Storage/Status.pm > @@ -578,6 +578,11 @@ __PACKAGE__->register_method({ > requires =3D> 'checksum-algorithm', > optional =3D> 1, > }, > + compression =3D> { > + description =3D> "The compression algorithm used to compress", > + type =3D> 'string', should probably be restricted via an enum schema (containing 'zstd', 'gz' and 'lzop' for now). > + optional =3D> 1, > + }, nit: I would at least the change the description to indicate that setting this means the downloaded file will be decompressed using this algorithm > 'checksum-algorithm' =3D> { > description =3D> "The algorithm to calculate the checksum of the file.= ", > type =3D> 'string', > @@ -642,14 +647,25 @@ __PACKAGE__->register_method({ > http_proxy =3D> $dccfg->{http_proxy}, > }; > =20 > - my ($checksum, $checksum_algorithm) =3D $param->@{'checksum', 'checksum= -algorithm'}; > + my ($checksum, $checksum_algorithm, $compression) =3D $param->@{'checks= um', 'checksum-algorithm', 'compression'}; > if ($checksum) { > $opts->{"${checksum_algorithm}sum"} =3D $checksum; > $opts->{hash_required} =3D 1; > } compression should only be allowed for isos (for now), since templates are *always* compressed at the moment, and uncompressing them while downloading would actually make them unusable AFAICT? > =20 > my $worker =3D sub { > - PVE::Tools::download_file_from_url("$path/$filename", $url, $opts); > + my $save_to =3D "$path/$filename"; > + die "refusing to override existing file $save_to \n" if -e $save_to= ; > + $save_to .=3D ".$compression" if $compression; > + PVE::Tools::download_file_from_url($save_to, $url, $opts); > + if($compression) > + { > + my $decrypton_error =3D PVE::Storage::decompress_iso($compression, $sa= ve_to); > + print $decrypton_error if $decrypton_error; > + unlink $save_to; > + > + > + } the decompression here could (or probably should) be moved into download_file_from_url (passing in the decompression command via $opts). this would also make the handling of the tmpfile easier, since now we have - a tempfile for the download - renamed to a tempfile for the decompression - uncompressed to final destination - intermediate tempfile needs manual cleanup if the decompression is moved into the helper (including any required cleanups), we are not concerned at all about the tmpfiles here, which would be nice(r) IMHO. other than that, there is a few things that could be improved: - code style (positioning of {, blank lines) - $decrypton -> there is no encryption happening here :) - error handling: -- errors are normally not returned, but passed up the stack via "die" -- the error should be propagated up so that the task fails and the user knows the download failed and why, see below for more details > }; > =20 > my $worker_id =3D PVE::Tools::encode_text($filename); # must not pass := or the like as w-ID > diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm > index b99ed35..0c62cc8 100755 > --- a/src/PVE/Storage.pm > +++ b/src/PVE/Storage.pm > @@ -1532,6 +1532,11 @@ sub decompressor_info { > lzo =3D> ['lzop', '-d', '-c'], > zst =3D> ['zstd', '-q', '-d', '-c'], > }, > + iso =3D> { > + gz =3D> ['zstd', '-q', '-d'], this might warrant a comment ;) > + zst =3D> ['zstd', '-q', '-d'], > + lzo =3D> ['lzop', '-q', '-d'], > + }, we did discuss this a bit already, but also posting here on list in case someone else has an opinion: while we possibly lose a bit of performance (although I doubt it really matters much for regular iso files), aligning the iso and vma commands here would simplify the filename handling - if the returned command simply decompresses the next pushed argument to stdout, we don't have to account for the peculiarities of each command with regards to automatic extension removal. we could also save writing the data to disk twice (compressed, and then uncompressed) if we add calculating the digest to the pipe, or at least do that for the non-verifying case. isos are commonly stored on NFS/CIFS shares, where this is even more expensive cause of the network round trips. > }; > =20 > die "ERROR: archive format not defined\n" > @@ -1611,6 +1616,23 @@ sub archive_auxiliaries_remove { > } > } > =20 > +sub decompress_iso > +{ nit: style again > + my ($compression, $file) =3D @_; > + > + my $raw =3D ''; > + my $out =3D sub { > + my $output =3D shift; > + $raw .=3D "$output\n"; > + }; > + > + my $info =3D decompressor_info('iso', $compression); > + my $decompressor =3D $info->{decompressor}; > + =20 > + run_command([@$decompressor, $file], outfunc =3D> $out); outfunc only captures STDOUT, not STDERR, so I don't think this part does what you want it to do (based on the naming of the '$decrypton_error' variable above).. there also is a 'logfunc' (capturing both) and an 'errfunc' (capturing only STDERR). also, not all output is necessarily an error, even in the face of `-q`. I think it should be safe to assume that any decompression tool we call here will fail run_command if the decompression fails for whatever reason, in which case decompress_iso here will die, but that is *not* handled at its call site, so no cleanup will happen and the compressed, downloaded file will remain... note that by default, run_command will just forward the command's output (on both FDs) to whatever those point to outside, so a plain run_command should do the right thing in most circumstances, and outfunc and friends are only needed if you actually need to do something (parse, filter, ..) with the output. > + return wantarray ? ($raw, undef) : $raw; why? we sometimes use wantarray, but only if it actually makes sense to either return one or multiple values, depending on call-site context.. I don't think it serves any purpose here ;) > +} > + > sub extract_vzdump_config_tar { > my ($archive, $conf_re) =3D @_; > =20 > --=20 > 2.39.2 >=20 >=20 >=20 > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >=20 >=20 >=20