From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B301E72444 for ; Tue, 15 Jun 2021 14:25:48 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A74202E3EA for ; Tue, 15 Jun 2021 14:25:48 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 913BA2E3DC for ; Tue, 15 Jun 2021 14:25:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6519743FEB for ; Tue, 15 Jun 2021 14:25:47 +0200 (CEST) Message-ID: <29c62c46-8790-99ef-f089-3736377ef0f5@proxmox.com> Date: Tue, 15 Jun 2021 14:25:40 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:90.0) Gecko/20100101 Thunderbird/90.0 Content-Language: en-US To: Proxmox VE development discussion , Lorenz Stechauner References: <20210614090557.33455-1-l.stechauner@proxmox.com> <20210614090557.33455-2-l.stechauner@proxmox.com> From: Thomas Lamprecht In-Reply-To: <20210614090557.33455-2-l.stechauner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.923 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [proxmox.com, tools.pm] Subject: [pve-devel] applied: [PATCH v7 common 1/1] tools: add download_file_from_url X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jun 2021 12:25:48 -0000 On 14.06.21 11:05, Lorenz Stechauner wrote: > code is based on > manager:PVE/API2/Nodes.pm:aplinfo > applied, with a slightly adapted commit message you send afterwards and some followups. I'm a bit sorry to not check on this more closely again earlier as I found quite some issues when finally wanting to apply this, I fixed up most of them in a followup but the remaining part of the series needs to adapt. > Signed-off-by: Lorenz Stechauner > --- > src/PVE/Tools.pm | 124 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 124 insertions(+) > > diff --git a/src/PVE/Tools.pm b/src/PVE/Tools.pm > index 16ae3d2..7b82e00 100644 > --- a/src/PVE/Tools.pm > +++ b/src/PVE/Tools.pm > @@ -1829,4 +1829,128 @@ sub safe_compare { > return $cmp->($left, $right); > } > > + > +# opts > +# -> hash_required > +# if 1, at least one checksum has to be specified otherwise an error will be thrown > +# -> http_proxy > +# -> https_proxy > +# -> verify_certificates > +# -> sha(1|224|256|384|512)sum > +# -> md5sum > +sub download_file_from_url { > + my ($dest, $url, $opts) = @_; > + > + my $tmpdest = "$dest.tmp.$$"; If we'd kept the fork_worker here (see below) this wouldn't be ideal, as now it's using the pveproxy worker pid, which can be shared by multiple in-flight requests, I'd have moved it down into the worker sub which is actually it's own process and has a more unique pid.. > + > + my $algorithm; > + my $expected; > + prefixed above two variables with "checksum_" to avoid over general names in longer methods. > + for ('sha512', 'sha384', 'sha256', 'sha224', 'sha1', 'md5') { > + if (defined($opts->{"${_}sum"})) { > + $algorithm = $_; > + $expected = $opts->{"${_}sum"}; > + last; > + } > + } > + > + die "checksum required but not specified\n" if ($opts->{hash_required} && !$algorithm); > + > + my $worker = sub { > + my $upid = shift; > + > + print "downloading $url to $dest\n"; > + > + eval { > + if (-f $dest && $algorithm) { > + print "calculating checksum of existing file...\n"; > + my $correct = check_file_hash($algorithm, $expected, $dest); > + > + if ($correct) { > + print "file already exists, no need to download\n"; > + return; > + } else { > + print "mismatch, downloading\n"; made a die "abort" out of above, IMO this is not really safe, admin should delete the existing file first explicitly (we may relax this in the future if actually requested by users, but as default I'd like to start out with safer behavior). > + } > + } > + > + my @cmd = ('/usr/bin/wget', '--progress=dot:mega', '-O', $tmpdest, $url); Changed to a array ref and dropped the /usr/bin/, which may be wrong (in theory) on some systems. > + > + local %ENV; > + if ($opts->{http_proxy}) { > + $ENV{http_proxy} = $opts->{http_proxy}; > + } > + if ($opts->{https_proxy}) { > + $ENV{https_proxy} = $opts->{https_proxy}; > + } > + > + my $verify = $opts->{verify_certificates} // 1; > + if (!$verify) { > + push @cmd, '--no-check-certificate'; > + } > + > + if (run_command([[@cmd]]) != 0) { > + die "download failed: $!\n"; > + } why double array refs? I changed above three lines to: run_command($cmd, errmsg => "download failed"); > + > + if ($algorithm) { > + print "calculating checksum...\n"; > + > + my $correct = check_file_hash($algorithm, $expected, $tmpdest); > + > + if ($correct) { > + print "checksum verified\n"; not telling what was wrong is never good, i.e., the calculated checksum should be printed here. I made 'check_file_hash' to 'get_file_hash' and let it just return the computed digest, that way we do not need to pass expected either and just check ourself. > + } else { > + die "checksum mismatch\n"; > + } > + } else { > + print "no checksum for verification specified\n"; nit: just noise, makes the task log unnecessarily longer > + } > + > + if (!rename($tmpdest, $dest)) { > + die "unable to save file: $!\n"; nit: it's already saved, so this error is misleading, I changed it to "unable to rename temporary file: $!" to better reflect where the actual issues happened. > + } > + }; > + my $err = $@; > + > + unlink $tmpdest; this can fail too, and we should warn the admin about that > + > + if ($err) { > + print "\n"; > + die $err; > + } > + > + print "download finished\n"; > + }; > + > + my $rpcenv = PVE::RPCEnvironment::get(); you use PVE::RPCEnvironment here but are missing an use statement, it works only as other modules in pveproxy/pvedaemon include it and perl has no usage-hygiene. But it cannot be "fixed" by just using this here, there's a reason that we have no other fork_worker/RPCEnv use in PVE::Tools, PVE::RPCEnvironment is in pve-access-control, which is a consumer of pve-common, so adding this here would create a circular dependency which we want to avoid at all costs (makes bootstrapping a huge PITA without any benefit). So I dropped the whole worker stuff, that needs to happen at the caller side. > + my $user = $rpcenv->get_user(); > + > + (my $filename = $dest) =~ s!.*/([^/]*)$!$1!; not really safe, would break quite a bit if the filename contains a colon : or white-space, this needs to be encoded > + > + return $rpcenv->fork_worker('download', $filename, $user, $worker); > +} > + > +sub check_file_hash { > + my ($algorithm, $expected, $filename) = @_; > + > + my $algorithm_map = { > + 'md5' => sub { Digest::MD5->new }, > + 'sha1' => sub { Digest::SHA->new(1) }, > + 'sha224' => sub { Digest::SHA->new(224) }, > + 'sha256' => sub { Digest::SHA->new(256) }, > + 'sha384' => sub { Digest::SHA->new(384) }, > + 'sha512' => sub { Digest::SHA->new(512) }, You use the Digest:: modules but never add an import use statement for that, meaning again that it's up to luck (i.e., some other module imports it) if this works.. E.g., the following minimal test does not work even if the worker stuff is dropped: perl -we 'use PVE::Tools; PVE::Tools::download_file_from_url("/tmp/foo", "http://download.proxmox.com/iso/proxmox-ve_3.4-102d4547-6.iso", {md5sum => "37efacd45b70d5d720b11b468f26136b"});' > + }; > + > + my $digester = $algorithm_map->{$algorithm}->() or die "unknown algorithm '$algorithm'\n"; > + > + open(my $fh, '<', $filename) or die "unable to open '$filename': $!\n"; > + binmode($fh); > + > + my $digest = $digester->addfile($fh)->hexdigest; > + > + return lc($digest) eq lc($expected); > +} > + > 1; >