From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id E28C179359 for ; Tue, 4 May 2021 11:32:09 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id CFE82268F5 for ; Tue, 4 May 2021 11:31:39 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 19D5A268E4 for ; Tue, 4 May 2021 11:31:38 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E070F42133 for ; Tue, 4 May 2021 11:31:37 +0200 (CEST) Message-ID: <6cd04728-b25d-0456-4f34-24536ae4d8ce@proxmox.com> Date: Tue, 4 May 2021 11:31:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:89.0) Gecko/20100101 Thunderbird/89.0 Content-Language: en-US To: Proxmox VE development discussion , Lorenz Stechauner References: <20210504085530.36961-1-l.stechauner@proxmox.com> <20210504085614.37162-1-l.stechauner@proxmox.com> From: Thomas Lamprecht In-Reply-To: <20210504085614.37162-1-l.stechauner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.005 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [status.pm, storage.pm] Subject: Re: [pve-devel] [PATCH v3 storage 1/2] fix #1710: add retrieve method for storage X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 May 2021 09:32:09 -0000 On 04.05.21 10:56, Lorenz Stechauner wrote: > Users are now able to download/retrieve any .iso/... file onto their > storages and verify file integrity with checksums. > > Signed-off-by: Lorenz Stechauner > --- > PVE/API2/Storage/Status.pm | 158 +++++++++++++++++++++++++++++++++++-- > PVE/Storage.pm | 10 +++ > 2 files changed, 163 insertions(+), 5 deletions(-) > > diff --git a/PVE/API2/Storage/Status.pm b/PVE/API2/Storage/Status.pm > index 897b4a7..8388714 100644 > --- a/PVE/API2/Storage/Status.pm > +++ b/PVE/API2/Storage/Status.pm > @@ -5,6 +5,8 @@ use warnings; > > use File::Path; > use File::Basename; > +use HTTP::Request; > +use LWP::UserAgent; > use PVE::Tools; > use PVE::INotify; > use PVE::Cluster; > @@ -412,11 +414,7 @@ __PACKAGE__->register_method ({ > my $size = -s $tmpfilename; > die "temporary file '$tmpfilename' does not exist\n" if !defined($size); > > - my $filename = $param->{filename}; > - > - chomp $filename; > - $filename =~ s/^.*[\/\\]//; > - $filename =~ s/[^-a-zA-Z0-9_.]/_/g; > + my $filename = PVE::Storage::normalize_content_filename($param->{filename}); > > my $path; > > @@ -497,4 +495,154 @@ __PACKAGE__->register_method ({ > return $upid; > }}); > > +__PACKAGE__->register_method({ > + name => 'retrieve', > + path => '{storage}/retrieve', over general path/method name, I'd either use "retrieve-url" or use "download-url" > + method => 'POST', > + description => "Download templates and ISO images by using an URL.", > + proxyto => 'node', > + permissions => { > + check => [ 'and', > + ['perm', '/storage/{storage}', [ 'Datastore.AllocateTemplate' ]], > + ['perm', '/', [ 'Sys.Audit', 'Sys.Modify' ]], > + ], > + }, > + protected => 1, > + parameters => { > + additionalProperties => 0, > + properties => { > + node => get_standard_option('pve-node'), > + storage => get_standard_option('pve-storage-id'), > + url => { > + description => "The URL to retrieve the file from.", > + type => 'string', > + }, > + content => { > + description => "Content type.", auto-detect could be done in general? At least for a few whitelisted types (e.g., .iso), but no hard feelings here, we can always make this more flexible in the future > + type => 'string', format => 'pve-storage-content', > + }, > + filename => { > + description => "The name of the file to create. Alternatively the file name given by the server will be used.", You mention "alternatively", but this parameter is not optional? > + type => 'string', > + }, > + checksum => { > + description => "The expected checksum of the file.", > + type => 'string', > + requires => 'checksumalg', > + optional => 1, > + }, > + checksumalg => { weird name? I really do not get the need to abbreviate everything in some custom way. APIs parameter should speak for them self, at least somewhat, just use "checksum-algorithm" > + description => "The algorithm to claculate the checksum of the file.", typo s/claculate/calculate/ > + type => 'string', enum => ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512'], The enum goes in its own line > + requires => 'checksum', > + optional => 1, > + }, > + insecure => { > + description => "Allow TLS certificates to be invalid.", Rather use "verify-certificates" and default to 1 (true). > + type => 'boolean', > + optional => 1, > + } > + }, > + }, > + returns => { type => "string" }, use returns => { type => "string", }, > + code => sub { > + my ($param) = @_; > + > + my $rpcenv = PVE::RPCEnvironment::get(); > + unnecessary extra newline > + my $user = $rpcenv->get_user(); > + > + my $cfg = PVE::Storage::config(); > + > + my $node = $param->{node}; > + my $storage = $param->{storage}; > + my $scfg = PVE::Storage::storage_check_enabled($cfg, $storage, $node); > + > + die "can't upload to storage type '$scfg->{type}'" 1. Without a \n at then perl will add the line + module to that error message, not nice for stuff which is not internal (i.e., assert-like). 2. normally its considered good to add some reason/explanation to an error, so that user can know what needs to actually change to make it work. Maybe: "cannot upload to storage '$storage' (type '$scfg->{type}'), not a file based storage!\n" > + if !defined($scfg->{path}); > + > + my $content = $param->{content}; > + > + my $url = $param->{url}; optimizer nit, above three lines could be: my ($content, $url) = $param->@{'content', 'url'}; > + > + die "invalid https or http url" > + if $url !~ qr!^https?://!; we normally place short post-if's into the same line as the statement it guards > + > + my $checksum = $param->{checksum}; > + my $hash_alg = $param->{checksumalg}; consistency please, if the param is named checksum*something then the variable should be named similar (hash vs. checksum here). Also, in general I'd like to see some locality between variable declaration and use > + > + my $filename = PVE::Storage::normalize_content_filename($param->{filename}); > + > + my $path; > + > + # MIME type is checked in front end only > + # this check is omitted here intentionally and replaced by file extension check > + if ($content eq 'iso') { > + if ($filename !~ m![^/]+$PVE::Storage::iso_extension_re$!) { > + raise_param_exc({ filename => "missing '.iso' or '.img' extension" }); > + } > + $path = PVE::Storage::get_iso_dir($cfg, $storage); > + } elsif ($content eq 'vztmpl') { > + if ($filename !~ m![^/]+\.tar\.[gx]z$!) { > + raise_param_exc({ filename => "missing '.tar.gz' or '.tar.xz' extension" }); > + } > + $path = PVE::Storage::get_vztmpl_dir($cfg, $storage); > + } else { > + raise_param_exc({ content => "upload content type '$content' not allowed" }); > + } > + > + die "storage '$storage' does not support '$content' content" > + if !$scfg->{content}->{$content}; > + > + my $dest = "$path/$filename"; > + > + PVE::Storage::activate_storage($cfg, $storage); > + File::Path::make_path($path); > + > + # -L follows redirects > + # -f silent fail on error why silently failing? wouldn't the error be interesting, should not have to big implications now that this call requires elevated privs, or? > + my @curlcmd = ('curl', '-L', '-o', $dest, '-f'); we use wget with --progress=dot:mega for the appliances, so why curl here? In general I ask myself if there would be some common helper possible for both, appliances and this here, to avoid code duplication, did you looked into that. > + push @curlcmd, '--insecure' > + if $param->{insecure}; > + > + my $cmd = [@curlcmd, $url]; $cmd is overly general > + > + my $cmd_check = [ [ 'echo', $checksum, $dest ], [ "${hash_alg}sum", '-c' ] ]; > + my $cmd_check_flat = [ 'echo', $checksum, $dest, '|', "${hash_alg}sum", '-c' ]; # only used for logging yeah, no, we don't do that, the log command is at risk to get outdated without noticing. Either auto generated it from the $cmd_check or omit in in general (i.e., by just logging the checksum algorithm used. Also why do we rely on the external tools if we have usage of the perl Digest::{MD5,SHA256,..} quite some times already? > + > + my $worker = sub { > + my $upid = shift; > + > + print "starting file download from: $url\n"; > + print "target node: $node\n"; omit above node info, it's already in the task UPID > + print "target storage: $storage\n"; isn't that to in the task UPID? > + print "target file: $dest\n"; > + print "command: " . join(' ', @$cmd) . "\n"; if use PVE::Tools shellquote or the like, but actually I'd omit that... The five prints could be print "download '$url' to $storage as $filename\n"; further advantage, that does not leaks that much info, e.g., the full path may not be known even for those with sys.audit. > + > + eval { PVE::Tools::run_command($cmd, errmsg => 'download failed'); }; > + if (my $err = $@) { > + unlink $dest; check the unlink, for a single file you can use a simple or: unlink $dest or warn "could not cleanup '$dest' - $!\n" > + die $err; > + } > + print "finished file download successfully\n"; > + > + if ($checksum) { > + print "validating checksum...\n"; > + print "expected $hash_alg checksum is: $checksum\n"; quite chatty, above could go into one line. Information is good to have in a task log, but it should be concise to avoid drowning the relevant in noise. > + print cmd_check print "checksum validation command: " . join(' ', @$cmd_check_flat) . "\n"; > + > + eval { PVE::Tools::run_command($cmd_check, errmsg => 'checksum mismatch'); }; this all is racy, i.e., if something aborts/kills the task or above hangs we have a time window where the downloaded file already shows up as valid (but not yet checked!) one at the storage. Use a temporary file ending (we have some use of those in various places, basically something like .tmp.$$) and do a rename only at the end. > + if (my $err = $@) { > + unlink $dest; > + die $err; > + } > + print "validated checksum successfully\n"; > + } > + }; > + > + my $upid = $rpcenv->fork_worker('imgdownload', $storage, $user, $worker); > + > + return $upid; > + }}); > + > 1; > diff --git a/PVE/Storage.pm b/PVE/Storage.pm > index 122c3e9..d57fd43 100755 > --- a/PVE/Storage.pm > +++ b/PVE/Storage.pm > @@ -1931,4 +1931,14 @@ sub assert_sid_unused { > return undef; > } > > +sub normalize_content_filename { > + my ($filename) = @_; > + > + chomp $filename; > + $filename =~ s/^.*[\/\\]//; > + $filename =~ s/[^-a-zA-Z0-9_.]/_/g; > + > + return $filename; > +} > + > 1; >