From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id DE0EC74140
 for <pve-devel@lists.proxmox.com>; Mon, 21 Jun 2021 09:30:38 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id CB7DF187C0
 for <pve-devel@lists.proxmox.com>; Mon, 21 Jun 2021 09:30:08 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 40454187B0
 for <pve-devel@lists.proxmox.com>; Mon, 21 Jun 2021 09:30:07 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0D27540C0A
 for <pve-devel@lists.proxmox.com>; Mon, 21 Jun 2021 09:30:07 +0200 (CEST)
Message-ID: <9a39a8ae-a865-57ba-a2bd-c0e8a1ce1ac6@proxmox.com>
Date: Mon, 21 Jun 2021 09:29:57 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:90.0) Gecko/20100101
 Thunderbird/90.0
Content-Language: en-US
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Lorenz Stechauner <l.stechauner@proxmox.com>
References: <20210616093604.33668-1-l.stechauner@proxmox.com>
 <20210616093604.33668-4-l.stechauner@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
In-Reply-To: <20210616093604.33668-4-l.stechauner@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.761 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [PATCH v9 storage 1/1] status: add download_url
 method
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 21 Jun 2021 07:30:38 -0000

On 16.06.21 11:35, Lorenz Stechauner wrote:
> uses common function PVE::Tools::download_file_from_url to download
> iso files.
> 
> Only users with permissions `Sys.Audit` and `Sys.Modify` on `/` are
> permitted to perform this action. This restriction is due to the
> fact, that the download function is able to download files from
> internal networks (which are not visible/accessible from outside).
> Users with these permissions anyway have the means to alter node
> (network) config, so this does not create any further security risk.
> 
> Signed-off-by: Lorenz Stechauner <l.stechauner@proxmox.com>
> ---
>  PVE/API2/Storage/Status.pm | 128 +++++++++++++++++++++++++++++++++++--
>  PVE/Storage.pm             |  10 +++
>  2 files changed, 133 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/API2/Storage/Status.pm b/PVE/API2/Storage/Status.pm
> index 897b4a7..584ba5d 100644
> --- a/PVE/API2/Storage/Status.pm
> +++ b/PVE/API2/Storage/Status.pm
> @@ -412,11 +412,7 @@ __PACKAGE__->register_method ({
>  	my $size = -s $tmpfilename;
>  	die "temporary file '$tmpfilename' does not exist\n" if !defined($size);
>  
> -	my $filename = $param->{filename};
> -
> -	chomp $filename;
> -	$filename =~ s/^.*[\/\\]//;
> -	$filename =~ s/[^-a-zA-Z0-9_.]/_/g;
> +	my $filename = PVE::Storage::normalize_content_filename($param->{filename});

nit/no hard feelings: factoring out normalize_content_filename could have been it's own
patch. IMO for re-using existing code that is nicer, when adding new code they can be
one patch too, if so small like here, as then it can be seen as part of the "single thing"
that patch does.

>  
>  	my $path;
>  
> @@ -497,4 +493,126 @@ __PACKAGE__->register_method ({
>  	return $upid;
>     }});
>  
> +__PACKAGE__->register_method({
> +    name => 'download_url',
> +    path => '{storage}/download-url',

you forgot to add this path to the index endpoint, making it basically a hidden
method..

We probably should assert this in the schema verifier to avoid, as the file-restore
from Stefan also forgot this.

> +    method => 'POST',
> +    description => "Download templates and ISO images by using an URL.",
> +    proxyto => 'node',
> +    permissions => {
> +	check => [ 'and',
> +	    ['perm', '/storage/{storage}', [ 'Datastore.AllocateTemplate' ]],
> +	    ['perm', '/', [ 'Sys.Audit', 'Sys.Modify' ]],
> +	],
> +    },
> +    protected => 1,
> +    parameters => {
> +	additionalProperties => 0,
> +	properties => {
> +	    node => get_standard_option('pve-node'),
> +	    storage => get_standard_option('pve-storage-id'),
> +	    url => {
> +		description => "The URL to download the file from.",
> +		type => 'string',
> +		pattern => 'https?://.*',
> +	    },
> +	    content => {
> +		description => "Content type.",
> +		type => 'string', format => 'pve-storage-content',
> +	    },
> +	    filename => {
> +		description => "The name of the file to create.",

does not mention that it will be normalized, i.e., possibly altered. IMO this is important
as else the user may not find their file again if looking for the original name.

I'd also rather limit this to what we can actually return in the content API call, I mean you
already enforce a implicit limit in the code, so why not encode the union of the allowed
file endings below? ideally using factored out, shared regexes for the ending.


> +		type => 'string',
> +	    },
> +	    checksum => {
> +		description => "The expected checksum of the file.",
> +		type => 'string',
> +		requires => 'checksum-algorithm',
> +		optional => 1,
> +	    },
> +	    'checksum-algorithm' => {
> +		description => "The algorithm to calculate the checksum of the file.",
> +		type => 'string',
> +		enum => ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512'],
> +		requires => 'checksum',
> +		optional => 1,
> +	    },
> +	    'verify-certificates' => {
> +		description => "If false, no SSL/TLS certificates will be verified.",
> +		type => 'boolean',
> +		optional => 1,
> +		default => 1,
> +	    }
> +	},
> +    },
> +    returns => {
> +	type => "string"
> +    },
> +    code => sub {
> +	my ($param) = @_;
> +
> +	my $cfg = PVE::Storage::config();
> +
> +	my ($node, $storage) = $param->@{'node', 'storage'};
> +	my $scfg = PVE::Storage::storage_check_enabled($cfg, $storage, $node);
> +
> +	die "can't upload to storage type '$scfg->{type}, not a file based storage!'\n" if !defined($scfg->{path});
> +
> +	my ($content, $url) = $param->@{'content', 'url'};
> +
> +	my $filename = PVE::Storage::normalize_content_filename($param->{filename});
> +	my $path;
> +
> +	# MIME type is checked in front end only
> +	# this check is omitted here intentionally and replaced by file extension check
> +	if ($content eq 'iso') {
> +	    if ($filename !~ m![^/]+$PVE::Storage::iso_extension_re$!) {
> +		raise_param_exc({ filename => "missing '.iso' or '.img' extension" });
> +	    }
> +	    $path = PVE::Storage::get_iso_dir($cfg, $storage);
> +	} elsif ($content eq 'vztmpl') {
> +	    if ($filename !~ m![^/]+\.tar\.[gx]z$!) {

we npwadays also support .zst (zstd), and .vma is missing too, please just avoid reinventing
regexes, especially if not really closly checking what it should cover, rather use an existing
one, which sometimes needs to be factored out first, `path_to_volume_id`

> +		raise_param_exc({ filename => "missing '.tar.gz' or '.tar.xz' extension" });
> +	    }
> +	    $path = PVE::Storage::get_vztmpl_dir($cfg, $storage);
> +	} else {
> +	    raise_param_exc({ content => "upload content type '$content' not allowed" });
> +	}
> +
> +	die "storage '$storage' does not support '$content' content\n" if !$scfg->{content}->{$content};

"does not support" is not necesarrily true, the storage could support it but an admin
could have disabled it, or forgot to enable it, so rather something along the lines of:

"storage '$storage' is not configured for content-type '$content'\n" if !$scfg->{content}->{$content};

> +
> +	PVE::Storage::activate_storage($cfg, $storage);
> +	File::Path::make_path($path);
> +
> +	my $dest = "$path/$filename";
> +
> +	my $opts = {
> +	    hash_required => 0,
> +	    verify_certificates => $param->{'verify-certificates'} // 1,
> +	};
> +
> +	my ($checksum, $checksum_algorithm) = $param->@{'checksum', 'checksum-algorithm'};
> +	if ($checksum) {
> +	    $opts->{"${checksum_algorithm}sum"} = $checksum;
> +	    $opts->{hash_required} = 1;
> +	}
> +
> +	my $dccfg = PVE::Cluster::cfs_read_file('datacenter.cfg');
> +	if ($dccfg->{http_proxy}) {
> +	    $opts->{http_proxy} = $dccfg->{http_proxy};
> +	}
> +
> +	my $worker = sub {
> +	    my $upid = shift;

unused variable.

> +	    PVE::Tools::download_file_from_url($dest, $url, $opts);
> +	};
> +
> +	my $rpcenv = PVE::RPCEnvironment::get();
> +	my $user = $rpcenv->get_user();

nit and not too hard feelings, but basically all existing API methods have above two
lines rather at the start of the  API endpoint's code definition.

> +
> +	my $upid = $rpcenv->fork_worker('download', $filename, $user, $worker);

> +
> +	return $upid;
> +    }});
> +
>  1;
> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
> index aa36bad..f9a3d49 100755
> --- a/PVE/Storage.pm
> +++ b/PVE/Storage.pm
> @@ -1929,4 +1929,14 @@ sub assert_sid_unused {
>      return undef;
>  }
>  

the method below is small, but a small comment about what normalize means here
could still be nice as regex can change their meaning a lot by subtle changes.

> +sub normalize_content_filename {
> +    my ($filename) = @_;
> +
> +    chomp $filename;
> +    $filename =~ s/^.*[\/\\]//;
> +    $filename =~ s/[^-a-zA-Z0-9_.]/_/g;
> +
> +    return $filename;
> +}
> +
>  1;
>