From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id E1F1C1FF37F for ; Wed, 17 Apr 2024 12:53:12 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 5FE713608; Wed, 17 Apr 2024 12:53:11 +0200 (CEST) Message-ID: Date: Wed, 17 Apr 2024 12:52:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion , Dominik Csapak References: <20240416131909.2867605-1-d.csapak@proxmox.com> <20240416131909.2867605-4-d.csapak@proxmox.com> Content-Language: en-US From: Fiona Ebner In-Reply-To: <20240416131909.2867605-4-d.csapak@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL -0.070 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [storage.pm, dmtf.org, plugin.pm, ovf.pm, status.pm, dirplugin.pm] Subject: Re: [pve-devel] [PATCH storage 3/9] plugin: dir: handle ova files for import X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Am 16.04.24 um 15:18 schrieb Dominik Csapak: > since we want to handle ova files (which are only ovf+vmdks bundled in a > tar file) for import, add code that handles that. > > we introduce a valid volname for files contained in ovas like this: > > storage:import/archive.ova/disk-1.vmdk > > by basically treating the last part of the path as the name for the > contained disk we want. > > we then provide 3 functions to use for that: > > * copy_needs_extraction: determines from the given volid (like above) if > that needs extraction to copy it, currently only 'import' vtype + > defined format returns true here (if we have more options in the > future, we can of course easily extend that) > > * extract_disk_from_import_file: this actually extracts the file from > the archive. Currently only ova is supported, so the extraction with > 'tar' is hardcoded, but again we can easily extend/modify that should > we need to. > > we currently extract into the import storage in a directory named: > `.tmp__` which should not clash with concurrent > operations (though we do extract it multiple times then) > Could we do "extract upon upload", "tar upon download" instead? Sure some people surely want to drop the ova manually, but we could tell them they need to extract it first too. Depending on the amount of headache this would save us, it might be worth it. > alternatively we could implement either a 'tmpstorage' parameter, > or use e.g. '/var/tmp/' or similar, but re-using the current storage > seemed ok. > > * cleanup_extracted_image: intended to cleanup the extracted images from > above, including the surrounding temporary directory > > we have to modify the `parse_ovf` a bit to handle the missing disk > images, and we parse the size out of the ovf part (since this is > informal only, it should be no problem if we cannot parse it sometimes) > > Signed-off-by: Dominik Csapak > --- > src/PVE/API2/Storage/Status.pm | 1 + > src/PVE/Storage.pm | 59 ++++++++++++++++++++++++++++++++++ > src/PVE/Storage/DirPlugin.pm | 13 +++++++- > src/PVE/Storage/OVF.pm | 53 ++++++++++++++++++++++++++---- > src/PVE/Storage/Plugin.pm | 5 +++ > 5 files changed, 123 insertions(+), 8 deletions(-) > > diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm > index f7e324f..77ed57c 100644 > --- a/src/PVE/API2/Storage/Status.pm > +++ b/src/PVE/API2/Storage/Status.pm > @@ -749,6 +749,7 @@ __PACKAGE__->register_method({ > 'efi-state-lost', > 'guest-is-running', > 'nvme-unsupported', > + 'ova-needs-extracting', > 'ovmf-with-lsi-unsupported', > 'serial-port-socket-only', > ], > diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm > index f8ea93d..bc073ef 100755 > --- a/src/PVE/Storage.pm > +++ b/src/PVE/Storage.pm > @@ -2189,4 +2189,63 @@ sub get_import_metadata { > return $plugin->get_import_metadata($scfg, $volname, $storeid); > } > Shouldn't the following three functions call into plugin methods instead? That'd seem much more future-proof to me. > +sub copy_needs_extraction { > + my ($volid) = @_; > + my ($storeid, $volname) = parse_volume_id($volid); > + my $cfg = config(); > + my $scfg = storage_config($cfg, $storeid); > + my $plugin = PVE::Storage::Plugin->lookup($scfg->{type}); > + > + my ($vtype, $name, $vmid, $basename, $basevmid, $isBase, $file_format) = > + $plugin->parse_volname($volname); > + > + return $vtype eq 'import' && defined($file_format); E.g this seems rather hacky, and puts a weird coupling on a future import plugin's parse_volname() function (presence of $file_format). > +} > + > +sub extract_disk_from_import_file { > + my ($volid, $vmid) = @_; > + > + my ($storeid, $volname) = parse_volume_id($volid); > + my $cfg = config(); > + my $scfg = storage_config($cfg, $storeid); > + my $plugin = PVE::Storage::Plugin->lookup($scfg->{type}); > + > + my ($vtype, $name, undef, undef, undef, undef, $file_format) = > + $plugin->parse_volname($volname); > + > + die "only files with content type 'import' can be extracted\n" > + if $vtype ne 'import' || !defined($file_format); > + > + # extract the inner file from the name > + if ($volid =~ m!${name}/([^/]+)$!) { > + $name = $1; > + } > + > + my ($source_file) = $plugin->path($scfg, $volname, $storeid); > + > + my $destdir = $plugin->get_subdir($scfg, 'import'); > + my $pid = $$; > + $destdir .= "/.tmp_${pid}_${vmid}"; > + mkdir $destdir; > + > + ($source_file) = $source_file =~ m|^(/.*)|; # untaint > + > + run_command(['tar', '-x', '-C', $destdir, '-f', $source_file, $name]); > + > + return "$destdir/$name"; > +} > + > +sub cleanup_extracted_image { > + my ($source) = @_; > + > + if ($source =~ m|^(/.+/\.tmp_[0-9]+_[0-9]+)/[^/]+$|) { > + my $tmpdir = $1; > + > + unlink $source or $! == ENOENT or die "removing image $source failed: $!\n"; > + rmdir $tmpdir or $! == ENOENT or die "removing tmpdir $tmpdir failed: $!\n"; > + } else { > + die "invalid extraced image path '$source'\n"; > + } > +} > + > 1; > diff --git a/src/PVE/Storage/DirPlugin.pm b/src/PVE/Storage/DirPlugin.pm > index 4dc7708..50ceab7 100644 > --- a/src/PVE/Storage/DirPlugin.pm > +++ b/src/PVE/Storage/DirPlugin.pm > @@ -260,14 +260,25 @@ sub get_import_metadata { > # NOTE: all types must be added to the return schema of the import-metadata API endpoint > my $warnings = []; > > + my $isOva = 0; > + if ($path =~ m!\.ova!) { Would be nicer if parse_volname() would return the $file_format and we chould check for that. Also missing the $ in the regex, so you'd mismatch a weird filename like ABC.ovaXYZ.ovf or? > + $isOva = 1; > + push @$warnings, { type => 'ova-needs-extracting' }; > + } > my $res = PVE::Storage::OVF::parse_ovf($path, $isOva); > my $disks = {}; > for my $disk ($res->{disks}->@*) { > my $id = $disk->{disk_address}; > my $size = $disk->{virtual_size}; > my $path = $disk->{relative_path}; > + my $volid; > + if ($isOva) { > + $volid = "$storeid:$volname/$path"; > + } else { > + $volid = "$storeid:import/$path", > + } > $disks->{$id} = { > - volid => "$storeid:import/$path", > + volid => $volid, > defined($size) ? (size => $size) : (), > }; > } > diff --git a/src/PVE/Storage/OVF.pm b/src/PVE/Storage/OVF.pm > index 4a322b9..fb850a8 100644 > --- a/src/PVE/Storage/OVF.pm > +++ b/src/PVE/Storage/OVF.pm > @@ -85,11 +85,37 @@ sub id_to_pve { > } > } > > +# technically defined in DSP0004 (https://www.dmtf.org/dsp/DSP0004) as an ABNF > +# but realistically this always takes the form of 'bytes * base^exponent' The comment wrongly says 'bytes' instead of 'byte' (your test examples confirm this). > +sub try_parse_capacity_unit { > + my ($unit_text) = @_; > + > + if ($unit_text =~ m/^\s*byte\s*\*\s*([0-9]+)\s*\^\s*([0-9]+)\s*$/) { Fun regex :P > + my $base = $1; > + my $exp = $2; > + return $base ** $exp; > + } > + > + return undef; > +} > + (...) > diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm > index deaf8b2..ea069ab 100644 > --- a/src/PVE/Storage/Plugin.pm > +++ b/src/PVE/Storage/Plugin.pm > @@ -654,6 +654,11 @@ sub parse_volname { > return ('backup', $fn); > } elsif ($volname =~ m!^snippets/([^/]+)$!) { > return ('snippets', $1); > + } elsif ($volname =~ m!^import/([^/]+\.ova)\/([^/]+)$!) { > + my $archive = $1; > + my $file = $2; > + my (undef, $format, undef) = parse_name_dir($file); > + return ('import', $archive, 0, undef, undef, undef, $format); So we return the same $name for different things here? Not super happy with that either. If we were to get creative we could say the archive is the "base" of the image, but surely also comes with caveats. > } elsif ($volname =~ m!^import/([^/]+$PVE::Storage::IMPORT_EXT_RE_1)$!) { > return ('import', $1); > } elsif ($volname =~ m!^import/([^/]+\.(raw|vmdk|qcow2))$!) { _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel