From: Dominik Csapak <d.csapak@proxmox.com>
To: "Proxmox VE development discussion" <pve-devel@lists.proxmox.com>,
"Fabian Grünbichler" <f.gruenbichler@proxmox.com>
Subject: Re: [pve-devel] [PATCH storage 3/9] plugin: dir: handle ova files for import
Date: Wed, 17 Apr 2024 15:10:01 +0200 [thread overview]
Message-ID: <3797837f-2411-4472-857c-120dd2242b06@proxmox.com> (raw)
In-Reply-To: <1713350290.jfv7fnie9d.astroid@yuna.none>
On 4/17/24 14:45, Fabian Grünbichler wrote:
> On April 16, 2024 3:18 pm, Dominik Csapak wrote:
>> since we want to handle ova files (which are only ovf+vmdks bundled in a
>> tar file) for import, add code that handles that.
>>
>> we introduce a valid volname for files contained in ovas like this:
>>
>> storage:import/archive.ova/disk-1.vmdk
>>
>> by basically treating the last part of the path as the name for the
>> contained disk we want.
>>
>> we then provide 3 functions to use for that:
>>
>> * copy_needs_extraction: determines from the given volid (like above) if
>> that needs extraction to copy it, currently only 'import' vtype +
>> defined format returns true here (if we have more options in the
>> future, we can of course easily extend that)
>>
>> * extract_disk_from_import_file: this actually extracts the file from
>> the archive. Currently only ova is supported, so the extraction with
>> 'tar' is hardcoded, but again we can easily extend/modify that should
>> we need to.
>>
>> we currently extract into the import storage in a directory named:
>> `.tmp_<pid>_<targetvmid>` which should not clash with concurrent
>> operations (though we do extract it multiple times then)
>>
>> alternatively we could implement either a 'tmpstorage' parameter,
>> or use e.g. '/var/tmp/' or similar, but re-using the current storage
>> seemed ok.
>>
>> * cleanup_extracted_image: intended to cleanup the extracted images from
>> above, including the surrounding temporary directory
>
> the helpers could also all live in qemu-server for now, which would also
> make extending it to use a different storage, or direct importing via a
> pipe easier? see below ;)
>
>>
>> we have to modify the `parse_ovf` a bit to handle the missing disk
>> images, and we parse the size out of the ovf part (since this is
>> informal only, it should be no problem if we cannot parse it sometimes)
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> src/PVE/API2/Storage/Status.pm | 1 +
>> src/PVE/Storage.pm | 59 ++++++++++++++++++++++++++++++++++
>> src/PVE/Storage/DirPlugin.pm | 13 +++++++-
>> src/PVE/Storage/OVF.pm | 53 ++++++++++++++++++++++++++----
>> src/PVE/Storage/Plugin.pm | 5 +++
>> 5 files changed, 123 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm
>> index f7e324f..77ed57c 100644
>> --- a/src/PVE/API2/Storage/Status.pm
>> +++ b/src/PVE/API2/Storage/Status.pm
>> @@ -749,6 +749,7 @@ __PACKAGE__->register_method({
>> 'efi-state-lost',
>> 'guest-is-running',
>> 'nvme-unsupported',
>> + 'ova-needs-extracting',
>> 'ovmf-with-lsi-unsupported',
>> 'serial-port-socket-only',
>> ],
>> diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
>> index f8ea93d..bc073ef 100755
>> --- a/src/PVE/Storage.pm
>> +++ b/src/PVE/Storage.pm
>> @@ -2189,4 +2189,63 @@ sub get_import_metadata {
>> return $plugin->get_import_metadata($scfg, $volname, $storeid);
>> }
>>
>> +sub copy_needs_extraction {
>> + my ($volid) = @_;
>> + my ($storeid, $volname) = parse_volume_id($volid);
>> + my $cfg = config();
>> + my $scfg = storage_config($cfg, $storeid);
>> + my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
>> +
>> + my ($vtype, $name, $vmid, $basename, $basevmid, $isBase, $file_format) =
>> + $plugin->parse_volname($volname);
>> +
>> + return $vtype eq 'import' && defined($file_format);
>> +}
>
> not sure this one is needed? it could also just be a call to
> PVE::Storage::parse_volname in qemu-server?
>
>> +
>> +sub extract_disk_from_import_file {
>
> similarly, this is basically PVE::Storage::get_import_dir + the
> run_command call, and could live in qemu-server?
>
>> + my ($volid, $vmid) = @_;
>> +
>> + my ($storeid, $volname) = parse_volume_id($volid);
>> + my $cfg = config();
>> + my $scfg = storage_config($cfg, $storeid);
>> + my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
>> +
>> + my ($vtype, $name, undef, undef, undef, undef, $file_format) =
>> + $plugin->parse_volname($volname);
>> +
>> + die "only files with content type 'import' can be extracted\n"
>> + if $vtype ne 'import' || !defined($file_format);
>> +
>> + # extract the inner file from the name
>> + if ($volid =~ m!${name}/([^/]+)$!) {
>> + $name = $1;
>
> we should probably be very conservative here and only allow [-_a-z0-9]
> as a start - or something similar rather restrictive..
>
>> + }
>> +
>> + my ($source_file) = $plugin->path($scfg, $volname, $storeid);
>> +
>> + my $destdir = $plugin->get_subdir($scfg, 'import');
>> + my $pid = $$;
>> + $destdir .= "/.tmp_${pid}_${vmid}";
>> + mkdir $destdir;
>> +
>> + ($source_file) = $source_file =~ m|^(/.*)|; # untaint
>
> again a rather interesting untaint ;)
>
>> +
>> + run_command(['tar', '-x', '-C', $destdir, '-f', $source_file, $name]);
>
> if $name was a symlink in the archive, you've now created a symlink
> pointing wherever..
>
>> +
>> + return "$destdir/$name";
>
> and this returns an absolute path to it, and now we are in trouble land
> ;) we should check that the file is a real file here..
>
>> +}
>> +
>> +sub cleanup_extracted_image {
>
> same for this?
>
>> + my ($source) = @_;
>> +
>> + if ($source =~ m|^(/.+/\.tmp_[0-9]+_[0-9]+)/[^/]+$|) {
>> + my $tmpdir = $1;
>> +
>> + unlink $source or $! == ENOENT or die "removing image $source failed: $!\n";
>> + rmdir $tmpdir or $! == ENOENT or die "removing tmpdir $tmpdir failed: $!\n";
>> + } else {
>> + die "invalid extraced image path '$source'\n";
>
> nit: typo
>
> these are also not discoverable if the error handling in qemu-server
> failed for some reason.. might be a source of unwanted space
> consumption..
any suggestions for better handling that cleanup?
we could put it at the beginning of each cleanup step, that should
at least make sure we cleaned up the temporary images
>
>> + }
>> +}
>> +
>> 1;
>> diff --git a/src/PVE/Storage/DirPlugin.pm b/src/PVE/Storage/DirPlugin.pm
>> index 4dc7708..50ceab7 100644
>> --- a/src/PVE/Storage/DirPlugin.pm
>> +++ b/src/PVE/Storage/DirPlugin.pm
>> @@ -260,14 +260,25 @@ sub get_import_metadata {
>> # NOTE: all types must be added to the return schema of the import-metadata API endpoint
>> my $warnings = [];
>>
>> + my $isOva = 0;
>> + if ($path =~ m!\.ova!) {
>> + $isOva = 1;
>> + push @$warnings, { type => 'ova-needs-extracting' };
>> + }
>> my $res = PVE::Storage::OVF::parse_ovf($path, $isOva);
>> my $disks = {};
>> for my $disk ($res->{disks}->@*) {
>> my $id = $disk->{disk_address};
>> my $size = $disk->{virtual_size};
>> my $path = $disk->{relative_path};
>> + my $volid;
>> + if ($isOva) {
>> + $volid = "$storeid:$volname/$path";
>> + } else {
>> + $volid = "$storeid:import/$path",
>> + }
>> $disks->{$id} = {
>> - volid => "$storeid:import/$path",
>> + volid => $volid,
>> defined($size) ? (size => $size) : (),
>> };
>> }
>> diff --git a/src/PVE/Storage/OVF.pm b/src/PVE/Storage/OVF.pm
>> index 4a322b9..fb850a8 100644
>> --- a/src/PVE/Storage/OVF.pm
>> +++ b/src/PVE/Storage/OVF.pm
>> @@ -85,11 +85,37 @@ sub id_to_pve {
>> }
>> }
>>
>> +# technically defined in DSP0004 (https://www.dmtf.org/dsp/DSP0004) as an ABNF
>> +# but realistically this always takes the form of 'bytes * base^exponent'
>> +sub try_parse_capacity_unit {
>> + my ($unit_text) = @_;
>> +
>> + if ($unit_text =~ m/^\s*byte\s*\*\s*([0-9]+)\s*\^\s*([0-9]+)\s*$/) {
>> + my $base = $1;
>> + my $exp = $2;
>> + return $base ** $exp;
>> + }
>> +
>> + return undef;
>> +}
>> +
>> # returns two references, $qm which holds qm.conf style key/values, and \@disks
>> sub parse_ovf {
>> - my ($ovf, $debug) = @_;
>> + my ($ovf, $isOva, $debug) = @_;
>> +
>> + # we have to ignore missing disk images for ova
>> + my $dom;
>> + if ($isOva) {
>> + my $raw = "";
>> + PVE::Tools::run_command(['tar', '-xO', '--wildcards', '--occurrence=1', '-f', $ovf, '*.ovf'], outfunc => sub {
>> + my $line = shift;
>> + $raw .= $line;
>> + });
>> + $dom = XML::LibXML->load_xml(string => $raw, no_blanks => 1);
>> + } else {
>> + $dom = XML::LibXML->load_xml(location => $ovf, no_blanks => 1);
>> + }
>>
>> - my $dom = XML::LibXML->load_xml(location => $ovf, no_blanks => 1);
>>
>> # register the xml namespaces in a xpath context object
>> # 'ovf' is the default namespace so it will prepended to each xml element
>> @@ -177,7 +203,17 @@ sub parse_ovf {
>> # @ needs to be escaped to prevent Perl double quote interpolation
>> my $xpath_find_fileref = sprintf("/ovf:Envelope/ovf:DiskSection/\
>> ovf:Disk[\@ovf:diskId='%s']/\@ovf:fileRef", $disk_id);
>> + my $xpath_find_capacity = sprintf("/ovf:Envelope/ovf:DiskSection/\
>> +ovf:Disk[\@ovf:diskId='%s']/\@ovf:capacity", $disk_id);
>> + my $xpath_find_capacity_unit = sprintf("/ovf:Envelope/ovf:DiskSection/\
>> +ovf:Disk[\@ovf:diskId='%s']/\@ovf:capacityAllocationUnits", $disk_id);
>> my $fileref = $xpc->findvalue($xpath_find_fileref);
>> + my $capacity = $xpc->findvalue($xpath_find_capacity);
>> + my $capacity_unit = $xpc->findvalue($xpath_find_capacity_unit);
>> + my $virtual_size;
>> + if (my $factor = try_parse_capacity_unit($capacity_unit)) {
>> + $virtual_size = $capacity * $factor;
>> + }
>>
>> my $valid_url_chars = qr@${valid_uripath_chars}|/@;
>> if (!$fileref || $fileref !~ m/^${valid_url_chars}+$/) {
>> @@ -217,23 +253,26 @@ ovf:Item[rasd:InstanceID='%s']/rasd:ResourceType", $controller_id);
>> die "error parsing $filepath, are you using a symlink ?\n";
>> }
>>
>> - if (!-e $backing_file_path) {
>> + if (!-e $backing_file_path && !$isOva) {
>
> this is actually not enough, the realpath call above can already fail if
> $filepath points to a file in a subdir (note that realpath will only
> check the path components, not the file itself).
>
> e.g.:
>
> error parsing foo/bar/chr-6.49.13-disk1.vmdk, are you using a symlink ? (500)
>
> we could also tighten what we allow as filepath here, in addition to the
> extraction code.
>
>> die "error parsing $filepath, file seems not to exist at $backing_file_path\n";
>> }
>>
>> ($backing_file_path) = $backing_file_path =~ m|^(/.*)|; # untaint
>> ($filepath) = $filepath =~ m|^(.*)|; # untaint
>>
>> - my $virtual_size = PVE::Storage::file_size_info($backing_file_path);
>> - die "error parsing $backing_file_path, cannot determine file size\n"
>> - if !$virtual_size;
>> + if (!$isOva) {
>> + my $size = PVE::Storage::file_size_info($backing_file_path);
>> + die "error parsing $backing_file_path, cannot determine file size\n"
>> + if !$size;
>>
>> + $virtual_size = $size;
>> + }
>> $pve_disk = {
>> disk_address => $pve_disk_address,
>> backing_file => $backing_file_path,
>> - virtual_size => $virtual_size
>> relative_path => $filepath,
>> };
>> + $pve_disk->{virtual_size} = $virtual_size if defined($virtual_size);
>> push @disks, $pve_disk;
>>
>> }
>> diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
>> index deaf8b2..ea069ab 100644
>> --- a/src/PVE/Storage/Plugin.pm
>> +++ b/src/PVE/Storage/Plugin.pm
>> @@ -654,6 +654,11 @@ sub parse_volname {
>> return ('backup', $fn);
>> } elsif ($volname =~ m!^snippets/([^/]+)$!) {
>> return ('snippets', $1);
>> + } elsif ($volname =~ m!^import/([^/]+\.ova)\/([^/]+)$!) {
>> + my $archive = $1;
>> + my $file = $2;
>> + my (undef, $format, undef) = parse_name_dir($file);
>> + return ('import', $archive, 0, undef, undef, undef, $format);
>> } elsif ($volname =~ m!^import/([^/]+$PVE::Storage::IMPORT_EXT_RE_1)$!) {
>> return ('import', $1);
>> } elsif ($volname =~ m!^import/([^/]+\.(raw|vmdk|qcow2))$!) {
>> --
>> 2.39.2
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2024-04-17 13:10 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-16 13:18 [pve-devel] [PATCH storage/qemu-server/pve-manager] implement ova/ovf import for directory type storages Dominik Csapak
2024-04-16 13:18 ` [pve-devel] [PATCH storage 1/9] copy OVF.pm from qemu-server Dominik Csapak
2024-04-16 15:02 ` Thomas Lamprecht
2024-04-17 9:19 ` Fiona Ebner
2024-04-17 9:26 ` Thomas Lamprecht
2024-04-16 13:18 ` [pve-devel] [PATCH storage 2/9] plugin: dir: implement import content type Dominik Csapak
2024-04-17 10:07 ` Fiona Ebner
2024-04-17 10:07 ` Fiona Ebner
2024-04-17 13:13 ` Dominik Csapak
2024-04-17 12:46 ` Fabian Grünbichler
2024-04-16 13:18 ` [pve-devel] [PATCH storage 3/9] plugin: dir: handle ova files for import Dominik Csapak
2024-04-17 10:52 ` Fiona Ebner
2024-04-17 13:07 ` Dominik Csapak
2024-04-17 13:39 ` Fabian Grünbichler
2024-04-18 7:22 ` Fiona Ebner
2024-04-18 7:25 ` Fiona Ebner
2024-04-18 8:55 ` Fabian Grünbichler
2024-04-17 12:45 ` Fabian Grünbichler
2024-04-17 13:10 ` Dominik Csapak [this message]
2024-04-17 13:52 ` Fabian Grünbichler
2024-04-17 14:07 ` Dominik Csapak
2024-04-18 6:46 ` Fabian Grünbichler
2024-04-16 13:18 ` [pve-devel] [PATCH storage 4/9] ovf: implement parsing the ostype Dominik Csapak
2024-04-17 11:32 ` Fiona Ebner
2024-04-17 13:14 ` Dominik Csapak
2024-04-18 7:31 ` Fiona Ebner
2024-04-16 13:18 ` [pve-devel] [PATCH storage 5/9] ovf: implement parsing out firmware type Dominik Csapak
2024-04-17 11:43 ` Fiona Ebner
2024-04-16 13:18 ` [pve-devel] [PATCH storage 6/9] ovf: implement rudimentary boot order Dominik Csapak
2024-04-17 11:54 ` Fiona Ebner
2024-04-17 13:15 ` Dominik Csapak
2024-04-16 13:19 ` [pve-devel] [PATCH storage 7/9] ovf: implement parsing nics Dominik Csapak
2024-04-17 12:09 ` Fiona Ebner
2024-04-17 13:16 ` Dominik Csapak
2024-04-18 8:22 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH storage 8/9] api: allow ova upload/download Dominik Csapak
2024-04-18 8:05 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH storage 9/9] plugin: enable import for nfs/btfs/cifs/cephfs Dominik Csapak
2024-04-18 8:43 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH qemu-server 1/3] api: delete unused OVF.pm Dominik Csapak
2024-04-18 8:52 ` Fiona Ebner
2024-04-18 8:57 ` Dominik Csapak
2024-04-18 9:03 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH qemu-server 2/3] use OVF from Storage Dominik Csapak
2024-04-18 9:07 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH qemu-server 3/3] api: create: implement extracting disks when needed for import-from Dominik Csapak
2024-04-18 9:41 ` Fiona Ebner
2024-04-18 9:48 ` Dominik Csapak
2024-04-18 9:55 ` Fiona Ebner
2024-04-18 9:58 ` Dominik Csapak
2024-04-18 10:01 ` Fiona Ebner
2024-04-16 13:19 ` [pve-devel] [PATCH manager 1/4] ui: fix special 'import' icon for non-esxi storages Dominik Csapak
2024-04-16 13:19 ` [pve-devel] [PATCH manager 2/4] ui: guest import: add ova-needs-extracting warning text Dominik Csapak
2024-04-16 13:19 ` [pve-devel] [PATCH manager 3/4] ui: enable import content type for relevant storages Dominik Csapak
2024-04-16 13:19 ` [pve-devel] [PATCH manager 4/4] ui: enable upload/download buttons for 'import' type storages Dominik Csapak
2024-04-17 12:37 ` Fabian Grünbichler
2024-04-18 11:20 ` Fiona Ebner
2024-04-18 11:23 ` Dominik Csapak
2024-04-18 11:26 ` Fiona Ebner
2024-04-17 13:11 ` [pve-devel] [PATCH storage/qemu-server/pve-manager] implement ova/ovf import for directory " Fabian Grünbichler
2024-04-17 13:19 ` Dominik Csapak
2024-04-18 6:40 ` Fabian Grünbichler
2024-04-18 9:27 ` Dominik Csapak
2024-04-18 10:35 ` Fiona Ebner
2024-04-18 11:10 ` Dominik Csapak
2024-04-18 11:13 ` Fiona Ebner
2024-04-18 11:17 ` Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3797837f-2411-4472-857c-120dd2242b06@proxmox.com \
--to=d.csapak@proxmox.com \
--cc=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox