From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 6EA531FF13B for ; Wed, 22 Apr 2026 13:16:38 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B41A6182E0; Wed, 22 Apr 2026 13:14:35 +0200 (CEST) From: "Max R. Carrara" To: pve-devel@lists.proxmox.com Subject: [PATCH pve-storage v1 22/54] tree-wide: introduce parsing module and replace usages of ISO_EXT_RE_0 Date: Wed, 22 Apr 2026 13:12:48 +0200 Message-ID: <20260422111322.257380-23-m.carrara@proxmox.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260422111322.257380-1-m.carrara@proxmox.com> References: <20260422111322.257380-1-m.carrara@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1776856338837 X-SPAM-LEVEL: Spam detection results: 0 AWL -0.291 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_ADVERT4 0.75 This is probably an unwanted commercial email... KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Message-ID-Hash: VZHCYQ6T3K6VUDXDJK7F3I3AO23OR4O2 X-Message-ID-Hash: VZHCYQ6T3K6VUDXDJK7F3I3AO23OR4O2 X-MailFrom: m.carrara@proxmox.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.10 Precedence: list List-Id: Proxmox VE development discussion List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Introduce the `PVE::Storage::Common::Parse` module with the following subroutines: * `parse_path_as_volname_parts($path, $vtype)` Parses a given file path into several smaller parts that constitute a volume name for the given volume type. * `parse_path_as_volname($path, $vtype)` Parses a given file path directly into a volume name. * `parse_volname_as_parts($volname)` Parses an existing volume name into its constituent parts. * `parse_path_as_volid_parts($storeid, $scfg, $path, $vtype)` Parses a given file path into several smaller parts that constitute a volume ID for the given storage ID, its configuration and volume type. * `parse_path_as_volid($storeid, $scfg, $path, $vtype)` Like `parse_path_as_volid_parts()`, but parses the given path directly into a volume ID. * `parse_volid_as_parts($volid)` Parses an existing volume ID into its storage ID and volume name parts. As of this commit, these parsing functions only support the 'iso' volume type, with the exception of `parse_volid_as_parts()`, which does not depend on knowing the volume type. Using the newly introduced parsing helpers, replace all occurrences of the `PVE::Storage::ISO_EXT_RE_0` regex across the repository. Keep the `ISO_EXT_RE_0` regex around for now, since its visibility is public, which also means that it is part of the storage API (whether that was intended or not). On an APIVER + APIAGE bump, this regex should be marked for removal. Additional Notes Regarding the new Parsers ========================================== Support for other volume types will be added individually in future commits. The new *private* regex used for 'iso' vtype parsing is still matching file extensions case-insensitively, but also matches the entire path and file name portions of 'iso' file paths and volume names. These parts are extracted using named regex groups, as that is much easier to handle and keep track of mentally, even with smaller regexes. These named groups are the "constituent parts" that are returned by some of the new parser subroutines. However, one important difference here is that named regex groups do not support dashes `-` in their names, only underscores `_`. To keep things consistent with our style (using dashes instead of underscores in hash keys and the API), these named groups are formatted before being returned—underscores are simply substituted with dashes. Finally, all parser subroutines check whether a parent directory reference (`..` or "double dots") is contained in the passed or extracted file path, and return early if there is. This is an additional safety measure that is intentionally introduced in this commit to guard against any mishaps in the future as the parsers gain more functionality, such as supporting nested directories inside different volume types' subdirectories. Signed-off-by: Max R. Carrara --- src/PVE/API2/Storage/Status.pm | 9 +- src/PVE/Storage.pm | 7 +- src/PVE/Storage/Common.pm | 2 + src/PVE/Storage/Common/Makefile | 1 + src/PVE/Storage/Common/Parse.pm | 330 ++++++++++++++++++++++++++++++++ src/PVE/Storage/Plugin.pm | 19 +- 6 files changed, 358 insertions(+), 10 deletions(-) create mode 100644 src/PVE/Storage/Common/Parse.pm diff --git a/src/PVE/API2/Storage/Status.pm b/src/PVE/API2/Storage/Status.pm index 2b0b121a..cf60c148 100644 --- a/src/PVE/API2/Storage/Status.pm +++ b/src/PVE/API2/Storage/Status.pm @@ -23,6 +23,9 @@ use PVE::Storage; use PVE::Storage::Common qw( plugin_get_vtype_subdir ); +use PVE::Storage::Common::Parse qw( + parse_path_as_volname_parts +); use base qw(PVE::RESTHandler); @@ -71,11 +74,13 @@ my sub parse_transferred_file_path_extension : prototype($$) { my ($path, $vtype) = @_; if ($vtype eq 'iso') { - if ($path !~ m![^/]+$PVE::Storage::ISO_EXT_RE_0$!) { + my $parts = parse_path_as_volname_parts($path, $vtype); + + if (!defined($parts)) { raise_param_exc({ filename => "wrong file extension" }); } - my $ext = $1; + my $ext = $parts->{ext}; return $ext; } diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm index 1b0569dd..d7d683ad 100755 --- a/src/PVE/Storage.pm +++ b/src/PVE/Storage.pm @@ -26,6 +26,9 @@ use PVE::Storage::Common qw( plugin_get_default_vtype_subdirs plugin_get_vtype_subdir ); +use PVE::Storage::Common::Parse qw( + parse_path_as_volid +); use PVE::RESTEnvironment qw(log_warn); use PVE::Storage::Plugin; @@ -754,9 +757,7 @@ sub path_to_volume_id { } if ($vtype eq 'iso') { - return if $filename !~ m!/([^/]+$ISO_EXT_RE_0)$!; - my $name = $1; - return "$sid:iso/$name"; + return parse_path_as_volid($sid, $scfg, $path, $vtype); } if ($vtype eq 'vztmpl') { diff --git a/src/PVE/Storage/Common.pm b/src/PVE/Storage/Common.pm index abccf52b..49defb98 100644 --- a/src/PVE/Storage/Common.pm +++ b/src/PVE/Storage/Common.pm @@ -55,6 +55,8 @@ be grouped in a submodule can also be found here. =over +=item * C> + =back =head1 STANDARD OPTIONS FOR JSON SCHEMA diff --git a/src/PVE/Storage/Common/Makefile b/src/PVE/Storage/Common/Makefile index 0c4bba5b..0d9b1be1 100644 --- a/src/PVE/Storage/Common/Makefile +++ b/src/PVE/Storage/Common/Makefile @@ -1,4 +1,5 @@ SOURCES = \ + Parse.pm \ .PHONY: install diff --git a/src/PVE/Storage/Common/Parse.pm b/src/PVE/Storage/Common/Parse.pm new file mode 100644 index 00000000..bcc6f9fc --- /dev/null +++ b/src/PVE/Storage/Common/Parse.pm @@ -0,0 +1,330 @@ +package PVE::Storage::Common::Parse; + +use v5.36; + +use PVE::Storage::Common qw( + plugin_get_vtype_subdir +); + +use Exporter qw(import); + +our @EXPORT_OK = qw( + parse_path_as_volname_parts + parse_path_as_volname + parse_volname_as_parts + + parse_path_as_volid_parts + parse_path_as_volid + parse_volid_as_parts +); + +=head1 NAME + +C - Storage-related Parsing Functions + +=head1 DESCRIPTION + +This module contains various parsing functions for use within C>, +its submodules (including storage plugins) and other related modules. + +Parsing functions are categorized by their main purpose / area of application +and may be further subdivided depending on what kind of data type they are +primarily related to. + +=cut + +my $RE_PARENT_DIR = quotemeta('..'); +my $RE_CONTAINS_PARENT_DIR = qr! + ( ^$RE_PARENT_DIR/ ) # ../ --> Beginning of path + | + ( /$RE_PARENT_DIR/ ) # /../ --> Between two path components + | + ( /$RE_PARENT_DIR$ ) # /.. --> End of path +!xn; + +my $RE_ISO_FILE_PATH = qr! + (? + (? [^/]+ \. (? (?i: iso|img) ) ) + ) +!xn; + +my $RE_FILE_PATH_FOR_VTYPE = { + iso => qr/^$RE_ISO_FILE_PATH$/, +}; + +my $RE_VOLNAME_FOR_VTYPE = { + iso => qr/^$RE_ISO_FILE_PATH$/, +}; + +my sub contains_parent_dir($path) { + return $path =~ $RE_CONTAINS_PARENT_DIR; +} + +my sub strip_leading_path_separators($path) { + return $path =~ s!^/+!!r; +} + +my sub strip_trailing_path_separators($path) { + return $path =~ s!/+$!!r; +} + +my sub format_named_groups(%groups) { + my $result = {}; + + for my $old_key (keys %groups) { + my $new_key = $old_key =~ s/_/-/gr; + $result->{$new_key} = $groups{$old_key}; + } + + my @disk_path_components = (); + + if (defined($result->{file})) { + $result->{file} = strip_leading_path_separators($result->{file}); + push(@disk_path_components, $result->{file}); + } + + if (scalar(@disk_path_components)) { + $result->{'disk-path'} = join('/', @disk_path_components); + } + + return $result; +} + +my sub split_leading_dir_from_path($path, $directory) { + $directory = strip_trailing_path_separators($directory); + + if ($path =~ m!^(? \Q$directory\E ) / (? .* ) $ !xn) { + return { + 'leading-dir' => strip_trailing_path_separators($+{leading_dir}), + remainder => strip_leading_path_separators($+{remainder}), + }; + } + + return; +} + +=head1 PARSERS RELATED TO VOLUMES + +The parsing functions in this section primarily deal with parsing data related +to storage volumes, primarily C>s and C>s. + +=head2 VOLUME NAMES + +=cut + +=head3 parse_path_as_volname_parts + +Parses the given relative file path C<$path> according to the given C<$vtype> +and returns a hashref containing the individually extracted parts that make up +a C on success. + +On failure or when the provided C<$vtype> is not supported or does not exist, +returns C instead. + +B This function assumes that C<$path> is already relative to the +directory that corresponds to the given C<$vtype>. + +For example, the path C points to an +ISO file on the default C directory storage. In this case, the storage +path is C, the subdirectory for the C<$vtype = "iso"> is +C