public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
@ 2024-12-20 18:51 Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:51 UTC (permalink / raw)
  To: pve-devel

Introduce and Package PVE::Path & PVE::Filesystem - v2
======================================================

Notable Changes Since v1
------------------------

Incorporate Thomas's feedback [1] as much as possible.

I was probably too eager to push this out yesterday and did notice many
spots where the amount of test cases / tests being performed can be
reduced significantly. The original commit that added all the tests to
begin with is also split into multiple smaller commits to hopefully make
reviewing this easier. The total amount of tests performed has therefore
gone down from 1050 to 475 if I counted correctly (yikes).

The only things that are still tested more elaborately are the file name
manipulation and path comparison functions. There is some potential to
slim those down as well, but I wanted to get a v2 out before Christmas
for some additional feedback.

Additionally, there was one slight discrepancy I noticed between how I
documented path_parent and it's actual behaviour -- the behaviour was
adapted accordingly. The behaviour of the boolean functions
path_is_absolute and path_is_relative is also improved. See patch 01 for
all details.

Closing Remarks
---------------

As always, any feedback is welcome -- if possible, it would be nice if
somebody could give these two modules a spin and tell me how they feel.
If there's anything that is or feels unexpected, surprising,
inconsistent, etc. please let me know!

References
----------

[1]: https://lore.proxmox.com/pve-devel/4990a4b4-5027-4db2-9909-d36af8a54a07@proxmox.com/

Older Versions
--------------

v1: https://lore.proxmox.com/pve-devel/20241219183143.526267-1-m.carrara@proxmox.com/

Summary of Changes
------------------

Max Carrara (12):
  introduce PVE::Path
  test: add directory for tests of PVE::Path module
  test: add tests for path_is_absolute and path_is_relative of PVE::Path
  test: add tests for path_components of PVE::Path
  test: add tests for path_join of PVE::Path
  test: add tests for path_push of PVE::Path
  test: add tests for path_parent of PVE::Path
  test: add tests for path_starts_with, path_ends_with, path_equals
  test: add test for file path operation functions of PVE::Path
  test: add tests for path_normalize of PVE::Path
  introduce PVE::Filesystem
  debian: introduce package libproxmox-fs-path-utils-perl

 debian/control                               |    6 +
 debian/libproxmox-fs-path-utils-perl.install |    2 +
 debian/libpve-common-perl.install            |   29 +
 src/Makefile                                 |    2 +
 src/PVE/Filesystem.pm                        |   78 ++
 src/PVE/Path.pm                              |  987 ++++++++++++++
 test/Makefile                                |    5 +-
 test/Path/Makefile                           |   24 +
 test/Path/path_comparison_tests.pl           |  851 ++++++++++++
 test/Path/path_components_tests.pl           |  159 +++
 test/Path/path_file_ops_tests.pl             | 1220 ++++++++++++++++++
 test/Path/path_is_absolute_relative_tests.pl |  122 ++
 test/Path/path_join_tests.pl                 |  310 +++++
 test/Path/path_normalize_tests.pl            |  176 +++
 test/Path/path_parent_tests.pl               |  153 +++
 test/Path/path_push_tests.pl                 |  159 +++
 16 files changed, 4282 insertions(+), 1 deletion(-)
 create mode 100644 debian/libproxmox-fs-path-utils-perl.install
 create mode 100644 debian/libpve-common-perl.install
 create mode 100644 src/PVE/Filesystem.pm
 create mode 100644 src/PVE/Path.pm
 create mode 100644 test/Path/Makefile
 create mode 100755 test/Path/path_comparison_tests.pl
 create mode 100755 test/Path/path_components_tests.pl
 create mode 100755 test/Path/path_file_ops_tests.pl
 create mode 100755 test/Path/path_is_absolute_relative_tests.pl
 create mode 100755 test/Path/path_join_tests.pl
 create mode 100755 test/Path/path_normalize_tests.pl
 create mode 100755 test/Path/path_parent_tests.pl
 create mode 100755 test/Path/path_push_tests.pl

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
@ 2024-12-20 18:51 ` Max Carrara
  2025-01-08 14:05   ` Wolfgang Bumiller
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module Max Carrara
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:51 UTC (permalink / raw)
  To: pve-devel

The PVE::Path module concerns itself with file / directory path
operations, like getting the parent directory of a path, extracting
the file name of a path, splitting a path into its individual
components, joining path components together, comparing paths, and so
on.

This module is added here in order to address the shortcomings of the
Perl core modules (such as lacking a lot the aforementioned
functionalities) without relying on some kind of object-based
abstraction.

PVE::Path strictly only contains functions that manipulate file paths
without meddling with the filesystem. The reasoning here is to be able
to just import any function when its necessary without having to
change the surrounding code in order to adapt to some kind of
abstraction.

The other motivation for this module is that path manipulation
operations have been getting more and more common recently, especially
in regards to storage.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * Improve some wording in the docstring of path_components
  * Simplify some logic in path_parent and remove an unnecessary sanity
    check
  * Actually treat "foo" as "./foo" in path_parent as mentioned in the
    docstring -- This means that path_parent("foo") now returns "."
    instead of "".
  * Adapt the path_with_file_* functions to the above accordingly, so
    that e.g. path_with_file_name("foo", "bar") returns "bar" instead of
    "./bar".
  * Improve the "boolean" behaviour of path_is_absolute and
    path_is_absolute and return 1 when true, but use an empty return
    when false.
    - An empty return means "undef" in scalar context and an empty list
      in list context, so those functions will always return something
      that's correctly truthy or falsy for Perl, regardless of context

 src/Makefile    |   1 +
 src/PVE/Path.pm | 987 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 988 insertions(+)
 create mode 100644 src/PVE/Path.pm

diff --git a/src/Makefile b/src/Makefile
index 2d8bdc4..25bc490 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -23,6 +23,7 @@ LIB_SOURCES = \
 	LDAP.pm \
 	Network.pm \
 	OTP.pm \
+	Path.pm \
 	PBSClient.pm \
 	PTY.pm \
 	ProcFSTools.pm \
diff --git a/src/PVE/Path.pm b/src/PVE/Path.pm
new file mode 100644
index 0000000..221e662
--- /dev/null
+++ b/src/PVE/Path.pm
@@ -0,0 +1,987 @@
+=head1 NAME
+
+C<PVE::Path> - Utilities related to handling file and directory paths
+
+=head1 DESCRIPTION
+
+This module provides functions concerned with file and directory path
+manipulation.
+
+None of the functions provided alter the filesystem in any way.
+
+The reason for this module's existence is to address a couple shortcomings:
+
+=over
+
+=item 1. The Perl core modules lack most of what is required for manipulating
+paths, for example getting the parent directory of a path, extracting the
+prefix of a file name (the "stem"), extracting the suffixes of a file name (the
+"endings" or "extensions"), checking whether two paths are the same, and so on.
+
+=item 2. If the Perl core modules provide something in that regard, it's usually
+provided in a not very ergonomic manner (L<C<File::Basename>>).
+
+=item 3. Additionally, the path utilities of the core modules are scattered
+across multiple modules, making them hard to discover.
+
+=item 4. Third-party libraries on CPAN mostly provide objects representing
+paths. Using any of these would require fundamental changes on how file paths
+are handled throughout our code, for almost no benefit.
+
+=back
+
+C<L<PVE::Path>> instead does without objects and strictly provides functions
+for path manipulation only. Any operation that is needed can simply be
+performed ad hoc by importing the corresponding function and doesn't require
+the surrounding code to conform to an abstraction like a path object.
+
+Additionally, some of the core modules' functionality is re-exported or
+re-implemented for ergonomic or logical purposes. The goal is to provide
+functions that don't come with any surprises and just behave like one assumes
+they would.
+
+This module takes inspiration from Rust's C<std::path> and Python's C<pathlib>,
+which are more modern path manipulation libraries.
+
+=head1 LIMITATIONS
+
+This module is limited to manipulating Unix-like / Linux file paths.
+
+=cut
+
+package PVE::Path;
+
+use strict;
+use warnings;
+
+use Carp qw(carp croak confess);
+use File::Spec ();
+use List::Util qw(any zip_shortest zip_longest);
+
+use Exporter qw(import);
+
+our @EXPORT_OK = qw(
+    path_is_absolute
+    path_is_relative
+
+    path_components
+    path_join
+
+    path_normalize
+
+    path_parent
+    path_push
+    path_pop
+
+    path_file_name
+    path_with_file_name
+
+    path_file_prefix
+    path_with_file_prefix
+
+    path_file_suffixes
+    path_with_file_suffixes
+
+    path_file_suffix
+    path_with_file_suffix
+
+    path_file_parts
+
+    path_starts_with
+    path_ends_with
+    path_equals
+);
+
+=head2 FUNCTIONS
+
+=cut
+
+=head3 path_is_absolute($path)
+
+Returns C<1> if C<$path> is absolute (starts with a C</>).
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_is_absolute : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    if ($path =~ m#^/#) {
+	return 1;
+    }
+
+    return;
+}
+
+=head3 path_is_relative($path)
+
+Returns C<1> if C<$path> is relative (doesn't start with a C</>).
+
+The opposite of C<L<< path_is_absolute()|/"path_is_absolute($path)" >>>.
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_is_relative : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    if ($path !~ m#^/#) {
+	return 1;
+    }
+
+    return;
+}
+
+=head3 path_components($path)
+
+Returns a list of the given C<$path>'s individual components.
+
+In scalar context, returns a reference to a list.
+
+The C<$path> is normalized a little during the parse:
+
+=over
+
+=item Repeated occurrences of C</> are removed, so C<foo/bar> and C<foo//bar>
+both have C<foo> and C<bar> as components.
+
+=item Trailing slashes C</> are removed.
+
+=item Occurrences of C<.> are normalized away, except the first C<.> at
+beginning of a path. This means that C<foo/bar>, C<foo/./bar>, C<foo/bar/.>,
+C<foo/././bar/./.>, etc. all have C<foo> and C<bar> as components, while
+C<./foo/bar>, C<./././foo/bar>, C<./foo/./bar/.> have C<.>, C<foo> and C<bar>
+as components.
+
+=back
+
+No other normalization is performed to account for the possibility of symlinks
+existing. This means that C<foo/baz> and C<foo/bar/../baz> are distinct (because
+C<bar> could be a symlink and thus C<foo> isn't its parent).
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_components : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $is_abs = path_is_absolute($path);
+    my $has_cur_dir = $path =~ m#^\.$|^\./#;
+
+    my @components = split('/', $path);
+    my @normalized_components = ();
+
+    for my $component (@components) {
+	next if $component eq '' || $component eq '.';
+
+	push(@normalized_components, $component);
+    }
+
+    unshift(@normalized_components, '/') if $is_abs;
+    unshift(@normalized_components, '.') if $has_cur_dir;
+
+    return @normalized_components if wantarray;
+    return \@normalized_components;
+}
+
+
+=head3 path_join(@paths)
+
+Joins multiple paths together. All kinds of paths are supported.
+
+Does not perform any C<L<< normalization|/"path_normalize($path)" >>>.
+
+Throws an exception if any of the passed C<@paths> is C<undef>.
+
+=cut
+
+sub path_join : prototype(@) {
+    my (@paths) = @_;
+
+    if (!scalar(@paths)) {
+	return '';
+    }
+
+    croak "one of the provided paths is undef" if any { !defined($_) } @paths;
+
+    # Find the last occurrence of a root directory and start conjoining the
+    # components from there onwards
+    my $index = scalar(@paths) - 1;
+    while ($index > 0) {
+	last if $paths[$index] =~ m#^/#;
+	$index--;
+    }
+
+    @paths = @paths[$index .. (scalar(@paths) - 1)];
+
+    my $resulting_path = shift @paths;
+
+    for my $path (@paths) {
+	$resulting_path = path_push($resulting_path, $path);
+    }
+
+    return $resulting_path;
+}
+
+=head3 path_normalize($path)
+
+Wrapper for L<C<File::Spec/canonpath>>. Performs a logical cleanup of the given
+C<$path>.
+
+This removes unnecessary components of a path that can be safely
+removed, such as C<.>, trailing C</> or repeated occurrences of C</>.
+
+For example, C<foo/./bar/baz/.> and C<foo////bar//baz//> will both become
+C<foo/bar/baz>.
+
+B<Note:> This will I<not> remove components referencing the parent directory,
+i.e. C<..>. For example, C<foo/bar/../baz> and C<foo/bar/baz/..> will therefore
+remain as they are. However, the parent directory of C</> is C</>, so
+C</../../foo> will be normalized to C</foo>.
+
+Throws an exception if C<$path> is C<undef> or the call to C<canonpath> failed.
+
+=cut
+
+sub path_normalize : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $cleaned_path = eval {
+	File::Spec->canonpath($path);
+    };
+
+    croak "failed to clean up path: $@" if $@;
+
+    return $cleaned_path;
+}
+
+=head3 path_parent($path)
+
+Returns the given C<$path> without its final component, if there is one.
+
+Trailing and repeated occurrences of C</> and C<.> are normalized on the fly
+when needed. This means that e.g. C<foo////bar///.//> becomes C<foo>, but
+C<foo/.//bar//./baz> becomes C<foo/.//bar>.
+
+This function's behaviour is almost identical to Rust's
+L<< Path::parent|https://doc.rust-lang.org/std/path/struct.Path.html#method.parent >>,
+with a few adaptations made wherever Perl treats things differently:
+
+=over
+
+=item * C</foo/bar> becomes C</foo>, C<foo/bar> becomes C<foo>.
+
+=item * C</foo> becomes C</>.
+
+=item * C<foo/bar/..> becomes C<foo/bar>.
+
+=item * C<foo/../bar> becomes C<foo/..>.
+
+=item * C<foo> is interpreted as C<./foo> and becomes C<.>. This is because Perl's
+C<L<File::Spec/canonpath>> interprets C<./foo> and C<foo> as the same thing.
+
+=item * C</> and an I<empty string> result in C<undef> being returned.
+
+=item * C<.> results in an empty string.
+
+=back
+
+Will raise an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_parent : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    if ($path eq '') {
+	return;
+    }
+
+    # A limit of -1 retains empty components at the end
+    my @components = split('/', $path, -1);
+
+    # Trim off needless extra components until actual final component is encountered, e.g.
+    # foo////bar////baz//// -> foo////bar////baz
+    # /// -> /
+    # ././//.///./ -> .
+    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
+	pop(@components);
+    }
+
+    my $final_component = pop(@components);
+
+    # We had a root dir with needless extra components, e.g. "//" or "////" or "//.///./" etc.
+    return if $final_component eq '';
+
+    # We had a current dir reference with needless extra components, e.g.
+    # "././" or ".///////" or "./././//./././//" etc.
+    return '' if $final_component eq '.';
+
+    # We had some other kind of single component like "foo", "bar" or "..",
+    # and because File::Spec->canonpath treats "foo" and "./foo" the same,
+    # return a single current dir reference
+    return '.' if !scalar(@components);
+
+    # Trim off needless extra components until actual parent component is encountered, like above
+    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
+	pop(@components);
+    }
+
+    # Handle lone root dir (@components with only one empty string)
+    if (scalar(@components) == 1 && $components[0] eq '') {
+	return '/';
+    }
+
+    return join('/', @components);
+}
+
+=head3 path_push($path, $other)
+
+Extends C<$path> with C<$other>, returning a new path.
+
+If C<$other> is absolute, it will be returned instead.
+
+Throws an exception if any of the arguments is C<undef>.
+
+=cut
+
+sub path_push : prototype($$) {
+    my ($path, $other) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$other is undef" if !defined($other);
+
+    return $other if path_is_absolute($other);
+    return $path if $other eq '';
+
+    my $need_sep = $path ne '' && $path !~ m#/$#;
+
+    $path .= "/" if $need_sep;
+    $path .= $other;
+
+    return $path;
+}
+
+=head3 path_pop($path)
+
+Alias for C<L<< path_parent()|/"path_parent($path)" >>>.
+
+=cut
+
+sub path_pop : prototype($) {
+    my ($path) = @_;
+    return path_parent($path);
+}
+
+=head3 path_file_name($path)
+
+Returns the last component of the given C<$path>, if it is a legal file name,
+or C<undef> otherwise.
+
+If C<$path> is an empty string, C</>, C<.> or ends with a C<..> component,
+there is no valid file name.
+
+B<Note:> This does not check whether the given C<$path> actually points to a
+file or a directory etc.
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_file_name : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my @components = path_components($path);
+
+    if (!scalar(@components)) {
+	return;
+    }
+
+    if (
+	scalar(@components) == 1
+	&& ($components[0] eq '/' || $components[0] eq '.')
+    ) {
+	return;
+    }
+
+    if ($components[-1] eq '..') {
+	return;
+    }
+
+    return $components[-1];
+}
+
+=head3 path_with_file_name($path, $file_name)
+
+Returns C<$path> with C<$file_name> as the new last component.
+
+This is essentially like calling C<L<< path_parent()|/"path_parent($path)" >>>
+and using C<L<< path_push()|/"path_push($path, $other)" >>> to append the new
+file name, but handles a few extra cases:
+
+=over
+
+=item * If C<$path> is C</>, appends C<$file_name>.
+
+=item * If C<$path> is an empty string, appends C<$file_name>.
+
+=item * If C<$path> ends with a parent directory reference (C<..>), replaces it
+with C<$file_name>.
+
+=back
+
+Throws an exception if any of the arguments is C<undef> or if C<$file_name>
+contains a path separator (C</>).
+
+=cut
+
+sub path_with_file_name : prototype($$) {
+    my ($path, $file_name) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$file_name is undef" if !defined($file_name);
+    croak "\$file_name contains a path separator: $file_name" if $file_name =~ m|/|;
+
+    my $parent = path_parent($path);
+
+    # undef means that $path was either "" or "/", so we can just append to it
+    return ($path . $file_name) if !defined($parent);
+
+    # Because the parent of "foo" is ".", return $file_name to stay consistent.
+    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
+    if ($parent eq '.' && $path !~ m|/|) {
+	return $file_name;
+    }
+
+    return path_push($parent, $file_name);
+}
+
+my sub _path_file_prefix : prototype($) {
+    my ($file_name) = @_;
+
+    confess "\$file_name is undef" if !defined($file_name);
+
+    $file_name =~ s|^(\.*[^\.]*)||;
+    my $prefix = $1;
+
+    # sanity check
+    confess "\$prefix not matched" if !defined($prefix);
+
+    return ($prefix, $file_name);
+}
+
+=head3 path_file_prefix($path)
+
+Returns the prefix of the file name of the given C<$path>. If the C<$path> does
+not have a valid file name and thus no prefix, C<undef> is returned instead.
+
+The prefix of a file name is the part before any extensions (suffixes).
+
+    my $prefix = path_file_prefix("/etc/resolv.conf");
+    # resolv
+
+    my $prefix = path_file_prefix("/tmp/archive.tar.zst");
+    # archive
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_file_prefix : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $file_name = path_file_name($path);
+    return undef if !defined($file_name);
+
+    my ($prefix, undef) = _path_file_prefix($file_name);
+    return $prefix;
+}
+
+=head3 path_with_file_prefix($path, $prefix)
+
+Returns C<$path> with a new C<$prefix>. This is similar to
+C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
+except that the file prefix is replaced or appended.
+
+If C<$path> does not have a file name or if C<$prefix> is an empty string,
+C<undef> is returned.
+
+    my $new_path = path_with_file_prefix("/tmp/archive.tar.zst", "backup");
+    # /tmp/backup.tar.zst
+
+    my $new_path = path_with_file_prefix("/etc/pve", "ceph");
+    # /etc/ceph
+
+Throws an exception if any of the arguments is C<undef>, or if C<$prefix>
+contains a path separator (C</>), ends with C<.>, or is an empty string.
+
+=cut
+
+sub path_with_file_prefix : prototype($$) {
+    my ($path, $prefix) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$prefix is undef" if !defined($prefix);
+    croak "\$prefix contains a path separator" if $prefix =~ m|/|;
+    croak "\$prefix ends with a dot" if $prefix =~ m|\.$|;
+
+    return undef if $prefix eq '';
+    return undef if !defined(path_file_name($path));
+
+    return $path if $prefix eq '';
+
+    my $parent = path_parent($path);
+
+    # sanity check -- should not happen because we checked for file name,
+    # and the existence of a file name implies there's a parent
+    confess "parent of \$path is undef" if !defined($parent);
+
+    my @suffixes = path_file_suffixes($path);
+
+    my $file_name = join(".", $prefix, @suffixes);
+
+    # Because the parent of "foo" is ".", return $file_name to stay consistent.
+    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
+    # (Done also in path_with_file_name)
+    if ($parent eq '.' && $path !~ m|/|) {
+	return $file_name;
+    }
+
+    return path_push($parent, $file_name);
+}
+
+my sub _path_file_suffixes : prototype($) {
+    my ($file_name_no_prefix) = @_;
+
+    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
+
+    # Suffixes are extracted "manually" because join()ing the result of split()
+    # results in a different file name than the original. Let's say you have a
+    # file named "foo.bar.". The correct suffixes would be ("bar", "").
+    # With split, you get the following:
+    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
+    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
+    my @suffixes = ();
+    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
+	my $suffix = $1;
+	$suffix =~ s|^\.||;
+	push(@suffixes, $suffix);
+    }
+
+    return @suffixes;
+}
+
+=head3 path_file_suffixes($path)
+
+Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
+not have a valid file name, an empty list is returned instead.
+
+In scalar context, returns a reference to a list.
+
+The suffixes of a path are essentially the file name's extensions, the parts
+that come after the L<< prefix|/"path_file_prefix($path)" >>.
+
+    my @suffixes = path_file_suffixes("/etc/resolv.conf");
+    # ("conf")
+
+    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
+    # ["tar", "zst"]
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_file_suffixes : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $file_name = path_file_name($path);
+    if (!defined($file_name)) {
+	return wantarray ? () : [];
+    }
+
+    (undef, $file_name) = _path_file_prefix($file_name);
+
+    my @suffixes = _path_file_suffixes($file_name);
+
+    return wantarray ? @suffixes : \@suffixes;
+}
+
+=head3 path_with_file_suffixes($path, @suffixes)
+
+Returns C<$path> with new C<@suffixes>. This is similar to
+C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
+except that the suffixes of the file name are replaced.
+
+If the C<$path> does not have a file name, C<undef> is returned.
+
+    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
+    # /tmp/archive.pxar.gz
+
+    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
+    # /tmp/archive.gz
+
+If the file name has no suffixes, the C<@suffixes> are appended instead:
+
+    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
+    # /etc/resolv.conf
+
+    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
+    # /etc/resolv.conf.zst
+
+If there are no C<@suffixes> provided, the file name's suffixes will
+be removed (if there are any):
+
+    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
+    # /tmp/archive
+
+Note that an empty string is still a valid suffix (an "empty" file ending):
+
+    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
+    # /tmp/archive....zst
+
+Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
+if any suffix contains a path separator (C</>) or a C<.>.
+
+=cut
+
+sub path_with_file_suffixes : prototype($@) {
+    my ($path, @suffixes) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "one of the provided suffixes is undef"
+	if any { !defined($_) } @suffixes;
+    croak "one of the provided suffixes contains a path separator"
+	if any { $_ =~ m|/| } @suffixes;
+    croak "one of the provided suffixes contains a dot"
+	if any { $_ =~ m|\.| } @suffixes;
+
+    return undef if !defined(path_file_name($path));
+
+    my $parent = path_parent($path);
+
+    # sanity check -- should not happen because we checked for file name,
+    # and the existence of a file name implies there's a parent
+    confess "parent of \$path is undef" if !defined($parent);
+
+    # Don't modify $path if there are no suffixes to be removed
+    my @existing_suffixes = path_file_suffixes($path);
+    return $path if !scalar(@suffixes) && !scalar(@existing_suffixes);
+
+    my $prefix = path_file_prefix($path);
+
+    # sanity check
+    confess "\$prefix is undef" if !defined($prefix);
+
+    my $file_name = join(".", $prefix, @suffixes);
+
+    # Because the parent of "foo" is ".", return $file_name to stay consistent.
+    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
+    # (Done also in path_with_file_name)
+    if ($parent eq '.' && $path !~ m|/|) {
+	return $file_name;
+    }
+
+    return path_push($parent, $file_name);
+}
+
+=head3 path_file_suffix($path)
+
+Returns the suffix of the C<$path>'s file name. If the C<$path> does not have a
+valid file name or if the file name has no suffix, C<undef> is returned
+instead.
+
+The suffix of a file name is essentially its extension, e.g.
+C</etc/resolv.conf> has the suffix C<conf>. If there are multiple suffixes,
+only the last will be returned; e.g. C</tmp/archive.tar.gz> has the suffix C<gz>.
+
+B<Note:> Files like e.g. C</tmp/foo.> have an empty string as suffix.
+
+For getting all suffixes of a path, see C<L<< path_file_suffixes()|/"path_file_suffixes($path)" >>>.
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_file_suffix : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $file_name = path_file_name($path);
+    return undef if !defined($file_name);
+
+    (undef, $file_name) = _path_file_prefix($file_name);
+
+    my @suffixes = _path_file_suffixes($file_name);
+
+    return pop(@suffixes);
+}
+
+=head3 path_with_file_suffix($path, $suffix)
+
+Returns C<$path> with a new C<$suffix>. This is similar to
+C<L<< path_with_file_suffixes()|/"path_with_file_suffixes($path, @suffixes)" >>>,
+except that only the last suffix of the file name is replaced.
+
+If the C<$path> does not have a file name, C<undef> is returned.
+
+    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "gz");
+    # /tmp/archive.tar.gz
+
+If the file name has no suffixes, the C<$suffix> is appended instead:
+
+    my $new_path = path_with_file_suffix("/etc/resolv", "conf");
+    # /etc/resolv.conf
+
+If C<$suffix> is C<undef>, the file name's (last) suffix will be removed (if
+there is one):
+
+    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", undef);
+    # /tmp/archive.tar
+
+    my $new_path = path_with_file_suffix("/etc/resolv", undef);
+    # /etc/resolv
+
+Note that an empty string is still a valid suffix (an "empty" file ending):
+
+    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "");
+    # /tmp/archive.tar.
+
+    my $new_path = path_with_file_suffix("/etc/resolv", "");
+    # /etc/resolv.
+
+Throws an exception if any of the arguments is C<undef>, or if C<$suffix>
+contains a path separator (C</>) or a C<.>.
+
+=cut
+
+sub path_with_file_suffix : prototype($$) {
+    my ($path, $suffix) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    if (defined($suffix)) {
+	croak "\$suffix contains a path separator" if $suffix =~ m|/|;
+	croak "\$suffix contains a dot" if $suffix =~ m|\.|;
+    }
+
+    return undef if !defined(path_file_name($path));
+
+    my $parent = path_parent($path);
+
+    # sanity check -- should not happen because we checked for file name,
+    # and the existence of a file name implies there's a parent
+    confess "parent of \$path is undef" if !defined($parent);
+
+    my @suffixes = path_file_suffixes($path);
+
+    # Don't modify $path if there is no suffix to be removed
+    return $path if !scalar(@suffixes) && !defined($suffix);
+
+    pop(@suffixes);
+    push(@suffixes, $suffix) if defined($suffix);
+
+    my $prefix = path_file_prefix($path);
+
+    # sanity check
+    confess "\$prefix is undef" if !defined($prefix);
+
+    my $file_name = join(".", $prefix, @suffixes);
+
+    # Because the parent of "foo" is ".", return $file_name to stay consistent.
+    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
+    # (Done also in path_with_file_name)
+    if ($parent eq '.' && $path !~ m|/|) {
+	return $file_name;
+    }
+
+    return path_push($parent, $file_name);
+}
+
+=head3 path_file_parts($path)
+
+Returns the parts that constitute the file name (prefix and suffixes) of a
+C<$path> as a list. If the C<$path> does not have a valid file name, an empty
+list is returned instead.
+
+In scalar context, returns a reference to a list.
+
+These parts are split in such a way that allows them to be C<join>ed together,
+resulting in the original file name of the given C<$path> again.
+
+    my $file_parts = path_file_parts("/etc/pve/firewall/cluster.fw");
+    # ("cluster", "fw")
+    my $file_name = join(".", $file_parts->@*);
+
+    my @file_parts = path_file_parts("/tmp/archive.tar.gz");
+    # ("archive", "tar", "gz")
+    my $file_name = join(".", @file_parts);
+
+Throws an exception if C<$path> is C<undef>.
+
+=cut
+
+sub path_file_parts : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $file_name = path_file_name($path);
+    if (!defined($file_name)) {
+	return wantarray ? () : [];
+    }
+
+    my $prefix;
+    ($prefix, $file_name) = _path_file_prefix($file_name);
+
+    my @suffixes = _path_file_suffixes($file_name);
+
+    my @file_parts = ($prefix, @suffixes);
+
+    return wantarray ? @file_parts : \@file_parts;
+}
+
+=head3 path_starts_with($path, $other_path)
+
+Checks whether a C<$path> starts with the components of C<$other_path>.
+
+    my $starts_with = path_starts_with("/etc/pve/firewall/cluster.fw", "/etc/pve");
+    # 1
+
+Throws an exception if any of the arguments is C<undef>.
+
+=cut
+
+sub path_starts_with : prototype($$) {
+    my ($path, $other_path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$other_path if undef" if !defined($other_path);
+
+    # Nothing starts with nothing
+    return 1 if ($path eq '' && $other_path eq '');
+
+    # Nothing cannot start with something
+    # Something cannot start with nothing
+    return if ($path eq '' || $other_path eq '');
+
+    my @components = path_components($path);
+    my @other_components = path_components($other_path);
+
+    my @pairs = zip_shortest(\@components, \@other_components);
+
+    # for my ($comp, $other_comp) (@pairs) is experimental
+    for my $pair (@pairs) {
+	my ($comp, $other_comp) = $pair->@*;
+
+	if ($comp ne $other_comp) {
+	    return;
+	}
+    }
+
+    return 1;
+}
+
+=head3 path_ends_with($path, $other_path)
+
+Checks whether a C<$path> ends with the components of C<$other_path>.
+
+    my $ends_with = path_ends_with("/etc/pve/firewall/cluster.fw", "firewall/cluster.fw");
+    # 1
+
+Throws an exception if any of the arguments is C<undef>.
+
+=cut
+
+sub path_ends_with : prototype($$) {
+    my ($path, $other_path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$other_path if undef" if !defined($other_path);
+
+    # Nothing ends with nothing
+    return 1 if ($path eq '' && $other_path eq '');
+
+    # Nothing cannot end with something
+    # Something cannot end with nothing
+    return if ($path eq '' || $other_path eq '');
+
+    my @components_rev = reverse(path_components($path));
+    my @other_components_rev = reverse(path_components($other_path));
+
+    my @pairs_rev = zip_shortest(\@components_rev, \@other_components_rev);
+
+    # for my ($comp, $other_comp) (@pairs_rev) is experimental
+    for my $pair (@pairs_rev) {
+	my ($comp, $other_comp) = $pair->@*;
+
+	if ($comp ne $other_comp) {
+	    return;
+	}
+    }
+
+    return 1;
+}
+
+=head3 path_equals($path, $other_path)
+
+Checks whether C<$path> equals C<$other_path>. The paths are compared
+by their components, meaning that it's not necessary to
+L<< normalize|/"path_normalize($path)" >> them beforehand.
+
+=cut
+
+sub path_equals : prototype($$) {
+    my ($path, $other_path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+    croak "\$other_path if undef" if !defined($other_path);
+
+    # Nothing is nothing
+    return 1 if ($path eq '' && $other_path eq '');
+
+    # Nothing is not something
+    # Something is not nothing
+    return if ($path eq '' || $other_path eq '');
+
+    my @components = path_components($path);
+    my @other_components = path_components($other_path);
+
+    return if scalar(@components) != scalar(@other_components);
+
+    my @pairs = zip_longest(\@components, \@other_components);
+
+    # for my ($comp, $other_comp) (@pairs_rev) is experimental
+    for my $pair (@pairs) {
+	my ($comp, $other_comp) = $pair->@*;
+
+	return if !defined($comp) || !defined($other_comp);
+
+	if ($comp ne $other_comp) {
+	    return;
+	}
+    }
+
+    return 1;
+}
+
+1;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
@ 2024-12-20 18:51 ` Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path Max Carrara
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:51 UTC (permalink / raw)
  To: pve-devel

Add the test/Path directory as well as test/Path/Makefile, containing
the scaffolding for tests regarding PVE::Path.

Include test/Path as subdirectory in test/Makefile, so
test/Path/Makefile runs when the other tests run, too.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Makefile      |  5 ++++-
 test/Path/Makefile | 16 ++++++++++++++++
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 test/Path/Makefile

diff --git a/test/Makefile b/test/Makefile
index 4e25a46..5c8f157 100644
--- a/test/Makefile
+++ b/test/Makefile
@@ -1,4 +1,7 @@
-SUBDIRS = etc_network_interfaces
+SUBDIRS = etc_network_interfaces	\
+	  Path				\
+
+
 TESTS = lock_file.test			\
 	calendar_event_test.test	\
 	convert_size_test.test		\
diff --git a/test/Path/Makefile b/test/Path/Makefile
new file mode 100644
index 0000000..d091036
--- /dev/null
+++ b/test/Path/Makefile
@@ -0,0 +1,16 @@
+TESTS = \
+
+
+TEST_TARGETS = $(addsuffix .t,$(basename ${TESTS}))
+
+all:
+
+.PHONY: check
+
+check: ${TEST_TARGETS}
+
+%.t: %.pl
+	./$<
+
+distclean: clean
+clean:
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module Max Carrara
@ 2024-12-20 18:51 ` Max Carrara
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components " Max Carrara
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:51 UTC (permalink / raw)
  To: pve-devel

Add cases for both functions, as they're each other's opposite.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile                           |   1 +
 test/Path/path_is_absolute_relative_tests.pl | 122 +++++++++++++++++++
 2 files changed, 123 insertions(+)
 create mode 100755 test/Path/path_is_absolute_relative_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index d091036..5998c19 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -1,4 +1,5 @@
 TESTS = \
+	path_is_absolute_relative_tests.pl			\
 
 
 TEST_TARGETS = $(addsuffix .t,$(basename ${TESTS}))
diff --git a/test/Path/path_is_absolute_relative_tests.pl b/test/Path/path_is_absolute_relative_tests.pl
new file mode 100755
index 0000000..ad36c11
--- /dev/null
+++ b/test/Path/path_is_absolute_relative_tests.pl
@@ -0,0 +1,122 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $path_is_absolute_relative_cases = [
+    {
+	name => "empty path",
+	path => "",
+	is_absolute => undef,
+	is_relative => 1,
+    },
+    {
+	name => "root",
+	path => "/",
+	is_absolute => 1,
+	is_relative => undef,
+    },
+    {
+	name => "single component, relative",
+	path => "foo",
+	is_absolute => undef,
+	is_relative => 1,
+    },
+    {
+	name => "single component, absolute",
+	path => "/foo",
+	is_absolute => 1,
+	is_relative => undef,
+    },
+    {
+	name => "path is undef",
+	path => undef,
+	is_absolute => undef,
+	is_relative => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_is_absolute : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_is_absolute: " . $case->{name};
+
+    my $is_absolute = eval { PVE::Path::path_is_absolute($case->{path}); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to determine whether path is absolute:\n$@");
+	return;
+    }
+
+    # Note: `!is()` isn't the same as `isnt()` -- we want extra output here
+    # if the check fails; can't do that with `isnt()`
+    if (!is($is_absolute, $case->{is_absolute}, $name)) {
+	diag("path = " . $case->{path});
+    }
+
+    return;
+}
+
+sub test_path_is_relative : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_is_relative: " . $case->{name};
+
+    my $is_relative = eval { PVE::Path::path_is_relative($case->{path}); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to determine whether path is relative:\n$@");
+	return;
+    }
+
+    # Note: `!is()` isn't the same as `isnt()` -- we want extra output here
+    # if the check fails; can't do that with `isnt()`
+    if (!is($is_relative, $case->{is_relative}, $name)) {
+	diag("path = " . $case->{path});
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    my $test_subs = [
+	\&test_path_is_absolute,
+	\&test_path_is_relative,
+    ];
+
+    plan(tests => scalar($path_is_absolute_relative_cases->@*) * scalar($test_subs->@*));
+
+    for my $case ($path_is_absolute_relative_cases->@*) {
+	for my $test_sub ($test_subs->@*) {
+	    eval {
+		# suppress warnings here to make output less noisy for certain tests if necessary
+		# local $SIG{__WARN__} = sub {};
+		$test_sub->($case);
+	    };
+	    warn "$@\n" if $@;
+	}
+    }
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (2 preceding siblings ...)
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path Max Carrara
@ 2024-12-20 18:51 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join " Max Carrara
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:51 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile                 |   1 +
 test/Path/path_components_tests.pl | 159 +++++++++++++++++++++++++++++
 2 files changed, 160 insertions(+)
 create mode 100755 test/Path/path_components_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 5998c19..3f48a38 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -1,4 +1,5 @@
 TESTS = \
+	path_components_tests.pl				\
 	path_is_absolute_relative_tests.pl			\
 
 
diff --git a/test/Path/path_components_tests.pl b/test/Path/path_components_tests.pl
new file mode 100755
index 0000000..0fabda5
--- /dev/null
+++ b/test/Path/path_components_tests.pl
@@ -0,0 +1,159 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $path_components_cases = [
+    {
+	name => "empty path",
+	path => "",
+	components => [],
+    },
+    {
+	name => "root",
+	path => "/",
+	components => ["/"],
+    },
+    {
+	name => "current path reference",
+	path => ".",
+	components => ["."],
+    },
+    {
+	name => "parent directory reference",
+	path => "..",
+	components => [".."],
+    },
+    {
+	name => "single component, relative",
+	path => "foo",
+	components => ["foo"],
+    },
+    {
+	name => "single component, absolute",
+	path => "/foo",
+	components => ["/", "foo"],
+    },
+    {
+	name => "multiple components, relative",
+	path => "foo/bar/baz/quo/qux",
+	components => ["foo", "bar", "baz", "quo", "qux"],
+    },
+    {
+	name => "multiple components, absolute",
+	path => "/foo/bar/baz/quo/qux",
+	components => ["/", "foo", "bar", "baz", "quo", "qux"],
+    },
+    {
+	name => "multiple components,"
+	    . " starting with current path reference,"
+	    . " with redundant current path references",
+	path => "././foo/./bar/././baz/./././quo/././././qux/./.",
+	components => [".", "foo", "bar", "baz", "quo", "qux"],
+    },
+    {
+	name => "multiple components, with parent directory references",
+	path => "../../foo/../bar/../../baz/../../../quo/../../../../qux/../..",
+	components => [
+	    "..", "..", "foo",
+	    "..", "bar",
+	    "..", "..", "baz",
+	    "..", "..", "..", "quo",
+	    "..", "..", "..", "..", "qux",
+	    "..", "..",
+	],
+    },
+    {
+	name => "multiple components, with redundant path separators",
+	path => "foo//bar///baz////quo/////qux//////",
+	components => ["foo", "bar", "baz", "quo", "qux"],
+    },
+    {
+	name => "root path, with redundant path separators",
+	path => "//////////",
+	components => ["/"],
+    },
+    {
+	name => "root path, with redundant current path references",
+	path => "/./././././././.",
+	components => ["/"],
+    },
+    {
+	name => "current path reference, with redundant path separators",
+	path => ".//////////",
+	components => ["."],
+    },
+    {
+	name => "current path reference, with redundant current path references",
+	path => "././././././././.",
+	components => ["."],
+    },
+    {
+	name => "multiple components,"
+	    . " absolute,"
+	    . " with redundant path separators,"
+	    . " with redundant current path references,"
+	    . " with parent directory references",
+	path => "///././//foo//.//bar////././//.///baz//..///.././quo/./.././/qux//././/",
+	components => ["/", "foo", "bar", "baz", "..", "..", "quo", "..", "qux"],
+    },
+    {
+	name => "path is undef",
+	path => undef,
+	components => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_components : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_components: " . $case->{name};
+
+    my $components = eval { PVE::Path::path_components($case->{path}); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to get components of path:\n$@");
+	return;
+    }
+
+    if (!is_deeply($components, $case->{components}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{components}));
+	diag("=== Got ===");
+	diag(explain($components));
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    plan(tests => scalar($path_components_cases->@*));
+
+    for my $case ($path_components_cases->@*) {
+	eval {
+	    # suppress warnings here to make output less noisy for certain tests if necessary
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_components($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (3 preceding siblings ...)
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components " Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push " Max Carrara
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

Add not just tests for the path_join function alone, but also a
separate test subroutine that checks whether a joined path remains the
same after splitting (with path_components) and joining it a second
time.

Because the path doesn't strictly have to be the *same string* after
it is split and joined a second time (it only has to be *logically*),
normalize both the re-joined and the expected path before comparison.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile           |   1 +
 test/Path/path_join_tests.pl | 310 +++++++++++++++++++++++++++++++++++
 2 files changed, 311 insertions(+)
 create mode 100755 test/Path/path_join_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 3f48a38..08d34ac 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -1,6 +1,7 @@
 TESTS = \
 	path_components_tests.pl				\
 	path_is_absolute_relative_tests.pl			\
+	path_join_tests.pl					\
 
 
 TEST_TARGETS = $(addsuffix .t,$(basename ${TESTS}))
diff --git a/test/Path/path_join_tests.pl b/test/Path/path_join_tests.pl
new file mode 100755
index 0000000..1a2eb72
--- /dev/null
+++ b/test/Path/path_join_tests.pl
@@ -0,0 +1,310 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $cases = [
+    {
+	name => "no components",
+	components => [],
+	joined => "",
+    },
+    {
+	name => "one component, relative",
+	components => ["foo"],
+	joined => "foo",
+    },
+    {
+	name => "one component, with root",
+	components => ["/", "foo"],
+	joined => "/foo",
+    },
+    {
+	name => "current path reference",
+	components => ["."],
+	joined => ".",
+    },
+    {
+	name => "multiple components, relative",
+	components => ["foo", "bar", "baz"],
+	joined => "foo/bar/baz",
+    },
+    {
+	name => "multiple components, with root",
+	components => ["/", "foo", "bar", "baz"],
+	joined => "/foo/bar/baz",
+    },
+    {
+	name => "multiple components, root inbetween",
+	components => ["foo", "bar", "/", "baz", "quo"],
+	joined => "/baz/quo",
+    },
+    {
+	name => "multiple components, with root, root inbetween",
+	components => ["/", "foo", "bar", "/", "baz", "quo"],
+	joined => "/baz/quo",
+    },
+    {
+	name => "multiple components, root at end",
+	components => ["foo", "bar", "baz", "/"],
+	joined => "/",
+    },
+    {
+	name => "multiple components, with root, root at end",
+	components => ["/", "foo", "bar", "baz", "/"],
+	joined => "/",
+    },
+    {
+	name => "multiple components, current path references inbetween",
+	components => ["foo", ".", "bar", ".", ".", "baz"],
+	joined => "foo/./bar/././baz",
+    },
+    {
+	name => "multiple components, with root, current path references inbetween",
+	components => ["/", "foo", ".", "bar", ".", ".", "baz"],
+	joined => "/foo/./bar/././baz",
+    },
+    {
+	name => "multiple components, current path references at end",
+	components => ["foo", "bar", ".", "."],
+	joined => "foo/bar/./.",
+    },
+    {
+	name => "multiple components, with root, current path references at end",
+	components => ["/", "foo", "bar", ".", "."],
+	joined => "/foo/bar/./.",
+    },
+    {
+	name => "multiple components, current path reference at start",
+	components => [".", "foo", "bar"],
+	joined => "./foo/bar",
+    },
+    {
+	name => "multiple components, parent path references inbetween",
+	components => ["foo", "..", "bar", "..", "..", "baz"],
+	joined => "foo/../bar/../../baz",
+    },
+    {
+	name => "multiple components, with root, parent path references inbetween",
+	components => ["/", "foo", "..", "bar", "..", "..", "baz"],
+	joined => "/foo/../bar/../../baz",
+    },
+    {
+	name => "multiple components, parent path references at end",
+	components => ["foo", "bar", "..", ".."],
+	joined => "foo/bar/../..",
+    },
+    {
+	name => "multiple components, with root, parent path references at end",
+	components => ["/", "foo", "bar", "..", ".."],
+	joined => "/foo/bar/../..",
+    },
+    {
+	name => "multiple components, parent path reference at start",
+	components => ["..", "foo", "bar"],
+	joined => "../foo/bar",
+    },
+    {
+	name => "relative paths (2)",
+	components => ["foo/bar", "baz/quo"],
+	joined => "foo/bar/baz/quo",
+    },
+    {
+	name => "relative paths (3)",
+	components => ["foo/bar", "baz/quo", "one/two/three"],
+	joined => "foo/bar/baz/quo/one/two/three",
+    },
+    {
+	name => "relative paths (2) with root inbetween",
+	components => ["foo/bar", "/","baz/quo"],
+	joined => "/baz/quo",
+    },
+    {
+	name => "relative paths (3) with root inbetween",
+	components => ["foo/bar", "/","baz/quo", "/", "one/two/three"],
+	joined => "/one/two/three",
+    },
+    {
+	name => "absolute paths (2)",
+	components => ["/foo/bar", "/baz/quo"],
+	joined => "/baz/quo",
+    },
+    {
+	name => "relative paths (2, not normalized)",
+	components => ["foo/.///.//.///bar", "baz/.////./quo"],
+	joined => "foo/.///.//.///bar/baz/.////./quo",
+    },
+    {
+	name => "relative paths (3, not normalized)",
+	components => ["foo/.///.//.///bar", "baz/.////./quo", "one/two//three///"],
+	joined => "foo/.///.//.///bar/baz/.////./quo/one/two//three///",
+    },
+    {
+	name => "relative paths (2), trailing slashes",
+	components => ["foo/bar/", "baz/quo/"],
+	joined => "foo/bar/baz/quo/",
+    },
+    {
+	name => "relative paths (3), trailing slashes",
+	components => ["foo/bar/", "baz/quo", "one/two/three/"],
+	joined => "foo/bar/baz/quo/one/two/three/",
+    },
+    {
+	name => "relative path and empty path at end",
+	components => ["foo/bar", ""],
+	joined => "foo/bar",
+    },
+    {
+	name => "relative path and empty paths at end (3)",
+	components => ["foo/bar", "", "", ""],
+	joined => "foo/bar",
+    },
+    {
+	name => "relative path and empty path at start",
+	components => ["", "foo/bar"],
+	joined => "foo/bar",
+    },
+    {
+	name => "relative path and empty paths at start (3)",
+	components => ["", "", "", "foo/bar"],
+	joined => "foo/bar",
+    },
+    {
+	name => "relative paths (2) and empty paths at start, middle, end (2)",
+	components => ["", "", "foo/bar", "", "", "baz/quo", "", ""],
+	joined => "foo/bar/baz/quo",
+    },
+    {
+	name => "relative paths (2) and empty paths at start, middle, end (2), with root at start",
+	components => ["/", "", "", "foo/bar", "", "", "baz/quo", "", ""],
+	joined => "/foo/bar/baz/quo",
+    },
+    {
+	name => "relative paths (2) and empty paths at start, middle, end (2), with root in middle",
+	components => ["", "", "foo/bar", "", "/", "", "baz/quo", "", ""],
+	joined => "/baz/quo",
+    },
+    {
+	name => "undef among paths",
+	components => ["foo", "bar/baz", undef, "quo", "qux"],
+	joined => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_join : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_join: " . $case->{name};
+
+    my $joined = eval { PVE::Path::path_join($case->{components}->@*); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to join components of path:\n$@");
+	return;
+    }
+
+    if (!is($joined, $case->{joined}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{joined}));
+	diag("=== Got ===");
+	diag(explain($joined));
+    }
+
+    return;
+}
+
+# This is basically the same as above, but checks whether the joined path
+# is still the same when normalized after splitting and joining it again.
+sub test_path_join_consistent : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_join (consistency): " . $case->{name};
+
+    my $joined = eval { PVE::Path::path_join($case->{components}->@*); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to join components of path:\n$@");
+	return;
+    }
+
+    my $joined_again = eval {
+	my @components = PVE::Path::path_components($joined);
+	PVE::Path::path_join(@components);
+    };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to re-join previously joined path:\n$@");
+	return;
+    }
+
+    my $normalized = eval { PVE::Path::path_normalize($joined_again); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to normalize re-joined path:\n$@");
+	return;
+    }
+
+    my $expected_normalized = eval { PVE::Path::path_normalize($case->{joined}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to normalize expected path:\n$@");
+	return;
+    }
+
+    if (!is($normalized, $expected_normalized, $name)) {
+	diag("=== Expected ===");
+	diag(explain($expected_normalized));
+	diag("=== Got ===");
+	diag(explain($normalized));
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    my $test_subs = [
+	\&test_path_join,
+	\&test_path_join_consistent,
+    ];
+
+    plan(tests => scalar($cases->@*) * scalar($test_subs->@*));
+
+    for my $case ($cases->@*) {
+	for my $test_sub ($test_subs->@*) {
+	    eval {
+		# suppress warnings here to make output less noisy for certain tests if necessary
+		# local $SIG{__WARN__} = sub {};
+		$test_sub->($case);
+	    };
+	    warn "$@\n" if $@;
+	}
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (4 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join " Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent " Max Carrara
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile           |   1 +
 test/Path/path_push_tests.pl | 159 +++++++++++++++++++++++++++++++++++
 2 files changed, 160 insertions(+)
 create mode 100755 test/Path/path_push_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 08d34ac..9dd95f1 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -2,6 +2,7 @@ TESTS = \
 	path_components_tests.pl				\
 	path_is_absolute_relative_tests.pl			\
 	path_join_tests.pl					\
+	path_push_tests.pl					\
 
 
 TEST_TARGETS = $(addsuffix .t,$(basename ${TESTS}))
diff --git a/test/Path/path_push_tests.pl b/test/Path/path_push_tests.pl
new file mode 100755
index 0000000..3b006a0
--- /dev/null
+++ b/test/Path/path_push_tests.pl
@@ -0,0 +1,159 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $cases = [
+    {
+	name => "push empty onto empty path",
+	path => "",
+	to_push => "",
+	pushed => "",
+    },
+    {
+	name => "push empty onto root",
+	path => "/",
+	to_push => "",
+	pushed => "/",
+    },
+    {
+	name => "push single component onto empty path",
+	path => "",
+	to_push => "foo",
+	pushed => "foo",
+    },
+    {
+	name => "push single component onto root",
+	path => "/",
+	to_push => "foo",
+	pushed => "/foo",
+    },
+    {
+	name => "push single component onto single component",
+	path => "foo",
+	to_push => "bar",
+	pushed => "foo/bar",
+    },
+    {
+	name => "push single component onto single component with trailing slash",
+	path => "foo/",
+	to_push => "bar",
+	pushed => "foo/bar",
+    },
+    {
+	name => "push single component with trailing slath onto single component",
+	path => "foo",
+	to_push => "bar/",
+	pushed => "foo/bar/",
+    },
+    {
+	name => "push single component with trailing slash"
+	    . " onto single component with trailing slash",
+	path => "foo/",
+	to_push => "bar/",
+	pushed => "foo/bar/",
+    },
+    {
+	name => "push relative path onto relative path",
+	path => "foo/bar",
+	to_push => "baz/quo",
+	pushed => "foo/bar/baz/quo",
+    },
+    {
+	name => "push relative path onto relative path with trailing slash",
+	path => "foo/bar/",
+	to_push => "baz/quo",
+	pushed => "foo/bar/baz/quo",
+    },
+    {
+	name => "push relative path with trailing slash onto relative path",
+	path => "foo/bar",
+	to_push => "baz/quo/",
+	pushed => "foo/bar/baz/quo/",
+    },
+    {
+	name => "push relative path with trailing slash onto relative path with trailing slash",
+	path => "foo/bar/",
+	to_push => "baz/quo/",
+	pushed => "foo/bar/baz/quo/",
+    },
+    {
+	name => "push root onto relative path",
+	path => "foo/bar",
+	to_push => "/",
+	pushed => "/",
+    },
+    {
+	name => "push root onto absolute path",
+	path => "/foo/bar",
+	to_push => "/",
+	pushed => "/",
+    },
+    {
+	name => "push absolute path onto relative path",
+	path => "foo/bar",
+	to_push => "/baz/quo",
+	pushed => "/baz/quo",
+    },
+    {
+	name => "push absolute path onto absolute path",
+	path => "/foo/bar",
+	to_push => "/baz/quo",
+	pushed => "/baz/quo",
+    },
+];
+
+sub test_path_push : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_push: " . $case->{name};
+
+    my $pushed = eval { PVE::Path::path_push($case->{path}, $case->{to_push}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to push onto path:\n$@");
+	return;
+    }
+
+    if (!is($pushed, $case->{pushed}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{pushed}));
+	diag("=== Got ===");
+	diag(explain($pushed));
+    }
+
+    return;
+}
+
+
+sub main : prototype() {
+    my $test_subs = [
+	\&test_path_push,
+    ];
+
+    plan(tests => scalar($cases->@*) * scalar($test_subs->@*));
+
+    for my $case ($cases->@*) {
+	for my $test_sub ($test_subs->@*) {
+	    eval {
+		# suppress warnings here to make output less noisy for certain tests if necessary
+		# local $SIG{__WARN__} = sub {};
+		$test_sub->($case);
+	    };
+	    warn "$@\n" if $@;
+	}
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (5 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push " Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals Max Carrara
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

What's important to note is that among the cases added are also some
that specifically account for Perl's File::Spec->canonpath treating
paths consisting of a single component that is neither "/" or "." as
the same. This means that e.g. "foo" and "./foo" are both the same
(canonpath returns "foo").

Because canonpath also returns "" instead of "." for the empty path
"", the behaviour of path_parent should be consistent with that of
Perl's canonpath (and in turn therefore also for path_normalize, which
wraps canonpath).

The path_parent function should otherwise work the same as Rust's
std::path::Path::parent [1]. The discrepancies as mentioned above are
checked for by the tests introduced here.

[1]: https://doc.rust-lang.org/std/path/struct.Path.html#method.parent

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile             |   1 +
 test/Path/path_parent_tests.pl | 153 +++++++++++++++++++++++++++++++++
 2 files changed, 154 insertions(+)
 create mode 100755 test/Path/path_parent_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 9dd95f1..a2b5bb1 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -2,6 +2,7 @@ TESTS = \
 	path_components_tests.pl				\
 	path_is_absolute_relative_tests.pl			\
 	path_join_tests.pl					\
+	path_parent_tests.pl					\
 	path_push_tests.pl					\
 
 
diff --git a/test/Path/path_parent_tests.pl b/test/Path/path_parent_tests.pl
new file mode 100755
index 0000000..9dfa301
--- /dev/null
+++ b/test/Path/path_parent_tests.pl
@@ -0,0 +1,153 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $path_parent_cases = [
+    {
+	name => "empty path",
+	path => "",
+	parent => undef,
+    },
+    {
+	name => "root",
+	path => "/",
+	parent => undef,
+    },
+    {
+	name => "current path reference",
+	path => ".",
+	parent => "",
+    },
+    # Why? Because File::Spec->canonpath interprets "./foo" and "foo"
+    # as the same thing, and the parent of "./foo" is "."
+    {
+	name => "single component, relative",
+	path => "foo",
+	parent => ".",
+    },
+    # Same as above!
+    {
+	name => "parent directory reference",
+	path => "..",
+	parent => ".",
+    },
+    {
+	name => "single component, absolute",
+	path => "/foo",
+	parent => "/",
+    },
+    {
+	name => "multiple components, relative",
+	path => "foo/bar/baz/quo/qux",
+	parent => "foo/bar/baz/quo",
+    },
+    {
+	name => "multiple components, absolute",
+	path => "/foo/bar/baz/quo/qux",
+	parent => "/foo/bar/baz/quo",
+    },
+    {
+	name => "multiple components, with redundant current path references",
+	path => "foo/bar/./baz/././quo/./././qux/././.",
+	parent => "foo/bar/./baz/././quo",
+    },
+    {
+	name => "multiple components, with parent directory references",
+	path => "foo/bar/../baz/../../quo/../../../qux/../../..",
+	parent => "foo/bar/../baz/../../quo/../../../qux/../..",
+    },
+    {
+	name => "multiple components, with redundant path separators",
+	path => "foo//bar///baz////quo/////qux//////",
+	parent => "foo//bar///baz////quo",
+    },
+    {
+	name => "root path, with redundant path separators",
+	path => "///////////",
+	parent => undef,
+    },
+    {
+	name => "root path, with redundant current path references",
+	path => "/././././././././././.",
+	parent => undef,
+    },
+    {
+	name => "current path reference, with redundant path separators",
+	path => ".///////////",
+	parent => "",
+    },
+    {
+	name => "current path reference, with redundant current path references",
+	path => "././././././././././.",
+	parent => "",
+    },
+    {
+	name => "multiple components,"
+	    . " with redundant path separators,"
+	    . " with redundant current path references",
+	path => "foo/.///./bar//././////././/baz//././///././././///",
+	parent => "foo/.///./bar",
+    },
+    {
+	name => "path is undef",
+	path => undef,
+	parent => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_parent : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_parent: " . $case->{name};
+
+    my $parent = eval { PVE::Path::path_parent($case->{path}); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to get parent from path:\n$@");
+	return;
+    }
+
+    # Note: `!is()` isn't the same as `isnt()` -- we want extra output here
+    # if the check fails; can't do that with `isnt()`
+    if (!is($parent, $case->{parent}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{parent}));
+	diag("=== Got ===");
+	diag(explain($parent));
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    plan(tests => scalar($path_parent_cases->@*));
+
+    for my $case ($path_parent_cases->@*) {
+	eval {
+	    # suppress warnings here to make output less noisy for certain tests if necessary
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_parent($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (6 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent " Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path Max Carrara
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

.. of PVE::Path.

Each function has its test cases defined separately to avoid running
unnecessary tests / repeating equivalent tests for certain things.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile                 |   1 +
 test/Path/path_comparison_tests.pl | 851 +++++++++++++++++++++++++++++
 2 files changed, 852 insertions(+)
 create mode 100755 test/Path/path_comparison_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index a2b5bb1..627dc09 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -1,4 +1,5 @@
 TESTS = \
+	path_comparison_tests.pl				\
 	path_components_tests.pl				\
 	path_is_absolute_relative_tests.pl			\
 	path_join_tests.pl					\
diff --git a/test/Path/path_comparison_tests.pl b/test/Path/path_comparison_tests.pl
new file mode 100755
index 0000000..928edc1
--- /dev/null
+++ b/test/Path/path_comparison_tests.pl
@@ -0,0 +1,851 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $path_starts_with_cases = [
+    {
+	name => "empty path starts with empty path",
+	path => "",
+	other_path => "",
+	expected => 1,
+    },
+    {
+	name => "root starts with empty path",
+	path => "/",
+	other_path => "",
+	expected => undef,
+    },
+    {
+	name => "empty path starts with root",
+	path => "",
+	other_path => "/",
+	expected => undef,
+    },
+    {
+	name => "foo starts with foo",
+	path => "foo",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo/ starts with foo",
+	path => "foo/",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo starts with foo/",
+	path => "foo",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo/ starts with foo/",
+	path => "foo/",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo starts with bar",
+	path => "foo",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo/ starts with bar",
+	path => "foo/",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo starts with bar/",
+	path => "foo",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "foo/ starts with bar/",
+	path => "foo/",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "/foo starts with /",
+	path => "/foo",
+	other_path => "/",
+	expected => 1,
+    },
+    {
+	name => "/foo starts with /foo",
+	path => "/foo",
+	other_path => "/foo",
+	expected => 1,
+    },
+    {
+	name => "/foo starts with foo",
+	path => "/foo",
+	other_path => "foo",
+	expected => undef,
+    },
+    {
+	name => "foo/bar starts with foo",
+	path => "foo/bar",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo/bar starts with foo/bar",
+	path => "foo/bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar starts with foo/bar/",
+	path => "foo/bar",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ starts with foo/bar",
+	path => "foo/bar/",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ starts with foo/bar/",
+	path => "foo/bar/",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar starts with /foo",
+	path => "/foo/bar",
+	other_path => "/foo",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar starts with /foo/bar",
+	path => "/foo/bar",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ starts with /foo/bar",
+	path => "/foo/bar/",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar starts with /foo/bar/",
+	path => "/foo/bar",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ starts with /foo/bar/",
+	path => "/foo/bar/",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/baz/quo/qux starts with foo/bar/baz/quo",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "foo/bar/baz/quo",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux starts with /foo/bar/baz/quo",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "/foo/bar/baz/quo",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/baz/quo/qux starts with one/two/three",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "one/two/three",
+	expected => undef,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux starts with /one/two/three",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "/one/two/three",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with /etc/pve",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with /etc/pve/firewall/cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with"
+	    . " ///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with /etc/ceph",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/ceph",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with /etc/pve/firewall/cluster.fw.gz",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw.gz",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " starts with"
+	    . " ///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	expected => undef,
+    },
+    {
+	name => "foo/../bar starts with foo/..",
+	path => "foo/../bar",
+	other_path => "foo/..",
+	expected => 1,
+    },
+    {
+	name => "foo/./bar starts with foo/bar",
+	path => "foo/./bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+];
+
+sub test_path_starts_with : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_starts_with: " . $case->{name};
+
+    my $result = eval {
+	PVE::Path::path_starts_with($case->{path}, $case->{other_path});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Encountered exception while running path_starts_with():\n$@");
+	return;
+    }
+
+    if (!is($result, $case->{expected}, $name)) {
+	diag("path       = " . $case->{path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{path})) . ")");
+	diag("other_path = " . $case->{other_path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{other_path})) . ")");
+    }
+
+    return;
+}
+
+my $path_ends_with_cases = [
+    {
+	name => "empty path ends with empty path",
+	path => "",
+	other_path => "",
+	expected => 1,
+    },
+    {
+	name => "root ends with empty path",
+	path => "/",
+	other_path => "",
+	expected => undef,
+    },
+    {
+	name => "empty path ends with root",
+	path => "",
+	other_path => "/",
+	expected => undef,
+    },
+    {
+	name => "foo ends with foo",
+	path => "foo",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo/ ends with foo",
+	path => "foo/",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo ends with foo/",
+	path => "foo",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo/ ends with foo/",
+	path => "foo/",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo ends with bar",
+	path => "foo",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo/ ends with bar",
+	path => "foo/",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo ends with bar/",
+	path => "foo",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "foo/ ends with bar/",
+	path => "foo/",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "/foo ends with /",
+	path => "/foo",
+	other_path => "/",
+	expected => undef,
+    },
+    {
+	name => "/foo ends with /foo",
+	path => "/foo",
+	other_path => "/foo",
+	expected => 1,
+    },
+    {
+	name => "/foo ends with foo",
+	path => "/foo",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo/bar ends with foo",
+	path => "foo/bar",
+	other_path => "foo",
+	expected => undef,
+    },
+    {
+	name => "foo/bar ends with bar",
+	path => "foo/bar",
+	other_path => "bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar ends with foo/bar",
+	path => "foo/bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar ends with foo/bar/",
+	path => "foo/bar",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ ends with foo/bar",
+	path => "foo/bar/",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ ends with foo/bar/",
+	path => "foo/bar/",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar ends with /foo",
+	path => "/foo/bar",
+	other_path => "/foo",
+	expected => undef,
+    },
+    {
+	name => "/foo/bar ends with /foo/bar",
+	path => "/foo/bar",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ ends with /foo/bar",
+	path => "/foo/bar/",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar ends with /foo/bar/",
+	path => "/foo/bar",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ ends with /foo/bar/",
+	path => "/foo/bar/",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/baz/quo/qux ends with bar/baz/quo/qux",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "bar/baz/quo/qux",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux ends with bar/baz/quo/qux",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "bar/baz/quo/qux",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/baz/quo/qux ends with one/two/three",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "one/two/three",
+	expected => undef,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux ends with /one/two/three",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "/one/two/three",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with firewall/cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "firewall/cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with /etc/pve/firewall/cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with"
+	    . " ///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with firewall/cluster.fw.gz",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "firewall/cluster.fw.gz",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with /etc/pve/firewall/cluster.fw.gz",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw.gz",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " ends with"
+	    . " ///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	expected => undef,
+    },
+    {
+	name => "foo/../bar ends with ../bar",
+	path => "foo/../bar",
+	other_path => "../bar",
+	expected => 1,
+    },
+    {
+	name => "foo/./bar ends with foo/bar",
+	path => "foo/./bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+];
+
+sub test_path_ends_with : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_ends_with: " . $case->{name};
+
+    my $result = eval {
+	PVE::Path::path_ends_with($case->{path}, $case->{other_path});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Encountered exception while running path_ends_with():\n$@");
+	return;
+    }
+
+    if (!is($result, $case->{expected}, $name)) {
+	diag("path       = " . $case->{path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{path})) . ")");
+	diag("other_path = " . $case->{other_path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{other_path})) . ")");
+    }
+
+    return;
+}
+
+my $path_equals_cases = [
+    {
+	name => "empty path equals empty path",
+	path => "",
+	other_path => "",
+	expected => 1,
+    },
+    {
+	name => "root equals empty path",
+	path => "/",
+	other_path => "",
+	expected => undef,
+    },
+    {
+	name => "empty path equals root",
+	path => "",
+	other_path => "/",
+	expected => undef,
+    },
+    {
+	name => "foo equals foo",
+	path => "foo",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo/ equals foo",
+	path => "foo/",
+	other_path => "foo",
+	expected => 1,
+    },
+    {
+	name => "foo equals foo/",
+	path => "foo",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo/ equals foo/",
+	path => "foo/",
+	other_path => "foo/",
+	expected => 1,
+    },
+    {
+	name => "foo equals bar",
+	path => "foo",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo/ equals bar",
+	path => "foo/",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo equals bar/",
+	path => "foo",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "foo/ equals bar/",
+	path => "foo/",
+	other_path => "bar/",
+	expected => undef,
+    },
+    {
+	name => "/foo equals /",
+	path => "/foo",
+	other_path => "/",
+	expected => undef,
+    },
+    {
+	name => "/foo equals /foo",
+	path => "/foo",
+	other_path => "/foo",
+	expected => 1,
+    },
+    {
+	name => "/foo equals foo",
+	path => "/foo",
+	other_path => "foo",
+	expected => undef,
+    },
+    {
+	name => "foo/bar equals foo",
+	path => "foo/bar",
+	other_path => "foo",
+	expected => undef,
+    },
+    {
+	name => "foo/bar equals bar",
+	path => "foo/bar",
+	other_path => "bar",
+	expected => undef,
+    },
+    {
+	name => "foo/bar equals foo/bar",
+	path => "foo/bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar equals foo/bar/",
+	path => "foo/bar",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ equals foo/bar",
+	path => "foo/bar/",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/ equals foo/bar/",
+	path => "foo/bar/",
+	other_path => "foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar equals /foo",
+	path => "/foo/bar",
+	other_path => "/foo",
+	expected => undef,
+    },
+    {
+	name => "/foo/bar equals /foo/bar",
+	path => "/foo/bar",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ equals /foo/bar",
+	path => "/foo/bar/",
+	other_path => "/foo/bar",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar equals /foo/bar/",
+	path => "/foo/bar",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/ equals /foo/bar/",
+	path => "/foo/bar/",
+	other_path => "/foo/bar/",
+	expected => 1,
+    },
+    {
+	name => "foo/bar/baz/quo/qux equals foo/bar/baz/quo/qux",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "foo/bar/baz/quo/qux",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux equals /foo/bar/baz/quo/qux",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "/foo/bar/baz/quo/qux",
+	expected => 1,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux equals foo/bar/baz/quo/qux",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "foo/bar/baz/quo/qux",
+	expected => undef,
+    },
+    {
+	name => "foo/bar/baz/quo/qux equals one/two/three",
+	path => "foo/bar/baz/quo/qux",
+	other_path => "one/two/three",
+	expected => undef,
+    },
+    {
+	name => "/foo/bar/baz/quo/qux equals /one/two/three",
+	path => "/foo/bar/baz/quo/qux",
+	other_path => "/one/two/three",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals /etc/pve",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals firewall/cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "firewall/cluster.fw",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals /etc/pve/firewall/cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals"
+	    . " ///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve//./././firewall/.//././././././././///cluster.fw",
+	expected => 1,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals firewall/cluster.fw.gz",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "firewall/cluster.fw.gz",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals /etc/pve/firewall/cluster.fw.gz",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "/etc/pve/firewall/cluster.fw.gz",
+	expected => undef,
+    },
+    {
+	name => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw"
+	    . " equals"
+	    . " ///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	path => "//./././///././//etc/.//./pve/././//.//firewall/././cluster.fw",
+	other_path => "///etc/////././././////pve/oh/no/./././firewall/.//././././././././///cluster.fw",
+	expected => undef,
+    },
+    {
+	name => "foo/../bar equals foo/..",
+	path => "foo/../bar",
+	other_path => "foo/..",
+	expected => undef,
+    },
+    {
+	name => "foo/../bar equals ../bar",
+	path => "foo/../bar",
+	other_path => "../bar",
+	expected => undef,
+    },
+    {
+	name => "foo/./bar equals foo/bar",
+	path => "foo/./bar",
+	other_path => "foo/bar",
+	expected => 1,
+    },
+];
+
+sub test_path_equals : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_equals: " . $case->{name};
+
+    my $result = eval {
+	PVE::Path::path_equals($case->{path}, $case->{other_path});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Encountered exception while running path_equals():\n$@");
+	return;
+    }
+
+    if (!is($result, $case->{expected}, $name)) {
+	diag("path       = " . $case->{path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{path})) . ")");
+	diag("other_path = " . $case->{other_path});
+	diag("             (" . join(", ", PVE::Path::path_components($case->{other_path})) . ")");
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    plan(
+	tests => scalar($path_starts_with_cases->@*)
+	    + scalar($path_ends_with_cases->@*)
+	    + scalar($path_equals_cases->@*)
+    );
+
+    for my $case ($path_starts_with_cases->@*) {
+	eval {
+	    # suppress warnings here to make output less noisy for certain tests if necessary
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_starts_with($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    for my $case ($path_ends_with_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_ends_with($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    for my $case ($path_equals_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_equals($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (7 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize " Max Carrara
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

Add tests for the functions path_file_name, path_file_prefix,
path_file_suffix and path_file_suffixes as well as their path_with_*
counterparts.

The cases defined here are a bit more elaborate than the others,
because manipulating file names specifically is more insidious as one
might think at first.

For example, getting the suffix of a file like /etc/resolv.conf is
rather trivial, as one can e.g. just take the last component and split
it at the dot, but for files like /foo/bar/...oh...no.. (yes, this has
a valid file name) it's much more tricker and can't actually be
performed via Perl's inbuilt split function.

That's why some of the cases added here account for weird file names,
files with more than two suffixes, etc. The functions above must of
course always work in a consistent manner, even if a file starts with
leading dots or has an arbitrarily high number of suffixes.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile               |    1 +
 test/Path/path_file_ops_tests.pl | 1220 ++++++++++++++++++++++++++++++
 2 files changed, 1221 insertions(+)
 create mode 100755 test/Path/path_file_ops_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 627dc09..9dcb878 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -1,6 +1,7 @@
 TESTS = \
 	path_comparison_tests.pl				\
 	path_components_tests.pl				\
+	path_file_ops_tests.pl					\
 	path_is_absolute_relative_tests.pl			\
 	path_join_tests.pl					\
 	path_parent_tests.pl					\
diff --git a/test/Path/path_file_ops_tests.pl b/test/Path/path_file_ops_tests.pl
new file mode 100755
index 0000000..ee32307
--- /dev/null
+++ b/test/Path/path_file_ops_tests.pl
@@ -0,0 +1,1220 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+my $path_file_part_cases = [
+    {
+	name => "empty path",
+	path => "",
+	file_name => undef,
+	file_prefix => undef,
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => [],
+    },
+    {
+	name => "root",
+	path => "/",
+	file_name => undef,
+	file_prefix => undef,
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => [],
+    },
+    {
+	name => "file without suffixes",
+	path => "foo",
+	file_name => "foo",
+	file_prefix => "foo",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => ["foo"],
+    },
+    {
+	name => "file without suffixes, with root",
+	path => "/foo",
+	file_name => "foo",
+	file_prefix => "foo",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => ["foo"],
+    },
+    {
+	name => "file with suffixes (1)",
+	path => "foo.txt",
+	file_name => "foo.txt",
+	file_prefix => "foo",
+	file_suffix => "txt",
+	file_suffixes => ["txt"],
+	file_parts => ["foo", "txt"],
+    },
+    {
+	name => "file with suffixes (3)",
+	path => "foo.txt.zip.zst",
+	file_name => "foo.txt.zip.zst",
+	file_prefix => "foo",
+	file_suffix => "zst",
+	file_suffixes => ["txt", "zip", "zst"],
+	file_parts => ["foo", "txt", "zip", "zst"],
+    },
+    {
+	name => "file with suffixes (1), with root",
+	path => "/foo.txt",
+	file_name => "foo.txt",
+	file_prefix => "foo",
+	file_suffix => "txt",
+	file_suffixes => ["txt"],
+	file_parts => ["foo", "txt"],
+    },
+    {
+	name => "file with suffixes (3), with root",
+	path => "/foo.txt.zip.zst",
+	file_name => "foo.txt.zip.zst",
+	file_prefix => "foo",
+	file_suffix => "zst",
+	file_suffixes => ["txt", "zip", "zst"],
+	file_parts => ["foo", "txt", "zip", "zst"],
+    },
+    {
+	name => "/etc/resolv.conf - simple file with single dir",
+	path => "/etc/resolv.conf",
+	file_name => "resolv.conf",
+	file_prefix => "resolv",
+	file_suffix => "conf",
+	file_suffixes => ["conf"],
+	file_parts => ["resolv", "conf"],
+    },
+    {
+	name => "/etc/pve/firewall/cluster.fw - long path",
+	path => "/etc/pve/firewall/cluster.fw",
+	file_name => "cluster.fw",
+	file_prefix => "cluster",
+	file_suffix => "fw",
+	file_suffixes => ["fw"],
+	file_parts => ["cluster", "fw"],
+    },
+    {
+	name => "/tmp/archive.tar.gz - file with two suffixes",
+	path => "/tmp/archive.tar.gz",
+	file_name => "archive.tar.gz",
+	file_prefix => "archive",
+	file_suffix => "gz",
+	file_suffixes => ["tar", "gz"],
+	file_parts => ["archive", "tar", "gz"],
+    },
+    {
+	name => "/home/bob/.bash_history - hidden file",
+	path => "/home/bob/.bash_history",
+	file_name => ".bash_history",
+	file_prefix => ".bash_history",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => [".bash_history"],
+    },
+    {
+	name => "/home/bob/..foobar - file prefixed with double dot",
+	path => "/home/bob/..foobar",
+	file_name => "..foobar",
+	file_prefix => "..foobar",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => ["..foobar"],
+    },
+    {
+	name => "/home/bob/...foo...bar...baz... - wacky but legal file name",
+	path => "/home/bob/...foo...bar...baz...",
+	file_name => "...foo...bar...baz...",
+	file_prefix => "...foo",
+	file_suffix => "",
+	file_suffixes => ["", "", "bar", "", "", "baz", "", "", ""],
+	file_parts => ["...foo", "", "", "bar", "", "", "baz", "", "", ""],
+    },
+    {
+	name => "/home/bob/...... - file name consisting solely of dots",
+	path => "/home/bob/......",
+	file_name => "......",
+	file_prefix => "......",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => ["......"],
+    },
+    {
+	name => "/home/bob/. - current path reference",
+	path => "/home/bob/.",
+	file_name => "bob",
+	file_prefix => "bob",
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => ["bob"],
+    },
+    {
+	name => "/home/bob/.. - parent path reference",
+	path => "/home/bob/..",
+	file_name => undef,
+	file_prefix => undef,
+	file_suffix => undef,
+	file_suffixes => [],
+	file_parts => [],
+    },
+];
+
+sub test_path_file_name : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_file_name: " . $case->{name};
+
+    my $file_name = eval { PVE::Path::path_file_name($case->{path}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to get file name of path:\n$@");
+	return;
+    }
+
+    if (!is($file_name, $case->{file_name}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{file_name}));
+	diag("=== Got ===");
+	diag(explain($file_name));
+    }
+
+    return;
+}
+
+sub test_path_file_prefix : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_file_prefix: " . $case->{name};
+
+    my $file_prefix = eval { PVE::Path::path_file_prefix($case->{path}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to get file prefix of path:\n$@");
+	return;
+    }
+
+    if (!is($file_prefix, $case->{file_prefix}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{file_prefix}));
+	diag("=== Got ===");
+	diag(explain($file_prefix));
+    }
+
+    return;
+}
+
+sub test_path_file_suffix : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_file_suffix: " . $case->{name};
+
+    my $file_suffix = eval { PVE::Path::path_file_suffix($case->{path}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to get file suffix of path:\n$@");
+	return;
+    }
+
+    if (!is_deeply($file_suffix, $case->{file_suffix}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{file_suffix}));
+	diag("=== Got ===");
+	diag(explain($file_suffix));
+    }
+
+    return;
+}
+
+sub test_path_file_suffixes : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_file_suffixes: " . $case->{name};
+
+    my $file_suffixes = eval { PVE::Path::path_file_suffixes($case->{path}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to get file suffixes of path:\n$@");
+	return;
+    }
+
+    if (!is_deeply($file_suffixes, $case->{file_suffixes}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{file_suffixes}));
+	diag("=== Got ===");
+	diag(explain($file_suffixes));
+    }
+
+    return;
+}
+
+sub test_path_file_parts : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_file_parts: " . $case->{name};
+
+    my $file_parts = eval { PVE::Path::path_file_parts($case->{path}); };
+
+    if ($@) {
+	fail($name);
+	diag("Failed to get file parts of path:\n$@");
+	return;
+    }
+
+    if (!is_deeply($file_parts, $case->{file_parts}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{file_parts}));
+	diag("=== Got ===");
+	diag(explain($file_parts));
+    }
+
+    return;
+}
+
+my $path_with_file_name_cases = [
+    {
+	name => "no path, no file name",
+	path => "",
+	file_name => "",
+	expected => "",
+    },
+    {
+	name => "root, no file name",
+	path => "/",
+	file_name => "",
+	expected => "/",
+    },
+    {
+	name => "no path, file name",
+	path => "",
+	file_name => "foo",
+	expected => "foo",
+    },
+    {
+	name => "root, file name",
+	path => "/",
+	file_name => "foo",
+	expected => "/foo",
+    },
+    {
+	name => "single path component, no file name",
+	path => "foo",
+	file_name => "",
+	expected => "",
+    },
+    {
+	name => "single path component, absolute, no file name",
+	path => "/foo",
+	file_name => "",
+	expected => "/",
+    },
+    {
+	name => "single path component, file name",
+	path => "foo",
+	file_name => "bar",
+	expected => "bar",
+    },
+    {
+	name => "single path component, absolute, file name",
+	path => "/foo",
+	file_name => "bar",
+	expected => "/bar",
+    },
+    {
+	name => "multiple path components, no file name",
+	path => "foo/bar/baz",
+	file_name => "",
+	expected => "foo/bar",
+    },
+    {
+	name => "multiple path components, absolute, no file name",
+	path => "/foo/bar/baz",
+	file_name => "",
+	expected => "/foo/bar",
+    },
+    {
+	name => "multiple path components, file name",
+	path => "foo/bar/baz",
+	file_name => "qux",
+	expected => "foo/bar/qux",
+    },
+    {
+	name => "multiple path components, absolute, file name",
+	path => "/foo/bar/baz",
+	file_name => "qux",
+	expected => "/foo/bar/qux",
+    },
+    {
+	name => "multiple path components with current path reference, no file name",
+	path => "foo/bar/baz/.",
+	file_name => "",
+	expected => "foo/bar",
+    },
+    {
+	name => "multiple path components with current path reference, file name",
+	path => "foo/bar/baz/.",
+	file_name => "qux",
+	expected => "foo/bar/qux",
+    },
+    {
+	name => "multiple path components with parent path reference, no file name",
+	path => "foo/bar/baz/..",
+	file_name => "",
+	expected => "foo/bar/baz",
+    },
+    {
+	name => "multiple path components with parent path reference, file name",
+	path => "foo/bar/baz/..",
+	file_name => "qux",
+	expected => "foo/bar/baz/qux",
+    },
+    {
+	name => "/home/bob/foo.txt --> /home/bob/bar.txt",
+	path => "/home/bob/foo.txt",
+	file_name => "bar.txt",
+	expected => "/home/bob/bar.txt",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/backup.tar.zst",
+	path => "/tmp/archive.tar.gz",
+	file_name => "backup.tar.zst",
+	expected => "/tmp/backup.tar.zst",
+    },
+    {
+	name => "/home/bob/...foo.txt --> /home/bob/...bar.csv",
+	path => "/home/bob/...foo.txt",
+	file_name => "...bar.csv",
+	expected => "/home/bob/...bar.csv",
+    },
+    {
+	name => "file name with path separator",
+	path => "foo/bar/baz",
+	file_name => "quo/qux",
+	expected => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_with_file_name : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_with_file_name: " . $case->{name};
+
+    my $new_path = eval {
+	PVE::Path::path_with_file_name($case->{path}, $case->{file_name});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to replace file name of path:\n$@");
+	return;
+    }
+
+    if (!is($new_path, $case->{expected}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{expected}));
+	diag("=== Got ===");
+	diag(explain($new_path));
+    }
+
+    return;
+}
+
+my $path_with_file_prefix_cases = [
+    {
+	name => "no path, no prefix",
+	path => "",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "root, no prefix",
+	path => "/",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "no path, prefix",
+	path => "",
+	prefix => "foo",
+	expected => undef,
+    },
+    {
+	name => "root, prefix",
+	path => "/",
+	prefix => "foo",
+	expected => undef,
+    },
+    {
+	name => "single path component, no prefix",
+	path => "foo",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "single path component, absolute, no prefix",
+	path => "/foo",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "single path component, prefix",
+	path => "foo",
+	prefix => "bar",
+	expected => "bar",
+    },
+    {
+	name => "single path component, absolute, prefix",
+	path => "/foo",
+	prefix => "bar",
+	expected => "/bar",
+    },
+    {
+	name => "multiple path components, no prefix",
+	path => "foo/bar/baz",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "multiple path components, absolute, no prefix",
+	path => "/foo/bar/baz",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "multiple path components, prefix",
+	path => "foo/bar/baz",
+	prefix => "qux",
+	expected => "foo/bar/qux",
+    },
+    {
+	name => "multiple path components, absolute, prefix",
+	path => "/foo/bar/baz",
+	prefix => "qux",
+	expected => "/foo/bar/qux",
+    },
+    {
+	name => "multiple path components with current path reference, no prefix",
+	path => "foo/bar/baz/.",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "multiple path components with current path reference, prefix",
+	path => "foo/bar/baz/.",
+	prefix => "qux",
+	expected => "foo/bar/qux",
+    },
+    {
+	name => "multiple path components with parent path reference, no prefix",
+	path => "foo/bar/baz/..",
+	prefix => "",
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent path reference, prefix",
+	path => "foo/bar/baz/..",
+	prefix => "qux",
+	expected => undef,
+    },
+    {
+	name => "/home/bob/foo.txt --> /home/bob/bar.txt",
+	path => "/home/bob/foo.txt",
+	prefix => "bar",
+	expected => "/home/bob/bar.txt",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/backup.tar.gz",
+	path => "/tmp/archive.tar.gz",
+	prefix => "backup",
+	expected => "/tmp/backup.tar.gz",
+    },
+    {
+	name => "/home/bob/...foo.txt --> /home/bob/...bar.txt",
+	path => "/home/bob/...foo.txt",
+	prefix => "...bar",
+	expected => "/home/bob/...bar.txt",
+    },
+    {
+	name => "prefix with path separator",
+	path => "foo/bar/baz",
+	prefix => "quo/qux",
+	expected => undef,
+	should_throw => 1,
+    },
+    {
+	name => "prefix ends with dot",
+	path => "foo/bar/baz",
+	prefix => "quo.",
+	expected => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_with_file_prefix : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_with_file_prefix: " . $case->{name};
+
+    my $new_path = eval {
+	PVE::Path::path_with_file_prefix($case->{path}, $case->{prefix});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to replace file prefix of path:\n$@");
+	return;
+    }
+
+    if (!is($new_path, $case->{expected}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{expected}));
+	diag("=== Got ===");
+	diag(explain($new_path));
+    }
+
+    return;
+}
+
+my $path_with_file_suffix_cases = [
+    {
+	name => "no path, empty suffix",
+	path => "",
+	suffix => undef,
+	expected => undef,
+    },
+    {
+	name => "root, empty suffix",
+	path => "/",
+	suffix => undef,
+	expected => undef,
+    },
+    {
+	name => "no path, suffix",
+	path => "",
+	suffix => "foo",
+	expected => undef,
+    },
+    {
+	name => "root, suffix",
+	path => "/",
+	suffix => "foo",
+	expected => undef,
+    },
+    {
+	name => "no path, undef suffix",
+	path => "",
+	suffix => undef,
+	expected => undef,
+    },
+    {
+	name => "root, undef suffix",
+	path => "/",
+	suffix => undef,
+	expected => undef,
+    },
+    {
+	name => "single path component, empty suffix",
+	path => "foo",
+	suffix => "",
+	expected => "foo.",
+    },
+    {
+	name => "single path component, absolute, empty suffix",
+	path => "/foo",
+	suffix => "",
+	expected => "/foo.",
+    },
+    {
+	name => "single path component, suffix",
+	path => "foo",
+	suffix => "bar",
+	expected => "foo.bar",
+    },
+    {
+	name => "single path component, absolute, suffix",
+	path => "/foo",
+	suffix => "bar",
+	expected => "/foo.bar",
+    },
+    {
+	name => "single path component, undef suffix",
+	path => "foo",
+	suffix => undef,
+	expected => "foo",
+    },
+    {
+	name => "single path component, absolute, undef suffix",
+	path => "/foo",
+	suffix => undef,
+	expected => "/foo",
+    },
+    {
+	name => "multiple path components, empty suffix",
+	path => "foo/bar/baz",
+	suffix => "",
+	expected => "foo/bar/baz.",
+    },
+    {
+	name => "multiple path components, absolute, empty suffix",
+	path => "/foo/bar/baz",
+	suffix => "",
+	expected => "/foo/bar/baz.",
+    },
+    {
+	name => "multiple path components, suffix",
+	path => "foo/bar/baz",
+	suffix => "qux",
+	expected => "foo/bar/baz.qux",
+    },
+    {
+	name => "multiple path components, absolute, suffix",
+	path => "/foo/bar/baz",
+	suffix => "qux",
+	expected => "/foo/bar/baz.qux",
+    },
+    {
+	name => "multiple path components, undef suffix",
+	path => "foo/bar/baz",
+	suffix => undef,
+	expected => "foo/bar/baz",
+    },
+    {
+	name => "multiple path components, absolute, undef suffix",
+	path => "/foo/bar/baz",
+	suffix => undef,
+	expected => "/foo/bar/baz",
+    },
+    {
+	name => "multiple path components with current path reference, empty suffix",
+	path => "foo/bar/baz/.",
+	suffix => "",
+	expected => "foo/bar/baz.",
+    },
+    {
+	name => "multiple path components with current path reference, suffix",
+	path => "foo/bar/baz/.",
+	suffix => "qux",
+	expected => "foo/bar/baz.qux",
+    },
+    {
+	name => "multiple path components with current path reference, undef suffix",
+	path => "foo/bar/baz/.",
+	suffix => undef,
+	expected => "foo/bar/baz/.",
+    },
+    {
+	name => "multiple path components with parent path reference, empty suffix",
+	path => "foo/bar/baz/..",
+	suffix => "",
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent path reference, suffix",
+	path => "foo/bar/baz/..",
+	suffix => "qux",
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent path reference, undef suffix",
+	path => "foo/bar/baz/..",
+	suffix => "qux",
+	expected => undef,
+    },
+    {
+	name => "/home/bob/foo.txt --> /home/bob/foo.mp4",
+	path => "/home/bob/foo.txt",
+	suffix => "mp4",
+	expected => "/home/bob/foo.mp4",
+    },
+    {
+	name => "/home/bob/foo.txt --> /home/bob/foo.",
+	path => "/home/bob/foo.txt",
+	suffix => "",
+	expected => "/home/bob/foo.",
+    },
+    {
+	name => "/home/bob/foo.txt --> /home/bob/foo",
+	path => "/home/bob/foo",
+	suffix => undef,
+	expected => "/home/bob/foo",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar.zst",
+	path => "/tmp/archive.tar.gz",
+	suffix => "zst",
+	expected => "/tmp/archive.tar.zst",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar.",
+	path => "/tmp/archive.tar.",
+	suffix => "",
+	expected => "/tmp/archive.tar.",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar",
+	path => "/tmp/archive.tar.gz",
+	suffix => undef,
+	expected => "/tmp/archive.tar",
+    },
+    {
+	name => "/home/bob/...foo.txt --> /home/bob/...foo.csv",
+	path => "/home/bob/...foo.txt",
+	suffix => "csv",
+	expected => "/home/bob/...foo.csv",
+    },
+    {
+	name => "/home/bob/...foo.txt --> /home/bob/...foo.",
+	path => "/home/bob/...foo.txt",
+	suffix => "",
+	expected => "/home/bob/...foo.",
+    },
+    {
+	name => "/home/bob/...foo.txt --> /home/bob/...foo",
+	path => "/home/bob/...foo.txt",
+	suffix => undef,
+	expected => "/home/bob/...foo",
+    },
+    {
+	name => "/home/bob/...foo --> /home/bob/...foo.txt",
+	path => "/home/bob/...foo",
+	suffix => "txt",
+	expected => "/home/bob/...foo.txt",
+    },
+    {
+	name => "/home/bob/...foo --> /home/bob/...foo.",
+	path => "/home/bob/...foo",
+	suffix => "",
+	expected => "/home/bob/...foo.",
+    },
+    {
+	name => "/home/bob/...foo --> /home/bob/...foo",
+	path => "/home/bob/...foo",
+	suffix => undef,
+	expected => "/home/bob/...foo",
+    },
+    {
+	name => "/home/bob/...foo. --> /home/bob/...foo.",
+	path => "/home/bob/...foo.",
+	suffix => "",
+	expected => "/home/bob/...foo.",
+    },
+    {
+	name => "/home/bob/...foo. --> /home/bob/...foo.txt",
+	path => "/home/bob/...foo.",
+	suffix => "txt",
+	expected => "/home/bob/...foo.txt",
+    },
+    {
+	name => "/home/bob/...foo. --> /home/bob/...foo",
+	path => "/home/bob/...foo.",
+	suffix => undef,
+	expected => "/home/bob/...foo",
+    },
+    {
+	name => "suffix with path separator",
+	path => "foo/bar/baz",
+	suffix => "quo/qux",
+	expected => undef,
+	should_throw => 1,
+    },
+    {
+	name => "suffix contains dot",
+	path => "foo/bar/baz",
+	suffix => "quo.qux",
+	expected => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_with_file_suffix : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_with_file_suffix: " . $case->{name};
+
+    my $new_path = eval {
+	PVE::Path::path_with_file_suffix($case->{path}, $case->{suffix});
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to replace file suffix of path:\n$@");
+	return;
+    }
+
+    if (!is($new_path, $case->{expected}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{expected}));
+	diag("=== Got ===");
+	diag(explain($new_path));
+    }
+
+    return;
+}
+
+my $path_with_file_suffixes_cases = [
+    {
+	name => "no path, no suffixes",
+	path => "",
+	suffixes => [],
+	expected => undef,
+    },
+    {
+	name => "root, no suffixes",
+	path => "/",
+	suffixes => [],
+	expected => undef,
+    },
+    {
+	name => "no path, suffixes (1)",
+	path => "",
+	suffixes => ["tar"],
+	expected => undef,
+    },
+    {
+	name => "root, suffixes (1)",
+	path => "/",
+	suffixes => ["tar"],
+	expected => undef,
+    },
+    {
+	name => "single path component, no suffixes",
+	path => "foo",
+	suffixes => [],
+	expected => "foo",
+    },
+    {
+	name => "single path component, absolute, no suffixes",
+	path => "/foo",
+	suffixes => [],
+	expected => "/foo",
+    },
+    {
+	name => "single path component, suffixes (1)",
+	path => "foo",
+	suffixes => ["tar"],
+	expected => "foo.tar",
+    },
+    {
+	name => "single path component, absolute, suffixes (1)",
+	path => "/foo",
+	suffixes => ["tar"],
+	expected => "/foo.tar",
+    },
+    {
+	name => "single path component, suffixes (3)",
+	path => "foo",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "foo.tar.zst.bak",
+    },
+    {
+	name => "single path component, absolute, suffixes (3)",
+	path => "/foo",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "/foo.tar.zst.bak",
+    },
+    {
+	name => "single path component, suffixes (10)",
+	path => "foo",
+	suffixes => ["tar", "zst", "bak", "gz", "zip", "xz", "lz4", "rar", "br", "lzma"],
+	expected => "foo.tar.zst.bak.gz.zip.xz.lz4.rar.br.lzma",
+    },
+    {
+	name => "single path component, absolute, suffixes (10)",
+	path => "/foo",
+	suffixes => ["tar", "zst", "bak", "gz", "zip", "xz", "lz4", "rar", "br", "lzma"],
+	expected => "/foo.tar.zst.bak.gz.zip.xz.lz4.rar.br.lzma",
+    },
+    {
+	name => "multiple path components, no suffixes",
+	path => "foo/bar/baz",
+	suffixes => [],
+	expected => "foo/bar/baz",
+    },
+    {
+	name => "multiple path components, absolute, no suffixes",
+	path => "/foo/bar/baz",
+	suffixes => [],
+	expected => "/foo/bar/baz",
+    },
+    {
+	name => "multiple path components, suffixes (1)",
+	path => "foo/bar/baz",
+	suffixes => ["tar"],
+	expected => "foo/bar/baz.tar",
+    },
+    {
+	name => "multiple path components, absolute, suffixes (1)",
+	path => "/foo/bar/baz",
+	suffixes => ["tar"],
+	expected => "/foo/bar/baz.tar",
+    },
+    {
+	name => "multiple path components, suffixes (3)",
+	path => "foo/bar/baz",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "foo/bar/baz.tar.zst.bak",
+    },
+    {
+	name => "multiple path components, absolute, suffixes (3)",
+	path => "/foo/bar/baz",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "/foo/bar/baz.tar.zst.bak",
+    },
+    {
+	name => "multiple path components, suffixes (10)",
+	path => "foo/bar/baz",
+	suffixes => ["tar", "zst", "bak", "gz", "zip", "xz", "lz4", "rar", "br", "lzma"],
+	expected => "foo/bar/baz.tar.zst.bak.gz.zip.xz.lz4.rar.br.lzma",
+    },
+    {
+	name => "multiple path components, absolute, suffixes (10)",
+	path => "/foo/bar/baz",
+	suffixes => ["tar", "zst", "bak", "gz", "zip", "xz", "lz4", "rar", "br", "lzma"],
+	expected => "/foo/bar/baz.tar.zst.bak.gz.zip.xz.lz4.rar.br.lzma",
+    },
+    {
+	name => "multiple path components with current path reference, no suffixes",
+	path => "foo/bar/baz/.",
+	suffixes => [],
+	expected => "foo/bar/baz/.",
+    },
+    {
+	name => "multiple path components with current path reference, absolute, no suffixes",
+	path => "/foo/bar/baz/.",
+	suffixes => [],
+	expected => "/foo/bar/baz/.",
+    },
+    {
+	name => "multiple path components with current path reference, suffixes (3)",
+	path => "foo/bar/baz/.",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "foo/bar/baz.tar.zst.bak",
+    },
+    {
+	name => "multiple path components with current path reference, absolute, suffixes (3)",
+	path => "/foo/bar/baz/.",
+	suffixes => ["tar", "zst", "bak"],
+	expected => "/foo/bar/baz.tar.zst.bak",
+    },
+    {
+	name => "multiple path components with parent directory reference, no suffixes",
+	path => "foo/bar/baz/..",
+	suffixes => [],
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent directory reference, absolute, no suffixes",
+	path => "/foo/bar/baz/..",
+	suffixes => [],
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent directory reference, suffixes (3)",
+	path => "foo/bar/baz/..",
+	suffixes => ["tar", "zst", "bak"],
+	expected => undef,
+    },
+    {
+	name => "multiple path components with parent directory reference, absolute, suffixes (3)",
+	path => "/foo/bar/baz/..",
+	suffixes => ["tar", "zst", "bak"],
+	expected => undef,
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar.zst",
+	path => "/tmp/archive.tar.gz",
+	suffixes => ["tar", "zst"],
+	expected => "/tmp/archive.tar.zst",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar",
+	path => "/tmp/archive.tar.gz",
+	suffixes => ["tar"],
+	expected => "/tmp/archive.tar",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive.tar.",
+	path => "/tmp/archive.tar.gz",
+	suffixes => ["tar", ""],
+	expected => "/tmp/archive.tar.",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive..",
+	path => "/tmp/archive.tar.gz",
+	suffixes => ["", ""],
+	expected => "/tmp/archive..",
+    },
+    {
+	name => "/tmp/archive.tar --> /tmp/archive.tar.gz",
+	path => "/tmp/archive.tar",
+	suffixes => ["tar", "gz"],
+	expected => "/tmp/archive.tar.gz",
+    },
+    {
+	name => "/tmp/archive --> /tmp/archive.tar.gz",
+	path => "/tmp/archive",
+	suffixes => ["tar", "gz"],
+	expected => "/tmp/archive.tar.gz",
+    },
+    {
+	name => "/tmp/archive.tar.gz --> /tmp/archive",
+	path => "/tmp/archive.tar.gz",
+	suffixes => [],
+	expected => "/tmp/archive",
+    },
+    {
+	name => "/tmp/archive --> /tmp/archive",
+	path => "/tmp/archive",
+	suffixes => [],
+	expected => "/tmp/archive",
+    },
+    {
+	name => "/home/bob/...one...two...three --> /home/bob/...one...foo...bar",
+	path => "/home/bob/...one...two...three",
+	suffixes => ["", "", "foo", "", "", "bar"],
+	expected => "/home/bob/...one...foo...bar",
+    },
+    {
+	name => "suffixes contain a path separator",
+	path => "foo/bar/baz",
+	suffixes => ["tar", "oh/no", "zst"],
+	expected => undef,
+	should_throw => 1,
+    },
+    {
+	name => "suffixes contain a dot",
+	path => "foo/bar/baz",
+	suffixes => ["tar", "oh.no", "zst"],
+	expected => undef,
+	should_throw => 1,
+    },
+    {
+	name => "suffixes contain undef",
+	path => "foo/bar/baz",
+	suffixes => ["tar", undef, "zst"],
+	expected => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_with_file_suffixes : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_with_file_suffixes: " . $case->{name};
+
+    my $new_path = eval {
+	PVE::Path::path_with_file_suffixes($case->{path}, $case->{suffixes}->@*);
+    };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to replace file suffixes of path:\n$@");
+	return;
+    }
+
+    if (!is($new_path, $case->{expected}, $name)) {
+	diag("=== Expected ===");
+	diag(explain($case->{expected}));
+	diag("=== Got ===");
+	diag(explain($new_path));
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    my $file_part_test_subs = [
+	\&test_path_file_name,
+	\&test_path_file_prefix,
+	\&test_path_file_suffix,
+	\&test_path_file_suffixes,
+	\&test_path_file_parts,
+    ];
+
+    plan(
+	tests => scalar($path_file_part_cases->@*) * scalar($file_part_test_subs->@*)
+	    + scalar($path_with_file_name_cases->@*)
+	    + scalar($path_with_file_prefix_cases->@*)
+	    + scalar($path_with_file_suffix_cases->@*)
+	    + scalar($path_with_file_suffixes_cases->@*)
+    );
+
+    for my $case ($path_file_part_cases->@*) {
+	for my $test_sub ($file_part_test_subs->@*) {
+	    eval {
+		# suppress warnings here to make output less noisy for certain tests if necessary
+		# local $SIG{__WARN__} = sub {};
+		$test_sub->($case);
+	    };
+	    warn "$@\n" if $@;
+	}
+    }
+
+    for my $case ($path_with_file_name_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_with_file_name($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    for my $case ($path_with_file_prefix_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_with_file_prefix($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    for my $case ($path_with_file_suffix_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_with_file_suffix($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    for my $case ($path_with_file_suffixes_cases->@*) {
+	eval {
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_with_file_suffixes($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize of PVE::Path
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (8 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem Max Carrara
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

Add these tests solely to ensure that the behaviour of path_normalize
stays consistent / stable in case we ever decide to provide our own
implementation instead of wrapping File::Spec->canonpath().

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * NEW: Split from patch 02

 test/Path/Makefile                |   1 +
 test/Path/path_normalize_tests.pl | 176 ++++++++++++++++++++++++++++++
 2 files changed, 177 insertions(+)
 create mode 100755 test/Path/path_normalize_tests.pl

diff --git a/test/Path/Makefile b/test/Path/Makefile
index 9dcb878..8afed17 100644
--- a/test/Path/Makefile
+++ b/test/Path/Makefile
@@ -4,6 +4,7 @@ TESTS = \
 	path_file_ops_tests.pl					\
 	path_is_absolute_relative_tests.pl			\
 	path_join_tests.pl					\
+	path_normalize_tests.pl					\
 	path_parent_tests.pl					\
 	path_push_tests.pl					\
 
diff --git a/test/Path/path_normalize_tests.pl b/test/Path/path_normalize_tests.pl
new file mode 100755
index 0000000..4d1de24
--- /dev/null
+++ b/test/Path/path_normalize_tests.pl
@@ -0,0 +1,176 @@
+#!/usr/bin/env perl
+
+use lib '../../src';
+
+use strict;
+use warnings;
+
+use Test::More;
+
+use PVE::Path;
+
+# Note: These tests exist solely to ensure that the behaviour of path_normalize
+# stays consistent in case we ever decide to provide our own implementation
+# instead of wrapping File::Spec->canonpath().
+
+my $path_normalize_cases = [
+    {
+	name => "empty path",
+	path => "",
+	normalized => "",
+    },
+    {
+	name => "root",
+	path => "/",
+	normalized => "/",
+    },
+    {
+	name => "current path reference",
+	path => ".",
+	normalized => ".",
+    },
+    {
+	name => "parent directory reference",
+	path => "..",
+	normalized => "..",
+    },
+    {
+	name => "single path component",
+	path => "foo",
+	normalized => "foo",
+    },
+    {
+	name => "single path component, absolute",
+	path => "/foo",
+	normalized => "/foo",
+    },
+    {
+	name => "single path component, starting with current path reference",
+	path => "./foo",
+	normalized => "foo",
+    },
+    {
+	name => "parent directory reference, starting with current path reference",
+	path => "./..",
+	normalized => "..",
+    },
+    {
+	name => "multiple components, with redundant path separators",
+	path => "foo//bar///baz////quo/////qux//////",
+	normalized => "foo/bar/baz/quo/qux",
+    },
+    {
+	name => "multiple components, with redundant current path references",
+	path => "foo/./bar/././baz/./././quo/././././qux/././././.",
+	normalized => "foo/bar/baz/quo/qux",
+    },
+    {
+	name => "multiple components, with parent directory references",
+	path => "foo/../bar/../../baz/../../../quo/../../../../qux/../../../../..",
+	normalized => "foo/../bar/../../baz/../../../quo/../../../../qux/../../../../..",
+    },
+    {
+	name => "root path, with redundant path separators",
+	path => "///////////",
+	normalized => "/",
+    },
+    {
+	name => "root path, with redundant current path references",
+	path => "/./././././././././.",
+	normalized => "/",
+    },
+    {
+	name => "root with parent directory reference",
+	path => "/..",
+	normalized => "/",
+    },
+    {
+	name => "root with multiple parent directory references",
+	path => "/../../../../../../../../..",
+	normalized => "/",
+    },
+    {
+	name => "current path reference, with redundant path separators",
+	path => ".///////////",
+	normalized => ".",
+    },
+    {
+	name => "current path reference, with redundant current path references",
+	path => "./././././././././.",
+	normalized => ".",
+    },
+    {
+	name => "current path reference, with parent directory references",
+	path => "./../../..",
+	normalized => "../../..",
+    },
+    {
+	name => "multiple components,"
+	    . " with redundant path separators,"
+	    . " with redundant current path references,"
+	    . " with parent directory references",
+	path => "foo//././//bar///./././//.//baz/.././..///quo/..////../qux/././//..",
+	normalized => "foo/bar/baz/../../quo/../../qux/..",
+    },
+    {
+	name => "multiple components, with odd component names",
+	path => ".../  \t/\t/\\/........../.~.^./.+\$={}[]()<>.../!/\"/'",
+	normalized => ".../  \t/\t/\\/........../.~.^./.+\$={}[]()<>.../!/\"/'",
+    },
+    # Diverging from File::Spec->canonpath() here -- canonpath() doesn't throw
+    # if it gets undef, but path_normalize() does in order to stay consistent
+    # with all the other functions of PVE::Path.
+    {
+	name => "path is undef",
+	path => undef,
+	normalized => undef,
+	should_throw => 1,
+    },
+];
+
+sub test_path_normalize : prototype($) {
+    my ($case) = @_;
+
+    my $name = "path_normalize: " . $case->{name};
+
+    my $normalized = eval { PVE::Path::path_normalize($case->{path}); };
+
+    if ($@) {
+	if ($case->{should_throw}) {
+	    pass($name);
+	    return;
+	}
+
+	fail($name);
+	diag("Failed to normalize path:\n$@");
+	return;
+    }
+
+    # Note: `!is()` isn't the same as `isnt()` -- we want extra output here
+    # if the check fails; can't do that with `isnt()`
+    if (!is($normalized, $case->{normalized}, $name)) {
+	diag("path       = " . $case->{path});
+	diag("normalized = " . $case->{normalized});
+    }
+
+    return;
+}
+
+sub main : prototype() {
+    plan(tests => scalar($path_normalize_cases->@*));
+
+    for my $case ($path_normalize_cases->@*) {
+	eval {
+	    # suppress warnings here to make output less noisy for certain tests if necessary
+	    # local $SIG{__WARN__} = sub {};
+	    test_path_normalize($case);
+	};
+	warn "$@\n" if $@;
+    }
+
+    done_testing();
+
+    return;
+}
+
+main();
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (9 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize " Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl Max Carrara
  2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

The PVE::Filesystem module implements filesystem manipulation
operations. Currently, this is limited to two wrapper functions, but
in the future anything that concernts itself with altering the
filesystem should be added to it.

This module can be seen as the "complement" to PVE::Path -- while
PVE::Path doesn't ever alter the filesystem, PVE::Filesystem does, as
the name implies. This is done in order to clearly separate concerns.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * None.

 src/Makefile          |  1 +
 src/PVE/Filesystem.pm | 78 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+)
 create mode 100644 src/PVE/Filesystem.pm

diff --git a/src/Makefile b/src/Makefile
index 25bc490..20a0988 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -16,6 +16,7 @@ LIB_SOURCES = \
 	CpuSet.pm \
 	Daemon.pm \
 	Exception.pm \
+	Filesystem.pm \
 	Format.pm \
 	INotify.pm \
 	JSONSchema.pm \
diff --git a/src/PVE/Filesystem.pm b/src/PVE/Filesystem.pm
new file mode 100644
index 0000000..3b06634
--- /dev/null
+++ b/src/PVE/Filesystem.pm
@@ -0,0 +1,78 @@
+=head1 NAME
+
+C<PVE::Filesystem> - Utilities related to filesystem manipulations
+
+=head1 DESCRIPTION
+
+This module implements utilities for manipulating the filesystem. Like
+L<C<PVE::Path>>, this module exists to address certain shortcomings of the Perl
+core modules without relying on third-party solutions / abstractions.
+
+=cut
+
+package PVE::Filesystem;
+
+use strict;
+use warnings;
+
+use Carp qw(carp croak confess);
+use Cwd ();
+
+use Exporter qw(import);
+
+our @EXPORT_OK = qw(
+    fs_cwd
+    fs_canonicalize
+);
+
+=head2 FUNCTIONS
+
+=cut
+
+=head3 fs_getcwd()
+
+Wrapper for C<L<< getcwd()|Cwd/getcwd >>>.
+
+Returns the absolute form of the current working directory.
+
+Unlike the original C<L<< getcwd()|Cwd/getcwd >>>, an exception is thrown if an
+error occurs instead of setting C<$!>.
+
+=cut
+
+sub fs_getcwd : prototype() {
+    my $cwd = Cwd::getcwd();
+
+    croak "failed to get current working directory: $!" if !defined($cwd);
+
+    return $cwd;
+}
+
+=head3 fs_canonicalize($path)
+
+Wrapper for C<L<< abs_path()|Cwd/abs_path >>>.
+
+Returns the canonical, absolute form of the given path with all logical
+components normalized and symlinks resolved.
+
+B<Note:> This requires the path to exist on the filesystem. If you want to
+avoid that, use C<L<< path_normalize()|PVE::Path/"path_normalize($path)" >>>
+instead.
+
+Unlike the original C<L<< abs_path()|Cwd/abs_path >>>, an exception is thrown
+if an error occurs instead of setting C<$!>.
+
+=cut
+
+sub fs_canonicalize : prototype($) {
+    my ($path) = @_;
+
+    croak "\$path is undef" if !defined($path);
+
+    my $canonicalized_path = Cwd::abs_path($path);
+
+    croak "failed to canonicalize path: $!" if !defined($canonicalized_path);
+
+    return $canonicalized_path;
+}
+
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (10 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem Max Carrara
@ 2024-12-20 18:52 ` Max Carrara
  2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
  12 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2024-12-20 18:52 UTC (permalink / raw)
  To: pve-devel

This package contains both PVE::Path and PVE::Filesystem and is added
in order to split off those two modules from the rest of
libpve-common-perl from the get-go.

Both modules are provided in one package because chances are, if one
needs stuff from one module, they'll need stuff from the other module
too. Their functionality goes hand-in-hand.

Signed-off-by: Max Carrara <m.carrara@proxmox.com>
---
Changes v1 --> v2:
  * None.

 debian/control                               |  6 ++++
 debian/libproxmox-fs-path-utils-perl.install |  2 ++
 debian/libpve-common-perl.install            | 29 ++++++++++++++++++++
 3 files changed, 37 insertions(+)
 create mode 100644 debian/libproxmox-fs-path-utils-perl.install
 create mode 100644 debian/libpve-common-perl.install

diff --git a/debian/control b/debian/control
index ac4cd66..39dadb5 100644
--- a/debian/control
+++ b/debian/control
@@ -52,3 +52,9 @@ Breaks: ifupdown2 (<< 2.0.1-1+pve5),
         qemu-server (<< 8.0.1),
 Description: Proxmox VE base library
  This package contains the base library used by other Proxmox VE components.
+
+Package: libproxmox-fs-path-utils-perl
+Architecture: all
+Depends: ${perl:Depends}
+Description: Proxmox Filesystem & Path Utilities
+ Utils for filesystem manipulation and file path operations.
diff --git a/debian/libproxmox-fs-path-utils-perl.install b/debian/libproxmox-fs-path-utils-perl.install
new file mode 100644
index 0000000..2a1a8ac
--- /dev/null
+++ b/debian/libproxmox-fs-path-utils-perl.install
@@ -0,0 +1,2 @@
+/usr/share/perl5/PVE/Filesystem.pm
+/usr/share/perl5/PVE/Path.pm
diff --git a/debian/libpve-common-perl.install b/debian/libpve-common-perl.install
new file mode 100644
index 0000000..2cd8b59
--- /dev/null
+++ b/debian/libpve-common-perl.install
@@ -0,0 +1,29 @@
+/usr/share/perl5/PVE/AtomicFile.pm
+/usr/share/perl5/PVE/CalendarEvent.pm
+/usr/share/perl5/PVE/Certificate.pm
+/usr/share/perl5/PVE/CGroup.pm
+/usr/share/perl5/PVE/CLIFormatter.pm
+/usr/share/perl5/PVE/CLIHandler.pm
+/usr/share/perl5/PVE/CpuSet.pm
+/usr/share/perl5/PVE/Daemon.pm
+/usr/share/perl5/PVE/Exception.pm
+/usr/share/perl5/PVE/Format.pm
+/usr/share/perl5/PVE/INotify.pm
+/usr/share/perl5/PVE/Job
+/usr/share/perl5/PVE/Job/Registry.pm
+/usr/share/perl5/PVE/JSONSchema.pm
+/usr/share/perl5/PVE/LDAP.pm
+/usr/share/perl5/PVE/Network.pm
+/usr/share/perl5/PVE/OTP.pm
+/usr/share/perl5/PVE/PBSClient.pm
+/usr/share/perl5/PVE/ProcFSTools.pm
+/usr/share/perl5/PVE/PTY.pm
+/usr/share/perl5/PVE/RESTEnvironment.pm
+/usr/share/perl5/PVE/RESTHandler.pm
+/usr/share/perl5/PVE/SafeSyslog.pm
+/usr/share/perl5/PVE/SectionConfig.pm
+/usr/share/perl5/PVE/Syscall.pm
+/usr/share/perl5/PVE/SysFSTools.pm
+/usr/share/perl5/PVE/Systemd.pm
+/usr/share/perl5/PVE/Ticket.pm
+/usr/share/perl5/PVE/Tools.pm
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
                   ` (11 preceding siblings ...)
  2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl Max Carrara
@ 2025-01-02 13:46 ` Fiona Ebner
  2025-01-02 13:53   ` Fiona Ebner
  12 siblings, 1 reply; 23+ messages in thread
From: Fiona Ebner @ 2025-01-02 13:46 UTC (permalink / raw)
  To: Proxmox VE development discussion, Max Carrara

Am 20.12.24 um 19:51 schrieb Max Carrara:
> Introduce and Package PVE::Path & PVE::Filesystem - v2
> ======================================================

Just an idea, but I'd like to have a discussion about it: Instead of
using Perl for such new general helper modules, would it be nicer to use
Rust+perlmod?

If our long-term goal is to migrate the Proxmox VE Perl code to Rust,
then we need to switch these modules over at some point in any case (or
drop them after switching over all users of the modules). Are there good
reasons not to start out with Rust+perlmod already?

You state that you (also) took inspiration from Rust's `std::path` so
could we just use that itself, wrapping via perlmod? Or would the
wrapping be too ugly here or lead to performance issues?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
@ 2025-01-02 13:53   ` Fiona Ebner
  2025-01-02 15:54     ` Max Carrara
  0 siblings, 1 reply; 23+ messages in thread
From: Fiona Ebner @ 2025-01-02 13:53 UTC (permalink / raw)
  To: Proxmox VE development discussion, Max Carrara

Am 02.01.25 um 14:46 schrieb Fiona Ebner:
> Am 20.12.24 um 19:51 schrieb Max Carrara:
>> Introduce and Package PVE::Path & PVE::Filesystem - v2
>> ======================================================
> 
> Just an idea, but I'd like to have a discussion about it: Instead of
> using Perl for such new general helper modules, would it be nicer to use
> Rust+perlmod?
> 
> If our long-term goal is to migrate the Proxmox VE Perl code to Rust,
> then we need to switch these modules over at some point in any case (or
> drop them after switching over all users of the modules). Are there good
> reasons not to start out with Rust+perlmod already?
> 
> You state that you (also) took inspiration from Rust's `std::path` so
> could we just use that itself, wrapping via perlmod? Or would the
> wrapping be too ugly here or lead to performance issues?

Or depending on whether it's nicer, also wrapping helpers from
proxmox-sys and friends where we already have similar functionality in
our Rust code.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2025-01-02 13:53   ` Fiona Ebner
@ 2025-01-02 15:54     ` Max Carrara
  2025-01-03  9:49       ` Fiona Ebner
  0 siblings, 1 reply; 23+ messages in thread
From: Max Carrara @ 2025-01-02 15:54 UTC (permalink / raw)
  To: Proxmox VE development discussion

On Thu Jan 2, 2025 at 2:53 PM CET, Fiona Ebner wrote:
> Am 02.01.25 um 14:46 schrieb Fiona Ebner:
> > Am 20.12.24 um 19:51 schrieb Max Carrara:
> >> Introduce and Package PVE::Path & PVE::Filesystem - v2
> >> ======================================================
> > 
> > Just an idea, but I'd like to have a discussion about it: Instead of
> > using Perl for such new general helper modules, would it be nicer to use
> > Rust+perlmod?
> > 
> > If our long-term goal is to migrate the Proxmox VE Perl code to Rust,
> > then we need to switch these modules over at some point in any case (or
> > drop them after switching over all users of the modules). Are there good
> > reasons not to start out with Rust+perlmod already?
> > 

Depends on what you mean with nicer: I was reluctant to use perlmod
here for a couple reasons:

1. We appear to have everything in the pve-rs crate right now
(libpve-rs-perl), so I had assumed that if I wanted to use perlmod here,
then I'd have to put my implementations into that crate as well.

This in turn would mean that for a simple path op library I'd need to
pull in pve-rs as dependency, which also contains a bunch of different
things that aren't concerned with path op stuff.

Perhaps I'm misunderstanding the purpose of the pve-rs crate, but I
decided against using perlmod here solely because I didn't want to add
any additional dependencies to this library unless otherwise necessary.

Right now as of this series, no additional dependencies besides some
Perl core modules are needed; the library can exist on its own.

2. I'm uncertain whether we actually want to have multiple repositories
or packages using perlmod (instead of having just pve-rs).

If we can use perlmod for individual modules, as in, add perlmod *alone*
as a dependency for packages like this one, then implement features and
add dependencies selectively, I'd be open to it.

Perhaps as an example, what I'd ideally prefer is something like
Python's cryptography is using PyO3 -- there's a Rust part and then
there's a Python part that's using the things implemented in Rust; only
whatever's necessary is pulled in [1].

3. Related to 1. and 2., there isn't any clear indication / guide / rule
of thumb / etc. on how perlmod ought to be used and in which contexts it
should be used.

4. Should we decide to use perlmod here eventually, individual functions
can still be implemented in Rust separately. Right now, there wasn't
really a need to use Rust, because PVE::Path works at most with strings
and a couple arrays here and there; there are no complex data structures
that need to be made typesafe.

> > You state that you (also) took inspiration from Rust's `std::path` so
> > could we just use that itself, wrapping via perlmod? Or would the
> > wrapping be too ugly here or lead to performance issues?

5. I'm not sure about the performance overhead, but it would certainly
be somewhat ugly, because all which PVE::Path essentially does consists
of string and array operations. If we used perlmod here hypothetically,
all that we'd be doing is give the Rust side a string or an array,
convert that to a PathBuf / Path or an iterable, perform the requested
operation and give the result back to Perl. It just seems a little
unnecessary to me.

6. I reckon that the places in which those two little libraries here
will be used will most likely be replaced by a pure Rust implementation
as a whole -- IMO there's no need to use perlmod for every single
smaller library if the Perl code using them gets replaced by Rust.

In other words, IMO a top-down approach such as replacing higher-level
subroutines or entire API calls would probably yield better results
rather than a bottom-up approach. (I believe there's a pattern for this
-- strangler pattern? I'd have to look it up tbh)

>
> Or depending on whether it's nicer, also wrapping helpers from
> proxmox-sys and friends where we already have similar functionality in
> our Rust code.

7. While I'm a big fan of re-using existing code, I don't think it
applies here -- I think it's fine to keep *certain* things separate and
decoupled from one another until we actually find that there's a lot of
common functionality between two or more things (speaking generally
here). For PVE::Path and PVE::Filesystem in particular, we can always
bridge over to Rust via perlmod for individual functions if needed (4.)
nevertheless, if that even ends up being necessary (6.).

With all that being said, I hope I could convey my reasoning here and
shine some light on my design decisions -- please let me know what you
think! And thanks for having a look :)

[1]: https://github.com/pyca/cryptography/tree/7fd5f95354e33d9ca90ba854e9cbda958968043a/src


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2025-01-02 15:54     ` Max Carrara
@ 2025-01-03  9:49       ` Fiona Ebner
  2025-01-03 10:41         ` Thomas Lamprecht
  0 siblings, 1 reply; 23+ messages in thread
From: Fiona Ebner @ 2025-01-03  9:49 UTC (permalink / raw)
  To: Proxmox VE development discussion, Max Carrara

Am 02.01.25 um 16:54 schrieb Max Carrara:
> On Thu Jan 2, 2025 at 2:53 PM CET, Fiona Ebner wrote:
>> Am 02.01.25 um 14:46 schrieb Fiona Ebner:
>>> Am 20.12.24 um 19:51 schrieb Max Carrara:
>>>> Introduce and Package PVE::Path & PVE::Filesystem - v2
>>>> ======================================================
>>>
>>> Just an idea, but I'd like to have a discussion about it: Instead of
>>> using Perl for such new general helper modules, would it be nicer to use
>>> Rust+perlmod?
>>>
>>> If our long-term goal is to migrate the Proxmox VE Perl code to Rust,
>>> then we need to switch these modules over at some point in any case (or
>>> drop them after switching over all users of the modules). Are there good
>>> reasons not to start out with Rust+perlmod already?
>>>
> 
> Depends on what you mean with nicer: I was reluctant to use perlmod
> here for a couple reasons:
> 
> 1. We appear to have everything in the pve-rs crate right now
> (libpve-rs-perl), so I had assumed that if I wanted to use perlmod here,
> then I'd have to put my implementations into that crate as well.
> 
> This in turn would mean that for a simple path op library I'd need to
> pull in pve-rs as dependency, which also contains a bunch of different
> things that aren't concerned with path op stuff.
> 
> Perhaps I'm misunderstanding the purpose of the pve-rs crate, but I
> decided against using perlmod here solely because I didn't want to add
> any additional dependencies to this library unless otherwise necessary.
> 
> Right now as of this series, no additional dependencies besides some
> Perl core modules are needed; the library can exist on its own.
> 
> 2. I'm uncertain whether we actually want to have multiple repositories
> or packages using perlmod (instead of having just pve-rs).
> 
> If we can use perlmod for individual modules, as in, add perlmod *alone*
> as a dependency for packages like this one, then implement features and
> add dependencies selectively, I'd be open to it.
> 
> Perhaps as an example, what I'd ideally prefer is something like
> Python's cryptography is using PyO3 -- there's a Rust part and then
> there's a Python part that's using the things implemented in Rust; only
> whatever's necessary is pulled in [1].


That is a good point. I agree this is also worth discussing. Do we want
to put all bindings for PVE there or do we want to have multiple
libraries for binding? Moving forward, more and more Perl packages will
need to depend on pve-rs or the other binding libraries in any case, so
I wouldn't consider that to be a blocker for the path library here.

> 3. Related to 1. and 2., there isn't any clear indication / guide / rule
> of thumb / etc. on how perlmod ought to be used and in which contexts it
> should be used.

Yes, that's why I wanted to have a discussion :)

> 4. Should we decide to use perlmod here eventually, individual functions
> can still be implemented in Rust separately. Right now, there wasn't
> really a need to use Rust, because PVE::Path works at most with strings
> and a couple arrays here and there; there are no complex data structures
> that need to be made typesafe.

The motivation for my suggestion is not about type-safety, but about
doing the groundwork for the future. Since we can call Rust from Perl
but not the other way around, we need to start with the low-level
modules to use Rust. If we go for it, I'd rather have all functions in
the new libraries be wrappers and as transparent as possible, so that a
module that uses the path functions can be converted to Rust more easily
in the future.

>>> You state that you (also) took inspiration from Rust's `std::path` so
>>> could we just use that itself, wrapping via perlmod? Or would the
>>> wrapping be too ugly here or lead to performance issues?
> 
> 5. I'm not sure about the performance overhead, but it would certainly
> be somewhat ugly, because all which PVE::Path essentially does consists
> of string and array operations. If we used perlmod here hypothetically,
> all that we'd be doing is give the Rust side a string or an array,
> convert that to a PathBuf / Path or an iterable, perform the requested
> operation and give the result back to Perl. It just seems a little
> unnecessary to me.
>
> 6. I reckon that the places in which those two little libraries here
> will be used will most likely be replaced by a pure Rust implementation
> as a whole -- IMO there's no need to use perlmod for every single
> smaller library if the Perl code using them gets replaced by Rust.

The question is how much easier does converting modules using these get
if we start out with wrappers? Because sure, right now it is limited in
scope, but if we start out with Perl, things will get added on top and
this might result in more work in the future.

> In other words, IMO a top-down approach such as replacing higher-level
> subroutines or entire API calls would probably yield better results
> rather than a bottom-up approach. (I believe there's a pattern for this
> -- strangler pattern? I'd have to look it up tbh)

See above, we can't too easily do that. We need to have all the parts an
API call needs already in Rust or we won't be able to implement it. Of
course path modification could be converted to Rust "on-the-fly", but if
we already have transparent wrappers it could even be trivial.

>> Or depending on whether it's nicer, also wrapping helpers from
>> proxmox-sys and friends where we already have similar functionality in
>> our Rust code.
> 
> 7. While I'm a big fan of re-using existing code, I don't think it
> applies here -- I think it's fine to keep *certain* things separate and
> decoupled from one another until we actually find that there's a lot of
> common functionality between two or more things (speaking generally
> here). For PVE::Path and PVE::Filesystem in particular, we can always
> bridge over to Rust via perlmod for individual functions if needed (4.)
> nevertheless, if that even ends up being necessary (6.).
> 
> With all that being said, I hope I could convey my reasoning here and
> shine some light on my design decisions -- please let me know what you
> think! And thanks for having a look :)
> 
> [1]: https://github.com/pyca/cryptography/tree/7fd5f95354e33d9ca90ba854e9cbda958968043a/src

All that said, yes, it might be a too small library in this case. OTOH,
it might be a good place to start.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2025-01-03  9:49       ` Fiona Ebner
@ 2025-01-03 10:41         ` Thomas Lamprecht
  2025-01-03 12:37           ` Fiona Ebner
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Lamprecht @ 2025-01-03 10:41 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner, Max Carrara

On 03/01/2025 10:49, Fiona Ebner wrote:
> Am 02.01.25 um 16:54 schrieb Max Carrara:
> That is a good point. I agree this is also worth discussing. Do we want
> to put all bindings for PVE there or do we want to have multiple
> libraries for binding? Moving forward, more and more Perl packages will
> need to depend on pve-rs or the other binding libraries in any case, so
> I wouldn't consider that to be a blocker for the path library here.

That we'll depending solely on pve-rs as central dependency is IMO not set
in stone, having one huge library, even if just temporarily for a transition
period, is IME a rather big PITA fast, besides, the period won't be short
but rather a few years I think, so just because it's temporary doesn't mean
we should ignore costs or potential tech debt.

>> 4. Should we decide to use perlmod here eventually, individual functions
>> can still be implemented in Rust separately. Right now, there wasn't
>> really a need to use Rust, because PVE::Path works at most with strings
>> and a couple arrays here and there; there are no complex data structures
>> that need to be made typesafe.
> 
> The motivation for my suggestion is not about type-safety, but about
> doing the groundwork for the future. Since we can call Rust from Perl
> but not the other way around, we need to start with the low-level
> modules to use Rust. If we go for it, I'd rather have all functions in
> the new libraries be wrappers and as transparent as possible, so that a
> module that uses the path functions can be converted to Rust more easily
> in the future.

I'm not sure if replacing from low-level helper upwards is the best choice,
as for one those helpers exist since a long time and are for most pretty
stable nowadays, replacing that means a lot of churn for almost no benefit.
Especially as you add extra cost now with a new perlmod backed interface
for code that won't then be used in the future when we actually do not
require 

This is not thought out 100% yet, but as of now I'd favor rewriting those
parts that actually provide value, and that can be plugged in in various
ways into the existing perl code:

- reusing our existing proxying for the API to route to a new fully rust
  based daemon. As of now that could already hosts a lot of the shared API
  and implementations parts that we got for PBS and PDM already anyway;
  albeit, PVE is sometimes a bit special (longer grown history) and might
  need more care to ensure backward compat, but that's details.
  This is basically a replacement for perlmod for whole API endpoints or
  even directories; it certainly is a bit bigger w.r.t. initial work
  required to get permission handling and locking right, but as most of
  that uses file locks or pmxcfs and works already for multiple processes,
  not just multiple threads, it should be doable.

- using IPC to call from perl into rust, through, e.g., UNIX sockets or
  more elaborate existing RPC methods. The storage system would be a prime
  candidate here. As there we will require rather permanent backwards
  compat for the perl based plugins, so we require a system that can handle
  that already; a storage daemon that calls into perl plugins via a wrapper
  executable and allows rewriting ours one by one would be nice, and it
  would provide one of the internal core APIs in rust, which would unlock
  rewriting code where rust could bring actually a big benefit by having
  modern language features and rust guarantees. And once the storage client
  system (not the disk management system that has been wrongly mixed into
  the same code basis) has a ABI available from rust we unlock a lot of
  things to rewrite in modules, one by one.

- perlmod. Yes, for some things it will be still fine, but IMO it should
  be rather used to plug in a smaller set of methods for bigger pieces of
  rust code/rewrites, not dozens of wrappers for trivial (helper, utility)
  methods; at the moment I just do not see much value in there.

With that we are much more modular in terms of switching out stuff compared
to preparing some low-level stuff first. Having code with side-effects
shared can be nice to ensure implementations match though, but that's mostly
locking and permissions and owner on file creation, but lots of that is
handled by pmxcfs already on PVE side and the rest really should be testable,
as it just has to be implemented correctly once.

>>>> You state that you (also) took inspiration from Rust's `std::path` so
>>>> could we just use that itself, wrapping via perlmod? Or would the
>>>> wrapping be too ugly here or lead to performance issues?
>>
>> 5. I'm not sure about the performance overhead, but it would certainly
>> be somewhat ugly, because all which PVE::Path essentially does consists
>> of string and array operations. If we used perlmod here hypothetically,
>> all that we'd be doing is give the Rust side a string or an array,
>> convert that to a PathBuf / Path or an iterable, perform the requested
>> operation and give the result back to Perl. It just seems a little
>> unnecessary to me.
>>
>> 6. I reckon that the places in which those two little libraries here
>> will be used will most likely be replaced by a pure Rust implementation
>> as a whole -- IMO there's no need to use perlmod for every single
>> smaller library if the Perl code using them gets replaced by Rust.
> 
> The question is how much easier does converting modules using these get
> if we start out with wrappers? Because sure, right now it is limited in
> scope, but if we start out with Perl, things will get added on top and
> this might result in more work in the future.

IMO not much _most_ of the time. Stefan's firewall approach is a good
example where it would have hurt much more to go that way, albeit that
was not only switching to rust but also switching to nft, so might not
be applicable in general.

Besides that, Perl is still our prime language for the PVE backend and
will quite probably stay so for at least a year, probably the whole
future PVE 9 release, not unlikely even the 10 one. During that time it
definitively must not become a second class citizen, it must be OK to
continue feature development there during transition period; forcing a
language switch to undergoing development is far from ideal. That's also
why I think doing _some_ clean-up and refactoring work on perl now is
still worth it now (stares at qemu-server) 
>> In other words, IMO a top-down approach such as replacing higher-level
>> subroutines or entire API calls would probably yield better results
>> rather than a bottom-up approach. (I believe there's a pattern for this
>> -- strangler pattern? I'd have to look it up tbh)
I agree with above.
> See above, we can't too easily do that. We need to have all the parts an
> API call needs already in Rust or we won't be able to implement it. Of
> course path modification could be converted to Rust "on-the-fly", but if
> we already have transparent wrappers it could even be trivial.
It might not be that easy than throwing in perlmod now, but it would
IMO bring more benefits and less (intermediate/transitional) churn.

>>> Or depending on whether it's nicer, also wrapping helpers from
>>> proxmox-sys and friends where we already have similar functionality in
>>> our Rust code.
>>
>> 7. While I'm a big fan of re-using existing code, I don't think it
>> applies here -- I think it's fine to keep *certain* things separate and
>> decoupled from one another until we actually find that there's a lot of
>> common functionality between two or more things (speaking generally
>> here). For PVE::Path and PVE::Filesystem in particular, we can always
>> bridge over to Rust via perlmod for individual functions if needed (4.)
>> nevertheless, if that even ends up being necessary (6.).

+1

>> With all that being said, I hope I could convey my reasoning here and
>> shine some light on my design decisions -- please let me know what you
>> think! And thanks for having a look :)
>>
>> [1]: https://github.com/pyca/cryptography/tree/7fd5f95354e33d9ca90ba854e9cbda958968043a/src
> 
> All that said, yes, it might be a too small library in this case. OTOH,
> it might be a good place to start.

IMO not really, a good place to start is to split up and rework pve-common's
into more modules with a clearer scope and also into more binary packages
and with some tests added.

That should happen in perl first, and would already bring us a lot of gains
there for the next few years where the transition is going on.
Also other stuff like making the guest configs "real" section configs,
which the would (or should) be in rust then too.

Then we might even switch some more complicated or finicky helpers over to
rust, but not something I'd start out with now.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem
  2025-01-03 10:41         ` Thomas Lamprecht
@ 2025-01-03 12:37           ` Fiona Ebner
  0 siblings, 0 replies; 23+ messages in thread
From: Fiona Ebner @ 2025-01-03 12:37 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox VE development discussion, Max Carrara

Thank you for the detailed answer and sketching out the plans for the
transition to Rust and when perlmod should be used. That clarifies a lot
for me!

Just wanted to ask about more details for the qemu-server example. Let's
say we get to a point where it is nicely split up into dedicated modules
and we can use storage functionality from Rust already. How would we go
about translating qemu-server to Rust then? Start by translating the
lower-level modules (e.g. QMP, memory, CPU, ...) using perlmod to
preserve the functionality for existing callers and then in a second
step, switch over the API endpoints one-by-one? Or do you have a
different way in mind?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
  2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
@ 2025-01-08 14:05   ` Wolfgang Bumiller
  2025-01-09  9:56     ` Max Carrara
  0 siblings, 1 reply; 23+ messages in thread
From: Wolfgang Bumiller @ 2025-01-08 14:05 UTC (permalink / raw)
  To: Max Carrara; +Cc: pve-devel

On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> The PVE::Path module concerns itself with file / directory path
> operations, like getting the parent directory of a path, extracting
> the file name of a path, splitting a path into its individual
> components, joining path components together, comparing paths, and so
> on.
> 
> This module is added here in order to address the shortcomings of the
> Perl core modules (such as lacking a lot the aforementioned
> functionalities) without relying on some kind of object-based
> abstraction.
> 
> PVE::Path strictly only contains functions that manipulate file paths
> without meddling with the filesystem. The reasoning here is to be able
> to just import any function when its necessary without having to
> change the surrounding code in order to adapt to some kind of
> abstraction.
> 
> The other motivation for this module is that path manipulation
> operations have been getting more and more common recently, especially
> in regards to storage.
> 
> Signed-off-by: Max Carrara <m.carrara@proxmox.com>
> ---
> Changes v1 --> v2:
>   * Improve some wording in the docstring of path_components
>   * Simplify some logic in path_parent and remove an unnecessary sanity
>     check
>   * Actually treat "foo" as "./foo" in path_parent as mentioned in the
>     docstring -- This means that path_parent("foo") now returns "."
>     instead of "".
>   * Adapt the path_with_file_* functions to the above accordingly, so
>     that e.g. path_with_file_name("foo", "bar") returns "bar" instead of
>     "./bar".
>   * Improve the "boolean" behaviour of path_is_absolute and
>     path_is_absolute and return 1 when true, but use an empty return
>     when false.
>     - An empty return means "undef" in scalar context and an empty list
>       in list context, so those functions will always return something
>       that's correctly truthy or falsy for Perl, regardless of context
> 
>  src/Makefile    |   1 +
>  src/PVE/Path.pm | 987 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 988 insertions(+)
>  create mode 100644 src/PVE/Path.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index 2d8bdc4..25bc490 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -23,6 +23,7 @@ LIB_SOURCES = \
>  	LDAP.pm \
>  	Network.pm \
>  	OTP.pm \
> +	Path.pm \
>  	PBSClient.pm \
>  	PTY.pm \
>  	ProcFSTools.pm \
> diff --git a/src/PVE/Path.pm b/src/PVE/Path.pm
> new file mode 100644
> index 0000000..221e662
> --- /dev/null
> +++ b/src/PVE/Path.pm
> @@ -0,0 +1,987 @@
> +=head1 NAME
> +
> +C<PVE::Path> - Utilities related to handling file and directory paths
> +
> +=head1 DESCRIPTION
> +
> +This module provides functions concerned with file and directory path
> +manipulation.
> +
> +None of the functions provided alter the filesystem in any way.
> +
> +The reason for this module's existence is to address a couple shortcomings:
> +
> +=over
> +
> +=item 1. The Perl core modules lack most of what is required for manipulating
> +paths, for example getting the parent directory of a path, extracting the
> +prefix of a file name (the "stem"), extracting the suffixes of a file name (the
> +"endings" or "extensions"), checking whether two paths are the same, and so on.
> +
> +=item 2. If the Perl core modules provide something in that regard, it's usually
> +provided in a not very ergonomic manner (L<C<File::Basename>>).
> +
> +=item 3. Additionally, the path utilities of the core modules are scattered
> +across multiple modules, making them hard to discover.
> +
> +=item 4. Third-party libraries on CPAN mostly provide objects representing
> +paths. Using any of these would require fundamental changes on how file paths
> +are handled throughout our code, for almost no benefit.
> +
> +=back
> +
> +C<L<PVE::Path>> instead does without objects and strictly provides functions
> +for path manipulation only. Any operation that is needed can simply be
> +performed ad hoc by importing the corresponding function and doesn't require
> +the surrounding code to conform to an abstraction like a path object.
> +
> +Additionally, some of the core modules' functionality is re-exported or
> +re-implemented for ergonomic or logical purposes. The goal is to provide
> +functions that don't come with any surprises and just behave like one assumes
> +they would.
> +
> +This module takes inspiration from Rust's C<std::path> and Python's C<pathlib>,
> +which are more modern path manipulation libraries.
> +
> +=head1 LIMITATIONS
> +
> +This module is limited to manipulating Unix-like / Linux file paths.
> +
> +=cut
> +
> +package PVE::Path;
> +
> +use strict;
> +use warnings;
> +
> +use Carp qw(carp croak confess);
> +use File::Spec ();
> +use List::Util qw(any zip_shortest zip_longest);
> +
> +use Exporter qw(import);
> +
> +our @EXPORT_OK = qw(
> +    path_is_absolute
> +    path_is_relative
> +
> +    path_components
> +    path_join
> +
> +    path_normalize
> +
> +    path_parent
> +    path_push
> +    path_pop
> +
> +    path_file_name
> +    path_with_file_name
> +
> +    path_file_prefix
> +    path_with_file_prefix
> +
> +    path_file_suffixes
> +    path_with_file_suffixes
> +
> +    path_file_suffix
> +    path_with_file_suffix
> +
> +    path_file_parts
> +
> +    path_starts_with
> +    path_ends_with
> +    path_equals
> +);
> +
> +=head2 FUNCTIONS
> +
> +=cut
> +
> +=head3 path_is_absolute($path)
> +
> +Returns C<1> if C<$path> is absolute (starts with a C</>).
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_is_absolute : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    if ($path =~ m#^/#) {
> +	return 1;
> +    }
> +
> +    return;
> +}
> +
> +=head3 path_is_relative($path)
> +
> +Returns C<1> if C<$path> is relative (doesn't start with a C</>).
> +
> +The opposite of C<L<< path_is_absolute()|/"path_is_absolute($path)" >>>.
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_is_relative : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    if ($path !~ m#^/#) {
> +	return 1;
> +    }
> +
> +    return;
> +}
> +
> +=head3 path_components($path)
> +
> +Returns a list of the given C<$path>'s individual components.
> +
> +In scalar context, returns a reference to a list.
> +
> +The C<$path> is normalized a little during the parse:
> +
> +=over
> +
> +=item Repeated occurrences of C</> are removed, so C<foo/bar> and C<foo//bar>
> +both have C<foo> and C<bar> as components.
> +
> +=item Trailing slashes C</> are removed.
> +
> +=item Occurrences of C<.> are normalized away, except the first C<.> at
> +beginning of a path. This means that C<foo/bar>, C<foo/./bar>, C<foo/bar/.>,
> +C<foo/././bar/./.>, etc. all have C<foo> and C<bar> as components, while
> +C<./foo/bar>, C<./././foo/bar>, C<./foo/./bar/.> have C<.>, C<foo> and C<bar>
> +as components.
> +
> +=back
> +
> +No other normalization is performed to account for the possibility of symlinks
> +existing. This means that C<foo/baz> and C<foo/bar/../baz> are distinct (because
> +C<bar> could be a symlink and thus C<foo> isn't its parent).
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_components : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $is_abs = path_is_absolute($path);
> +    my $has_cur_dir = $path =~ m#^\.$|^\./#;
> +
> +    my @components = split('/', $path);
> +    my @normalized_components = ();
> +
> +    for my $component (@components) {
> +	next if $component eq '' || $component eq '.';
> +
> +	push(@normalized_components, $component);
> +    }
> +
> +    unshift(@normalized_components, '/') if $is_abs;

^ This case should probably also be explicitly mentioned in the docs.
(`path_file_name()` also relies on it)

> +    unshift(@normalized_components, '.') if $has_cur_dir;
> +
> +    return @normalized_components if wantarray;
> +    return \@normalized_components;
> +}
> +
> +
> +=head3 path_join(@paths)
> +
> +Joins multiple paths together. All kinds of paths are supported.
> +
> +Does not perform any C<L<< normalization|/"path_normalize($path)" >>>.
> +
> +Throws an exception if any of the passed C<@paths> is C<undef>.

This should probably point to the `path_push` documentation mentioning
the absolute path special case.

> +
> +=cut
> +
> +sub path_join : prototype(@) {
> +    my (@paths) = @_;
> +
> +    if (!scalar(@paths)) {
> +	return '';
> +    }
> +
> +    croak "one of the provided paths is undef" if any { !defined($_) } @paths;
> +
> +    # Find the last occurrence of a root directory and start conjoining the
> +    # components from there onwards
> +    my $index = scalar(@paths) - 1;
> +    while ($index > 0) {
> +	last if $paths[$index] =~ m#^/#;
> +	$index--;
> +    }
> +
> +    @paths = @paths[$index .. (scalar(@paths) - 1)];
> +
> +    my $resulting_path = shift @paths;
> +
> +    for my $path (@paths) {
> +	$resulting_path = path_push($resulting_path, $path);
> +    }
> +
> +    return $resulting_path;
> +}
> +
> +=head3 path_normalize($path)
> +
> +Wrapper for L<C<File::Spec/canonpath>>. Performs a logical cleanup of the given
> +C<$path>.
> +
> +This removes unnecessary components of a path that can be safely
> +removed, such as C<.>, trailing C</> or repeated occurrences of C</>.
> +
> +For example, C<foo/./bar/baz/.> and C<foo////bar//baz//> will both become
> +C<foo/bar/baz>.
> +
> +B<Note:> This will I<not> remove components referencing the parent directory,
> +i.e. C<..>. For example, C<foo/bar/../baz> and C<foo/bar/baz/..> will therefore
> +remain as they are. However, the parent directory of C</> is C</>, so
> +C</../../foo> will be normalized to C</foo>.
> +
> +Throws an exception if C<$path> is C<undef> or the call to C<canonpath> failed.
> +
> +=cut
> +
> +sub path_normalize : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $cleaned_path = eval {
> +	File::Spec->canonpath($path);
> +    };
> +
> +    croak "failed to clean up path: $@" if $@;
> +
> +    return $cleaned_path;
> +}
> +
> +=head3 path_parent($path)
> +
> +Returns the given C<$path> without its final component, if there is one.
> +
> +Trailing and repeated occurrences of C</> and C<.> are normalized on the fly
> +when needed. This means that e.g. C<foo////bar///.//> becomes C<foo>, but
> +C<foo/.//bar//./baz> becomes C<foo/.//bar>.
> +
> +This function's behaviour is almost identical to Rust's
> +L<< Path::parent|https://doc.rust-lang.org/std/path/struct.Path.html#method.parent >>,
> +with a few adaptations made wherever Perl treats things differently:
> +
> +=over
> +
> +=item * C</foo/bar> becomes C</foo>, C<foo/bar> becomes C<foo>.
> +
> +=item * C</foo> becomes C</>.
> +
> +=item * C<foo/bar/..> becomes C<foo/bar>.
> +
> +=item * C<foo/../bar> becomes C<foo/..>.
> +
> +=item * C<foo> is interpreted as C<./foo> and becomes C<.>. This is because Perl's
> +C<L<File::Spec/canonpath>> interprets C<./foo> and C<foo> as the same thing.
> +
> +=item * C</> and an I<empty string> result in C<undef> being returned.
> +
> +=item * C<.> results in an empty string.

^ Like in `path_components()`, this should mention symlinks, since the
parent of `/foo/..` is not necessarily `/foo` in the file system, but
will be here.

> +
> +=back
> +
> +Will raise an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_parent : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    if ($path eq '') {
> +	return;
> +    }
> +
> +    # A limit of -1 retains empty components at the end
> +    my @components = split('/', $path, -1);

^ Meh, if we didn't mind normalizing all multi-slash occurrances we could just split on `m@/+@` :S

> +
> +    # Trim off needless extra components until actual final component is encountered, e.g.
> +    # foo////bar////baz//// -> foo////bar////baz
> +    # /// -> /
> +    # ././//.///./ -> .
> +    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
> +	pop(@components);
> +    }
> +
> +    my $final_component = pop(@components);
> +
> +    # We had a root dir with needless extra components, e.g. "//" or "////" or "//.///./" etc.
> +    return if $final_component eq '';
> +
> +    # We had a current dir reference with needless extra components, e.g.
> +    # "././" or ".///////" or "./././//./././//" etc.
> +    return '' if $final_component eq '.';
> +
> +    # We had some other kind of single component like "foo", "bar" or "..",
> +    # and because File::Spec->canonpath treats "foo" and "./foo" the same,
> +    # return a single current dir reference
> +    return '.' if !scalar(@components);
> +
> +    # Trim off needless extra components until actual parent component is encountered, like above
> +    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
> +	pop(@components);
> +    }
> +
> +    # Handle lone root dir (@components with only one empty string)
> +    if (scalar(@components) == 1 && $components[0] eq '') {
> +	return '/';
> +    }
> +
> +    return join('/', @components);
> +}
> +
> +=head3 path_push($path, $other)
> +
> +Extends C<$path> with C<$other>, returning a new path.
> +
> +If C<$other> is absolute, it will be returned instead.

I was never a fan of this. But I guess we cannot change that in rust, so
for the sake of consistency...

> +
> +Throws an exception if any of the arguments is C<undef>.
> +
> +=cut
> +
> +sub path_push : prototype($$) {
> +    my ($path, $other) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$other is undef" if !defined($other);
> +
> +    return $other if path_is_absolute($other);
> +    return $path if $other eq '';

(^ Could swap the above 2 checks)

> +
> +    my $need_sep = $path ne '' && $path !~ m#/$#;
> +
> +    $path .= "/" if $need_sep;
> +    $path .= $other;
> +
> +    return $path;
> +}
> +
> +=head3 path_pop($path)
> +
> +Alias for C<L<< path_parent()|/"path_parent($path)" >>>.
> +
> +=cut
> +
> +sub path_pop : prototype($) {
> +    my ($path) = @_;
> +    return path_parent($path);
> +}
> +
> +=head3 path_file_name($path)
> +
> +Returns the last component of the given C<$path>, if it is a legal file name,
> +or C<undef> otherwise.
> +
> +If C<$path> is an empty string, C</>, C<.> or ends with a C<..> component,
> +there is no valid file name.
> +
> +B<Note:> This does not check whether the given C<$path> actually points to a
> +file or a directory etc.
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_file_name : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my @components = path_components($path);
> +
> +    if (!scalar(@components)) {
> +	return;
> +    }
> +
> +    if (
> +	scalar(@components) == 1
> +	&& ($components[0] eq '/' || $components[0] eq '.')
> +    ) {
> +	return;
> +    }
> +
> +    if ($components[-1] eq '..') {
> +	return;
> +    }
> +
> +    return $components[-1];
> +}
> +
> +=head3 path_with_file_name($path, $file_name)
> +
> +Returns C<$path> with C<$file_name> as the new last component.
> +
> +This is essentially like calling C<L<< path_parent()|/"path_parent($path)" >>>
> +and using C<L<< path_push()|/"path_push($path, $other)" >>> to append the new
> +file name, but handles a few extra cases:
> +
> +=over
> +
> +=item * If C<$path> is C</>, appends C<$file_name>.
> +
> +=item * If C<$path> is an empty string, appends C<$file_name>.
> +
> +=item * If C<$path> ends with a parent directory reference (C<..>), replaces it
> +with C<$file_name>.
> +
> +=back
> +
> +Throws an exception if any of the arguments is C<undef> or if C<$file_name>
> +contains a path separator (C</>).
> +
> +=cut
> +
> +sub path_with_file_name : prototype($$) {
> +    my ($path, $file_name) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$file_name is undef" if !defined($file_name);
> +    croak "\$file_name contains a path separator: $file_name" if $file_name =~ m|/|;
> +
> +    my $parent = path_parent($path);
> +
> +    # undef means that $path was either "" or "/", so we can just append to it
> +    return ($path . $file_name) if !defined($parent);
> +
> +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> +    if ($parent eq '.' && $path !~ m|/|) {
> +	return $file_name;
> +    }
> +
> +    return path_push($parent, $file_name);
> +}
> +
> +my sub _path_file_prefix : prototype($) {
> +    my ($file_name) = @_;
> +
> +    confess "\$file_name is undef" if !defined($file_name);
> +
> +    $file_name =~ s|^(\.*[^\.]*)||;
> +    my $prefix = $1;
> +
> +    # sanity check
> +    confess "\$prefix not matched" if !defined($prefix);

^ For the above 4 lines you should be able to just test the `=~`
directly:

    confess "..." if $file_name !~ s|^(\.*[^\.]*)||;

(note =~ -> !~)

Would it make sense to instead *match* those parts instead of s// and
return the extension as a 3rd value? I think `path_with_file_prefix()`
could use this instead of building and joining a suffix list?

No hard feelings, though.

> +
> +    return ($prefix, $file_name);
> +}
> +
> +=head3 path_file_prefix($path)
> +
> +Returns the prefix of the file name of the given C<$path>. If the C<$path> does
> +not have a valid file name and thus no prefix, C<undef> is returned instead.
> +
> +The prefix of a file name is the part before any extensions (suffixes).
> +
> +    my $prefix = path_file_prefix("/etc/resolv.conf");
> +    # resolv
> +
> +    my $prefix = path_file_prefix("/tmp/archive.tar.zst");
> +    # archive
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_file_prefix : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $file_name = path_file_name($path);
> +    return undef if !defined($file_name);
> +
> +    my ($prefix, undef) = _path_file_prefix($file_name);
> +    return $prefix;
> +}
> +
> +=head3 path_with_file_prefix($path, $prefix)
> +
> +Returns C<$path> with a new C<$prefix>. This is similar to
> +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> +except that the file prefix is replaced or appended.
> +
> +If C<$path> does not have a file name or if C<$prefix> is an empty string,
> +C<undef> is returned.
> +
> +    my $new_path = path_with_file_prefix("/tmp/archive.tar.zst", "backup");
> +    # /tmp/backup.tar.zst
> +
> +    my $new_path = path_with_file_prefix("/etc/pve", "ceph");
> +    # /etc/ceph
> +
> +Throws an exception if any of the arguments is C<undef>, or if C<$prefix>
> +contains a path separator (C</>), ends with C<.>, or is an empty string.
> +
> +=cut
> +
> +sub path_with_file_prefix : prototype($$) {
> +    my ($path, $prefix) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$prefix is undef" if !defined($prefix);
> +    croak "\$prefix contains a path separator" if $prefix =~ m|/|;
> +    croak "\$prefix ends with a dot" if $prefix =~ m|\.$|;
> +
> +    return undef if $prefix eq '';
> +    return undef if !defined(path_file_name($path));
> +
> +    return $path if $prefix eq '';
> +
> +    my $parent = path_parent($path);
> +
> +    # sanity check -- should not happen because we checked for file name,
> +    # and the existence of a file name implies there's a parent
> +    confess "parent of \$path is undef" if !defined($parent);
> +
> +    my @suffixes = path_file_suffixes($path);
> +
> +    my $file_name = join(".", $prefix, @suffixes);
> +
> +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> +    # (Done also in path_with_file_name)
> +    if ($parent eq '.' && $path !~ m|/|) {
> +	return $file_name;
> +    }
> +
> +    return path_push($parent, $file_name);
> +}
> +
> +my sub _path_file_suffixes : prototype($) {
> +    my ($file_name_no_prefix) = @_;
> +
> +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> +
> +    # Suffixes are extracted "manually" because join()ing the result of split()
> +    # results in a different file name than the original. Let's say you have a
> +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> +    # With split, you get the following:
> +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> +    my @suffixes = ();
> +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> +	my $suffix = $1;
> +	$suffix =~ s|^\.||;
> +	push(@suffixes, $suffix);
> +    }
> +
> +    return @suffixes;
> +}
> +
> +=head3 path_file_suffixes($path)
> +
> +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> +not have a valid file name, an empty list is returned instead.
> +
> +In scalar context, returns a reference to a list.

(^ Isn't this a bit awkward?)

> +
> +The suffixes of a path are essentially the file name's extensions, the parts
> +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> +
> +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> +    # ("conf")
> +
> +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> +    # ["tar", "zst"]
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_file_suffixes : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $file_name = path_file_name($path);
> +    if (!defined($file_name)) {
> +	return wantarray ? () : [];
> +    }
> +
> +    (undef, $file_name) = _path_file_prefix($file_name);
> +
> +    my @suffixes = _path_file_suffixes($file_name);
> +
> +    return wantarray ? @suffixes : \@suffixes;
> +}
> +
> +=head3 path_with_file_suffixes($path, @suffixes)
> +
> +Returns C<$path> with new C<@suffixes>. This is similar to
> +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> +except that the suffixes of the file name are replaced.
> +
> +If the C<$path> does not have a file name, C<undef> is returned.
> +
> +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> +    # /tmp/archive.pxar.gz
> +
> +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> +    # /tmp/archive.gz
> +
> +If the file name has no suffixes, the C<@suffixes> are appended instead:
> +
> +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> +    # /etc/resolv.conf
> +
> +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> +    # /etc/resolv.conf.zst
> +
> +If there are no C<@suffixes> provided, the file name's suffixes will
> +be removed (if there are any):
> +
> +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> +    # /tmp/archive
> +
> +Note that an empty string is still a valid suffix (an "empty" file ending):
> +
> +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> +    # /tmp/archive....zst
> +
> +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> +if any suffix contains a path separator (C</>) or a C<.>.
> +
> +=cut
> +
> +sub path_with_file_suffixes : prototype($@) {
> +    my ($path, @suffixes) = @_;

I am questioning a bit the sanity of having "suffixes" throughout this
module instead of simply an "extension" that covers them all and can be
split if needed...

Do we have/anticipate particular use cases where this is more
convenient?

> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "one of the provided suffixes is undef"
> +	if any { !defined($_) } @suffixes;
> +    croak "one of the provided suffixes contains a path separator"
> +	if any { $_ =~ m|/| } @suffixes;
> +    croak "one of the provided suffixes contains a dot"
> +	if any { $_ =~ m|\.| } @suffixes;
> +
> +    return undef if !defined(path_file_name($path));
> +
> +    my $parent = path_parent($path);
> +
> +    # sanity check -- should not happen because we checked for file name,
> +    # and the existence of a file name implies there's a parent
> +    confess "parent of \$path is undef" if !defined($parent);
> +
> +    # Don't modify $path if there are no suffixes to be removed
> +    my @existing_suffixes = path_file_suffixes($path);
> +    return $path if !scalar(@suffixes) && !scalar(@existing_suffixes);
> +
> +    my $prefix = path_file_prefix($path);
> +
> +    # sanity check
> +    confess "\$prefix is undef" if !defined($prefix);
> +
> +    my $file_name = join(".", $prefix, @suffixes);
> +
> +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> +    # (Done also in path_with_file_name)
> +    if ($parent eq '.' && $path !~ m|/|) {
> +	return $file_name;
> +    }
> +
> +    return path_push($parent, $file_name);
> +}
> +
> +=head3 path_file_suffix($path)
> +
> +Returns the suffix of the C<$path>'s file name. If the C<$path> does not have a
> +valid file name or if the file name has no suffix, C<undef> is returned
> +instead.
> +
> +The suffix of a file name is essentially its extension, e.g.
> +C</etc/resolv.conf> has the suffix C<conf>. If there are multiple suffixes,
> +only the last will be returned; e.g. C</tmp/archive.tar.gz> has the suffix C<gz>.
> +
> +B<Note:> Files like e.g. C</tmp/foo.> have an empty string as suffix.
> +
> +For getting all suffixes of a path, see C<L<< path_file_suffixes()|/"path_file_suffixes($path)" >>>.
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_file_suffix : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $file_name = path_file_name($path);
> +    return undef if !defined($file_name);
> +
> +    (undef, $file_name) = _path_file_prefix($file_name);
> +
> +    my @suffixes = _path_file_suffixes($file_name);
> +
> +    return pop(@suffixes);
> +}
> +
> +=head3 path_with_file_suffix($path, $suffix)
> +
> +Returns C<$path> with a new C<$suffix>. This is similar to
> +C<L<< path_with_file_suffixes()|/"path_with_file_suffixes($path, @suffixes)" >>>,
> +except that only the last suffix of the file name is replaced.
> +
> +If the C<$path> does not have a file name, C<undef> is returned.
> +
> +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "gz");
> +    # /tmp/archive.tar.gz
> +
> +If the file name has no suffixes, the C<$suffix> is appended instead:
> +
> +    my $new_path = path_with_file_suffix("/etc/resolv", "conf");
> +    # /etc/resolv.conf
> +
> +If C<$suffix> is C<undef>, the file name's (last) suffix will be removed (if
> +there is one):
> +
> +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", undef);
> +    # /tmp/archive.tar
> +
> +    my $new_path = path_with_file_suffix("/etc/resolv", undef);
> +    # /etc/resolv
> +
> +Note that an empty string is still a valid suffix (an "empty" file ending):
> +
> +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "");
> +    # /tmp/archive.tar.
> +
> +    my $new_path = path_with_file_suffix("/etc/resolv", "");
> +    # /etc/resolv.
> +
> +Throws an exception if any of the arguments is C<undef>, or if C<$suffix>
> +contains a path separator (C</>) or a C<.>.
> +
> +=cut
> +
> +sub path_with_file_suffix : prototype($$) {
> +    my ($path, $suffix) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    if (defined($suffix)) {
> +	croak "\$suffix contains a path separator" if $suffix =~ m|/|;
> +	croak "\$suffix contains a dot" if $suffix =~ m|\.|;
> +    }
> +
> +    return undef if !defined(path_file_name($path));
> +
> +    my $parent = path_parent($path);
> +
> +    # sanity check -- should not happen because we checked for file name,
> +    # and the existence of a file name implies there's a parent
> +    confess "parent of \$path is undef" if !defined($parent);
> +
> +    my @suffixes = path_file_suffixes($path);
> +
> +    # Don't modify $path if there is no suffix to be removed
> +    return $path if !scalar(@suffixes) && !defined($suffix);
> +
> +    pop(@suffixes);
> +    push(@suffixes, $suffix) if defined($suffix);
> +
> +    my $prefix = path_file_prefix($path);
> +
> +    # sanity check
> +    confess "\$prefix is undef" if !defined($prefix);
> +
> +    my $file_name = join(".", $prefix, @suffixes);
> +
> +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> +    # (Done also in path_with_file_name)
> +    if ($parent eq '.' && $path !~ m|/|) {
> +	return $file_name;
> +    }
> +
> +    return path_push($parent, $file_name);
> +}
> +
> +=head3 path_file_parts($path)
> +
> +Returns the parts that constitute the file name (prefix and suffixes) of a
> +C<$path> as a list. If the C<$path> does not have a valid file name, an empty
> +list is returned instead.
> +
> +In scalar context, returns a reference to a list.
> +
> +These parts are split in such a way that allows them to be C<join>ed together,
> +resulting in the original file name of the given C<$path> again.
> +
> +    my $file_parts = path_file_parts("/etc/pve/firewall/cluster.fw");
> +    # ("cluster", "fw")
> +    my $file_name = join(".", $file_parts->@*);
> +
> +    my @file_parts = path_file_parts("/tmp/archive.tar.gz");
> +    # ("archive", "tar", "gz")
> +    my $file_name = join(".", @file_parts);
> +
> +Throws an exception if C<$path> is C<undef>.
> +
> +=cut
> +
> +sub path_file_parts : prototype($) {
> +    my ($path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +
> +    my $file_name = path_file_name($path);
> +    if (!defined($file_name)) {
> +	return wantarray ? () : [];
> +    }
> +
> +    my $prefix;
> +    ($prefix, $file_name) = _path_file_prefix($file_name);
> +
> +    my @suffixes = _path_file_suffixes($file_name);
> +
> +    my @file_parts = ($prefix, @suffixes);
> +
> +    return wantarray ? @file_parts : \@file_parts;
> +}
> +
> +=head3 path_starts_with($path, $other_path)
> +
> +Checks whether a C<$path> starts with the components of C<$other_path>.
> +
> +    my $starts_with = path_starts_with("/etc/pve/firewall/cluster.fw", "/etc/pve");
> +    # 1
> +
> +Throws an exception if any of the arguments is C<undef>.
> +
> +=cut
> +
> +sub path_starts_with : prototype($$) {
> +    my ($path, $other_path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$other_path if undef" if !defined($other_path);
> +
> +    # Nothing starts with nothing
> +    return 1 if ($path eq '' && $other_path eq '');
> +
> +    # Nothing cannot start with something
> +    # Something cannot start with nothing
> +    return if ($path eq '' || $other_path eq '');
> +
> +    my @components = path_components($path);
> +    my @other_components = path_components($other_path);
> +
> +    my @pairs = zip_shortest(\@components, \@other_components);
> +
> +    # for my ($comp, $other_comp) (@pairs) is experimental
> +    for my $pair (@pairs) {
> +	my ($comp, $other_comp) = $pair->@*;
> +
> +	if ($comp ne $other_comp) {
> +	    return;
> +	}
> +    }
> +
> +    return 1;
> +}
> +
> +=head3 path_ends_with($path, $other_path)
> +
> +Checks whether a C<$path> ends with the components of C<$other_path>.
> +
> +    my $ends_with = path_ends_with("/etc/pve/firewall/cluster.fw", "firewall/cluster.fw");
> +    # 1
> +
> +Throws an exception if any of the arguments is C<undef>.
> +
> +=cut
> +
> +sub path_ends_with : prototype($$) {
> +    my ($path, $other_path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$other_path if undef" if !defined($other_path);
> +
> +    # Nothing ends with nothing
> +    return 1 if ($path eq '' && $other_path eq '');
> +
> +    # Nothing cannot end with something
> +    # Something cannot end with nothing
> +    return if ($path eq '' || $other_path eq '');
> +
> +    my @components_rev = reverse(path_components($path));
> +    my @other_components_rev = reverse(path_components($other_path));
> +
> +    my @pairs_rev = zip_shortest(\@components_rev, \@other_components_rev);
> +
> +    # for my ($comp, $other_comp) (@pairs_rev) is experimental
> +    for my $pair (@pairs_rev) {
> +	my ($comp, $other_comp) = $pair->@*;
> +
> +	if ($comp ne $other_comp) {
> +	    return;
> +	}
> +    }
> +
> +    return 1;
> +}
> +
> +=head3 path_equals($path, $other_path)
> +
> +Checks whether C<$path> equals C<$other_path>. The paths are compared
> +by their components, meaning that it's not necessary to
> +L<< normalize|/"path_normalize($path)" >> them beforehand.
> +
> +=cut
> +
> +sub path_equals : prototype($$) {
> +    my ($path, $other_path) = @_;
> +
> +    croak "\$path is undef" if !defined($path);
> +    croak "\$other_path if undef" if !defined($other_path);
> +
> +    # Nothing is nothing
> +    return 1 if ($path eq '' && $other_path eq '');
> +
> +    # Nothing is not something
> +    # Something is not nothing
> +    return if ($path eq '' || $other_path eq '');
> +
> +    my @components = path_components($path);
> +    my @other_components = path_components($other_path);
> +
> +    return if scalar(@components) != scalar(@other_components);
> +
> +    my @pairs = zip_longest(\@components, \@other_components);
> +
> +    # for my ($comp, $other_comp) (@pairs_rev) is experimental
> +    for my $pair (@pairs) {
> +	my ($comp, $other_comp) = $pair->@*;
> +
> +	return if !defined($comp) || !defined($other_comp);
> +
> +	if ($comp ne $other_comp) {
> +	    return;
> +	}
> +    }
> +
> +    return 1;
> +}
> +
> +1;
> -- 
> 2.39.5


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
  2025-01-08 14:05   ` Wolfgang Bumiller
@ 2025-01-09  9:56     ` Max Carrara
  2025-01-09 11:06       ` Wolfgang Bumiller
  0 siblings, 1 reply; 23+ messages in thread
From: Max Carrara @ 2025-01-09  9:56 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: pve-devel

On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > The PVE::Path module concerns itself with file / directory path
> > operations, like getting the parent directory of a path, extracting
> > the file name of a path, splitting a path into its individual
> > components, joining path components together, comparing paths, and so
> > on.
> > 
> > This module is added here in order to address the shortcomings of the
> > Perl core modules (such as lacking a lot the aforementioned
> > functionalities) without relying on some kind of object-based
> > abstraction.
> > 
> > PVE::Path strictly only contains functions that manipulate file paths
> > without meddling with the filesystem. The reasoning here is to be able
> > to just import any function when its necessary without having to
> > change the surrounding code in order to adapt to some kind of
> > abstraction.
> > 
> > The other motivation for this module is that path manipulation
> > operations have been getting more and more common recently, especially
> > in regards to storage.
> > 
> > Signed-off-by: Max Carrara <m.carrara@proxmox.com>
> > ---
> > Changes v1 --> v2:
> >   * Improve some wording in the docstring of path_components
> >   * Simplify some logic in path_parent and remove an unnecessary sanity
> >     check
> >   * Actually treat "foo" as "./foo" in path_parent as mentioned in the
> >     docstring -- This means that path_parent("foo") now returns "."
> >     instead of "".
> >   * Adapt the path_with_file_* functions to the above accordingly, so
> >     that e.g. path_with_file_name("foo", "bar") returns "bar" instead of
> >     "./bar".
> >   * Improve the "boolean" behaviour of path_is_absolute and
> >     path_is_absolute and return 1 when true, but use an empty return
> >     when false.
> >     - An empty return means "undef" in scalar context and an empty list
> >       in list context, so those functions will always return something
> >       that's correctly truthy or falsy for Perl, regardless of context
> > 
> >  src/Makefile    |   1 +
> >  src/PVE/Path.pm | 987 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 988 insertions(+)
> >  create mode 100644 src/PVE/Path.pm
> > 
> > diff --git a/src/Makefile b/src/Makefile
> > index 2d8bdc4..25bc490 100644
> > --- a/src/Makefile
> > +++ b/src/Makefile
> > @@ -23,6 +23,7 @@ LIB_SOURCES = \
> >  	LDAP.pm \
> >  	Network.pm \
> >  	OTP.pm \
> > +	Path.pm \
> >  	PBSClient.pm \
> >  	PTY.pm \
> >  	ProcFSTools.pm \
> > diff --git a/src/PVE/Path.pm b/src/PVE/Path.pm
> > new file mode 100644
> > index 0000000..221e662
> > --- /dev/null
> > +++ b/src/PVE/Path.pm
> > @@ -0,0 +1,987 @@
> > +=head1 NAME
> > +
> > +C<PVE::Path> - Utilities related to handling file and directory paths
> > +
> > +=head1 DESCRIPTION
> > +
> > +This module provides functions concerned with file and directory path
> > +manipulation.
> > +
> > +None of the functions provided alter the filesystem in any way.
> > +
> > +The reason for this module's existence is to address a couple shortcomings:
> > +
> > +=over
> > +
> > +=item 1. The Perl core modules lack most of what is required for manipulating
> > +paths, for example getting the parent directory of a path, extracting the
> > +prefix of a file name (the "stem"), extracting the suffixes of a file name (the
> > +"endings" or "extensions"), checking whether two paths are the same, and so on.
> > +
> > +=item 2. If the Perl core modules provide something in that regard, it's usually
> > +provided in a not very ergonomic manner (L<C<File::Basename>>).
> > +
> > +=item 3. Additionally, the path utilities of the core modules are scattered
> > +across multiple modules, making them hard to discover.
> > +
> > +=item 4. Third-party libraries on CPAN mostly provide objects representing
> > +paths. Using any of these would require fundamental changes on how file paths
> > +are handled throughout our code, for almost no benefit.
> > +
> > +=back
> > +
> > +C<L<PVE::Path>> instead does without objects and strictly provides functions
> > +for path manipulation only. Any operation that is needed can simply be
> > +performed ad hoc by importing the corresponding function and doesn't require
> > +the surrounding code to conform to an abstraction like a path object.
> > +
> > +Additionally, some of the core modules' functionality is re-exported or
> > +re-implemented for ergonomic or logical purposes. The goal is to provide
> > +functions that don't come with any surprises and just behave like one assumes
> > +they would.
> > +
> > +This module takes inspiration from Rust's C<std::path> and Python's C<pathlib>,
> > +which are more modern path manipulation libraries.
> > +
> > +=head1 LIMITATIONS
> > +
> > +This module is limited to manipulating Unix-like / Linux file paths.
> > +
> > +=cut
> > +
> > +package PVE::Path;
> > +
> > +use strict;
> > +use warnings;
> > +
> > +use Carp qw(carp croak confess);
> > +use File::Spec ();
> > +use List::Util qw(any zip_shortest zip_longest);
> > +
> > +use Exporter qw(import);
> > +
> > +our @EXPORT_OK = qw(
> > +    path_is_absolute
> > +    path_is_relative
> > +
> > +    path_components
> > +    path_join
> > +
> > +    path_normalize
> > +
> > +    path_parent
> > +    path_push
> > +    path_pop
> > +
> > +    path_file_name
> > +    path_with_file_name
> > +
> > +    path_file_prefix
> > +    path_with_file_prefix
> > +
> > +    path_file_suffixes
> > +    path_with_file_suffixes
> > +
> > +    path_file_suffix
> > +    path_with_file_suffix
> > +
> > +    path_file_parts
> > +
> > +    path_starts_with
> > +    path_ends_with
> > +    path_equals
> > +);
> > +
> > +=head2 FUNCTIONS
> > +
> > +=cut
> > +
> > +=head3 path_is_absolute($path)
> > +
> > +Returns C<1> if C<$path> is absolute (starts with a C</>).
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_is_absolute : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    if ($path =~ m#^/#) {
> > +	return 1;
> > +    }
> > +
> > +    return;
> > +}
> > +
> > +=head3 path_is_relative($path)
> > +
> > +Returns C<1> if C<$path> is relative (doesn't start with a C</>).
> > +
> > +The opposite of C<L<< path_is_absolute()|/"path_is_absolute($path)" >>>.
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_is_relative : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    if ($path !~ m#^/#) {
> > +	return 1;
> > +    }
> > +
> > +    return;
> > +}
> > +
> > +=head3 path_components($path)
> > +
> > +Returns a list of the given C<$path>'s individual components.
> > +
> > +In scalar context, returns a reference to a list.
> > +
> > +The C<$path> is normalized a little during the parse:
> > +
> > +=over
> > +
> > +=item Repeated occurrences of C</> are removed, so C<foo/bar> and C<foo//bar>
> > +both have C<foo> and C<bar> as components.
> > +
> > +=item Trailing slashes C</> are removed.
> > +
> > +=item Occurrences of C<.> are normalized away, except the first C<.> at
> > +beginning of a path. This means that C<foo/bar>, C<foo/./bar>, C<foo/bar/.>,
> > +C<foo/././bar/./.>, etc. all have C<foo> and C<bar> as components, while
> > +C<./foo/bar>, C<./././foo/bar>, C<./foo/./bar/.> have C<.>, C<foo> and C<bar>
> > +as components.
> > +
> > +=back
> > +
> > +No other normalization is performed to account for the possibility of symlinks
> > +existing. This means that C<foo/baz> and C<foo/bar/../baz> are distinct (because
> > +C<bar> could be a symlink and thus C<foo> isn't its parent).
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_components : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $is_abs = path_is_absolute($path);
> > +    my $has_cur_dir = $path =~ m#^\.$|^\./#;
> > +
> > +    my @components = split('/', $path);
> > +    my @normalized_components = ();
> > +
> > +    for my $component (@components) {
> > +	next if $component eq '' || $component eq '.';
> > +
> > +	push(@normalized_components, $component);
> > +    }
> > +
> > +    unshift(@normalized_components, '/') if $is_abs;
>
> ^ This case should probably also be explicitly mentioned in the docs.
> (`path_file_name()` also relies on it)

Good catch! Will add that.

>
> > +    unshift(@normalized_components, '.') if $has_cur_dir;
> > +
> > +    return @normalized_components if wantarray;
> > +    return \@normalized_components;
> > +}
> > +
> > +
> > +=head3 path_join(@paths)
> > +
> > +Joins multiple paths together. All kinds of paths are supported.
> > +
> > +Does not perform any C<L<< normalization|/"path_normalize($path)" >>>.
> > +
> > +Throws an exception if any of the passed C<@paths> is C<undef>.
>
> This should probably point to the `path_push` documentation mentioning
> the absolute path special case.

Another good catch; I agree.

>
> > +
> > +=cut
> > +
> > +sub path_join : prototype(@) {
> > +    my (@paths) = @_;
> > +
> > +    if (!scalar(@paths)) {
> > +	return '';
> > +    }
> > +
> > +    croak "one of the provided paths is undef" if any { !defined($_) } @paths;
> > +
> > +    # Find the last occurrence of a root directory and start conjoining the
> > +    # components from there onwards
> > +    my $index = scalar(@paths) - 1;
> > +    while ($index > 0) {
> > +	last if $paths[$index] =~ m#^/#;
> > +	$index--;
> > +    }
> > +
> > +    @paths = @paths[$index .. (scalar(@paths) - 1)];
> > +
> > +    my $resulting_path = shift @paths;
> > +
> > +    for my $path (@paths) {
> > +	$resulting_path = path_push($resulting_path, $path);
> > +    }
> > +
> > +    return $resulting_path;
> > +}
> > +
> > +=head3 path_normalize($path)
> > +
> > +Wrapper for L<C<File::Spec/canonpath>>. Performs a logical cleanup of the given
> > +C<$path>.
> > +
> > +This removes unnecessary components of a path that can be safely
> > +removed, such as C<.>, trailing C</> or repeated occurrences of C</>.
> > +
> > +For example, C<foo/./bar/baz/.> and C<foo////bar//baz//> will both become
> > +C<foo/bar/baz>.
> > +
> > +B<Note:> This will I<not> remove components referencing the parent directory,
> > +i.e. C<..>. For example, C<foo/bar/../baz> and C<foo/bar/baz/..> will therefore
> > +remain as they are. However, the parent directory of C</> is C</>, so
> > +C</../../foo> will be normalized to C</foo>.
> > +
> > +Throws an exception if C<$path> is C<undef> or the call to C<canonpath> failed.
> > +
> > +=cut
> > +
> > +sub path_normalize : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $cleaned_path = eval {
> > +	File::Spec->canonpath($path);
> > +    };
> > +
> > +    croak "failed to clean up path: $@" if $@;
> > +
> > +    return $cleaned_path;
> > +}
> > +
> > +=head3 path_parent($path)
> > +
> > +Returns the given C<$path> without its final component, if there is one.
> > +
> > +Trailing and repeated occurrences of C</> and C<.> are normalized on the fly
> > +when needed. This means that e.g. C<foo////bar///.//> becomes C<foo>, but
> > +C<foo/.//bar//./baz> becomes C<foo/.//bar>.
> > +
> > +This function's behaviour is almost identical to Rust's
> > +L<< Path::parent|https://doc.rust-lang.org/std/path/struct.Path.html#method.parent >>,
> > +with a few adaptations made wherever Perl treats things differently:
> > +
> > +=over
> > +
> > +=item * C</foo/bar> becomes C</foo>, C<foo/bar> becomes C<foo>.
> > +
> > +=item * C</foo> becomes C</>.
> > +
> > +=item * C<foo/bar/..> becomes C<foo/bar>.
> > +
> > +=item * C<foo/../bar> becomes C<foo/..>.
> > +
> > +=item * C<foo> is interpreted as C<./foo> and becomes C<.>. This is because Perl's
> > +C<L<File::Spec/canonpath>> interprets C<./foo> and C<foo> as the same thing.
> > +
> > +=item * C</> and an I<empty string> result in C<undef> being returned.
> > +
> > +=item * C<.> results in an empty string.
>
> ^ Like in `path_components()`, this should mention symlinks, since the
> parent of `/foo/..` is not necessarily `/foo` in the file system, but
> will be here.

I agree with this as well; will add a note to the docs.

>
> > +
> > +=back
> > +
> > +Will raise an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_parent : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    if ($path eq '') {
> > +	return;
> > +    }
> > +
> > +    # A limit of -1 retains empty components at the end
> > +    my @components = split('/', $path, -1);
>
> ^ Meh, if we didn't mind normalizing all multi-slash occurrances we could just split on `m@/+@` :S

We could, but I'd rather keep it as it is; one can always use
path_normalize to get rid of those.

>
> > +
> > +    # Trim off needless extra components until actual final component is encountered, e.g.
> > +    # foo////bar////baz//// -> foo////bar////baz
> > +    # /// -> /
> > +    # ././//.///./ -> .
> > +    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
> > +	pop(@components);
> > +    }
> > +
> > +    my $final_component = pop(@components);
> > +
> > +    # We had a root dir with needless extra components, e.g. "//" or "////" or "//.///./" etc.
> > +    return if $final_component eq '';
> > +
> > +    # We had a current dir reference with needless extra components, e.g.
> > +    # "././" or ".///////" or "./././//./././//" etc.
> > +    return '' if $final_component eq '.';
> > +
> > +    # We had some other kind of single component like "foo", "bar" or "..",
> > +    # and because File::Spec->canonpath treats "foo" and "./foo" the same,
> > +    # return a single current dir reference
> > +    return '.' if !scalar(@components);
> > +
> > +    # Trim off needless extra components until actual parent component is encountered, like above
> > +    while (scalar(@components) > 1 && ($components[-1] eq '' || $components[-1] eq '.')) {
> > +	pop(@components);
> > +    }
> > +
> > +    # Handle lone root dir (@components with only one empty string)
> > +    if (scalar(@components) == 1 && $components[0] eq '') {
> > +	return '/';
> > +    }
> > +
> > +    return join('/', @components);
> > +}
> > +
> > +=head3 path_push($path, $other)
> > +
> > +Extends C<$path> with C<$other>, returning a new path.
> > +
> > +If C<$other> is absolute, it will be returned instead.
>
> I was never a fan of this. But I guess we cannot change that in rust, so
> for the sake of consistency...
>
> > +
> > +Throws an exception if any of the arguments is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_push : prototype($$) {
> > +    my ($path, $other) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$other is undef" if !defined($other);
> > +
> > +    return $other if path_is_absolute($other);
> > +    return $path if $other eq '';
>
> (^ Could swap the above 2 checks)

ACK

>
> > +
> > +    my $need_sep = $path ne '' && $path !~ m#/$#;
> > +
> > +    $path .= "/" if $need_sep;
> > +    $path .= $other;
> > +
> > +    return $path;
> > +}
> > +
> > +=head3 path_pop($path)
> > +
> > +Alias for C<L<< path_parent()|/"path_parent($path)" >>>.
> > +
> > +=cut
> > +
> > +sub path_pop : prototype($) {
> > +    my ($path) = @_;
> > +    return path_parent($path);
> > +}
> > +
> > +=head3 path_file_name($path)
> > +
> > +Returns the last component of the given C<$path>, if it is a legal file name,
> > +or C<undef> otherwise.
> > +
> > +If C<$path> is an empty string, C</>, C<.> or ends with a C<..> component,
> > +there is no valid file name.
> > +
> > +B<Note:> This does not check whether the given C<$path> actually points to a
> > +file or a directory etc.
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_file_name : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my @components = path_components($path);
> > +
> > +    if (!scalar(@components)) {
> > +	return;
> > +    }
> > +
> > +    if (
> > +	scalar(@components) == 1
> > +	&& ($components[0] eq '/' || $components[0] eq '.')
> > +    ) {
> > +	return;
> > +    }
> > +
> > +    if ($components[-1] eq '..') {
> > +	return;
> > +    }
> > +
> > +    return $components[-1];
> > +}
> > +
> > +=head3 path_with_file_name($path, $file_name)
> > +
> > +Returns C<$path> with C<$file_name> as the new last component.
> > +
> > +This is essentially like calling C<L<< path_parent()|/"path_parent($path)" >>>
> > +and using C<L<< path_push()|/"path_push($path, $other)" >>> to append the new
> > +file name, but handles a few extra cases:
> > +
> > +=over
> > +
> > +=item * If C<$path> is C</>, appends C<$file_name>.
> > +
> > +=item * If C<$path> is an empty string, appends C<$file_name>.
> > +
> > +=item * If C<$path> ends with a parent directory reference (C<..>), replaces it
> > +with C<$file_name>.
> > +
> > +=back
> > +
> > +Throws an exception if any of the arguments is C<undef> or if C<$file_name>
> > +contains a path separator (C</>).
> > +
> > +=cut
> > +
> > +sub path_with_file_name : prototype($$) {
> > +    my ($path, $file_name) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$file_name is undef" if !defined($file_name);
> > +    croak "\$file_name contains a path separator: $file_name" if $file_name =~ m|/|;
> > +
> > +    my $parent = path_parent($path);
> > +
> > +    # undef means that $path was either "" or "/", so we can just append to it
> > +    return ($path . $file_name) if !defined($parent);
> > +
> > +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> > +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> > +    if ($parent eq '.' && $path !~ m|/|) {
> > +	return $file_name;
> > +    }
> > +
> > +    return path_push($parent, $file_name);
> > +}
> > +
> > +my sub _path_file_prefix : prototype($) {
> > +    my ($file_name) = @_;
> > +
> > +    confess "\$file_name is undef" if !defined($file_name);
> > +
> > +    $file_name =~ s|^(\.*[^\.]*)||;
> > +    my $prefix = $1;
> > +
> > +    # sanity check
> > +    confess "\$prefix not matched" if !defined($prefix);
>
> ^ For the above 4 lines you should be able to just test the `=~`
> directly:
>
>     confess "..." if $file_name !~ s|^(\.*[^\.]*)||;
>
> (note =~ -> !~)

Ah, another good catch! Will incorporate this in v3 also.

>
> Would it make sense to instead *match* those parts instead of s// and
> return the extension as a 3rd value? I think `path_with_file_prefix()`
> could use this instead of building and joining a suffix list?
>
> No hard feelings, though.

Hmm, I'll give it a shot.

>
> > +
> > +    return ($prefix, $file_name);
> > +}
> > +
> > +=head3 path_file_prefix($path)
> > +
> > +Returns the prefix of the file name of the given C<$path>. If the C<$path> does
> > +not have a valid file name and thus no prefix, C<undef> is returned instead.
> > +
> > +The prefix of a file name is the part before any extensions (suffixes).
> > +
> > +    my $prefix = path_file_prefix("/etc/resolv.conf");
> > +    # resolv
> > +
> > +    my $prefix = path_file_prefix("/tmp/archive.tar.zst");
> > +    # archive
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_file_prefix : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $file_name = path_file_name($path);
> > +    return undef if !defined($file_name);
> > +
> > +    my ($prefix, undef) = _path_file_prefix($file_name);
> > +    return $prefix;
> > +}
> > +
> > +=head3 path_with_file_prefix($path, $prefix)
> > +
> > +Returns C<$path> with a new C<$prefix>. This is similar to
> > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > +except that the file prefix is replaced or appended.
> > +
> > +If C<$path> does not have a file name or if C<$prefix> is an empty string,
> > +C<undef> is returned.
> > +
> > +    my $new_path = path_with_file_prefix("/tmp/archive.tar.zst", "backup");
> > +    # /tmp/backup.tar.zst
> > +
> > +    my $new_path = path_with_file_prefix("/etc/pve", "ceph");
> > +    # /etc/ceph
> > +
> > +Throws an exception if any of the arguments is C<undef>, or if C<$prefix>
> > +contains a path separator (C</>), ends with C<.>, or is an empty string.
> > +
> > +=cut
> > +
> > +sub path_with_file_prefix : prototype($$) {
> > +    my ($path, $prefix) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$prefix is undef" if !defined($prefix);
> > +    croak "\$prefix contains a path separator" if $prefix =~ m|/|;
> > +    croak "\$prefix ends with a dot" if $prefix =~ m|\.$|;
> > +
> > +    return undef if $prefix eq '';
> > +    return undef if !defined(path_file_name($path));
> > +
> > +    return $path if $prefix eq '';
> > +
> > +    my $parent = path_parent($path);
> > +
> > +    # sanity check -- should not happen because we checked for file name,
> > +    # and the existence of a file name implies there's a parent
> > +    confess "parent of \$path is undef" if !defined($parent);
> > +
> > +    my @suffixes = path_file_suffixes($path);
> > +
> > +    my $file_name = join(".", $prefix, @suffixes);
> > +
> > +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> > +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> > +    # (Done also in path_with_file_name)
> > +    if ($parent eq '.' && $path !~ m|/|) {
> > +	return $file_name;
> > +    }
> > +
> > +    return path_push($parent, $file_name);
> > +}
> > +
> > +my sub _path_file_suffixes : prototype($) {
> > +    my ($file_name_no_prefix) = @_;
> > +
> > +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > +
> > +    # Suffixes are extracted "manually" because join()ing the result of split()
> > +    # results in a different file name than the original. Let's say you have a
> > +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > +    # With split, you get the following:
> > +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> > +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > +    my @suffixes = ();
> > +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > +	my $suffix = $1;
> > +	$suffix =~ s|^\.||;
> > +	push(@suffixes, $suffix);
> > +    }
> > +
> > +    return @suffixes;
> > +}
> > +
> > +=head3 path_file_suffixes($path)
> > +
> > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > +not have a valid file name, an empty list is returned instead.
> > +
> > +In scalar context, returns a reference to a list.
>
> (^ Isn't this a bit awkward?)

Well, do you think it is? I've been using that kind of "return style"
here and there and have been liking it, but if it's weird to others, I
can adapt it. :P

>
> > +
> > +The suffixes of a path are essentially the file name's extensions, the parts
> > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > +
> > +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > +    # ("conf")
> > +
> > +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > +    # ["tar", "zst"]
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_file_suffixes : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $file_name = path_file_name($path);
> > +    if (!defined($file_name)) {
> > +	return wantarray ? () : [];
> > +    }
> > +
> > +    (undef, $file_name) = _path_file_prefix($file_name);
> > +
> > +    my @suffixes = _path_file_suffixes($file_name);
> > +
> > +    return wantarray ? @suffixes : \@suffixes;
> > +}
> > +
> > +=head3 path_with_file_suffixes($path, @suffixes)
> > +
> > +Returns C<$path> with new C<@suffixes>. This is similar to
> > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > +except that the suffixes of the file name are replaced.
> > +
> > +If the C<$path> does not have a file name, C<undef> is returned.
> > +
> > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > +    # /tmp/archive.pxar.gz
> > +
> > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > +    # /tmp/archive.gz
> > +
> > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > +
> > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > +    # /etc/resolv.conf
> > +
> > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > +    # /etc/resolv.conf.zst
> > +
> > +If there are no C<@suffixes> provided, the file name's suffixes will
> > +be removed (if there are any):
> > +
> > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > +    # /tmp/archive
> > +
> > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > +
> > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > +    # /tmp/archive....zst
> > +
> > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > +if any suffix contains a path separator (C</>) or a C<.>.
> > +
> > +=cut
> > +
> > +sub path_with_file_suffixes : prototype($@) {
> > +    my ($path, @suffixes) = @_;
>
> I am questioning a bit the sanity of having "suffixes" throughout this
> module instead of simply an "extension" that covers them all and can be
> split if needed...
>
> Do we have/anticipate particular use cases where this is more
> convenient?

To be really honest, it was 50/50 on the "extension" vs "suffixes"
decision. I then opted for suffixes instead, because they seemed to be
less ambiguous than an "extension". Let me elaborate.

Let's say we have a file called "foo.tar.gz" -- what would its extension
be? Some might say that ".tar.gz" is its extension, while some others
might say ".gz"; both have different reasonings but aren't necessarily
more or less correct than the other.

In our case, both you and I would say that ".tar.gz" is the extension,
but...

Rust's std::path::Path::extension [1] would return "gz" for the above,
while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
don't have such a case in their docs!). Java has java.nio.file.Path, but
conveniently doesn't care about extensions. Kotlin fixest his by adding
an "extension" property to that class, which returns "gz" [3].

Python's pathlib is the only one I've seen go the "suffix route" and
provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
type of approach also came up on the (Rust) tracking issue for
Path::file_prefix [5].

When I was looking up all that, I decided to go for the prefix +
suffix(es) route, as that just seemed to be the most unambiguous.
Convenience wasn't really a factor here, because there's the
"path_with_file_* family" of functions that should handle most of the
replacement cases.

(Besides, I found it quite nice that I could join() on the prefix +
suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
with path_file_prefix() and path_file_suffixes(); I like that there's an
"inverse" operation.)

*But,* I wouldn't be opposed to adding a function that just returns
"tar.gz" for the above case. Perhaps with a different name though :P

[1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
[2] https://en.cppreference.com/w/cpp/filesystem/path/extension
[3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
[4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
[5] https://github.com/rust-lang/rust/issues/86319

>
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "one of the provided suffixes is undef"
> > +	if any { !defined($_) } @suffixes;
> > +    croak "one of the provided suffixes contains a path separator"
> > +	if any { $_ =~ m|/| } @suffixes;
> > +    croak "one of the provided suffixes contains a dot"
> > +	if any { $_ =~ m|\.| } @suffixes;
> > +
> > +    return undef if !defined(path_file_name($path));
> > +
> > +    my $parent = path_parent($path);
> > +
> > +    # sanity check -- should not happen because we checked for file name,
> > +    # and the existence of a file name implies there's a parent
> > +    confess "parent of \$path is undef" if !defined($parent);
> > +
> > +    # Don't modify $path if there are no suffixes to be removed
> > +    my @existing_suffixes = path_file_suffixes($path);
> > +    return $path if !scalar(@suffixes) && !scalar(@existing_suffixes);
> > +
> > +    my $prefix = path_file_prefix($path);
> > +
> > +    # sanity check
> > +    confess "\$prefix is undef" if !defined($prefix);
> > +
> > +    my $file_name = join(".", $prefix, @suffixes);
> > +
> > +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> > +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> > +    # (Done also in path_with_file_name)
> > +    if ($parent eq '.' && $path !~ m|/|) {
> > +	return $file_name;
> > +    }
> > +
> > +    return path_push($parent, $file_name);
> > +}
> > +
> > +=head3 path_file_suffix($path)
> > +
> > +Returns the suffix of the C<$path>'s file name. If the C<$path> does not have a
> > +valid file name or if the file name has no suffix, C<undef> is returned
> > +instead.
> > +
> > +The suffix of a file name is essentially its extension, e.g.
> > +C</etc/resolv.conf> has the suffix C<conf>. If there are multiple suffixes,
> > +only the last will be returned; e.g. C</tmp/archive.tar.gz> has the suffix C<gz>.
> > +
> > +B<Note:> Files like e.g. C</tmp/foo.> have an empty string as suffix.
> > +
> > +For getting all suffixes of a path, see C<L<< path_file_suffixes()|/"path_file_suffixes($path)" >>>.
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_file_suffix : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $file_name = path_file_name($path);
> > +    return undef if !defined($file_name);
> > +
> > +    (undef, $file_name) = _path_file_prefix($file_name);
> > +
> > +    my @suffixes = _path_file_suffixes($file_name);
> > +
> > +    return pop(@suffixes);
> > +}
> > +
> > +=head3 path_with_file_suffix($path, $suffix)
> > +
> > +Returns C<$path> with a new C<$suffix>. This is similar to
> > +C<L<< path_with_file_suffixes()|/"path_with_file_suffixes($path, @suffixes)" >>>,
> > +except that only the last suffix of the file name is replaced.
> > +
> > +If the C<$path> does not have a file name, C<undef> is returned.
> > +
> > +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "gz");
> > +    # /tmp/archive.tar.gz
> > +
> > +If the file name has no suffixes, the C<$suffix> is appended instead:
> > +
> > +    my $new_path = path_with_file_suffix("/etc/resolv", "conf");
> > +    # /etc/resolv.conf
> > +
> > +If C<$suffix> is C<undef>, the file name's (last) suffix will be removed (if
> > +there is one):
> > +
> > +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", undef);
> > +    # /tmp/archive.tar
> > +
> > +    my $new_path = path_with_file_suffix("/etc/resolv", undef);
> > +    # /etc/resolv
> > +
> > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > +
> > +    my $new_path = path_with_file_suffix("/tmp/archive.tar.zst", "");
> > +    # /tmp/archive.tar.
> > +
> > +    my $new_path = path_with_file_suffix("/etc/resolv", "");
> > +    # /etc/resolv.
> > +
> > +Throws an exception if any of the arguments is C<undef>, or if C<$suffix>
> > +contains a path separator (C</>) or a C<.>.
> > +
> > +=cut
> > +
> > +sub path_with_file_suffix : prototype($$) {
> > +    my ($path, $suffix) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    if (defined($suffix)) {
> > +	croak "\$suffix contains a path separator" if $suffix =~ m|/|;
> > +	croak "\$suffix contains a dot" if $suffix =~ m|\.|;
> > +    }
> > +
> > +    return undef if !defined(path_file_name($path));
> > +
> > +    my $parent = path_parent($path);
> > +
> > +    # sanity check -- should not happen because we checked for file name,
> > +    # and the existence of a file name implies there's a parent
> > +    confess "parent of \$path is undef" if !defined($parent);
> > +
> > +    my @suffixes = path_file_suffixes($path);
> > +
> > +    # Don't modify $path if there is no suffix to be removed
> > +    return $path if !scalar(@suffixes) && !defined($suffix);
> > +
> > +    pop(@suffixes);
> > +    push(@suffixes, $suffix) if defined($suffix);
> > +
> > +    my $prefix = path_file_prefix($path);
> > +
> > +    # sanity check
> > +    confess "\$prefix is undef" if !defined($prefix);
> > +
> > +    my $file_name = join(".", $prefix, @suffixes);
> > +
> > +    # Because the parent of "foo" is ".", return $file_name to stay consistent.
> > +    # Otherwise, we'd end up with a current path ref prepended ("./$file_name")
> > +    # (Done also in path_with_file_name)
> > +    if ($parent eq '.' && $path !~ m|/|) {
> > +	return $file_name;
> > +    }
> > +
> > +    return path_push($parent, $file_name);
> > +}
> > +
> > +=head3 path_file_parts($path)
> > +
> > +Returns the parts that constitute the file name (prefix and suffixes) of a
> > +C<$path> as a list. If the C<$path> does not have a valid file name, an empty
> > +list is returned instead.
> > +
> > +In scalar context, returns a reference to a list.
> > +
> > +These parts are split in such a way that allows them to be C<join>ed together,
> > +resulting in the original file name of the given C<$path> again.
> > +
> > +    my $file_parts = path_file_parts("/etc/pve/firewall/cluster.fw");
> > +    # ("cluster", "fw")
> > +    my $file_name = join(".", $file_parts->@*);
> > +
> > +    my @file_parts = path_file_parts("/tmp/archive.tar.gz");
> > +    # ("archive", "tar", "gz")
> > +    my $file_name = join(".", @file_parts);
> > +
> > +Throws an exception if C<$path> is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_file_parts : prototype($) {
> > +    my ($path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +
> > +    my $file_name = path_file_name($path);
> > +    if (!defined($file_name)) {
> > +	return wantarray ? () : [];
> > +    }
> > +
> > +    my $prefix;
> > +    ($prefix, $file_name) = _path_file_prefix($file_name);
> > +
> > +    my @suffixes = _path_file_suffixes($file_name);
> > +
> > +    my @file_parts = ($prefix, @suffixes);
> > +
> > +    return wantarray ? @file_parts : \@file_parts;
> > +}
> > +
> > +=head3 path_starts_with($path, $other_path)
> > +
> > +Checks whether a C<$path> starts with the components of C<$other_path>.
> > +
> > +    my $starts_with = path_starts_with("/etc/pve/firewall/cluster.fw", "/etc/pve");
> > +    # 1
> > +
> > +Throws an exception if any of the arguments is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_starts_with : prototype($$) {
> > +    my ($path, $other_path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$other_path if undef" if !defined($other_path);
> > +
> > +    # Nothing starts with nothing
> > +    return 1 if ($path eq '' && $other_path eq '');
> > +
> > +    # Nothing cannot start with something
> > +    # Something cannot start with nothing
> > +    return if ($path eq '' || $other_path eq '');
> > +
> > +    my @components = path_components($path);
> > +    my @other_components = path_components($other_path);
> > +
> > +    my @pairs = zip_shortest(\@components, \@other_components);
> > +
> > +    # for my ($comp, $other_comp) (@pairs) is experimental
> > +    for my $pair (@pairs) {
> > +	my ($comp, $other_comp) = $pair->@*;
> > +
> > +	if ($comp ne $other_comp) {
> > +	    return;
> > +	}
> > +    }
> > +
> > +    return 1;
> > +}
> > +
> > +=head3 path_ends_with($path, $other_path)
> > +
> > +Checks whether a C<$path> ends with the components of C<$other_path>.
> > +
> > +    my $ends_with = path_ends_with("/etc/pve/firewall/cluster.fw", "firewall/cluster.fw");
> > +    # 1
> > +
> > +Throws an exception if any of the arguments is C<undef>.
> > +
> > +=cut
> > +
> > +sub path_ends_with : prototype($$) {
> > +    my ($path, $other_path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$other_path if undef" if !defined($other_path);
> > +
> > +    # Nothing ends with nothing
> > +    return 1 if ($path eq '' && $other_path eq '');
> > +
> > +    # Nothing cannot end with something
> > +    # Something cannot end with nothing
> > +    return if ($path eq '' || $other_path eq '');
> > +
> > +    my @components_rev = reverse(path_components($path));
> > +    my @other_components_rev = reverse(path_components($other_path));
> > +
> > +    my @pairs_rev = zip_shortest(\@components_rev, \@other_components_rev);
> > +
> > +    # for my ($comp, $other_comp) (@pairs_rev) is experimental
> > +    for my $pair (@pairs_rev) {
> > +	my ($comp, $other_comp) = $pair->@*;
> > +
> > +	if ($comp ne $other_comp) {
> > +	    return;
> > +	}
> > +    }
> > +
> > +    return 1;
> > +}
> > +
> > +=head3 path_equals($path, $other_path)
> > +
> > +Checks whether C<$path> equals C<$other_path>. The paths are compared
> > +by their components, meaning that it's not necessary to
> > +L<< normalize|/"path_normalize($path)" >> them beforehand.
> > +
> > +=cut
> > +
> > +sub path_equals : prototype($$) {
> > +    my ($path, $other_path) = @_;
> > +
> > +    croak "\$path is undef" if !defined($path);
> > +    croak "\$other_path if undef" if !defined($other_path);
> > +
> > +    # Nothing is nothing
> > +    return 1 if ($path eq '' && $other_path eq '');
> > +
> > +    # Nothing is not something
> > +    # Something is not nothing
> > +    return if ($path eq '' || $other_path eq '');
> > +
> > +    my @components = path_components($path);
> > +    my @other_components = path_components($other_path);
> > +
> > +    return if scalar(@components) != scalar(@other_components);
> > +
> > +    my @pairs = zip_longest(\@components, \@other_components);
> > +
> > +    # for my ($comp, $other_comp) (@pairs_rev) is experimental
> > +    for my $pair (@pairs) {
> > +	my ($comp, $other_comp) = $pair->@*;
> > +
> > +	return if !defined($comp) || !defined($other_comp);
> > +
> > +	if ($comp ne $other_comp) {
> > +	    return;
> > +	}
> > +    }
> > +
> > +    return 1;
> > +}
> > +
> > +1;
> > -- 
> > 2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
  2025-01-09  9:56     ` Max Carrara
@ 2025-01-09 11:06       ` Wolfgang Bumiller
  2025-01-09 12:56         ` Max Carrara
  0 siblings, 1 reply; 23+ messages in thread
From: Wolfgang Bumiller @ 2025-01-09 11:06 UTC (permalink / raw)
  To: Max Carrara; +Cc: pve-devel

On Thu, Jan 09, 2025 at 10:56:16AM +0100, Max Carrara wrote:
> On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> > On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > > +my sub _path_file_suffixes : prototype($) {
> > > +    my ($file_name_no_prefix) = @_;
> > > +
> > > +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > > +
> > > +    # Suffixes are extracted "manually" because join()ing the result of split()
> > > +    # results in a different file name than the original. Let's say you have a
> > > +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > > +    # With split, you get the following:
> > > +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> > > +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > > +    my @suffixes = ();
> > > +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > > +	my $suffix = $1;
> > > +	$suffix =~ s|^\.||;
> > > +	push(@suffixes, $suffix);
> > > +    }
> > > +
> > > +    return @suffixes;
> > > +}
> > > +
> > > +=head3 path_file_suffixes($path)
> > > +
> > > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > > +not have a valid file name, an empty list is returned instead.
> > > +
> > > +In scalar context, returns a reference to a list.
> >
> > (^ Isn't this a bit awkward?)
> 
> Well, do you think it is? I've been using that kind of "return style"
> here and there and have been liking it, but if it's weird to others, I
> can adapt it. :P

If you remove the distinction and always return the list, the scalar
context would give you the final element, which would be "the extension"
('gz', not 'tar.gz' :P), which would also be a nice shortcut for what
I'd anticipate to be one of the most common(?) uses.

> 
> >
> > > +
> > > +The suffixes of a path are essentially the file name's extensions, the parts
> > > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > > +
> > > +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > > +    # ("conf")
> > > +
> > > +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > > +    # ["tar", "zst"]
> > > +
> > > +Throws an exception if C<$path> is C<undef>.
> > > +
> > > +=cut
> > > +
> > > +sub path_file_suffixes : prototype($) {
> > > +    my ($path) = @_;
> > > +
> > > +    croak "\$path is undef" if !defined($path);
> > > +
> > > +    my $file_name = path_file_name($path);
> > > +    if (!defined($file_name)) {
> > > +	return wantarray ? () : [];
> > > +    }
> > > +
> > > +    (undef, $file_name) = _path_file_prefix($file_name);
> > > +
> > > +    my @suffixes = _path_file_suffixes($file_name);
> > > +
> > > +    return wantarray ? @suffixes : \@suffixes;
> > > +}
> > > +
> > > +=head3 path_with_file_suffixes($path, @suffixes)
> > > +
> > > +Returns C<$path> with new C<@suffixes>. This is similar to
> > > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > > +except that the suffixes of the file name are replaced.
> > > +
> > > +If the C<$path> does not have a file name, C<undef> is returned.
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > > +    # /tmp/archive.pxar.gz
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > > +    # /tmp/archive.gz
> > > +
> > > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > > +
> > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > > +    # /etc/resolv.conf
> > > +
> > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > > +    # /etc/resolv.conf.zst
> > > +
> > > +If there are no C<@suffixes> provided, the file name's suffixes will
> > > +be removed (if there are any):
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > > +    # /tmp/archive
> > > +
> > > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > > +
> > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > > +    # /tmp/archive....zst
> > > +
> > > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > > +if any suffix contains a path separator (C</>) or a C<.>.
> > > +
> > > +=cut
> > > +
> > > +sub path_with_file_suffixes : prototype($@) {
> > > +    my ($path, @suffixes) = @_;
> >
> > I am questioning a bit the sanity of having "suffixes" throughout this
> > module instead of simply an "extension" that covers them all and can be
> > split if needed...
> >
> > Do we have/anticipate particular use cases where this is more
> > convenient?
> 
> To be really honest, it was 50/50 on the "extension" vs "suffixes"
> decision. I then opted for suffixes instead, because they seemed to be
> less ambiguous than an "extension". Let me elaborate.
> 
> Let's say we have a file called "foo.tar.gz" -- what would its extension
> be? Some might say that ".tar.gz" is its extension, while some others
> might say ".gz"; both have different reasonings but aren't necessarily
> more or less correct than the other.
> 
> In our case, both you and I would say that ".tar.gz" is the extension,
> but...
> 
> Rust's std::path::Path::extension [1] would return "gz" for the above,
> while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
> don't have such a case in their docs!). Java has java.nio.file.Path, but
> conveniently doesn't care about extensions. Kotlin fixest his by adding
> an "extension" property to that class, which returns "gz" [3].
> 
> Python's pathlib is the only one I've seen go the "suffix route" and
> provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
> type of approach also came up on the (Rust) tracking issue for
> Path::file_prefix [5].
> 
> When I was looking up all that, I decided to go for the prefix +
> suffix(es) route, as that just seemed to be the most unambiguous.
> Convenience wasn't really a factor here, because there's the
> "path_with_file_* family" of functions that should handle most of the
> replacement cases.
> 
> (Besides, I found it quite nice that I could join() on the prefix +
> suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
> with path_file_prefix() and path_file_suffixes(); I like that there's an
> "inverse" operation.)
> 
> *But,* I wouldn't be opposed to adding a function that just returns
> "tar.gz" for the above case. Perhaps with a different name though :P
> 
> [1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
> [2] https://en.cppreference.com/w/cpp/filesystem/path/extension
> [3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
> [4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
> [5] https://github.com/rust-lang/rust/issues/86319

TBH I wouldn't care if it returned "tar.gz", ".tar.gz", "gz" or ".gz"
as long as it's documented ;-).

But yeah, I guess suffixes makes sense - "forces"(🤪) you to not ignore
it in your code...


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
  2025-01-09 11:06       ` Wolfgang Bumiller
@ 2025-01-09 12:56         ` Max Carrara
  0 siblings, 0 replies; 23+ messages in thread
From: Max Carrara @ 2025-01-09 12:56 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: pve-devel

On Thu Jan 9, 2025 at 12:06 PM CET, Wolfgang Bumiller wrote:
> On Thu, Jan 09, 2025 at 10:56:16AM +0100, Max Carrara wrote:
> > On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> > > On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > > > +my sub _path_file_suffixes : prototype($) {
> > > > +    my ($file_name_no_prefix) = @_;
> > > > +
> > > > +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > > > +
> > > > +    # Suffixes are extracted "manually" because join()ing the result of split()
> > > > +    # results in a different file name than the original. Let's say you have a
> > > > +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > > > +    # With split, you get the following:
> > > > +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> > > > +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > > > +    my @suffixes = ();
> > > > +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > > > +	my $suffix = $1;
> > > > +	$suffix =~ s|^\.||;
> > > > +	push(@suffixes, $suffix);
> > > > +    }
> > > > +
> > > > +    return @suffixes;
> > > > +}
> > > > +
> > > > +=head3 path_file_suffixes($path)
> > > > +
> > > > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > > > +not have a valid file name, an empty list is returned instead.
> > > > +
> > > > +In scalar context, returns a reference to a list.
> > >
> > > (^ Isn't this a bit awkward?)
> > 
> > Well, do you think it is? I've been using that kind of "return style"
> > here and there and have been liking it, but if it's weird to others, I
> > can adapt it. :P
>
> If you remove the distinction and always return the list, the scalar
> context would give you the final element, which would be "the extension"
> ('gz', not 'tar.gz' :P), which would also be a nice shortcut for what
> I'd anticipate to be one of the most common(?) uses.

Sure, that's fine by me!

Though, I still wanna keep path_file_suffix() around, as that returns
only the last suffix (the extension), as that's a bit more explicit.

There are a few other functions that return a listref in scalar context
though; I'd prefer to adapt those too then, for consistency's sake and
all that. (Since this is a fresh library, I want it to have no surprises
about its usage ;P )

I'll have to adapt the tests a little for this, but otherwise this
shouldn't be much of a hassle to change.

>
> > 
> > >
> > > > +
> > > > +The suffixes of a path are essentially the file name's extensions, the parts
> > > > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > > > +
> > > > +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > > > +    # ("conf")
> > > > +
> > > > +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > > > +    # ["tar", "zst"]
> > > > +
> > > > +Throws an exception if C<$path> is C<undef>.
> > > > +
> > > > +=cut
> > > > +
> > > > +sub path_file_suffixes : prototype($) {
> > > > +    my ($path) = @_;
> > > > +
> > > > +    croak "\$path is undef" if !defined($path);
> > > > +
> > > > +    my $file_name = path_file_name($path);
> > > > +    if (!defined($file_name)) {
> > > > +	return wantarray ? () : [];
> > > > +    }
> > > > +
> > > > +    (undef, $file_name) = _path_file_prefix($file_name);
> > > > +
> > > > +    my @suffixes = _path_file_suffixes($file_name);
> > > > +
> > > > +    return wantarray ? @suffixes : \@suffixes;
> > > > +}
> > > > +
> > > > +=head3 path_with_file_suffixes($path, @suffixes)
> > > > +
> > > > +Returns C<$path> with new C<@suffixes>. This is similar to
> > > > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > > > +except that the suffixes of the file name are replaced.
> > > > +
> > > > +If the C<$path> does not have a file name, C<undef> is returned.
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > > > +    # /tmp/archive.pxar.gz
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > > > +    # /tmp/archive.gz
> > > > +
> > > > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > > > +    # /etc/resolv.conf
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > > > +    # /etc/resolv.conf.zst
> > > > +
> > > > +If there are no C<@suffixes> provided, the file name's suffixes will
> > > > +be removed (if there are any):
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > > > +    # /tmp/archive
> > > > +
> > > > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > > > +    # /tmp/archive....zst
> > > > +
> > > > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > > > +if any suffix contains a path separator (C</>) or a C<.>.
> > > > +
> > > > +=cut
> > > > +
> > > > +sub path_with_file_suffixes : prototype($@) {
> > > > +    my ($path, @suffixes) = @_;
> > >
> > > I am questioning a bit the sanity of having "suffixes" throughout this
> > > module instead of simply an "extension" that covers them all and can be
> > > split if needed...
> > >
> > > Do we have/anticipate particular use cases where this is more
> > > convenient?
> > 
> > To be really honest, it was 50/50 on the "extension" vs "suffixes"
> > decision. I then opted for suffixes instead, because they seemed to be
> > less ambiguous than an "extension". Let me elaborate.
> > 
> > Let's say we have a file called "foo.tar.gz" -- what would its extension
> > be? Some might say that ".tar.gz" is its extension, while some others
> > might say ".gz"; both have different reasonings but aren't necessarily
> > more or less correct than the other.
> > 
> > In our case, both you and I would say that ".tar.gz" is the extension,
> > but...
> > 
> > Rust's std::path::Path::extension [1] would return "gz" for the above,
> > while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
> > don't have such a case in their docs!). Java has java.nio.file.Path, but
> > conveniently doesn't care about extensions. Kotlin fixest his by adding
> > an "extension" property to that class, which returns "gz" [3].
> > 
> > Python's pathlib is the only one I've seen go the "suffix route" and
> > provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
> > type of approach also came up on the (Rust) tracking issue for
> > Path::file_prefix [5].
> > 
> > When I was looking up all that, I decided to go for the prefix +
> > suffix(es) route, as that just seemed to be the most unambiguous.
> > Convenience wasn't really a factor here, because there's the
> > "path_with_file_* family" of functions that should handle most of the
> > replacement cases.
> > 
> > (Besides, I found it quite nice that I could join() on the prefix +
> > suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
> > with path_file_prefix() and path_file_suffixes(); I like that there's an
> > "inverse" operation.)
> > 
> > *But,* I wouldn't be opposed to adding a function that just returns
> > "tar.gz" for the above case. Perhaps with a different name though :P
> > 
> > [1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
> > [2] https://en.cppreference.com/w/cpp/filesystem/path/extension
> > [3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
> > [4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
> > [5] https://github.com/rust-lang/rust/issues/86319
>
> TBH I wouldn't care if it returned "tar.gz", ".tar.gz", "gz" or ".gz"
> as long as it's documented ;-).
>
> But yeah, I guess suffixes makes sense - "forces"(🤪) you to not ignore
> it in your code...

ACK -- will keep it that way then.



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-01-09 12:57 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
2025-01-08 14:05   ` Wolfgang Bumiller
2025-01-09  9:56     ` Max Carrara
2025-01-09 11:06       ` Wolfgang Bumiller
2025-01-09 12:56         ` Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl Max Carrara
2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
2025-01-02 13:53   ` Fiona Ebner
2025-01-02 15:54     ` Max Carrara
2025-01-03  9:49       ` Fiona Ebner
2025-01-03 10:41         ` Thomas Lamprecht
2025-01-03 12:37           ` Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal