From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Max Carrara <m.carrara@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
Date: Thu, 9 Jan 2025 12:06:40 +0100 [thread overview]
Message-ID: <ipkh37upodgful7xzaschqfyupvy4yb454vljll7v2ab6ury62@u2nksoadgcc2> (raw)
In-Reply-To: <D6XGAZ0XA9IZ.2CIB4A13ERV1Z@proxmox.com>
On Thu, Jan 09, 2025 at 10:56:16AM +0100, Max Carrara wrote:
> On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> > On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > > +my sub _path_file_suffixes : prototype($) {
> > > + my ($file_name_no_prefix) = @_;
> > > +
> > > + confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > > +
> > > + # Suffixes are extracted "manually" because join()ing the result of split()
> > > + # results in a different file name than the original. Let's say you have a
> > > + # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > > + # With split, you get the following:
> > > + # split(/\./, ".bar.") --> ("", "bar") --> join()ed to "foo..bar"
> > > + # split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > > + my @suffixes = ();
> > > + while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > > + my $suffix = $1;
> > > + $suffix =~ s|^\.||;
> > > + push(@suffixes, $suffix);
> > > + }
> > > +
> > > + return @suffixes;
> > > +}
> > > +
> > > +=head3 path_file_suffixes($path)
> > > +
> > > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > > +not have a valid file name, an empty list is returned instead.
> > > +
> > > +In scalar context, returns a reference to a list.
> >
> > (^ Isn't this a bit awkward?)
>
> Well, do you think it is? I've been using that kind of "return style"
> here and there and have been liking it, but if it's weird to others, I
> can adapt it. :P
If you remove the distinction and always return the list, the scalar
context would give you the final element, which would be "the extension"
('gz', not 'tar.gz' :P), which would also be a nice shortcut for what
I'd anticipate to be one of the most common(?) uses.
>
> >
> > > +
> > > +The suffixes of a path are essentially the file name's extensions, the parts
> > > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > > +
> > > + my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > > + # ("conf")
> > > +
> > > + my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > > + # ["tar", "zst"]
> > > +
> > > +Throws an exception if C<$path> is C<undef>.
> > > +
> > > +=cut
> > > +
> > > +sub path_file_suffixes : prototype($) {
> > > + my ($path) = @_;
> > > +
> > > + croak "\$path is undef" if !defined($path);
> > > +
> > > + my $file_name = path_file_name($path);
> > > + if (!defined($file_name)) {
> > > + return wantarray ? () : [];
> > > + }
> > > +
> > > + (undef, $file_name) = _path_file_prefix($file_name);
> > > +
> > > + my @suffixes = _path_file_suffixes($file_name);
> > > +
> > > + return wantarray ? @suffixes : \@suffixes;
> > > +}
> > > +
> > > +=head3 path_with_file_suffixes($path, @suffixes)
> > > +
> > > +Returns C<$path> with new C<@suffixes>. This is similar to
> > > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > > +except that the suffixes of the file name are replaced.
> > > +
> > > +If the C<$path> does not have a file name, C<undef> is returned.
> > > +
> > > + my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > > + # /tmp/archive.pxar.gz
> > > +
> > > + my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > > + # /tmp/archive.gz
> > > +
> > > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > > +
> > > + my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > > + # /etc/resolv.conf
> > > +
> > > + my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > > + # /etc/resolv.conf.zst
> > > +
> > > +If there are no C<@suffixes> provided, the file name's suffixes will
> > > +be removed (if there are any):
> > > +
> > > + my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > > + # /tmp/archive
> > > +
> > > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > > +
> > > + my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > > + # /tmp/archive....zst
> > > +
> > > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > > +if any suffix contains a path separator (C</>) or a C<.>.
> > > +
> > > +=cut
> > > +
> > > +sub path_with_file_suffixes : prototype($@) {
> > > + my ($path, @suffixes) = @_;
> >
> > I am questioning a bit the sanity of having "suffixes" throughout this
> > module instead of simply an "extension" that covers them all and can be
> > split if needed...
> >
> > Do we have/anticipate particular use cases where this is more
> > convenient?
>
> To be really honest, it was 50/50 on the "extension" vs "suffixes"
> decision. I then opted for suffixes instead, because they seemed to be
> less ambiguous than an "extension". Let me elaborate.
>
> Let's say we have a file called "foo.tar.gz" -- what would its extension
> be? Some might say that ".tar.gz" is its extension, while some others
> might say ".gz"; both have different reasonings but aren't necessarily
> more or less correct than the other.
>
> In our case, both you and I would say that ".tar.gz" is the extension,
> but...
>
> Rust's std::path::Path::extension [1] would return "gz" for the above,
> while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
> don't have such a case in their docs!). Java has java.nio.file.Path, but
> conveniently doesn't care about extensions. Kotlin fixest his by adding
> an "extension" property to that class, which returns "gz" [3].
>
> Python's pathlib is the only one I've seen go the "suffix route" and
> provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
> type of approach also came up on the (Rust) tracking issue for
> Path::file_prefix [5].
>
> When I was looking up all that, I decided to go for the prefix +
> suffix(es) route, as that just seemed to be the most unambiguous.
> Convenience wasn't really a factor here, because there's the
> "path_with_file_* family" of functions that should handle most of the
> replacement cases.
>
> (Besides, I found it quite nice that I could join() on the prefix +
> suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
> with path_file_prefix() and path_file_suffixes(); I like that there's an
> "inverse" operation.)
>
> *But,* I wouldn't be opposed to adding a function that just returns
> "tar.gz" for the above case. Perhaps with a different name though :P
>
> [1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
> [2] https://en.cppreference.com/w/cpp/filesystem/path/extension
> [3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
> [4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
> [5] https://github.com/rust-lang/rust/issues/86319
TBH I wouldn't care if it returned "tar.gz", ".tar.gz", "gz" or ".gz"
as long as it's documented ;-).
But yeah, I guess suffixes makes sense - "forces"(🤪) you to not ignore
it in your code...
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-01-09 11:07 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
2025-01-08 14:05 ` Wolfgang Bumiller
2025-01-09 9:56 ` Max Carrara
2025-01-09 11:06 ` Wolfgang Bumiller [this message]
2025-01-09 12:56 ` Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl Max Carrara
2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
2025-01-02 13:53 ` Fiona Ebner
2025-01-02 15:54 ` Max Carrara
2025-01-03 9:49 ` Fiona Ebner
2025-01-03 10:41 ` Thomas Lamprecht
2025-01-03 12:37 ` Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ipkh37upodgful7xzaschqfyupvy4yb454vljll7v2ab6ury62@u2nksoadgcc2 \
--to=w.bumiller@proxmox.com \
--cc=m.carrara@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox