all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Max Carrara" <m.carrara@proxmox.com>
To: "Wolfgang Bumiller" <w.bumiller@proxmox.com>
Cc: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path
Date: Thu, 09 Jan 2025 13:56:15 +0100	[thread overview]
Message-ID: <D6XK4S035SMX.29YOUT77ZCCA0@proxmox.com> (raw)
In-Reply-To: <ipkh37upodgful7xzaschqfyupvy4yb454vljll7v2ab6ury62@u2nksoadgcc2>

On Thu Jan 9, 2025 at 12:06 PM CET, Wolfgang Bumiller wrote:
> On Thu, Jan 09, 2025 at 10:56:16AM +0100, Max Carrara wrote:
> > On Wed Jan 8, 2025 at 3:05 PM CET, Wolfgang Bumiller wrote:
> > > On Fri, Dec 20, 2024 at 07:51:56PM +0100, Max Carrara wrote:
> > > > +my sub _path_file_suffixes : prototype($) {
> > > > +    my ($file_name_no_prefix) = @_;
> > > > +
> > > > +    confess "\$file_name_no_prefix is undef" if !defined($file_name_no_prefix);
> > > > +
> > > > +    # Suffixes are extracted "manually" because join()ing the result of split()
> > > > +    # results in a different file name than the original. Let's say you have a
> > > > +    # file named "foo.bar.". The correct suffixes would be ("bar", "").
> > > > +    # With split, you get the following:
> > > > +    #     split(/\./, ".bar.")     --> ("", "bar")     --> join()ed to "foo..bar"
> > > > +    #     split(/\./, ".bar.", -1) --> ("", "bar", "") --> join()ed to "foo..bar."
> > > > +    my @suffixes = ();
> > > > +    while ($file_name_no_prefix =~ s|^(\.[^\.]*)||) {
> > > > +	my $suffix = $1;
> > > > +	$suffix =~ s|^\.||;
> > > > +	push(@suffixes, $suffix);
> > > > +    }
> > > > +
> > > > +    return @suffixes;
> > > > +}
> > > > +
> > > > +=head3 path_file_suffixes($path)
> > > > +
> > > > +Returns the suffixes of the C<$path>'s file name as a list. If the C<$path> does
> > > > +not have a valid file name, an empty list is returned instead.
> > > > +
> > > > +In scalar context, returns a reference to a list.
> > >
> > > (^ Isn't this a bit awkward?)
> > 
> > Well, do you think it is? I've been using that kind of "return style"
> > here and there and have been liking it, but if it's weird to others, I
> > can adapt it. :P
>
> If you remove the distinction and always return the list, the scalar
> context would give you the final element, which would be "the extension"
> ('gz', not 'tar.gz' :P), which would also be a nice shortcut for what
> I'd anticipate to be one of the most common(?) uses.

Sure, that's fine by me!

Though, I still wanna keep path_file_suffix() around, as that returns
only the last suffix (the extension), as that's a bit more explicit.

There are a few other functions that return a listref in scalar context
though; I'd prefer to adapt those too then, for consistency's sake and
all that. (Since this is a fresh library, I want it to have no surprises
about its usage ;P )

I'll have to adapt the tests a little for this, but otherwise this
shouldn't be much of a hassle to change.

>
> > 
> > >
> > > > +
> > > > +The suffixes of a path are essentially the file name's extensions, the parts
> > > > +that come after the L<< prefix|/"path_file_prefix($path)" >>.
> > > > +
> > > > +    my @suffixes = path_file_suffixes("/etc/resolv.conf");
> > > > +    # ("conf")
> > > > +
> > > > +    my $suffixes = path_file_prefix("/tmp/archive.tar.zst");
> > > > +    # ["tar", "zst"]
> > > > +
> > > > +Throws an exception if C<$path> is C<undef>.
> > > > +
> > > > +=cut
> > > > +
> > > > +sub path_file_suffixes : prototype($) {
> > > > +    my ($path) = @_;
> > > > +
> > > > +    croak "\$path is undef" if !defined($path);
> > > > +
> > > > +    my $file_name = path_file_name($path);
> > > > +    if (!defined($file_name)) {
> > > > +	return wantarray ? () : [];
> > > > +    }
> > > > +
> > > > +    (undef, $file_name) = _path_file_prefix($file_name);
> > > > +
> > > > +    my @suffixes = _path_file_suffixes($file_name);
> > > > +
> > > > +    return wantarray ? @suffixes : \@suffixes;
> > > > +}
> > > > +
> > > > +=head3 path_with_file_suffixes($path, @suffixes)
> > > > +
> > > > +Returns C<$path> with new C<@suffixes>. This is similar to
> > > > +C<L<< path_with_file_name()|/"path_with_file_name($path, $file_name)" >>>,
> > > > +except that the suffixes of the file name are replaced.
> > > > +
> > > > +If the C<$path> does not have a file name, C<undef> is returned.
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "pxar", "gz");
> > > > +    # /tmp/archive.pxar.gz
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "gz");
> > > > +    # /tmp/archive.gz
> > > > +
> > > > +If the file name has no suffixes, the C<@suffixes> are appended instead:
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf");
> > > > +    # /etc/resolv.conf
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/etc/resolv", "conf", "zst");
> > > > +    # /etc/resolv.conf.zst
> > > > +
> > > > +If there are no C<@suffixes> provided, the file name's suffixes will
> > > > +be removed (if there are any):
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst");
> > > > +    # /tmp/archive
> > > > +
> > > > +Note that an empty string is still a valid suffix (an "empty" file ending):
> > > > +
> > > > +    my $new_path = path_with_file_suffixes("/tmp/archive.tar.zst", "", "", "", "zst");
> > > > +    # /tmp/archive....zst
> > > > +
> > > > +Throws an exception if C<$path> or any of the C<@suffixes> is C<undef>, or
> > > > +if any suffix contains a path separator (C</>) or a C<.>.
> > > > +
> > > > +=cut
> > > > +
> > > > +sub path_with_file_suffixes : prototype($@) {
> > > > +    my ($path, @suffixes) = @_;
> > >
> > > I am questioning a bit the sanity of having "suffixes" throughout this
> > > module instead of simply an "extension" that covers them all and can be
> > > split if needed...
> > >
> > > Do we have/anticipate particular use cases where this is more
> > > convenient?
> > 
> > To be really honest, it was 50/50 on the "extension" vs "suffixes"
> > decision. I then opted for suffixes instead, because they seemed to be
> > less ambiguous than an "extension". Let me elaborate.
> > 
> > Let's say we have a file called "foo.tar.gz" -- what would its extension
> > be? Some might say that ".tar.gz" is its extension, while some others
> > might say ".gz"; both have different reasonings but aren't necessarily
> > more or less correct than the other.
> > 
> > In our case, both you and I would say that ".tar.gz" is the extension,
> > but...
> > 
> > Rust's std::path::Path::extension [1] would return "gz" for the above,
> > while C++'s std::filesystem::path::extension [2] returns ".gz" (but they
> > don't have such a case in their docs!). Java has java.nio.file.Path, but
> > conveniently doesn't care about extensions. Kotlin fixest his by adding
> > an "extension" property to that class, which returns "gz" [3].
> > 
> > Python's pathlib is the only one I've seen go the "suffix route" and
> > provides the pathlib.PurePath.suffix and .suffixes methods [4]. This
> > type of approach also came up on the (Rust) tracking issue for
> > Path::file_prefix [5].
> > 
> > When I was looking up all that, I decided to go for the prefix +
> > suffix(es) route, as that just seemed to be the most unambiguous.
> > Convenience wasn't really a factor here, because there's the
> > "path_with_file_* family" of functions that should handle most of the
> > replacement cases.
> > 
> > (Besides, I found it quite nice that I could join() on the prefix +
> > suffixes (e.g. join(".", "foo", "tar", "gz")) that I first extracted
> > with path_file_prefix() and path_file_suffixes(); I like that there's an
> > "inverse" operation.)
> > 
> > *But,* I wouldn't be opposed to adding a function that just returns
> > "tar.gz" for the above case. Perhaps with a different name though :P
> > 
> > [1] https://doc.rust-lang.org/std/path/struct.Path.html#method.extension
> > [2] https://en.cppreference.com/w/cpp/filesystem/path/extension
> > [3] https://github.com/JetBrains/kotlin/blob/rrr/2.1.0/core-docs/libraries/stdlib/jdk7/src/kotlin/io/path/PathUtils.kt#L46
> > [4] https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.suffix
> > [5] https://github.com/rust-lang/rust/issues/86319
>
> TBH I wouldn't care if it returned "tar.gz", ".tar.gz", "gz" or ".gz"
> as long as it's documented ;-).
>
> But yeah, I guess suffixes makes sense - "forces"(🤪) you to not ignore
> it in your code...

ACK -- will keep it that way then.



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  reply	other threads:[~2025-01-09 12:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-20 18:51 [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 01/12] introduce PVE::Path Max Carrara
2025-01-08 14:05   ` Wolfgang Bumiller
2025-01-09  9:56     ` Max Carrara
2025-01-09 11:06       ` Wolfgang Bumiller
2025-01-09 12:56         ` Max Carrara [this message]
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 02/12] test: add directory for tests of PVE::Path module Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 03/12] test: add tests for path_is_absolute and path_is_relative of PVE::Path Max Carrara
2024-12-20 18:51 ` [pve-devel] [PATCH v2 pve-common 04/12] test: add tests for path_components " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 05/12] test: add tests for path_join " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 06/12] test: add tests for path_push " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 07/12] test: add tests for path_parent " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 08/12] test: add tests for path_starts_with, path_ends_with, path_equals Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 09/12] test: add tests for file path ops functions of PVE::Path Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 10/12] test: add tests for path_normalize " Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 11/12] introduce PVE::Filesystem Max Carrara
2024-12-20 18:52 ` [pve-devel] [PATCH v2 pve-common 12/12] debian: introduce package libproxmox-fs-path-utils-perl Max Carrara
2025-01-02 13:46 ` [pve-devel] [PATCH v2 pve-common 00/12] Introduce and Package PVE::Path & PVE::Filesystem Fiona Ebner
2025-01-02 13:53   ` Fiona Ebner
2025-01-02 15:54     ` Max Carrara
2025-01-03  9:49       ` Fiona Ebner
2025-01-03 10:41         ` Thomas Lamprecht
2025-01-03 12:37           ` Fiona Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D6XK4S035SMX.29YOUT77ZCCA0@proxmox.com \
    --to=m.carrara@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal