From: Wolfgang Bumiller <w.bumiller@proxmox.com>
To: Gabriel Goller <g.goller@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
Subject: Re: [pbs-devel] [PATCH pathpatterns v2] match_list: added `matches_mode_lazy()`, which only retrieves file mode if necessary
Date: Thu, 17 Aug 2023 11:01:26 +0200 [thread overview]
Message-ID: <geluhhwcgsj5otowki6gf67lpzo7zbaxmhwdbuf2nemvunnpwm@bohtdrk5lg7e> (raw)
In-Reply-To: <20230816140330.191947-1-g.goller@proxmox.com>
Getting there :-)
I'd still like a few changes.
On Wed, Aug 16, 2023 at 04:03:29PM +0200, Gabriel Goller wrote:
> Added `matches_mode_lazy()` function, which takes a closure that should return the
> `file_mode`. The function will go through the patterns and match by path only
> until it finds a pattern which does not have `MatchFlag::ANY_FILE_TYPE`, in case
> it will call the closure, which will return a `file_mode`. This will ensure that
> the closure (which in our case executes `stat()`) will only be run once(at most),
> every pattern will only be processed once and that the order of the patterns
> will be respected.
>
> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
> ---
>
> changes v2:
> - updated the `matches_path()` function to take a closure and renamed
> it to `matches_mode_lazy()`. This allows us to only match once, so we
> don't need to match before and after `stat()` (without and with
> `file_mode`) anymore.
>
> changes v1:
> - added `matches_path()` function, which matches by path only and returns an
> error if the `file_mode` is required.
>
> src/match_list.rs | 203 +++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 202 insertions(+), 1 deletion(-)
>
> diff --git a/src/match_list.rs b/src/match_list.rs
> index c5b14e0..9914130 100644
> --- a/src/match_list.rs
> +++ b/src/match_list.rs
> @@ -1,6 +1,6 @@
> //! Helpers for include/exclude lists.
> -
> use bitflags::bitflags;
> +use std::{error::Error, fmt};
>
> use crate::PatternFlag;
>
> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
> }
> }
>
> +#[derive(Debug, PartialEq)]
> +pub struct FileModeRequired;
> +
> +impl fmt::Display for FileModeRequired {
> + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> + write!(f, "File mode is required for matching")
> + }
> +}
> +
> +impl Error for FileModeRequired {}
> +
> /// A pattern entry. (Glob patterns or literal patterns.)
> // Note:
> // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
> @@ -304,12 +315,26 @@ impl MatchEntry {
>
> self.matches_path_exact(path)
> }
> +
These can be dropped, because we can actually avoid the need to call any
`matches` functions that would require a mode if we don't actually have
a mode.
For this purpose, `MatchEntry` should just have a
pub fn needs_file_mode(&self) -> bool;
(also utilized in its `fn matches_mode()` for less duplicate bitflag
handling code)
> + /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
> + pub fn matches_path<T: AsRef<[u8]>>(&self, path: T) -> Result<bool, FileModeRequired> {
> + self.matches_path_do(path.as_ref())
> + }
> +
> + fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequired> {
> + if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
> + return Err(FileModeRequired);
> + }
> +
> + Ok(self.matches_path_suffix_do(path))
> + }
> }
>
> #[doc(hidden)]
> pub trait MatchListEntry {
> fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> + fn entry_matches_path(&self, path: &[u8]) -> Result<Option<MatchType>, FileModeRequired>;
Instead of this, we should expose
fn needs_file_mode(&self) -> bool
in this trait.
> }
>
> impl MatchListEntry for &'_ MatchEntry {
> @@ -328,6 +353,15 @@ impl MatchListEntry for &'_ MatchEntry {
> None
> }
> }
> +
> + fn entry_matches_path(&self, path: &[u8]) -> Result<Option<MatchType>, FileModeRequired> {
> + let matches = self.matches_path(path)?;
> + if matches {
> + Ok(Some(self.match_type()))
> + } else {
> + Ok(None)
> + }
> + }
> }
>
> impl MatchListEntry for &'_ &'_ MatchEntry {
> @@ -346,6 +380,15 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
> None
> }
> }
> +
> + fn entry_matches_path(&self, path: &[u8]) -> Result<Option<MatchType>, FileModeRequired> {
> + let matches = self.matches_path(path)?;
> + if matches {
> + Ok(Some(self.match_type()))
> + } else {
> + Ok(None)
> + }
> + }
> }
>
> /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
> @@ -374,6 +417,22 @@ pub trait MatchList {
> }
And here would be the main API break.
All of the methods here in `MatchList` should act as lazy instead of
having this as a new method.
The only downside is that this means fixing up all the test cases... ;-)
>
> fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> +
> + /// Check whether this list contains anything exactly matching the path, returns error if
> + /// `file_mode` is required for exact matching and cannot be retrieved using closure.
> + fn matches_mode_lazy<T: AsRef<[u8]>, E: Error, U: FnMut() -> Result<u32, E>>(
Style nit: with large trait bounds, please move them into a `where`
clause instead (same in the other cases where you add these).
> + &self,
> + path: T,
> + get_file_mode: U,
> + ) -> Result<Option<MatchType>, E> {
> + self.matches_mode_lazy_do(path.as_ref(), get_file_mode)
> + }
> +
> + fn matches_mode_lazy_do<E: Error, T: FnMut() -> Result<u32, E>>(
> + &self,
> + path: &[u8],
> + get_file_mode: T,
> + ) -> Result<Option<MatchType>, E>;
> }
>
> impl<'a, T> MatchList for T
> @@ -408,6 +467,39 @@ where
>
> None
> }
> +
> + fn matches_mode_lazy_do<E: Error, U: FnMut() -> Result<u32, E>>(
`FnOnce` should suffice, to guarantee at the API level that we only call
it once.
You can make this work by moving it into an `Option` first:
let mut get_file_mode = Some(get_file_mode);
let mut file_mode = None;
for m in ... {
if file_mode.is_none() && m.needs_file_mode() {
// unwrap: we always have get_file_mode XOR file_mode as Some()
file_mode = Some((get_file_mode.take().unwrap())()?);
}
if let Some(mt) = m.entry_matches(path, file_mode) {
return Ok(Some(mt));
}
}
Note that we just call the regular `entry_matches`, since given `None`
as file mode it simply won't bother with the file mode, but at the same
time, if the pattern does *want* a file mode, we try to fetch it.
Now, as a bonus, we could also make `None` work again as a parameter for
these methods by replacing the `FnOnce` parameter with a new small
custom trait:
pub trait GetFileMode {
type Error;
fn get(self) -> Result<u32, Self::Error>;
}
With this being our own trait, we can have a generic impl along with a
one for `Option<u32>`:
impl<T> GetFileMode for T
where
T: FnOnce() -> Result<u32, E>,
{
type Error = E;
fn get(self) -> Result<u32, Self::Error> {
return self();
}
}
impl GetFileMode for Option<u32> {
type Error = FileModeRequired;
fn get(self) -> Result<u32, Self::Error> {
self.ok_or(FileModeRequired)
}
}
^ otherwise we also won't need the `FileModeRequired` error type ;-)
This will also make adapting the test cases much easier (only the `Ok()`
around the result types needs to be added, no closures for the modes).
> + &self,
> + path: &[u8],
> + mut get_file_mode: U,
> + ) -> Result<Option<MatchType>, E> {
> + // This is an &self method on a `T where T: 'a`.
> + let this: &'a Self = unsafe { std::mem::transmute(self) };
> +
> + let mut file_mode: Option<u32> = None;
> + for m in this.into_iter().rev() {
> + if file_mode.is_none() {
> + match m.entry_matches_path(path) {
> + Ok(mt) => {
> + if mt.is_some() {
> + return Ok(mt);
> + }
> + }
> + Err(_) => match get_file_mode() {
> + Ok(x) => file_mode = Some(x),
> + Err(e) => return Err(e),
> + },
> + }
> + }
> + if file_mode.is_some() {
> + let full_match = m.entry_matches(path, file_mode);
> + if full_match.is_some() {
> + return Ok(full_match);
> + }
> + }
> + }
> + Ok(None)
> + }
> }
>
> #[test]
prev parent reply other threads:[~2023-08-17 9:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-16 14:03 Gabriel Goller
2023-08-16 14:03 ` [pbs-devel] [PATCH proxmox-backup v4] fix #4380: check if file is excluded before running `stat()` Gabriel Goller
2023-08-17 9:01 ` Wolfgang Bumiller [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=geluhhwcgsj5otowki6gf67lpzo7zbaxmhwdbuf2nemvunnpwm@bohtdrk5lg7e \
--to=w.bumiller@proxmox.com \
--cc=g.goller@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.