public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
@ 2023-08-09 10:19 Gabriel Goller
  2023-08-09 10:19 ` [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()` Gabriel Goller
  2023-08-11  8:26 ` [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Wolfgang Bumiller
  0 siblings, 2 replies; 8+ messages in thread
From: Gabriel Goller @ 2023-08-09 10:19 UTC (permalink / raw)
  To: pbs-devel

Added `matches_path()` function, which only matches against the path and returns
an error if a file_mode pattern is found/needed in the matching list. This is
useful when we want to check if a file is excluded before running `stat()` on
the file to get the file_mode (which could fail).

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
---
 src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 158 insertions(+), 1 deletion(-)

diff --git a/src/match_list.rs b/src/match_list.rs
index c5b14e0..acad328 100644
--- a/src/match_list.rs
+++ b/src/match_list.rs
@@ -1,6 +1,6 @@
 //! Helpers for include/exclude lists.
-
 use bitflags::bitflags;
+use std::fmt;
 
 use crate::PatternFlag;
 
@@ -39,6 +39,17 @@ impl Default for MatchFlag {
     }
 }
 
+#[derive(Debug, PartialEq)]
+pub struct FileModeRequiredForMatching;
+
+impl fmt::Display for FileModeRequiredForMatching {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        write!(f, "File mode is required for matching")
+    }
+}
+
+impl std::error::Error for FileModeRequiredForMatching {}
+
 /// A pattern entry. (Glob patterns or literal patterns.)
 // Note:
 // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
@@ -304,12 +315,32 @@ impl MatchEntry {
 
         self.matches_path_exact(path)
     }
+
+    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
+    pub fn matches_path<T: AsRef<[u8]>>(
+        &self,
+        path: T,
+    ) -> Result<bool, FileModeRequiredForMatching> {
+        self.matches_path_do(path.as_ref())
+    }
+
+    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
+        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
+            return Err(FileModeRequiredForMatching);
+        }
+
+        Ok(self.matches_path_suffix_do(path))
+    }
 }
 
 #[doc(hidden)]
 pub trait MatchListEntry {
     fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
     fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
+    fn entry_matches_path(
+        &self,
+        path: &[u8],
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
 }
 
 impl MatchListEntry for &'_ MatchEntry {
@@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
             None
         }
     }
+
+    fn entry_matches_path(
+        &self,
+        path: &[u8],
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
+        if let Ok(b) = self.matches_path(path) {
+            if b {
+                Ok(Some(self.match_type()))
+            } else {
+                Ok(None)
+            }
+        } else {
+            Err(FileModeRequiredForMatching)
+        }
+    }
 }
 
 impl MatchListEntry for &'_ &'_ MatchEntry {
@@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
             None
         }
     }
+
+    fn entry_matches_path(
+        &self,
+        path: &[u8],
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
+        if let Ok(b) = self.matches_path(path) {
+            if b {
+                Ok(Some(self.match_type()))
+            } else {
+                Ok(None)
+            }
+        } else {
+            Err(FileModeRequiredForMatching)
+        }
+    }
 }
 
 /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
@@ -374,6 +435,20 @@ pub trait MatchList {
     }
 
     fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
+
+    /// Check whether this list contains anything exactly matching the path, returns error if
+    /// `file_mode` is required for exact matching.
+    fn matches_path<T: AsRef<[u8]>>(
+        &self,
+        path: T,
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
+        self.matches_path_do(path.as_ref())
+    }
+
+    fn matches_path_do(
+        &self,
+        path: &[u8],
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
 }
 
 impl<'a, T> MatchList for T
@@ -408,6 +483,24 @@ where
 
         None
     }
+
+    fn matches_path_do(
+        &self,
+        path: &[u8],
+    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
+        // This is an &self method on a `T where T: 'a`.
+        let this: &'a Self = unsafe { std::mem::transmute(self) };
+
+        for m in this.into_iter().rev() {
+            if let Ok(mt) = m.entry_matches_path(path) {
+                if mt.is_some() {
+                    return Ok(mt);
+                }
+            }
+        }
+
+        Err(FileModeRequiredForMatching)
+    }
 }
 
 #[test]
@@ -530,3 +623,67 @@ fn test_path_relativity() {
     assert_eq!(matchlist.matches("foo/slash", None), None);
     assert_eq!(matchlist.matches("foo/slash-a", None), None);
 }
+
+#[test]
+fn test_matches_path() {
+    let vec = vec![
+        MatchEntry::include(crate::Pattern::path("as*").unwrap()),
+        MatchEntry::include(crate::Pattern::path("a*").unwrap()),
+    ];
+    assert_eq!(vec.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let list: &[MatchEntry] = &vec[..];
+    assert_eq!(list.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let list: Vec<&MatchEntry> = vec.iter().collect();
+    assert_eq!(list.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let list: &[&MatchEntry] = &list[..];
+    assert_eq!(list.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let vec = vec![
+        MatchEntry::include(crate::Pattern::path("a*").unwrap()),
+        MatchEntry::include(crate::Pattern::path("a*").unwrap())
+            .flags(MatchFlag::MATCH_REGULAR_FILES),
+    ];
+    assert_eq!(vec.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let vec = vec![
+        MatchEntry::include(crate::Pattern::path("a*").unwrap()),
+        MatchEntry::include(crate::Pattern::path("asdf").unwrap()),
+    ];
+    assert_eq!(vec.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let list: &[MatchEntry] = &vec[..];
+    assert_eq!(list.matches_path("azdf"), Ok(Some(MatchType::Include)));
+
+    let vec = vec![
+        MatchEntry::include(crate::Pattern::path("a*").unwrap()).flags(MatchFlag::ANY_FILE_TYPE),
+        MatchEntry::include(crate::Pattern::path("asdf").unwrap()).flags(MatchFlag::ANY_FILE_TYPE),
+    ];
+    assert_eq!(vec.matches_path("asdf"), Ok(Some(MatchType::Include)));
+    assert_eq!(vec.matches_path("adbb"), Ok(Some(MatchType::Include)));
+}
+
+#[test]
+fn test_matches_path_error() {
+    let vec = vec![
+        MatchEntry::include(crate::Pattern::path("a*").unwrap())
+            .flags(MatchFlag::MATCH_REGULAR_FILES),
+        MatchEntry::include(crate::Pattern::path("asdf").unwrap()),
+    ];
+    assert_eq!(vec.matches_path("asdf"), Ok(Some(MatchType::Include)));
+
+    let list: &[MatchEntry] = &vec[..];
+    assert_eq!(list.matches_path("azdf"), Err(FileModeRequiredForMatching));
+
+    let list: Vec<&MatchEntry> = vec.iter().collect();
+    assert_eq!(list.matches_path("abdf"), Err(FileModeRequiredForMatching));
+
+    let list: &[&MatchEntry] = &list[..];
+    assert_eq!(list.matches_path("acdf"), Err(FileModeRequiredForMatching));
+
+    let vec = vec![MatchEntry::include(crate::Pattern::path("a*").unwrap())
+        .flags(MatchFlag::MATCH_DIRECTORIES)];
+    assert_eq!(vec.matches_path("asdf"), Err(FileModeRequiredForMatching));
+}
-- 
2.39.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()`
  2023-08-09 10:19 [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Gabriel Goller
@ 2023-08-09 10:19 ` Gabriel Goller
  2023-08-11  8:51   ` Wolfgang Bumiller
  2023-08-11  8:26 ` [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Wolfgang Bumiller
  1 sibling, 1 reply; 8+ messages in thread
From: Gabriel Goller @ 2023-08-09 10:19 UTC (permalink / raw)
  To: pbs-devel

Before running `stat()` we now make a `matches_path()` call, which checks
if the file/directory is excluded (`matches_path()` tries to match using
only the path, without the file mode). If the file is not excluded,
we `stat()` and then check again if the path matches using the file_mode and
the `matches()` function.
Added `pathpatterns` crate to local overrides in cargo.toml.

changes v2:
 - checking for `read` and `execute` permissions before entering directory,
   doesn't work because there are a lot of side-effects (executed by
   different user, AppArmor, SELinux, ...).
changes v1:
 - checking for excluded files with `matches()` before executing `stat()`,
   this doesn't work because we get the file_mode from `stat()` and don't 
   want to ignore it when matching.

Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
---
 Cargo.toml                    |  1 +
 pbs-client/src/pxar/create.rs | 45 ++++++++++++++++++++++++++---------
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/Cargo.toml b/Cargo.toml
index 2a05c471..c54bc28b 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -262,6 +262,7 @@ proxmox-rrd.workspace = true
 
 #proxmox-apt = { path = "../proxmox-apt" }
 #proxmox-openid = { path = "../proxmox-openid-rs" }
+#pathpatterns = {path = "../pathpatterns" }
 
 #pxar = { path = "../pxar" }
 
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 2577cf98..a6d608e7 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -232,7 +232,7 @@ impl Archiver {
             let entry_counter = self.entry_counter;
 
             let old_patterns_count = self.patterns.len();
-            self.read_pxar_excludes(dir.as_raw_fd())?;
+            self.read_pxar_excludes(&mut dir)?;
 
             let mut file_list = self.generate_directory_file_list(&mut dir, is_root)?;
 
@@ -315,8 +315,23 @@ impl Archiver {
         }
     }
 
-    fn read_pxar_excludes(&mut self, parent: RawFd) -> Result<(), Error> {
-        let fd = match self.open_file(parent, c_str!(".pxarexclude"), OFlag::O_RDONLY, false)? {
+    fn read_pxar_excludes(&mut self, parent: &mut Dir) -> Result<(), Error> {
+        let mut exclude_file_found = false;
+        for file in parent.iter() {
+            if file?.file_name().to_str()? == ".pxarexclude" {
+                exclude_file_found = true;
+                break;
+            }
+        }
+        if !exclude_file_found {
+            return Ok(());
+        }
+        let fd = match self.open_file(
+            parent.as_raw_fd(),
+            c_str!(".pxarexclude"),
+            OFlag::O_RDONLY,
+            false,
+        )? {
             Some(fd) => fd,
             None => return Ok(()),
         };
@@ -420,7 +435,7 @@ impl Archiver {
         for file in dir.iter() {
             let file = file?;
 
-            let file_name = file.file_name().to_owned();
+            let file_name = file.file_name();
             let file_name_bytes = file_name.to_bytes();
             if file_name_bytes == b"." || file_name_bytes == b".." {
                 continue;
@@ -434,9 +449,17 @@ impl Archiver {
             assert_single_path_component(os_file_name)?;
             let full_path = self.path.join(os_file_name);
 
+            let match_path = PathBuf::from("/").join(full_path.clone());
+            let pre_stat_match = self
+                .patterns
+                .matches_path(match_path.as_os_str().as_bytes());
+            if pre_stat_match == Ok(Some(MatchType::Exclude)) {
+                continue;
+            }
+
             let stat = match nix::sys::stat::fstatat(
                 dir_fd,
-                file_name.as_c_str(),
+                file_name.to_owned().as_c_str(),
                 nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
             ) {
                 Ok(stat) => stat,
@@ -444,11 +467,11 @@ impl Archiver {
                 Err(err) => return Err(err).context(format!("stat failed on {:?}", full_path)),
             };
 
-            let match_path = PathBuf::from("/").join(full_path.clone());
-            if self
-                .patterns
-                .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
-                == Some(MatchType::Exclude)
+            if pre_stat_match.is_err()
+                && self
+                    .patterns
+                    .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
+                    == Some(MatchType::Exclude)
             {
                 continue;
             }
@@ -462,7 +485,7 @@ impl Archiver {
             }
 
             file_list.push(FileListEntry {
-                name: file_name,
+                name: file_name.to_owned(),
                 path: full_path,
                 stat,
             });
-- 
2.39.2





^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
  2023-08-09 10:19 [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Gabriel Goller
  2023-08-09 10:19 ` [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()` Gabriel Goller
@ 2023-08-11  8:26 ` Wolfgang Bumiller
  2023-08-11  8:32   ` Wolfgang Bumiller
  2023-08-14  9:32   ` Gabriel Goller
  1 sibling, 2 replies; 8+ messages in thread
From: Wolfgang Bumiller @ 2023-08-11  8:26 UTC (permalink / raw)
  To: Gabriel Goller; +Cc: pbs-devel

On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
> Added `matches_path()` function, which only matches against the path and returns
> an error if a file_mode pattern is found/needed in the matching list. This is
> useful when we want to check if a file is excluded before running `stat()` on
> the file to get the file_mode (which could fail).
> 
> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
> ---
>  src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 158 insertions(+), 1 deletion(-)
> 
> diff --git a/src/match_list.rs b/src/match_list.rs
> index c5b14e0..acad328 100644
> --- a/src/match_list.rs
> +++ b/src/match_list.rs
> @@ -1,6 +1,6 @@
>  //! Helpers for include/exclude lists.
> -
>  use bitflags::bitflags;
> +use std::fmt;
>  
>  use crate::PatternFlag;
>  
> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
>      }
>  }
>  
> +#[derive(Debug, PartialEq)]
> +pub struct FileModeRequiredForMatching;

Let's shorten this to just `FileModeRequired` ;-)

> +
> +impl fmt::Display for FileModeRequiredForMatching {
> +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> +        write!(f, "File mode is required for matching")
> +    }
> +}
> +
> +impl std::error::Error for FileModeRequiredForMatching {}
> +
>  /// A pattern entry. (Glob patterns or literal patterns.)
>  // Note:
>  // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
> @@ -304,12 +315,32 @@ impl MatchEntry {
>  
>          self.matches_path_exact(path)
>      }
> +
> +    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
> +    pub fn matches_path<T: AsRef<[u8]>>(
> +        &self,
> +        path: T,
> +    ) -> Result<bool, FileModeRequiredForMatching> {
> +        self.matches_path_do(path.as_ref())
> +    }
> +
> +    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
> +        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
> +            return Err(FileModeRequiredForMatching);
> +        }
> +
> +        Ok(self.matches_path_suffix_do(path))
> +    }
>  }
>  
>  #[doc(hidden)]
>  pub trait MatchListEntry {
>      fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>      fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;

>  }
>  
>  impl MatchListEntry for &'_ MatchEntry {
> @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
>              None
>          }
>      }
> +
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> +        if let Ok(b) = self.matches_path(path) {

This can just use `?`, it's the exact same error type after all.
(Also `if let Ok` is generally best avoided since the 'else' branch
discards the error, and if not, it's often a case like this where it can
juse use '?' ;-) ).

Effectively this could be as short as

    Ok(self.matches_path(path)?.then(|| self.match_type()))

> +            if b {

As an additional hint: when you nest ifs around things you can match,
you can just include both cases in the patterns:
    match self.matches_path(path) {
        Ok(true) => Ok(Some(self.match_type())),
        Ok(false) => Ok(None),
        Err(err) => Err(err),
        // where this Err() case already tells you that you can use '?' instead
    }

> +                Ok(Some(self.match_type()))
> +            } else {
> +                Ok(None)
> +            }
> +        } else {
> +            Err(FileModeRequiredForMatching)
> +        }
> +    }
>  }
>  
>  impl MatchListEntry for &'_ &'_ MatchEntry {
> @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
>              None
>          }
>      }
> +
> +    fn entry_matches_path(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {

same

> +        if let Ok(b) = self.matches_path(path) {
> +            if b {
> +                Ok(Some(self.match_type()))
> +            } else {
> +                Ok(None)
> +            }
> +        } else {
> +            Err(FileModeRequiredForMatching)
> +        }
> +    }
>  }
>  
>  /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
> @@ -374,6 +435,20 @@ pub trait MatchList {
>      }
>  
>      fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> +
> +    /// Check whether this list contains anything exactly matching the path, returns error if
> +    /// `file_mode` is required for exact matching.
> +    fn matches_path<T: AsRef<[u8]>>(
> +        &self,
> +        path: T,
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> +        self.matches_path_do(path.as_ref())
> +    }
> +
> +    fn matches_path_do(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>  }
>  
>  impl<'a, T> MatchList for T
> @@ -408,6 +483,24 @@ where
>  
>          None
>      }
> +
> +    fn matches_path_do(
> +        &self,
> +        path: &[u8],
> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {

Given the amount of tiny match helpers we run through with those 2
traits already I wonder if we should just make a breaking change here
instead and only have the versions with the `Result` while users (or
defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
(since a mode of 0 should match anything ;-) ).

But I don't have any strong feelings about this, so either way is fine
with me.

> +        // This is an &self method on a `T where T: 'a`.
> +        let this: &'a Self = unsafe { std::mem::transmute(self) };
> +
> +        for m in this.into_iter().rev() {
> +            if let Ok(mt) = m.entry_matches_path(path) {

IIRC the intention was actually to immediately fail if we hit a pattern
with a file mode, since we wouldn't be able to tell if it would already
exclude the file, otherwise we really *could* just skip this entirely
and have a failing stat() call just use `Some(!0)` as file mode.

Basically, if the user runs into an inaccessible file they have to
append `--exclude=/that/one/file` to the CLI invocation to fix it,
whereas otherwise they'd still get an error.

So this should just be

    if let Some(mt) = m.entry_matches_path(path)? {
        return Some(mt);
    }


> +                if mt.is_some() {
> +                    return Ok(mt);
> +                }
> +            }
> +        }
> +
> +        Err(FileModeRequiredForMatching)

Also this is wrong. If nothing matches, nothing matches ;-) (just like
in the other matching variants).
Which immediately tells you that skipping the error above doesn't make
much sense :-) )




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
  2023-08-11  8:26 ` [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Wolfgang Bumiller
@ 2023-08-11  8:32   ` Wolfgang Bumiller
  2023-08-11  8:38     ` Wolfgang Bumiller
  2023-08-14  9:32   ` Gabriel Goller
  1 sibling, 1 reply; 8+ messages in thread
From: Wolfgang Bumiller @ 2023-08-11  8:32 UTC (permalink / raw)
  To: Gabriel Goller; +Cc: pbs-devel

On Fri, Aug 11, 2023 at 10:26:22AM +0200, Wolfgang Bumiller wrote:
> On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
> > Added `matches_path()` function, which only matches against the path and returns
> > an error if a file_mode pattern is found/needed in the matching list. This is
> > useful when we want to check if a file is excluded before running `stat()` on
> > the file to get the file_mode (which could fail).
> > 
> > Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
> > ---
> >  src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 158 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/match_list.rs b/src/match_list.rs
> > index c5b14e0..acad328 100644
> > --- a/src/match_list.rs
> > +++ b/src/match_list.rs
> > @@ -1,6 +1,6 @@
> >  //! Helpers for include/exclude lists.
> > -
> >  use bitflags::bitflags;
> > +use std::fmt;
> >  
> >  use crate::PatternFlag;
> >  
> > @@ -39,6 +39,17 @@ impl Default for MatchFlag {
> >      }
> >  }
> >  
> > +#[derive(Debug, PartialEq)]
> > +pub struct FileModeRequiredForMatching;
> 
> Let's shorten this to just `FileModeRequired` ;-)
> 
> > +
> > +impl fmt::Display for FileModeRequiredForMatching {
> > +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> > +        write!(f, "File mode is required for matching")
> > +    }
> > +}
> > +
> > +impl std::error::Error for FileModeRequiredForMatching {}
> > +
> >  /// A pattern entry. (Glob patterns or literal patterns.)
> >  // Note:
> >  // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
> > @@ -304,12 +315,32 @@ impl MatchEntry {
> >  
> >          self.matches_path_exact(path)
> >      }
> > +
> > +    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
> > +    pub fn matches_path<T: AsRef<[u8]>>(
> > +        &self,
> > +        path: T,
> > +    ) -> Result<bool, FileModeRequiredForMatching> {
> > +        self.matches_path_do(path.as_ref())
> > +    }
> > +
> > +    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
> > +        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
> > +            return Err(FileModeRequiredForMatching);
> > +        }
> > +
> > +        Ok(self.matches_path_suffix_do(path))
> > +    }
> >  }
> >  
> >  #[doc(hidden)]
> >  pub trait MatchListEntry {
> >      fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> >      fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> > +    fn entry_matches_path(
> > +        &self,
> > +        path: &[u8],
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
> 
> >  }
> >  
> >  impl MatchListEntry for &'_ MatchEntry {
> > @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
> >              None
> >          }
> >      }
> > +
> > +    fn entry_matches_path(
> > +        &self,
> > +        path: &[u8],
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> > +        if let Ok(b) = self.matches_path(path) {
> 
> This can just use `?`, it's the exact same error type after all.
> (Also `if let Ok` is generally best avoided since the 'else' branch
> discards the error, and if not, it's often a case like this where it can
> juse use '?' ;-) ).
> 
> Effectively this could be as short as
> 
>     Ok(self.matches_path(path)?.then(|| self.match_type()))
> 
> > +            if b {
> 
> As an additional hint: when you nest ifs around things you can match,
> you can just include both cases in the patterns:
>     match self.matches_path(path) {
>         Ok(true) => Ok(Some(self.match_type())),
>         Ok(false) => Ok(None),
>         Err(err) => Err(err),
>         // where this Err() case already tells you that you can use '?' instead
>     }
> 
> > +                Ok(Some(self.match_type()))
> > +            } else {
> > +                Ok(None)
> > +            }
> > +        } else {
> > +            Err(FileModeRequiredForMatching)
> > +        }
> > +    }
> >  }
> >  
> >  impl MatchListEntry for &'_ &'_ MatchEntry {
> > @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
> >              None
> >          }
> >      }
> > +
> > +    fn entry_matches_path(
> > +        &self,
> > +        path: &[u8],
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> 
> same
> 
> > +        if let Ok(b) = self.matches_path(path) {
> > +            if b {
> > +                Ok(Some(self.match_type()))
> > +            } else {
> > +                Ok(None)
> > +            }
> > +        } else {
> > +            Err(FileModeRequiredForMatching)
> > +        }
> > +    }
> >  }
> >  
> >  /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
> > @@ -374,6 +435,20 @@ pub trait MatchList {
> >      }
> >  
> >      fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
> > +
> > +    /// Check whether this list contains anything exactly matching the path, returns error if
> > +    /// `file_mode` is required for exact matching.
> > +    fn matches_path<T: AsRef<[u8]>>(
> > +        &self,
> > +        path: T,
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> > +        self.matches_path_do(path.as_ref())
> > +    }
> > +
> > +    fn matches_path_do(
> > +        &self,
> > +        path: &[u8],
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
> >  }
> >  
> >  impl<'a, T> MatchList for T
> > @@ -408,6 +483,24 @@ where
> >  
> >          None
> >      }
> > +
> > +    fn matches_path_do(
> > +        &self,
> > +        path: &[u8],
> > +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> 
> Given the amount of tiny match helpers we run through with those 2
> traits already I wonder if we should just make a breaking change here
> instead and only have the versions with the `Result` while users (or
> defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
> (since a mode of 0 should match anything ;-) ).

(of s/of 0/of !0/` of course)




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
  2023-08-11  8:32   ` Wolfgang Bumiller
@ 2023-08-11  8:38     ` Wolfgang Bumiller
  0 siblings, 0 replies; 8+ messages in thread
From: Wolfgang Bumiller @ 2023-08-11  8:38 UTC (permalink / raw)
  To: Gabriel Goller; +Cc: pbs-devel

On Fri, Aug 11, 2023 at 10:32:59AM +0200, Wolfgang Bumiller wrote:
> On Fri, Aug 11, 2023 at 10:26:22AM +0200, Wolfgang Bumiller wrote:
> > On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
(...)
> > 
> > Given the amount of tiny match helpers we run through with those 2
> > traits already I wonder if we should just make a breaking change here
> > instead and only have the versions with the `Result` while users (or
> > defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
> > (since a mode of 0 should match anything ;-) ).
> 
> (of s/of 0/of !0/` of course)

In fact, I'm technically wrong, since the value here just gets masked
and isn't actually a bit field, (similarly wrong will be my comment
further down in my original reply),

However! Since this won't be a valid value, we could still use it and
just special-case this in the `matches_mode()` function.
If, however, we go with an API-breaking route, this is definitely better
implemented as either `enum Mode { Match(u32), Ignore, Fail }` or an
additional "required: bool" parameter... but we need to hand that
through all the layers of helpers...




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()`
  2023-08-09 10:19 ` [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()` Gabriel Goller
@ 2023-08-11  8:51   ` Wolfgang Bumiller
  2023-08-14  7:41     ` Gabriel Goller
  0 siblings, 1 reply; 8+ messages in thread
From: Wolfgang Bumiller @ 2023-08-11  8:51 UTC (permalink / raw)
  To: Gabriel Goller; +Cc: pbs-devel

On Wed, Aug 09, 2023 at 12:19:13PM +0200, Gabriel Goller wrote:
> Before running `stat()` we now make a `matches_path()` call, which checks

I was more thinking to use `matches_path()` only if `stat()` actually
fails and then propagate `stat()`'s error (instead of the error that a
file mode is required... ;-) )
That way we wouldn't match twice in the other case.

> if the file/directory is excluded (`matches_path()` tries to match using
> only the path, without the file mode). If the file is not excluded,
> we `stat()` and then check again if the path matches using the file_mode and
> the `matches()` function.
> Added `pathpatterns` crate to local overrides in cargo.toml.
> 
> changes v2:
>  - checking for `read` and `execute` permissions before entering directory,
>    doesn't work because there are a lot of side-effects (executed by
>    different user, AppArmor, SELinux, ...).
> changes v1:
>  - checking for excluded files with `matches()` before executing `stat()`,
>    this doesn't work because we get the file_mode from `stat()` and don't 
>    want to ignore it when matching.
> 
> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
> ---
>  Cargo.toml                    |  1 +
>  pbs-client/src/pxar/create.rs | 45 ++++++++++++++++++++++++++---------
>  2 files changed, 35 insertions(+), 11 deletions(-)
> 
> diff --git a/Cargo.toml b/Cargo.toml
> index 2a05c471..c54bc28b 100644
> --- a/Cargo.toml
> +++ b/Cargo.toml
> @@ -262,6 +262,7 @@ proxmox-rrd.workspace = true
>  
>  #proxmox-apt = { path = "../proxmox-apt" }
>  #proxmox-openid = { path = "../proxmox-openid-rs" }
> +#pathpatterns = {path = "../pathpatterns" }
>  
>  #pxar = { path = "../pxar" }
>  
> diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
> index 2577cf98..a6d608e7 100644
> --- a/pbs-client/src/pxar/create.rs
> +++ b/pbs-client/src/pxar/create.rs
> @@ -232,7 +232,7 @@ impl Archiver {
>              let entry_counter = self.entry_counter;
>  
>              let old_patterns_count = self.patterns.len();
> -            self.read_pxar_excludes(dir.as_raw_fd())?;
> +            self.read_pxar_excludes(&mut dir)?;
>  
>              let mut file_list = self.generate_directory_file_list(&mut dir, is_root)?;
>  
> @@ -315,8 +315,23 @@ impl Archiver {
>          }
>      }
>  
> -    fn read_pxar_excludes(&mut self, parent: RawFd) -> Result<(), Error> {
> -        let fd = match self.open_file(parent, c_str!(".pxarexclude"), OFlag::O_RDONLY, false)? {
> +    fn read_pxar_excludes(&mut self, parent: &mut Dir) -> Result<(), Error> {
> +        let mut exclude_file_found = false;
> +        for file in parent.iter() {
> +            if file?.file_name().to_str()? == ".pxarexclude" {
> +                exclude_file_found = true;
> +                break;
> +            }
> +        }
> +        if !exclude_file_found {

Why are we adding this check?
It's not necessary.

> +            return Ok(());
> +        }
> +        let fd = match self.open_file(
> +            parent.as_raw_fd(),
> +            c_str!(".pxarexclude"),
> +            OFlag::O_RDONLY,
> +            false,
> +        )? {
>              Some(fd) => fd,
>              None => return Ok(()),

^ This already means the file does not exist.

>          };
> @@ -420,7 +435,7 @@ impl Archiver {
>          for file in dir.iter() {
>              let file = file?;
>  
> -            let file_name = file.file_name().to_owned();
> +            let file_name = file.file_name();
>              let file_name_bytes = file_name.to_bytes();
>              if file_name_bytes == b"." || file_name_bytes == b".." {
>                  continue;
> @@ -434,9 +449,17 @@ impl Archiver {
>              assert_single_path_component(os_file_name)?;
>              let full_path = self.path.join(os_file_name);
>  
> +            let match_path = PathBuf::from("/").join(full_path.clone());
> +            let pre_stat_match = self
> +                .patterns
> +                .matches_path(match_path.as_os_str().as_bytes());

As said above, I'd like this to instead happen in stat's `Err() => {}`
case.

> +            if pre_stat_match == Ok(Some(MatchType::Exclude)) {
> +                continue;
> +            }
> +
>              let stat = match nix::sys::stat::fstatat(
>                  dir_fd,
> -                file_name.as_c_str(),
> +                file_name.to_owned().as_c_str(),
>                  nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
>              ) {
>                  Ok(stat) => stat,
> @@ -444,11 +467,11 @@ impl Archiver {
>                  Err(err) => return Err(err).context(format!("stat failed on {:?}", full_path)),
>              };
>  
> -            let match_path = PathBuf::from("/").join(full_path.clone());
> -            if self
> -                .patterns
> -                .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
> -                == Some(MatchType::Exclude)
> +            if pre_stat_match.is_err()

And then this won't be needed and AFAICT isn't correct with the current
implementation of `matches_path()` of this series since it might have
found a match *after* one of a higher precedence would have already
matched potentially differently.

> +                && self
> +                    .patterns
> +                    .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
> +                    == Some(MatchType::Exclude)
>              {
>                  continue;
>              }
> @@ -462,7 +485,7 @@ impl Archiver {
>              }
>  
>              file_list.push(FileListEntry {
> -                name: file_name,
> +                name: file_name.to_owned(),
>                  path: full_path,
>                  stat,
>              });
> -- 
> 2.39.2




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()`
  2023-08-11  8:51   ` Wolfgang Bumiller
@ 2023-08-14  7:41     ` Gabriel Goller
  0 siblings, 0 replies; 8+ messages in thread
From: Gabriel Goller @ 2023-08-14  7:41 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: pbs-devel

On 8/11/23 10:51, Wolfgang Bumiller wrote:
> On Wed, Aug 09, 2023 at 12:19:13PM +0200, Gabriel Goller wrote:
>> Before running `stat()` we now make a `matches_path()` call, which checks
> I was more thinking to use `matches_path()` only if `stat()` actually
> fails and then propagate `stat()`'s error (instead of the error that a
> file mode is required... ;-) )
> That way we wouldn't match twice in the other case.
I used this approach so that we don't `stat` every excluded files as 
well (I think `matches_path` is
probably faster than `stat`). Also we will very rarely make the matching 
twice, because most of the
excluded paths come from the `--exclude` parameter or the `.pxarexclude` 
file, which both don't
allow to set the `file_mode`.
>> if the file/directory is excluded (`matches_path()` tries to match using
>> only the path, without the file mode). If the file is not excluded,
>> we `stat()` and then check again if the path matches using the file_mode and
>> the `matches()` function.
>> Added `pathpatterns` crate to local overrides in cargo.toml.
>>
>> changes v2:
>>   - checking for `read` and `execute` permissions before entering directory,
>>     doesn't work because there are a lot of side-effects (executed by
>>     different user, AppArmor, SELinux, ...).
>> changes v1:
>>   - checking for excluded files with `matches()` before executing `stat()`,
>>     this doesn't work because we get the file_mode from `stat()` and don't
>>     want to ignore it when matching.
>>
>> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
>> ---
>>   Cargo.toml                    |  1 +
>>   pbs-client/src/pxar/create.rs | 45 ++++++++++++++++++++++++++---------
>>   2 files changed, 35 insertions(+), 11 deletions(-)
>>
>> diff --git a/Cargo.toml b/Cargo.toml
>> index 2a05c471..c54bc28b 100644
>> --- a/Cargo.toml
>> +++ b/Cargo.toml
>> @@ -262,6 +262,7 @@ proxmox-rrd.workspace = true
>>   
>>   #proxmox-apt = { path = "../proxmox-apt" }
>>   #proxmox-openid = { path = "../proxmox-openid-rs" }
>> +#pathpatterns = {path = "../pathpatterns" }
>>   
>>   #pxar = { path = "../pxar" }
>>   
>> diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
>> index 2577cf98..a6d608e7 100644
>> --- a/pbs-client/src/pxar/create.rs
>> +++ b/pbs-client/src/pxar/create.rs
>> @@ -232,7 +232,7 @@ impl Archiver {
>>               let entry_counter = self.entry_counter;
>>   
>>               let old_patterns_count = self.patterns.len();
>> -            self.read_pxar_excludes(dir.as_raw_fd())?;
>> +            self.read_pxar_excludes(&mut dir)?;
>>   
>>               let mut file_list = self.generate_directory_file_list(&mut dir, is_root)?;
>>   
>> @@ -315,8 +315,23 @@ impl Archiver {
>>           }
>>       }
>>   
>> -    fn read_pxar_excludes(&mut self, parent: RawFd) -> Result<(), Error> {
>> -        let fd = match self.open_file(parent, c_str!(".pxarexclude"), OFlag::O_RDONLY, false)? {
>> +    fn read_pxar_excludes(&mut self, parent: &mut Dir) -> Result<(), Error> {
>> +        let mut exclude_file_found = false;
>> +        for file in parent.iter() {
>> +            if file?.file_name().to_str()? == ".pxarexclude" {
>> +                exclude_file_found = true;
>> +                break;
>> +            }
>> +        }
>> +        if !exclude_file_found {
> Why are we adding this check?
> It's not necessary.

It is not strictly necessary, but the problem is that the function tries 
to open `.pxarexclude` in every directory.
Also directories were we have no access to, which will obviously fail. 
So if we have a directory with limited permissions,
(regardless if a `.pxarexclude` lives in it or not) we will get an error 
`failed to open .pxarexclude file` (even though it
doesn't even exist).

>> +            return Ok(());
>> +        }
>> +        let fd = match self.open_file(
>> +            parent.as_raw_fd(),
>> +            c_str!(".pxarexclude"),
>> +            OFlag::O_RDONLY,
>> +            false,
>> +        )? {
>>               Some(fd) => fd,
>>               None => return Ok(()),
> ^ This already means the file does not exist.
>
>>           };
>> @@ -420,7 +435,7 @@ impl Archiver {
>>           for file in dir.iter() {
>>               let file = file?;
>>   
>> -            let file_name = file.file_name().to_owned();
>> +            let file_name = file.file_name();
>>               let file_name_bytes = file_name.to_bytes();
>>               if file_name_bytes == b"." || file_name_bytes == b".." {
>>                   continue;
>> @@ -434,9 +449,17 @@ impl Archiver {
>>               assert_single_path_component(os_file_name)?;
>>               let full_path = self.path.join(os_file_name);
>>   
>> +            let match_path = PathBuf::from("/").join(full_path.clone());
>> +            let pre_stat_match = self
>> +                .patterns
>> +                .matches_path(match_path.as_os_str().as_bytes());
> As said above, I'd like this to instead happen in stat's `Err() => {}`
> case.
>
>> +            if pre_stat_match == Ok(Some(MatchType::Exclude)) {
>> +                continue;
>> +            }
>> +
>>               let stat = match nix::sys::stat::fstatat(
>>                   dir_fd,
>> -                file_name.as_c_str(),
>> +                file_name.to_owned().as_c_str(),
>>                   nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW,
>>               ) {
>>                   Ok(stat) => stat,
>> @@ -444,11 +467,11 @@ impl Archiver {
>>                   Err(err) => return Err(err).context(format!("stat failed on {:?}", full_path)),
>>               };
>>   
>> -            let match_path = PathBuf::from("/").join(full_path.clone());
>> -            if self
>> -                .patterns
>> -                .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
>> -                == Some(MatchType::Exclude)
>> +            if pre_stat_match.is_err()
> And then this won't be needed and AFAICT isn't correct with the current
> implementation of `matches_path()` of this series since it might have
> found a match *after* one of a higher precedence would have already
> matched potentially differently.

I'll fix the implementation of `matches_path()` so that it actually 
fails (returns `Err`) when a pattern
with a `file_mode` is encountered (as it was intended). Otherwise this 
should work, as I didn't change
anything (except the preceding `matches_path()`).

>> +                && self
>> +                    .patterns
>> +                    .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode))
>> +                    == Some(MatchType::Exclude)
>>               {
>>                   continue;
>>               }
>> @@ -462,7 +485,7 @@ impl Archiver {
>>               }
>>   
>>               file_list.push(FileListEntry {
>> -                name: file_name,
>> +                name: file_name.to_owned(),
>>                   path: full_path,
>>                   stat,
>>               });
>> -- 
>> 2.39.2




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path
  2023-08-11  8:26 ` [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Wolfgang Bumiller
  2023-08-11  8:32   ` Wolfgang Bumiller
@ 2023-08-14  9:32   ` Gabriel Goller
  1 sibling, 0 replies; 8+ messages in thread
From: Gabriel Goller @ 2023-08-14  9:32 UTC (permalink / raw)
  To: Wolfgang Bumiller; +Cc: pbs-devel

On 8/11/23 10:26, Wolfgang Bumiller wrote:

> On Wed, Aug 09, 2023 at 12:19:12PM +0200, Gabriel Goller wrote:
>> Added `matches_path()` function, which only matches against the path and returns
>> an error if a file_mode pattern is found/needed in the matching list. This is
>> useful when we want to check if a file is excluded before running `stat()` on
>> the file to get the file_mode (which could fail).
>>
>> Signed-off-by: Gabriel Goller <g.goller@proxmox.com>
>> ---
>>   src/match_list.rs | 159 +++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 158 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/match_list.rs b/src/match_list.rs
>> index c5b14e0..acad328 100644
>> --- a/src/match_list.rs
>> +++ b/src/match_list.rs
>> @@ -1,6 +1,6 @@
>>   //! Helpers for include/exclude lists.
>> -
>>   use bitflags::bitflags;
>> +use std::fmt;
>>   
>>   use crate::PatternFlag;
>>   
>> @@ -39,6 +39,17 @@ impl Default for MatchFlag {
>>       }
>>   }
>>   
>> +#[derive(Debug, PartialEq)]
>> +pub struct FileModeRequiredForMatching;
> Let's shorten this to just `FileModeRequired` ;-)
Agree
>> +
>> +impl fmt::Display for FileModeRequiredForMatching {
>> +    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
>> +        write!(f, "File mode is required for matching")
>> +    }
>> +}
>> +
>> +impl std::error::Error for FileModeRequiredForMatching {}
>> +
>>   /// A pattern entry. (Glob patterns or literal patterns.)
>>   // Note:
>>   // For regex we'd likely use the POSIX extended REs via `regexec(3)`, since we're targetting
>> @@ -304,12 +315,32 @@ impl MatchEntry {
>>   
>>           self.matches_path_exact(path)
>>       }
>> +
>> +    /// Check whether the path contains a matching suffix. Returns an error if a file mode is required.
>> +    pub fn matches_path<T: AsRef<[u8]>>(
>> +        &self,
>> +        path: T,
>> +    ) -> Result<bool, FileModeRequiredForMatching> {
>> +        self.matches_path_do(path.as_ref())
>> +    }
>> +
>> +    fn matches_path_do(&self, path: &[u8]) -> Result<bool, FileModeRequiredForMatching> {
>> +        if !self.flags.contains(MatchFlag::ANY_FILE_TYPE) {
>> +            return Err(FileModeRequiredForMatching);
>> +        }
>> +
>> +        Ok(self.matches_path_suffix_do(path))
>> +    }
>>   }
>>   
>>   #[doc(hidden)]
>>   pub trait MatchListEntry {
>>       fn entry_matches(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>>       fn entry_matches_exact(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>>   }
>>   
>>   impl MatchListEntry for &'_ MatchEntry {
>> @@ -328,6 +359,21 @@ impl MatchListEntry for &'_ MatchEntry {
>>               None
>>           }
>>       }
>> +
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> +        if let Ok(b) = self.matches_path(path) {
> This can just use `?`, it's the exact same error type after all.
> (Also `if let Ok` is generally best avoided since the 'else' branch
> discards the error, and if not, it's often a case like this where it can
> juse use '?' ;-) ).
>
> Effectively this could be as short as
>
>      Ok(self.matches_path(path)?.then(|| self.match_type()))
>
>> +            if b {
> As an additional hint: when you nest ifs around things you can match,
> you can just include both cases in the patterns:
>      match self.matches_path(path) {
>          Ok(true) => Ok(Some(self.match_type())),
>          Ok(false) => Ok(None),
>          Err(err) => Err(err),
>          // where this Err() case already tells you that you can use '?' instead
>      }
>
>> +                Ok(Some(self.match_type()))
>> +            } else {
>> +                Ok(None)
>> +            }
>> +        } else {
>> +            Err(FileModeRequiredForMatching)
>> +        }
>> +    }
>>   }
>>   
>>   impl MatchListEntry for &'_ &'_ MatchEntry {
>> @@ -346,6 +392,21 @@ impl MatchListEntry for &'_ &'_ MatchEntry {
>>               None
>>           }
>>       }
>> +
>> +    fn entry_matches_path(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> same
>
>> +        if let Ok(b) = self.matches_path(path) {
>> +            if b {
>> +                Ok(Some(self.match_type()))
>> +            } else {
>> +                Ok(None)
>> +            }
>> +        } else {
>> +            Err(FileModeRequiredForMatching)
>> +        }
>> +    }
>>   }
>>   
>>   /// This provides [`matches`](MatchList::matches) and [`matches_exact`](MatchList::matches_exact)
>> @@ -374,6 +435,20 @@ pub trait MatchList {
>>       }
>>   
>>       fn matches_exact_do(&self, path: &[u8], file_mode: Option<u32>) -> Option<MatchType>;
>> +
>> +    /// Check whether this list contains anything exactly matching the path, returns error if
>> +    /// `file_mode` is required for exact matching.
>> +    fn matches_path<T: AsRef<[u8]>>(
>> +        &self,
>> +        path: T,
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
>> +        self.matches_path_do(path.as_ref())
>> +    }
>> +
>> +    fn matches_path_do(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching>;
>>   }
>>   
>>   impl<'a, T> MatchList for T
>> @@ -408,6 +483,24 @@ where
>>   
>>           None
>>       }
>> +
>> +    fn matches_path_do(
>> +        &self,
>> +        path: &[u8],
>> +    ) -> Result<Option<MatchType>, FileModeRequiredForMatching> {
> Given the amount of tiny match helpers we run through with those 2
> traits already I wonder if we should just make a breaking change here
> instead and only have the versions with the `Result` while users (or
> defaulted helpers in the trait) just pass `file_mode.unwrap_or(!0)`
> (since a mode of 0 should match anything ;-) ).
>
> But I don't have any strong feelings about this, so either way is fine
> with me.

Hmm, I don't get what you mean by the `versions with the Result`?
Do you mean we should just modify the `matches()` and `matches_exact()` 
function to
return a `Result` if the mode is `ANY_FILE_TYPE` and we need one (a file 
mode) to match?

>> +        // This is an &self method on a `T where T: 'a`.
>> +        let this: &'a Self = unsafe { std::mem::transmute(self) };
>> +
>> +        for m in this.into_iter().rev() {
>> +            if let Ok(mt) = m.entry_matches_path(path) {
> IIRC the intention was actually to immediately fail if we hit a pattern
> with a file mode, since we wouldn't be able to tell if it would already
> exclude the file, otherwise we really *could* just skip this entirely
> and have a failing stat() call just use `Some(!0)` as file mode.
>
> Basically, if the user runs into an inaccessible file they have to
> append `--exclude=/that/one/file` to the CLI invocation to fix it,
> whereas otherwise they'd still get an error.
>
> So this should just be
>
>      if let Some(mt) = m.entry_matches_path(path)? {
>          return Some(mt);
>      }

Hmm, my solution would have been to traverse all the patterns with a 
`ANY_FILE_TYPE` mode and
if there is a match, nice, return. If there is no match, check if there 
are other patterns without
the `ANY_FILE_TYPE` mode (then we need to fail because we don't know the 
`file_mode` yet...) or
return nothing if there are no patterns left. I don't know if this is 
the correct approach though...

>> +                if mt.is_some() {
>> +                    return Ok(mt);
>> +                }
>> +            }
>> +        }
>> +
>> +        Err(FileModeRequiredForMatching)
> Also this is wrong. If nothing matches, nothing matches ;-) (just like
> in the other matching variants).
> Which immediately tells you that skipping the error above doesn't make
> much sense :-) )
Yes, you are right, corrected it.




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-14  9:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-09 10:19 [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Gabriel Goller
2023-08-09 10:19 ` [pbs-devel] [PATCH proxmox-backup v3 1/1] fix #4380: check if file is excluded before running `stat()` Gabriel Goller
2023-08-11  8:51   ` Wolfgang Bumiller
2023-08-14  7:41     ` Gabriel Goller
2023-08-11  8:26 ` [pbs-devel] [PATCH pathpatterns] match_list: added `matches_path()` function, which matches only the path Wolfgang Bumiller
2023-08-11  8:32   ` Wolfgang Bumiller
2023-08-11  8:38     ` Wolfgang Bumiller
2023-08-14  9:32   ` Gabriel Goller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal