all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: pdm-devel@lists.proxmox.com
Subject: Re: [pdm-devel] [PATCH datacenter-manager v4 02/10] lib: add pdm-search crate
Date: Wed, 3 Sep 2025 10:14:50 +0200	[thread overview]
Message-ID: <c211b39c-d9e0-48e4-9d29-fba6ce2e85f5@proxmox.com> (raw)
In-Reply-To: <5zrkhdsc4sw2jsqlqrn7ijbj2oyav6bcpa7f42ybawwrv6zu4a@kjhewrghg7xa>



On 9/2/25 5:32 PM, Wolfgang Bumiller wrote:
> On Thu, Aug 28, 2025 at 03:16:01PM +0200, Dominik Csapak wrote:
>> Introduce a new create for search & filter related code. It currently
>> includes basic parsing & testing of search terms. Intended to be used on
>> some API calls that allow for more complex filters, such as the
>> resources API.
>>
>> Contains a `SearchTerm` and a `Search` struct. The former represents
>> a single term to search for, with an optional category and if it's
>> optional or not. The latter represents a full search with multiple
>> terms.
>>
>> Short syntax summary:
>> 'sometext'  : normal search term
>> '+sometext' : required search term
>> 'cat:text'  : looks for 'text' in the category 'cat'
>>
>> required terms have to exist in the resulting match, while optional
>> ones are OR'd (so at least one optional match must exist)
> 
> ^ Would be good to clarify that (and fix if not) if *no* optional one
> matches *but* there *are* matching required ones, it's also a positive
> match...

no that's wrong?

if no optional ones have to match they don't influence the result at all?

required terms must all match, and at least one from the set optional ones

for example:

i search for

+foo type:qemu type:lxc

foo is required, type:qemu and type:lxc optional

if one thing would match foo but not type:qemu and not type:lxc it 
should not appear in the result list, because otherwise
i could have written

+foo

and achieve the same result

but i wanted everything with "foo" in it that is from type "qemu" or "lxc"

does that make sense?

Maybe i have to describe it better in the commit message?

> 
>>
>> This is loosely inspired by various search syntaxes, e.g. the one from
>> gitlab[0] or query-dsl from elastic[1]
>>
>> [1] https://docs.gitlab.com/user/search/advanced_search/
>> [2] https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-simple-query-string-query
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> changes from v3:
>> * add missing doc comments
>> * more tests
>>
>>   Cargo.toml                |   2 +
>>   lib/pdm-search/Cargo.toml |  12 ++
>>   lib/pdm-search/src/lib.rs | 339 ++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 353 insertions(+)
>>   create mode 100644 lib/pdm-search/Cargo.toml
>>   create mode 100644 lib/pdm-search/src/lib.rs
>>
>> diff --git a/Cargo.toml b/Cargo.toml
>> index f07733d..ae2816a 100644
>> --- a/Cargo.toml
>> +++ b/Cargo.toml
>> @@ -19,6 +19,7 @@ members = [
>>       "lib/pdm-api-types",
>>       "lib/pdm-client",
>>       "lib/pdm-config",
>> +    "lib/pdm-search",
>>       "lib/pdm-ui-shared",
>>   
>>       "cli/client",
>> @@ -86,6 +87,7 @@ pdm-api-types = { path = "lib/pdm-api-types" }
>>   pdm-buildcfg = { path = "lib/pdm-buildcfg" }
>>   pdm-config = { path = "lib/pdm-config" }
>>   pdm-client = { version = "0.2", path = "lib/pdm-client" }
>> +pdm-search = { version = "0.2", path = "lib/pdm-search" }
>>   pdm-ui-shared = { version = "0.2", path = "lib/pdm-ui-shared" }
>>   proxmox-fido2 = { path = "cli/proxmox-fido2" }
>>   
>> diff --git a/lib/pdm-search/Cargo.toml b/lib/pdm-search/Cargo.toml
>> new file mode 100644
>> index 0000000..5f51e75
>> --- /dev/null
>> +++ b/lib/pdm-search/Cargo.toml
>> @@ -0,0 +1,12 @@
>> +[package]
>> +name = "pdm-search"
>> +description = "Proxmox Datacenter Manager shared ui modules"
>> +homepage = "https://www.proxmox.com"
>> +
>> +version.workspace = true
>> +edition.workspace = true
>> +license.workspace = true
>> +repository.workspace = true
>> +
>> +[dependencies]
>> +anyhow.workspace = true
>> diff --git a/lib/pdm-search/src/lib.rs b/lib/pdm-search/src/lib.rs
>> new file mode 100644
>> index 0000000..4854166
>> --- /dev/null
>> +++ b/lib/pdm-search/src/lib.rs
>> @@ -0,0 +1,339 @@
>> +//! Abstraction over a [`Search`] that contains multiple [`SearchTerm`]s.
>> +//!
>> +//! Provides methods to filter an item over a combination of such terms and
>> +//! construct them from text, and serialize them back to text.
>> +use std::str::FromStr;
> 
> ↑ you never use any of its methods, for the single trait implementation
> it's fine to just `impl std::str::FromStr` below
> 
> OTOH you do implement `Display` twice without importing it (which is
> fine, don't but importing just `std::fmt` will shorten it all...)
> 

right, i'll do `use std::fmt` and `fmt::Display` etc.

> 
>> +
>> +use anyhow::bail;
>> +
>> +#[derive(Clone)]
>> +pub struct Search {
>> +    required_terms: Vec<SearchTerm>,
>> +    optional_terms: Vec<SearchTerm>,
> 
> ↑ In theory - is partitioning them very useful, or would it be a better
> UX if we kept them in a single vec such that `.parse().to_string()`
> doesn't reorder them?
> 

mhmm. i have to think about that, but since we might
want to filter a large amount of resource, having it pre-partitioned
can speed up the matching

we could save both, original order and partitioned values

since most of the time a single search won't contain that many terms

(i guess 10-20 at max) so the little extra wasted space
could be a good tradeoff for keeping the order

OTOH, we don't currently parse and display it again

we parse it for user input, and display it for auto generated
searches, in both cases it does not really matter if
we have the original order or not....

>> +}
>> +
>> +impl Default for Search {
> 
> A derive should work here
> 
>> +    fn default() -> Self {
>> +        Self::new()
>> +    }
>> +}
>> +
>> +impl FromIterator<SearchTerm> for Search {
>> +    fn from_iter<T: IntoIterator<Item = SearchTerm>>(iter: T) -> Self {
>> +        let (optional_terms, required_terms) = iter.into_iter().partition(|term| term.optional);
>> +
>> +        Self {
>> +            required_terms,
>> +            optional_terms,
>> +        }
>> +    }
>> +}
>> +
>> +impl<S: AsRef<str>> From<S> for Search {
>> +    fn from(value: S) -> Self {
>> +        value
>> +            .as_ref()
>> +            .split_whitespace()
>> +            .filter_map(|term| term.parse().ok())
> 
> Maybe add a note here why we do this, as this is quite painful to read.
> We throw away errors because we expect users to produce them? ;-)

yes, basically

if a user enters something we cannot sure what we're searching for
there is currently only two cases where that can happen:

the string ':a'; was it meant with an empty category and
the value "a", or the literal string ':a'

the other error case is when the term would be empty
* a single '+' -> could again interpret literally
* an empty string '' -> should not be possible from the user side
   since we split on whitespace

i could adapt it to take that as a literal term though and implement
From<AsRef<str>> instead of FromStr

> 
>> +            .collect()
>> +    }
>> +}
>> +
>> +impl Search {
>> +    /// Create a new empty [`Search`]
>> +    pub fn new() -> Self {
>> +        Self::with_terms(Vec::new())
>> +    }
>> +
>> +    /// Returns true if no [`SearchTerm`] exist
>> +    pub fn is_empty(&self) -> bool {
>> +        self.required_terms.is_empty() && self.optional_terms.is_empty()
>> +    }
>> +
>> +    /// Create a new [`Search`] with the given [`SearchTerm`]s
>> +    pub fn with_terms(terms: Vec<SearchTerm>) -> Self {
> 
> (
> Since we already have `FromIterator`, this could be
> 
>      pub fn with_terms<I>(terms: I) -> Self
>      where
>          I: IntoIterator<Item = SearchTerm>;
> )

true

> 
>> +        terms.into_iter().collect()
>> +    }
>> +
>> +    /// Test if the given `Fn(&SearchTerm) -> bool` for all [`SearchTerm`] configured matches
>> +    ///
>> +    /// Returns true if it matches considering the constraints:
>> +    /// if there are no filters, returns true
>> +    pub fn matches<F: Fn(&SearchTerm) -> bool>(&self, matches: F) -> bool {
>> +        if self.is_empty() {
>> +            return true;
>> +        }
>> +
>> +        if self.required_terms.iter().map(&matches).any(|f| !f) {
>> +            return false;
>> +        }
>> +
>> +        if !self.optional_terms.is_empty() && self.optional_terms.iter().map(&matches).all(|f| !f) {
>> +            return false;
>> +        }
>> +
>> +        true
>> +    }
>> +
>> +    /// Returns true if the combination of [`SearchTerm`]s require that this category value must be
>> +    /// true. Useful to find out if some condition is required (e.g. type == 'remote')
>> +    pub fn category_value_required(&self, category: &str, value: &str) -> bool {
>> +        for term in &self.required_terms {
>> +            if term.category.as_deref() == Some(category) && value.contains(&term.value) {
> 
> Why does this use `.contains()` but the loop over optional ones uses
> `==`?

good point

this probably does not belong here at all, but as a helper in the api
where we search

we only use for 'fast skipping' some infos there when we know we only
look for remotes

i'd leave it out here and make a special check in the api. we can still
introduce it if we need it somewhere else too

> 
>> +                return true;
>> +            }
>> +        }
>> +
>> +        let mut optional_count = 0;
>> +
>> +        for term in &self.optional_terms {
>> +            if term.category.as_deref() == Some(category) && term.value == value {
>> +                optional_count += 1;
>> +            }
>> +        }
>> +
>> +        self.required_terms.is_empty()
>> +            && self.optional_terms.len() == optional_count
>> +            && optional_count > 0
>> +    }
>> +}
>> +
>> +impl std::fmt::Display for Search {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
>> +        for (count, term) in self.required_terms.iter().enumerate() {
>> +            if count != 0 {
>> +                write!(f, " ")?;
>> +            }
>> +
>> +            write!(f, "{term}")?;
>> +        }
> 
> personal opinion: enumerate() to only handle the first element
> differently seems a bit much, if you consider that it keeps a counter,
> you could keep a "separator" instead which you just update once:
> 
>      let mut sep = "";
>      for term in &self.required_terms {
>          write!(f, "{sep}{term}")?;
>          sep = " ";
>      }
> 
yes makes sense

>> +
>> +        if !self.required_terms.is_empty() && !self.optional_terms.is_empty() {
>> +            write!(f, " ")?;
>> +        }
> 
> ↑ This could then be skipped entirely and we can reuse `sep` inside the
> lower loop the same way - if no previous text was there, the first
> element will still be using an empty string for `sep` before updating it
> to be `" "` for the remaining elements...
> 
> Then the entire implementation is just:
>      let mut sep = "";
>      for term in &self.required_terms {
>          write!(f, "{sep}{term}")?;
>          sep = " ";
>      }
>      for term in &self.optional_terms {
>          write!(f, "{sep}{term}")?;
>          sep = " ";
>      }
> 
> of if you're feeling fancy:
> 
>      let mut sep = "";
>      for term in self.required_terms.iter().chain(self.optional_terms.iter()) {
>          write!(f, "{sep}{term}")?;
>          sep = " ";
>      }
> 

thanks i'll see to integrate taht

>> +
>> +        for (count, term) in self.optional_terms.iter().enumerate() {
>> +            if count != 0 {
>> +                write!(f, " ")?;
>> +            }
>> +
>> +            write!(f, "{term}")?;
>> +        }
>> +
>> +        Ok(())
>> +    }
>> +}
>> +
>> +#[derive(Debug, Clone, PartialEq)]
>> +pub struct SearchTerm {
>> +    optional: bool,
>> +    pub value: String,
>> +    pub category: Option<String>,
>> +}
>> +
>> +impl SearchTerm {
>> +    /// Creates a new [`SearchTerm`].
>> +    pub fn new<S: Into<String>>(term: S) -> Self {
> 
> ↑ use `Into<String>`, `ToString` will cause an unnecessary clone when
> passing an already owned `String` as it is implemented via `Display`.
> 
>> +        Self {
>> +            value: term.to_string(),
>> +            optional: false,
>> +            category: None,
>> +        }
>> +    }
>> +
>> +    /// Builder style method to set the category
>> +    pub fn category<S: ToString>(mut self, category: Option<S>) -> Self {
>> +        self.category = category.map(|s| s.to_string());
>> +        self
>> +    }
>> +
>> +    /// Builder style method to mark this [`SearchTerm`] as optional
>> +    pub fn optional(mut self, optional: bool) -> Self {
>> +        self.optional = optional;
>> +        self
>> +    }
>> +}
>> +
>> +impl FromStr for SearchTerm {
>> +    type Err = anyhow::Error;
>> +
>> +    fn from_str(s: &str) -> Result<Self, Self::Err> {
> 
> ↑ use: (mut term: &str)
> 
>> +        let mut optional = true;
>> +        let mut term: String = s.into();
> 
> ↑ skip the premature allocation ;-)
> 
>> +        if term.starts_with("+") {
>> +            optional = false;
>> +            term.remove(0);
>> +        }
> 
> ↑
>      if let Some(rest) = term.strip_prefix("+") {
>          term = rest;
>          optional = false;
>      }
> 
>> +
>> +        let (term, category) = if let Some(idx) = term.find(":") {
>> +            let mut real_term = term.split_off(idx);
>> +            real_term.remove(0); // remove ':'
> 
> ↑ 2 expensive avoidable ops, see below
> 
>> +            (real_term, Some(term))
>> +        } else {
>> +            (term, None)
>> +        };
> 
> ↑
>      let category;
>      (term, category) = match term.split_once(':') {
>          Some((category, term)) => (term, Some(category)),
>          None => (term, None),
>      }
> 
>> +
>> +        if term.is_empty() {
>> +            bail!("term cannot be empty");
>> +        }
>> +
>> +        if category == Some("".into()) {
> 
> ↑ Can avoid this (also, `String::new()` would be cheaper) via:
> 
>      if category.is_some_and(str::is_empty) {
> 
> 
>> +            bail!("category cannot be empty");
>> +        }
>> +
>> +        Ok(SearchTerm::new(term).optional(optional).category(category))
>> +    }
>> +}
>> +
>> +impl std::fmt::Display for SearchTerm {
>> +    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
>> +        if !self.optional {
>> +            f.write_str("+")?;
>> +        }
>> +
>> +        if let Some(cat) = &self.category {
>> +            f.write_str(cat)?;
>> +            f.write_str(":")?;
>> +        }
>> +
>> +        f.write_str(&self.value)
>> +    }
>> +}
>> +
>> +#[cfg(test)]
>> +mod tests {
>> +    use crate::{Search, SearchTerm};
>> +
>> +    #[test]
>> +    fn parse_test_simple_filter() {
>> +        assert_eq!(
>> +            "foo".parse::<SearchTerm>().unwrap(),
>> +            SearchTerm::new("foo").optional(true),
>> +        );
>> +    }
>> +
>> +    #[test]
>> +    fn parse_test_requires_filter() {
>> +        assert_eq!(
>> +            "+foo".parse::<SearchTerm>().unwrap(),
>> +            SearchTerm::new("foo"),
>> +        );
>> +    }
>> +
>> +    #[test]
>> +    fn parse_test_category_filter() {
>> +        assert_eq!(
>> +            "foo:bar".parse::<SearchTerm>().unwrap(),
>> +            SearchTerm::new("bar").optional(true).category(Some("foo"))
>> +        );
>> +        assert_eq!(
>> +            "+foo:bar".parse::<SearchTerm>().unwrap(),
>> +            SearchTerm::new("bar").category(Some("foo"))
>> +        );
>> +    }
>> +
>> +    #[test]
>> +    fn parse_test_invalid_filter() {
>> +        assert!(":bar".parse::<SearchTerm>().is_err());
>> +        assert!("+cat:".parse::<SearchTerm>().is_err());
>> +        assert!("+".parse::<SearchTerm>().is_err());
>> +        assert!(":".parse::<SearchTerm>().is_err());
>> +    }
>> +
>> +    #[test]
>> +    fn match_tests() {
>> +        let search = Search::from_iter(vec![
>> +            SearchTerm::new("required1").optional(false),
>> +            SearchTerm::new("required2").optional(false),
>> +            SearchTerm::new("optional1").optional(true),
>> +            SearchTerm::new("optional2").optional(true),
>> +        ]);
>> +
>> +        // each case contains results for
>> +        // required1, required2, optional1, optional2
>> +        // and if it should match or not
>> +        let cases = [
>> +            ((true, true, true, false), true),
>> +            ((true, true, false, true), true),
>> +            ((true, true, true, true), true),
>> +            ((true, true, false, false), false),
>> +            ((true, false, false, false), false),
>> +            ((false, false, false, false), false),
>> +            ((false, true, false, false), false),
>> +            ((false, false, true, true), false),
>> +            ((false, true, true, true), false),
>> +            ((true, false, true, true), false),
>> +            ((true, false, true, false), false),
>> +            ((false, true, true, false), false),
>> +            ((false, false, true, false), false),
>> +            ((true, false, false, true), false),
>> +            ((false, true, false, true), false),
>> +            ((false, false, false, true), false),
>> +        ];
>> +        for (input, expected) in cases {
>> +            assert!(
>> +                search.matches(|term| {
>> +                    match term.value.as_str() {
>> +                        "required1" => input.0,
>> +                        "required2" => input.1,
>> +                        "optional1" => input.2,
>> +                        "optional2" => input.3,
>> +                        _ => unreachable!(),
>> +                    }
>> +                }) == expected
>> +            )
>> +        }
>> +    }
>> +
>> +    #[test]
>> +    fn category_value_required() {
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("foo")]);
>> +        assert!(!search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("foo").optional(true)]);
>> +        assert!(!search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("foo").category(Some("bar"))]);
>> +        assert!(!search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("baz").category(Some("bar"))]);
>> +        assert!(search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("foo")
>> +            .optional(true)
>> +            .category(Some("bar"))]);
>> +        assert!(!search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![SearchTerm::new("baz")
>> +            .optional(true)
>> +            .category(Some("bar"))]);
>> +        assert!(search.category_value_required("bar", "baz"));
>> +
>> +        let search: Search = Search::from_iter(vec![
>> +            SearchTerm::new("baz").optional(true).category(Some("bar")),
>> +            SearchTerm::new("foo").optional(true),
>> +        ]);
>> +        assert!(!search.category_value_required("bar", "baz"));
>> +    }
>> +
>> +    #[test]
>> +    fn test_display() {
>> +        let term = SearchTerm::new("foo");
>> +        assert_eq!("+foo", &term.to_string());
>> +
>> +        let term = SearchTerm::new("foo").optional(true);
>> +        assert_eq!("foo", &term.to_string());
>> +
>> +        let term = SearchTerm::new("foo").optional(false);
>> +        assert_eq!("+foo", &term.to_string());
>> +
>> +        let term = SearchTerm::new("foo").category(Some("bar"));
>> +        assert_eq!("+bar:foo", &term.to_string());
>> +
>> +        let term = SearchTerm::new("foo").optional(true).category(Some("bar"));
>> +        assert_eq!("bar:foo", &term.to_string());
>> +
>> +        let term = SearchTerm::new("foo").optional(false).category(Some("bar"));
>> +        assert_eq!("+bar:foo", &term.to_string());
>> +    }
>> +}
>> -- 
>> 2.47.2



_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel

  reply	other threads:[~2025-09-03  8:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-28 13:15 [pdm-devel] [PATCH datacenter-manager v4 00/10] implement more complex search syntax Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 01/10] pdm-api-types: resources: add helper methods for fields Dominik Csapak
2025-09-02 14:29   ` Wolfgang Bumiller
2025-09-03  7:38     ` Dominik Csapak
2025-09-03  7:53       ` Wolfgang Bumiller
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 02/10] lib: add pdm-search crate Dominik Csapak
2025-09-02 15:33   ` Wolfgang Bumiller
2025-09-03  8:14     ` Dominik Csapak [this message]
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 03/10] server: api: resources: add more complex filter syntax Dominik Csapak
2025-08-28 13:53   ` Shannon Sterz
2025-08-28 14:18     ` Dominik Csapak
2025-09-03  8:49   ` Wolfgang Bumiller
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 04/10] ui: add possibility to insert into search box Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 05/10] ui: dashboard: remotes panel: open search on click Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 06/10] ui: dashboard: guest panel: search for guest states when clicking on them Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 07/10] ui: dashboard: search for nodes when clicking on the nodes panel Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 08/10] ui: search box: add clear trigger Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 09/10] ui: dashboard: guest panel: use `List` instead of `DataTable` Dominik Csapak
2025-08-28 13:16 ` [pdm-devel] [PATCH datacenter-manager v4 10/10] ui: dashboard: guest panel: add search icon for better discoverability Dominik Csapak
2025-09-03 13:34 ` [pdm-devel] superseded: [PATCH datacenter-manager v4 00/10] implement more complex search syntax Dominik Csapak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c211b39c-d9e0-48e4-9d29-fba6ce2e85f5@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pdm-devel@lists.proxmox.com \
    --cc=w.bumiller@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal