From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pdm-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id 9A69E1FF183
	for <inbox@lore.proxmox.com>; Wed,  5 Nov 2025 11:57:59 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id 8B6322100E;
	Wed,  5 Nov 2025 11:58:39 +0100 (CET)
Mime-Version: 1.0
Date: Wed, 05 Nov 2025 11:58:05 +0100
Message-Id: <DE0PHQIDCL1P.3VR35J8P02V05@proxmox.com>
From: "Lukas Wagner" <l.wagner@proxmox.com>
To: "Dominik Csapak" <d.csapak@proxmox.com>, "Proxmox Datacenter Manager
 development discussion" <pdm-devel@lists.proxmox.com>, "Lukas Wagner"
 <l.wagner@proxmox.com>
X-Mailer: aerc 0.21.0-0-g5549850facc2-dirty
References: <20251103123521.266258-1-l.wagner@proxmox.com>
 <20251103123521.266258-5-l.wagner@proxmox.com>
 <a9e4fe9a-3ecd-4a11-ad3e-e72c846b72b7@proxmox.com>
In-Reply-To: <a9e4fe9a-3ecd-4a11-ad3e-e72c846b72b7@proxmox.com>
X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2
X-Bm-Transport-Timestamp: 1762340267510
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.027 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [config.id]
Subject: Re: [pdm-devel] [PATCH datacenter-manager v2 04/12] views: add
 implementation for view filters
X-BeenThere: pdm-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Datacenter Manager development discussion
 <pdm-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pdm-devel>, 
 <mailto:pdm-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pdm-devel/>
List-Post: <mailto:pdm-devel@lists.proxmox.com>
List-Help: <mailto:pdm-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel>, 
 <mailto:pdm-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox Datacenter Manager development discussion
 <pdm-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pdm-devel-bounces@lists.proxmox.com
Sender: "pdm-devel" <pdm-devel-bounces@lists.proxmox.com>

On Wed Nov 5, 2025 at 11:08 AM CET, Dominik Csapak wrote:
>> +
>> +/// Get view filter with a given ID.
>> +///
>> +/// Returns an error if the view filter configuration file could not be read, or
>> +/// if the view filter with the provided ID does not exist.
>> +pub fn get_view_filter(filter_id: &str) -> Result<ViewFilter, Error> {
>> +    pdm_config::views::get_view_filter_config(filter_id).map(ViewFilter::new)
>> +}
>> +
>> +/// View filter implementation.
>> +///
>> +/// Given a [`ViewFilterConfig`], this struct can be used to check if a resource/remote/node
>> +/// matches the filter rules.
>> +#[derive(Clone)]
>> +pub struct ViewFilter {
>> +    config: ViewFilterConfig,
>> +}
>
> wouldn't a newtype suffice here too?
>
> pub struct ViewFilter(ViewFilterConfig)

Personally I'm not the biggest fan of newtypes unless I'm 100%
convinced that the type is only ever going to contain this single
member (simple example: pub struct UnixTimestamp(i64)), just to avoid
having to change any code when I have to introduce another member (e.g.
changing self.0 to self.config). Since it's only a very minor syntactic
thing, I'd be tempted to leave it as is - unless you insist I change it.

>
> ? alternatively, what about having freestanding functions that
> take a `&ViewFilterConfig` as parameter ?
>

I want to shield callers from the ViewFilterConfig itself (also the
reason why the `get_view_filter` fn exist, instead of having the caller
call directly into pdm_config) - so I'd prefer to keep this struct.

> If we're doing it this way though, I'd rather implement a
> From<ViewFilterConfig> for ViewFilter than a `new` method
> (or maybe both)
>
>> +
>> +impl ViewFilter {
>> +    /// Create a new [`ViewFiler`].
>> +    pub fn new(config: ViewFilterConfig) -> Self {
>> +        Self { config }
>> +    }
>> +
>> +    /// Check if a [`Resource`] matches the filter rules.
>> +    pub fn resource_matches(&self, remote: &str, resource: &Resource) -> bool {
>> +        // NOTE: Establishing a cache here is not worth the effort at the moment, evaluation of
>> +        // rules is *very* fast.
>> +        //
>> +        // Some experiments were performed with a cache that works roughly as following:
>> +        //   - HashMap<ViewId, HashMap<ResourceId, bool>> in a mutex
>> +        //   - Cache invalidated if view-filter config digest changed
>> +        //   - Cache invalidated if certain resource fields such as tags or resource pools change
>> +        //     from the last time (also with a digest-based implementation)
>> +        //
>> +        // Experimented with the `fake-remote` feature and and 15000 guests showed that
>> +        // caching was only faster than direct evaluation if the number of rules in the
>> +        // ViewFilterConfig is *huge* (e.g. >1000 `include-resource-id` entries). But even for those,
>> +        // direct evaluation was always plenty fast, with evaluation times ~20ms for *all* resources.
>> +        //
>> +        // -> for any *realistic* filter config, we should be good with direct evaluation, as long
>> +        // as we don't add any filter rules which are very expensive to evaluate.
>
> isn't that (full) info more suited for the commit message than a  comment?
>
> e.g. a single line comment with 'caching here is currently not worth it'
> and the full text in the commit message should also be ok?
>
> (no hard feelings though)
>

I'll move it to the commit message and leave a short summary in the
comment - thanks!

>> +
>> +        let resource_data = resource.into();
>> +
>> +        self.check_if_included(remote, &resource_data)
>> +            && !self.check_if_excluded(remote, &resource_data)
>> +    }
>> +
>> +    /// Check if a remote can be safely skipped based on the filter rule definition.
>> +    ///
>> +    /// When there are `include remote:<...>` or `exclude remote:<...>` rules, we can use these to
>> +    /// check if a remote needs to be considered at all.
>> +    pub fn can_skip_remote(&self, remote: &str) -> bool {
>> +        let mut has_any_include_remote = false;
>> +        let mut matches_any_include_remote = false;
>> +
>> +        let mut any_other = false;
>> +
>> +        for include in &self.config.include {
>> +            if let FilterRule::Remote(r) = include {
>> +                has_any_include_remote = true;
>> +                if r == remote {
>> +                    matches_any_include_remote = true;
>> +                    break;
>> +                }
>> +            } else {
>> +                any_other = true;
>> +            }
>> +        }
>> +
>> +        let matches_any_exclude_remote = self.config.exclude.iter().any(|e| {
>> +            if let FilterRule::Remote(r) = e {
>> +                r == remote
>> +            } else {
>> +                false
>> +            }
>> +        });
>> +
>> +        (has_any_include_remote && !matches_any_include_remote && !any_other)
>> +            || matches_any_exclude_remote
>> +    }
>> +
>> +    /// Check if a node is matched by the filter rules.
>> +    ///
>> +    /// This is equivalent to checking an actual node resource.
>> +    pub fn is_node_included(&self, remote: &str, node: &str) -> bool {
>> +        let resource_data = ResourceData {
>> +            resource_type: ResourceType::Node,
>> +            tags: None,
>> +            resource_pool: None,
>> +            resource_id: &format!("remote/{remote}/node/{node}"),
>> +        };
>> +
>> +        self.check_if_included(remote, &resource_data)
>> +            && !self.check_if_excluded(remote, &resource_data)
>> +    }
>> +
>> +    /// Returns the name of the view filter.
>> +    pub fn name(&self) -> &str {
>> +        &self.config.id
>> +    }
>> +
>> +    fn check_if_included(&self, remote: &str, resource: &ResourceData) -> bool {
>> +        if self.config.include.is_empty() {
>> +            // If there are no include rules, any resource is included (unless excluded)
>> +            return true;
>> +        }
>> +
>> +        check_rules(&self.config.include, remote, resource)
>> +    }
>> +
>> +    fn check_if_excluded(&self, remote: &str, resource: &ResourceData) -> bool {
>> +        check_rules(&self.config.exclude, remote, resource)
>> +    }
>> +}
>> +
>> +fn check_rules(rules: &[FilterRule], remote: &str, resource: &ResourceData) -> bool {
>> +    for rule in rules {
>> +        let verdict = match rule {
>> +            FilterRule::ResourceType(resource_type) => resource.resource_type == *resource_type,
>> +            FilterRule::ResourcePool(pool) => resource.resource_pool == Some(pool),
>> +            FilterRule::ResourceId(resource_id) => resource.resource_id == resource_id,
>> +            FilterRule::Tag(tag) => {
>> +                if let Some(resource_tags) = resource.tags {
>> +                    resource_tags.contains(tag)
>> +                } else {
>> +                    false
>> +                }
>> +            }
>> +            FilterRule::Remote(included_remote) => included_remote == remote,
>> +        };
>> +
>> +        if verdict {
>> +            return true;
>> +        }
>> +    }
>> +
>> +    false
>
> wouldn't this boil down to:
>
> return rules.any(|rule| match rule { ... } ) ?
>
> instead of looping and doing an early return manually?
>

You are absolutely right - I'll change it as suggested.

>> +}
>> +
>> +struct ResourceData<'a> {
>> +    resource_type: ResourceType,
>> +    tags: Option<&'a [String]>,
>> +    resource_pool: Option<&'a String>,
>> +    resource_id: &'a str,
>> +}
>> +
>> +impl<'a> From<&'a Resource> for ResourceData<'a> {
>> +    fn from(value: &'a Resource) -> Self {
>> +        match value {
>> +            Resource::PveQemu(resource) => ResourceData {
>> +                resource_type: value.resource_type(),
>> +                tags: Some(&resource.tags),
>> +                resource_pool: Some(&resource.pool),
>> +                resource_id: value.global_id(),
>> +            },
>> +            Resource::PveLxc(resource) => ResourceData {
>> +                resource_type: value.resource_type(),
>> +                tags: Some(&resource.tags),
>> +                resource_pool: Some(&resource.pool),
>> +                resource_id: value.global_id(),
>> +            },
>> +            Resource::PveNode(_)
>> +            | Resource::PveSdn(_)
>> +            | Resource::PbsNode(_)
>> +            | Resource::PbsDatastore(_)
>> +            | Resource::PveStorage(_) => ResourceData {
>> +                resource_type: value.resource_type(),
>> +                tags: None,
>> +                resource_pool: None,
>> +                resource_id: value.global_id(),
>> +            },
>> +        }
>> +    }
>> +}
>
> Is it really worht it, to define a seperate type that you only use 
> internally?
> couldn't you simple use the &Resource type directly?
> or maybe just having 2 helper methods to extract the relevant info?
> (the type and global_id is already abstracted, so it's only relevant
> for the tags and the resource_pool ?)
>
> but i guess i'd have to see it to determine what is better...

The sole reason why there is this 'itermediary' type is the
'is_node_included' function. It is used when we don't handle resources,
but only really care about the node (e.g. task API). I didn't want to
'synthesize' some fake PveNode resource in the 'is_node_included'
function; it felt kind of wrong.


_______________________________________________
pdm-devel mailing list
pdm-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pdm-devel