From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pmg-devel@lists.proxmox.com
Subject: Re: [pmg-devel] [PATCH pmg-api 03/12] RuleCache: reorganize how we gather marks and spaminfo
Date: Tue, 20 Feb 2024 12:10:35 +0100 [thread overview]
Message-ID: <20240220121035.5b7f6889@rosa.proxmox.com> (raw)
In-Reply-To: <20240209125440.2572239-4-d.csapak@proxmox.com>
On Fri, 9 Feb 2024 13:54:27 +0100
Dominik Csapak <d.csapak@proxmox.com> wrote:
> instead of collecting the spaminfo (+match) seperately, collect this
> per target together with the regular marks. With this, we can omit the
> 'global' marks list, since each target has their own anyway.
>
> We want this, since when we'll implement and/invert for matches, the marks
> can differ between targets, since the spamlevel can diverge for them and
> that can be and-combined with objects that add marks. For that to be
> possible we have to save each match + info per target instead of
> globally.
>
> Since we don't change the actual matching behaviour with this patch,
> for the remove action, we can simply use the marks from the first target
> (as they currently have to be identical).
I don't think this premise holds - or rather the reasoning seems a bit off?
* marks are generated with what_matches
* global (not-per-part) matches are virus, spam - these just mark with an
empty array-ref [] - indicating they affect the whole mail
* per-part what-matches are MatchField, and the content-type/filename
matches - they add a list of all parts they match
* the only what_match that might differ per user/target is the spam-match,
which marks the complete mail
marks are identical per rule across all targets, because the only place
where they could differ just pushes the contents of an empty array to the
list.
(sorry if this sounds a bit pedantic - but it sadly took me 30 minutes
with Data::Dumper to get my head around this)
>
> Conversely, we currently save the spaminfo per target, but later in
> pmg-smtp-filter we only ever use the first one we encounter, so instead
> save it only the first time and use that.
we currently get the spaminfo as part of the resulting hashref from
RuleCache::what_match, next to the only other member 'targets'.
Maybe we could return that as second value from what_match and save
ourselves the second level of nesting (see inline)
Please disregard if this becomes obsolete by one of the later patches
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> src/PMG/RuleCache.pm | 32 ++++++++++----------------------
> src/PMG/RuleDB/Remove.pm | 19 +++++++++++++++----
> src/bin/pmg-smtp-filter | 18 +++++-------------
> 3 files changed, 30 insertions(+), 39 deletions(-)
>
> diff --git a/src/PMG/RuleCache.pm b/src/PMG/RuleCache.pm
> index fd22a16..4f7ebe7 100644
> --- a/src/PMG/RuleCache.pm
> +++ b/src/PMG/RuleCache.pm
> @@ -304,37 +304,25 @@ sub what_match {
> if (scalar($what->{groups}->@*) == 0) {
> # match all targets
> foreach my $target (@{$msginfo->{targets}}) {
> - $res->{$target}->{marks} = [];
> + $res->{targets}->{$target}->{marks} = [];
here this could become $res->{$target}->{marks}
> }
> -
> - $res->{marks} = [];
> return $res;
> }
>
> - my $marks;
> -
> for my $group ($what->{groups}->@*) {
> for my $obj ($group->{objects}->@*) {
> if (!$obj->can('what_match_targets')) {
> if (my $match = $obj->what_match($queue, $element, $msginfo, $dbh)) {
> - push @$marks, @$match;
> + for my $target ($msginfo->{targets}->@*) {
> + push $res->{targets}->{$target}->{marks}->@*, $match->@*;
here as well
> + }
> }
> - }
> - }
> - }
> -
> - foreach my $target (@{$msginfo->{targets}}) {
> - $res->{$target}->{marks} = $marks;
> - $res->{marks} = $marks;
> - }
> -
> - for my $group ($what->{groups}->@*) {
> - for my $obj ($group->{objects}->@*) {
> - if ($obj->can ("what_match_targets")) {
> - my $target_info;
> - if ($target_info = $obj->what_match_targets($queue, $element, $msginfo, $dbh)) {
> - foreach my $k (keys %$target_info) {
> - $res->{$k} = $target_info->{$k};
> + } else {
> + if (my $target_info = $obj->what_match_targets($queue, $element, $msginfo, $dbh)) {
> + foreach my $k (keys $target_info->%*) {
> + push $res->{targets}->{$k}->{marks}->@*, $target_info->{$k}->{marks}->@*;
and here
> + # only save spaminfo once
> + $res->{spaminfo} = $target_info->{$k}->{spaminfo} if !defined($res->{spaminfo});
this would need to be changed (and returned as second value below)
> }
> }
> }
> diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
> index e7c353c..5812602 100644
> --- a/src/PMG/RuleDB/Remove.pm
> +++ b/src/PMG/RuleDB/Remove.pm
> @@ -198,9 +198,15 @@ sub execute {
>
> my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>
> - if (!$self->{all} && ($#$marks == -1)) {
> - # no marks
> - return;
> + if (!$self->{all}) {
> + my $found_mark = 0;
> + for my $target (keys $marks->{targets}->%*) {
> + if (scalar($marks->{targets}->{$target}->{marks}->@*) > 0) {
> + $found_mark = 1;
> + last;
> + }
> + }
> + return if !$found_mark;
> }
>
> my $subgroups = $mod_group->subgroups ($targets);
> @@ -256,7 +262,12 @@ sub execute {
> }
>
> $self->{message_seen} = 0;
> - $self->delete_marked_parts($queue, $entity, $html, $rtype, $marks, $rulename);
> +
> + # since all matches are or combinded, marks for all targets must be the same if they exist
> + # so simply use the first one here
maybe "since currently all marks are equal for all targets, use the first
one"?
> + my $match_marks = $marks->{targets}->{$tg->[0]}->{marks};
> +
> + $self->delete_marked_parts($queue, $entity, $html, $rtype, $match_marks, $rulename);
> delete $self->{message_seen};
>
> if ($msginfo->{testmode}) {
> diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
> index 7da3de8..71043b0 100755
> --- a/src/bin/pmg-smtp-filter
> +++ b/src/bin/pmg-smtp-filter
> @@ -276,8 +276,9 @@ sub apply_rules {
> foreach my $target (@{$msginfo->{targets}}) {
> next if $final->{$target};
> next if !defined ($rule_marks{$rule->{id}});
> - next if !defined ($rule_marks{$rule->{id}}->{$target});
> - next if !defined ($rule_marks{$rule->{id}}->{$target}->{marks});
> + next if !defined ($rule_marks{$rule->{id}}->{targets});
here you could get rid of this line - if the what_match returns the spaminfo as second value.
> + next if !defined ($rule_marks{$rule->{id}}->{targets}->{$target});
> + next if !defined ($rule_marks{$rule->{id}}->{targets}->{$target}->{marks});
and here get rid of {targets}->
> next if !$rulecache->to_match ($rule->{id}, $target, $ldap);
>
> $final->{$target} = $fin;
> @@ -320,24 +321,15 @@ sub apply_rules {
> my $targets = $rule_targets{$rule->{id}};
> next if !$targets;
>
> - my $spaminfo;
> - foreach my $t (@$targets) {
> - if ($rule_marks{$rule->{id}}->{$t} && $rule_marks{$rule->{id}}->{$t}->{spaminfo}) {
> - $spaminfo = $rule_marks{$rule->{id}}->{$t}->{spaminfo};
> - # we assume spam info is the same for all matching targets
> - last;
> - }
> - }
> -
> my $vars = $self->get_prox_vars (
> - $queue, $entity, $msginfo, $rule, $rule_targets{$rule->{id}}, $spaminfo);
> + $queue, $entity, $msginfo, $rule, $rule_targets{$rule->{id}}, $rule_marks{$rule->{id}}->{spaminfo});
>
> my @sorted_actions = sort {$a->priority <=> $b->priority} @{$rule_actions{$rule->{id}}};
>
> foreach my $action (@sorted_actions) {
> $action->execute(
> $queue, $self->{ruledb}, $mod_group, $rule_targets{$rule->{id}}, $msginfo, $vars,
> - $rule_marks{$rule->{id}}->{marks}, $ldap
> + $rule_marks{$rule->{id}}, $ldap
> );
> last if $action->final;
> }
next prev parent reply other threads:[~2024-02-20 11:10 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-09 12:54 [pmg-devel] [PATCH pmg-api/docs/gui] implement and combination and inversion of groups and objects Dominik Csapak
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 01/12] RuleCache: remove unnecessary copying of marks Dominik Csapak
2024-02-20 14:42 ` [pmg-devel] applied: " Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 02/12] RuleCache: reorganize to keep group structure Dominik Csapak
2024-02-20 14:45 ` [pmg-devel] applied: " Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 03/12] RuleCache: reorganize how we gather marks and spaminfo Dominik Csapak
2024-02-20 11:10 ` Stoiko Ivanov [this message]
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 04/12] api: refactor rule parameters Dominik Csapak
2024-02-20 11:49 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 05/12] add objectgroup attributes and/invert Dominik Csapak
2024-02-20 12:35 ` Stoiko Ivanov
2024-02-20 12:47 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 06/12] add rule attributes and/invert (for each relevant type) Dominik Csapak
2024-02-20 13:03 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 07/12] RuleCache: load rule/objectgroup attributes from database Dominik Csapak
2024-02-20 13:18 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 08/12] RuleCache: implement and/invert for when/from/to Dominik Csapak
2024-02-20 13:09 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 09/12] MailQueue: return maximum AID Dominik Csapak
2024-02-20 13:20 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 10/12] WIP: ModGroup: add possibility to explode to all targets Dominik Csapak
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 11/12] RuleCache: implement and/invert for what matches Dominik Csapak
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-api 12/12] pmgdb: extend dump output to include add/invert Dominik Csapak
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-docs 1/2] rule system: add a small section about matching rules Dominik Csapak
2024-02-20 14:47 ` [pmg-devel] applied: " Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-docs 2/2] rule system: explain new and mode and invert flag Dominik Csapak
2024-02-20 14:40 ` Stoiko Ivanov
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-gui 1/2] rules: use tree panel instead of grouping feature of the grid Dominik Csapak
2024-02-09 12:54 ` [pmg-devel] [PATCH pmg-gui 2/2] rules/objects: add mode selector dropdown Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240220121035.5b7f6889@rosa.proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox