public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
@ 2022-11-23  9:23 Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

v2->v3:
* dropped the useless decode/encode/decode chain in decode_rfc1522
* moved try_decode_utf8 to patch 1 as it's now used there
* renamed 'encode_user_stat' to 'user_stat_to_perlstring' as this is what
  the helper actually does
* the 2 patches for pmg-gui make it possible to add user black/whitelist
  entries for non-ascii e-mails
* quickly re-verified that pmgpolicy should be robust for smtputf8 mail
  (postfix hands the data over as utf-8 - and pmgpolicy does not parse it

Thanks again to Dominik for the off-list suggestions!

original cover-letter for v2:
v1->v2:
* dropped already applied patches
* added a patch for one further glitch in ModField/Notify actions (when
  parsing/replacing non-ascii characters) - patch 1/5+2/5
* added support for utf-8 data in the mailflow additionally for:
** quarantine API handlng
** user BL/WL (the GUI still needs adaptation to parse e-mail-addresses
   more liberally - but else it seems to work)
** pmgqm (spamreports)
** statistics

still missing support for:
* LDAP
* Who Objects

huge thanks to Dominik for taking the time to review and test the v1!

original cover-letter for v1:
this patchseries partially fixes #2465 and #2541, two quite often reported
issues, which are causing quite a disappointing experience for users
in non-ascii only environments

the main assumption of the patches are:
* envelope addresses are either ascii or utf-8 (latter only with smtputf8)
* thus we can unconditionally de-/encode envelope addresses for database
  results/lookups
* the matching in the rule-objects will see the relevant parts of the mail
  as properly encoded perl-strings (with multi-byte characters - e.g. the
  euro sign as \x{20ac} instead of \x{e2}\x{82}\x{ac})
(I did a bit of testing to verify them, by e.g. sending an ISO-8859-1
encoded mail and matching for an umlaut in the subject)

While going through the RuleDB classes I remembered, that we have a few
pieces of legacy objects (Attach, ReportSpam, Counter actions) there, and
went ahead with deprecating them (initially I simply deleted them, but
decided to be more cautious and just log the deprecation until 8.0, when
we can drop them explicitly). They cannot be instantiated currently (short
of a direct insert into the database) - but I don't know if they were ever
used in pre 5.0 times in their current form. - patch 2/5.

Out of scope of the series for now:
* utf-8 support in the LDAP subsystem (deployments with a configured LDAP
  profile still won't be able to process smtputf8 mails) - mostly until I
  get around to create test-environment with the appropriate schema for
  having non-ascii mail-addresses
* Domain/Email objects - did not find the time to consider how to store
  them most sensibly (puny-code, utf-8) and if the choice should be
  carried over to all of our 'email' formats (it probably shouldn't)

patches 1/5 and 4/5 address 2 small bugs I ran into while testing

Given that I quite often miss a few fine points or use-cases I'd be very
grateful for some more experimenting/testing!


pmg-api:
Stoiko Ivanov (8):
  utils: return perl string from decode_rfc1522
  ruledb: properly substitute prox_vars in headers
  fix #2541 ruledb: encode relevant values as utf-8 in database
  ruledb: encode e-mail addresses for syslog
  partially fix #2465: handle smtputf8 addresses in the rule-system
  quarantine: handle utf8 data
  pmgqm: handle smtputf8 data
  statistics: handle utf8 data.

 src/PMG/API2/Quarantine.pm      | 16 ++++----
 src/PMG/CLI/pmgqm.pm            | 24 ++++++------
 src/PMG/HTMLMail.pm             |  7 ++--
 src/PMG/MailQueue.pm            | 10 +++--
 src/PMG/Quarantine.pm           | 13 ++++---
 src/PMG/RuleDB.pm               | 24 ++++++++----
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/BCC.pm           | 23 +++++++++--
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 +-
 src/PMG/RuleDB/MatchField.pm    |  8 +++-
 src/PMG/RuleDB/MatchFilename.pm |  5 ++-
 src/PMG/RuleDB/ModField.pm      | 19 +++-------
 src/PMG/RuleDB/Notify.pm        | 24 +++++++++---
 src/PMG/RuleDB/Quarantine.pm    | 19 ++++++++--
 src/PMG/RuleDB/Remove.pm        | 20 +++++++---
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/Spam.pm          | 17 +++++----
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++-
 src/PMG/Statistic.pm            | 67 ++++++++++++++++++++++++---------
 src/PMG/Utils.pm                | 32 ++++++++++++++--
 src/bin/pmg-smtp-filter         |  7 ++--
 23 files changed, 238 insertions(+), 114 deletions(-)

pmg-gui:
Stoiko Ivanov (2):
  utils: add custom validator for pmg-email-address
  userblocklists: use PMGMail as validator for pmail

 js/UserBlackWhiteList.js | 2 +-
 js/Utils.js              | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

-- 
2.30.2




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)

the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit
43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/Utils.pm | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..cfb8852 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1088,6 +1088,7 @@ sub decode_to_html {
     return $res;
 }
 
+# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters)
 sub decode_rfc1522 {
     my ($enc) = @_;
 
@@ -1102,7 +1103,7 @@ sub decode_rfc1522 {
 		if ($cs) {
 		    $res .= decode($cs, $d);
 		} else {
-		    $res .= $d;
+		    $res .= try_decode_utf8($d);
 		}
 	    }
 	}
@@ -1542,4 +1543,9 @@ sub get_existing_object_id {
     return;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+    return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

by storing the variables as perl-string (not mime-encoded, and not
utf-8 encoded), and appropriately dealing with multi-line values to
input (folding the headers and encoding as mime).

This fixes another glitch not caught by
d3d6b5dff9e4447d16cb92e0fdf26f67d9384423

the Subject was always displayed with a '?' in the end (due to the
(quoted-printable encoded) \n added).

Additionally adapt the other callsites of PMG::Utils::subst_values
where applicable.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/BCC.pm      |  2 +-
 src/PMG/RuleDB/ModField.pm | 13 +------------
 src/PMG/RuleDB/Notify.pm   |  4 ++--
 src/PMG/Utils.pm           | 17 +++++++++++++++++
 src/bin/pmg-smtp-filter    |  2 +-
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index d364690..4867d83 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -117,7 +117,7 @@ sub execute {
 
     my $rulename = $vars->{RULE} // 'unknown';
 
-    my $bcc_to = PMG::Utils::subst_values($self->{target}, $vars);
+    my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
 
     if ($bcc_to =~ m/^\s*$/) {
 	# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 4ebb618..34108d1 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -5,7 +5,6 @@ use warnings;
 use DBI;
 use Digest::SHA;
 use Encode qw(encode decode);
-use MIME::Words qw(encode_mimewords);
 
 use PMG::Utils;
 use PMG::ModGroup;
@@ -98,17 +97,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $fvalue = '';
-
-    foreach my $line (split('\r?\n\s*',PMG::Utils::subst_values ($self->{field_value}, $vars))) {
-	$fvalue .= "\n" if $fvalue;
-	$fvalue .= encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
-    }
-
-    # support for multiline values (i.e. __SPAM_INFO__)
-    $fvalue =~ s/\n/\n\t/sg; # indent content
-    $fvalue =~ s/\n\s*\n//sg;   # remove empty line
-    $fvalue =~ s/\n?\s*$//s;    # remove trailing spaces
+    my $fvalue = PMG::Utils::subst_values_for_header($self->{field_value}, $vars);
 
     my $subgroups = $mod_group->subgroups($targets);
 
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index d67221e..7b38e0d 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -211,8 +211,8 @@ sub execute {
     my $rulename = $vars->{RULE} // 'unknown';
 
     my $body = PMG::Utils::subst_values($self->{body}, $vars);
-    my $subject = PMG::Utils::subst_values($self->{subject}, $vars);
-    my $to = PMG::Utils::subst_values($self->{to}, $vars);
+    my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
+    my $to = PMG::Utils::subst_values_for_header($self->{to}, $vars);
 
     if ($to =~ m/^\s*$/) {
 	# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cfb8852..cc30e67 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -203,6 +203,23 @@ sub subst_values {
     return $body;
 }
 
+sub subst_values_for_header {
+    my ($header, $dh) = @_;
+
+    my $res = '';
+    foreach my $line (split('\r?\n\s*', subst_values ($header, $dh))) {
+	$res .= "\n" if $res;
+	$res .= MIME::Words::encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
+    }
+
+    # support for multiline values (i.e. __SPAM_INFO__)
+    $res =~ s/\n/\n\t/sg; # indent content
+    $res =~ s/\n\s*\n//sg;   # remove empty line
+    $res =~ s/\n?\s*$//s;    # remove trailing spaces
+
+    return $res;
+}
+
 sub reinject_mail {
     my ($entity, $sender, $targets, $xforward, $me, $params) = @_;
 
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 35a6ac6..45e68a7 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -152,7 +152,7 @@ sub get_prox_vars {
     } if !$spaminfo;
 
     my $vars = {
-	'SUBJECT' => mime_to_perl_string($entity->head->get ('subject', 0) || 'No Subject'),
+	'SUBJECT' => PMG::Utils::decode_rfc1522($entity->head->get ('subject', 0) || 'No Subject'),
 	'RULE' => $rule->{name},
 	'RULE_INFO' => $msginfo->{rule_info},
 	'SENDER' => $msginfo->{sender},
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.

backwards-compatibility should not be an issue:
* currently the database should not contain any utf-8 multibyte
  characters, as our tooling prevented this due to sending
  wide-characters, which causes an exception in DBI.
* any character > 127 and < 256 will be correctly interpreted when
  stored in a perl-string (this happens if the decode fails in
  try_decode_utf8), and will be correctly encoded when storing into
  the database.

the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)

This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
  (puny-code for the domain part, or everything as UTF-8) and it would
  need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
  is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases

The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].

[0] https://www.postgresql.org/docs/13/multibyte.html
[1] https://datatracker.ietf.org/doc/html/rfc6838#section-4.2
[2] https://datatracker.ietf.org/doc/html/rfc2045#section-5.1

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB.pm               | 24 ++++++++++++++++--------
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/BCC.pm           |  2 +-
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 ++--
 src/PMG/RuleDB/MatchField.pm    |  8 ++++++--
 src/PMG/RuleDB/MatchFilename.pm |  5 ++++-
 src/PMG/RuleDB/ModField.pm      |  6 ++++--
 src/PMG/RuleDB/Notify.pm        |  2 +-
 src/PMG/RuleDB/Quarantine.pm    |  3 ++-
 src/PMG/RuleDB/Remove.pm        | 12 +++++++-----
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++++-
 14 files changed, 51 insertions(+), 28 deletions(-)

diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
index 895acc6..a6b0b79 100644
--- a/src/PMG/RuleDB.pm
+++ b/src/PMG/RuleDB.pm
@@ -5,6 +5,7 @@ use warnings;
 use DBI;
 use HTML::Entities;
 use Data::Dumper;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -70,8 +71,8 @@ sub create_group_with_obj {
 
     defined($obj) || die "proxmox: undefined object";
 
-    $name //= '';
-    $info //= '';
+    $name = encode('UTF-8', $name // '');
+    $info = encode('UTF-8', $info // '');
 
     eval {
 
@@ -174,7 +175,9 @@ sub save_group {
 	$self->{dbh}->do("UPDATE Objectgroup " .
 			 "SET Name = ?, Info = ? " .
 			 "WHERE ID = ?", undef,
-			 $og->{name}, $og->{info}, $og->{id});
+			 encode('UTF-8', $og->{name}),
+			 encode('UTF-8', $og->{info}),
+			 $og->{id});
 
 	return $og->{id};
 
@@ -183,7 +186,7 @@ sub save_group {
 	    "INSERT INTO Objectgroup (Name, Info, Class) " .
 	    "VALUES (?, ?, ?);");
 
-	$sth->execute($og->name, $og->info, $og->class);
+	$sth->execute(encode('UTF-8', $og->name), encode('UTF-8', $og->info), $og->class);
 
 	return $og->{id} = PMG::Utils::lastid($self->{dbh}, 'objectgroup_id_seq');
     }
@@ -212,7 +215,9 @@ sub delete_group {
 	$sth->execute($groupid);
 
 	if (my $ref = $sth->fetchrow_hashref()) {
-	    die "Group '$ref->{groupname}' is used by rule '$ref->{rulename}' - unable to delete\n";
+	    my $groupname = PMG::Utils::try_decode_utf8($ref->{groupname});
+	    my $rulename = PMG::Utils::try_decode_utf8($ref->{rulename});
+	    die "Group '$groupname' is used by rule '$rulename' - unable to delete\n";
 	}
 
         $sth->finish();
@@ -474,6 +479,7 @@ sub load_object_full {
 sub load_group_by_name {
     my ($self, $name) = @_;
 
+    $name = encode('UTF-8', $name);
     my $sth = $self->{dbh}->prepare("SELECT * FROM Objectgroup " .
 				    "WHERE name = ?");
 
@@ -598,13 +604,14 @@ sub save_rule {
     defined($rule->{direction}) ||
 	die "undefined rule attribute - direction: ERROR";
 
+    my $rulename = encode('UTF-8', $rule->{name});
     if (defined($rule->{id})) {
 
 	$self->{dbh}->do(
 	    "UPDATE Rule " .
 	    "SET Name = ?, Priority = ?, Active = ?, Direction = ? " .
 	    "WHERE ID = ?", undef,
-	    $rule->{name}, $rule->{priority}, $rule->{active},
+	    $rulename, $rule->{priority}, $rule->{active},
 	    $rule->{direction}, $rule->{id});
 
 	return $rule->{id};
@@ -614,7 +621,7 @@ sub save_rule {
 	    "INSERT INTO Rule (Name, Priority, Active, Direction) " .
 	    "VALUES (?, ?, ?, ?);");
 
-	$sth->execute($rule->name, $rule->priority, $rule->active,
+	$sth->execute($rulename, $rule->priority, $rule->active,
 		      $rule->direction);
 
 	return $rule->{id} = PMG::Utils::lastid($self->{dbh}, 'rule_id_seq');
@@ -779,7 +786,8 @@ sub load_rules {
     $sth->execute();
 
     while (my $ref = $sth->fetchrow_hashref()) {
-	my $rule = PMG::RuleDB::Rule->new($ref->{name}, $ref->{priority},
+	my $rulename = PMG::Utils::try_decode_utf8($ref->{name});
+	my $rule = PMG::RuleDB::Rule->new($rulename, $ref->{priority},
 					  $ref->{active}, $ref->{direction});
 	$rule->{id} = $ref->{id};
 	push @$rules, $rule;
diff --git a/src/PMG/RuleDB/Accept.pm b/src/PMG/RuleDB/Accept.pm
index cd67ea2..4ebd6da 100644
--- a/src/PMG/RuleDB/Accept.pm
+++ b/src/PMG/RuleDB/Accept.pm
@@ -93,7 +93,7 @@ sub execute {
     my $dkim = $msginfo->{dkim} // {};
     my $subgroups = $mod_group->subgroups($targets, !$dkim->{sign});
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 4867d83..6244dd9 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -115,7 +115,7 @@ sub execute {
 
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
 
diff --git a/src/PMG/RuleDB/Block.pm b/src/PMG/RuleDB/Block.pm
index c758787..25bb74e 100644
--- a/src/PMG/RuleDB/Block.pm
+++ b/src/PMG/RuleDB/Block.pm
@@ -89,7 +89,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if ($msginfo->{testmode}) {
 	my $fh = $msginfo->{test_fh};
diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
index d3003b2..c6afe54 100644
--- a/src/PMG/RuleDB/Disclaimer.pm
+++ b/src/PMG/RuleDB/Disclaimer.pm
@@ -193,7 +193,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $subgroups = $mod_group->subgroups($targets);
 
diff --git a/src/PMG/RuleDB/Group.pm b/src/PMG/RuleDB/Group.pm
index 2508305..baa68ce 100644
--- a/src/PMG/RuleDB/Group.pm
+++ b/src/PMG/RuleDB/Group.pm
@@ -12,8 +12,8 @@ sub new {
     my ($type, $name, $info, $class) = @_;
 
     my $self = {
-	name => $name,
-	info => $info,
+	name => PMG::Utils::try_decode_utf8($name),
+	info => PMG::Utils::try_decode_utf8($info),
 	class => $class,
     };
 
diff --git a/src/PMG/RuleDB/MatchField.pm b/src/PMG/RuleDB/MatchField.pm
index 2671ea4..2b56058 100644
--- a/src/PMG/RuleDB/MatchField.pm
+++ b/src/PMG/RuleDB/MatchField.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use MIME::Words;
 
 use PVE::SafeSyslog;
@@ -50,9 +51,10 @@ sub load_attr {
     defined($field) || die "undefined object attribute: ERROR";
     defined($field_value) || die "undefined object attribute: ERROR";
 
+    my $decoded_field_value = PMG::Utils::try_decode_utf8($field_value);
     # use known constructor, bless afterwards (because sub class can have constructor
     # with other parameter signature).
-    my $obj =  PMG::RuleDB::MatchField->new($field, $field_value, $ogroup);
+    my $obj =  PMG::RuleDB::MatchField->new($field, $decoded_field_value, $ogroup);
     bless $obj, $class;
 
     $obj->{id} = $id;
@@ -69,6 +71,7 @@ sub save {
 
     my $new_value = "$self->{field}:$self->{field_value}";
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined ($self->{id})) {
 	# update
@@ -105,7 +108,8 @@ sub parse_entity {
 	for my $value ($entity->head->get_all($self->{field})) {
 	    chomp $value;
 
-	    my $decvalue = MIME::Words::decode_mimewords($value);
+	    my $decvalue = PMG::Utils::decode_rfc1522($value);
+	    $decvalue = PMG::Utils::try_decode_utf8($decvalue);
 
 	    if ($decvalue =~ m|$self->{field_value}|i) {
 		push @$res, $id;
diff --git a/src/PMG/RuleDB/MatchFilename.pm b/src/PMG/RuleDB/MatchFilename.pm
index 7e5b486..c9cdbe0 100644
--- a/src/PMG/RuleDB/MatchFilename.pm
+++ b/src/PMG/RuleDB/MatchFilename.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use MIME::Words;
 
 use PMG::Utils;
@@ -41,8 +42,9 @@ sub load_attr {
     my $class = ref($type) || $type;
 
     defined($value) || die "undefined value: ERROR";;
+    my $decvalue = PMG::Utils::try_decode_utf8($value);
 
-    my $obj = $class->new($value, $ogroup);
+    my $obj = $class->new($decvalue, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -57,6 +59,7 @@ sub save {
 
     my $new_value = $self->{fname};
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 34108d1..6232322 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -56,7 +56,9 @@ sub load_attr {
 
     (defined($field) && defined($field_value)) || return undef;
 
-    my $obj = $class->new($field, $field_value, $ogroup);
+    my $dec_field_value = PMG::Utils::try_decode_utf8($field_value);
+
+    my $obj = $class->new($field, $dec_field_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $field, $field_value, $ogroup);
@@ -69,7 +71,7 @@ sub save {
 
     defined($self->{ogroup}) || return undef;
 
-    my $new_value = "$self->{field}:$self->{field_value}";
+    my $new_value = encode('UTF-8', "$self->{field}:$self->{field_value}");
 
     if (defined ($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 7b38e0d..8a9945b 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -208,7 +208,7 @@ sub execute {
 
     my $from = 'postmaster';
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $body = PMG::Utils::subst_values($self->{body}, $vars);
     my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 1426393..9d802fe 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -89,7 +90,7 @@ sub execute {
     
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index 6b27b91..da6c25f 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -63,12 +63,14 @@ sub load_attr {
 
     defined ($value) || die "undefined value: ERROR";
 
-    my $obj;
+    my ($obj, $text);
 
     if ($value =~ m/^([01])\,([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $4, $ogroup, $2);
+	$text = PMG::Utils::try_decode_utf8($4);
+	$obj = $class->new($1, $text, $ogroup, $2);
     } elsif ($value =~ m/^([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $3, $ogroup);
+	$text = PMG::Utils::try_decode_utf8($3);
+	$obj = $class->new($1, $text, $ogroup);
     } else {
 	$obj = $class->new(0, undef, $ogroup);
     }
@@ -89,7 +91,7 @@ sub save {
     $value .= ','. ($self->{quarantine} ? '1' : '0');
 
     if ($self->{text}) {
-	$value .= ":$self->{text}";
+	$value .= encode('UTF-8', ":$self->{text}");
     }
 
     if (defined ($self->{id})) {
@@ -194,7 +196,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets,
 	$msginfo, $vars, $marks, $ldap) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if (!$self->{all} && ($#$marks == -1)) {
 	# no marks
diff --git a/src/PMG/RuleDB/Rule.pm b/src/PMG/RuleDB/Rule.pm
index c49ad21..e7c9146 100644
--- a/src/PMG/RuleDB/Rule.pm
+++ b/src/PMG/RuleDB/Rule.pm
@@ -12,7 +12,7 @@ sub new {
     my ($type, $name, $priority, $active, $direction) = @_;
 
     my $self = { 
-	name => $name // '',
+	name => PMG::Utils::try_decode_utf8($name) // '',
 	priority => $priority // 0,
 	active => $active // 0,
     }; 
diff --git a/src/PMG/RuleDB/WhoRegex.pm b/src/PMG/RuleDB/WhoRegex.pm
index 37ec3aa..5c13604 100644
--- a/src/PMG/RuleDB/WhoRegex.pm
+++ b/src/PMG/RuleDB/WhoRegex.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 
 use PMG::Utils;
 use PMG::RuleDB::Object;
@@ -43,7 +44,8 @@ sub load_attr {
 
     defined($value) || die "undefined value: ERROR";
 
-    my $obj = $class->new ($value, $ogroup);
+    my $decoded_value = PMG::Utils::try_decode_utf8($value);
+    my $obj = $class->new ($decoded_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -59,6 +61,7 @@ sub save {
 
     my $adr = $self->{address};
     $adr =~ s/\\/\\\\/g;
+    $adr = encode('UTF-8', $adr);
 
     if (defined ($self->{id})) {
 	# update
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

as done in 114655f4fdb07c789a361b2f397f5345eafd16c6 for Accept and
Block.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/BCC.pm        | 19 +++++++++++++++++--
 src/PMG/RuleDB/Notify.pm     | 18 ++++++++++++++++--
 src/PMG/RuleDB/Quarantine.pm | 16 ++++++++++++++--
 src/PMG/RuleDB/Remove.pm     |  8 +++++++-
 4 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 6244dd9..0f016f8 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -3,6 +3,7 @@ package PMG::RuleDB::BCC;
 use strict;
 use warnings;
 use DBI;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -164,10 +165,24 @@ sub execute {
 		$entity, $msginfo->{sender}, \@bcc_targets,
 		$msginfo->{xforward}, $msginfo->{fqdn}, $param);
 	    foreach (@bcc_targets) {
+		my $target = encode('UTF-8', $_);
 		if ($qid) {
-		    syslog('info', "%s: bcc to <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		    syslog(
+			'info',
+			"%s: bcc to <%s> (rule: %s, %s)",
+			$queue->{logid},
+			$target,
+			$rulename,
+			$qid,
+		    );
 		} else {
-		    syslog('err', "%s: bcc to <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		    syslog(
+			'err',
+			"%s: bcc to <%s> (rule: %s) failed",
+			$queue->{logid},
+			$target,
+			$rulename,
+		    );
 		}
 	    }
 	}
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 8a9945b..68f9b4e 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -259,10 +259,24 @@ sub execute {
 	my $qid = PMG::Utils::reinject_mail(
 	    $top, $from, \@targets, undef, $msginfo->{fqdn});
 	foreach (@targets) {
+	    my $target = encode('UTF-8', $_);
 	    if ($qid) {
-		syslog('info', "%s: notify <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		syslog(
+		    'info',
+		    "%s: notify <%s> (rule: %s, %s)",
+		    $queue->{logid},
+		    $target,
+		    $rulename,
+		    $qid,
+		);
 	    } else {
-		syslog ('err', "%s: notify <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		syslog (
+		    'err',
+		    "%s: notify <%s> (rule: %s) failed",
+		    $queue->{logid},
+		    $target,
+		    $rulename,
+		);
 	    }
 	}
     }
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 9d802fe..0fc8352 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -101,7 +101,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'V', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status ($tg, 'delivered');
@@ -111,7 +117,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'S', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status($tg, 'delivered');
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index da6c25f..e7c353c 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -235,7 +235,13 @@ sub execute {
 		}
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 	    }
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (3 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.

This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
  encoding
* valid utf-8 (smtputf8)

The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/MailQueue.pm    | 10 ++++++----
 src/PMG/RuleDB/Spam.pm  |  5 +++--
 src/bin/pmg-smtp-filter |  5 +++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/PMG/MailQueue.pm b/src/PMG/MailQueue.pm
index 2841b07..8355c30 100644
--- a/src/PMG/MailQueue.pm
+++ b/src/PMG/MailQueue.pm
@@ -6,6 +6,7 @@ use warnings;
 use PVE::SafeSyslog;
 use MIME::Parser;
 use IO::File;
+use Encode;
 use File::Sync;
 use File::Basename;
 use File::Path;
@@ -141,6 +142,7 @@ sub quarantinedb_insert {
     my ($self, $ruledb, $lcid, $ldap, $qtype, $header, $sender, $file, $targets, $vars) = @_;
 
     eval {
+	$sender = encode('UTF-8', $sender);
 	my $dbh = $ruledb->{dbh};
 
 	my $insert_cmds = "SELECT nextval ('cmailstore_id_seq'); INSERT INTO CMailStore " .
@@ -188,11 +190,11 @@ sub quarantinedb_insert {
 	    if ($pmail eq lc ($r)) {
 		$receiver = "NULL";
 	    } else {
-		$receiver = $dbh->quote ($r);
+		$receiver = $dbh->quote (encode('UTF-8', $r));
 	    }
 
 
-	    $pmail = $dbh->quote ($pmail);
+	    $pmail = $dbh->quote (encode('UTF-8', $pmail));
 	    $insert_cmds .= "INSERT INTO CMSReceivers " .
 		"(CMailStore_CID, CMailStore_RID, PMail, Receiver, TicketID, Status, MTime) " .
 		"VALUES ($lcid, currval ('cmailstore_id_seq'), $pmail, $receiver, $tid, 'N', $now); ";
@@ -294,8 +296,8 @@ sub quarantine_mail {
 	$entity->head->delete ('Return-Path');
 
 	# prepend Delivered-To and Return-Path (like QMAIL MAILDIR FORMAT)
-	$entity->head->add ('Return-Path', join (',', $sender), 0);
-	$entity->head->add ('Delivered-To', join (',', @$tg), 0);
+	$entity->head->add ('Return-Path', encode('UTF-8', join (',', $sender)), 0);
+	$entity->head->add ('Delivered-To', encode('UTF-8', join (',', @$tg)), 0);
 
 	$entity->print ($fh);
 
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index cc9a347..99056a3 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use Time::HiRes qw (gettimeofday);
 
 use PVE::SafeSyslog;
@@ -135,8 +136,8 @@ sub get_blackwhite {
     my $cond = '';
     foreach my $r (@$targets) {
 	my $pmail = $msginfo->{pmail}->{$r} || lc ($r);
-	my $qr = $dbh->quote ($pmail);
-	$cond .= " OR " if $cond;  
+	my $qr = $dbh->quote (encode('UTF-8', $pmail));
+	$cond .= " OR " if $cond;
 	$cond .= "pmail = $qr";
     }	 
 
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 45e68a7..911e9cd 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 
 use Carp;
+use Encode qw(encode);
 use Getopt::Long;
 use Time::HiRes qw (usleep gettimeofday tv_interval);
 use POSIX qw(:sys_wait_h errno_h signal_h);
@@ -791,10 +792,10 @@ sub handle_smtp {
 	$insert_cmds .= ($queue->{sa_score} || 0) . ',';
 	$insert_cmds .= $dbh->quote($queue->{vinfo}) . ',';
 	$insert_cmds .= $time_total . ',';
-	$insert_cmds .= $dbh->quote($msginfo->{sender}) . ');';
+	$insert_cmds .= $dbh->quote(encode('UTF-8', $msginfo->{sender})) . ');';
 
 	foreach my $r (@{$msginfo->{targets}}) {
-	    my $tmp = $dbh->quote($r);
+	    my $tmp = $dbh->quote(encode('UTF-8',$r));
 	    my $blocked = $queue->{status}->{$r} eq 'blocked' ? 1 : 0;
 	    $insert_cmds .= "INSERT INTO CReceivers (CStatistic_CID, CStatistic_RID, Receiver, Blocked) " .
 		"VALUES ($lcid, currval ('cstatistic_id_seq'), $tmp, '$blocked'); ";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (4 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:15   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/API2/Quarantine.pm | 10 +++++-----
 src/PMG/HTMLMail.pm        |  7 ++++---
 src/PMG/Quarantine.pm      | 13 +++++++------
 src/PMG/RuleDB/Spam.pm     | 12 ++++++------
 4 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
index ddf7c04..819c78c 100644
--- a/src/PMG/API2/Quarantine.pm
+++ b/src/PMG/API2/Quarantine.pm
@@ -141,8 +141,8 @@ my $parse_header_info = sub {
     my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
     $res->{sender} = $sender if $sender && ($sender ne $res->{from});
 
-    $res->{envelope_sender} = $ref->{sender};
-    $res->{receiver} = $ref->{receiver} // $ref->{pmail};
+    $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
+    $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});
     $res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
     $res->{time} = $ref->{time};
     $res->{bytes} = $ref->{bytes};
@@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
 	$sth->execute();
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    push @$res, { mail => $ref->{pmail} };
+	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
 	}
 
 	return $res;
@@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
 	}
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    push @$res, { mail => $ref->{pmail} };
+	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
 	}
 
 	return $res;
@@ -569,7 +569,7 @@ my $quarantine_api = sub {
     }
 
     if ($check_pmail || $role eq 'quser') {
-	$sth->execute($pmail);
+	$sth->execute(encode('UTF-8', $pmail));
     } else {
 	$sth->execute();
     }
diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
index 87f5c40..207c52c 100644
--- a/src/PMG/HTMLMail.pm
+++ b/src/PMG/HTMLMail.pm
@@ -192,9 +192,10 @@ sub read_raw_email {
     # read header
     my $header;
     while (defined(my $line = <$fh>)) {
-	$raw_header .= $line;
-	chomp $line;
-	push @$header, $line;
+	my $decoded_line = PMG::Utils::try_decode_utf8($line);
+	$raw_header .= $decoded_line;
+	chomp $decoded_line;
+	push @$header, $decoded_line;
 	last if $line =~ m/^\s*$/;
     }
 
diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
index 77af8cc..aa6b948 100644
--- a/src/PMG/Quarantine.pm
+++ b/src/PMG/Quarantine.pm
@@ -3,6 +3,7 @@ package PMG::Quarantine;
 use strict;
 use warnings;
 use Net::SMTP;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 use PVE::Tools;
@@ -16,7 +17,7 @@ sub add_to_blackwhite {
 
     my $name = $listname eq 'BL' ? 'BL' : 'WL';
     my $oname = $listname eq 'BL' ? 'WL' : 'BL';
-    my $qu = $dbh->quote ($username);
+    my $qu = $dbh->quote (encode('UTF-8', $username));
 
     my $sth = $dbh->prepare(
 	"SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
@@ -25,13 +26,13 @@ sub add_to_blackwhite {
     my $list = { 'WL' => {}, 'BL' => {} };
 
     while (my $ref = $sth->fetchrow_hashref()) {
-	my $data = $ref->{data};
+	my $data = PMG::Utils::try_decode_utf8($ref->{data});
 	$data =~ s/[,;]/ /g;
 	my @alist = split('\s+', $data);
 
 	my $tmp = {};
 	foreach my $a (@alist) {
-	    if ($a =~ m/^[[:ascii:]]+$/) {
+	    if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {
 		$tmp->{$a} = 1;
 	    }
 	}
@@ -50,7 +51,7 @@ sub add_to_blackwhite {
 	    if ($delete) {
 		delete($list->{$name}->{$v});
 	    } else {
-		if ($v =~ m/[[:^ascii:]]/) {
+		if ($v =~ m/[\s\\]/) {
 		    die "email address '$v' contains invalid characters\n";
 		}
 		$list->{$name}->{$v} = 1;
@@ -58,8 +59,8 @@ sub add_to_blackwhite {
 	    }
 	}
 
-	my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
-	my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
+	my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
+	my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
 
 	if (!$delete) {
 	    my $maxlen = 200000;
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index 99056a3..bc1d422 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -94,7 +94,7 @@ sub parse_addrlist {
 	my $regex = $addr;
 	# SA like checks
 	$regex =~ s/[\000\\\(]/_/gs;		# is this really necessasry ?
-	$regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g;	# escape possible metachars
+	$regex =~ s/([^\*\?_\w])/\\$1/g;	# escape possible metachars
 	$regex =~ tr/?/./;			# replace "?" with "."
 	$regex =~ s/\*+/\.\*/g;			# replace "*" with  ".*"
 
@@ -149,13 +149,13 @@ sub get_blackwhite {
 	$sth->execute();
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    my $pmail = lc ($ref->{pmail});
+	    my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
 	    if ($ref->{name} eq 'WL') {
 		$target_info->{$pmail}->{whitelist} = 
-		    parse_addrlist($ref->{data});
+		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
 	    } elsif ($ref->{name} eq 'BL') {
 		$target_info->{$pmail}->{blacklist} = 
-		    parse_addrlist($ref->{data});
+		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
 	    }
 	}
 
@@ -205,7 +205,7 @@ sub what_match_targets {
 		($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
 		check_addrlist($list, $queue->{all_from_addrs})) {
 		syslog('info', "%s: sender in user (%s) whitelist", 
-		       $queue->{logid}, $pmail);
+		       $queue->{logid}, encode('UTF-8', $pmail));
 	    } else {
 		$target_info->{$t}->{marks} = []; # never add additional marks here
 		$target_info->{$t}->{spaminfo} = $info;
@@ -234,7 +234,7 @@ sub what_match_targets {
 		$target_info->{$t}->{marks} = [];
 		$target_info->{$t}->{spaminfo} = $info;
 		syslog ('info', "%s: sender in user (%s) blacklist", 
-			$queue->{logid}, $pmail);
+			$queue->{logid}, encode('UTF-8',$pmail));
 	    }
 	}
     }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (5 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:20   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

$data->{pmail} is both used in the template rendering ('Spam Report for
$pmail'), and as content for the To header, which need different
treatment. Thus introduce 'pmail_raw' additionally.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
 src/PMG/Utils.pm     |  7 ++++---
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
index dbec8ef..7293579 100755
--- a/src/PMG/CLI/pmgqm.pm
+++ b/src/PMG/CLI/pmgqm.pm
@@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
 
 use strict;
 use Data::Dumper;
+use Encode qw(encode);
 use Template;
 use MIME::Entity;
 use HTML::Entities;
@@ -17,6 +18,7 @@ use PVE::SafeSyslog;
 use PVE::Tools;
 use PVE::INotify;
 use PVE::CLIHandler;
+use PVE::JSONSchema qw(get_standard_option);
 
 use PMG::RESTEnvironment;
 use PMG::Utils;
@@ -57,7 +59,7 @@ sub get_item_data {
     }
 
     $item->{envelope_sender} = $ref->{sender};
-    $item->{pmail} = $ref->{pmail};
+    $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
     $item->{receiver} = $ref->{receiver} || $ref->{pmail};
 
     $item->{date} = strftime("%F", localtime($ref->{time}));
@@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
     parameters => {
 	additionalProperties => 0,
 	properties => {
-	    receiver => {
+	    receiver => get_standard_option('pmg-email-address', {
 		description => "Generate report for a single email address. If not specified, generate reports for all users.",
-		type => 'string', format => 'email',
 		optional => 1,
-	    },
+	    }),
 	    timespan => {
 		description => "Select time span.",
 		type => 'string',
@@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
 		enum => ['short', 'verbose', 'custom'],
 		optional => 1,
 	    },
-	    redirect => {
+	    redirect => get_standard_option('pmg-email-address', {
 		description => "Redirect spam report email to this address.",
-		type => 'string', format => 'email',
 		optional => 1,
-	    },
+	    }),
 	    debug => {
 		description => "Debug mode. Print raw email to stdout instead of sending them.",
 		type => 'boolean',
@@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
 	    "ORDER BY pmail, time, receiver");
 
 	if ($target) {
-	    $sth->execute($target);
+	    $sth->execute(encode('UTF-8', $target));
 	} else {
 	    $sth->execute();
 	}
@@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
 	};
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    if ($creceiver ne $ref->{pmail}) {
+	    my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
+	    if ($creceiver ne $decoded_pmail) {
 
 		$finalize->() if $data;
 
 		$data = clone($global_data);
 
-		$creceiver = $ref->{pmail};
+		$creceiver = $decoded_pmail;
 		$mailcount = 0;
 
-		$data->{pmail} = $creceiver;
+		$data->{pmail} = encode_entities($decoded_pmail);
+		$data->{pmail_raw} = $ref->{pmail};
 		$data->{managehref} = "$protocol_fqdn_port/quarantine";
 		if ($data->{authmode} ne 'ldap') {
 		    $data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cc30e67..5c9e873 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
 	    my ($d, $cs) = @$r;
 	    if ($d) {
 		if ($cs) {
-		    $res .= encode_entities(decode($cs, $d));
+		    $res .= encode('UTF-8', decode($cs, $d));
 		} else {
-		    $res .= encode_entities($d);
+		    $res .= $d;
 		}
 	    }
 	}
+	$res = encode_entities(decode('UTF-8', $res));
     };
 
     $res = $enc if $@;
@@ -1257,7 +1258,7 @@ sub finalize_report {
 
     my $top = MIME::Entity->build(
 	Type    => "multipart/related",
-	To      => $data->{pmail},
+	To      => $data->{pmail_raw},
 	From    => $mailfrom,
 	Subject => bencode_header(decode_entities($title)));
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (6 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:26   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 17 deletions(-)

diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
index 6d27930..96ef61d 100755
--- a/src/PMG/Statistic.pm
+++ b/src/PMG/Statistic.pm
@@ -3,6 +3,7 @@ package PMG::Statistic;
 use strict;
 use warnings;
 use DBI;
+use Encode qw(encode);
 use Time::Local;
 use Time::Zone;
 
@@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
     return $orderby;
 };
 
+sub user_stat_to_perlstring {
+    my ($entry) = @_;
+
+    my $res = { };
+
+    for my $a (keys %$entry) {
+	if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
+	    $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
+	} else {
+	    $res->{$a} = $entry->{$a};
+	}
+    }
+
+    return $res;
+}
+
 sub user_stat_contact_details {
     my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
 
@@ -554,19 +571,21 @@ sub user_stat_contact_details {
 
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
 
+    my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
+
     my $query = "SELECT * FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
 	"AND NOT direction AND sender != '' AND receiver = ? " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
 
-    $sth->execute($receiver);
+    $sth->execute(encode('UTF-8',$receiver));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -583,11 +602,14 @@ sub user_stat_contact {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
 	"count (virusinfo) as viruscount " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"AND $cond_good_mail AND NOT direction AND sender != '' ";
 
     if ($advfilter) {
@@ -603,7 +625,7 @@ sub user_stat_contact {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -620,20 +642,23 @@ sub user_stat_sender_details {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $sth = $rdb->{dbh}->prepare(
 	"SELECT " .
 	"blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
 	"$cond_good_mail AND NOT direction AND sender = ? " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit");
 
-    $sth->execute($sender);
+    $sth->execute(encode('UTF-8',$sender));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -650,11 +675,14 @@ sub user_stat_sender {
 
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
 	"count (virusinfo) as viruscount, " .
 	"count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
 	"FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"GROUP BY sender ORDER BY $orderby limit $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
@@ -662,7 +690,7 @@ sub user_stat_sender {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -679,18 +707,21 @@ sub user_stat_receiver_details {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $sth = $rdb->{dbh}->prepare(
 	"SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit");
 
-    $sth->execute($receiver);
+    $sth->execute(encode('UTF-8',$receiver));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -708,6 +739,9 @@ sub user_stat_receiver {
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
 	"receiver IS NOT NULL AND receiver != ''";
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT receiver, " .
 	"count(*) AS count, " .
 	"sum (bytes) AS bytes, " .
@@ -728,7 +762,7 @@ sub user_stat_receiver {
     }
 
     $query .= "AND $cond_good_mail and direction " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"GROUP BY receiver ORDER BY $orderby LIMIT $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
@@ -736,7 +770,7 @@ sub user_stat_receiver {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -873,9 +907,8 @@ sub recent_receivers {
     my $sth =  $rdb->{dbh}->prepare($cmd);
 
     $sth->execute ($from, $limit);
-
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
     $sth->finish();
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (7 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

matching the pattern in the backend (allowing most characters inside
of e-mail addresses.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/UserBlackWhiteList.js | 2 +-
 js/Utils.js              | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 4f4a756..44d75b3 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
 	{
 	    xtype: 'combobox',
 	    displayField: 'mail',
-	    vtype: 'email',
+	    vtype: 'proxmoxMail',
 	    allowBlank: false,
 	    valueField: 'mail',
 	    store: {
diff --git a/js/Utils.js b/js/Utils.js
index dc924d2..7fa154e 100644
--- a/js/Utils.js
+++ b/js/Utils.js
@@ -898,3 +898,12 @@ Ext.define('PMG.Async', {
 	);
     },
 });
+
+// custom Vtypes
+Ext.apply(Ext.form.field.VTypes, {
+    // matches the pmg-email-address in pmg-api
+    PMGMail: function(v) {
+	return (/[^\s\\@]+@[^\s/\\@]+/).test(v);
+    },
+    PMGMailText: gettext('Example') + ": user@example.com",
+});
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (8 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
  2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

to be able to add addresses to the lists for non-ascii-addresses

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/UserBlackWhiteList.js | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 44d75b3..1344496 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
 	{
 	    xtype: 'combobox',
 	    displayField: 'mail',
-	    vtype: 'proxmoxMail',
+	    vtype: 'PMGMail',
 	    allowBlank: false,
 	    valueField: 'mail',
 	    store: {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (9 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
@ 2022-11-23 14:09 ` Dominik Csapak
  2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:09 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

all in all works mostly well,
tested various weird emails with various rules
that include emojis/non-ascii characters

(weird mails as in a mix of smtputf8,mixed charsets and quoted-printable fields
with mixed encodings, with and without non-ascii characters in the
sender/recipient)

things that did not work and need to be fixed if we want to apply this:

* LDAP, you mentioned it, but it fails in a really non obvious way
   and drops mails currently
* user wl/bl from the quarantine interface
   (some en/decode is missing, and garbage reaches the user lists)

things that worked in my tests:

* sending emails (with/without smtputf8)
* quarantining mails
* notication/modify/header/disclaimer/etc. with non-ascii characters
* various what/who objects with non-ascii characters
* greylisting with non-ascii characters in sender/recipient
* modifying user wl/bl
* matching user wl/bl
* log tracker
* statistics

i did find some things to note in the individual patches, i'll answer there




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:15   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:15 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

i'd like to have some rationale for the changes in the commit message
at least for the more non-obvious ones (regex changes for example)

comments inline

On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/API2/Quarantine.pm | 10 +++++-----
>   src/PMG/HTMLMail.pm        |  7 ++++---
>   src/PMG/Quarantine.pm      | 13 +++++++------
>   src/PMG/RuleDB/Spam.pm     | 12 ++++++------
>   4 files changed, 22 insertions(+), 20 deletions(-)
> 
> diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
> index ddf7c04..819c78c 100644
> --- a/src/PMG/API2/Quarantine.pm
> +++ b/src/PMG/API2/Quarantine.pm
> @@ -141,8 +141,8 @@ my $parse_header_info = sub {
>       my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
>       $res->{sender} = $sender if $sender && ($sender ne $res->{from});
>   
> -    $res->{envelope_sender} = $ref->{sender};
> -    $res->{receiver} = $ref->{receiver} // $ref->{pmail};
> +    $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
> +    $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});

maybe we should note here in a comment that these are not headers
but part of the smtp dialog and cannot be quoted-printable/base64 encoded?

>       $res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
>       $res->{time} = $ref->{time};
>       $res->{bytes} = $ref->{bytes};
> @@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
>   	$sth->execute();
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    push @$res, { mail => $ref->{pmail} };
> +	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
>   	}
>   
>   	return $res;
> @@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
>   	}
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    push @$res, { mail => $ref->{pmail} };
> +	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
>   	}
>   
>   	return $res;
> @@ -569,7 +569,7 @@ my $quarantine_api = sub {
>       }
>   
>       if ($check_pmail || $role eq 'quser') {
> -	$sth->execute($pmail);
> +	$sth->execute(encode('UTF-8', $pmail));
>       } else {
>   	$sth->execute();
>       }
> diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
> index 87f5c40..207c52c 100644
> --- a/src/PMG/HTMLMail.pm
> +++ b/src/PMG/HTMLMail.pm
> @@ -192,9 +192,10 @@ sub read_raw_email {
>       # read header
>       my $header;
>       while (defined(my $line = <$fh>)) {
> -	$raw_header .= $line;
> -	chomp $line;
> -	push @$header, $line;
> +	my $decoded_line = PMG::Utils::try_decode_utf8($line);
> +	$raw_header .= $decoded_line;
> +	chomp $decoded_line;
> +	push @$header, $decoded_line;
>   	last if $line =~ m/^\s*$/;
>       }
>   
> diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
> index 77af8cc..aa6b948 100644
> --- a/src/PMG/Quarantine.pm
> +++ b/src/PMG/Quarantine.pm
> @@ -3,6 +3,7 @@ package PMG::Quarantine;
>   use strict;
>   use warnings;
>   use Net::SMTP;
> +use Encode qw(encode);
>   
>   use PVE::SafeSyslog;
>   use PVE::Tools;
> @@ -16,7 +17,7 @@ sub add_to_blackwhite {
>   
>       my $name = $listname eq 'BL' ? 'BL' : 'WL';
>       my $oname = $listname eq 'BL' ? 'WL' : 'BL';
> -    my $qu = $dbh->quote ($username);
> +    my $qu = $dbh->quote (encode('UTF-8', $username));
>   
>       my $sth = $dbh->prepare(
>   	"SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
> @@ -25,13 +26,13 @@ sub add_to_blackwhite {
>       my $list = { 'WL' => {}, 'BL' => {} };
>   
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	my $data = $ref->{data};
> +	my $data = PMG::Utils::try_decode_utf8($ref->{data});
>   	$data =~ s/[,;]/ /g;
>   	my @alist = split('\s+', $data);
>   
>   	my $tmp = {};
>   	foreach my $a (@alist) {
> -	    if ($a =~ m/^[[:ascii:]]+$/) {
> +	    if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {

that change seems a bit dangerous, maybe we should at least
filter out some control characters here?

>   		$tmp->{$a} = 1;
>   	    }
>   	}
> @@ -50,7 +51,7 @@ sub add_to_blackwhite {
>   	    if ($delete) {
>   		delete($list->{$name}->{$v});
>   	    } else {
> -		if ($v =~ m/[[:^ascii:]]/) {
> +		if ($v =~ m/[\s\\]/) {

same here, going from 'non-ascii' is forbidden to 'non whitespace+\' is forbidden
is a bit broad imho

>   		    die "email address '$v' contains invalid characters\n";
>   		}
>   		$list->{$name}->{$v} = 1;
> @@ -58,8 +59,8 @@ sub add_to_blackwhite {
>   	    }
>   	}
>   
> -	my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
> -	my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
> +	my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
> +	my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
>   
>   	if (!$delete) {
>   	    my $maxlen = 200000;
> diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
> index 99056a3..bc1d422 100644
> --- a/src/PMG/RuleDB/Spam.pm
> +++ b/src/PMG/RuleDB/Spam.pm
> @@ -94,7 +94,7 @@ sub parse_addrlist {
>   	my $regex = $addr;
>   	# SA like checks
>   	$regex =~ s/[\000\\\(]/_/gs;		# is this really necessasry ?
> -	$regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g;	# escape possible metachars
> +	$regex =~ s/([^\*\?_\w])/\\$1/g;	# escape possible metachars

what does \w include more here than a-zA-Z0-9 ?
(a short explanation in the commit message would be enough imo)

>   	$regex =~ tr/?/./;			# replace "?" with "."
>   	$regex =~ s/\*+/\.\*/g;			# replace "*" with  ".*"
>   
> @@ -149,13 +149,13 @@ sub get_blackwhite {
>   	$sth->execute();
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    my $pmail = lc ($ref->{pmail});
> +	    my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
>   	    if ($ref->{name} eq 'WL') {
>   		$target_info->{$pmail}->{whitelist} =
> -		    parse_addrlist($ref->{data});
> +		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
>   	    } elsif ($ref->{name} eq 'BL') {
>   		$target_info->{$pmail}->{blacklist} =
> -		    parse_addrlist($ref->{data});
> +		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
>   	    }
>   	}
>   
> @@ -205,7 +205,7 @@ sub what_match_targets {
>   		($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
>   		check_addrlist($list, $queue->{all_from_addrs})) {
>   		syslog('info', "%s: sender in user (%s) whitelist",
> -		       $queue->{logid}, $pmail);
> +		       $queue->{logid}, encode('UTF-8', $pmail));
>   	    } else {
>   		$target_info->{$t}->{marks} = []; # never add additional marks here
>   		$target_info->{$t}->{spaminfo} = $info;
> @@ -234,7 +234,7 @@ sub what_match_targets {
>   		$target_info->{$t}->{marks} = [];
>   		$target_info->{$t}->{spaminfo} = $info;
>   		syslog ('info', "%s: sender in user (%s) blacklist",
> -			$queue->{logid}, $pmail);
> +			$queue->{logid}, encode('UTF-8',$pmail));
>   	    }
>   	}
>       }





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23 14:20   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:20 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

comments inline

On 11/23/22 10:23, Stoiko Ivanov wrote:
> $data->{pmail} is both used in the template rendering ('Spam Report for
> $pmail'), and as content for the To header, which need different
> treatment. Thus introduce 'pmail_raw' additionally.
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
>   src/PMG/Utils.pm     |  7 ++++---
>   2 files changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
> index dbec8ef..7293579 100755
> --- a/src/PMG/CLI/pmgqm.pm
> +++ b/src/PMG/CLI/pmgqm.pm
> @@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
>   
>   use strict;
>   use Data::Dumper;
> +use Encode qw(encode);
>   use Template;
>   use MIME::Entity;
>   use HTML::Entities;
> @@ -17,6 +18,7 @@ use PVE::SafeSyslog;
>   use PVE::Tools;
>   use PVE::INotify;
>   use PVE::CLIHandler;
> +use PVE::JSONSchema qw(get_standard_option);
>   
>   use PMG::RESTEnvironment;
>   use PMG::Utils;
> @@ -57,7 +59,7 @@ sub get_item_data {
>       }
>   
>       $item->{envelope_sender} = $ref->{sender};
> -    $item->{pmail} = $ref->{pmail};
> +    $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
>       $item->{receiver} = $ref->{receiver} || $ref->{pmail};
>   
>       $item->{date} = strftime("%F", localtime($ref->{time}));
> @@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
>       parameters => {
>   	additionalProperties => 0,
>   	properties => {
> -	    receiver => {
> +	    receiver => get_standard_option('pmg-email-address', {
>   		description => "Generate report for a single email address. If not specified, generate reports for all users.",
> -		type => 'string', format => 'email',
>   		optional => 1,
> -	    },
> +	    }),
>   	    timespan => {
>   		description => "Select time span.",
>   		type => 'string',
> @@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
>   		enum => ['short', 'verbose', 'custom'],
>   		optional => 1,
>   	    },
> -	    redirect => {
> +	    redirect => get_standard_option('pmg-email-address', {
>   		description => "Redirect spam report email to this address.",
> -		type => 'string', format => 'email',
>   		optional => 1,
> -	    },
> +	    }),
>   	    debug => {
>   		description => "Debug mode. Print raw email to stdout instead of sending them.",
>   		type => 'boolean',
> @@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
>   	    "ORDER BY pmail, time, receiver");
>   
>   	if ($target) {
> -	    $sth->execute($target);
> +	    $sth->execute(encode('UTF-8', $target));
>   	} else {
>   	    $sth->execute();
>   	}
> @@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
>   	};
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    if ($creceiver ne $ref->{pmail}) {
> +	    my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
> +	    if ($creceiver ne $decoded_pmail) {
>   
>   		$finalize->() if $data;
>   
>   		$data = clone($global_data);
>   
> -		$creceiver = $ref->{pmail};
> +		$creceiver = $decoded_pmail;
>   		$mailcount = 0;
>   
> -		$data->{pmail} = $creceiver;
> +		$data->{pmail} = encode_entities($decoded_pmail);
> +		$data->{pmail_raw} = $ref->{pmail};
>   		$data->{managehref} = "$protocol_fqdn_port/quarantine";
>   		if ($data->{authmode} ne 'ldap') {
>   		    $data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
> diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
> index cc30e67..5c9e873 100644
> --- a/src/PMG/Utils.pm
> +++ b/src/PMG/Utils.pm
> @@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
>   	    my ($d, $cs) = @$r;
>   	    if ($d) {
>   		if ($cs) {
> -		    $res .= encode_entities(decode($cs, $d));
> +		    $res .= encode('UTF-8', decode($cs, $d));
>   		} else {
> -		    $res .= encode_entities($d);
> +		    $res .= $d;
>   		}
>   	    }
>   	}
> +	$res = encode_entities(decode('UTF-8', $res));

this change is not really explained in the commit message
and is a bit confusing

couldn't we simply do:

encode_entities(decode_rfc1522($enc))

?

afaics is rfc1522_to_html mostly the same as decode_rfc1522
but with an 'encode_entities' after decoding


>       };
>   
>       $res = $enc if $@;
> @@ -1257,7 +1258,7 @@ sub finalize_report {
>   
>       my $top = MIME::Entity->build(
>   	Type    => "multipart/related",
> -	To      => $data->{pmail},
> +	To      => $data->{pmail_raw},
>   	From    => $mailfrom,
>   	Subject => bencode_header(decode_entities($title)));
>   





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:26   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:26 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

again, a bit more commit message would be nice

On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
>   1 file changed, 50 insertions(+), 17 deletions(-)
> 
> diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
> index 6d27930..96ef61d 100755
> --- a/src/PMG/Statistic.pm
> +++ b/src/PMG/Statistic.pm
> @@ -3,6 +3,7 @@ package PMG::Statistic;
>   use strict;
>   use warnings;
>   use DBI;
> +use Encode qw(encode);
>   use Time::Local;
>   use Time::Zone;
>   
> @@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
>       return $orderby;
>   };
>   
> +sub user_stat_to_perlstring {
> +    my ($entry) = @_;
> +
> +    my $res = { };
> +
> +    for my $a (keys %$entry) {
> +	if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
> +	    $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
> +	} else {
> +	    $res->{$a} = $entry->{$a};
> +	}
> +    }
> +
> +    return $res;
> +}
> +
>   sub user_stat_contact_details {
>       my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
>   
> @@ -554,19 +571,21 @@ sub user_stat_contact_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>   
> +    my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
> +
>       my $query = "SELECT * FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
>   	"AND NOT direction AND sender != '' AND receiver = ? " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
>   
> -    $sth->execute($receiver);
> +    $sth->execute(encode('UTF-8',$receiver));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -583,11 +602,14 @@ sub user_stat_contact {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
>   	"count (virusinfo) as viruscount " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
>   	"AND $cond_good_mail AND NOT direction AND sender != '' ";
>   
>       if ($advfilter) {
> @@ -603,7 +625,7 @@ sub user_stat_contact {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -620,20 +642,23 @@ sub user_stat_sender_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $sth = $rdb->{dbh}->prepare(
>   	"SELECT " .
>   	"blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
>   	"$cond_good_mail AND NOT direction AND sender = ? " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit");
>   
> -    $sth->execute($sender);
> +    $sth->execute(encode('UTF-8',$sender));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -650,11 +675,14 @@ sub user_stat_sender {
>   
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
>   	"count (virusinfo) as viruscount, " .
>   	"count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
>   	"FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"GROUP BY sender ORDER BY $orderby limit $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
> @@ -662,7 +690,7 @@ sub user_stat_sender {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -679,18 +707,21 @@ sub user_stat_receiver_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $sth = $rdb->{dbh}->prepare(
>   	"SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit");
>   
> -    $sth->execute($receiver);
> +    $sth->execute(encode('UTF-8',$receiver));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -708,6 +739,9 @@ sub user_stat_receiver {
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
>   	"receiver IS NOT NULL AND receiver != ''";
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT receiver, " .
>   	"count(*) AS count, " .
>   	"sum (bytes) AS bytes, " .
> @@ -728,7 +762,7 @@ sub user_stat_receiver {
>       }
>   
>       $query .= "AND $cond_good_mail and direction " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .

we have this pattern 6 times in this diff, wouldn't it be easier to do something like this:
(naming is not optimal, just what came to my mind)

sub sql_filter_text {
     my ($dbh, $field, $filter) = @_;
     my $filter_text = $filter ? "AND $field like ". $dbh->quote(...). " " : '';
     return $filter_text
}

and call it in the functions with

my $filter_text = sql_filter_text($rdb->{dbh}, 'receiver', $filter);

and simply use it with:

$query .= "...." . $filter_text . "...";

?

>   	"GROUP BY receiver ORDER BY $orderby LIMIT $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
> @@ -736,7 +770,7 @@ sub user_stat_receiver {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -873,9 +907,8 @@ sub recent_receivers {
>       my $sth =  $rdb->{dbh}->prepare($cmd);
>   
>       $sth->execute ($from, $limit);
> -
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>       $sth->finish();
>   





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] applied-gui: [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (10 preceding siblings ...)
  2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
@ 2022-11-26  7:00 ` Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2022-11-26  7:00 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

Am 23/11/2022 um 10:23 schrieb Stoiko Ivanov:
> pmg-gui:
> Stoiko Ivanov (2):
>   utils: add custom validator for pmg-email-address
>   userblocklists: use PMGMail as validator for pmail

before I forget: applied those two yesterday, thanks!




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-26  7:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
2022-11-23 14:15   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
2022-11-23 14:20   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
2022-11-23 14:26   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal