all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
@ 2022-11-23  9:23 Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

v2->v3:
* dropped the useless decode/encode/decode chain in decode_rfc1522
* moved try_decode_utf8 to patch 1 as it's now used there
* renamed 'encode_user_stat' to 'user_stat_to_perlstring' as this is what
  the helper actually does
* the 2 patches for pmg-gui make it possible to add user black/whitelist
  entries for non-ascii e-mails
* quickly re-verified that pmgpolicy should be robust for smtputf8 mail
  (postfix hands the data over as utf-8 - and pmgpolicy does not parse it

Thanks again to Dominik for the off-list suggestions!

original cover-letter for v2:
v1->v2:
* dropped already applied patches
* added a patch for one further glitch in ModField/Notify actions (when
  parsing/replacing non-ascii characters) - patch 1/5+2/5
* added support for utf-8 data in the mailflow additionally for:
** quarantine API handlng
** user BL/WL (the GUI still needs adaptation to parse e-mail-addresses
   more liberally - but else it seems to work)
** pmgqm (spamreports)
** statistics

still missing support for:
* LDAP
* Who Objects

huge thanks to Dominik for taking the time to review and test the v1!

original cover-letter for v1:
this patchseries partially fixes #2465 and #2541, two quite often reported
issues, which are causing quite a disappointing experience for users
in non-ascii only environments

the main assumption of the patches are:
* envelope addresses are either ascii or utf-8 (latter only with smtputf8)
* thus we can unconditionally de-/encode envelope addresses for database
  results/lookups
* the matching in the rule-objects will see the relevant parts of the mail
  as properly encoded perl-strings (with multi-byte characters - e.g. the
  euro sign as \x{20ac} instead of \x{e2}\x{82}\x{ac})
(I did a bit of testing to verify them, by e.g. sending an ISO-8859-1
encoded mail and matching for an umlaut in the subject)

While going through the RuleDB classes I remembered, that we have a few
pieces of legacy objects (Attach, ReportSpam, Counter actions) there, and
went ahead with deprecating them (initially I simply deleted them, but
decided to be more cautious and just log the deprecation until 8.0, when
we can drop them explicitly). They cannot be instantiated currently (short
of a direct insert into the database) - but I don't know if they were ever
used in pre 5.0 times in their current form. - patch 2/5.

Out of scope of the series for now:
* utf-8 support in the LDAP subsystem (deployments with a configured LDAP
  profile still won't be able to process smtputf8 mails) - mostly until I
  get around to create test-environment with the appropriate schema for
  having non-ascii mail-addresses
* Domain/Email objects - did not find the time to consider how to store
  them most sensibly (puny-code, utf-8) and if the choice should be
  carried over to all of our 'email' formats (it probably shouldn't)

patches 1/5 and 4/5 address 2 small bugs I ran into while testing

Given that I quite often miss a few fine points or use-cases I'd be very
grateful for some more experimenting/testing!


pmg-api:
Stoiko Ivanov (8):
  utils: return perl string from decode_rfc1522
  ruledb: properly substitute prox_vars in headers
  fix #2541 ruledb: encode relevant values as utf-8 in database
  ruledb: encode e-mail addresses for syslog
  partially fix #2465: handle smtputf8 addresses in the rule-system
  quarantine: handle utf8 data
  pmgqm: handle smtputf8 data
  statistics: handle utf8 data.

 src/PMG/API2/Quarantine.pm      | 16 ++++----
 src/PMG/CLI/pmgqm.pm            | 24 ++++++------
 src/PMG/HTMLMail.pm             |  7 ++--
 src/PMG/MailQueue.pm            | 10 +++--
 src/PMG/Quarantine.pm           | 13 ++++---
 src/PMG/RuleDB.pm               | 24 ++++++++----
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/BCC.pm           | 23 +++++++++--
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 +-
 src/PMG/RuleDB/MatchField.pm    |  8 +++-
 src/PMG/RuleDB/MatchFilename.pm |  5 ++-
 src/PMG/RuleDB/ModField.pm      | 19 +++-------
 src/PMG/RuleDB/Notify.pm        | 24 +++++++++---
 src/PMG/RuleDB/Quarantine.pm    | 19 ++++++++--
 src/PMG/RuleDB/Remove.pm        | 20 +++++++---
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/Spam.pm          | 17 +++++----
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++-
 src/PMG/Statistic.pm            | 67 ++++++++++++++++++++++++---------
 src/PMG/Utils.pm                | 32 ++++++++++++++--
 src/bin/pmg-smtp-filter         |  7 ++--
 23 files changed, 238 insertions(+), 114 deletions(-)

pmg-gui:
Stoiko Ivanov (2):
  utils: add custom validator for pmg-email-address
  userblocklists: use PMGMail as validator for pmail

 js/UserBlackWhiteList.js | 2 +-
 js/Utils.js              | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

-- 
2.30.2




^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)

the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit
43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/Utils.pm | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..cfb8852 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1088,6 +1088,7 @@ sub decode_to_html {
     return $res;
 }
 
+# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters)
 sub decode_rfc1522 {
     my ($enc) = @_;
 
@@ -1102,7 +1103,7 @@ sub decode_rfc1522 {
 		if ($cs) {
 		    $res .= decode($cs, $d);
 		} else {
-		    $res .= $d;
+		    $res .= try_decode_utf8($d);
 		}
 	    }
 	}
@@ -1542,4 +1543,9 @@ sub get_existing_object_id {
     return;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+    return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

by storing the variables as perl-string (not mime-encoded, and not
utf-8 encoded), and appropriately dealing with multi-line values to
input (folding the headers and encoding as mime).

This fixes another glitch not caught by
d3d6b5dff9e4447d16cb92e0fdf26f67d9384423

the Subject was always displayed with a '?' in the end (due to the
(quoted-printable encoded) \n added).

Additionally adapt the other callsites of PMG::Utils::subst_values
where applicable.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/BCC.pm      |  2 +-
 src/PMG/RuleDB/ModField.pm | 13 +------------
 src/PMG/RuleDB/Notify.pm   |  4 ++--
 src/PMG/Utils.pm           | 17 +++++++++++++++++
 src/bin/pmg-smtp-filter    |  2 +-
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index d364690..4867d83 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -117,7 +117,7 @@ sub execute {
 
     my $rulename = $vars->{RULE} // 'unknown';
 
-    my $bcc_to = PMG::Utils::subst_values($self->{target}, $vars);
+    my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
 
     if ($bcc_to =~ m/^\s*$/) {
 	# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 4ebb618..34108d1 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -5,7 +5,6 @@ use warnings;
 use DBI;
 use Digest::SHA;
 use Encode qw(encode decode);
-use MIME::Words qw(encode_mimewords);
 
 use PMG::Utils;
 use PMG::ModGroup;
@@ -98,17 +97,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $fvalue = '';
-
-    foreach my $line (split('\r?\n\s*',PMG::Utils::subst_values ($self->{field_value}, $vars))) {
-	$fvalue .= "\n" if $fvalue;
-	$fvalue .= encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
-    }
-
-    # support for multiline values (i.e. __SPAM_INFO__)
-    $fvalue =~ s/\n/\n\t/sg; # indent content
-    $fvalue =~ s/\n\s*\n//sg;   # remove empty line
-    $fvalue =~ s/\n?\s*$//s;    # remove trailing spaces
+    my $fvalue = PMG::Utils::subst_values_for_header($self->{field_value}, $vars);
 
     my $subgroups = $mod_group->subgroups($targets);
 
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index d67221e..7b38e0d 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -211,8 +211,8 @@ sub execute {
     my $rulename = $vars->{RULE} // 'unknown';
 
     my $body = PMG::Utils::subst_values($self->{body}, $vars);
-    my $subject = PMG::Utils::subst_values($self->{subject}, $vars);
-    my $to = PMG::Utils::subst_values($self->{to}, $vars);
+    my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
+    my $to = PMG::Utils::subst_values_for_header($self->{to}, $vars);
 
     if ($to =~ m/^\s*$/) {
 	# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cfb8852..cc30e67 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -203,6 +203,23 @@ sub subst_values {
     return $body;
 }
 
+sub subst_values_for_header {
+    my ($header, $dh) = @_;
+
+    my $res = '';
+    foreach my $line (split('\r?\n\s*', subst_values ($header, $dh))) {
+	$res .= "\n" if $res;
+	$res .= MIME::Words::encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
+    }
+
+    # support for multiline values (i.e. __SPAM_INFO__)
+    $res =~ s/\n/\n\t/sg; # indent content
+    $res =~ s/\n\s*\n//sg;   # remove empty line
+    $res =~ s/\n?\s*$//s;    # remove trailing spaces
+
+    return $res;
+}
+
 sub reinject_mail {
     my ($entity, $sender, $targets, $xforward, $me, $params) = @_;
 
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 35a6ac6..45e68a7 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -152,7 +152,7 @@ sub get_prox_vars {
     } if !$spaminfo;
 
     my $vars = {
-	'SUBJECT' => mime_to_perl_string($entity->head->get ('subject', 0) || 'No Subject'),
+	'SUBJECT' => PMG::Utils::decode_rfc1522($entity->head->get ('subject', 0) || 'No Subject'),
 	'RULE' => $rule->{name},
 	'RULE_INFO' => $msginfo->{rule_info},
 	'SENDER' => $msginfo->{sender},
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.

backwards-compatibility should not be an issue:
* currently the database should not contain any utf-8 multibyte
  characters, as our tooling prevented this due to sending
  wide-characters, which causes an exception in DBI.
* any character > 127 and < 256 will be correctly interpreted when
  stored in a perl-string (this happens if the decode fails in
  try_decode_utf8), and will be correctly encoded when storing into
  the database.

the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)

This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
  (puny-code for the domain part, or everything as UTF-8) and it would
  need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
  is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases

The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].

[0] https://www.postgresql.org/docs/13/multibyte.html
[1] https://datatracker.ietf.org/doc/html/rfc6838#section-4.2
[2] https://datatracker.ietf.org/doc/html/rfc2045#section-5.1

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB.pm               | 24 ++++++++++++++++--------
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/BCC.pm           |  2 +-
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 ++--
 src/PMG/RuleDB/MatchField.pm    |  8 ++++++--
 src/PMG/RuleDB/MatchFilename.pm |  5 ++++-
 src/PMG/RuleDB/ModField.pm      |  6 ++++--
 src/PMG/RuleDB/Notify.pm        |  2 +-
 src/PMG/RuleDB/Quarantine.pm    |  3 ++-
 src/PMG/RuleDB/Remove.pm        | 12 +++++++-----
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++++-
 14 files changed, 51 insertions(+), 28 deletions(-)

diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
index 895acc6..a6b0b79 100644
--- a/src/PMG/RuleDB.pm
+++ b/src/PMG/RuleDB.pm
@@ -5,6 +5,7 @@ use warnings;
 use DBI;
 use HTML::Entities;
 use Data::Dumper;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -70,8 +71,8 @@ sub create_group_with_obj {
 
     defined($obj) || die "proxmox: undefined object";
 
-    $name //= '';
-    $info //= '';
+    $name = encode('UTF-8', $name // '');
+    $info = encode('UTF-8', $info // '');
 
     eval {
 
@@ -174,7 +175,9 @@ sub save_group {
 	$self->{dbh}->do("UPDATE Objectgroup " .
 			 "SET Name = ?, Info = ? " .
 			 "WHERE ID = ?", undef,
-			 $og->{name}, $og->{info}, $og->{id});
+			 encode('UTF-8', $og->{name}),
+			 encode('UTF-8', $og->{info}),
+			 $og->{id});
 
 	return $og->{id};
 
@@ -183,7 +186,7 @@ sub save_group {
 	    "INSERT INTO Objectgroup (Name, Info, Class) " .
 	    "VALUES (?, ?, ?);");
 
-	$sth->execute($og->name, $og->info, $og->class);
+	$sth->execute(encode('UTF-8', $og->name), encode('UTF-8', $og->info), $og->class);
 
 	return $og->{id} = PMG::Utils::lastid($self->{dbh}, 'objectgroup_id_seq');
     }
@@ -212,7 +215,9 @@ sub delete_group {
 	$sth->execute($groupid);
 
 	if (my $ref = $sth->fetchrow_hashref()) {
-	    die "Group '$ref->{groupname}' is used by rule '$ref->{rulename}' - unable to delete\n";
+	    my $groupname = PMG::Utils::try_decode_utf8($ref->{groupname});
+	    my $rulename = PMG::Utils::try_decode_utf8($ref->{rulename});
+	    die "Group '$groupname' is used by rule '$rulename' - unable to delete\n";
 	}
 
         $sth->finish();
@@ -474,6 +479,7 @@ sub load_object_full {
 sub load_group_by_name {
     my ($self, $name) = @_;
 
+    $name = encode('UTF-8', $name);
     my $sth = $self->{dbh}->prepare("SELECT * FROM Objectgroup " .
 				    "WHERE name = ?");
 
@@ -598,13 +604,14 @@ sub save_rule {
     defined($rule->{direction}) ||
 	die "undefined rule attribute - direction: ERROR";
 
+    my $rulename = encode('UTF-8', $rule->{name});
     if (defined($rule->{id})) {
 
 	$self->{dbh}->do(
 	    "UPDATE Rule " .
 	    "SET Name = ?, Priority = ?, Active = ?, Direction = ? " .
 	    "WHERE ID = ?", undef,
-	    $rule->{name}, $rule->{priority}, $rule->{active},
+	    $rulename, $rule->{priority}, $rule->{active},
 	    $rule->{direction}, $rule->{id});
 
 	return $rule->{id};
@@ -614,7 +621,7 @@ sub save_rule {
 	    "INSERT INTO Rule (Name, Priority, Active, Direction) " .
 	    "VALUES (?, ?, ?, ?);");
 
-	$sth->execute($rule->name, $rule->priority, $rule->active,
+	$sth->execute($rulename, $rule->priority, $rule->active,
 		      $rule->direction);
 
 	return $rule->{id} = PMG::Utils::lastid($self->{dbh}, 'rule_id_seq');
@@ -779,7 +786,8 @@ sub load_rules {
     $sth->execute();
 
     while (my $ref = $sth->fetchrow_hashref()) {
-	my $rule = PMG::RuleDB::Rule->new($ref->{name}, $ref->{priority},
+	my $rulename = PMG::Utils::try_decode_utf8($ref->{name});
+	my $rule = PMG::RuleDB::Rule->new($rulename, $ref->{priority},
 					  $ref->{active}, $ref->{direction});
 	$rule->{id} = $ref->{id};
 	push @$rules, $rule;
diff --git a/src/PMG/RuleDB/Accept.pm b/src/PMG/RuleDB/Accept.pm
index cd67ea2..4ebd6da 100644
--- a/src/PMG/RuleDB/Accept.pm
+++ b/src/PMG/RuleDB/Accept.pm
@@ -93,7 +93,7 @@ sub execute {
     my $dkim = $msginfo->{dkim} // {};
     my $subgroups = $mod_group->subgroups($targets, !$dkim->{sign});
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 4867d83..6244dd9 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -115,7 +115,7 @@ sub execute {
 
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
 
diff --git a/src/PMG/RuleDB/Block.pm b/src/PMG/RuleDB/Block.pm
index c758787..25bb74e 100644
--- a/src/PMG/RuleDB/Block.pm
+++ b/src/PMG/RuleDB/Block.pm
@@ -89,7 +89,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if ($msginfo->{testmode}) {
 	my $fh = $msginfo->{test_fh};
diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
index d3003b2..c6afe54 100644
--- a/src/PMG/RuleDB/Disclaimer.pm
+++ b/src/PMG/RuleDB/Disclaimer.pm
@@ -193,7 +193,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $subgroups = $mod_group->subgroups($targets);
 
diff --git a/src/PMG/RuleDB/Group.pm b/src/PMG/RuleDB/Group.pm
index 2508305..baa68ce 100644
--- a/src/PMG/RuleDB/Group.pm
+++ b/src/PMG/RuleDB/Group.pm
@@ -12,8 +12,8 @@ sub new {
     my ($type, $name, $info, $class) = @_;
 
     my $self = {
-	name => $name,
-	info => $info,
+	name => PMG::Utils::try_decode_utf8($name),
+	info => PMG::Utils::try_decode_utf8($info),
 	class => $class,
     };
 
diff --git a/src/PMG/RuleDB/MatchField.pm b/src/PMG/RuleDB/MatchField.pm
index 2671ea4..2b56058 100644
--- a/src/PMG/RuleDB/MatchField.pm
+++ b/src/PMG/RuleDB/MatchField.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use MIME::Words;
 
 use PVE::SafeSyslog;
@@ -50,9 +51,10 @@ sub load_attr {
     defined($field) || die "undefined object attribute: ERROR";
     defined($field_value) || die "undefined object attribute: ERROR";
 
+    my $decoded_field_value = PMG::Utils::try_decode_utf8($field_value);
     # use known constructor, bless afterwards (because sub class can have constructor
     # with other parameter signature).
-    my $obj =  PMG::RuleDB::MatchField->new($field, $field_value, $ogroup);
+    my $obj =  PMG::RuleDB::MatchField->new($field, $decoded_field_value, $ogroup);
     bless $obj, $class;
 
     $obj->{id} = $id;
@@ -69,6 +71,7 @@ sub save {
 
     my $new_value = "$self->{field}:$self->{field_value}";
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined ($self->{id})) {
 	# update
@@ -105,7 +108,8 @@ sub parse_entity {
 	for my $value ($entity->head->get_all($self->{field})) {
 	    chomp $value;
 
-	    my $decvalue = MIME::Words::decode_mimewords($value);
+	    my $decvalue = PMG::Utils::decode_rfc1522($value);
+	    $decvalue = PMG::Utils::try_decode_utf8($decvalue);
 
 	    if ($decvalue =~ m|$self->{field_value}|i) {
 		push @$res, $id;
diff --git a/src/PMG/RuleDB/MatchFilename.pm b/src/PMG/RuleDB/MatchFilename.pm
index 7e5b486..c9cdbe0 100644
--- a/src/PMG/RuleDB/MatchFilename.pm
+++ b/src/PMG/RuleDB/MatchFilename.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use MIME::Words;
 
 use PMG::Utils;
@@ -41,8 +42,9 @@ sub load_attr {
     my $class = ref($type) || $type;
 
     defined($value) || die "undefined value: ERROR";;
+    my $decvalue = PMG::Utils::try_decode_utf8($value);
 
-    my $obj = $class->new($value, $ogroup);
+    my $obj = $class->new($decvalue, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -57,6 +59,7 @@ sub save {
 
     my $new_value = $self->{fname};
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 34108d1..6232322 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -56,7 +56,9 @@ sub load_attr {
 
     (defined($field) && defined($field_value)) || return undef;
 
-    my $obj = $class->new($field, $field_value, $ogroup);
+    my $dec_field_value = PMG::Utils::try_decode_utf8($field_value);
+
+    my $obj = $class->new($field, $dec_field_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $field, $field_value, $ogroup);
@@ -69,7 +71,7 @@ sub save {
 
     defined($self->{ogroup}) || return undef;
 
-    my $new_value = "$self->{field}:$self->{field_value}";
+    my $new_value = encode('UTF-8', "$self->{field}:$self->{field_value}");
 
     if (defined ($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 7b38e0d..8a9945b 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -208,7 +208,7 @@ sub execute {
 
     my $from = 'postmaster';
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $body = PMG::Utils::subst_values($self->{body}, $vars);
     my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 1426393..9d802fe 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -89,7 +90,7 @@ sub execute {
     
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index 6b27b91..da6c25f 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -63,12 +63,14 @@ sub load_attr {
 
     defined ($value) || die "undefined value: ERROR";
 
-    my $obj;
+    my ($obj, $text);
 
     if ($value =~ m/^([01])\,([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $4, $ogroup, $2);
+	$text = PMG::Utils::try_decode_utf8($4);
+	$obj = $class->new($1, $text, $ogroup, $2);
     } elsif ($value =~ m/^([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $3, $ogroup);
+	$text = PMG::Utils::try_decode_utf8($3);
+	$obj = $class->new($1, $text, $ogroup);
     } else {
 	$obj = $class->new(0, undef, $ogroup);
     }
@@ -89,7 +91,7 @@ sub save {
     $value .= ','. ($self->{quarantine} ? '1' : '0');
 
     if ($self->{text}) {
-	$value .= ":$self->{text}";
+	$value .= encode('UTF-8', ":$self->{text}");
     }
 
     if (defined ($self->{id})) {
@@ -194,7 +196,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets,
 	$msginfo, $vars, $marks, $ldap) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if (!$self->{all} && ($#$marks == -1)) {
 	# no marks
diff --git a/src/PMG/RuleDB/Rule.pm b/src/PMG/RuleDB/Rule.pm
index c49ad21..e7c9146 100644
--- a/src/PMG/RuleDB/Rule.pm
+++ b/src/PMG/RuleDB/Rule.pm
@@ -12,7 +12,7 @@ sub new {
     my ($type, $name, $priority, $active, $direction) = @_;
 
     my $self = { 
-	name => $name // '',
+	name => PMG::Utils::try_decode_utf8($name) // '',
 	priority => $priority // 0,
 	active => $active // 0,
     }; 
diff --git a/src/PMG/RuleDB/WhoRegex.pm b/src/PMG/RuleDB/WhoRegex.pm
index 37ec3aa..5c13604 100644
--- a/src/PMG/RuleDB/WhoRegex.pm
+++ b/src/PMG/RuleDB/WhoRegex.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 
 use PMG::Utils;
 use PMG::RuleDB::Object;
@@ -43,7 +44,8 @@ sub load_attr {
 
     defined($value) || die "undefined value: ERROR";
 
-    my $obj = $class->new ($value, $ogroup);
+    my $decoded_value = PMG::Utils::try_decode_utf8($value);
+    my $obj = $class->new ($decoded_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -59,6 +61,7 @@ sub save {
 
     my $adr = $self->{address};
     $adr =~ s/\\/\\\\/g;
+    $adr = encode('UTF-8', $adr);
 
     if (defined ($self->{id})) {
 	# update
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

as done in 114655f4fdb07c789a361b2f397f5345eafd16c6 for Accept and
Block.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/BCC.pm        | 19 +++++++++++++++++--
 src/PMG/RuleDB/Notify.pm     | 18 ++++++++++++++++--
 src/PMG/RuleDB/Quarantine.pm | 16 ++++++++++++++--
 src/PMG/RuleDB/Remove.pm     |  8 +++++++-
 4 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 6244dd9..0f016f8 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -3,6 +3,7 @@ package PMG::RuleDB::BCC;
 use strict;
 use warnings;
 use DBI;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 
@@ -164,10 +165,24 @@ sub execute {
 		$entity, $msginfo->{sender}, \@bcc_targets,
 		$msginfo->{xforward}, $msginfo->{fqdn}, $param);
 	    foreach (@bcc_targets) {
+		my $target = encode('UTF-8', $_);
 		if ($qid) {
-		    syslog('info', "%s: bcc to <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		    syslog(
+			'info',
+			"%s: bcc to <%s> (rule: %s, %s)",
+			$queue->{logid},
+			$target,
+			$rulename,
+			$qid,
+		    );
 		} else {
-		    syslog('err', "%s: bcc to <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		    syslog(
+			'err',
+			"%s: bcc to <%s> (rule: %s) failed",
+			$queue->{logid},
+			$target,
+			$rulename,
+		    );
 		}
 	    }
 	}
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 8a9945b..68f9b4e 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -259,10 +259,24 @@ sub execute {
 	my $qid = PMG::Utils::reinject_mail(
 	    $top, $from, \@targets, undef, $msginfo->{fqdn});
 	foreach (@targets) {
+	    my $target = encode('UTF-8', $_);
 	    if ($qid) {
-		syslog('info', "%s: notify <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		syslog(
+		    'info',
+		    "%s: notify <%s> (rule: %s, %s)",
+		    $queue->{logid},
+		    $target,
+		    $rulename,
+		    $qid,
+		);
 	    } else {
-		syslog ('err', "%s: notify <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		syslog (
+		    'err',
+		    "%s: notify <%s> (rule: %s) failed",
+		    $queue->{logid},
+		    $target,
+		    $rulename,
+		);
 	    }
 	}
     }
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 9d802fe..0fc8352 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -101,7 +101,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'V', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status ($tg, 'delivered');
@@ -111,7 +117,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'S', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status($tg, 'delivered');
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index da6c25f..e7c353c 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -235,7 +235,13 @@ sub execute {
 		}
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 	    }
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (3 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.

This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
  encoding
* valid utf-8 (smtputf8)

The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/MailQueue.pm    | 10 ++++++----
 src/PMG/RuleDB/Spam.pm  |  5 +++--
 src/bin/pmg-smtp-filter |  5 +++--
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/PMG/MailQueue.pm b/src/PMG/MailQueue.pm
index 2841b07..8355c30 100644
--- a/src/PMG/MailQueue.pm
+++ b/src/PMG/MailQueue.pm
@@ -6,6 +6,7 @@ use warnings;
 use PVE::SafeSyslog;
 use MIME::Parser;
 use IO::File;
+use Encode;
 use File::Sync;
 use File::Basename;
 use File::Path;
@@ -141,6 +142,7 @@ sub quarantinedb_insert {
     my ($self, $ruledb, $lcid, $ldap, $qtype, $header, $sender, $file, $targets, $vars) = @_;
 
     eval {
+	$sender = encode('UTF-8', $sender);
 	my $dbh = $ruledb->{dbh};
 
 	my $insert_cmds = "SELECT nextval ('cmailstore_id_seq'); INSERT INTO CMailStore " .
@@ -188,11 +190,11 @@ sub quarantinedb_insert {
 	    if ($pmail eq lc ($r)) {
 		$receiver = "NULL";
 	    } else {
-		$receiver = $dbh->quote ($r);
+		$receiver = $dbh->quote (encode('UTF-8', $r));
 	    }
 
 
-	    $pmail = $dbh->quote ($pmail);
+	    $pmail = $dbh->quote (encode('UTF-8', $pmail));
 	    $insert_cmds .= "INSERT INTO CMSReceivers " .
 		"(CMailStore_CID, CMailStore_RID, PMail, Receiver, TicketID, Status, MTime) " .
 		"VALUES ($lcid, currval ('cmailstore_id_seq'), $pmail, $receiver, $tid, 'N', $now); ";
@@ -294,8 +296,8 @@ sub quarantine_mail {
 	$entity->head->delete ('Return-Path');
 
 	# prepend Delivered-To and Return-Path (like QMAIL MAILDIR FORMAT)
-	$entity->head->add ('Return-Path', join (',', $sender), 0);
-	$entity->head->add ('Delivered-To', join (',', @$tg), 0);
+	$entity->head->add ('Return-Path', encode('UTF-8', join (',', $sender)), 0);
+	$entity->head->add ('Delivered-To', encode('UTF-8', join (',', @$tg)), 0);
 
 	$entity->print ($fh);
 
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index cc9a347..99056a3 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode);
 use Time::HiRes qw (gettimeofday);
 
 use PVE::SafeSyslog;
@@ -135,8 +136,8 @@ sub get_blackwhite {
     my $cond = '';
     foreach my $r (@$targets) {
 	my $pmail = $msginfo->{pmail}->{$r} || lc ($r);
-	my $qr = $dbh->quote ($pmail);
-	$cond .= " OR " if $cond;  
+	my $qr = $dbh->quote (encode('UTF-8', $pmail));
+	$cond .= " OR " if $cond;
 	$cond .= "pmail = $qr";
     }	 
 
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 45e68a7..911e9cd 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 
 use Carp;
+use Encode qw(encode);
 use Getopt::Long;
 use Time::HiRes qw (usleep gettimeofday tv_interval);
 use POSIX qw(:sys_wait_h errno_h signal_h);
@@ -791,10 +792,10 @@ sub handle_smtp {
 	$insert_cmds .= ($queue->{sa_score} || 0) . ',';
 	$insert_cmds .= $dbh->quote($queue->{vinfo}) . ',';
 	$insert_cmds .= $time_total . ',';
-	$insert_cmds .= $dbh->quote($msginfo->{sender}) . ');';
+	$insert_cmds .= $dbh->quote(encode('UTF-8', $msginfo->{sender})) . ');';
 
 	foreach my $r (@{$msginfo->{targets}}) {
-	    my $tmp = $dbh->quote($r);
+	    my $tmp = $dbh->quote(encode('UTF-8',$r));
 	    my $blocked = $queue->{status}->{$r} eq 'blocked' ? 1 : 0;
 	    $insert_cmds .= "INSERT INTO CReceivers (CStatistic_CID, CStatistic_RID, Receiver, Blocked) " .
 		"VALUES ($lcid, currval ('cstatistic_id_seq'), $tmp, '$blocked'); ";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (4 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:15   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/API2/Quarantine.pm | 10 +++++-----
 src/PMG/HTMLMail.pm        |  7 ++++---
 src/PMG/Quarantine.pm      | 13 +++++++------
 src/PMG/RuleDB/Spam.pm     | 12 ++++++------
 4 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
index ddf7c04..819c78c 100644
--- a/src/PMG/API2/Quarantine.pm
+++ b/src/PMG/API2/Quarantine.pm
@@ -141,8 +141,8 @@ my $parse_header_info = sub {
     my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
     $res->{sender} = $sender if $sender && ($sender ne $res->{from});
 
-    $res->{envelope_sender} = $ref->{sender};
-    $res->{receiver} = $ref->{receiver} // $ref->{pmail};
+    $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
+    $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});
     $res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
     $res->{time} = $ref->{time};
     $res->{bytes} = $ref->{bytes};
@@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
 	$sth->execute();
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    push @$res, { mail => $ref->{pmail} };
+	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
 	}
 
 	return $res;
@@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
 	}
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    push @$res, { mail => $ref->{pmail} };
+	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
 	}
 
 	return $res;
@@ -569,7 +569,7 @@ my $quarantine_api = sub {
     }
 
     if ($check_pmail || $role eq 'quser') {
-	$sth->execute($pmail);
+	$sth->execute(encode('UTF-8', $pmail));
     } else {
 	$sth->execute();
     }
diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
index 87f5c40..207c52c 100644
--- a/src/PMG/HTMLMail.pm
+++ b/src/PMG/HTMLMail.pm
@@ -192,9 +192,10 @@ sub read_raw_email {
     # read header
     my $header;
     while (defined(my $line = <$fh>)) {
-	$raw_header .= $line;
-	chomp $line;
-	push @$header, $line;
+	my $decoded_line = PMG::Utils::try_decode_utf8($line);
+	$raw_header .= $decoded_line;
+	chomp $decoded_line;
+	push @$header, $decoded_line;
 	last if $line =~ m/^\s*$/;
     }
 
diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
index 77af8cc..aa6b948 100644
--- a/src/PMG/Quarantine.pm
+++ b/src/PMG/Quarantine.pm
@@ -3,6 +3,7 @@ package PMG::Quarantine;
 use strict;
 use warnings;
 use Net::SMTP;
+use Encode qw(encode);
 
 use PVE::SafeSyslog;
 use PVE::Tools;
@@ -16,7 +17,7 @@ sub add_to_blackwhite {
 
     my $name = $listname eq 'BL' ? 'BL' : 'WL';
     my $oname = $listname eq 'BL' ? 'WL' : 'BL';
-    my $qu = $dbh->quote ($username);
+    my $qu = $dbh->quote (encode('UTF-8', $username));
 
     my $sth = $dbh->prepare(
 	"SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
@@ -25,13 +26,13 @@ sub add_to_blackwhite {
     my $list = { 'WL' => {}, 'BL' => {} };
 
     while (my $ref = $sth->fetchrow_hashref()) {
-	my $data = $ref->{data};
+	my $data = PMG::Utils::try_decode_utf8($ref->{data});
 	$data =~ s/[,;]/ /g;
 	my @alist = split('\s+', $data);
 
 	my $tmp = {};
 	foreach my $a (@alist) {
-	    if ($a =~ m/^[[:ascii:]]+$/) {
+	    if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {
 		$tmp->{$a} = 1;
 	    }
 	}
@@ -50,7 +51,7 @@ sub add_to_blackwhite {
 	    if ($delete) {
 		delete($list->{$name}->{$v});
 	    } else {
-		if ($v =~ m/[[:^ascii:]]/) {
+		if ($v =~ m/[\s\\]/) {
 		    die "email address '$v' contains invalid characters\n";
 		}
 		$list->{$name}->{$v} = 1;
@@ -58,8 +59,8 @@ sub add_to_blackwhite {
 	    }
 	}
 
-	my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
-	my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
+	my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
+	my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
 
 	if (!$delete) {
 	    my $maxlen = 200000;
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index 99056a3..bc1d422 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -94,7 +94,7 @@ sub parse_addrlist {
 	my $regex = $addr;
 	# SA like checks
 	$regex =~ s/[\000\\\(]/_/gs;		# is this really necessasry ?
-	$regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g;	# escape possible metachars
+	$regex =~ s/([^\*\?_\w])/\\$1/g;	# escape possible metachars
 	$regex =~ tr/?/./;			# replace "?" with "."
 	$regex =~ s/\*+/\.\*/g;			# replace "*" with  ".*"
 
@@ -149,13 +149,13 @@ sub get_blackwhite {
 	$sth->execute();
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    my $pmail = lc ($ref->{pmail});
+	    my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
 	    if ($ref->{name} eq 'WL') {
 		$target_info->{$pmail}->{whitelist} = 
-		    parse_addrlist($ref->{data});
+		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
 	    } elsif ($ref->{name} eq 'BL') {
 		$target_info->{$pmail}->{blacklist} = 
-		    parse_addrlist($ref->{data});
+		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
 	    }
 	}
 
@@ -205,7 +205,7 @@ sub what_match_targets {
 		($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
 		check_addrlist($list, $queue->{all_from_addrs})) {
 		syslog('info', "%s: sender in user (%s) whitelist", 
-		       $queue->{logid}, $pmail);
+		       $queue->{logid}, encode('UTF-8', $pmail));
 	    } else {
 		$target_info->{$t}->{marks} = []; # never add additional marks here
 		$target_info->{$t}->{spaminfo} = $info;
@@ -234,7 +234,7 @@ sub what_match_targets {
 		$target_info->{$t}->{marks} = [];
 		$target_info->{$t}->{spaminfo} = $info;
 		syslog ('info', "%s: sender in user (%s) blacklist", 
-			$queue->{logid}, $pmail);
+			$queue->{logid}, encode('UTF-8',$pmail));
 	    }
 	}
     }
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (5 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:20   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

$data->{pmail} is both used in the template rendering ('Spam Report for
$pmail'), and as content for the To header, which need different
treatment. Thus introduce 'pmail_raw' additionally.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
 src/PMG/Utils.pm     |  7 ++++---
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
index dbec8ef..7293579 100755
--- a/src/PMG/CLI/pmgqm.pm
+++ b/src/PMG/CLI/pmgqm.pm
@@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
 
 use strict;
 use Data::Dumper;
+use Encode qw(encode);
 use Template;
 use MIME::Entity;
 use HTML::Entities;
@@ -17,6 +18,7 @@ use PVE::SafeSyslog;
 use PVE::Tools;
 use PVE::INotify;
 use PVE::CLIHandler;
+use PVE::JSONSchema qw(get_standard_option);
 
 use PMG::RESTEnvironment;
 use PMG::Utils;
@@ -57,7 +59,7 @@ sub get_item_data {
     }
 
     $item->{envelope_sender} = $ref->{sender};
-    $item->{pmail} = $ref->{pmail};
+    $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
     $item->{receiver} = $ref->{receiver} || $ref->{pmail};
 
     $item->{date} = strftime("%F", localtime($ref->{time}));
@@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
     parameters => {
 	additionalProperties => 0,
 	properties => {
-	    receiver => {
+	    receiver => get_standard_option('pmg-email-address', {
 		description => "Generate report for a single email address. If not specified, generate reports for all users.",
-		type => 'string', format => 'email',
 		optional => 1,
-	    },
+	    }),
 	    timespan => {
 		description => "Select time span.",
 		type => 'string',
@@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
 		enum => ['short', 'verbose', 'custom'],
 		optional => 1,
 	    },
-	    redirect => {
+	    redirect => get_standard_option('pmg-email-address', {
 		description => "Redirect spam report email to this address.",
-		type => 'string', format => 'email',
 		optional => 1,
-	    },
+	    }),
 	    debug => {
 		description => "Debug mode. Print raw email to stdout instead of sending them.",
 		type => 'boolean',
@@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
 	    "ORDER BY pmail, time, receiver");
 
 	if ($target) {
-	    $sth->execute($target);
+	    $sth->execute(encode('UTF-8', $target));
 	} else {
 	    $sth->execute();
 	}
@@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
 	};
 
 	while (my $ref = $sth->fetchrow_hashref()) {
-	    if ($creceiver ne $ref->{pmail}) {
+	    my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
+	    if ($creceiver ne $decoded_pmail) {
 
 		$finalize->() if $data;
 
 		$data = clone($global_data);
 
-		$creceiver = $ref->{pmail};
+		$creceiver = $decoded_pmail;
 		$mailcount = 0;
 
-		$data->{pmail} = $creceiver;
+		$data->{pmail} = encode_entities($decoded_pmail);
+		$data->{pmail_raw} = $ref->{pmail};
 		$data->{managehref} = "$protocol_fqdn_port/quarantine";
 		if ($data->{authmode} ne 'ldap') {
 		    $data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cc30e67..5c9e873 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
 	    my ($d, $cs) = @$r;
 	    if ($d) {
 		if ($cs) {
-		    $res .= encode_entities(decode($cs, $d));
+		    $res .= encode('UTF-8', decode($cs, $d));
 		} else {
-		    $res .= encode_entities($d);
+		    $res .= $d;
 		}
 	    }
 	}
+	$res = encode_entities(decode('UTF-8', $res));
     };
 
     $res = $enc if $@;
@@ -1257,7 +1258,7 @@ sub finalize_report {
 
     my $top = MIME::Entity->build(
 	Type    => "multipart/related",
-	To      => $data->{pmail},
+	To      => $data->{pmail_raw},
 	From    => $mailfrom,
 	Subject => bencode_header(decode_entities($title)));
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (6 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:26   ` Dominik Csapak
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 17 deletions(-)

diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
index 6d27930..96ef61d 100755
--- a/src/PMG/Statistic.pm
+++ b/src/PMG/Statistic.pm
@@ -3,6 +3,7 @@ package PMG::Statistic;
 use strict;
 use warnings;
 use DBI;
+use Encode qw(encode);
 use Time::Local;
 use Time::Zone;
 
@@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
     return $orderby;
 };
 
+sub user_stat_to_perlstring {
+    my ($entry) = @_;
+
+    my $res = { };
+
+    for my $a (keys %$entry) {
+	if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
+	    $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
+	} else {
+	    $res->{$a} = $entry->{$a};
+	}
+    }
+
+    return $res;
+}
+
 sub user_stat_contact_details {
     my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
 
@@ -554,19 +571,21 @@ sub user_stat_contact_details {
 
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
 
+    my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
+
     my $query = "SELECT * FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
 	"AND NOT direction AND sender != '' AND receiver = ? " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
 
-    $sth->execute($receiver);
+    $sth->execute(encode('UTF-8',$receiver));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -583,11 +602,14 @@ sub user_stat_contact {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
 	"count (virusinfo) as viruscount " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"AND $cond_good_mail AND NOT direction AND sender != '' ";
 
     if ($advfilter) {
@@ -603,7 +625,7 @@ sub user_stat_contact {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -620,20 +642,23 @@ sub user_stat_sender_details {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $sth = $rdb->{dbh}->prepare(
 	"SELECT " .
 	"blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
 	"$cond_good_mail AND NOT direction AND sender = ? " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit");
 
-    $sth->execute($sender);
+    $sth->execute(encode('UTF-8',$sender));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -650,11 +675,14 @@ sub user_stat_sender {
 
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
 	"count (virusinfo) as viruscount, " .
 	"count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
 	"FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"GROUP BY sender ORDER BY $orderby limit $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
@@ -662,7 +690,7 @@ sub user_stat_sender {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -679,18 +707,21 @@ sub user_stat_receiver_details {
 
     my $cond_good_mail = $self->query_cond_good_mail($from, $to);
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $sth = $rdb->{dbh}->prepare(
 	"SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
 	"FROM CStatistic, CReceivers " .
 	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
-	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
 	"ORDER BY $orderby limit $limit");
 
-    $sth->execute($receiver);
+    $sth->execute(encode('UTF-8',$receiver));
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -708,6 +739,9 @@ sub user_stat_receiver {
     my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
 	"receiver IS NOT NULL AND receiver != ''";
 
+    my $filter_pattern;
+    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
     my $query = "SELECT receiver, " .
 	"count(*) AS count, " .
 	"sum (bytes) AS bytes, " .
@@ -728,7 +762,7 @@ sub user_stat_receiver {
     }
 
     $query .= "AND $cond_good_mail and direction " .
-	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
 	"GROUP BY receiver ORDER BY $orderby LIMIT $limit";
 
     my $sth = $rdb->{dbh}->prepare($query);
@@ -736,7 +770,7 @@ sub user_stat_receiver {
 
     my $res = [];
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
 
     $sth->finish();
@@ -873,9 +907,8 @@ sub recent_receivers {
     my $sth =  $rdb->{dbh}->prepare($cmd);
 
     $sth->execute ($from, $limit);
-
     while (my $ref = $sth->fetchrow_hashref()) {
-	push @$res, $ref;
+	push @$res, user_stat_to_perlstring($ref);
     }
     $sth->finish();
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (7 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

matching the pattern in the backend (allowing most characters inside
of e-mail addresses.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/UserBlackWhiteList.js | 2 +-
 js/Utils.js              | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 4f4a756..44d75b3 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
 	{
 	    xtype: 'combobox',
 	    displayField: 'mail',
-	    vtype: 'email',
+	    vtype: 'proxmoxMail',
 	    allowBlank: false,
 	    valueField: 'mail',
 	    store: {
diff --git a/js/Utils.js b/js/Utils.js
index dc924d2..7fa154e 100644
--- a/js/Utils.js
+++ b/js/Utils.js
@@ -898,3 +898,12 @@ Ext.define('PMG.Async', {
 	);
     },
 });
+
+// custom Vtypes
+Ext.apply(Ext.form.field.VTypes, {
+    // matches the pmg-email-address in pmg-api
+    PMGMail: function(v) {
+	return (/[^\s\\@]+@[^\s/\\@]+/).test(v);
+    },
+    PMGMailText: gettext('Example') + ": user@example.com",
+});
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (8 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
@ 2022-11-23  9:23 ` Stoiko Ivanov
  2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
  2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23  9:23 UTC (permalink / raw)
  To: pmg-devel

to be able to add addresses to the lists for non-ascii-addresses

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/UserBlackWhiteList.js | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 44d75b3..1344496 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
 	{
 	    xtype: 'combobox',
 	    displayField: 'mail',
-	    vtype: 'proxmoxMail',
+	    vtype: 'PMGMail',
 	    allowBlank: false,
 	    valueField: 'mail',
 	    store: {
-- 
2.30.2





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (9 preceding siblings ...)
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
@ 2022-11-23 14:09 ` Dominik Csapak
  2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:09 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

all in all works mostly well,
tested various weird emails with various rules
that include emojis/non-ascii characters

(weird mails as in a mix of smtputf8,mixed charsets and quoted-printable fields
with mixed encodings, with and without non-ascii characters in the
sender/recipient)

things that did not work and need to be fixed if we want to apply this:

* LDAP, you mentioned it, but it fails in a really non obvious way
   and drops mails currently
* user wl/bl from the quarantine interface
   (some en/decode is missing, and garbage reaches the user lists)

things that worked in my tests:

* sending emails (with/without smtputf8)
* quarantining mails
* notication/modify/header/disclaimer/etc. with non-ascii characters
* various what/who objects with non-ascii characters
* greylisting with non-ascii characters in sender/recipient
* modifying user wl/bl
* matching user wl/bl
* log tracker
* statistics

i did find some things to note in the individual patches, i'll answer there




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:15   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:15 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

i'd like to have some rationale for the changes in the commit message
at least for the more non-obvious ones (regex changes for example)

comments inline

On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/API2/Quarantine.pm | 10 +++++-----
>   src/PMG/HTMLMail.pm        |  7 ++++---
>   src/PMG/Quarantine.pm      | 13 +++++++------
>   src/PMG/RuleDB/Spam.pm     | 12 ++++++------
>   4 files changed, 22 insertions(+), 20 deletions(-)
> 
> diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
> index ddf7c04..819c78c 100644
> --- a/src/PMG/API2/Quarantine.pm
> +++ b/src/PMG/API2/Quarantine.pm
> @@ -141,8 +141,8 @@ my $parse_header_info = sub {
>       my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
>       $res->{sender} = $sender if $sender && ($sender ne $res->{from});
>   
> -    $res->{envelope_sender} = $ref->{sender};
> -    $res->{receiver} = $ref->{receiver} // $ref->{pmail};
> +    $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
> +    $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});

maybe we should note here in a comment that these are not headers
but part of the smtp dialog and cannot be quoted-printable/base64 encoded?

>       $res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
>       $res->{time} = $ref->{time};
>       $res->{bytes} = $ref->{bytes};
> @@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
>   	$sth->execute();
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    push @$res, { mail => $ref->{pmail} };
> +	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
>   	}
>   
>   	return $res;
> @@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
>   	}
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    push @$res, { mail => $ref->{pmail} };
> +	    push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
>   	}
>   
>   	return $res;
> @@ -569,7 +569,7 @@ my $quarantine_api = sub {
>       }
>   
>       if ($check_pmail || $role eq 'quser') {
> -	$sth->execute($pmail);
> +	$sth->execute(encode('UTF-8', $pmail));
>       } else {
>   	$sth->execute();
>       }
> diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
> index 87f5c40..207c52c 100644
> --- a/src/PMG/HTMLMail.pm
> +++ b/src/PMG/HTMLMail.pm
> @@ -192,9 +192,10 @@ sub read_raw_email {
>       # read header
>       my $header;
>       while (defined(my $line = <$fh>)) {
> -	$raw_header .= $line;
> -	chomp $line;
> -	push @$header, $line;
> +	my $decoded_line = PMG::Utils::try_decode_utf8($line);
> +	$raw_header .= $decoded_line;
> +	chomp $decoded_line;
> +	push @$header, $decoded_line;
>   	last if $line =~ m/^\s*$/;
>       }
>   
> diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
> index 77af8cc..aa6b948 100644
> --- a/src/PMG/Quarantine.pm
> +++ b/src/PMG/Quarantine.pm
> @@ -3,6 +3,7 @@ package PMG::Quarantine;
>   use strict;
>   use warnings;
>   use Net::SMTP;
> +use Encode qw(encode);
>   
>   use PVE::SafeSyslog;
>   use PVE::Tools;
> @@ -16,7 +17,7 @@ sub add_to_blackwhite {
>   
>       my $name = $listname eq 'BL' ? 'BL' : 'WL';
>       my $oname = $listname eq 'BL' ? 'WL' : 'BL';
> -    my $qu = $dbh->quote ($username);
> +    my $qu = $dbh->quote (encode('UTF-8', $username));
>   
>       my $sth = $dbh->prepare(
>   	"SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
> @@ -25,13 +26,13 @@ sub add_to_blackwhite {
>       my $list = { 'WL' => {}, 'BL' => {} };
>   
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	my $data = $ref->{data};
> +	my $data = PMG::Utils::try_decode_utf8($ref->{data});
>   	$data =~ s/[,;]/ /g;
>   	my @alist = split('\s+', $data);
>   
>   	my $tmp = {};
>   	foreach my $a (@alist) {
> -	    if ($a =~ m/^[[:ascii:]]+$/) {
> +	    if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {

that change seems a bit dangerous, maybe we should at least
filter out some control characters here?

>   		$tmp->{$a} = 1;
>   	    }
>   	}
> @@ -50,7 +51,7 @@ sub add_to_blackwhite {
>   	    if ($delete) {
>   		delete($list->{$name}->{$v});
>   	    } else {
> -		if ($v =~ m/[[:^ascii:]]/) {
> +		if ($v =~ m/[\s\\]/) {

same here, going from 'non-ascii' is forbidden to 'non whitespace+\' is forbidden
is a bit broad imho

>   		    die "email address '$v' contains invalid characters\n";
>   		}
>   		$list->{$name}->{$v} = 1;
> @@ -58,8 +59,8 @@ sub add_to_blackwhite {
>   	    }
>   	}
>   
> -	my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
> -	my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
> +	my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
> +	my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
>   
>   	if (!$delete) {
>   	    my $maxlen = 200000;
> diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
> index 99056a3..bc1d422 100644
> --- a/src/PMG/RuleDB/Spam.pm
> +++ b/src/PMG/RuleDB/Spam.pm
> @@ -94,7 +94,7 @@ sub parse_addrlist {
>   	my $regex = $addr;
>   	# SA like checks
>   	$regex =~ s/[\000\\\(]/_/gs;		# is this really necessasry ?
> -	$regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g;	# escape possible metachars
> +	$regex =~ s/([^\*\?_\w])/\\$1/g;	# escape possible metachars

what does \w include more here than a-zA-Z0-9 ?
(a short explanation in the commit message would be enough imo)

>   	$regex =~ tr/?/./;			# replace "?" with "."
>   	$regex =~ s/\*+/\.\*/g;			# replace "*" with  ".*"
>   
> @@ -149,13 +149,13 @@ sub get_blackwhite {
>   	$sth->execute();
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    my $pmail = lc ($ref->{pmail});
> +	    my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
>   	    if ($ref->{name} eq 'WL') {
>   		$target_info->{$pmail}->{whitelist} =
> -		    parse_addrlist($ref->{data});
> +		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
>   	    } elsif ($ref->{name} eq 'BL') {
>   		$target_info->{$pmail}->{blacklist} =
> -		    parse_addrlist($ref->{data});
> +		    parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
>   	    }
>   	}
>   
> @@ -205,7 +205,7 @@ sub what_match_targets {
>   		($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
>   		check_addrlist($list, $queue->{all_from_addrs})) {
>   		syslog('info', "%s: sender in user (%s) whitelist",
> -		       $queue->{logid}, $pmail);
> +		       $queue->{logid}, encode('UTF-8', $pmail));
>   	    } else {
>   		$target_info->{$t}->{marks} = []; # never add additional marks here
>   		$target_info->{$t}->{spaminfo} = $info;
> @@ -234,7 +234,7 @@ sub what_match_targets {
>   		$target_info->{$t}->{marks} = [];
>   		$target_info->{$t}->{spaminfo} = $info;
>   		syslog ('info', "%s: sender in user (%s) blacklist",
> -			$queue->{logid}, $pmail);
> +			$queue->{logid}, encode('UTF-8',$pmail));
>   	    }
>   	}
>       }





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23 14:20   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:20 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

comments inline

On 11/23/22 10:23, Stoiko Ivanov wrote:
> $data->{pmail} is both used in the template rendering ('Spam Report for
> $pmail'), and as content for the To header, which need different
> treatment. Thus introduce 'pmail_raw' additionally.
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
>   src/PMG/Utils.pm     |  7 ++++---
>   2 files changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
> index dbec8ef..7293579 100755
> --- a/src/PMG/CLI/pmgqm.pm
> +++ b/src/PMG/CLI/pmgqm.pm
> @@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
>   
>   use strict;
>   use Data::Dumper;
> +use Encode qw(encode);
>   use Template;
>   use MIME::Entity;
>   use HTML::Entities;
> @@ -17,6 +18,7 @@ use PVE::SafeSyslog;
>   use PVE::Tools;
>   use PVE::INotify;
>   use PVE::CLIHandler;
> +use PVE::JSONSchema qw(get_standard_option);
>   
>   use PMG::RESTEnvironment;
>   use PMG::Utils;
> @@ -57,7 +59,7 @@ sub get_item_data {
>       }
>   
>       $item->{envelope_sender} = $ref->{sender};
> -    $item->{pmail} = $ref->{pmail};
> +    $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
>       $item->{receiver} = $ref->{receiver} || $ref->{pmail};
>   
>       $item->{date} = strftime("%F", localtime($ref->{time}));
> @@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
>       parameters => {
>   	additionalProperties => 0,
>   	properties => {
> -	    receiver => {
> +	    receiver => get_standard_option('pmg-email-address', {
>   		description => "Generate report for a single email address. If not specified, generate reports for all users.",
> -		type => 'string', format => 'email',
>   		optional => 1,
> -	    },
> +	    }),
>   	    timespan => {
>   		description => "Select time span.",
>   		type => 'string',
> @@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
>   		enum => ['short', 'verbose', 'custom'],
>   		optional => 1,
>   	    },
> -	    redirect => {
> +	    redirect => get_standard_option('pmg-email-address', {
>   		description => "Redirect spam report email to this address.",
> -		type => 'string', format => 'email',
>   		optional => 1,
> -	    },
> +	    }),
>   	    debug => {
>   		description => "Debug mode. Print raw email to stdout instead of sending them.",
>   		type => 'boolean',
> @@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
>   	    "ORDER BY pmail, time, receiver");
>   
>   	if ($target) {
> -	    $sth->execute($target);
> +	    $sth->execute(encode('UTF-8', $target));
>   	} else {
>   	    $sth->execute();
>   	}
> @@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
>   	};
>   
>   	while (my $ref = $sth->fetchrow_hashref()) {
> -	    if ($creceiver ne $ref->{pmail}) {
> +	    my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
> +	    if ($creceiver ne $decoded_pmail) {
>   
>   		$finalize->() if $data;
>   
>   		$data = clone($global_data);
>   
> -		$creceiver = $ref->{pmail};
> +		$creceiver = $decoded_pmail;
>   		$mailcount = 0;
>   
> -		$data->{pmail} = $creceiver;
> +		$data->{pmail} = encode_entities($decoded_pmail);
> +		$data->{pmail_raw} = $ref->{pmail};
>   		$data->{managehref} = "$protocol_fqdn_port/quarantine";
>   		if ($data->{authmode} ne 'ldap') {
>   		    $data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
> diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
> index cc30e67..5c9e873 100644
> --- a/src/PMG/Utils.pm
> +++ b/src/PMG/Utils.pm
> @@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
>   	    my ($d, $cs) = @$r;
>   	    if ($d) {
>   		if ($cs) {
> -		    $res .= encode_entities(decode($cs, $d));
> +		    $res .= encode('UTF-8', decode($cs, $d));
>   		} else {
> -		    $res .= encode_entities($d);
> +		    $res .= $d;
>   		}
>   	    }
>   	}
> +	$res = encode_entities(decode('UTF-8', $res));

this change is not really explained in the commit message
and is a bit confusing

couldn't we simply do:

encode_entities(decode_rfc1522($enc))

?

afaics is rfc1522_to_html mostly the same as decode_rfc1522
but with an 'encode_entities' after decoding


>       };
>   
>       $res = $enc if $@;
> @@ -1257,7 +1258,7 @@ sub finalize_report {
>   
>       my $top = MIME::Entity->build(
>   	Type    => "multipart/related",
> -	To      => $data->{pmail},
> +	To      => $data->{pmail_raw},
>   	From    => $mailfrom,
>   	Subject => bencode_header(decode_entities($title)));
>   





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
  2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:26   ` Dominik Csapak
  0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:26 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

again, a bit more commit message would be nice

On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
>   1 file changed, 50 insertions(+), 17 deletions(-)
> 
> diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
> index 6d27930..96ef61d 100755
> --- a/src/PMG/Statistic.pm
> +++ b/src/PMG/Statistic.pm
> @@ -3,6 +3,7 @@ package PMG::Statistic;
>   use strict;
>   use warnings;
>   use DBI;
> +use Encode qw(encode);
>   use Time::Local;
>   use Time::Zone;
>   
> @@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
>       return $orderby;
>   };
>   
> +sub user_stat_to_perlstring {
> +    my ($entry) = @_;
> +
> +    my $res = { };
> +
> +    for my $a (keys %$entry) {
> +	if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
> +	    $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
> +	} else {
> +	    $res->{$a} = $entry->{$a};
> +	}
> +    }
> +
> +    return $res;
> +}
> +
>   sub user_stat_contact_details {
>       my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
>   
> @@ -554,19 +571,21 @@ sub user_stat_contact_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>   
> +    my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
> +
>       my $query = "SELECT * FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
>   	"AND NOT direction AND sender != '' AND receiver = ? " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
>   
> -    $sth->execute($receiver);
> +    $sth->execute(encode('UTF-8',$receiver));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -583,11 +602,14 @@ sub user_stat_contact {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
>   	"count (virusinfo) as viruscount " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
>   	"AND $cond_good_mail AND NOT direction AND sender != '' ";
>   
>       if ($advfilter) {
> @@ -603,7 +625,7 @@ sub user_stat_contact {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -620,20 +642,23 @@ sub user_stat_sender_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $sth = $rdb->{dbh}->prepare(
>   	"SELECT " .
>   	"blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
>   	"$cond_good_mail AND NOT direction AND sender = ? " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit");
>   
> -    $sth->execute($sender);
> +    $sth->execute(encode('UTF-8',$sender));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -650,11 +675,14 @@ sub user_stat_sender {
>   
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
>   	"count (virusinfo) as viruscount, " .
>   	"count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
>   	"FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"GROUP BY sender ORDER BY $orderby limit $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
> @@ -662,7 +690,7 @@ sub user_stat_sender {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -679,18 +707,21 @@ sub user_stat_receiver_details {
>   
>       my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $sth = $rdb->{dbh}->prepare(
>   	"SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
>   	"FROM CStatistic, CReceivers " .
>   	"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
> -	($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
>   	"ORDER BY $orderby limit $limit");
>   
> -    $sth->execute($receiver);
> +    $sth->execute(encode('UTF-8',$receiver));
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -708,6 +739,9 @@ sub user_stat_receiver {
>       my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
>   	"receiver IS NOT NULL AND receiver != ''";
>   
> +    my $filter_pattern;
> +    $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
>       my $query = "SELECT receiver, " .
>   	"count(*) AS count, " .
>   	"sum (bytes) AS bytes, " .
> @@ -728,7 +762,7 @@ sub user_stat_receiver {
>       }
>   
>       $query .= "AND $cond_good_mail and direction " .
> -	($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> +	($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .

we have this pattern 6 times in this diff, wouldn't it be easier to do something like this:
(naming is not optimal, just what came to my mind)

sub sql_filter_text {
     my ($dbh, $field, $filter) = @_;
     my $filter_text = $filter ? "AND $field like ". $dbh->quote(...). " " : '';
     return $filter_text
}

and call it in the functions with

my $filter_text = sql_filter_text($rdb->{dbh}, 'receiver', $filter);

and simply use it with:

$query .= "...." . $filter_text . "...";

?

>   	"GROUP BY receiver ORDER BY $orderby LIMIT $limit";
>   
>       my $sth = $rdb->{dbh}->prepare($query);
> @@ -736,7 +770,7 @@ sub user_stat_receiver {
>   
>       my $res = [];
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>   
>       $sth->finish();
> @@ -873,9 +907,8 @@ sub recent_receivers {
>       my $sth =  $rdb->{dbh}->prepare($cmd);
>   
>       $sth->execute ($from, $limit);
> -
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	push @$res, $ref;
> +	push @$res, user_stat_to_perlstring($ref);
>       }
>       $sth->finish();
>   





^ permalink raw reply	[flat|nested] 16+ messages in thread

* [pmg-devel] applied-gui: [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
  2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (10 preceding siblings ...)
  2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
@ 2022-11-26  7:00 ` Thomas Lamprecht
  11 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2022-11-26  7:00 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

Am 23/11/2022 um 10:23 schrieb Stoiko Ivanov:
> pmg-gui:
> Stoiko Ivanov (2):
>   utils: add custom validator for pmg-email-address
>   userblocklists: use PMGMail as validator for pmail

before I forget: applied those two yesterday, thanks!




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-26  7:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-23  9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
2022-11-23 14:15   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
2022-11-23 14:20   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
2022-11-23 14:26   ` Dominik Csapak
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
2022-11-23  9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
2022-11-26  7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal