* [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
` (10 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)
the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit
43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/Utils.pm | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..cfb8852 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1088,6 +1088,7 @@ sub decode_to_html {
return $res;
}
+# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters)
sub decode_rfc1522 {
my ($enc) = @_;
@@ -1102,7 +1103,7 @@ sub decode_rfc1522 {
if ($cs) {
$res .= decode($cs, $d);
} else {
- $res .= $d;
+ $res .= try_decode_utf8($d);
}
}
}
@@ -1542,4 +1543,9 @@ sub get_existing_object_id {
return;
}
+sub try_decode_utf8 {
+ my ($data) = @_;
+ return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
1;
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
` (9 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
by storing the variables as perl-string (not mime-encoded, and not
utf-8 encoded), and appropriately dealing with multi-line values to
input (folding the headers and encoding as mime).
This fixes another glitch not caught by
d3d6b5dff9e4447d16cb92e0fdf26f67d9384423
the Subject was always displayed with a '?' in the end (due to the
(quoted-printable encoded) \n added).
Additionally adapt the other callsites of PMG::Utils::subst_values
where applicable.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/RuleDB/BCC.pm | 2 +-
src/PMG/RuleDB/ModField.pm | 13 +------------
src/PMG/RuleDB/Notify.pm | 4 ++--
src/PMG/Utils.pm | 17 +++++++++++++++++
src/bin/pmg-smtp-filter | 2 +-
5 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index d364690..4867d83 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -117,7 +117,7 @@ sub execute {
my $rulename = $vars->{RULE} // 'unknown';
- my $bcc_to = PMG::Utils::subst_values($self->{target}, $vars);
+ my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
if ($bcc_to =~ m/^\s*$/) {
# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 4ebb618..34108d1 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -5,7 +5,6 @@ use warnings;
use DBI;
use Digest::SHA;
use Encode qw(encode decode);
-use MIME::Words qw(encode_mimewords);
use PMG::Utils;
use PMG::ModGroup;
@@ -98,17 +97,7 @@ sub execute {
my ($self, $queue, $ruledb, $mod_group, $targets,
$msginfo, $vars, $marks) = @_;
- my $fvalue = '';
-
- foreach my $line (split('\r?\n\s*',PMG::Utils::subst_values ($self->{field_value}, $vars))) {
- $fvalue .= "\n" if $fvalue;
- $fvalue .= encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
- }
-
- # support for multiline values (i.e. __SPAM_INFO__)
- $fvalue =~ s/\n/\n\t/sg; # indent content
- $fvalue =~ s/\n\s*\n//sg; # remove empty line
- $fvalue =~ s/\n?\s*$//s; # remove trailing spaces
+ my $fvalue = PMG::Utils::subst_values_for_header($self->{field_value}, $vars);
my $subgroups = $mod_group->subgroups($targets);
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index d67221e..7b38e0d 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -211,8 +211,8 @@ sub execute {
my $rulename = $vars->{RULE} // 'unknown';
my $body = PMG::Utils::subst_values($self->{body}, $vars);
- my $subject = PMG::Utils::subst_values($self->{subject}, $vars);
- my $to = PMG::Utils::subst_values($self->{to}, $vars);
+ my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
+ my $to = PMG::Utils::subst_values_for_header($self->{to}, $vars);
if ($to =~ m/^\s*$/) {
# this happens if a notification is triggered by bounce mails
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cfb8852..cc30e67 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -203,6 +203,23 @@ sub subst_values {
return $body;
}
+sub subst_values_for_header {
+ my ($header, $dh) = @_;
+
+ my $res = '';
+ foreach my $line (split('\r?\n\s*', subst_values ($header, $dh))) {
+ $res .= "\n" if $res;
+ $res .= MIME::Words::encode_mimewords(encode('UTF-8', $line), 'Charset' => 'UTF-8');
+ }
+
+ # support for multiline values (i.e. __SPAM_INFO__)
+ $res =~ s/\n/\n\t/sg; # indent content
+ $res =~ s/\n\s*\n//sg; # remove empty line
+ $res =~ s/\n?\s*$//s; # remove trailing spaces
+
+ return $res;
+}
+
sub reinject_mail {
my ($entity, $sender, $targets, $xforward, $me, $params) = @_;
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 35a6ac6..45e68a7 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -152,7 +152,7 @@ sub get_prox_vars {
} if !$spaminfo;
my $vars = {
- 'SUBJECT' => mime_to_perl_string($entity->head->get ('subject', 0) || 'No Subject'),
+ 'SUBJECT' => PMG::Utils::decode_rfc1522($entity->head->get ('subject', 0) || 'No Subject'),
'RULE' => $rule->{name},
'RULE_INFO' => $msginfo->{rule_info},
'SENDER' => $msginfo->{sender},
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
` (8 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.
backwards-compatibility should not be an issue:
* currently the database should not contain any utf-8 multibyte
characters, as our tooling prevented this due to sending
wide-characters, which causes an exception in DBI.
* any character > 127 and < 256 will be correctly interpreted when
stored in a perl-string (this happens if the decode fails in
try_decode_utf8), and will be correctly encoded when storing into
the database.
the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)
This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
(puny-code for the domain part, or everything as UTF-8) and it would
need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases
The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].
[0] https://www.postgresql.org/docs/13/multibyte.html
[1] https://datatracker.ietf.org/doc/html/rfc6838#section-4.2
[2] https://datatracker.ietf.org/doc/html/rfc2045#section-5.1
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/RuleDB.pm | 24 ++++++++++++++++--------
src/PMG/RuleDB/Accept.pm | 2 +-
src/PMG/RuleDB/BCC.pm | 2 +-
src/PMG/RuleDB/Block.pm | 2 +-
src/PMG/RuleDB/Disclaimer.pm | 2 +-
src/PMG/RuleDB/Group.pm | 4 ++--
src/PMG/RuleDB/MatchField.pm | 8 ++++++--
src/PMG/RuleDB/MatchFilename.pm | 5 ++++-
src/PMG/RuleDB/ModField.pm | 6 ++++--
src/PMG/RuleDB/Notify.pm | 2 +-
src/PMG/RuleDB/Quarantine.pm | 3 ++-
src/PMG/RuleDB/Remove.pm | 12 +++++++-----
src/PMG/RuleDB/Rule.pm | 2 +-
src/PMG/RuleDB/WhoRegex.pm | 5 ++++-
14 files changed, 51 insertions(+), 28 deletions(-)
diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
index 895acc6..a6b0b79 100644
--- a/src/PMG/RuleDB.pm
+++ b/src/PMG/RuleDB.pm
@@ -5,6 +5,7 @@ use warnings;
use DBI;
use HTML::Entities;
use Data::Dumper;
+use Encode qw(encode);
use PVE::SafeSyslog;
@@ -70,8 +71,8 @@ sub create_group_with_obj {
defined($obj) || die "proxmox: undefined object";
- $name //= '';
- $info //= '';
+ $name = encode('UTF-8', $name // '');
+ $info = encode('UTF-8', $info // '');
eval {
@@ -174,7 +175,9 @@ sub save_group {
$self->{dbh}->do("UPDATE Objectgroup " .
"SET Name = ?, Info = ? " .
"WHERE ID = ?", undef,
- $og->{name}, $og->{info}, $og->{id});
+ encode('UTF-8', $og->{name}),
+ encode('UTF-8', $og->{info}),
+ $og->{id});
return $og->{id};
@@ -183,7 +186,7 @@ sub save_group {
"INSERT INTO Objectgroup (Name, Info, Class) " .
"VALUES (?, ?, ?);");
- $sth->execute($og->name, $og->info, $og->class);
+ $sth->execute(encode('UTF-8', $og->name), encode('UTF-8', $og->info), $og->class);
return $og->{id} = PMG::Utils::lastid($self->{dbh}, 'objectgroup_id_seq');
}
@@ -212,7 +215,9 @@ sub delete_group {
$sth->execute($groupid);
if (my $ref = $sth->fetchrow_hashref()) {
- die "Group '$ref->{groupname}' is used by rule '$ref->{rulename}' - unable to delete\n";
+ my $groupname = PMG::Utils::try_decode_utf8($ref->{groupname});
+ my $rulename = PMG::Utils::try_decode_utf8($ref->{rulename});
+ die "Group '$groupname' is used by rule '$rulename' - unable to delete\n";
}
$sth->finish();
@@ -474,6 +479,7 @@ sub load_object_full {
sub load_group_by_name {
my ($self, $name) = @_;
+ $name = encode('UTF-8', $name);
my $sth = $self->{dbh}->prepare("SELECT * FROM Objectgroup " .
"WHERE name = ?");
@@ -598,13 +604,14 @@ sub save_rule {
defined($rule->{direction}) ||
die "undefined rule attribute - direction: ERROR";
+ my $rulename = encode('UTF-8', $rule->{name});
if (defined($rule->{id})) {
$self->{dbh}->do(
"UPDATE Rule " .
"SET Name = ?, Priority = ?, Active = ?, Direction = ? " .
"WHERE ID = ?", undef,
- $rule->{name}, $rule->{priority}, $rule->{active},
+ $rulename, $rule->{priority}, $rule->{active},
$rule->{direction}, $rule->{id});
return $rule->{id};
@@ -614,7 +621,7 @@ sub save_rule {
"INSERT INTO Rule (Name, Priority, Active, Direction) " .
"VALUES (?, ?, ?, ?);");
- $sth->execute($rule->name, $rule->priority, $rule->active,
+ $sth->execute($rulename, $rule->priority, $rule->active,
$rule->direction);
return $rule->{id} = PMG::Utils::lastid($self->{dbh}, 'rule_id_seq');
@@ -779,7 +786,8 @@ sub load_rules {
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
- my $rule = PMG::RuleDB::Rule->new($ref->{name}, $ref->{priority},
+ my $rulename = PMG::Utils::try_decode_utf8($ref->{name});
+ my $rule = PMG::RuleDB::Rule->new($rulename, $ref->{priority},
$ref->{active}, $ref->{direction});
$rule->{id} = $ref->{id};
push @$rules, $rule;
diff --git a/src/PMG/RuleDB/Accept.pm b/src/PMG/RuleDB/Accept.pm
index cd67ea2..4ebd6da 100644
--- a/src/PMG/RuleDB/Accept.pm
+++ b/src/PMG/RuleDB/Accept.pm
@@ -93,7 +93,7 @@ sub execute {
my $dkim = $msginfo->{dkim} // {};
my $subgroups = $mod_group->subgroups($targets, !$dkim->{sign});
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
foreach my $ta (@$subgroups) {
my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 4867d83..6244dd9 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -115,7 +115,7 @@ sub execute {
my $subgroups = $mod_group->subgroups($targets, 1);
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
my $bcc_to = PMG::Utils::subst_values_for_header($self->{target}, $vars);
diff --git a/src/PMG/RuleDB/Block.pm b/src/PMG/RuleDB/Block.pm
index c758787..25bb74e 100644
--- a/src/PMG/RuleDB/Block.pm
+++ b/src/PMG/RuleDB/Block.pm
@@ -89,7 +89,7 @@ sub execute {
my ($self, $queue, $ruledb, $mod_group, $targets,
$msginfo, $vars, $marks) = @_;
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
if ($msginfo->{testmode}) {
my $fh = $msginfo->{test_fh};
diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
index d3003b2..c6afe54 100644
--- a/src/PMG/RuleDB/Disclaimer.pm
+++ b/src/PMG/RuleDB/Disclaimer.pm
@@ -193,7 +193,7 @@ sub execute {
my ($self, $queue, $ruledb, $mod_group, $targets,
$msginfo, $vars, $marks) = @_;
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
my $subgroups = $mod_group->subgroups($targets);
diff --git a/src/PMG/RuleDB/Group.pm b/src/PMG/RuleDB/Group.pm
index 2508305..baa68ce 100644
--- a/src/PMG/RuleDB/Group.pm
+++ b/src/PMG/RuleDB/Group.pm
@@ -12,8 +12,8 @@ sub new {
my ($type, $name, $info, $class) = @_;
my $self = {
- name => $name,
- info => $info,
+ name => PMG::Utils::try_decode_utf8($name),
+ info => PMG::Utils::try_decode_utf8($info),
class => $class,
};
diff --git a/src/PMG/RuleDB/MatchField.pm b/src/PMG/RuleDB/MatchField.pm
index 2671ea4..2b56058 100644
--- a/src/PMG/RuleDB/MatchField.pm
+++ b/src/PMG/RuleDB/MatchField.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use DBI;
use Digest::SHA;
+use Encode qw(encode);
use MIME::Words;
use PVE::SafeSyslog;
@@ -50,9 +51,10 @@ sub load_attr {
defined($field) || die "undefined object attribute: ERROR";
defined($field_value) || die "undefined object attribute: ERROR";
+ my $decoded_field_value = PMG::Utils::try_decode_utf8($field_value);
# use known constructor, bless afterwards (because sub class can have constructor
# with other parameter signature).
- my $obj = PMG::RuleDB::MatchField->new($field, $field_value, $ogroup);
+ my $obj = PMG::RuleDB::MatchField->new($field, $decoded_field_value, $ogroup);
bless $obj, $class;
$obj->{id} = $id;
@@ -69,6 +71,7 @@ sub save {
my $new_value = "$self->{field}:$self->{field_value}";
$new_value =~ s/\\/\\\\/g;
+ $new_value = encode('UTF-8', $new_value);
if (defined ($self->{id})) {
# update
@@ -105,7 +108,8 @@ sub parse_entity {
for my $value ($entity->head->get_all($self->{field})) {
chomp $value;
- my $decvalue = MIME::Words::decode_mimewords($value);
+ my $decvalue = PMG::Utils::decode_rfc1522($value);
+ $decvalue = PMG::Utils::try_decode_utf8($decvalue);
if ($decvalue =~ m|$self->{field_value}|i) {
push @$res, $id;
diff --git a/src/PMG/RuleDB/MatchFilename.pm b/src/PMG/RuleDB/MatchFilename.pm
index 7e5b486..c9cdbe0 100644
--- a/src/PMG/RuleDB/MatchFilename.pm
+++ b/src/PMG/RuleDB/MatchFilename.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use DBI;
use Digest::SHA;
+use Encode qw(encode);
use MIME::Words;
use PMG::Utils;
@@ -41,8 +42,9 @@ sub load_attr {
my $class = ref($type) || $type;
defined($value) || die "undefined value: ERROR";;
+ my $decvalue = PMG::Utils::try_decode_utf8($value);
- my $obj = $class->new($value, $ogroup);
+ my $obj = $class->new($decvalue, $ogroup);
$obj->{id} = $id;
$obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -57,6 +59,7 @@ sub save {
my $new_value = $self->{fname};
$new_value =~ s/\\/\\\\/g;
+ $new_value = encode('UTF-8', $new_value);
if (defined($self->{id})) {
# update
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 34108d1..6232322 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -56,7 +56,9 @@ sub load_attr {
(defined($field) && defined($field_value)) || return undef;
- my $obj = $class->new($field, $field_value, $ogroup);
+ my $dec_field_value = PMG::Utils::try_decode_utf8($field_value);
+
+ my $obj = $class->new($field, $dec_field_value, $ogroup);
$obj->{id} = $id;
$obj->{digest} = Digest::SHA::sha1_hex($id, $field, $field_value, $ogroup);
@@ -69,7 +71,7 @@ sub save {
defined($self->{ogroup}) || return undef;
- my $new_value = "$self->{field}:$self->{field_value}";
+ my $new_value = encode('UTF-8', "$self->{field}:$self->{field_value}");
if (defined ($self->{id})) {
# update
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 7b38e0d..8a9945b 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -208,7 +208,7 @@ sub execute {
my $from = 'postmaster';
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
my $body = PMG::Utils::subst_values($self->{body}, $vars);
my $subject = PMG::Utils::subst_values_for_header($self->{subject}, $vars);
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 1426393..9d802fe 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use DBI;
use Digest::SHA;
+use Encode qw(encode);
use PVE::SafeSyslog;
@@ -89,7 +90,7 @@ sub execute {
my $subgroups = $mod_group->subgroups($targets, 1);
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
foreach my $ta (@$subgroups) {
my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index 6b27b91..da6c25f 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -63,12 +63,14 @@ sub load_attr {
defined ($value) || die "undefined value: ERROR";
- my $obj;
+ my ($obj, $text);
if ($value =~ m/^([01])\,([01])(\:(.*))?$/s) {
- $obj = $class->new($1, $4, $ogroup, $2);
+ $text = PMG::Utils::try_decode_utf8($4);
+ $obj = $class->new($1, $text, $ogroup, $2);
} elsif ($value =~ m/^([01])(\:(.*))?$/s) {
- $obj = $class->new($1, $3, $ogroup);
+ $text = PMG::Utils::try_decode_utf8($3);
+ $obj = $class->new($1, $text, $ogroup);
} else {
$obj = $class->new(0, undef, $ogroup);
}
@@ -89,7 +91,7 @@ sub save {
$value .= ','. ($self->{quarantine} ? '1' : '0');
if ($self->{text}) {
- $value .= ":$self->{text}";
+ $value .= encode('UTF-8', ":$self->{text}");
}
if (defined ($self->{id})) {
@@ -194,7 +196,7 @@ sub execute {
my ($self, $queue, $ruledb, $mod_group, $targets,
$msginfo, $vars, $marks, $ldap) = @_;
- my $rulename = $vars->{RULE} // 'unknown';
+ my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
if (!$self->{all} && ($#$marks == -1)) {
# no marks
diff --git a/src/PMG/RuleDB/Rule.pm b/src/PMG/RuleDB/Rule.pm
index c49ad21..e7c9146 100644
--- a/src/PMG/RuleDB/Rule.pm
+++ b/src/PMG/RuleDB/Rule.pm
@@ -12,7 +12,7 @@ sub new {
my ($type, $name, $priority, $active, $direction) = @_;
my $self = {
- name => $name // '',
+ name => PMG::Utils::try_decode_utf8($name) // '',
priority => $priority // 0,
active => $active // 0,
};
diff --git a/src/PMG/RuleDB/WhoRegex.pm b/src/PMG/RuleDB/WhoRegex.pm
index 37ec3aa..5c13604 100644
--- a/src/PMG/RuleDB/WhoRegex.pm
+++ b/src/PMG/RuleDB/WhoRegex.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use DBI;
use Digest::SHA;
+use Encode qw(encode);
use PMG::Utils;
use PMG::RuleDB::Object;
@@ -43,7 +44,8 @@ sub load_attr {
defined($value) || die "undefined value: ERROR";
- my $obj = $class->new ($value, $ogroup);
+ my $decoded_value = PMG::Utils::try_decode_utf8($value);
+ my $obj = $class->new ($decoded_value, $ogroup);
$obj->{id} = $id;
$obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -59,6 +61,7 @@ sub save {
my $adr = $self->{address};
$adr =~ s/\\/\\\\/g;
+ $adr = encode('UTF-8', $adr);
if (defined ($self->{id})) {
# update
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (2 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
` (7 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
as done in 114655f4fdb07c789a361b2f397f5345eafd16c6 for Accept and
Block.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/RuleDB/BCC.pm | 19 +++++++++++++++++--
src/PMG/RuleDB/Notify.pm | 18 ++++++++++++++++--
src/PMG/RuleDB/Quarantine.pm | 16 ++++++++++++++--
src/PMG/RuleDB/Remove.pm | 8 +++++++-
4 files changed, 54 insertions(+), 7 deletions(-)
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index 6244dd9..0f016f8 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -3,6 +3,7 @@ package PMG::RuleDB::BCC;
use strict;
use warnings;
use DBI;
+use Encode qw(encode);
use PVE::SafeSyslog;
@@ -164,10 +165,24 @@ sub execute {
$entity, $msginfo->{sender}, \@bcc_targets,
$msginfo->{xforward}, $msginfo->{fqdn}, $param);
foreach (@bcc_targets) {
+ my $target = encode('UTF-8', $_);
if ($qid) {
- syslog('info', "%s: bcc to <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+ syslog(
+ 'info',
+ "%s: bcc to <%s> (rule: %s, %s)",
+ $queue->{logid},
+ $target,
+ $rulename,
+ $qid,
+ );
} else {
- syslog('err', "%s: bcc to <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+ syslog(
+ 'err',
+ "%s: bcc to <%s> (rule: %s) failed",
+ $queue->{logid},
+ $target,
+ $rulename,
+ );
}
}
}
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index 8a9945b..68f9b4e 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -259,10 +259,24 @@ sub execute {
my $qid = PMG::Utils::reinject_mail(
$top, $from, \@targets, undef, $msginfo->{fqdn});
foreach (@targets) {
+ my $target = encode('UTF-8', $_);
if ($qid) {
- syslog('info', "%s: notify <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+ syslog(
+ 'info',
+ "%s: notify <%s> (rule: %s, %s)",
+ $queue->{logid},
+ $target,
+ $rulename,
+ $qid,
+ );
} else {
- syslog ('err', "%s: notify <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+ syslog (
+ 'err',
+ "%s: notify <%s> (rule: %s) failed",
+ $queue->{logid},
+ $target,
+ $rulename,
+ );
}
}
}
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 9d802fe..0fc8352 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -101,7 +101,13 @@ sub execute {
if (my $qid = $queue->quarantine_mail($ruledb, 'V', $entity, $tg, $msginfo, $vars, $ldap)) {
foreach (@$tg) {
- syslog ('info', "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)", $_, $qid, $rulename);
+ syslog (
+ 'info',
+ "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)",
+ encode('UTF-8',$_),
+ $qid,
+ $rulename,
+ );
}
$queue->set_status ($tg, 'delivered');
@@ -111,7 +117,13 @@ sub execute {
if (my $qid = $queue->quarantine_mail($ruledb, 'S', $entity, $tg, $msginfo, $vars, $ldap)) {
foreach (@$tg) {
- syslog ('info', "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)", $_, $qid, $rulename);
+ syslog (
+ 'info',
+ "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)",
+ encode('UTF-8',$_),
+ $qid,
+ $rulename,
+ );
}
$queue->set_status($tg, 'delivered');
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index da6c25f..e7c353c 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -235,7 +235,13 @@ sub execute {
}
foreach (@$tg) {
- syslog ('info', "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)", $_, $qid, $rulename);
+ syslog (
+ 'info',
+ "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)",
+ encode('UTF-8',$_),
+ $qid,
+ $rulename,
+ );
}
}
}
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (3 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
` (6 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.
This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
encoding
* valid utf-8 (smtputf8)
The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/MailQueue.pm | 10 ++++++----
src/PMG/RuleDB/Spam.pm | 5 +++--
src/bin/pmg-smtp-filter | 5 +++--
3 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/src/PMG/MailQueue.pm b/src/PMG/MailQueue.pm
index 2841b07..8355c30 100644
--- a/src/PMG/MailQueue.pm
+++ b/src/PMG/MailQueue.pm
@@ -6,6 +6,7 @@ use warnings;
use PVE::SafeSyslog;
use MIME::Parser;
use IO::File;
+use Encode;
use File::Sync;
use File::Basename;
use File::Path;
@@ -141,6 +142,7 @@ sub quarantinedb_insert {
my ($self, $ruledb, $lcid, $ldap, $qtype, $header, $sender, $file, $targets, $vars) = @_;
eval {
+ $sender = encode('UTF-8', $sender);
my $dbh = $ruledb->{dbh};
my $insert_cmds = "SELECT nextval ('cmailstore_id_seq'); INSERT INTO CMailStore " .
@@ -188,11 +190,11 @@ sub quarantinedb_insert {
if ($pmail eq lc ($r)) {
$receiver = "NULL";
} else {
- $receiver = $dbh->quote ($r);
+ $receiver = $dbh->quote (encode('UTF-8', $r));
}
- $pmail = $dbh->quote ($pmail);
+ $pmail = $dbh->quote (encode('UTF-8', $pmail));
$insert_cmds .= "INSERT INTO CMSReceivers " .
"(CMailStore_CID, CMailStore_RID, PMail, Receiver, TicketID, Status, MTime) " .
"VALUES ($lcid, currval ('cmailstore_id_seq'), $pmail, $receiver, $tid, 'N', $now); ";
@@ -294,8 +296,8 @@ sub quarantine_mail {
$entity->head->delete ('Return-Path');
# prepend Delivered-To and Return-Path (like QMAIL MAILDIR FORMAT)
- $entity->head->add ('Return-Path', join (',', $sender), 0);
- $entity->head->add ('Delivered-To', join (',', @$tg), 0);
+ $entity->head->add ('Return-Path', encode('UTF-8', join (',', $sender)), 0);
+ $entity->head->add ('Delivered-To', encode('UTF-8', join (',', @$tg)), 0);
$entity->print ($fh);
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index cc9a347..99056a3 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -4,6 +4,7 @@ use strict;
use warnings;
use DBI;
use Digest::SHA;
+use Encode qw(encode);
use Time::HiRes qw (gettimeofday);
use PVE::SafeSyslog;
@@ -135,8 +136,8 @@ sub get_blackwhite {
my $cond = '';
foreach my $r (@$targets) {
my $pmail = $msginfo->{pmail}->{$r} || lc ($r);
- my $qr = $dbh->quote ($pmail);
- $cond .= " OR " if $cond;
+ my $qr = $dbh->quote (encode('UTF-8', $pmail));
+ $cond .= " OR " if $cond;
$cond .= "pmail = $qr";
}
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 45e68a7..911e9cd 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -4,6 +4,7 @@ use strict;
use warnings;
use Carp;
+use Encode qw(encode);
use Getopt::Long;
use Time::HiRes qw (usleep gettimeofday tv_interval);
use POSIX qw(:sys_wait_h errno_h signal_h);
@@ -791,10 +792,10 @@ sub handle_smtp {
$insert_cmds .= ($queue->{sa_score} || 0) . ',';
$insert_cmds .= $dbh->quote($queue->{vinfo}) . ',';
$insert_cmds .= $time_total . ',';
- $insert_cmds .= $dbh->quote($msginfo->{sender}) . ');';
+ $insert_cmds .= $dbh->quote(encode('UTF-8', $msginfo->{sender})) . ');';
foreach my $r (@{$msginfo->{targets}}) {
- my $tmp = $dbh->quote($r);
+ my $tmp = $dbh->quote(encode('UTF-8',$r));
my $blocked = $queue->{status}->{$r} eq 'blocked' ? 1 : 0;
$insert_cmds .= "INSERT INTO CReceivers (CStatistic_CID, CStatistic_RID, Receiver, Blocked) " .
"VALUES ($lcid, currval ('cstatistic_id_seq'), $tmp, '$blocked'); ";
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (4 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 14:15 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
` (5 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/API2/Quarantine.pm | 10 +++++-----
src/PMG/HTMLMail.pm | 7 ++++---
src/PMG/Quarantine.pm | 13 +++++++------
src/PMG/RuleDB/Spam.pm | 12 ++++++------
4 files changed, 22 insertions(+), 20 deletions(-)
diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
index ddf7c04..819c78c 100644
--- a/src/PMG/API2/Quarantine.pm
+++ b/src/PMG/API2/Quarantine.pm
@@ -141,8 +141,8 @@ my $parse_header_info = sub {
my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
$res->{sender} = $sender if $sender && ($sender ne $res->{from});
- $res->{envelope_sender} = $ref->{sender};
- $res->{receiver} = $ref->{receiver} // $ref->{pmail};
+ $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
+ $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});
$res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
$res->{time} = $ref->{time};
$res->{bytes} = $ref->{bytes};
@@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, { mail => $ref->{pmail} };
+ push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
}
return $res;
@@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
}
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, { mail => $ref->{pmail} };
+ push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
}
return $res;
@@ -569,7 +569,7 @@ my $quarantine_api = sub {
}
if ($check_pmail || $role eq 'quser') {
- $sth->execute($pmail);
+ $sth->execute(encode('UTF-8', $pmail));
} else {
$sth->execute();
}
diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
index 87f5c40..207c52c 100644
--- a/src/PMG/HTMLMail.pm
+++ b/src/PMG/HTMLMail.pm
@@ -192,9 +192,10 @@ sub read_raw_email {
# read header
my $header;
while (defined(my $line = <$fh>)) {
- $raw_header .= $line;
- chomp $line;
- push @$header, $line;
+ my $decoded_line = PMG::Utils::try_decode_utf8($line);
+ $raw_header .= $decoded_line;
+ chomp $decoded_line;
+ push @$header, $decoded_line;
last if $line =~ m/^\s*$/;
}
diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
index 77af8cc..aa6b948 100644
--- a/src/PMG/Quarantine.pm
+++ b/src/PMG/Quarantine.pm
@@ -3,6 +3,7 @@ package PMG::Quarantine;
use strict;
use warnings;
use Net::SMTP;
+use Encode qw(encode);
use PVE::SafeSyslog;
use PVE::Tools;
@@ -16,7 +17,7 @@ sub add_to_blackwhite {
my $name = $listname eq 'BL' ? 'BL' : 'WL';
my $oname = $listname eq 'BL' ? 'WL' : 'BL';
- my $qu = $dbh->quote ($username);
+ my $qu = $dbh->quote (encode('UTF-8', $username));
my $sth = $dbh->prepare(
"SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
@@ -25,13 +26,13 @@ sub add_to_blackwhite {
my $list = { 'WL' => {}, 'BL' => {} };
while (my $ref = $sth->fetchrow_hashref()) {
- my $data = $ref->{data};
+ my $data = PMG::Utils::try_decode_utf8($ref->{data});
$data =~ s/[,;]/ /g;
my @alist = split('\s+', $data);
my $tmp = {};
foreach my $a (@alist) {
- if ($a =~ m/^[[:ascii:]]+$/) {
+ if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {
$tmp->{$a} = 1;
}
}
@@ -50,7 +51,7 @@ sub add_to_blackwhite {
if ($delete) {
delete($list->{$name}->{$v});
} else {
- if ($v =~ m/[[:^ascii:]]/) {
+ if ($v =~ m/[\s\\]/) {
die "email address '$v' contains invalid characters\n";
}
$list->{$name}->{$v} = 1;
@@ -58,8 +59,8 @@ sub add_to_blackwhite {
}
}
- my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
- my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
+ my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
+ my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
if (!$delete) {
my $maxlen = 200000;
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index 99056a3..bc1d422 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -94,7 +94,7 @@ sub parse_addrlist {
my $regex = $addr;
# SA like checks
$regex =~ s/[\000\\\(]/_/gs; # is this really necessasry ?
- $regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g; # escape possible metachars
+ $regex =~ s/([^\*\?_\w])/\\$1/g; # escape possible metachars
$regex =~ tr/?/./; # replace "?" with "."
$regex =~ s/\*+/\.\*/g; # replace "*" with ".*"
@@ -149,13 +149,13 @@ sub get_blackwhite {
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
- my $pmail = lc ($ref->{pmail});
+ my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
if ($ref->{name} eq 'WL') {
$target_info->{$pmail}->{whitelist} =
- parse_addrlist($ref->{data});
+ parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
} elsif ($ref->{name} eq 'BL') {
$target_info->{$pmail}->{blacklist} =
- parse_addrlist($ref->{data});
+ parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
}
}
@@ -205,7 +205,7 @@ sub what_match_targets {
($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
check_addrlist($list, $queue->{all_from_addrs})) {
syslog('info', "%s: sender in user (%s) whitelist",
- $queue->{logid}, $pmail);
+ $queue->{logid}, encode('UTF-8', $pmail));
} else {
$target_info->{$t}->{marks} = []; # never add additional marks here
$target_info->{$t}->{spaminfo} = $info;
@@ -234,7 +234,7 @@ sub what_match_targets {
$target_info->{$t}->{marks} = [];
$target_info->{$t}->{spaminfo} = $info;
syslog ('info', "%s: sender in user (%s) blacklist",
- $queue->{logid}, $pmail);
+ $queue->{logid}, encode('UTF-8',$pmail));
}
}
}
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:15 ` Dominik Csapak
0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:15 UTC (permalink / raw)
To: Stoiko Ivanov, pmg-devel
i'd like to have some rationale for the changes in the commit message
at least for the more non-obvious ones (regex changes for example)
comments inline
On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
> src/PMG/API2/Quarantine.pm | 10 +++++-----
> src/PMG/HTMLMail.pm | 7 ++++---
> src/PMG/Quarantine.pm | 13 +++++++------
> src/PMG/RuleDB/Spam.pm | 12 ++++++------
> 4 files changed, 22 insertions(+), 20 deletions(-)
>
> diff --git a/src/PMG/API2/Quarantine.pm b/src/PMG/API2/Quarantine.pm
> index ddf7c04..819c78c 100644
> --- a/src/PMG/API2/Quarantine.pm
> +++ b/src/PMG/API2/Quarantine.pm
> @@ -141,8 +141,8 @@ my $parse_header_info = sub {
> my $sender = PMG::Utils::decode_rfc1522(PVE::Tools::trim($head->get('sender')));
> $res->{sender} = $sender if $sender && ($sender ne $res->{from});
>
> - $res->{envelope_sender} = $ref->{sender};
> - $res->{receiver} = $ref->{receiver} // $ref->{pmail};
> + $res->{envelope_sender} = PMG::Utils::try_decode_utf8($ref->{sender});
> + $res->{receiver} = PMG::Utils::try_decode_utf8($ref->{receiver} // $ref->{pmail});
maybe we should note here in a comment that these are not headers
but part of the smtp dialog and cannot be quoted-printable/base64 encoded?
> $res->{id} = 'C' . $ref->{cid} . 'R' . $ref->{rid} . 'T' . $ref->{ticketid};
> $res->{time} = $ref->{time};
> $res->{bytes} = $ref->{bytes};
> @@ -437,7 +437,7 @@ __PACKAGE__->register_method ({
> $sth->execute();
>
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, { mail => $ref->{pmail} };
> + push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
> }
>
> return $res;
> @@ -532,7 +532,7 @@ __PACKAGE__->register_method ({
> }
>
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, { mail => $ref->{pmail} };
> + push @$res, { mail => PMG::Utils::try_decode_utf8($ref->{pmail}) };
> }
>
> return $res;
> @@ -569,7 +569,7 @@ my $quarantine_api = sub {
> }
>
> if ($check_pmail || $role eq 'quser') {
> - $sth->execute($pmail);
> + $sth->execute(encode('UTF-8', $pmail));
> } else {
> $sth->execute();
> }
> diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
> index 87f5c40..207c52c 100644
> --- a/src/PMG/HTMLMail.pm
> +++ b/src/PMG/HTMLMail.pm
> @@ -192,9 +192,10 @@ sub read_raw_email {
> # read header
> my $header;
> while (defined(my $line = <$fh>)) {
> - $raw_header .= $line;
> - chomp $line;
> - push @$header, $line;
> + my $decoded_line = PMG::Utils::try_decode_utf8($line);
> + $raw_header .= $decoded_line;
> + chomp $decoded_line;
> + push @$header, $decoded_line;
> last if $line =~ m/^\s*$/;
> }
>
> diff --git a/src/PMG/Quarantine.pm b/src/PMG/Quarantine.pm
> index 77af8cc..aa6b948 100644
> --- a/src/PMG/Quarantine.pm
> +++ b/src/PMG/Quarantine.pm
> @@ -3,6 +3,7 @@ package PMG::Quarantine;
> use strict;
> use warnings;
> use Net::SMTP;
> +use Encode qw(encode);
>
> use PVE::SafeSyslog;
> use PVE::Tools;
> @@ -16,7 +17,7 @@ sub add_to_blackwhite {
>
> my $name = $listname eq 'BL' ? 'BL' : 'WL';
> my $oname = $listname eq 'BL' ? 'WL' : 'BL';
> - my $qu = $dbh->quote ($username);
> + my $qu = $dbh->quote (encode('UTF-8', $username));
>
> my $sth = $dbh->prepare(
> "SELECT * FROM UserPrefs WHERE pmail = $qu AND (Name = 'BL' OR Name = 'WL')");
> @@ -25,13 +26,13 @@ sub add_to_blackwhite {
> my $list = { 'WL' => {}, 'BL' => {} };
>
> while (my $ref = $sth->fetchrow_hashref()) {
> - my $data = $ref->{data};
> + my $data = PMG::Utils::try_decode_utf8($ref->{data});
> $data =~ s/[,;]/ /g;
> my @alist = split('\s+', $data);
>
> my $tmp = {};
> foreach my $a (@alist) {
> - if ($a =~ m/^[[:ascii:]]+$/) {
> + if ($a =~ m/^[^\s\\\@]+(?:\@[^\s\/\\\@]+)?$/) {
that change seems a bit dangerous, maybe we should at least
filter out some control characters here?
> $tmp->{$a} = 1;
> }
> }
> @@ -50,7 +51,7 @@ sub add_to_blackwhite {
> if ($delete) {
> delete($list->{$name}->{$v});
> } else {
> - if ($v =~ m/[[:^ascii:]]/) {
> + if ($v =~ m/[\s\\]/) {
same here, going from 'non-ascii' is forbidden to 'non whitespace+\' is forbidden
is a bit broad imho
> die "email address '$v' contains invalid characters\n";
> }
> $list->{$name}->{$v} = 1;
> @@ -58,8 +59,8 @@ sub add_to_blackwhite {
> }
> }
>
> - my $wlist = $dbh->quote(join (',', keys %{$list->{WL}}) || '');
> - my $blist = $dbh->quote(join (',', keys %{$list->{BL}}) || '');
> + my $wlist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{WL}})) || '');
> + my $blist = $dbh->quote(encode('UTF-8', join (',', keys %{$list->{BL}})) || '');
>
> if (!$delete) {
> my $maxlen = 200000;
> diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
> index 99056a3..bc1d422 100644
> --- a/src/PMG/RuleDB/Spam.pm
> +++ b/src/PMG/RuleDB/Spam.pm
> @@ -94,7 +94,7 @@ sub parse_addrlist {
> my $regex = $addr;
> # SA like checks
> $regex =~ s/[\000\\\(]/_/gs; # is this really necessasry ?
> - $regex =~ s/([^\*\?_a-zA-Z0-9])/\\$1/g; # escape possible metachars
> + $regex =~ s/([^\*\?_\w])/\\$1/g; # escape possible metachars
what does \w include more here than a-zA-Z0-9 ?
(a short explanation in the commit message would be enough imo)
> $regex =~ tr/?/./; # replace "?" with "."
> $regex =~ s/\*+/\.\*/g; # replace "*" with ".*"
>
> @@ -149,13 +149,13 @@ sub get_blackwhite {
> $sth->execute();
>
> while (my $ref = $sth->fetchrow_hashref()) {
> - my $pmail = lc ($ref->{pmail});
> + my $pmail = lc (PMG::Utils::try_decode_utf8($ref->{pmail}));
> if ($ref->{name} eq 'WL') {
> $target_info->{$pmail}->{whitelist} =
> - parse_addrlist($ref->{data});
> + parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
> } elsif ($ref->{name} eq 'BL') {
> $target_info->{$pmail}->{blacklist} =
> - parse_addrlist($ref->{data});
> + parse_addrlist(PMG::Utils::try_decode_utf8($ref->{data}));
> }
> }
>
> @@ -205,7 +205,7 @@ sub what_match_targets {
> ($list = $queue->{blackwhite}->{$pmail}->{whitelist}) &&
> check_addrlist($list, $queue->{all_from_addrs})) {
> syslog('info', "%s: sender in user (%s) whitelist",
> - $queue->{logid}, $pmail);
> + $queue->{logid}, encode('UTF-8', $pmail));
> } else {
> $target_info->{$t}->{marks} = []; # never add additional marks here
> $target_info->{$t}->{spaminfo} = $info;
> @@ -234,7 +234,7 @@ sub what_match_targets {
> $target_info->{$t}->{marks} = [];
> $target_info->{$t}->{spaminfo} = $info;
> syslog ('info', "%s: sender in user (%s) blacklist",
> - $queue->{logid}, $pmail);
> + $queue->{logid}, encode('UTF-8',$pmail));
> }
> }
> }
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (5 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 14:20 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
` (4 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
$data->{pmail} is both used in the template rendering ('Spam Report for
$pmail'), and as content for the To header, which need different
treatment. Thus introduce 'pmail_raw' additionally.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
src/PMG/Utils.pm | 7 ++++---
2 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
index dbec8ef..7293579 100755
--- a/src/PMG/CLI/pmgqm.pm
+++ b/src/PMG/CLI/pmgqm.pm
@@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
use strict;
use Data::Dumper;
+use Encode qw(encode);
use Template;
use MIME::Entity;
use HTML::Entities;
@@ -17,6 +18,7 @@ use PVE::SafeSyslog;
use PVE::Tools;
use PVE::INotify;
use PVE::CLIHandler;
+use PVE::JSONSchema qw(get_standard_option);
use PMG::RESTEnvironment;
use PMG::Utils;
@@ -57,7 +59,7 @@ sub get_item_data {
}
$item->{envelope_sender} = $ref->{sender};
- $item->{pmail} = $ref->{pmail};
+ $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
$item->{receiver} = $ref->{receiver} || $ref->{pmail};
$item->{date} = strftime("%F", localtime($ref->{time}));
@@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
parameters => {
additionalProperties => 0,
properties => {
- receiver => {
+ receiver => get_standard_option('pmg-email-address', {
description => "Generate report for a single email address. If not specified, generate reports for all users.",
- type => 'string', format => 'email',
optional => 1,
- },
+ }),
timespan => {
description => "Select time span.",
type => 'string',
@@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
enum => ['short', 'verbose', 'custom'],
optional => 1,
},
- redirect => {
+ redirect => get_standard_option('pmg-email-address', {
description => "Redirect spam report email to this address.",
- type => 'string', format => 'email',
optional => 1,
- },
+ }),
debug => {
description => "Debug mode. Print raw email to stdout instead of sending them.",
type => 'boolean',
@@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
"ORDER BY pmail, time, receiver");
if ($target) {
- $sth->execute($target);
+ $sth->execute(encode('UTF-8', $target));
} else {
$sth->execute();
}
@@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
};
while (my $ref = $sth->fetchrow_hashref()) {
- if ($creceiver ne $ref->{pmail}) {
+ my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
+ if ($creceiver ne $decoded_pmail) {
$finalize->() if $data;
$data = clone($global_data);
- $creceiver = $ref->{pmail};
+ $creceiver = $decoded_pmail;
$mailcount = 0;
- $data->{pmail} = $creceiver;
+ $data->{pmail} = encode_entities($decoded_pmail);
+ $data->{pmail_raw} = $ref->{pmail};
$data->{managehref} = "$protocol_fqdn_port/quarantine";
if ($data->{authmode} ne 'ldap') {
$data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cc30e67..5c9e873 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
my ($d, $cs) = @$r;
if ($d) {
if ($cs) {
- $res .= encode_entities(decode($cs, $d));
+ $res .= encode('UTF-8', decode($cs, $d));
} else {
- $res .= encode_entities($d);
+ $res .= $d;
}
}
}
+ $res = encode_entities(decode('UTF-8', $res));
};
$res = $enc if $@;
@@ -1257,7 +1258,7 @@ sub finalize_report {
my $top = MIME::Entity->build(
Type => "multipart/related",
- To => $data->{pmail},
+ To => $data->{pmail_raw},
From => $mailfrom,
Subject => bencode_header(decode_entities($title)));
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23 14:20 ` Dominik Csapak
0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:20 UTC (permalink / raw)
To: Stoiko Ivanov, pmg-devel
comments inline
On 11/23/22 10:23, Stoiko Ivanov wrote:
> $data->{pmail} is both used in the template rendering ('Spam Report for
> $pmail'), and as content for the To header, which need different
> treatment. Thus introduce 'pmail_raw' additionally.
>
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
> src/PMG/CLI/pmgqm.pm | 24 +++++++++++++-----------
> src/PMG/Utils.pm | 7 ++++---
> 2 files changed, 17 insertions(+), 14 deletions(-)
>
> diff --git a/src/PMG/CLI/pmgqm.pm b/src/PMG/CLI/pmgqm.pm
> index dbec8ef..7293579 100755
> --- a/src/PMG/CLI/pmgqm.pm
> +++ b/src/PMG/CLI/pmgqm.pm
> @@ -2,6 +2,7 @@ package PMG::CLI::pmgqm;
>
> use strict;
> use Data::Dumper;
> +use Encode qw(encode);
> use Template;
> use MIME::Entity;
> use HTML::Entities;
> @@ -17,6 +18,7 @@ use PVE::SafeSyslog;
> use PVE::Tools;
> use PVE::INotify;
> use PVE::CLIHandler;
> +use PVE::JSONSchema qw(get_standard_option);
>
> use PMG::RESTEnvironment;
> use PMG::Utils;
> @@ -57,7 +59,7 @@ sub get_item_data {
> }
>
> $item->{envelope_sender} = $ref->{sender};
> - $item->{pmail} = $ref->{pmail};
> + $item->{pmail} = encode_entities(PMG::Utils::try_decode_utf8($ref->{pmail}));
> $item->{receiver} = $ref->{receiver} || $ref->{pmail};
>
> $item->{date} = strftime("%F", localtime($ref->{time}));
> @@ -157,11 +159,10 @@ __PACKAGE__->register_method ({
> parameters => {
> additionalProperties => 0,
> properties => {
> - receiver => {
> + receiver => get_standard_option('pmg-email-address', {
> description => "Generate report for a single email address. If not specified, generate reports for all users.",
> - type => 'string', format => 'email',
> optional => 1,
> - },
> + }),
> timespan => {
> description => "Select time span.",
> type => 'string',
> @@ -175,11 +176,10 @@ __PACKAGE__->register_method ({
> enum => ['short', 'verbose', 'custom'],
> optional => 1,
> },
> - redirect => {
> + redirect => get_standard_option('pmg-email-address', {
> description => "Redirect spam report email to this address.",
> - type => 'string', format => 'email',
> optional => 1,
> - },
> + }),
> debug => {
> description => "Debug mode. Print raw email to stdout instead of sending them.",
> type => 'boolean',
> @@ -280,7 +280,7 @@ __PACKAGE__->register_method ({
> "ORDER BY pmail, time, receiver");
>
> if ($target) {
> - $sth->execute($target);
> + $sth->execute(encode('UTF-8', $target));
> } else {
> $sth->execute();
> }
> @@ -302,16 +302,18 @@ __PACKAGE__->register_method ({
> };
>
> while (my $ref = $sth->fetchrow_hashref()) {
> - if ($creceiver ne $ref->{pmail}) {
> + my $decoded_pmail = PMG::Utils::try_decode_utf8($ref->{pmail});
> + if ($creceiver ne $decoded_pmail) {
>
> $finalize->() if $data;
>
> $data = clone($global_data);
>
> - $creceiver = $ref->{pmail};
> + $creceiver = $decoded_pmail;
> $mailcount = 0;
>
> - $data->{pmail} = $creceiver;
> + $data->{pmail} = encode_entities($decoded_pmail);
> + $data->{pmail_raw} = $ref->{pmail};
> $data->{managehref} = "$protocol_fqdn_port/quarantine";
> if ($data->{authmode} ne 'ldap') {
> $data->{ticket} = PMG::Ticket::assemble_quarantine_ticket($data->{pmail});
> diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
> index cc30e67..5c9e873 100644
> --- a/src/PMG/Utils.pm
> +++ b/src/PMG/Utils.pm
> @@ -1143,12 +1143,13 @@ sub rfc1522_to_html {
> my ($d, $cs) = @$r;
> if ($d) {
> if ($cs) {
> - $res .= encode_entities(decode($cs, $d));
> + $res .= encode('UTF-8', decode($cs, $d));
> } else {
> - $res .= encode_entities($d);
> + $res .= $d;
> }
> }
> }
> + $res = encode_entities(decode('UTF-8', $res));
this change is not really explained in the commit message
and is a bit confusing
couldn't we simply do:
encode_entities(decode_rfc1522($enc))
?
afaics is rfc1522_to_html mostly the same as decode_rfc1522
but with an 'encode_entities' after decoding
> };
>
> $res = $enc if $@;
> @@ -1257,7 +1258,7 @@ sub finalize_report {
>
> my $top = MIME::Entity->build(
> Type => "multipart/related",
> - To => $data->{pmail},
> + To => $data->{pmail_raw},
> From => $mailfrom,
> Subject => bencode_header(decode_entities($title)));
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (6 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 14:26 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
` (3 subsequent siblings)
11 siblings, 1 reply; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
1 file changed, 50 insertions(+), 17 deletions(-)
diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
index 6d27930..96ef61d 100755
--- a/src/PMG/Statistic.pm
+++ b/src/PMG/Statistic.pm
@@ -3,6 +3,7 @@ package PMG::Statistic;
use strict;
use warnings;
use DBI;
+use Encode qw(encode);
use Time::Local;
use Time::Zone;
@@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
return $orderby;
};
+sub user_stat_to_perlstring {
+ my ($entry) = @_;
+
+ my $res = { };
+
+ for my $a (keys %$entry) {
+ if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
+ $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
+ } else {
+ $res->{$a} = $entry->{$a};
+ }
+ }
+
+ return $res;
+}
+
sub user_stat_contact_details {
my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
@@ -554,19 +571,21 @@ sub user_stat_contact_details {
my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
+ my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
+
my $query = "SELECT * FROM CStatistic, CReceivers " .
"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
"AND NOT direction AND sender != '' AND receiver = ? " .
- ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
"ORDER BY $orderby limit $limit";
my $sth = $rdb->{dbh}->prepare($query);
- $sth->execute($receiver);
+ $sth->execute(encode('UTF-8',$receiver));
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -583,11 +602,14 @@ sub user_stat_contact {
my $cond_good_mail = $self->query_cond_good_mail($from, $to);
+ my $filter_pattern;
+ $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
"count (virusinfo) as viruscount " .
"FROM CStatistic, CReceivers " .
"WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
- ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
"AND $cond_good_mail AND NOT direction AND sender != '' ";
if ($advfilter) {
@@ -603,7 +625,7 @@ sub user_stat_contact {
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -620,20 +642,23 @@ sub user_stat_sender_details {
my $cond_good_mail = $self->query_cond_good_mail($from, $to);
+ my $filter_pattern;
+ $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
my $sth = $rdb->{dbh}->prepare(
"SELECT " .
"blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
"FROM CStatistic, CReceivers " .
"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
"$cond_good_mail AND NOT direction AND sender = ? " .
- ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
"ORDER BY $orderby limit $limit");
- $sth->execute($sender);
+ $sth->execute(encode('UTF-8',$sender));
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -650,11 +675,14 @@ sub user_stat_sender {
my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
+ my $filter_pattern;
+ $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
"count (virusinfo) as viruscount, " .
"count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
"FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
- ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
"GROUP BY sender ORDER BY $orderby limit $limit";
my $sth = $rdb->{dbh}->prepare($query);
@@ -662,7 +690,7 @@ sub user_stat_sender {
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -679,18 +707,21 @@ sub user_stat_receiver_details {
my $cond_good_mail = $self->query_cond_good_mail($from, $to);
+ my $filter_pattern;
+ $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
my $sth = $rdb->{dbh}->prepare(
"SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
"FROM CStatistic, CReceivers " .
"WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
- ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
"ORDER BY $orderby limit $limit");
- $sth->execute($receiver);
+ $sth->execute(encode('UTF-8',$receiver));
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -708,6 +739,9 @@ sub user_stat_receiver {
my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
"receiver IS NOT NULL AND receiver != ''";
+ my $filter_pattern;
+ $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
+
my $query = "SELECT receiver, " .
"count(*) AS count, " .
"sum (bytes) AS bytes, " .
@@ -728,7 +762,7 @@ sub user_stat_receiver {
}
$query .= "AND $cond_good_mail and direction " .
- ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
+ ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
"GROUP BY receiver ORDER BY $orderby LIMIT $limit";
my $sth = $rdb->{dbh}->prepare($query);
@@ -736,7 +770,7 @@ sub user_stat_receiver {
my $res = [];
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
@@ -873,9 +907,8 @@ sub recent_receivers {
my $sth = $rdb->{dbh}->prepare($cmd);
$sth->execute ($from, $limit);
-
while (my $ref = $sth->fetchrow_hashref()) {
- push @$res, $ref;
+ push @$res, user_stat_to_perlstring($ref);
}
$sth->finish();
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data.
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23 14:26 ` Dominik Csapak
0 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:26 UTC (permalink / raw)
To: Stoiko Ivanov, pmg-devel
again, a bit more commit message would be nice
On 11/23/22 10:23, Stoiko Ivanov wrote:
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
> src/PMG/Statistic.pm | 67 +++++++++++++++++++++++++++++++++-----------
> 1 file changed, 50 insertions(+), 17 deletions(-)
>
> diff --git a/src/PMG/Statistic.pm b/src/PMG/Statistic.pm
> index 6d27930..96ef61d 100755
> --- a/src/PMG/Statistic.pm
> +++ b/src/PMG/Statistic.pm
> @@ -3,6 +3,7 @@ package PMG::Statistic;
> use strict;
> use warnings;
> use DBI;
> +use Encode qw(encode);
> use Time::Local;
> use Time::Zone;
>
> @@ -545,6 +546,22 @@ my $compute_sql_orderby = sub {
> return $orderby;
> };
>
> +sub user_stat_to_perlstring {
> + my ($entry) = @_;
> +
> + my $res = { };
> +
> + for my $a (keys %$entry) {
> + if ($a eq 'receiver' || $a eq 'sender' || $a eq 'contact') {
> + $res->{$a} = PMG::Utils::try_decode_utf8($entry->{$a});
> + } else {
> + $res->{$a} = $entry->{$a};
> + }
> + }
> +
> + return $res;
> +}
> +
> sub user_stat_contact_details {
> my ($self, $rdb, $receiver, $limit, $sorters, $filter) = @_;
>
> @@ -554,19 +571,21 @@ sub user_stat_contact_details {
>
> my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>
> + my $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%"));
> +
> my $query = "SELECT * FROM CStatistic, CReceivers " .
> "WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail " .
> "AND NOT direction AND sender != '' AND receiver = ? " .
> - ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
> "ORDER BY $orderby limit $limit";
>
> my $sth = $rdb->{dbh}->prepare($query);
>
> - $sth->execute($receiver);
> + $sth->execute(encode('UTF-8',$receiver));
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -583,11 +602,14 @@ sub user_stat_contact {
>
> my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>
> + my $filter_pattern;
> + $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
> my $query = "SELECT receiver as contact, count(*) AS count, sum (bytes) AS bytes, " .
> "count (virusinfo) as viruscount " .
> "FROM CStatistic, CReceivers " .
> "WHERE cid = cstatistic_cid AND rid = cstatistic_rid " .
> - ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
> "AND $cond_good_mail AND NOT direction AND sender != '' ";
>
> if ($advfilter) {
> @@ -603,7 +625,7 @@ sub user_stat_contact {
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -620,20 +642,23 @@ sub user_stat_sender_details {
>
> my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>
> + my $filter_pattern;
> + $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
> my $sth = $rdb->{dbh}->prepare(
> "SELECT " .
> "blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
> "FROM CStatistic, CReceivers " .
> "WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND " .
> "$cond_good_mail AND NOT direction AND sender = ? " .
> - ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
> "ORDER BY $orderby limit $limit");
>
> - $sth->execute($sender);
> + $sth->execute(encode('UTF-8',$sender));
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -650,11 +675,14 @@ sub user_stat_sender {
>
> my $cond_good_mail = $self->query_cond_good_mail ($from, $to);
>
> + my $filter_pattern;
> + $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
> my $query = "SELECT sender,count(*) AS count, sum (bytes) AS bytes, " .
> "count (virusinfo) as viruscount, " .
> "count (CASE WHEN spamlevel >= 3 THEN 1 ELSE NULL END) as spamcount " .
> "FROM CStatistic WHERE $cond_good_mail AND NOT direction AND sender != '' " .
> - ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
> "GROUP BY sender ORDER BY $orderby limit $limit";
>
> my $sth = $rdb->{dbh}->prepare($query);
> @@ -662,7 +690,7 @@ sub user_stat_sender {
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -679,18 +707,21 @@ sub user_stat_receiver_details {
>
> my $cond_good_mail = $self->query_cond_good_mail($from, $to);
>
> + my $filter_pattern;
> + $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
> my $sth = $rdb->{dbh}->prepare(
> "SELECT blocked, bytes, ptime, sender, receiver, spamlevel, time, virusinfo " .
> "FROM CStatistic, CReceivers " .
> "WHERE cid = cstatistic_cid AND rid = cstatistic_rid AND $cond_good_mail AND receiver = ? " .
> - ($filter ? "AND sender like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND sender like " . $filter_pattern . ' ' : '') .
> "ORDER BY $orderby limit $limit");
>
> - $sth->execute($receiver);
> + $sth->execute(encode('UTF-8',$receiver));
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -708,6 +739,9 @@ sub user_stat_receiver {
> my $cond_good_mail = $self->query_cond_good_mail ($from, $to) . " AND " .
> "receiver IS NOT NULL AND receiver != ''";
>
> + my $filter_pattern;
> + $filter_pattern = $rdb->{dbh}->quote(encode('UTF-8', "%${filter}%")) if $filter;
> +
> my $query = "SELECT receiver, " .
> "count(*) AS count, " .
> "sum (bytes) AS bytes, " .
> @@ -728,7 +762,7 @@ sub user_stat_receiver {
> }
>
> $query .= "AND $cond_good_mail and direction " .
> - ($filter ? "AND receiver like " . $rdb->{dbh}->quote("%${filter}%") . ' ' : '') .
> + ($filter_pattern ? "AND receiver like " . $filter_pattern . ' ' : '') .
we have this pattern 6 times in this diff, wouldn't it be easier to do something like this:
(naming is not optimal, just what came to my mind)
sub sql_filter_text {
my ($dbh, $field, $filter) = @_;
my $filter_text = $filter ? "AND $field like ". $dbh->quote(...). " " : '';
return $filter_text
}
and call it in the functions with
my $filter_text = sql_filter_text($rdb->{dbh}, 'receiver', $filter);
and simply use it with:
$query .= "...." . $filter_text . "...";
?
> "GROUP BY receiver ORDER BY $orderby LIMIT $limit";
>
> my $sth = $rdb->{dbh}->prepare($query);
> @@ -736,7 +770,7 @@ sub user_stat_receiver {
>
> my $res = [];
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
>
> $sth->finish();
> @@ -873,9 +907,8 @@ sub recent_receivers {
> my $sth = $rdb->{dbh}->prepare($cmd);
>
> $sth->execute ($from, $limit);
> -
> while (my $ref = $sth->fetchrow_hashref()) {
> - push @$res, $ref;
> + push @$res, user_stat_to_perlstring($ref);
> }
> $sth->finish();
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (7 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
` (2 subsequent siblings)
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
matching the pattern in the backend (allowing most characters inside
of e-mail addresses.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
js/UserBlackWhiteList.js | 2 +-
js/Utils.js | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 4f4a756..44d75b3 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
{
xtype: 'combobox',
displayField: 'mail',
- vtype: 'email',
+ vtype: 'proxmoxMail',
allowBlank: false,
valueField: 'mail',
store: {
diff --git a/js/Utils.js b/js/Utils.js
index dc924d2..7fa154e 100644
--- a/js/Utils.js
+++ b/js/Utils.js
@@ -898,3 +898,12 @@ Ext.define('PMG.Async', {
);
},
});
+
+// custom Vtypes
+Ext.apply(Ext.form.field.VTypes, {
+ // matches the pmg-email-address in pmg-api
+ PMGMail: function(v) {
+ return (/[^\s\\@]+@[^\s/\\@]+/).test(v);
+ },
+ PMGMailText: gettext('Example') + ": user@example.com",
+});
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (8 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
@ 2022-11-23 9:23 ` Stoiko Ivanov
2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
2022-11-26 7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
11 siblings, 0 replies; 16+ messages in thread
From: Stoiko Ivanov @ 2022-11-23 9:23 UTC (permalink / raw)
To: pmg-devel
to be able to add addresses to the lists for non-ascii-addresses
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
js/UserBlackWhiteList.js | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/js/UserBlackWhiteList.js b/js/UserBlackWhiteList.js
index 44d75b3..1344496 100644
--- a/js/UserBlackWhiteList.js
+++ b/js/UserBlackWhiteList.js
@@ -127,7 +127,7 @@ Ext.define('PMG.UserBlackWhiteList', {
{
xtype: 'combobox',
displayField: 'mail',
- vtype: 'proxmoxMail',
+ vtype: 'PMGMail',
allowBlank: false,
valueField: 'mail',
store: {
--
2.30.2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (9 preceding siblings ...)
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
@ 2022-11-23 14:09 ` Dominik Csapak
2022-11-26 7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
11 siblings, 0 replies; 16+ messages in thread
From: Dominik Csapak @ 2022-11-23 14:09 UTC (permalink / raw)
To: Stoiko Ivanov, pmg-devel
all in all works mostly well,
tested various weird emails with various rules
that include emojis/non-ascii characters
(weird mails as in a mix of smtputf8,mixed charsets and quoted-printable fields
with mixed encodings, with and without non-ascii characters in the
sender/recipient)
things that did not work and need to be fixed if we want to apply this:
* LDAP, you mentioned it, but it fails in a really non obvious way
and drops mails currently
* user wl/bl from the quarantine interface
(some en/decode is missing, and garbage reaches the user lists)
things that worked in my tests:
* sending emails (with/without smtputf8)
* quarantining mails
* notication/modify/header/disclaimer/etc. with non-ascii characters
* various what/who objects with non-ascii characters
* greylisting with non-ascii characters in sender/recipient
* modifying user wl/bl
* matching user wl/bl
* log tracker
* statistics
i did find some things to note in the individual patches, i'll answer there
^ permalink raw reply [flat|nested] 16+ messages in thread
* [pmg-devel] applied-gui: [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
2022-11-23 9:23 [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
` (10 preceding siblings ...)
2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
@ 2022-11-26 7:00 ` Thomas Lamprecht
11 siblings, 0 replies; 16+ messages in thread
From: Thomas Lamprecht @ 2022-11-26 7:00 UTC (permalink / raw)
To: Stoiko Ivanov, pmg-devel
Am 23/11/2022 um 10:23 schrieb Stoiko Ivanov:
> pmg-gui:
> Stoiko Ivanov (2):
> utils: add custom validator for pmg-email-address
> userblocklists: use PMGMail as validator for pmail
before I forget: applied those two yesterday, thanks!
^ permalink raw reply [flat|nested] 16+ messages in thread