public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails
@ 2022-11-09 18:27 Stoiko Ivanov
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution Stoiko Ivanov
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

this patchseries partially fixes #2465 and #2541, two quite often reported
issues, which are causing quite a disappointing experience for users
in non-ascii only environments

the main assumption of the patches are:
* envelope addresses are either ascii or utf-8 (latter only with smtputf8)
* thus we can unconditionally de-/encode envelope addresses for database
  results/lookups
* the matching in the rule-objects will see the relevant parts of the mail
  as properly encoded perl-strings (with multi-byte characters - e.g. the
  euro sign as \x{20ac} instead of \x{e2}\x{82}\x{ac})
(I did a bit of testing to verify them, by e.g. sending an ISO-8859-1
encoded mail and matching for an umlaut in the subject)

While going through the RuleDB classes I remembered, that we have a few
pieces of legacy objects (Attach, ReportSpam, Counter actions) there, and
went ahead with deprecating them (initially I simply deleted them, but
decided to be more cautious and just log the deprecation until 8.0, when
we can drop them explicitly). They cannot be instantiated currently (short
of a direct insert into the database) - but I don't know if they were ever
used in pre 5.0 times in their current form. - patch 2/5.

Out of scope of the series for now:
* utf-8 support in the LDAP subsystem (deployments with a configured LDAP
  profile still won't be able to process smtputf8 mails) - mostly until I
  get around to create test-environment with the appropriate schema for
  having non-ascii mail-addresses
* Domain/Email objects - did not find the time to consider how to store
  them most sensibly (puny-code, utf-8) and if the choice should be
  carried over to all of our 'email' formats (it probably shouldn't)

patches 1/5 and 4/5 address 2 small bugs I ran into while testing

Given that I quite often miss a few fine points or use-cases I'd be very
grateful for some more experimenting/testing!

Stoiko Ivanov (5):
  ruledb: modfield: properly encode field after variable substitution
  ruledb: add deprecation warnings for unused actions
  fix #2541 ruledb: encode relevant values as utf-8 in database
  ruledb: encode e-mail addresses for syslog
  partially fix #2465: handle smtputf8 addresses in the rule-system

 src/PMG/MailQueue.pm            | 10 +++++----
 src/PMG/RuleDB.pm               | 37 ++++++++++++++++++++++++++-------
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/Attach.pm        |  7 +++++++
 src/PMG/RuleDB/BCC.pm           | 19 ++++++++++++++---
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Counter.pm       |  5 +++++
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 ++--
 src/PMG/RuleDB/MatchField.pm    |  6 +++++-
 src/PMG/RuleDB/MatchFilename.pm |  5 ++++-
 src/PMG/RuleDB/ModField.pm      | 10 ++++++---
 src/PMG/RuleDB/Notify.pm        | 19 ++++++++++++++---
 src/PMG/RuleDB/Quarantine.pm    | 19 ++++++++++++++---
 src/PMG/RuleDB/Remove.pm        | 20 ++++++++++++------
 src/PMG/RuleDB/ReportSpam.pm    |  5 +++++
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/Spam.pm          |  5 +++--
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++++-
 src/PMG/Utils.pm                |  5 +++++
 src/bin/pmg-smtp-filter         |  3 ++-
 21 files changed, 150 insertions(+), 42 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
@ 2022-11-09 18:27 ` Stoiko Ivanov
  2022-11-11 13:56   ` [pmg-devel] applied: " Thomas Lamprecht
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions Stoiko Ivanov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

this patch follows 6296d93fecb84e71603c15218f6ffc9732173491 in
properly encoding the added header-field, the way the subject is
encoded for a notification.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/ModField.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index 3e66ac3..fb15076 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -4,6 +4,8 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode decode);
+use MIME::Words qw(encode_mimewords);
 
 use PMG::Utils;
 use PMG::ModGroup;
@@ -107,7 +109,7 @@ sub execute {
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $e) = (@$ta[0], @$ta[1]);
-	$e->head->replace($self->{field}, $fvalue);
+	$e->head->replace($self->{field}, encode_mimewords(encode('UTF-8', $fvalue), "Charset" => "UTF-8"));
     }
 }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution Stoiko Ivanov
@ 2022-11-09 18:27 ` Stoiko Ivanov
  2022-11-14 16:02   ` Dominik Csapak
  2022-11-15 14:32   ` [pmg-devel] applied: " Thomas Lamprecht
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

* ReportSpam
* Attach
* Counter

are all still present since (at least) the release of PMG 5.0, but
were never exposed in the API/GUI.

All of them in their current form don't seem to fit well nowadays, or
their functionality was taken over by some other Action:
* Attach - the functionality is currently present in the Notify action
  (attach original mail)
* Counter - without a matching What object simply increasing a counter
  by one in the database serves no purpose
* ReportSpam - sending potentially sensitive mail automatically to the
  public SpamAssassin project does not seem to fit well nowadays

Instead of dropping them right away - this patch adds logging when
they are encountered while loading or when they are run, to keep
backwards-compatibility for users who have very long-running PMG
instances (not sure if the actions were ever used in the pre git-days
of PMG)

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB.pm            | 14 +++++++++++++-
 src/PMG/RuleDB/Attach.pm     |  7 +++++++
 src/PMG/RuleDB/Counter.pm    |  5 +++++
 src/PMG/RuleDB/ReportSpam.pm |  5 +++++
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
index faed404..895acc6 100644
--- a/src/PMG/RuleDB.pm
+++ b/src/PMG/RuleDB.pm
@@ -289,7 +289,14 @@ sub load_objectgroups {
 sub get_object {
     my ($self, $otype) = @_;
 
-     my $obj;
+    my $obj;
+
+    # FIXME: remove deprecated types and files with PMG 8.0
+    my $deprecated_types = {
+	4004 => "Attach",
+	4008 => "ReportSpam",
+	4999 => "Counter",
+    };
 
     # WHO OBJECTS
     if ($otype == PMG::RuleDB::Domain::otype()) {
@@ -386,9 +393,14 @@ sub get_object {
 	    die "proxmox: unknown object type: ERROR";
     }
 
+    if ( grep( $_ == $otype, keys %$deprecated_types)) {
+	syslog('warning', "proxmox: deprecated object of type %s found!",
+	    $deprecated_types->{$otype});
+    }
     return $obj;
 }
 
+# FIXME: remove with PMG 8.0
 sub load_counters_data {
     my ($self) = @_;
 
diff --git a/src/PMG/RuleDB/Attach.pm b/src/PMG/RuleDB/Attach.pm
index 78b2749..cdb9d89 100644
--- a/src/PMG/RuleDB/Attach.pm
+++ b/src/PMG/RuleDB/Attach.pm
@@ -1,10 +1,14 @@
 package PMG::RuleDB::Attach;
 
+# FIXME: remove with PMG 8.0
+
 use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
 
+use PVE::SafeSyslog;
+
 use PMG::Utils;
 use PMG::ModGroup;
 use PMG::RuleDB::Object;
@@ -84,6 +88,9 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets,
 	$msginfo, $vars, $marks) = @_;
 
+    syslog('warning', "%s: deprecated action 'Attach' will be removed with PMG 8.0.",
+	   $queue->{logid},);
+
     my $subgroups = $mod_group->subgroups($targets);
 
     foreach my $ta (@$subgroups) {
diff --git a/src/PMG/RuleDB/Counter.pm b/src/PMG/RuleDB/Counter.pm
index 9872ab0..6c7721b 100644
--- a/src/PMG/RuleDB/Counter.pm
+++ b/src/PMG/RuleDB/Counter.pm
@@ -1,5 +1,7 @@
 package PMG::RuleDB::Counter;
 
+# FIXME: remove with PMG 8.0
+
 use strict;
 use warnings;
 use DBI;
@@ -87,6 +89,9 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
+    syslog('warning', "%s: deprecated action 'Counter' will be removed with PMG 8.0.",
+	   $queue->{logid},);
+
     eval {
 	$ruledb->{dbh}->begin_work;
 	
diff --git a/src/PMG/RuleDB/ReportSpam.pm b/src/PMG/RuleDB/ReportSpam.pm
index e0ac004..c8ae9bd 100644
--- a/src/PMG/RuleDB/ReportSpam.pm
+++ b/src/PMG/RuleDB/ReportSpam.pm
@@ -1,5 +1,7 @@
 package PMG::RuleDB::ReportSpam;
 
+# FIXME: remove with PMG 8.0
+
 use strict;
 use warnings;
 use DBI;
@@ -85,6 +87,9 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
+    syslog('warning', "%s: deprecated action 'Attach' will be removed with PMG 8.0.",
+	   $queue->{logid},);
+
     my $rulename = $vars->{RULE} // 'unknown';
 
     my $subgroups = $mod_group->subgroups($targets);
-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution Stoiko Ivanov
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions Stoiko Ivanov
@ 2022-11-09 18:27 ` Stoiko Ivanov
  2022-11-14 14:36   ` Dominik Csapak
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.

backwards-compatibility should not be an issue:
* following the argumentation from commit
  43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage
* we only need to consider that the valid multibyte utf-8 characters
  do not really yield sensible combinations of single-byte characters
  (starting with a byte > 127 - e.g. "£")

the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)

This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
  (puny-code for the domain part, or everything as UTF-8) and it would
  need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
  is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases

The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].

[0] https://www.postgresql.org/docs/13/multibyte.html
[1] https://datatracker.ietf.org/doc/html/rfc6838#section-4.2
[2] https://datatracker.ietf.org/doc/html/rfc2045#section-5.1

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB.pm               | 23 ++++++++++++++++-------
 src/PMG/RuleDB/Accept.pm        |  2 +-
 src/PMG/RuleDB/BCC.pm           |  2 +-
 src/PMG/RuleDB/Block.pm         |  2 +-
 src/PMG/RuleDB/Disclaimer.pm    |  2 +-
 src/PMG/RuleDB/Group.pm         |  4 ++--
 src/PMG/RuleDB/MatchField.pm    |  6 +++++-
 src/PMG/RuleDB/MatchFilename.pm |  5 ++++-
 src/PMG/RuleDB/ModField.pm      |  6 ++++--
 src/PMG/RuleDB/Notify.pm        |  2 +-
 src/PMG/RuleDB/Quarantine.pm    |  3 ++-
 src/PMG/RuleDB/Remove.pm        | 12 +++++++-----
 src/PMG/RuleDB/Rule.pm          |  2 +-
 src/PMG/RuleDB/WhoRegex.pm      |  5 ++++-
 src/PMG/Utils.pm                |  5 +++++
 15 files changed, 55 insertions(+), 26 deletions(-)

diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
index 895acc6..9d6d99d 100644
--- a/src/PMG/RuleDB.pm
+++ b/src/PMG/RuleDB.pm
@@ -5,6 +5,7 @@ use warnings;
 use DBI;
 use HTML::Entities;
 use Data::Dumper;
+use Encode qw(encode decode);
 
 use PVE::SafeSyslog;
 
@@ -72,7 +73,8 @@ sub create_group_with_obj {
 
     $name //= '';
     $info //= '';
-
+    $name = encode('UTF-8', $name);
+    $info = encode('UTF-8',$info);
     eval {
 
 	$self->{dbh}->begin_work;
@@ -174,7 +176,9 @@ sub save_group {
 	$self->{dbh}->do("UPDATE Objectgroup " .
 			 "SET Name = ?, Info = ? " .
 			 "WHERE ID = ?", undef,
-			 $og->{name}, $og->{info}, $og->{id});
+			 encode('UTF-8', $og->{name}),
+			 encode('UTF-8', $og->{info}),
+			 $og->{id});
 
 	return $og->{id};
 
@@ -183,7 +187,7 @@ sub save_group {
 	    "INSERT INTO Objectgroup (Name, Info, Class) " .
 	    "VALUES (?, ?, ?);");
 
-	$sth->execute($og->name, $og->info, $og->class);
+	$sth->execute(encode('UTF-8', $og->name), encode('UTF-8', $og->info), $og->class);
 
 	return $og->{id} = PMG::Utils::lastid($self->{dbh}, 'objectgroup_id_seq');
     }
@@ -212,7 +216,9 @@ sub delete_group {
 	$sth->execute($groupid);
 
 	if (my $ref = $sth->fetchrow_hashref()) {
-	    die "Group '$ref->{groupname}' is used by rule '$ref->{rulename}' - unable to delete\n";
+	    my $groupname = PMG::Utils::try_deocode_utf8($ref->{groupname});
+	    my $rulename = PMG::Utils::try_deocode_utf8($ref->{rulename});
+	    die "Group '$groupname' is used by rule '$rulename' - unable to delete\n";
 	}
 
         $sth->finish();
@@ -474,6 +480,7 @@ sub load_object_full {
 sub load_group_by_name {
     my ($self, $name) = @_;
 
+    $name = PMG::Utils::try_decode_utf8($name);
     my $sth = $self->{dbh}->prepare("SELECT * FROM Objectgroup " .
 				    "WHERE name = ?");
 
@@ -598,13 +605,14 @@ sub save_rule {
     defined($rule->{direction}) ||
 	die "undefined rule attribute - direction: ERROR";
 
+    my $rulename = encode('UTF-8', $rule->{name});
     if (defined($rule->{id})) {
 
 	$self->{dbh}->do(
 	    "UPDATE Rule " .
 	    "SET Name = ?, Priority = ?, Active = ?, Direction = ? " .
 	    "WHERE ID = ?", undef,
-	    $rule->{name}, $rule->{priority}, $rule->{active},
+	    $rulename, $rule->{priority}, $rule->{active},
 	    $rule->{direction}, $rule->{id});
 
 	return $rule->{id};
@@ -614,7 +622,7 @@ sub save_rule {
 	    "INSERT INTO Rule (Name, Priority, Active, Direction) " .
 	    "VALUES (?, ?, ?, ?);");
 
-	$sth->execute($rule->name, $rule->priority, $rule->active,
+	$sth->execute($rulename, $rule->priority, $rule->active,
 		      $rule->direction);
 
 	return $rule->{id} = PMG::Utils::lastid($self->{dbh}, 'rule_id_seq');
@@ -779,7 +787,8 @@ sub load_rules {
     $sth->execute();
 
     while (my $ref = $sth->fetchrow_hashref()) {
-	my $rule = PMG::RuleDB::Rule->new($ref->{name}, $ref->{priority},
+	my $rulename = PMG::Utils::try_decode_utf8($ref->{name});
+	my $rule = PMG::RuleDB::Rule->new($rulename, $ref->{priority},
 					  $ref->{active}, $ref->{direction});
 	$rule->{id} = $ref->{id};
 	push @$rules, $rule;
diff --git a/src/PMG/RuleDB/Accept.pm b/src/PMG/RuleDB/Accept.pm
index cd67ea2..4ebd6da 100644
--- a/src/PMG/RuleDB/Accept.pm
+++ b/src/PMG/RuleDB/Accept.pm
@@ -93,7 +93,7 @@ sub execute {
     my $dkim = $msginfo->{dkim} // {};
     my $subgroups = $mod_group->subgroups($targets, !$dkim->{sign});
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index d364690..c1225f3 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -115,7 +115,7 @@ sub execute {
 
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $bcc_to = PMG::Utils::subst_values($self->{target}, $vars);
 
diff --git a/src/PMG/RuleDB/Block.pm b/src/PMG/RuleDB/Block.pm
index c758787..25bb74e 100644
--- a/src/PMG/RuleDB/Block.pm
+++ b/src/PMG/RuleDB/Block.pm
@@ -89,7 +89,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if ($msginfo->{testmode}) {
 	my $fh = $msginfo->{test_fh};
diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
index d3003b2..c6afe54 100644
--- a/src/PMG/RuleDB/Disclaimer.pm
+++ b/src/PMG/RuleDB/Disclaimer.pm
@@ -193,7 +193,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets, 
 	$msginfo, $vars, $marks) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $subgroups = $mod_group->subgroups($targets);
 
diff --git a/src/PMG/RuleDB/Group.pm b/src/PMG/RuleDB/Group.pm
index 2508305..baa68ce 100644
--- a/src/PMG/RuleDB/Group.pm
+++ b/src/PMG/RuleDB/Group.pm
@@ -12,8 +12,8 @@ sub new {
     my ($type, $name, $info, $class) = @_;
 
     my $self = {
-	name => $name,
-	info => $info,
+	name => PMG::Utils::try_decode_utf8($name),
+	info => PMG::Utils::try_decode_utf8($info),
 	class => $class,
     };
 
diff --git a/src/PMG/RuleDB/MatchField.pm b/src/PMG/RuleDB/MatchField.pm
index 2671ea4..8246e6e 100644
--- a/src/PMG/RuleDB/MatchField.pm
+++ b/src/PMG/RuleDB/MatchField.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(decode encode);
 use MIME::Words;
 
 use PVE::SafeSyslog;
@@ -50,9 +51,10 @@ sub load_attr {
     defined($field) || die "undefined object attribute: ERROR";
     defined($field_value) || die "undefined object attribute: ERROR";
 
+    my $decoded_field_value = PMG::Utils::try_decode_utf8($field_value);
     # use known constructor, bless afterwards (because sub class can have constructor
     # with other parameter signature).
-    my $obj =  PMG::RuleDB::MatchField->new($field, $field_value, $ogroup);
+    my $obj =  PMG::RuleDB::MatchField->new($field, $decoded_field_value, $ogroup);
     bless $obj, $class;
 
     $obj->{id} = $id;
@@ -69,6 +71,7 @@ sub save {
 
     my $new_value = "$self->{field}:$self->{field_value}";
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined ($self->{id})) {
 	# update
@@ -106,6 +109,7 @@ sub parse_entity {
 	    chomp $value;
 
 	    my $decvalue = MIME::Words::decode_mimewords($value);
+	    $decvalue = PMG::Utils::try_decode_utf8($decvalue);
 
 	    if ($decvalue =~ m|$self->{field_value}|i) {
 		push @$res, $id;
diff --git a/src/PMG/RuleDB/MatchFilename.pm b/src/PMG/RuleDB/MatchFilename.pm
index 7e5b486..06bf931 100644
--- a/src/PMG/RuleDB/MatchFilename.pm
+++ b/src/PMG/RuleDB/MatchFilename.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode decode);
 use MIME::Words;
 
 use PMG::Utils;
@@ -41,8 +42,9 @@ sub load_attr {
     my $class = ref($type) || $type;
 
     defined($value) || die "undefined value: ERROR";;
+    my $decvalue = PMG::Utils::try_decode_utf8($value);
 
-    my $obj = $class->new($value, $ogroup);
+    my $obj = $class->new($decvalue, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -57,6 +59,7 @@ sub save {
 
     my $new_value = $self->{fname};
     $new_value =~ s/\\/\\\\/g;
+    $new_value = encode('UTF-8', $new_value);
 
     if (defined($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
index fb15076..1e1727f 100644
--- a/src/PMG/RuleDB/ModField.pm
+++ b/src/PMG/RuleDB/ModField.pm
@@ -57,7 +57,9 @@ sub load_attr {
 
     (defined($field) && defined($field_value)) || return undef;
 
-    my $obj = $class->new($field, $field_value, $ogroup);
+    my $dec_field_value = PMG::Utils::try_decode_utf8($field_value);
+
+    my $obj = $class->new($field, $dec_field_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $field, $field_value, $ogroup);
@@ -70,7 +72,7 @@ sub save {
 
     defined($self->{ogroup}) || return undef;
 
-    my $new_value = "$self->{field}:$self->{field_value}";
+    my $new_value = encode('UTF-8', "$self->{field}:$self->{field_value}");
 
     if (defined ($self->{id})) {
 	# update
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index af853a3..bca5ebf 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -208,7 +208,7 @@ sub execute {
 
     my $from = 'postmaster';
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     my $body = PMG::Utils::subst_values($self->{body}, $vars);
     my $subject = PMG::Utils::subst_values($self->{subject}, $vars);
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 1426393..30bc5ec 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(decode encode);
 
 use PVE::SafeSyslog;
 
@@ -89,7 +90,7 @@ sub execute {
     
     my $subgroups = $mod_group->subgroups($targets, 1);
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     foreach my $ta (@$subgroups) {
 	my ($tg, $entity) = (@$ta[0], @$ta[1]);
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index 6b27b91..da6c25f 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -63,12 +63,14 @@ sub load_attr {
 
     defined ($value) || die "undefined value: ERROR";
 
-    my $obj;
+    my ($obj, $text);
 
     if ($value =~ m/^([01])\,([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $4, $ogroup, $2);
+	$text = PMG::Utils::try_decode_utf8($4);
+	$obj = $class->new($1, $text, $ogroup, $2);
     } elsif ($value =~ m/^([01])(\:(.*))?$/s) {
-	$obj = $class->new($1, $3, $ogroup);
+	$text = PMG::Utils::try_decode_utf8($3);
+	$obj = $class->new($1, $text, $ogroup);
     } else {
 	$obj = $class->new(0, undef, $ogroup);
     }
@@ -89,7 +91,7 @@ sub save {
     $value .= ','. ($self->{quarantine} ? '1' : '0');
 
     if ($self->{text}) {
-	$value .= ":$self->{text}";
+	$value .= encode('UTF-8', ":$self->{text}");
     }
 
     if (defined ($self->{id})) {
@@ -194,7 +196,7 @@ sub execute {
     my ($self, $queue, $ruledb, $mod_group, $targets,
 	$msginfo, $vars, $marks, $ldap) = @_;
 
-    my $rulename = $vars->{RULE} // 'unknown';
+    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
 
     if (!$self->{all} && ($#$marks == -1)) {
 	# no marks
diff --git a/src/PMG/RuleDB/Rule.pm b/src/PMG/RuleDB/Rule.pm
index c49ad21..e7c9146 100644
--- a/src/PMG/RuleDB/Rule.pm
+++ b/src/PMG/RuleDB/Rule.pm
@@ -12,7 +12,7 @@ sub new {
     my ($type, $name, $priority, $active, $direction) = @_;
 
     my $self = { 
-	name => $name // '',
+	name => PMG::Utils::try_decode_utf8($name) // '',
 	priority => $priority // 0,
 	active => $active // 0,
     }; 
diff --git a/src/PMG/RuleDB/WhoRegex.pm b/src/PMG/RuleDB/WhoRegex.pm
index 37ec3aa..ccc94a0 100644
--- a/src/PMG/RuleDB/WhoRegex.pm
+++ b/src/PMG/RuleDB/WhoRegex.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(decode encode);
 
 use PMG::Utils;
 use PMG::RuleDB::Object;
@@ -43,7 +44,8 @@ sub load_attr {
 
     defined($value) || die "undefined value: ERROR";
 
-    my $obj = $class->new ($value, $ogroup);
+    my $decoded_value = PMG::Utils::try_decode_utf8($value);
+    my $obj = $class->new ($decoded_value, $ogroup);
     $obj->{id} = $id;
 
     $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
@@ -59,6 +61,7 @@ sub save {
 
     my $adr = $self->{address};
     $adr =~ s/\\/\\\\/g;
+    $adr = encode('UTF-8', $adr);
 
     if (defined ($self->{id})) {
 	# update
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..23f60eb 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1542,4 +1542,9 @@ sub get_existing_object_id {
     return;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+    return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
 1;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
@ 2022-11-09 18:27 ` Stoiko Ivanov
  2022-11-14 14:49   ` Dominik Csapak
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
  2022-11-14 16:02 ` [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
  5 siblings, 1 reply; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

as done in 114655f4fdb07c789a361b2f397f5345eafd16c6 for Accept and
Block.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/BCC.pm        | 17 +++++++++++++++--
 src/PMG/RuleDB/Notify.pm     | 17 +++++++++++++++--
 src/PMG/RuleDB/Quarantine.pm | 16 ++++++++++++++--
 src/PMG/RuleDB/Remove.pm     |  8 +++++++-
 4 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
index c1225f3..e56f051 100644
--- a/src/PMG/RuleDB/BCC.pm
+++ b/src/PMG/RuleDB/BCC.pm
@@ -165,9 +165,22 @@ sub execute {
 		$msginfo->{xforward}, $msginfo->{fqdn}, $param);
 	    foreach (@bcc_targets) {
 		if ($qid) {
-		    syslog('info', "%s: bcc to <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		    syslog(
+			'info',
+			"%s: bcc to <%s> (rule: %s, %s)",
+			$queue->{logid},
+			encode('UTF-8',$_),
+			$rulename,
+			$qid,
+		    );
 		} else {
-		    syslog('err', "%s: bcc to <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		    syslog(
+			'err',
+			"%s: bcc to <%s> (rule: %s) failed",
+			$queue->{logid},
+			encode('UTF-8',$_),
+			$rulename,
+		    );
 		}
 	    }
 	}
diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
index bca5ebf..ee5d2ac 100644
--- a/src/PMG/RuleDB/Notify.pm
+++ b/src/PMG/RuleDB/Notify.pm
@@ -260,9 +260,22 @@ sub execute {
 	    $top, $from, \@targets, undef, $msginfo->{fqdn});
 	foreach (@targets) {
 	    if ($qid) {
-		syslog('info', "%s: notify <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
+		syslog(
+		    'info',
+		    "%s: notify <%s> (rule: %s, %s)",
+		    $queue->{logid},
+		    encode('UTF-8', $_),
+		    $rulename,
+		    $qid,
+		);
 	    } else {
-		syslog ('err', "%s: notify <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
+		syslog (
+		    'err',
+		    "%s: notify <%s> (rule: %s) failed",
+		    $queue->{logid},
+		    encode('UTF-8', $_),
+		    $rulename,
+		);
 	    }
 	}
     }
diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
index 30bc5ec..f7154d8 100644
--- a/src/PMG/RuleDB/Quarantine.pm
+++ b/src/PMG/RuleDB/Quarantine.pm
@@ -101,7 +101,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'V', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status ($tg, 'delivered');
@@ -111,7 +117,13 @@ sub execute {
 	    if (my $qid = $queue->quarantine_mail($ruledb, 'S', $entity, $tg, $msginfo, $vars, $ldap)) {
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 
 		$queue->set_status($tg, 'delivered');
diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
index da6c25f..e7c353c 100644
--- a/src/PMG/RuleDB/Remove.pm
+++ b/src/PMG/RuleDB/Remove.pm
@@ -235,7 +235,13 @@ sub execute {
 		}
 
 		foreach (@$tg) {
-		    syslog ('info', "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)", $_, $qid, $rulename);
+		    syslog (
+			'info',
+			"$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)",
+			encode('UTF-8',$_),
+			$qid,
+			$rulename,
+		    );
 		}
 	    }
 	}
-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (3 preceding siblings ...)
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
@ 2022-11-09 18:27 ` Stoiko Ivanov
  2022-11-14 16:03   ` Dominik Csapak
  2022-11-14 16:02 ` [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
  5 siblings, 1 reply; 13+ messages in thread
From: Stoiko Ivanov @ 2022-11-09 18:27 UTC (permalink / raw)
  To: pmg-devel

the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.

This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
  encoding
* valid utf-8 (smtputf8)

The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/MailQueue.pm    | 10 ++++++----
 src/PMG/RuleDB/Spam.pm  |  5 +++--
 src/bin/pmg-smtp-filter |  3 ++-
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/PMG/MailQueue.pm b/src/PMG/MailQueue.pm
index 2841b07..8355c30 100644
--- a/src/PMG/MailQueue.pm
+++ b/src/PMG/MailQueue.pm
@@ -6,6 +6,7 @@ use warnings;
 use PVE::SafeSyslog;
 use MIME::Parser;
 use IO::File;
+use Encode;
 use File::Sync;
 use File::Basename;
 use File::Path;
@@ -141,6 +142,7 @@ sub quarantinedb_insert {
     my ($self, $ruledb, $lcid, $ldap, $qtype, $header, $sender, $file, $targets, $vars) = @_;
 
     eval {
+	$sender = encode('UTF-8', $sender);
 	my $dbh = $ruledb->{dbh};
 
 	my $insert_cmds = "SELECT nextval ('cmailstore_id_seq'); INSERT INTO CMailStore " .
@@ -188,11 +190,11 @@ sub quarantinedb_insert {
 	    if ($pmail eq lc ($r)) {
 		$receiver = "NULL";
 	    } else {
-		$receiver = $dbh->quote ($r);
+		$receiver = $dbh->quote (encode('UTF-8', $r));
 	    }
 
 
-	    $pmail = $dbh->quote ($pmail);
+	    $pmail = $dbh->quote (encode('UTF-8', $pmail));
 	    $insert_cmds .= "INSERT INTO CMSReceivers " .
 		"(CMailStore_CID, CMailStore_RID, PMail, Receiver, TicketID, Status, MTime) " .
 		"VALUES ($lcid, currval ('cmailstore_id_seq'), $pmail, $receiver, $tid, 'N', $now); ";
@@ -294,8 +296,8 @@ sub quarantine_mail {
 	$entity->head->delete ('Return-Path');
 
 	# prepend Delivered-To and Return-Path (like QMAIL MAILDIR FORMAT)
-	$entity->head->add ('Return-Path', join (',', $sender), 0);
-	$entity->head->add ('Delivered-To', join (',', @$tg), 0);
+	$entity->head->add ('Return-Path', encode('UTF-8', join (',', $sender)), 0);
+	$entity->head->add ('Delivered-To', encode('UTF-8', join (',', @$tg)), 0);
 
 	$entity->print ($fh);
 
diff --git a/src/PMG/RuleDB/Spam.pm b/src/PMG/RuleDB/Spam.pm
index cc9a347..b7e7dd4 100644
--- a/src/PMG/RuleDB/Spam.pm
+++ b/src/PMG/RuleDB/Spam.pm
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 use DBI;
 use Digest::SHA;
+use Encode qw(encode decode);
 use Time::HiRes qw (gettimeofday);
 
 use PVE::SafeSyslog;
@@ -135,8 +136,8 @@ sub get_blackwhite {
     my $cond = '';
     foreach my $r (@$targets) {
 	my $pmail = $msginfo->{pmail}->{$r} || lc ($r);
-	my $qr = $dbh->quote ($pmail);
-	$cond .= " OR " if $cond;  
+	my $qr = $dbh->quote (encode('UTF-8', $pmail));
+	$cond .= " OR " if $cond;
 	$cond .= "pmail = $qr";
     }	 
 
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index eaecd21..b7a130a 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -4,6 +4,7 @@ use strict;
 use warnings;
 
 use Carp;
+use Encode qw(encode decode);
 use Getopt::Long;
 use Time::HiRes qw (usleep gettimeofday tv_interval);
 use POSIX qw(:sys_wait_h errno_h signal_h);
@@ -818,7 +819,7 @@ sub handle_smtp {
 	$insert_cmds .= $dbh->quote($msginfo->{sender}) . ');';
 
 	foreach my $r (@{$msginfo->{targets}}) {
-	    my $tmp = $dbh->quote($r);
+	    my $tmp = $dbh->quote(encode('UTF-8',$r));
 	    my $blocked = $queue->{status}->{$r} eq 'blocked' ? 1 : 0;
 	    $insert_cmds .= "INSERT INTO CReceivers (CStatistic_CID, CStatistic_RID, Receiver, Blocked) " .
 		"VALUES ($lcid, currval ('cstatistic_id_seq'), $tmp, '$blocked'); ";
-- 
2.30.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] applied: [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution Stoiko Ivanov
@ 2022-11-11 13:56   ` Thomas Lamprecht
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Lamprecht @ 2022-11-11 13:56 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

Am 09/11/2022 um 19:27 schrieb Stoiko Ivanov:
> this patch follows 6296d93fecb84e71603c15218f6ffc9732173491 in
> properly encoding the added header-field, the way the subject is
> encoded for a notification.
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>  src/PMG/RuleDB/ModField.pm | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
>

applied this one for now, thanks!




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
@ 2022-11-14 14:36   ` Dominik Csapak
  0 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2022-11-14 14:36 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

misin

comments inline:

On 11/9/22 19:27, Stoiko Ivanov wrote:
> This patch adds support for storing rule names, comments(info), and
> most relevant values (e.g. the header content to match) in utf-8 in
> the database.
> 
> backwards-compatibility should not be an issue:
> * following the argumentation from commit
>    43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage
> * we only need to consider that the valid multibyte utf-8 characters
>    do not really yield sensible combinations of single-byte characters
>    (starting with a byte > 127 - e.g. "£")
> 
> the database is created with SQL_ASCII encoding - which behaves by
> interpreting bytes <= 127 as ascii and those > 127 are not interpreted
> (see [0], which just means that we have to explicitly en-/decode upon
> storing/reading from there)
> 
> This patch currently omits most Who objects:
> * for email/domain we'd still need to consider how to store them
>    (puny-code for the domain part, or everything as UTF-8) and it would
>    need changes to the API-types.
> * the LDAP objects currently would not work too well, since our LDAPCache
>    is not UTF-8 safe - and fixing warants its own patch-series
> * WhoRegex should work and be able to handle many use-cases
> 
> The ContentType values should also contain only ascii characters per
> RFC6838 [1] and RFC2045 [2].
> 
> [0] https://www.postgresql.org/docs/13/multibyte.html
> [1] https://datatracker.ietf.org/doc/html/rfc6838#section-4.2
> [2] https://datatracker.ietf.org/doc/html/rfc2045#section-5.1
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/RuleDB.pm               | 23 ++++++++++++++++-------
>   src/PMG/RuleDB/Accept.pm        |  2 +-
>   src/PMG/RuleDB/BCC.pm           |  2 +-
>   src/PMG/RuleDB/Block.pm         |  2 +-
>   src/PMG/RuleDB/Disclaimer.pm    |  2 +-
>   src/PMG/RuleDB/Group.pm         |  4 ++--
>   src/PMG/RuleDB/MatchField.pm    |  6 +++++-
>   src/PMG/RuleDB/MatchFilename.pm |  5 ++++-
>   src/PMG/RuleDB/ModField.pm      |  6 ++++--
>   src/PMG/RuleDB/Notify.pm        |  2 +-
>   src/PMG/RuleDB/Quarantine.pm    |  3 ++-
>   src/PMG/RuleDB/Remove.pm        | 12 +++++++-----
>   src/PMG/RuleDB/Rule.pm          |  2 +-
>   src/PMG/RuleDB/WhoRegex.pm      |  5 ++++-
>   src/PMG/Utils.pm                |  5 +++++
>   15 files changed, 55 insertions(+), 26 deletions(-)
> 
> diff --git a/src/PMG/RuleDB.pm b/src/PMG/RuleDB.pm
> index 895acc6..9d6d99d 100644
> --- a/src/PMG/RuleDB.pm
> +++ b/src/PMG/RuleDB.pm
> @@ -5,6 +5,7 @@ use warnings;
>   use DBI;
>   use HTML::Entities;
>   use Data::Dumper;
> +use Encode qw(encode decode);
>   
>   use PVE::SafeSyslog;
>   
> @@ -72,7 +73,8 @@ sub create_group_with_obj {
>   
>       $name //= '';
>       $info //= '';
> -
> +    $name = encode('UTF-8', $name);
> +    $info = encode('UTF-8',$info);

nit: spacing and we could combine with the lines above:

$name = encode('UTF-8', $name // '');
...


>       eval {
>   
>   	$self->{dbh}->begin_work;
> @@ -174,7 +176,9 @@ sub save_group {
>   	$self->{dbh}->do("UPDATE Objectgroup " .
>   			 "SET Name = ?, Info = ? " .
>   			 "WHERE ID = ?", undef,
> -			 $og->{name}, $og->{info}, $og->{id});
> +			 encode('UTF-8', $og->{name}),
> +			 encode('UTF-8', $og->{info}),
> +			 $og->{id});
>   
>   	return $og->{id};
>   
> @@ -183,7 +187,7 @@ sub save_group {
>   	    "INSERT INTO Objectgroup (Name, Info, Class) " .
>   	    "VALUES (?, ?, ?);");
>   
> -	$sth->execute($og->name, $og->info, $og->class);
> +	$sth->execute(encode('UTF-8', $og->name), encode('UTF-8', $og->info), $og->class);
>   
>   	return $og->{id} = PMG::Utils::lastid($self->{dbh}, 'objectgroup_id_seq');
>       }
> @@ -212,7 +216,9 @@ sub delete_group {
>   	$sth->execute($groupid);
>   
>   	if (my $ref = $sth->fetchrow_hashref()) {
> -	    die "Group '$ref->{groupname}' is used by rule '$ref->{rulename}' - unable to delete\n";
> +	    my $groupname = PMG::Utils::try_deocode_utf8($ref->{groupname});
> +	    my $rulename = PMG::Utils::try_deocode_utf8($ref->{rulename});

typo: deocode vs decode

> +	    die "Group '$groupname' is used by rule '$rulename' - unable to delete\n";
>   	}
>   
>           $sth->finish();
> @@ -474,6 +480,7 @@ sub load_object_full {
>   sub load_group_by_name {
>       my ($self, $name) = @_;
>   
> +    $name = PMG::Utils::try_decode_utf8($name);

isn't this wrong?
we encode the name before writing into the database, so we nave to
encode it again when getting it out of the db, no?

otoh, after a short grep, we never call 'load_group_by_name' anyway,
so we could just delete

>       my $sth = $self->{dbh}->prepare("SELECT * FROM Objectgroup " .
>   				    "WHERE name = ?");
>   
> @@ -598,13 +605,14 @@ sub save_rule {
>       defined($rule->{direction}) ||
>   	die "undefined rule attribute - direction: ERROR";
>   
> +    my $rulename = encode('UTF-8', $rule->{name});
>       if (defined($rule->{id})) {
>   
>   	$self->{dbh}->do(
>   	    "UPDATE Rule " .
>   	    "SET Name = ?, Priority = ?, Active = ?, Direction = ? " .
>   	    "WHERE ID = ?", undef,
> -	    $rule->{name}, $rule->{priority}, $rule->{active},
> +	    $rulename, $rule->{priority}, $rule->{active},
>   	    $rule->{direction}, $rule->{id});
>   
>   	return $rule->{id};
> @@ -614,7 +622,7 @@ sub save_rule {
>   	    "INSERT INTO Rule (Name, Priority, Active, Direction) " .
>   	    "VALUES (?, ?, ?, ?);");
>   
> -	$sth->execute($rule->name, $rule->priority, $rule->active,
> +	$sth->execute($rulename, $rule->priority, $rule->active,
>   		      $rule->direction);
>   
>   	return $rule->{id} = PMG::Utils::lastid($self->{dbh}, 'rule_id_seq');
> @@ -779,7 +787,8 @@ sub load_rules {
>       $sth->execute();
>   
>       while (my $ref = $sth->fetchrow_hashref()) {
> -	my $rule = PMG::RuleDB::Rule->new($ref->{name}, $ref->{priority},
> +	my $rulename = PMG::Utils::try_decode_utf8($ref->{name});
> +	my $rule = PMG::RuleDB::Rule->new($rulename, $ref->{priority},
>   					  $ref->{active}, $ref->{direction});
>   	$rule->{id} = $ref->{id};
>   	push @$rules, $rule;
> diff --git a/src/PMG/RuleDB/Accept.pm b/src/PMG/RuleDB/Accept.pm
> index cd67ea2..4ebd6da 100644
> --- a/src/PMG/RuleDB/Accept.pm
> +++ b/src/PMG/RuleDB/Accept.pm
> @@ -93,7 +93,7 @@ sub execute {
>       my $dkim = $msginfo->{dkim} // {};
>       my $subgroups = $mod_group->subgroups($targets, !$dkim->{sign});
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       foreach my $ta (@$subgroups) {
>   	my ($tg, $entity) = (@$ta[0], @$ta[1]);
> diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
> index d364690..c1225f3 100644
> --- a/src/PMG/RuleDB/BCC.pm
> +++ b/src/PMG/RuleDB/BCC.pm
> @@ -115,7 +115,7 @@ sub execute {
>   
>       my $subgroups = $mod_group->subgroups($targets, 1);
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       my $bcc_to = PMG::Utils::subst_values($self->{target}, $vars);
>   
> diff --git a/src/PMG/RuleDB/Block.pm b/src/PMG/RuleDB/Block.pm
> index c758787..25bb74e 100644
> --- a/src/PMG/RuleDB/Block.pm
> +++ b/src/PMG/RuleDB/Block.pm
> @@ -89,7 +89,7 @@ sub execute {
>       my ($self, $queue, $ruledb, $mod_group, $targets,
>   	$msginfo, $vars, $marks) = @_;
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       if ($msginfo->{testmode}) {
>   	my $fh = $msginfo->{test_fh};
> diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
> index d3003b2..c6afe54 100644
> --- a/src/PMG/RuleDB/Disclaimer.pm
> +++ b/src/PMG/RuleDB/Disclaimer.pm
> @@ -193,7 +193,7 @@ sub execute {
>       my ($self, $queue, $ruledb, $mod_group, $targets,
>   	$msginfo, $vars, $marks) = @_;
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       my $subgroups = $mod_group->subgroups($targets);
>   
> diff --git a/src/PMG/RuleDB/Group.pm b/src/PMG/RuleDB/Group.pm
> index 2508305..baa68ce 100644
> --- a/src/PMG/RuleDB/Group.pm
> +++ b/src/PMG/RuleDB/Group.pm
> @@ -12,8 +12,8 @@ sub new {
>       my ($type, $name, $info, $class) = @_;
>   
>       my $self = {
> -	name => $name,
> -	info => $info,
> +	name => PMG::Utils::try_decode_utf8($name),
> +	info => PMG::Utils::try_decode_utf8($info),
>   	class => $class,
>       };
>   
> diff --git a/src/PMG/RuleDB/MatchField.pm b/src/PMG/RuleDB/MatchField.pm
> index 2671ea4..8246e6e 100644
> --- a/src/PMG/RuleDB/MatchField.pm
> +++ b/src/PMG/RuleDB/MatchField.pm
> @@ -4,6 +4,7 @@ use strict;
>   use warnings;
>   use DBI;
>   use Digest::SHA;
> +use Encode qw(decode encode);

again here

>   use MIME::Words;
>   
>   use PVE::SafeSyslog;
> @@ -50,9 +51,10 @@ sub load_attr {
>       defined($field) || die "undefined object attribute: ERROR";
>       defined($field_value) || die "undefined object attribute: ERROR";
>   
> +    my $decoded_field_value = PMG::Utils::try_decode_utf8($field_value);
>       # use known constructor, bless afterwards (because sub class can have constructor
>       # with other parameter signature).
> -    my $obj =  PMG::RuleDB::MatchField->new($field, $field_value, $ogroup);
> +    my $obj =  PMG::RuleDB::MatchField->new($field, $decoded_field_value, $ogroup);
>       bless $obj, $class;
>   
>       $obj->{id} = $id;
> @@ -69,6 +71,7 @@ sub save {
>   
>       my $new_value = "$self->{field}:$self->{field_value}";
>       $new_value =~ s/\\/\\\\/g;
> +    $new_value = encode('UTF-8', $new_value);
>   
>       if (defined ($self->{id})) {
>   	# update
> @@ -106,6 +109,7 @@ sub parse_entity {
>   	    chomp $value;
>   
>   	    my $decvalue = MIME::Words::decode_mimewords($value);
> +	    $decvalue = PMG::Utils::try_decode_utf8($decvalue);

doesn't that do utf8 decoding already? we have the mail here not the database
object...

>   
>   	    if ($decvalue =~ m|$self->{field_value}|i) {
>   		push @$res, $id;
> diff --git a/src/PMG/RuleDB/MatchFilename.pm b/src/PMG/RuleDB/MatchFilename.pm
> index 7e5b486..06bf931 100644
> --- a/src/PMG/RuleDB/MatchFilename.pm
> +++ b/src/PMG/RuleDB/MatchFilename.pm
> @@ -4,6 +4,7 @@ use strict;
>   use warnings;
>   use DBI;
>   use Digest::SHA;
> +use Encode qw(encode decode);
>   use MIME::Words;
>   
>   use PMG::Utils;
> @@ -41,8 +42,9 @@ sub load_attr {
>       my $class = ref($type) || $type;
>   
>       defined($value) || die "undefined value: ERROR";;
> +    my $decvalue = PMG::Utils::try_decode_utf8($value);
>   
> -    my $obj = $class->new($value, $ogroup);
> +    my $obj = $class->new($decvalue, $ogroup);
>       $obj->{id} = $id;
>   
>       $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
> @@ -57,6 +59,7 @@ sub save {
>   
>       my $new_value = $self->{fname};
>       $new_value =~ s/\\/\\\\/g;
> +    $new_value = encode('UTF-8', $new_value);
>   
>       if (defined($self->{id})) {
>   	# update
> diff --git a/src/PMG/RuleDB/ModField.pm b/src/PMG/RuleDB/ModField.pm
> index fb15076..1e1727f 100644
> --- a/src/PMG/RuleDB/ModField.pm
> +++ b/src/PMG/RuleDB/ModField.pm
> @@ -57,7 +57,9 @@ sub load_attr {
>   
>       (defined($field) && defined($field_value)) || return undef;
>   
> -    my $obj = $class->new($field, $field_value, $ogroup);
> +    my $dec_field_value = PMG::Utils::try_decode_utf8($field_value);
> +
> +    my $obj = $class->new($field, $dec_field_value, $ogroup);
>       $obj->{id} = $id;
>   
>       $obj->{digest} = Digest::SHA::sha1_hex($id, $field, $field_value, $ogroup);
> @@ -70,7 +72,7 @@ sub save {
>   
>       defined($self->{ogroup}) || return undef;
>   
> -    my $new_value = "$self->{field}:$self->{field_value}";
> +    my $new_value = encode('UTF-8', "$self->{field}:$self->{field_value}");
>   
>       if (defined ($self->{id})) {
>   	# update
> diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
> index af853a3..bca5ebf 100644
> --- a/src/PMG/RuleDB/Notify.pm
> +++ b/src/PMG/RuleDB/Notify.pm
> @@ -208,7 +208,7 @@ sub execute {
>   :bd
>       my $from = 'postmaster';
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       my $body = PMG::Utils::subst_values($self->{body}, $vars);
>       my $subject = PMG::Utils::subst_values($self->{subject}, $vars);
> diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
> index 1426393..30bc5ec 100644
> --- a/src/PMG/RuleDB/Quarantine.pm
> +++ b/src/PMG/RuleDB/Quarantine.pm
> @@ -4,6 +4,7 @@ use strict;
>   use warnings;
>   use DBI;
>   use Digest::SHA;
> +use Encode qw(decode encode);

why decode too?

>   
>   use PVE::SafeSyslog;
>   
> @@ -89,7 +90,7 @@ sub execute {
>       
>       my $subgroups = $mod_group->subgroups($targets, 1);
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       foreach my $ta (@$subgroups) {
>   	my ($tg, $entity) = (@$ta[0], @$ta[1]);
> diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
> index 6b27b91..da6c25f 100644
> --- a/src/PMG/RuleDB/Remove.pm
> +++ b/src/PMG/RuleDB/Remove.pm
> @@ -63,12 +63,14 @@ sub load_attr {
>   
>       defined ($value) || die "undefined value: ERROR";
>   
> -    my $obj;
> +    my ($obj, $text);
>   
>       if ($value =~ m/^([01])\,([01])(\:(.*))?$/s) {
> -	$obj = $class->new($1, $4, $ogroup, $2);
> +	$text = PMG::Utils::try_decode_utf8($4);
> +	$obj = $class->new($1, $text, $ogroup, $2);
>       } elsif ($value =~ m/^([01])(\:(.*))?$/s) {
> -	$obj = $class->new($1, $3, $ogroup);
> +	$text = PMG::Utils::try_decode_utf8($3);
> +	$obj = $class->new($1, $text, $ogroup);
>       } else {
>   	$obj = $class->new(0, undef, $ogroup);
>       }
> @@ -89,7 +91,7 @@ sub save {
>       $value .= ','. ($self->{quarantine} ? '1' : '0');
>   
>       if ($self->{text}) {
> -	$value .= ":$self->{text}";
> +	$value .= encode('UTF-8', ":$self->{text}");
>       }
>   
>       if (defined ($self->{id})) {
> @@ -194,7 +196,7 @@ sub execute {
>       my ($self, $queue, $ruledb, $mod_group, $targets,
>   	$msginfo, $vars, $marks, $ldap) = @_;
>   
> -    my $rulename = $vars->{RULE} // 'unknown';
> +    my $rulename = encode('UTF-8', $vars->{RULE} // 'unknown');
>   
>       if (!$self->{all} && ($#$marks == -1)) {
>   	# no marks
> diff --git a/src/PMG/RuleDB/Rule.pm b/src/PMG/RuleDB/Rule.pm
> index c49ad21..e7c9146 100644
> --- a/src/PMG/RuleDB/Rule.pm
> +++ b/src/PMG/RuleDB/Rule.pm
> @@ -12,7 +12,7 @@ sub new {
>       my ($type, $name, $priority, $active, $direction) = @_;
>   
>       my $self = {
> -	name => $name // '',
> +	name => PMG::Utils::try_decode_utf8($name) // '',
>   	priority => $priority // 0,
>   	active => $active // 0,
>       };
> diff --git a/src/PMG/RuleDB/WhoRegex.pm b/src/PMG/RuleDB/WhoRegex.pm
> index 37ec3aa..ccc94a0 100644
> --- a/src/PMG/RuleDB/WhoRegex.pm
> +++ b/src/PMG/RuleDB/WhoRegex.pm
> @@ -4,6 +4,7 @@ use strict;
>   use warnings;
>   use DBI;
>   use Digest::SHA;
> +use Encode qw(decode encode);

again here

>   
>   use PMG::Utils;
>   use PMG::RuleDB::Object;
> @@ -43,7 +44,8 @@ sub load_attr {
>   
>       defined($value) || die "undefined value: ERROR";
>   
> -    my $obj = $class->new ($value, $ogroup);
> +    my $decoded_value = PMG::Utils::try_decode_utf8($value);
> +    my $obj = $class->new ($decoded_value, $ogroup);
>       $obj->{id} = $id;
>   
>       $obj->{digest} = Digest::SHA::sha1_hex($id, $value, $ogroup);
> @@ -59,6 +61,7 @@ sub save {
>   
>       my $adr = $self->{address};
>       $adr =~ s/\\/\\\\/g;
> +    $adr = encode('UTF-8', $adr);
>   
>       if (defined ($self->{id})) {
>   	# update
> diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
> index cef232b..23f60eb 100644
> --- a/src/PMG/Utils.pm
> +++ b/src/PMG/Utils.pm
> @@ -1542,4 +1542,9 @@ sub get_existing_object_id {
>       return;
>   }
>   
> +sub try_decode_utf8 {
> +    my ($data) = @_;
> +    return eval { decode('UTF-8', $data, 1) } // $data;
> +}
> +
>   1;





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
@ 2022-11-14 14:49   ` Dominik Csapak
  0 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2022-11-14 14:49 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

question: since the rulenames are now utf-8, wouldn't we have to encode
them here too?

nits inline:

On 11/9/22 19:27, Stoiko Ivanov wrote:
> as done in 114655f4fdb07c789a361b2f397f5345eafd16c6 for Accept and
> Block.
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/RuleDB/BCC.pm        | 17 +++++++++++++++--
>   src/PMG/RuleDB/Notify.pm     | 17 +++++++++++++++--
>   src/PMG/RuleDB/Quarantine.pm | 16 ++++++++++++++--
>   src/PMG/RuleDB/Remove.pm     |  8 +++++++-
>   4 files changed, 51 insertions(+), 7 deletions(-)
> 
> diff --git a/src/PMG/RuleDB/BCC.pm b/src/PMG/RuleDB/BCC.pm
> index c1225f3..e56f051 100644
> --- a/src/PMG/RuleDB/BCC.pm
> +++ b/src/PMG/RuleDB/BCC.pm
> @@ -165,9 +165,22 @@ sub execute {
>   		$msginfo->{xforward}, $msginfo->{fqdn}, $param);
>   	    foreach (@bcc_targets) {
>   		if ($qid) {
> -		    syslog('info', "%s: bcc to <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
> +		    syslog(
> +			'info',
> +			"%s: bcc to <%s> (rule: %s, %s)",
> +			$queue->{logid},
> +			encode('UTF-8',$_),
> +			$rulename,
> +			$qid,
> +		    );
>   		} else {
> -		    syslog('err', "%s: bcc to <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
> +		    syslog(
> +			'err',
> +			"%s: bcc to <%s> (rule: %s) failed",
> +			$queue->{logid},
> +			encode('UTF-8',$_),
> +			$rulename,
> +		    );
>   		}

i'd rather prefer to encode $_ once before the if and using that instead of having
two encode calls:

my $target = encode(... $_);

if ($qid) {
     syslog(..., $target);
else {
     syslog(..., $target);
}

>   	    }
>   	}
> diff --git a/src/PMG/RuleDB/Notify.pm b/src/PMG/RuleDB/Notify.pm
> index bca5ebf..ee5d2ac 100644
> --- a/src/PMG/RuleDB/Notify.pm
> +++ b/src/PMG/RuleDB/Notify.pm
> @@ -260,9 +260,22 @@ sub execute {
>   	    $top, $from, \@targets, undef, $msginfo->{fqdn});
>   	foreach (@targets) {
>   	    if ($qid) {
> -		syslog('info', "%s: notify <%s> (rule: %s, %s)", $queue->{logid}, $_, $rulename, $qid);
> +		syslog(
> +		    'info',
> +		    "%s: notify <%s> (rule: %s, %s)",
> +		    $queue->{logid},
> +		    encode('UTF-8', $_),
> +		    $rulename,
> +		    $qid,
> +		);
>   	    } else {
> -		syslog ('err', "%s: notify <%s> (rule: %s) failed", $queue->{logid}, $_, $rulename);
> +		syslog (
> +		    'err',
> +		    "%s: notify <%s> (rule: %s) failed",
> +		    $queue->{logid},
> +		    encode('UTF-8', $_),
> +		    $rulename,
> +		);
>   	    }

same here

>   	}
>       }
> diff --git a/src/PMG/RuleDB/Quarantine.pm b/src/PMG/RuleDB/Quarantine.pm
> index 30bc5ec..f7154d8 100644
> --- a/src/PMG/RuleDB/Quarantine.pm
> +++ b/src/PMG/RuleDB/Quarantine.pm
> @@ -101,7 +101,13 @@ sub execute {
>   	    if (my $qid = $queue->quarantine_mail($ruledb, 'V', $entity, $tg, $msginfo, $vars, $ldap)) {
>   
>   		foreach (@$tg) {
> -		    syslog ('info', "$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)", $_, $qid, $rulename);
> +		    syslog (
> +			'info',
> +			"$queue->{logid}: moved mail for <%s> to virus quarantine - %s (rule: %s)",
> +			encode('UTF-8',$_),
> +			$qid,
> +			$rulename,
> +		    );
>   		}
>   
>   		$queue->set_status ($tg, 'delivered');
> @@ -111,7 +117,13 @@ sub execute {
>   	    if (my $qid = $queue->quarantine_mail($ruledb, 'S', $entity, $tg, $msginfo, $vars, $ldap)) {
>   
>   		foreach (@$tg) {
> -		    syslog ('info', "$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)", $_, $qid, $rulename);
> +		    syslog (
> +			'info',
> +			"$queue->{logid}: moved mail for <%s> to spam quarantine - %s (rule: %s)",
> +			encode('UTF-8',$_),
> +			$qid,
> +			$rulename,
> +		    );
>   		}
>   
>   		$queue->set_status($tg, 'delivered');
> diff --git a/src/PMG/RuleDB/Remove.pm b/src/PMG/RuleDB/Remove.pm
> index da6c25f..e7c353c 100644
> --- a/src/PMG/RuleDB/Remove.pm
> +++ b/src/PMG/RuleDB/Remove.pm
> @@ -235,7 +235,13 @@ sub execute {
>   		}
>   
>   		foreach (@$tg) {
> -		    syslog ('info', "$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)", $_, $qid, $rulename);
> +		    syslog (
> +			'info',
> +			"$queue->{logid}: moved mail for <%s> to attachment quarantine - %s (rule: %s)",
> +			encode('UTF-8',$_),
> +			$qid,
> +			$rulename,
> +		    );
>   		}
>   	    }
>   	}





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails
  2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
                   ` (4 preceding siblings ...)
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
@ 2022-11-14 16:02 ` Dominik Csapak
  5 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2022-11-14 16:02 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

ok tested a bit around with this series

generally "works" as in mail flows and reaches the right things
(recipient/quarantine/etc) AFAICS

some things are a bit broken:

* using notification/modify field with smtputf8 has not the desired result:
   sending an email with smtputf8 and an utf8 encoded subject results in
   the subject being \x "encoded", in the quarantine the notifications
   and the resulting mail on delivery
   (unicode characters configured in the rule themselves show properly)

   ideally this would be detected and properly de/encoded

* still some issues with the statistics database
   (talked to stoiko off list about that)

* the quarantine ui is rather broken with this:
   neither the sender/recipient nor the mail/subject are correctly (en?)decoded
   such that the utf-8 bytes are double encoded

   we may want to save the info if the mail came from a 'smtputf8' source somewhere
   so that we can properly de/encode the info again?

   also i'm not sure if we want to release it, with the quarantine in this state.
   i guess it'll be one of the first bug reports then..

What worked well:

* using unicode characters in the rule system (where appropriate):
   - rule names
   - rule comments
   - rule values

   i tested as many rules as i could find where it would make sense:
   match field, attachment replacement, notify text, modify field, and so on

* sending / receiving mails with unicode characters in the sender/recipient


What's missing:

* ldap and who objects are a big one -> we should soon think about how we can do that
* statistics entries


all in all a good step in the right direction, thanks :)




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions Stoiko Ivanov
@ 2022-11-14 16:02   ` Dominik Csapak
  2022-11-15 14:32   ` [pmg-devel] applied: " Thomas Lamprecht
  1 sibling, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2022-11-14 16:02 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

LGTM




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
@ 2022-11-14 16:03   ` Dominik Csapak
  0 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2022-11-14 16:03 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

as talked off-list, there seems to be something not right here
as the statistics code still seems to trigger some errors







^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pmg-devel] applied: [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions
  2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions Stoiko Ivanov
  2022-11-14 16:02   ` Dominik Csapak
@ 2022-11-15 14:32   ` Thomas Lamprecht
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Lamprecht @ 2022-11-15 14:32 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

Am 09/11/2022 um 19:27 schrieb Stoiko Ivanov:
> * ReportSpam
> * Attach
> * Counter
> 
> are all still present since (at least) the release of PMG 5.0, but
> were never exposed in the API/GUI.
> 
> All of them in their current form don't seem to fit well nowadays, or
> their functionality was taken over by some other Action:
> * Attach - the functionality is currently present in the Notify action
>   (attach original mail)
> * Counter - without a matching What object simply increasing a counter
>   by one in the database serves no purpose
> * ReportSpam - sending potentially sensitive mail automatically to the
>   public SpamAssassin project does not seem to fit well nowadays
> 
> Instead of dropping them right away - this patch adds logging when
> they are encountered while loading or when they are run, to keep
> backwards-compatibility for users who have very long-running PMG
> instances (not sure if the actions were ever used in the pre git-days
> of PMG)
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>  src/PMG/RuleDB.pm            | 14 +++++++++++++-
>  src/PMG/RuleDB/Attach.pm     |  7 +++++++
>  src/PMG/RuleDB/Counter.pm    |  5 +++++
>  src/PMG/RuleDB/ReportSpam.pm |  5 +++++
>  4 files changed, 30 insertions(+), 1 deletion(-)
> 
>

applied, thanks!




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-11-15 14:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09 18:27 [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Stoiko Ivanov
2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 1/5] ruledb: modfield: properly encode field after variable substitution Stoiko Ivanov
2022-11-11 13:56   ` [pmg-devel] applied: " Thomas Lamprecht
2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 2/5] ruledb: add deprecation warnings for unused actions Stoiko Ivanov
2022-11-14 16:02   ` Dominik Csapak
2022-11-15 14:32   ` [pmg-devel] applied: " Thomas Lamprecht
2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 3/5] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
2022-11-14 14:36   ` Dominik Csapak
2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 4/5] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
2022-11-14 14:49   ` Dominik Csapak
2022-11-09 18:27 ` [pmg-devel] [PATCH pmg-api 5/5] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
2022-11-14 16:03   ` Dominik Csapak
2022-11-14 16:02 ` [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for non-ascii tests and mails Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal