From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails
Date: Wed, 23 Nov 2022 10:23:26 +0100 [thread overview]
Message-ID: <20221123092336.11423-1-s.ivanov@proxmox.com> (raw)
v2->v3:
* dropped the useless decode/encode/decode chain in decode_rfc1522
* moved try_decode_utf8 to patch 1 as it's now used there
* renamed 'encode_user_stat' to 'user_stat_to_perlstring' as this is what
the helper actually does
* the 2 patches for pmg-gui make it possible to add user black/whitelist
entries for non-ascii e-mails
* quickly re-verified that pmgpolicy should be robust for smtputf8 mail
(postfix hands the data over as utf-8 - and pmgpolicy does not parse it
Thanks again to Dominik for the off-list suggestions!
original cover-letter for v2:
v1->v2:
* dropped already applied patches
* added a patch for one further glitch in ModField/Notify actions (when
parsing/replacing non-ascii characters) - patch 1/5+2/5
* added support for utf-8 data in the mailflow additionally for:
** quarantine API handlng
** user BL/WL (the GUI still needs adaptation to parse e-mail-addresses
more liberally - but else it seems to work)
** pmgqm (spamreports)
** statistics
still missing support for:
* LDAP
* Who Objects
huge thanks to Dominik for taking the time to review and test the v1!
original cover-letter for v1:
this patchseries partially fixes #2465 and #2541, two quite often reported
issues, which are causing quite a disappointing experience for users
in non-ascii only environments
the main assumption of the patches are:
* envelope addresses are either ascii or utf-8 (latter only with smtputf8)
* thus we can unconditionally de-/encode envelope addresses for database
results/lookups
* the matching in the rule-objects will see the relevant parts of the mail
as properly encoded perl-strings (with multi-byte characters - e.g. the
euro sign as \x{20ac} instead of \x{e2}\x{82}\x{ac})
(I did a bit of testing to verify them, by e.g. sending an ISO-8859-1
encoded mail and matching for an umlaut in the subject)
While going through the RuleDB classes I remembered, that we have a few
pieces of legacy objects (Attach, ReportSpam, Counter actions) there, and
went ahead with deprecating them (initially I simply deleted them, but
decided to be more cautious and just log the deprecation until 8.0, when
we can drop them explicitly). They cannot be instantiated currently (short
of a direct insert into the database) - but I don't know if they were ever
used in pre 5.0 times in their current form. - patch 2/5.
Out of scope of the series for now:
* utf-8 support in the LDAP subsystem (deployments with a configured LDAP
profile still won't be able to process smtputf8 mails) - mostly until I
get around to create test-environment with the appropriate schema for
having non-ascii mail-addresses
* Domain/Email objects - did not find the time to consider how to store
them most sensibly (puny-code, utf-8) and if the choice should be
carried over to all of our 'email' formats (it probably shouldn't)
patches 1/5 and 4/5 address 2 small bugs I ran into while testing
Given that I quite often miss a few fine points or use-cases I'd be very
grateful for some more experimenting/testing!
pmg-api:
Stoiko Ivanov (8):
utils: return perl string from decode_rfc1522
ruledb: properly substitute prox_vars in headers
fix #2541 ruledb: encode relevant values as utf-8 in database
ruledb: encode e-mail addresses for syslog
partially fix #2465: handle smtputf8 addresses in the rule-system
quarantine: handle utf8 data
pmgqm: handle smtputf8 data
statistics: handle utf8 data.
src/PMG/API2/Quarantine.pm | 16 ++++----
src/PMG/CLI/pmgqm.pm | 24 ++++++------
src/PMG/HTMLMail.pm | 7 ++--
src/PMG/MailQueue.pm | 10 +++--
src/PMG/Quarantine.pm | 13 ++++---
src/PMG/RuleDB.pm | 24 ++++++++----
src/PMG/RuleDB/Accept.pm | 2 +-
src/PMG/RuleDB/BCC.pm | 23 +++++++++--
src/PMG/RuleDB/Block.pm | 2 +-
src/PMG/RuleDB/Disclaimer.pm | 2 +-
src/PMG/RuleDB/Group.pm | 4 +-
src/PMG/RuleDB/MatchField.pm | 8 +++-
src/PMG/RuleDB/MatchFilename.pm | 5 ++-
src/PMG/RuleDB/ModField.pm | 19 +++-------
src/PMG/RuleDB/Notify.pm | 24 +++++++++---
src/PMG/RuleDB/Quarantine.pm | 19 ++++++++--
src/PMG/RuleDB/Remove.pm | 20 +++++++---
src/PMG/RuleDB/Rule.pm | 2 +-
src/PMG/RuleDB/Spam.pm | 17 +++++----
src/PMG/RuleDB/WhoRegex.pm | 5 ++-
src/PMG/Statistic.pm | 67 ++++++++++++++++++++++++---------
src/PMG/Utils.pm | 32 ++++++++++++++--
src/bin/pmg-smtp-filter | 7 ++--
23 files changed, 238 insertions(+), 114 deletions(-)
pmg-gui:
Stoiko Ivanov (2):
utils: add custom validator for pmg-email-address
userblocklists: use PMGMail as validator for pmail
js/UserBlackWhiteList.js | 2 +-
js/Utils.js | 9 +++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
--
2.30.2
next reply other threads:[~2022-11-23 9:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-23 9:23 Stoiko Ivanov [this message]
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from decode_rfc1522 Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 2/8] ruledb: properly substitute prox_vars in headers Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 3/8] fix #2541 ruledb: encode relevant values as utf-8 in database Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 4/8] ruledb: encode e-mail addresses for syslog Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 5/8] partially fix #2465: handle smtputf8 addresses in the rule-system Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 6/8] quarantine: handle utf8 data Stoiko Ivanov
2022-11-23 14:15 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 7/8] pmgqm: handle smtputf8 data Stoiko Ivanov
2022-11-23 14:20 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-api v3 8/8] statistics: handle utf8 data Stoiko Ivanov
2022-11-23 14:26 ` Dominik Csapak
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 1/2] utils: add custom validator for pmg-email-address Stoiko Ivanov
2022-11-23 9:23 ` [pmg-devel] [PATCH pmg-gui v3 2/2] userblocklists: use PMGMail as validator for pmail Stoiko Ivanov
2022-11-23 14:09 ` [pmg-devel] [PATCH pmg-api/pmg-gui v3] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
2022-11-26 7:00 ` [pmg-devel] applied-gui: " Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221123092336.11423-1-s.ivanov@proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox