public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH pmg-api v4 01/12] utils: return perl string from decode_rfc1522
Date: Thu, 24 Nov 2022 13:21:01 +0100	[thread overview]
Message-ID: <20221124122112.666868-2-d.csapak@proxmox.com> (raw)
In-Reply-To: <20221124122112.666868-1-d.csapak@proxmox.com>

From: Stoiko Ivanov <s.ivanov@proxmox.com>

decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)

the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit
43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PMG/Utils.pm | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..cfb8852 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1088,6 +1088,7 @@ sub decode_to_html {
     return $res;
 }
 
+# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters)
 sub decode_rfc1522 {
     my ($enc) = @_;
 
@@ -1102,7 +1103,7 @@ sub decode_rfc1522 {
 		if ($cs) {
 		    $res .= decode($cs, $d);
 		} else {
-		    $res .= $d;
+		    $res .= try_decode_utf8($d);
 		}
 	    }
 	}
@@ -1542,4 +1543,9 @@ sub get_existing_object_id {
     return;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+    return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
 1;
-- 
2.30.2





  reply	other threads:[~2022-11-24 12:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-24 12:21 [pmg-devel] [PATCH pmg-api v4 00/12] ruledb - improve experience for non-ascii tests and mails Dominik Csapak
2022-11-24 12:21 ` Dominik Csapak [this message]
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 02/12] ruledb: properly substitute prox_vars in headers Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 03/12] fix #2541 ruledb: encode relevant values as utf-8 in database Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 04/12] ruledb: encode e-mail addresses for syslog Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 05/12] partially fix #2465: handle smtputf8 addresses in the rule-system Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 06/12] quarantine: handle utf8 data Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 07/12] pmgqm: handle smtputf8 data Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 08/12] statistics: handle utf8 data Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 09/12] quarantine: fix adding non-ascii senders to wl/bl Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 10/12] utils: refactor rfc1522_to_html Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 11/12] ldap: improve unicode support Dominik Csapak
2022-11-24 12:21 ` [pmg-devel] [PATCH pmg-api v4 12/12] statistics: refactor filter_text generation Dominik Csapak
2022-11-24 15:45 ` [pmg-devel] applied-series: [PATCH pmg-api v4 00/12] ruledb - improve experience for non-ascii tests and mails Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221124122112.666868-2-d.csapak@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pmg-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal