From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.ivanov@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 5AF1CB005
 for <pmg-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:24:14 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 3C90C20184
 for <pmg-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:23:44 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pmg-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:23:42 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id AD85344452
 for <pmg-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:23:42 +0100 (CET)
From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: pmg-devel@lists.proxmox.com
Date: Wed, 23 Nov 2022 10:23:27 +0100
Message-Id: <20221123092336.11423-2-s.ivanov@proxmox.com>
X-Mailer: git-send-email 2.30.2
In-Reply-To: <20221123092336.11423-1-s.ivanov@proxmox.com>
References: <20221123092336.11423-1-s.ivanov@proxmox.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.083 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 KAM_NUMSUBJECT 0.5 Subject ends in numbers excluding current years
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: [pmg-devel] [PATCH pmg-api v3 1/8] utils: return perl string from
 decode_rfc1522
X-BeenThere: pmg-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Mail Gateway development discussion
 <pmg-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pmg-devel>, 
 <mailto:pmg-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pmg-devel/>
List-Post: <mailto:pmg-devel@lists.proxmox.com>
List-Help: <mailto:pmg-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel>, 
 <mailto:pmg-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 23 Nov 2022 09:24:14 -0000

decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)

the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit
43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/Utils.pm | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index cef232b..cfb8852 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -1088,6 +1088,7 @@ sub decode_to_html {
     return $res;
 }
 
+# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters)
 sub decode_rfc1522 {
     my ($enc) = @_;
 
@@ -1102,7 +1103,7 @@ sub decode_rfc1522 {
 		if ($cs) {
 		    $res .= decode($cs, $d);
 		} else {
-		    $res .= $d;
+		    $res .= try_decode_utf8($d);
 		}
 	    }
 	}
@@ -1542,4 +1543,9 @@ sub get_existing_object_id {
     return;
 }
 
+sub try_decode_utf8 {
+    my ($data) = @_;
+    return eval { decode('UTF-8', $data, 1) } // $data;
+}
+
 1;
-- 
2.30.2