From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id C46B2B7CE for ; Thu, 24 Nov 2022 13:21:51 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A56F02EB31 for ; Thu, 24 Nov 2022 13:21:21 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 24 Nov 2022 13:21:19 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 848F1436D6 for ; Thu, 24 Nov 2022 13:21:13 +0100 (CET) From: Dominik Csapak To: pmg-devel@lists.proxmox.com Date: Thu, 24 Nov 2022 13:21:01 +0100 Message-Id: <20221124122112.666868-2-d.csapak@proxmox.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221124122112.666868-1-d.csapak@proxmox.com> References: <20221124122112.666868-1-d.csapak@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.186 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_NUMSUBJECT 0.5 Subject ends in numbers excluding current years SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [utils.pm] Subject: [pmg-devel] [PATCH pmg-api v4 01/12] utils: return perl string from decode_rfc1522 X-BeenThere: pmg-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Mail Gateway development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Nov 2022 12:21:51 -0000 From: Stoiko Ivanov decode_rfc1522 is a more robust version of decode_mimewords (in scalar context) - adapt it to return a perlstring, under the assumption that data is utf-8 encoded (holds true for ascii and smtputf8 mails) the try_decode_utf8 helper sub backwards will be used extensively in later patches and is inspired by commit 43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage: We consider that the valid multibyte utf-8 characters do not really yield sensible combinations of single-byte perl characters (starting with a byte > 127 - e.g. "£") so if something decodes without error from utf-8 it will in all likelyhood have been utf-8 to begin with Signed-off-by: Stoiko Ivanov Signed-off-by: Dominik Csapak --- src/PMG/Utils.pm | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm index cef232b..cfb8852 100644 --- a/src/PMG/Utils.pm +++ b/src/PMG/Utils.pm @@ -1088,6 +1088,7 @@ sub decode_to_html { return $res; } +# assume enc contains utf-8 and mime-encoded data returns a perl-string (with wide characters) sub decode_rfc1522 { my ($enc) = @_; @@ -1102,7 +1103,7 @@ sub decode_rfc1522 { if ($cs) { $res .= decode($cs, $d); } else { - $res .= $d; + $res .= try_decode_utf8($d); } } } @@ -1542,4 +1543,9 @@ sub get_existing_object_id { return; } +sub try_decode_utf8 { + my ($data) = @_; + return eval { decode('UTF-8', $data, 1) } // $data; +} + 1; -- 2.30.2