From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <d.csapak@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id EE22691980
 for <pmg-devel@lists.proxmox.com>; Mon, 14 Nov 2022 17:02:38 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id D195125650
 for <pmg-devel@lists.proxmox.com>; Mon, 14 Nov 2022 17:02:08 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pmg-devel@lists.proxmox.com>; Mon, 14 Nov 2022 17:02:07 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 57AF443B94;
 Mon, 14 Nov 2022 17:02:07 +0100 (CET)
Message-ID: <604d6ab9-26bf-308d-0072-a2a45b867ced@proxmox.com>
Date: Mon, 14 Nov 2022 17:02:06 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:107.0) Gecko/20100101
 Thunderbird/107.0
Content-Language: en-US
To: Stoiko Ivanov <s.ivanov@proxmox.com>, pmg-devel@lists.proxmox.com
References: <20221109182728.629576-1-s.ivanov@proxmox.com>
From: Dominik Csapak <d.csapak@proxmox.com>
In-Reply-To: <20221109182728.629576-1-s.ivanov@proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results: =?UTF-8?Q?0=0A=09?=AWL 0.066 Adjusted
 score from AWL reputation of From: =?UTF-8?Q?address=0A=09?=BAYES_00 -1.9
 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict
 =?UTF-8?Q?Alignment=0A=09?=NICE_REPLY_A -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF
 =?UTF-8?Q?Record=0A=09?=SPF_PASS -0.001 SPF: sender matches SPF record
Subject: Re: [pmg-devel] [PATCH pmg-api 0/5] ruledb - improve experience for
 non-ascii tests and mails
X-BeenThere: pmg-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Mail Gateway development discussion
 <pmg-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pmg-devel>, 
 <mailto:pmg-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pmg-devel/>
List-Post: <mailto:pmg-devel@lists.proxmox.com>
List-Help: <mailto:pmg-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel>, 
 <mailto:pmg-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Mon, 14 Nov 2022 16:02:39 -0000

ok tested a bit around with this series

generally "works" as in mail flows and reaches the right things
(recipient/quarantine/etc) AFAICS

some things are a bit broken:

* using notification/modify field with smtputf8 has not the desired result:
   sending an email with smtputf8 and an utf8 encoded subject results in
   the subject being \x "encoded", in the quarantine the notifications
   and the resulting mail on delivery
   (unicode characters configured in the rule themselves show properly)

   ideally this would be detected and properly de/encoded

* still some issues with the statistics database
   (talked to stoiko off list about that)

* the quarantine ui is rather broken with this:
   neither the sender/recipient nor the mail/subject are correctly (en?)decoded
   such that the utf-8 bytes are double encoded

   we may want to save the info if the mail came from a 'smtputf8' source somewhere
   so that we can properly de/encode the info again?

   also i'm not sure if we want to release it, with the quarantine in this state.
   i guess it'll be one of the first bug reports then..

What worked well:

* using unicode characters in the rule system (where appropriate):
   - rule names
   - rule comments
   - rule values

   i tested as many rules as i could find where it would make sense:
   match field, attachment replacement, notify text, modify field, and so on

* sending / receiving mails with unicode characters in the sender/recipient


What's missing:

* ldap and who objects are a big one -> we should soon think about how we can do that
* statistics entries


all in all a good step in the right direction, thanks :)