From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 099A21FF168 for ; Tue, 18 Feb 2025 18:18:51 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2643612999; Tue, 18 Feb 2025 18:18:47 +0100 (CET) Message-ID: Date: Tue, 18 Feb 2025 18:18:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Stoiko Ivanov , pmg-devel@lists.proxmox.com References: <20250218135416.54504-1-s.ivanov@proxmox.com> <20250218135416.54504-3-s.ivanov@proxmox.com> Content-Language: en-US From: Friedrich Weber In-Reply-To: <20250218135416.54504-3-s.ivanov@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.001 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [contenttypefilter.pm, proxmox.com] Subject: Re: [pmg-devel] [PATCH pmg-api v2 2/2] ruledb: content-type: add flag for matching only based on magic/content X-BeenThere: pmg-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Mail Gateway development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pmg-devel-bounces@lists.proxmox.com Sender: "pmg-devel" On 18/02/2025 14:54, Stoiko Ivanov wrote: > our current content-type matching is sensibly quite cautious in > matching if any available information indicates a potential match: > * mime-type detection based on file contents > * mime-type detection based on file suffix > * content-type header > > Sometimes this can lead to surprises (e.g. when a MUA sets the > filetype of a pdf to application/octet-stream (the default type if no > information is available), or a filter for zip-files matching > docx-files. > > This change gives users the option to restrict matching only on the > content as detected by xdg_mime_get_mime_type_for_data. > > This is a fix for the intial request in #2691 and addresses the > suggestion from Friedrich from: > https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2 Thanks for tackling this! I think having a flag like only-content makes sense. I tested this a bit and there seems to be one issue, steps to reproduce: - add a What object with a Content Type Filter for application/pdf, enable the new "Ignore header information" flag - create a rule that blocks incoming mails matching this What object - send an email with a random 1K blob as attachment that sets Content-Type: application/pdf and some non-descriptive filename for the attachment: swaks --from [...] --to [...] --server [...] --attach-type application/pdf --attach-name foo.bin --attach <(dd if=/dev/urandom bs=1k count=1) The email is blocked by the rule. But I would expect it to be accepted, because the `xdg_mime_get_mime_type_for_data` shouldn't recognize the random blob as PDF, and the user-provided Content-Type application/pdf should be ignored. I think the email is accepted because the magic ct [1] defaults to the user-provided Content-Type and since `xdg_mime_get_mime_type_for_data` returns application/octet-stream, we're keep it at the user-provided Content-Type. I guess it would be nicer if the magic wouldn't default to the user-provided Content-Type if "Ignore header information" is enabled, but I'm not sure how easily this can be done. [1] https://git.proxmox.com/?p=pmg-api.git;a=blob;f=src/PMG/Utils.pm;h=0b8945f245;hb=6bbc222#l623 > > matches on the other items can be created with Match Field objects > (for the content-type header) and Filename (for the match based on the > provided filename - combinations of those should give us the complete > flexibility. > > inspired by the changes for disclaimer released with PMG 8.1: > 51d1507 ("fix #2430: ruledb disclaimer: make separator configurable") > > Signed-off-by: Stoiko Ivanov > --- > would be grateful for suggestions that are a better fit than 'only-content'! > > src/PMG/RuleDB/ContentTypeFilter.pm | 75 ++++++++++++++++++++++++++--- > 1 file changed, 68 insertions(+), 7 deletions(-) > > diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm > index 0199311..550a880 100644 > --- a/src/PMG/RuleDB/ContentTypeFilter.pm > +++ b/src/PMG/RuleDB/ContentTypeFilter.pm > @@ -26,7 +26,7 @@ sub otype_text { > } > > sub new { > - my ($type, $fvalue, $ogroup) = @_; > + my ($type, $fvalue, $ogroup, $only_content) = @_; > > my $class = ref($type) || $type; > > @@ -36,6 +36,7 @@ sub new { > } > > my $self = $class->SUPER::new('content-type', $fvalue, $ogroup); > + $self->{only_content} = $only_content; > > return $self; > } > @@ -52,9 +53,50 @@ sub load_attr { > $obj->{field_value} = $nt; > } > > + my $sth = $ruledb->{dbh}->prepare( > + "SELECT * FROM Attribut WHERE Object_ID = ?"); > + > + $sth->execute($id); > + > + $obj->{only_content} = 0; > + > + while (my $ref = $sth->fetchrow_hashref()) { > + if ($ref->{name} eq 'only_content') { > + $obj->{only_content} = $ref->{value}; > + } > + } > + > + $sth->finish(); > + > + $obj->{id} = $id; > + > + $obj->{digest} = Digest::SHA::sha1_hex( $id, $value, $ogroup, $obj->{only_content}); > + > return $obj; > } > > +sub save { > + my ($self, $ruledb) = @_; > + > + if (defined($self->{id})) { > + #update - clean old attribut entries > + $ruledb->{dbh}->do( > + "DELETE FROM Attribut WHERE Object_ID = ?", > + undef, $self->{id}); > + } > + > + $self->{id} = $self->SUPER::save($ruledb); > + > + if (defined($self->{only_content})) { > + $ruledb->{dbh}->do( > + "INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, 'only_content', ?) ". > + "ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ", > + undef, $self->{only_content}, $self->{id}); > + } > + > + return $self->{id}; > +} > + > sub parse_entity { > my ($self, $entity) = @_; > > @@ -78,12 +120,16 @@ sub parse_entity { > > my $glob_ct = $entity->{PMX_glob_ct}; > > - if ($header_ct && $header_ct =~ m|$self->{field_value}|) { > - push @$res, $id; > - } elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) { > - push @$res, $id; > - } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) { > + my $check_only_content = ${self}->{only_content} // 1; > + > + if ($magic_ct && $magic_ct =~ m|$self->{field_value}|) { > push @$res, $id; > + } elsif (!$check_only_content) { > + if ($header_ct && $header_ct =~ m|$self->{field_value}|) { > + push @$res, $id; > + } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) { > + push @$res, $id; > + } > } > } > > @@ -112,19 +158,34 @@ sub properties { > pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+', > maxLength => 1024, > }, > + 'only-content' => { > + description => "use content-type from scanning only (ignore filename and header)", > + type => 'boolean', > + optional => 1, > + default => 0, > + }, > }; > } > > sub get { > my ($self) = @_; > > - return { contenttype => $self->{field_value} }; > + return { > + contenttype => $self->{field_value}, > + 'only-content' => $self->{only_content}, > + }; > } > > sub update { > my ($self, $param) = @_; > > $self->{field_value} = $param->{contenttype}; > + > + if (defined($param->{'only-content'}) && $param->{'only-content'} == 1) { > + $self->{only_content} = 1; > + } else { > + delete $self->{only_content}; > + } > } > > 1; _______________________________________________ pmg-devel mailing list pmg-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel