From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pmg-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id B9D981FF168 for <inbox@lore.proxmox.com>; Tue, 18 Feb 2025 14:54:50 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A5292DFDE; Tue, 18 Feb 2025 14:54:46 +0100 (CET) From: Stoiko Ivanov <s.ivanov@proxmox.com> To: pmg-devel@lists.proxmox.com Date: Tue, 18 Feb 2025 14:54:14 +0100 Message-Id: <20250218135416.54504-3-s.ivanov@proxmox.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250218135416.54504-1-s.ivanov@proxmox.com> References: <20250218135416.54504-1-s.ivanov@proxmox.com> MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.068 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pmg-devel] [PATCH pmg-api v2 2/2] ruledb: content-type: add flag for matching only based on magic/content X-BeenThere: pmg-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Mail Gateway development discussion <pmg-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pmg-devel>, <mailto:pmg-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pmg-devel/> List-Post: <mailto:pmg-devel@lists.proxmox.com> List-Help: <mailto:pmg-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel>, <mailto:pmg-devel-request@lists.proxmox.com?subject=subscribe> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pmg-devel-bounces@lists.proxmox.com Sender: "pmg-devel" <pmg-devel-bounces@lists.proxmox.com> our current content-type matching is sensibly quite cautious in matching if any available information indicates a potential match: * mime-type detection based on file contents * mime-type detection based on file suffix * content-type header Sometimes this can lead to surprises (e.g. when a MUA sets the filetype of a pdf to application/octet-stream (the default type if no information is available), or a filter for zip-files matching docx-files. This change gives users the option to restrict matching only on the content as detected by xdg_mime_get_mime_type_for_data. This is a fix for the intial request in #2691 and addresses the suggestion from Friedrich from: https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2 matches on the other items can be created with Match Field objects (for the content-type header) and Filename (for the match based on the provided filename - combinations of those should give us the complete flexibility. inspired by the changes for disclaimer released with PMG 8.1: 51d1507 ("fix #2430: ruledb disclaimer: make separator configurable") Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com> --- would be grateful for suggestions that are a better fit than 'only-content'! src/PMG/RuleDB/ContentTypeFilter.pm | 75 ++++++++++++++++++++++++++--- 1 file changed, 68 insertions(+), 7 deletions(-) diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm index 0199311..550a880 100644 --- a/src/PMG/RuleDB/ContentTypeFilter.pm +++ b/src/PMG/RuleDB/ContentTypeFilter.pm @@ -26,7 +26,7 @@ sub otype_text { } sub new { - my ($type, $fvalue, $ogroup) = @_; + my ($type, $fvalue, $ogroup, $only_content) = @_; my $class = ref($type) || $type; @@ -36,6 +36,7 @@ sub new { } my $self = $class->SUPER::new('content-type', $fvalue, $ogroup); + $self->{only_content} = $only_content; return $self; } @@ -52,9 +53,50 @@ sub load_attr { $obj->{field_value} = $nt; } + my $sth = $ruledb->{dbh}->prepare( + "SELECT * FROM Attribut WHERE Object_ID = ?"); + + $sth->execute($id); + + $obj->{only_content} = 0; + + while (my $ref = $sth->fetchrow_hashref()) { + if ($ref->{name} eq 'only_content') { + $obj->{only_content} = $ref->{value}; + } + } + + $sth->finish(); + + $obj->{id} = $id; + + $obj->{digest} = Digest::SHA::sha1_hex( $id, $value, $ogroup, $obj->{only_content}); + return $obj; } +sub save { + my ($self, $ruledb) = @_; + + if (defined($self->{id})) { + #update - clean old attribut entries + $ruledb->{dbh}->do( + "DELETE FROM Attribut WHERE Object_ID = ?", + undef, $self->{id}); + } + + $self->{id} = $self->SUPER::save($ruledb); + + if (defined($self->{only_content})) { + $ruledb->{dbh}->do( + "INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, 'only_content', ?) ". + "ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ", + undef, $self->{only_content}, $self->{id}); + } + + return $self->{id}; +} + sub parse_entity { my ($self, $entity) = @_; @@ -78,12 +120,16 @@ sub parse_entity { my $glob_ct = $entity->{PMX_glob_ct}; - if ($header_ct && $header_ct =~ m|$self->{field_value}|) { - push @$res, $id; - } elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) { - push @$res, $id; - } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) { + my $check_only_content = ${self}->{only_content} // 1; + + if ($magic_ct && $magic_ct =~ m|$self->{field_value}|) { push @$res, $id; + } elsif (!$check_only_content) { + if ($header_ct && $header_ct =~ m|$self->{field_value}|) { + push @$res, $id; + } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) { + push @$res, $id; + } } } @@ -112,19 +158,34 @@ sub properties { pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+', maxLength => 1024, }, + 'only-content' => { + description => "use content-type from scanning only (ignore filename and header)", + type => 'boolean', + optional => 1, + default => 0, + }, }; } sub get { my ($self) = @_; - return { contenttype => $self->{field_value} }; + return { + contenttype => $self->{field_value}, + 'only-content' => $self->{only_content}, + }; } sub update { my ($self, $param) = @_; $self->{field_value} = $param->{contenttype}; + + if (defined($param->{'only-content'}) && $param->{'only-content'} == 1) { + $self->{only_content} = 1; + } else { + delete $self->{only_content}; + } } 1; -- 2.39.5 _______________________________________________ pmg-devel mailing list pmg-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel