From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH pmg-api v5 3/4] ruledb: content-type: add flag for matching only based on magic/content
Date: Fri, 21 Feb 2025 17:48:17 +0100 [thread overview]
Message-ID: <20250221164821.207845-4-s.ivanov@proxmox.com> (raw)
In-Reply-To: <20250221164821.207845-1-s.ivanov@proxmox.com>
our current content-type matching is sensibly quite cautious in
matching if any available information indicates a potential match:
* mime-type detection based on file contents
* mime-type detection based on file suffix
* content-type header
Sometimes this can lead to surprises (e.g. when a MUA sets the
filetype of a pdf to application/octet-stream (the default type if no
information is available), or a filter for zip-files matching
docx-files.
This change gives users the option to restrict matching only on the
content as detected by xdg_mime_get_mime_type_for_data.
This is a fix for the intial request in #2691 and addresses the
suggestion from Friedrich from:
https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2
matches on the other items can be created with Match Field objects
(for the content-type header) and Filename (for the match based on the
provided filename - combinations of those should give us the complete
flexibility.
inspired by the changes for disclaimer released with PMG 8.1:
51d1507 ("fix #2430: ruledb disclaimer: make separator configurable")
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/RuleDB/ContentTypeFilter.pm | 75 ++++++++++++++++++++++++++---
1 file changed, 68 insertions(+), 7 deletions(-)
diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm
index fb45e95..e44bf3c 100644
--- a/src/PMG/RuleDB/ContentTypeFilter.pm
+++ b/src/PMG/RuleDB/ContentTypeFilter.pm
@@ -26,7 +26,7 @@ sub otype_text {
}
sub new {
- my ($type, $fvalue, $ogroup) = @_;
+ my ($type, $fvalue, $ogroup, $only_content) = @_;
my $class = ref($type) || $type;
@@ -36,6 +36,7 @@ sub new {
}
my $self = $class->SUPER::new('content-type', $fvalue, $ogroup);
+ $self->{only_content} = $only_content;
return $self;
}
@@ -52,9 +53,50 @@ sub load_attr {
$obj->{field_value} = $nt;
}
+ my $sth = $ruledb->{dbh}->prepare(
+ "SELECT * FROM Attribut WHERE Object_ID = ?");
+
+ $sth->execute($id);
+
+ $obj->{only_content} = 0;
+
+ while (my $ref = $sth->fetchrow_hashref()) {
+ if ($ref->{name} eq 'only_content') {
+ $obj->{only_content} = $ref->{value};
+ }
+ }
+
+ $sth->finish();
+
+ $obj->{id} = $id;
+
+ $obj->{digest} = Digest::SHA::sha1_hex( $id, $value, $ogroup, $obj->{only_content});
+
return $obj;
}
+sub save {
+ my ($self, $ruledb) = @_;
+
+ if (defined($self->{id})) {
+ #update - clean old attribut entries
+ $ruledb->{dbh}->do(
+ "DELETE FROM Attribut WHERE Object_ID = ?",
+ undef, $self->{id});
+ }
+
+ $self->{id} = $self->SUPER::save($ruledb);
+
+ if (defined($self->{only_content})) {
+ $ruledb->{dbh}->do(
+ "INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, 'only_content', ?) ".
+ "ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ",
+ undef, $self->{only_content}, $self->{id});
+ }
+
+ return $self->{id};
+}
+
sub parse_entity {
my ($self, $entity) = @_;
@@ -78,12 +120,16 @@ sub parse_entity {
my $glob_ct = $entity->{PMX_glob_ct};
- if ($header_ct && $header_ct =~ m|$self->{field_value}|) {
- push @$res, $id;
- } elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) {
- push @$res, $id;
- } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) {
+ my $check_only_content = ${self}->{only_content} // 1;
+
+ if ($magic_ct && $magic_ct =~ m|$self->{field_value}|) {
push @$res, $id;
+ } elsif (!$check_only_content) {
+ if ($header_ct && $header_ct =~ m|$self->{field_value}|) {
+ push @$res, $id;
+ } elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) {
+ push @$res, $id;
+ }
}
}
@@ -112,19 +158,34 @@ sub properties {
pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+',
maxLength => 1024,
},
+ 'only-content' => {
+ description => "use content-type from scanning only (ignore filename and header)",
+ type => 'boolean',
+ optional => 1,
+ default => 0,
+ },
};
}
sub get {
my ($self) = @_;
- return { contenttype => $self->{field_value} };
+ return {
+ contenttype => $self->{field_value},
+ 'only-content' => $self->{only_content},
+ };
}
sub update {
my ($self, $param) = @_;
$self->{field_value} = $param->{contenttype};
+
+ if (defined($param->{'only-content'}) && $param->{'only-content'} == 1) {
+ $self->{only_content} = 1;
+ } else {
+ delete $self->{only_content};
+ }
}
1;
--
2.39.5
_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel
next prev parent reply other threads:[~2025-02-21 16:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-21 16:48 [pmg-devel] [PATCH pmg-api/pmg-gui v5] add additional attributes to ContentTypeFilter and MatchField Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 1/4] ruledb: disclaimer: simplify update-case Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 2/4] utils: content-type: don't fallback to header information for magic Stoiko Ivanov
2025-02-21 16:48 ` Stoiko Ivanov [this message]
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 4/4] fix #2709: ruledb: match-field: optionally restrict to top mime-part Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 1/3] rules/object: remove icon from remove button Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 2/3] rules/content-typefilter: add checkbox for file content only matching Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 3/3] fix #2709: rules: match-field: add top-level-only checkbox Stoiko Ivanov
2025-02-21 17:26 ` [pmg-devel] applied: [PATCH pmg-api/pmg-gui v5] add additional attributes to ContentTypeFilter and MatchField Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250221164821.207845-4-s.ivanov@proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal