From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH pmg-api v5 2/4] utils: content-type: don't fallback to header information for magic
Date: Fri, 21 Feb 2025 17:48:16 +0100 [thread overview]
Message-ID: <20250221164821.207845-3-s.ivanov@proxmox.com> (raw)
In-Reply-To: <20250221164821.207845-1-s.ivanov@proxmox.com>
file-type detection based on content/magic is the single piece of
information not determined by the headers of the e-mail, and thus not
directly controlled by the sender.
this patch removes the fallback to the content-type header mime-type
in case magic_mime_type_for_file does not detect the type.
one exception to this is trying to eagerly gain information from
archives - where we want to try to unpack an archive if the header
says it is an archive but the content is not detected as such.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
src/PMG/RuleDB/ArchiveFilter.pm | 2 +-
src/PMG/RuleDB/ContentTypeFilter.pm | 2 +-
src/PMG/Utils.pm | 12 ++++--------
src/bin/pmg-smtp-filter | 8 +++++++-
4 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/src/PMG/RuleDB/ArchiveFilter.pm b/src/PMG/RuleDB/ArchiveFilter.pm
index 3d9890c..d7f6399 100644
--- a/src/PMG/RuleDB/ArchiveFilter.pm
+++ b/src/PMG/RuleDB/ArchiveFilter.pm
@@ -59,7 +59,7 @@ sub parse_entity {
if (my $id = $entity->head->mime_attr ('x-proxmox-tmp-aid')) {
chomp $id;
- my $header_ct = $entity->head->mime_attr ('content-type');
+ my $header_ct = $entity->{PMX_header_ct};
my $magic_ct = $entity->{PMX_magic_ct};
diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm
index 0199311..fb45e95 100644
--- a/src/PMG/RuleDB/ContentTypeFilter.pm
+++ b/src/PMG/RuleDB/ContentTypeFilter.pm
@@ -72,7 +72,7 @@ sub parse_entity {
if (my $id = $entity->head->mime_attr ('x-proxmox-tmp-aid')) {
chomp $id;
- my $header_ct = $entity->head->mime_attr ('content-type');
+ my $header_ct = $entity->{PMX_header_ct};
my $magic_ct = $entity->{PMX_magic_ct};
diff --git a/src/PMG/Utils.pm b/src/PMG/Utils.pm
index 0b8945f..b2a75fb 100644
--- a/src/PMG/Utils.pm
+++ b/src/PMG/Utils.pm
@@ -598,7 +598,7 @@ sub magic_mime_type_for_file {
my $bufsize = Xdgmime::xdg_mime_get_max_buffer_extents();
die "got strange value for max_buffer_extents" if $bufsize > 4096*10;
- my $ct = "application/octet-stream";
+ my $ct;
my $fh = IO::File->new("<$filename") ||
die "unable to open file '$filename' - $!";
@@ -611,6 +611,7 @@ sub magic_mime_type_for_file {
die "unable to read file '$filename' - $!" if ($len < 0);
+ $ct ||= "application/octet-stream";
return $ct;
}
@@ -619,14 +620,9 @@ sub add_ct_marks {
if (my $path = $entity->{PMX_decoded_path}) {
- # set a reasonable default if magic does not give a result
- $entity->{PMX_magic_ct} = $entity->head->mime_attr('content-type');
+ $entity->{PMX_header_ct} = $entity->head->mime_attr('content-type');
- if (my $ct = magic_mime_type_for_file($path)) {
- if ($ct ne 'application/octet-stream' || !$entity->{PMX_magic_ct}) {
- $entity->{PMX_magic_ct} = $ct;
- }
- }
+ $entity->{PMX_magic_ct} = magic_mime_type_for_file($path);
my $filename = $entity->head->recommended_filename;
$filename = basename($path) if !defined($filename) || $filename eq '';
diff --git a/src/bin/pmg-smtp-filter b/src/bin/pmg-smtp-filter
index 6061459..60737ea 100755
--- a/src/bin/pmg-smtp-filter
+++ b/src/bin/pmg-smtp-filter
@@ -561,9 +561,15 @@ sub run_dequeue {
sub unpack_entity {
my ($self, $unpack, $entity, $msginfo, $queue) = @_;
- my ($magic, $path) = $entity->@{'PMX_magic_ct', 'PMX_decoded_path'};
+ my ($magic, $headerct, $path) = $entity->@{'PMX_magic_ct', 'PMX_header_ct', 'PMX_decoded_path'};
if ($magic && $path) {
+ # in order to not miss information from a misdetected archive use information provided in the
+ # header here as well
+ if ($headerct && ($magic && $magic eq 'application/octet-stream')) {
+ $magic = $headerct;
+ }
+
my $filename = basename ($path);
if (PMG::Unpack::is_archive ($magic)) {
--
2.39.5
_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel
next prev parent reply other threads:[~2025-02-21 16:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-21 16:48 [pmg-devel] [PATCH pmg-api/pmg-gui v5] add additional attributes to ContentTypeFilter and MatchField Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 1/4] ruledb: disclaimer: simplify update-case Stoiko Ivanov
2025-02-21 16:48 ` Stoiko Ivanov [this message]
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 3/4] ruledb: content-type: add flag for matching only based on magic/content Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-api v5 4/4] fix #2709: ruledb: match-field: optionally restrict to top mime-part Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 1/3] rules/object: remove icon from remove button Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 2/3] rules/content-typefilter: add checkbox for file content only matching Stoiko Ivanov
2025-02-21 16:48 ` [pmg-devel] [PATCH pmg-gui v5 3/3] fix #2709: rules: match-field: add top-level-only checkbox Stoiko Ivanov
2025-02-21 17:26 ` [pmg-devel] applied: [PATCH pmg-api/pmg-gui v5] add additional attributes to ContentTypeFilter and MatchField Thomas Lamprecht
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250221164821.207845-3-s.ivanov@proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.