public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object
@ 2025-02-12 15:12 Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 1/2] ruledb: disclaimer: simplify update-case Stoiko Ivanov
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-12 15:12 UTC (permalink / raw)
  To: pmg-devel

The following patch series was started shortly after the release of PMG 8.1,
but I did not find the time to get it in shape for sending.
They follow the patches for adding 'top' and 'add_separator' to the
Disclaimer action.

The current content-type filter can sometimes surprise users (e.g.
https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2 and
https://bugzilla.proxmox.com/show_bug.cgi?id=2691#c0 ,but also a few
cases in our technical support-channels come up here and there):

It matches if any of:
* content-type header
* file-magic
* filename (extension)
match the content type, the what-object matches.
by adding an attribute for each of the sources users can restrict which
of the sources should be taken into consideration

the first patches for both repositories are independent (I just ran into
them while looking into this).

minimally tested locally, by sending a plain-text file called
testtext.pdf, and a pdf-file renamed to have a `.docx` suffix.

pmg-api:
Stoiko Ivanov (2):
  ruledb: disclaimer: simplify update-case
  ruledb: content-type: add flags for source of matching

 src/PMG/RuleDB/ContentTypeFilter.pm | 91 +++++++++++++++++++++++++++--
 src/PMG/RuleDB/Disclaimer.pm        |  8 +--
 2 files changed, 89 insertions(+), 10 deletions(-)

pmg-gui:
Stoiko Ivanov (2):
  rules/object: remove icon from remove button
  rules/content-typefilter: add match source checkboxes

 js/ObjectGroup.js |  3 +--
 js/Utils.js       | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 2 deletions(-)

-- 
2.39.5



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pmg-devel] [PATCH pmg-api 1/2] ruledb: disclaimer: simplify update-case
  2025-02-12 15:12 [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object Stoiko Ivanov
@ 2025-02-12 15:12 ` Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching Stoiko Ivanov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-12 15:12 UTC (permalink / raw)
  To: pmg-devel

as one object(_id) belongs is of one object type, we can safely remove
all associated attribut entries from the database at once instead of
running 2 separate statements to delete each individually

noticed this while adding new attribut entries to the content-type
check.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/Disclaimer.pm | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/PMG/RuleDB/Disclaimer.pm b/src/PMG/RuleDB/Disclaimer.pm
index 574f62e..cb9b5fb 100644
--- a/src/PMG/RuleDB/Disclaimer.pm
+++ b/src/PMG/RuleDB/Disclaimer.pm
@@ -113,11 +113,9 @@ sub save {
 	    "UPDATE Object SET Value = ? WHERE ID = ?",
 	    undef, $value, $self->{id});
 
-	for my $prop (qw(top separator)) {
-	    $ruledb->{dbh}->do(
-		"DELETE FROM Attribut WHERE Name = ? and Object_ID = ?",
-		undef, $prop,  $self->{id});
-	}
+	$ruledb->{dbh}->do(
+	    "DELETE FROM Attribut WHERE Object_ID = ?",
+		undef, $self->{id});
     } else {
 	# insert
 
-- 
2.39.5



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching
  2025-02-12 15:12 [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 1/2] ruledb: disclaimer: simplify update-case Stoiko Ivanov
@ 2025-02-12 15:12 ` Stoiko Ivanov
  2025-02-17 13:03   ` Dominik Csapak
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 2/2] rules/content-typefilter: add match source checkboxes Stoiko Ivanov
  3 siblings, 1 reply; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-12 15:12 UTC (permalink / raw)
  To: pmg-devel

our current content-type matching is sensibly quite cautious in
matching if any available information indicates a potential match:
* mime-type detection based on file contents
* mime-type detection based on file suffix
* content-type header

Sometimes this can lead to surprises (e.g. when a MUA sets the
filetype of a pdf to application/octet-stream (the default type if no
information is available).

This change gives users the option to rely only on some of the sources
for matching.

This is a fix for the intial request in #2691 and addresses the
suggestion from Friedrich from:
https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2

inspired by the changes for disclaimer released with PMG 8.1:
51d1507 ("fix #2430: ruledb disclaimer: make separator configurable")

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 src/PMG/RuleDB/ContentTypeFilter.pm | 91 +++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 5 deletions(-)

diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm
index 0199311..0dafa64 100644
--- a/src/PMG/RuleDB/ContentTypeFilter.pm
+++ b/src/PMG/RuleDB/ContentTypeFilter.pm
@@ -26,7 +26,7 @@ sub otype_text {
 }
 
 sub new {
-    my ($type, $fvalue, $ogroup) = @_;
+    my ($type, $fvalue, $ogroup, $header, $magic, $glob) = @_;
 
     my $class = ref($type) || $type;
 
@@ -36,6 +36,9 @@ sub new {
     }
 
     my $self = $class->SUPER::new('content-type', $fvalue, $ogroup);
+    $self->{header} = $header;
+    $self->{magic} = $magic;
+    $self->{glob} = $glob;
 
     return $self;
 }
@@ -52,9 +55,53 @@ sub load_attr {
 	$obj->{field_value} = $nt;
     }
 
+    my $sth = $ruledb->{dbh}->prepare(
+	"SELECT * FROM Attribut WHERE Object_ID = ?");
+
+    $sth->execute($id);
+
+    $obj->{header} = $obj->{magic} = $obj->{glob} = 1;
+
+    while (my $ref = $sth->fetchrow_hashref()) {
+	if ($ref->{name} =~ /^(header|magic|glob)$/) {
+	    $obj->{$1} = $ref->{value};
+	}
+    }
+
+    $sth->finish();
+
+    $obj->{id} = $id;
+
+    $obj->{digest} = Digest::SHA::sha1_hex(
+	$id, $value, $ogroup, $obj->{header} // 1, $obj->{magic} //1 , $obj->{glob} // 1);
+
     return $obj;
 }
 
+sub save {
+    my ($self, $ruledb) = @_;
+
+    if (defined($self->{id})) {
+	#update - clean old attribut entries
+	$ruledb->{dbh}->do(
+	    "DELETE FROM Attribut WHERE Object_ID = ?",
+	    undef, $self->{id});
+    }
+
+    $self->{id} = $self->SUPER::save($ruledb);
+
+    for my $prop (qw(header magic glob)) {
+	if (defined($self->{$prop})) {
+	    $ruledb->{dbh}->do(
+		"INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, ?, ?) ".
+		"ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ",
+		undef, $self->{$prop}, $prop,  $self->{id});
+	}
+    }
+
+    return $self->{id};
+}
+
 sub parse_entity {
     my ($self, $entity) = @_;
 
@@ -78,11 +125,14 @@ sub parse_entity {
 
 	my $glob_ct = $entity->{PMX_glob_ct};
 
-	if ($header_ct && $header_ct =~ m|$self->{field_value}|) {
+	my $check_header = !defined($self->{header}) || ${self}->{header};
+	my $check_magic = !defined($self->{magic}) || ${self}->{magic};
+	my $check_glob = !defined($self->{glob}) || ${self}->{glob};
+	if ($header_ct && $check_header && $header_ct =~ m|$self->{field_value}|) {
 	    push @$res, $id;
-	} elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) {
+	} elsif ($magic_ct && $check_magic && $magic_ct =~ m|$self->{field_value}|) {
 	    push @$res, $id;
-	} elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) {
+	} elsif ($glob_ct && $check_glob && $glob_ct =~ m|$self->{field_value}|) {
 	    push @$res, $id;
 	}
     }
@@ -112,19 +162,50 @@ sub properties {
 	    pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+',
 	    maxLength => 1024,
 	},
+	header => {
+	    description => "use content-type from mail-header for matching",
+	    type => 'boolean',
+	    optional => 1,
+	    default => 1,
+	},
+	magic => {
+	    description => "use content-type from scanning the content for matching",
+	    type => 'boolean',
+	    optional => 1,
+	    default => 1,
+	},
+	glob => {
+	    description => "use content-type based on file-name for matching",
+	    type => 'boolean',
+	    optional => 1,
+	    default => 1,
+	},
     };
 }
 
 sub get {
     my ($self) = @_;
 
-    return { contenttype => $self->{field_value} };
+    return {
+	contenttype => $self->{field_value},
+	header => $self->{header},
+	magic => $self->{magic},
+	glob => $self->{glob},
+    };
 }
 
 sub update {
     my ($self, $param) = @_;
 
     $self->{field_value} = $param->{contenttype};
+
+    for my $prop (qw(header magic glob)) {
+	if (defined($param->{$prop}) && $param->{$prop} == 0) {
+	    $self->{$prop} = 0;
+	} else {
+	    delete $self->{$prop};
+	}
+    }
 }
 
 1;
-- 
2.39.5



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button
  2025-02-12 15:12 [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 1/2] ruledb: disclaimer: simplify update-case Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching Stoiko Ivanov
@ 2025-02-12 15:12 ` Stoiko Ivanov
  2025-02-13 10:37   ` Stoiko Ivanov
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 2/2] rules/content-typefilter: add match source checkboxes Stoiko Ivanov
  3 siblings, 1 reply; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-12 15:12 UTC (permalink / raw)
  To: pmg-devel

the icons were introduced to the listing, and it seems their rendering
when removing was not noticed - w/o this the message is e.g.:
"Are you sure you want to remove entry
'<span class="fa-fw fa fa-file-image-o'"></span> Content Type Filter:
..."

Fixes: ea4f2a7 ("add icons to the object types")

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/ObjectGroup.js | 3 +--
 js/Utils.js       | 5 +++++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/js/ObjectGroup.js b/js/ObjectGroup.js
index 3c8de64..1807e97 100644
--- a/js/ObjectGroup.js
+++ b/js/ObjectGroup.js
@@ -125,8 +125,7 @@ Ext.define('PMG.ObjectGroup', {
 	    },
 	    callback: reload,
 	    getRecordName: function(rec) {
-		return PMG.Utils.format_otype(rec.data.otype) +
-		    ': ' + rec.data.descr;
+		return PMG.Utils.format_otype_subject(rec.data.otype) + ': ' + rec.data.descr;
 	    },
 	    waitMsgTarget: me,
 	});
diff --git a/js/Utils.js b/js/Utils.js
index 9b5f054..94e3c95 100644
--- a/js/Utils.js
+++ b/js/Utils.js
@@ -128,6 +128,11 @@ Ext.define('PMG.Utils', {
 	return icon + text;
     },
 
+    format_otype_subject: function(otype) {
+	let editor = PMG.Utils.object_editors[otype];
+	return editor.subject ?? 'unknown';
+    },
+
     format_otype: function(otype) {
 	let editor = PMG.Utils.object_editors[otype];
 	let iconCls = 'fa fa-question-circle';
-- 
2.39.5



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [pmg-devel] [PATCH pmg-gui 2/2] rules/content-typefilter: add match source checkboxes
  2025-02-12 15:12 [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object Stoiko Ivanov
                   ` (2 preceding siblings ...)
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button Stoiko Ivanov
@ 2025-02-12 15:12 ` Stoiko Ivanov
  3 siblings, 0 replies; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-12 15:12 UTC (permalink / raw)
  To: pmg-devel

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 js/Utils.js | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/js/Utils.js b/js/Utils.js
index 94e3c95..1f9dafa 100644
--- a/js/Utils.js
+++ b/js/Utils.js
@@ -416,6 +416,27 @@ Ext.define('PMG.Utils', {
 		    allowBlank: false,
 		    reset: Ext.emptyFn,
 		},
+		{
+		    xtype: 'proxmoxcheckbox',
+		    name: 'magic',
+		    fieldLabel: gettext("Detected filetype"),
+		    checked: true,
+		    uncheckedValue: '0',
+		},
+		{
+		    xtype: 'proxmoxcheckbox',
+		    name: 'header',
+		    fieldLabel: gettext("Content-Type header"),
+		    checked: true,
+		    uncheckedValue: '0',
+		},
+		{
+		    xtype: 'proxmoxcheckbox',
+		    name: 'glob',
+		    fieldLabel: gettext("File suffix"),
+		    checked: true,
+		    uncheckedValue: '0',
+		},
 	    ],
 	},
 	3004: {
-- 
2.39.5



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button Stoiko Ivanov
@ 2025-02-13 10:37   ` Stoiko Ivanov
  0 siblings, 0 replies; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-13 10:37 UTC (permalink / raw)
  To: pmg-devel

On Wed, 12 Feb 2025 16:12:40 +0100
Stoiko Ivanov <s.ivanov@proxmox.com> wrote:

> the icons were introduced to the listing, and it seems their rendering
> when removing was not noticed - w/o this the message is e.g.:
> "Are you sure you want to remove entry
> '<span class="fa-fw fa fa-file-image-o'"></span> Content Type Filter:
> ..."
> 
> Fixes: ea4f2a7 ("add icons to the object types")
after talking with Dominik off-list went and checked again:
the issue surfaced due to a (much more recent) commit in
proxmox-widet-toolkit:
7bb124c ("button: htmlEncode the name/id for the confirm message")

I think we can drop the icon for the remove-message, but if someone feels
that it makes for a worse UX I can see if this can be sensibly added back.


> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>  js/ObjectGroup.js | 3 +--
>  js/Utils.js       | 5 +++++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/js/ObjectGroup.js b/js/ObjectGroup.js
> index 3c8de64..1807e97 100644
> --- a/js/ObjectGroup.js
> +++ b/js/ObjectGroup.js
> @@ -125,8 +125,7 @@ Ext.define('PMG.ObjectGroup', {
>  	    },
>  	    callback: reload,
>  	    getRecordName: function(rec) {
> -		return PMG.Utils.format_otype(rec.data.otype) +
> -		    ': ' + rec.data.descr;
> +		return PMG.Utils.format_otype_subject(rec.data.otype) + ': ' + rec.data.descr;
>  	    },
>  	    waitMsgTarget: me,
>  	});
> diff --git a/js/Utils.js b/js/Utils.js
> index 9b5f054..94e3c95 100644
> --- a/js/Utils.js
> +++ b/js/Utils.js
> @@ -128,6 +128,11 @@ Ext.define('PMG.Utils', {
>  	return icon + text;
>      },
>  
> +    format_otype_subject: function(otype) {
> +	let editor = PMG.Utils.object_editors[otype];
> +	return editor.subject ?? 'unknown';
> +    },
> +
>      format_otype: function(otype) {
>  	let editor = PMG.Utils.object_editors[otype];
>  	let iconCls = 'fa fa-question-circle';



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching
  2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching Stoiko Ivanov
@ 2025-02-17 13:03   ` Dominik Csapak
  2025-02-17 14:44     ` Stoiko Ivanov
  0 siblings, 1 reply; 8+ messages in thread
From: Dominik Csapak @ 2025-02-17 13:03 UTC (permalink / raw)
  To: Stoiko Ivanov, pmg-devel

code looks mostly fine (some comments inline)

but one higher level question:

does it really make sense to expose 3 settings for this?

e.g. wouldn't it much more simple to have a 'default' and 'strict' mode
where 'strict' only matches on the contents file type?

both filename&header can be set aribtrarily by the sender, so doing
filtering based on that seem dangerous?

Or is there some specific use case that makes it necessary
to specify all 3 sources separately ?

On 2/12/25 16:12, Stoiko Ivanov wrote:
> our current content-type matching is sensibly quite cautious in
> matching if any available information indicates a potential match:
> * mime-type detection based on file contents
> * mime-type detection based on file suffix
> * content-type header
> 
> Sometimes this can lead to surprises (e.g. when a MUA sets the
> filetype of a pdf to application/octet-stream (the default type if no
> information is available).
> 
> This change gives users the option to rely only on some of the sources
> for matching.
> 
> This is a fix for the intial request in #2691 and addresses the
> suggestion from Friedrich from:
> https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2
> 
> inspired by the changes for disclaimer released with PMG 8.1:
> 51d1507 ("fix #2430: ruledb disclaimer: make separator configurable")
> 
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---
>   src/PMG/RuleDB/ContentTypeFilter.pm | 91 +++++++++++++++++++++++++++--
>   1 file changed, 86 insertions(+), 5 deletions(-)
> 
> diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm
> index 0199311..0dafa64 100644
> --- a/src/PMG/RuleDB/ContentTypeFilter.pm
> +++ b/src/PMG/RuleDB/ContentTypeFilter.pm
> @@ -26,7 +26,7 @@ sub otype_text {
>   }
>   
>   sub new {
> -    my ($type, $fvalue, $ogroup) = @_;
> +    my ($type, $fvalue, $ogroup, $header, $magic, $glob) = @_;
>   
>       my $class = ref($type) || $type;
>   
> @@ -36,6 +36,9 @@ sub new {
>       }
>   
>       my $self = $class->SUPER::new('content-type', $fvalue, $ogroup);
> +    $self->{header} = $header;
> +    $self->{magic} = $magic;
> +    $self->{glob} = $glob;
>   
>       return $self;
>   }
> @@ -52,9 +55,53 @@ sub load_attr {
>   	$obj->{field_value} = $nt;
>       }
>   
> +    my $sth = $ruledb->{dbh}->prepare(
> +	"SELECT * FROM Attribut WHERE Object_ID = ?");
> +
> +    $sth->execute($id);
> +
> +    $obj->{header} = $obj->{magic} = $obj->{glob} = 1;
> +
> +    while (my $ref = $sth->fetchrow_hashref()) {
> +	if ($ref->{name} =~ /^(header|magic|glob)$/) {
> +	    $obj->{$1} = $ref->{value};
> +	}
> +    }
> +
> +    $sth->finish();
> +
> +    $obj->{id} = $id;
> +
> +    $obj->{digest} = Digest::SHA::sha1_hex(
> +	$id, $value, $ogroup, $obj->{header} // 1, $obj->{magic} //1 , $obj->{glob} // 1);

missing whitespace: '//1' -> '// 1'

> +
>       return $obj;
>   }
>   
> +sub save {
> +    my ($self, $ruledb) = @_;
> +
> +    if (defined($self->{id})) {
> +	#update - clean old attribut entries
> +	$ruledb->{dbh}->do(
> +	    "DELETE FROM Attribut WHERE Object_ID = ?",
> +	    undef, $self->{id});
> +    }
> +
> +    $self->{id} = $self->SUPER::save($ruledb);
> +
> +    for my $prop (qw(header magic glob)) {
> +	if (defined($self->{$prop})) {
> +	    $ruledb->{dbh}->do(
> +		"INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, ?, ?) ".
> +		"ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ",
> +		undef, $self->{$prop}, $prop,  $self->{id});
> +	}
> +    }
> +
> +    return $self->{id};
> +}
> +
>   sub parse_entity {
>       my ($self, $entity) = @_;
>   
> @@ -78,11 +125,14 @@ sub parse_entity {
>   
>   	my $glob_ct = $entity->{PMX_glob_ct};
>   
> -	if ($header_ct && $header_ct =~ m|$self->{field_value}|) {
> +	my $check_header = !defined($self->{header}) || ${self}->{header};
> +	my $check_magic = !defined($self->{magic}) || ${self}->{magic};
> +	my $check_glob = !defined($self->{glob}) || ${self}->{glob};

IMHO would be more readable when using the same syntax as above:

my $check_header = $self->{header} // 1;

what do you think?

> +	if ($header_ct && $check_header && $header_ct =~ m|$self->{field_value}|) {
>   	    push @$res, $id;
> -	} elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) {
> +	} elsif ($magic_ct && $check_magic && $magic_ct =~ m|$self->{field_value}|) {
>   	    push @$res, $id;
> -	} elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) {
> +	} elsif ($glob_ct && $check_glob && $glob_ct =~ m|$self->{field_value}|) {
>   	    push @$res, $id;
>   	}
>       }
> @@ -112,19 +162,50 @@ sub properties {
>   	    pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+',
>   	    maxLength => 1024,
>   	},
> +	header => {
> +	    description => "use content-type from mail-header for matching",

is that really the 'mail-header' or the "mime-part header" ?
(not sure how to express that in a way it's clear to all)

> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 1,
> +	},
> +	magic => {
> +	    description => "use content-type from scanning the content for matching",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 1,
> +	},
> +	glob => {
> +	    description => "use content-type based on file-name for matching",
> +	    type => 'boolean',
> +	    optional => 1,
> +	    default => 1,
> +	},
>       };
>   }
>   
>   sub get {
>       my ($self) = @_;
>   
> -    return { contenttype => $self->{field_value} };
> +    return {
> +	contenttype => $self->{field_value},
> +	header => $self->{header},
> +	magic => $self->{magic},
> +	glob => $self->{glob},
> +    };
>   }
>   
>   sub update {
>       my ($self, $param) = @_;
>   
>       $self->{field_value} = $param->{contenttype};
> +
> +    for my $prop (qw(header magic glob)) {
> +	if (defined($param->{$prop}) && $param->{$prop} == 0) {
> +	    $self->{$prop} = 0;
> +	} else {
> +	    delete $self->{$prop};
> +	}
> +    }
>   }
>   
>   1;



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching
  2025-02-17 13:03   ` Dominik Csapak
@ 2025-02-17 14:44     ` Stoiko Ivanov
  0 siblings, 0 replies; 8+ messages in thread
From: Stoiko Ivanov @ 2025-02-17 14:44 UTC (permalink / raw)
  To: Dominik Csapak; +Cc: pmg-devel

On Mon, 17 Feb 2025 14:03:55 +0100
Dominik Csapak <d.csapak@proxmox.com> wrote:

> code looks mostly fine (some comments inline)
thanks!

> 
> but one higher level question:
> 
> does it really make sense to expose 3 settings for this?
> 
> e.g. wouldn't it much more simple to have a 'default' and 'strict' mode
> where 'strict' only matches on the contents file type?
The idea was to make it explicit what is matched and how.

I'd say that the current way (and future default) corresponds to 'strict'
(in the sense that the what-match eagerly happens if any of the 3 options
yield a match)


> 
> both filename&header can be set aribtrarily by the sender, so doing
> filtering based on that seem dangerous?
matching on the magic only can also be misleading - with some tests I'm
currently running - xlsx (and other msoffice documents) get (righly)
recognized as application/zip by magic (the filename and header do fit in
that case).

But yes - this should at least be added to the docs to make it explicit
that most things matched here are controlled by the sender of the mails.
So I'll add a patch for the docs to the v2.


> 
> Or is there some specific use case that makes it necessary
> to specify all 3 sources separately ?
see above (since I just ran into it) - if someone want's to block
zip-files, but allow docx (might seem a bit far-fetched, but afair it was
something like that, that caused me to open the issue in the first place
back then).

> 
> On 2/12/25 16:12, Stoiko Ivanov wrote:
> > our current content-type matching is sensibly quite cautious in
> > matching if any available information indicates a potential match:
> > * mime-type detection based on file contents
> > * mime-type detection based on file suffix
> > * content-type header
> > 
> > Sometimes this can lead to surprises (e.g. when a MUA sets the
> > filetype of a pdf to application/octet-stream (the default type if no
> > information is available).
> > 
> > This change gives users the option to rely only on some of the sources
> > for matching.
> > 
> > This is a fix for the intial request in #2691 and addresses the
> > suggestion from Friedrich from:
> > https://bugzilla.proxmox.com/show_bug.cgi?id=5618#c2
> > 
> > inspired by the changes for disclaimer released with PMG 8.1:
> > 51d1507 ("fix #2430: ruledb disclaimer: make separator configurable")
> > 
> > Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> > ---
> >   src/PMG/RuleDB/ContentTypeFilter.pm | 91 +++++++++++++++++++++++++++--
> >   1 file changed, 86 insertions(+), 5 deletions(-)
> > 
> > diff --git a/src/PMG/RuleDB/ContentTypeFilter.pm b/src/PMG/RuleDB/ContentTypeFilter.pm
> > index 0199311..0dafa64 100644
> > --- a/src/PMG/RuleDB/ContentTypeFilter.pm
> > +++ b/src/PMG/RuleDB/ContentTypeFilter.pm
> > @@ -26,7 +26,7 @@ sub otype_text {
> >   }
> >   
> >   sub new {
> > -    my ($type, $fvalue, $ogroup) = @_;
> > +    my ($type, $fvalue, $ogroup, $header, $magic, $glob) = @_;
> >   
> >       my $class = ref($type) || $type;
> >   
> > @@ -36,6 +36,9 @@ sub new {
> >       }
> >   
> >       my $self = $class->SUPER::new('content-type', $fvalue, $ogroup);
> > +    $self->{header} = $header;
> > +    $self->{magic} = $magic;
> > +    $self->{glob} = $glob;
> >   
> >       return $self;
> >   }
> > @@ -52,9 +55,53 @@ sub load_attr {
> >   	$obj->{field_value} = $nt;
> >       }
> >   
> > +    my $sth = $ruledb->{dbh}->prepare(
> > +	"SELECT * FROM Attribut WHERE Object_ID = ?");
> > +
> > +    $sth->execute($id);
> > +
> > +    $obj->{header} = $obj->{magic} = $obj->{glob} = 1;
> > +
> > +    while (my $ref = $sth->fetchrow_hashref()) {
> > +	if ($ref->{name} =~ /^(header|magic|glob)$/) {
> > +	    $obj->{$1} = $ref->{value};
> > +	}
> > +    }
> > +
> > +    $sth->finish();
> > +
> > +    $obj->{id} = $id;
> > +
> > +    $obj->{digest} = Digest::SHA::sha1_hex(
> > +	$id, $value, $ogroup, $obj->{header} // 1, $obj->{magic} //1 , $obj->{glob} // 1);  
> 
> missing whitespace: '//1' -> '// 1'
> 
> > +
> >       return $obj;
> >   }
> >   
> > +sub save {
> > +    my ($self, $ruledb) = @_;
> > +
> > +    if (defined($self->{id})) {
> > +	#update - clean old attribut entries
> > +	$ruledb->{dbh}->do(
> > +	    "DELETE FROM Attribut WHERE Object_ID = ?",
> > +	    undef, $self->{id});
> > +    }
> > +
> > +    $self->{id} = $self->SUPER::save($ruledb);
> > +
> > +    for my $prop (qw(header magic glob)) {
> > +	if (defined($self->{$prop})) {
> > +	    $ruledb->{dbh}->do(
> > +		"INSERT INTO Attribut (Value, Name, Object_ID) VALUES (?, ?, ?) ".
> > +		"ON CONFLICT(Object_ID, Name) DO UPDATE SET Value = Excluded.Value ",
> > +		undef, $self->{$prop}, $prop,  $self->{id});
> > +	}
> > +    }
> > +
> > +    return $self->{id};
> > +}
> > +
> >   sub parse_entity {
> >       my ($self, $entity) = @_;
> >   
> > @@ -78,11 +125,14 @@ sub parse_entity {
> >   
> >   	my $glob_ct = $entity->{PMX_glob_ct};
> >   
> > -	if ($header_ct && $header_ct =~ m|$self->{field_value}|) {
> > +	my $check_header = !defined($self->{header}) || ${self}->{header};
> > +	my $check_magic = !defined($self->{magic}) || ${self}->{magic};
> > +	my $check_glob = !defined($self->{glob}) || ${self}->{glob};  
> 
> IMHO would be more readable when using the same syntax as above:
> 
> my $check_header = $self->{header} // 1;
> 
> what do you think?
> 
> > +	if ($header_ct && $check_header && $header_ct =~ m|$self->{field_value}|) {
> >   	    push @$res, $id;
> > -	} elsif ($magic_ct && $magic_ct =~ m|$self->{field_value}|) {
> > +	} elsif ($magic_ct && $check_magic && $magic_ct =~ m|$self->{field_value}|) {
> >   	    push @$res, $id;
> > -	} elsif ($glob_ct && $glob_ct =~ m|$self->{field_value}|) {
> > +	} elsif ($glob_ct && $check_glob && $glob_ct =~ m|$self->{field_value}|) {
> >   	    push @$res, $id;
> >   	}
> >       }
> > @@ -112,19 +162,50 @@ sub properties {
> >   	    pattern => '[0-9a-zA-Z\/\\\[\]\+\-\.\*\_]+',
> >   	    maxLength => 1024,
> >   	},
> > +	header => {
> > +	    description => "use content-type from mail-header for matching",  
> 
> is that really the 'mail-header' or the "mime-part header" ?
> (not sure how to express that in a way it's clear to all)
> 
> > +	    type => 'boolean',
> > +	    optional => 1,
> > +	    default => 1,
> > +	},
> > +	magic => {
> > +	    description => "use content-type from scanning the content for matching",
> > +	    type => 'boolean',
> > +	    optional => 1,
> > +	    default => 1,
> > +	},
> > +	glob => {
> > +	    description => "use content-type based on file-name for matching",
> > +	    type => 'boolean',
> > +	    optional => 1,
> > +	    default => 1,
> > +	},
> >       };
> >   }
> >   
> >   sub get {
> >       my ($self) = @_;
> >   
> > -    return { contenttype => $self->{field_value} };
> > +    return {
> > +	contenttype => $self->{field_value},
> > +	header => $self->{header},
> > +	magic => $self->{magic},
> > +	glob => $self->{glob},
> > +    };
> >   }
> >   
> >   sub update {
> >       my ($self, $param) = @_;
> >   
> >       $self->{field_value} = $param->{contenttype};
> > +
> > +    for my $prop (qw(header magic glob)) {
> > +	if (defined($param->{$prop}) && $param->{$prop} == 0) {
> > +	    $self->{$prop} = 0;
> > +	} else {
> > +	    delete $self->{$prop};
> > +	}
> > +    }
> >   }
> >   
> >   1;  
> 
> 



_______________________________________________
pmg-devel mailing list
pmg-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pmg-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-02-17 14:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-12 15:12 [pmg-devel] [PATCH pmg-api/pmg-gui] add information source attribute to content-type what-object Stoiko Ivanov
2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 1/2] ruledb: disclaimer: simplify update-case Stoiko Ivanov
2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-api 2/2] ruledb: content-type: add flags for source of matching Stoiko Ivanov
2025-02-17 13:03   ` Dominik Csapak
2025-02-17 14:44     ` Stoiko Ivanov
2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 1/2] rules/object: remove icon from remove button Stoiko Ivanov
2025-02-13 10:37   ` Stoiko Ivanov
2025-02-12 15:12 ` [pmg-devel] [PATCH pmg-gui 2/2] rules/content-typefilter: add match source checkboxes Stoiko Ivanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal