public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH pmg-api] fix #3734: scrub 'url' from style tags/attributes
Date: Thu, 25 Nov 2021 12:22:31 +0100	[thread overview]
Message-ID: <20211125112231.3403069-1-d.csapak@proxmox.com> (raw)

if 'view images' for the quarantine is disabled, it is expected that
*no* images will be loaded. but in addition to img (src/href/etc.)
also css can load external images via the 'url' directive

since html scrubber does not parse/iterate over css, we simply remove
the url+protocol part of those tags/attributes. this technically leaves behind
invalid css, but the browsers should cope with that.
(we cannot 'cleanly' remove without much more effort because of quoting)

also we have to scrub the style tags in 'dump_html' since HTML::Scrubber
does not have a way to modify the *content* of a tag, only the
attributes...

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 src/PMG/HTMLMail.pm | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/PMG/HTMLMail.pm b/src/PMG/HTMLMail.pm
index b69a596..987dc39 100644
--- a/src/PMG/HTMLMail.pm
+++ b/src/PMG/HTMLMail.pm
@@ -15,8 +15,16 @@ use HTML::Scrubber;
 use PMG::Utils;
 use PMG::MIMEUtils;
 
+# $value is a ref to a string scalar
+my sub remove_urls {
+    my ($value) = @_;
+    # remove all urls with a protocol, this leaves partially invalid
+    # css, but prevents the browser from loading them
+    $$value =~ s|url\s*\(\s*(['"]?)[a-z]+://|($1|gi;
+}
+
 sub dump_html {
-    my ($tree, $cid_hash) = @_;
+    my ($tree, $cid_hash, $viewimages) = @_;
 
     my @html = ();
 
@@ -37,6 +45,11 @@ sub dump_html {
 			    $node->{src} = $datauri;
 			}
 		    }
+		} elsif ($tag eq 'style' && !$viewimages) {
+		    for my $el ($node->content_refs_list()) {
+			next if ref $$el;
+			remove_urls($el);
+		    }
 		}
 
 		if($start) { # on the way in
@@ -137,7 +150,13 @@ sub getscrubber {
 	    span => 1,
 	    src => $viewimages ? qr{^(?!(?:java)?script)}i : 0,
 	    start => 1,
-	    style => 1,
+	    style => $viewimages ? 1 : sub {
+		my ($obj, $tag_name, $attr_name, $value) = @_;
+
+		remove_urls(\$value);
+
+		return $value;
+	    },
 	    summary => 1,
 	    tabindex => 1,
 	    target => 1,
@@ -267,7 +286,7 @@ sub entity_to_html {
 	$tree->parse($raw);
 	$tree->eof();
 
-	my $whtml = dump_html($tree, $viewimages ? $cid_hash : {});
+	my $whtml = dump_html($tree, $viewimages ? $cid_hash : {}, $viewimages); #scrubs style tags
 	$tree->delete;
 
 	# remove dangerous/unneeded elements
-- 
2.30.2





             reply	other threads:[~2021-11-25 11:23 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-25 11:22 Dominik Csapak [this message]
2021-11-25 13:28 ` Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211125112231.3403069-1-d.csapak@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pmg-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal