aboutsummaryrefslogtreecommitdiff
path: root/doc/plugins/htmlscrubber.mdwn
blob: 08c81212bdf19fdf6acc0c4756d6becddcaee036 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
[[!template id=plugin name=htmlscrubber core=1 author="[[Joey]]"]]
[[!tag type/html]]

This plugin is enabled by default. It sanitizes the html on pages it renders
to avoid XSS attacks and the like.

It excludes all html tags and attributes except for those that are
whitelisted using the same lists as used by Mark Pilgrim's Universal Feed
Parser, documented at
<http://web.archive.org/web/20110726052341/http://feedparser.org/docs/html-sanitization.html>.
Notably it strips `style` and `link` tags, and the `style` attribute.

Any attributes that could be used to specify a URL are checked to ensure
that they are known, safe schemes.  It will also block embedded javascript
in such URLs.

It uses the [[!cpan HTML::Scrubber]] perl module to perform its html
sanitisation, and this perl module also deals with various entity encoding
tricks.

While I believe that this makes ikiwiki as resistant to malicious html
content as anything else on the web, I cannot guarantee that it will
actually protect every user of every browser from every browser security
hole, badly designed feature, etc. I can provide NO WARRANTY, like it says
in ikiwiki's [[GPL]] license. 

The web's security model is *fundamentally broken*; ikiwiki's html
sanitisation is only a patch on the underlying gaping hole that is your web
browser.

Note that enabling or disabling the htmlscrubber plugin also affects some
other HTML-related functionality, such as whether [[meta]] allows
potentially unsafe HTML tags.

The `htmlscrubber_skip` configuration setting can be used to skip scrubbing
of some pages. Set it to a [[ikiwiki/PageSpec]], such as 
`posts/* and !comment(*) and !*/Discussion`, and pages matching that can have
all the evil CSS, JavsScript, and unsafe html elements you like. One safe
way to use this is to use [[lockedit]] to lock those pages, so only admins
can edit them.

----

Some examples of embedded javascript that won't be let through when this
plugin is active:

* script tag test <script>window.location='http://example.org';</script>
* <span style="background: url(javascript:window.location='http://example.org/')">CSS script test</span>
* <span style="&#x61;&#x6e;&#x79;&#x3a;&#x20;&#x65;&#x78;&#x70;&#x72;&#x65;&#x73;&#x73;&#x69;&#x6f;&#x6e;&#x28;&#x77;&#x69;&#x6e;&#x64;&#x6f;&#x77;&#x2e;&#x6c;&#x6f;&#x63;&#x61;&#x74;&#x69;&#x6f;&#x6e;&#x3d;&#x27;&#x68;&#x74;&#x74;&#x70;&#x3a;&#x2f;&#x2f;&#x65;&#x78;&#x61;&#x6d;&#x70;&#x6c;&#x65;&#x2e;&#x6f;&#x72;&#x67;&#x2f;&#x27;&#x29;">entity-encoded CSS script test</span>
* <span style="&#97;&#110;&#121;&#58;&#32;&#101;&#120;&#112;&#114;&#101;&#115;&#115;&#105;&#111;&#110;&#40;&#119;&#105;&#110;&#100;&#111;&#119;&#46;&#108;&#111;&#99;&#97;&#116;&#105;&#111;&#110;&#61;&#39;&#104;&#116;&#116;&#112;&#58;&#47;&#47;&#101;&#120;&#97;&#109;&#112;&#108;&#101;&#46;&#111;&#114;&#103;&#47;&#39;&#41;">entity-encoded CSS script test</span>
* <a href="javascript&#x3A;alert('foo')">click me</a>