16.1. How to allow special HTML tags

User input is checked and filtered automatically by PHP-Nuke. For this purpose the functions filter_text and check_html are used. filter_text checks and replaces bad words, then calls check_html, which in turn checks for HTML tags and strips them completely off, if the second parameter is “nohtml”.

Table 16-1 shows all modules that call filter_text, together with the line the call is made on. You can see that filter_text is used to filter

Table 16-1. Calls to filter_text from PHP-Nuke modules (v.6.8)

Module

Line with call to filter_text

modules/WebMail/readmail.php

$res = filter_text($res); // Security fix by Ulf Harnhammar 2002

modules/News/comments.php

$comment = FixQuotes(nl2br(filter_text($comment)));

modules/News/comments.php

$subject = FixQuotes(filter_text($subject, "nohtml"));

modules/News/comments.php

$comment = FixQuotes(filter_text($comment));

modules/Reviews/index.php

$comments = FixQuotes(nl2br(filter_text($comments)));

modules/Submit_News/index.php

$subject = FixQuotes(filter_text($subject, "nohtml"));

modules/Submit_News/index.php

$story = FixQuotes(nl2br(filter_text($story)));

modules/Submit_News/index.php

$storyext = FixQuotes(nl2br(filter_text($storyext)));

modules/Submit_News/index.php

$story = FixQuotes(filter_text($story));

modules/Submit_News/index.php

$storyext = FixQuotes(filter_text($storyext));

modules/Surveys/comments.php

$subject = FixQuotes(filter_text($subject, "nohtml"));

modules/Surveys/comments.php

$comment = FixQuotes(nl2br(filter_text($comment)));

modules/Surveys/comments.php

$comment = FixQuotes(filter_text($comment));

modules/Your_Account/index.php

filter_text($username);

modules/Your_Account/index.php

if ($bio) { filter_text($bio); $bio = $EditedMessage; $bio = FixQuotes($bio); }

modules/Your_Account/index.php

$the_message = FixQuotes(filter_text($the_message, "nohtml"));

check_html, in turn, is not only called from filter_text, but also in its own right. Table 16-2 shows all modules that call check_html and the line it is called on. You can see that check_html is called to check the HTML input in

Table 16-2. Calls to check_html from PHP-Nuke modules (v.6.8)

Module

Line with call to filter_text

modules/Downloads/index.php

$query = check_html($query, nohtml);

modules/Encyclopedia/search.php

$query = check_html($query, nohtml);

modules/Encyclopedia/search.php

$query = check_html($query, nohtml);

modules/Reviews/index.php

$title = stripslashes(check_html($title, "nohtml"));

modules/Reviews/index.php

$text = stripslashes(check_html($text, ""));

modules/Reviews/index.php

$reviewer = stripslashes(check_html($reviewer, "nohtml"));

modules/Reviews/index.php

$url_title = stripslashes(check_html($url_title, "nohtml"));

modules/Reviews/index.php

$title = stripslashes(FixQuotes(check_html($title, "nohtml")));

modules/Reviews/index.php

$text = stripslashes(Fixquotes(urldecode(check_html($text, ""))));

modules/Reviews/index.php

$comments = stripslashes(FixQuotes(check_html($comments)));

modules/Search/index.php

$query = stripslashes(check_html($query, nohtml));

modules/Web_Links/index.php

$query = check_html($query, nohtml);

modules/Your_Account/index.php

$username = check_html($username, nohtml);

modules/Your_Account/index.php

$user_email = check_html($user_email, nohtml);

modules/Your_Account/index.php

$user_email = check_html($user_email, nohtml);

modules/Your_Account/index.php

$femail = check_html($femail, nohtml);

modules/Your_Account/index.php

$user_website = check_html($user_website, nohtml);

modules/Your_Account/index.php

$bio = check_html($bio, nohtml);

modules/Your_Account/index.php

$user_avatar = check_html($user_avatar, nohtml);

modules/Your_Account/index.php

$user_icq = check_html($user_icq, nohtml);

modules/Your_Account/index.php

$user_aim = check_html($user_aim, nohtml);

modules/Your_Account/index.php

$user_yim = check_html($user_yim, nohtml);

modules/Your_Account/index.php

$user_msnm = check_html($user_msnm, nohtml);

modules/Your_Account/index.php

$user_occ = check_html($user_occ, nohtml);

modules/Your_Account/index.php

$user_from = check_html($user_from, nohtml);

modules/Your_Account/index.php

$user_interests = check_html($user_interests, nohtml);

modules/Your_Account/index.php

$realname = check_html($realname, nohtml);

modules/Journal/display.php

$row[bodytext]=check_html($row[bodytext], $strip);

modules/Journal/display.php

$row[comment]=check_html($row[comment], $strip);

check_html uses the $AllowableHTML array that is defined in config.php. The idea is that only the tags that are included in the $AllowableHTML array should be allowed. However, even if you explicitly allow the img tag in $AllowableHTML, it will be stripped away by check_html (and by filter_text, which also calls it). The line that does this is

$str = eregi_replace("<[[:space:]]* img[[:space:]]*([^>]*)[[:space:]]*>", ", $str);

You can comment out that line - though it is certainly a security issue (allowing people to post harmful code in img tags).

You can also comment out the line that eliminates all anchor attributes exept href in the <a> tag:

$str = eregi_replace("<a[^>]*href[[:space:]]*=[[:space:]]*\"?[[:space:]]*([^\" >]*)[[:space:]]*\"?[^>]*>", '<a href="\\1">', $str); # "

These changes will affect the checks done at all places shown in both Table 16-1 and Table 16-2, so again, be careful with security issues. You have to trust your users to give them this comfort.

Put the tags you want to allow in the $AllowableHTML array in the config.php file. Here is a (quite liberal) example:

$AllowableHTML = array("b"=>1,
         "i"=>1,
         "a"=>2,
         "em"=>1,
         "br"=>1,
         "strong"=>1,
         "blockquote"=>1,
         "tt"=>1,
         "li"=>1,
         "ol"=>1,
         "H1"=>1,
         "H2"=>1,
         "H3"=>1,
         "H4"=>1,
         "center"=>1,
         "img"=>2,
         "alt"=>1,
         "table"=>2,
         "tr"=>2,
         "td"=>2,
         "p"=>2,
         "div"=>2,
         "font"=>2,
         "p"=>1,
         "p"=>1,
         "ul"=>1);

The numbers that appear next to each tag have the following meaning:

See also Section 3.7.