By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,824 Members | 1,396 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,824 IT Pros & Developers. It's quick & easy.

Pop quiz, hot shots!

P: n/a
In an attempt to get another discussion going about security in PHP, I've
put together this little exercise, the purpose of which is to illustrate how
bedevilingly tricky writing secure code can be sometimes.

Think you're up to the challenge? Here goes:

Bob is the author of Bobster, a online community web-app written in PHP. At
the core of it is its public message forums. As this sort of web-apps are
many aplenty, Bob is constantly adding new features in order to stay ahead
of the competition. One feature that users have oft requested is the ability
to enter messages in HTML. So he did a search on Google and found this nifty
Javascript HTML editor.

Bob is no bozo, of course. He is aware of the dangers of accepting HTML, the
biggest of which is the injection of malicious Javascript into pages of a
site. Since Bob knows this editor does not permit the entry of arbitrary
HTML code, he figured that a check on the HTTP referrer is a good first line
of defense, ensuring only form submissions that originate from the site are
processed. As a second line of defense, he added regular expression code
that looks for <script> tags in the message text.

The following is Bob's code:

$text = $_POST['msg'];
$user_id = $_SESSION['user_id'];
$referrer = parse_url($_SERVER['HTTP_REFERER']);

// the submission must come from a page on the server
if($referrer['host'] != $_SERVER['HTTP_HOST']) {

// remove Javscript from text
$text = preg_replace('/<script.*?>.*?<\/script>/si', '', $text);

// save the message
AddMessage($user_id, $text);
}

Now, analyse this code and note as many weaknesses as you can find, keeping
in mind that Bob's main objective is preventing Javascript injection. Extra
credits will be given for other security concerns, however. Those who answer
along the line of "he shouldn't accept HTML in the first place" will receive
no points. What interests us here is why not.
Jul 17 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Regarding this well-known quote, often attributed to Chung Leong's famous
"Fri, 9 Apr 2004 00:23:16 -0400" speech:
In an attempt to get another discussion going about security in PHP, I've
put together this little exercise, the purpose of which is to illustrate how
bedevilingly tricky writing secure code can be sometimes.

Think you're up to the challenge? Here goes:

Bob is the author of Bobster, a online community web-app written in PHP. At
the core of it is its public message forums. As this sort of web-apps are
many aplenty, Bob is constantly adding new features in order to stay ahead
of the competition. One feature that users have oft requested is the ability
to enter messages in HTML. So he did a search on Google and found this nifty
Javascript HTML editor.

Bob is no bozo, of course. He is aware of the dangers of accepting HTML, the
biggest of which is the injection of malicious Javascript into pages of a
site. Since Bob knows this editor does not permit the entry of arbitrary
HTML code, he figured that a check on the HTTP referrer is a good first line
of defense, ensuring only form submissions that originate from the site are
processed. As a second line of defense, he added regular expression code
that looks for <script> tags in the message text.
I'll bite. I need to test my security skills, anyhow.

The following is Bob's code:

$text = $_POST['msg'];
Well, Bob did learn one thing in PHP class... no autoglobals. There are
also no "assumed unset" variables in case autoglobals are on. Good Bob.
$user_id = $_SESSION['user_id'];
I'll assume that the session has already been started and validated?
$referrer = parse_url($_SERVER['HTTP_REFERER']);
FATAL!: Never trust the referrer. This header is given by the client, and
can be swapped out as easily as downloading a Mozilla plugin. This, the
Useragent, and the remote IP are completely useless as mission-critical
data. In fact, I wouldn't trust anything that the client gives me, short of
a good, solid set of $_SESSION[] variables I set earlier.

THE SOLUTION!: The code should work securely no matter what it's fed. The
Web is stateless, so the security should not rely on what happened in a
foggy past. If it's absolutely necessary, statefulness should depend on a
unique, irrelevant, quickly expiring, and heavily random session-ID.

// the submission must come from a page on the server
if($referrer['host'] != $_SERVER['HTTP_HOST']) {

// remove Javscript from text
$text = preg_replace('/<script.*?>.*?<\/script>/si', '', $text);
FATAL!: Think onLoad, onMouseOver, and <A HREF="javascript:...">. There's
more than one way to load a script.

FATAL!: This doesn't drop the OBJECT or APPLET tags. Prepare to enter a
world of ActiveX and Java pain. Also, allowing unauthorized DIV, SPAN, or
STYLE tags (STYLE as a tag, not a parameter) can cause visual headaches on
the rest of the page.

WARNING!: Although it might not be HTML strict, some browsers might
interpret lines like: "< SCRIPT..." (with a space) or "<SCRIPT
LANGUAGE="JavaScript" SRC="sourcefile.js">" (SRC without a closing tag) as
parsable. It's best to just dump any SCRIPT tags, regardless.

WARNING!: There are a few other tags Bob should watch out for: OBJECT and
IFRAME tags can load an outside page in a small "internal frame", right
along with all it's malicious script. Browser tainting prevents the code
from doing much if it's offsite, but they can still run some annoying
attacks (ActiveX controls, popups, the JavaScript Looping Alert Box of
Doom). They *might* be able to hijack Bob's page, too, with a "Bookmarklet"
style <A HREF="javascript:javascriptCodeGoesHere()"> link. (Actually, the
OBJECT tag can load an ActiveX plugin, too, can't it... time to add another
"FATAL!")

THE SOLUTION!: Match the code against an Array of known ACCEPTABLE tags.
Anything else, convert the < and > to &lt; and &gt;. I'd recommend this
over just an empty-replace. Otherwise, entire great swaths of text might be
wiped out by having an errant < near the beginning and an errant > near the
end of a comment (as ASCII-Art arrows, perhaps).

Once acceptable tags are weeded out, process each tag. Tags such as <B> or
<I> should be abstracted without parameters, and complex tags should only
have the parameters Bob wants to allow. Some clever regexps should do the
trick. Remember that parameters can be signified by double-quotes, single
quotes, or by just being after an equal-sign.

Bob should also make sure all tags are closed, to prevent the "Didn't close
<I> and the whole page is italic" problem. Other visual checks would be
anti-page-widening filters and overlarge image-size checking.

// save the message
AddMessage($user_id, $text);
AddMessage(). We can only hope.

Remember, Bob. Don't store the data in a file with an extension that can
get parsed or CGI'd by Apache, and never, EVER just include() the file,
expecting innocuous HTML.
}

Now, analyse this code and note as many weaknesses as you can find, keeping
in mind that Bob's main objective is preventing Javascript injection. Extra
credits will be given for other security concerns, however. Those who answer
along the line of "he shouldn't accept HTML in the first place" will receive
no points. What interests us here is why not.

--
-- Rudy Fleminger
-- sp@mmers.and.evil.ones.will.bow-down-to.us
(put "Hey!" in the Subject line for priority processing!)
-- http://www.pixelsaredead.com
Jul 17 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.