Regarding this well-known quote, often attributed to Chung Leong's famous
"Fri, 9 Apr 2004 00:23:16 -0400" speech:
[color=blue]
> In an attempt to get another discussion going about security in PHP, I've
> put together this little exercise, the purpose of which is to illustrate how
> bedevilingly tricky writing secure code can be sometimes.
>
> Think you're up to the challenge? Here goes:
>
> Bob is the author of Bobster, a online community web-app written in PHP. At
> the core of it is its public message forums. As this sort of web-apps are
> many aplenty, Bob is constantly adding new features in order to stay ahead
> of the competition. One feature that users have oft requested is the ability
> to enter messages in HTML. So he did a search on Google and found this nifty
> Javascript HTML editor.
>
> Bob is no bozo, of course. He is aware of the dangers of accepting HTML, the
> biggest of which is the injection of malicious Javascript into pages of a
> site. Since Bob knows this editor does not permit the entry of arbitrary
> HTML code, he figured that a check on the HTTP referrer is a good first line
> of defense, ensuring only form submissions that originate from the site are
> processed. As a second line of defense, he added regular expression code
> that looks for <script> tags in the message text.[/color]
I'll bite. I need to test my security skills, anyhow.
[color=blue]
>
> The following is Bob's code:
>
> $text = $_POST['msg'];[/color]
Well, Bob did learn one thing in PHP class... no autoglobals. There are
also no "assumed unset" variables in case autoglobals are on. Good Bob.
[color=blue]
> $user_id = $_SESSION['user_id'];[/color]
I'll assume that the session has already been started and validated?
[color=blue]
> $referrer = parse_url($_SERVER['HTTP_REFERER']);[/color]
FATAL!: Never trust the referrer. This header is given by the client, and
can be swapped out as easily as downloading a Mozilla plugin. This, the
Useragent, and the remote IP are completely useless as mission-critical
data. In fact, I wouldn't trust anything that the client gives me, short of
a good, solid set of $_SESSION[] variables I set earlier.
THE SOLUTION!: The code should work securely no matter what it's fed. The
Web is stateless, so the security should not rely on what happened in a
foggy past. If it's absolutely necessary, statefulness should depend on a
unique, irrelevant, quickly expiring, and heavily random session-ID.
[color=blue]
>
> // the submission must come from a page on the server
> if($referrer['host'] != $_SERVER['HTTP_HOST']) {
>
> // remove Javscript from text
> $text = preg_replace('/<script.*?>.*?<\/script>/si', '', $text);[/color]
FATAL!: Think onLoad, onMouseOver, and <A HREF="javascript
:...">. There's
more than one way to load a script.
FATAL!: This doesn't drop the OBJECT or APPLET tags. Prepare to enter a
world of ActiveX and Java pain. Also, allowing unauthorized DIV, SPAN, or
STYLE tags (STYLE as a tag, not a parameter) can cause visual headaches on
the rest of the page.
WARNING!: Although it might not be HTML strict, some browsers might
interpret lines like: "< SCRIPT..." (with a space) or "<SCRIPT
LANGUAGE="JavaScript" SRC="sourcefile.js">" (SRC without a closing tag) as
parsable. It's best to just dump any SCRIPT tags, regardless.
WARNING!: There are a few other tags Bob should watch out for: OBJECT and
IFRAME tags can load an outside page in a small "internal frame", right
along with all it's malicious script. Browser tainting prevents the code
from doing much if it's offsite, but they can still run some annoying
attacks (ActiveX controls, popups, the JavaScript Looping Alert Box of
Doom). They *might* be able to hijack Bob's page, too, with a "Bookmarklet"
style <A HREF="javascript
:javascriptCodeGoesHere()"> link. (Actually, the
OBJECT tag can load an ActiveX plugin, too, can't it... time to add another
"FATAL!")
THE SOLUTION!: Match the code against an Array of known ACCEPTABLE tags.
Anything else, convert the < and > to < and >. I'd recommend this
over just an empty-replace. Otherwise, entire great swaths of text might be
wiped out by having an errant < near the beginning and an errant > near the
end of a comment (as ASCII-Art arrows, perhaps).
Once acceptable tags are weeded out, process each tag. Tags such as <B> or
<I> should be abstracted without parameters, and complex tags should only
have the parameters Bob wants to allow. Some clever regexps should do the
trick. Remember that parameters can be signified by double-quotes, single
quotes, or by just being after an equal-sign.
Bob should also make sure all tags are closed, to prevent the "Didn't close
<I> and the whole page is italic" problem. Other visual checks would be
anti-page-widening filters and overlarge image-size checking.
[color=blue]
>
> // save the message
> AddMessage($user_id, $text);[/color]
AddMessage(). We can only hope.
Remember, Bob. Don't store the data in a file with an extension that can
get parsed or CGI'd by Apache, and never, EVER just include() the file,
expecting innocuous HTML.
[color=blue]
> }
>
> Now, analyse this code and note as many weaknesses as you can find, keeping
> in mind that Bob's main objective is preventing Javascript injection. Extra
> credits will be given for other security concerns, however. Those who answer
> along the line of "he shouldn't accept HTML in the first place" will receive
> no points. What interests us here is why not.[/color]
--
-- Rudy Fleminger
--
sp@mmers.and.evil.ones.will.bow-down-to.us
(put "Hey!" in the Subject line for priority processing!)
--
http://www.pixelsaredead.com