By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,042 Members | 1,047 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,042 IT Pros & Developers. It's quick & easy.

When to clean input text

P: n/a
I was wondering what people do with text provided by the user in a
form. Some cleaning needs to be done at some stage if you are going to
be putting it in a database or displaying it etc. But when is the time
to do that?

Do you clean it as soon as you get it?
Do you pass around the original text and clean it when you use it?

What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?

TIA.

Craig
Jul 17 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
Great question. The answer is that you always clean it. The OWASP (
http://www.owasp.org ) compiles a list of the top 10 most critical web
application flaws every year, and every year, unvalidated input is at
the top of the list. Here is what I do. I wrote a "datascrubber"
class that I use on every page that accepts any kind of input either
from POST or GET variables. The datascrubber is very simple, it runs
a series of tests on each variable passed to the page. The tests
include:

type (using is_int, is_float, etc)
minimum and maximum lenghts
minimum and maximum values
regex - compare it to a regex to see if it matches the expected
pattern (email address, URL, etc).

If the variable data passes all the tests, then I push it into an
array called $clean[] and if not it goes into $unclean[]. At this
point I do the addslahses as well. Once this is done, I can call any
variable from the $clean[] array and be sure that it passed the tests
I set for it. I've encapsulated all this into an object for easy
reuse and I can provide that to you if you would like.

Jimmy
Codingscape.com

Craig Thomson <cr***@spam.free> wrote in message news:<bp********************************@4ax.com>. ..
I was wondering what people do with text provided by the user in a
form. Some cleaning needs to be done at some stage if you are going to
be putting it in a database or displaying it etc. But when is the time
to do that?

Do you clean it as soon as you get it?
Do you pass around the original text and clean it when you use it?

What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?

TIA.

Craig

Jul 17 '05 #2

P: n/a
Great question. The answer is that you always clean it. The OWASP (
http://www.owasp.org ) compiles a list of the top 10 most critical web
application flaws every year, and every year, unvalidated input is at
the top of the list. Here is what I do. I wrote a "datascrubber"
class that I use on every page that accepts any kind of input either
from POST or GET variables. The datascrubber is very simple, it runs
a series of tests on each variable passed to the page. The tests
include:

type (using is_int, is_float, etc)
minimum and maximum lenghts
minimum and maximum values
regex - compare it to a regex to see if it matches the expected
pattern (email address, URL, etc).

If the variable data passes all the tests, then I push it into an
array called $clean[] and if not it goes into $unclean[]. At this
point I do the addslahses as well. Once this is done, I can call any
variable from the $clean[] array and be sure that it passed the tests
I set for it. I've encapsulated all this into an object for easy
reuse and I can provide that to you if you would like.

Jimmy
Codingscape.com

Craig Thomson <cr***@spam.free> wrote in message news:<bp********************************@4ax.com>. ..
I was wondering what people do with text provided by the user in a
form. Some cleaning needs to be done at some stage if you are going to
be putting it in a database or displaying it etc. But when is the time
to do that?

Do you clean it as soon as you get it?
Do you pass around the original text and clean it when you use it?

What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?

TIA.

Craig

Jul 17 '05 #3

P: n/a
"Craig Thomson" <cr***@spam.free> wrote in message
news:bp********************************@4ax.com...
I was wondering what people do with text provided by the user in a
form. Some cleaning needs to be done at some stage if you are going to
be putting it in a database or displaying it etc. But when is the time
to do that?

Do you clean it as soon as you get it?
Do you pass around the original text and clean it when you use it?
I use the latter approach, since you can only tell whether something is
"clean" or not when it's used in a particular context. An example would be
text with unescaped single quotes.

A good rule to go by, I think, is "functions should always validate
parameters passed to them." For sample, say I have the function
GetUser($user_id). Since an integer is expected, the function should either
fail immediately when an non-integer is passed or cast the parameter into an
int.
What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?


Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you can.
If not, use a statement in a header file to strip off slashes from all
incoming data ($_GET, $_PUT), and them escape quotes manually.
Jul 17 '05 #4

P: n/a
On 16 Apr 2004 10:05:15 -0700, ja****@jamesj.zyx.net (Jimmy Jacobson)
wrote:
I wrote a "datascrubber"
class that I use on every page that accepts any kind of input either
from POST or GET variables. [...]I've encapsulated all this into an object for easy
reuse and I can provide that to you if you would like.


Thanks, I would love to see it!

Craig
Jul 17 '05 #5

P: n/a
On Fri, 16 Apr 2004 18:39:05 -0400, "Chung Leong"
<ch***********@hotmail.com> wrote:
What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?


Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you can.
If not, use a statement in a header file to strip off slashes from all
incoming data ($_GET, $_PUT), and them escape quotes manually.


What do you mean by putting a statement in a header file? Do you mean
turning it off using an option places in a header file? Or do you mean
checking if it is on and stripping the slashes as you read the $_GET
and $_POST data?

Craig

Jul 17 '05 #6

P: n/a
"Craig Thomson" <cr***@spam.free> wrote in message
news:t0********************************@4ax.com...
On Fri, 16 Apr 2004 18:39:05 -0400, "Chung Leong"
<ch***********@hotmail.com> wrote:
What about magic slashes? You need to addslashes before using in a db
statement, but you need to strip them when displaying. When do you do
that?


Magic quotes, IMHO, is the dumbest feature of PHP. Turn it off if you can.If not, use a statement in a header file to strip off slashes from all
incoming data ($_GET, $_PUT), and them escape quotes manually.


What do you mean by putting a statement in a header file? Do you mean
turning it off using an option places in a header file? Or do you mean
checking if it is on and stripping the slashes as you read the $_GET
and $_POST data?


The assumption is that you have a file which is included at the beginning of
every script. Global.php or something like that. In this file, you place the
slash stripping code, so that all your scripts will get data without
slashes.

Example:

if(get_magic_quotes_gpc()) {
function __stripslashes (&$s) { $s = stripslashes($s); }

array_walk($_POST, '__stripslashes');
array_walk($_GET, '__stripslashes');
}

This is necessary if you can't change the setting in php.ini.
Jul 17 '05 #7

P: n/a
On Mon, 19 Apr 2004 18:31:13 -0400, "Chung Leong"
<ch***********@hotmail.com> wrote:
The assumption is that you have a file which is included at the beginning of
every script. Global.php or something like that. In this file, you place the
slash stripping code, so that all your scripts will get data without
slashes.

Example:

if(get_magic_quotes_gpc()) {
function __stripslashes (&$s) { $s = stripslashes($s); }

array_walk($_POST, '__stripslashes');
array_walk($_GET, '__stripslashes');
}

This is necessary if you can't change the setting in php.ini.


Thanks Chuck.

I was thinking about this and wondered if it would be possible for the
GET or POST element to be an array itself? If so, wouldn't your user
function be better as:

function __stripslashes (&$s) {
if ( is_array($s) ) { __stripslashes($s); }
$s = stripslashes($s);
}

But I'm not sure under what circumstances it would be an array. A
multi select box may do it. What do you think?

Craig
Jul 17 '05 #8

P: n/a
Actually it should be:
function __stripslashes (&$s) {
if ( is_array($s) ) { array_walk($s, '__stripslashes'); }
else { $s = stripslashes($s); }
}

I have a test case below.

Thanks, Chung, for the code.

Craig

PS: And Chung, sorry for calling you Chuck in a previous post. My
mistake.

-----------------------------------------------

<html><head><title> Test Forms </title></head><body>
<pre><?PHP print_r($_POST) ?></pre>
<HR>
<?PHP
if( get_magic_quotes_gpc() ) {
function __stripslashes (&$s) {
if ( is_array($s) ) {
array_walk($s, '__stripslashes');
} else {
$s = stripslashes($s);
}
}

array_walk($_POST, '__stripslashes');
}
?>

<pre><?PHP print_r($_POST) ?></pre>
<hr>

<form action="<?PHP echo $_SERVER['PHP_SELF']; ?>" method="post">
<select multiple name="snacks[]">
<option value='option "l"'>Option one</option>
<option value="option '2'">Option two</option>
<option value='option "3"'>Option three</option>
<option value="option '4'">Option four</option>
<option value='option "5"'>Option five</option>
<option value="option '5'">Option five</option>
</select>
<input type="submit" value="Submit" name="add">
</form>
</body>
</html>
Jul 17 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.