By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,686 Members | 1,558 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,686 IT Pros & Developers. It's quick & easy.

Validating input...

P: n/a
Hi...

What's the best method of validating input characters? I would like to
prevent users submitting exotic characters (such as those acquired on
Windows Systems by pressing ALT+[keypad number of your choice]) and thought
a way of doing this would be to compare the submitted strings with the array
keys returned by get_html_translation_table(HTML_ENTITIES), but padding this
array out with all the remaining normal keyboard characters.

But... am I reinventing the wheel? Surely there must be an existing function
along the lines of: valid_charset($str_blah, "iso-8859-01") or somesutch?!

Here's hoping...

Plankmeister.
Jul 17 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
I don't know what you mean by validating input characters, really. I mean,
whatever they type is going to get translated into *something* in your
charset. If you want to restrict the user to using a specified charset,
however, this can be done client-side as it's a part of the HTML 4.0
standard. See http://www.w3.org/TR/REC-html40/interact/forms.html
(specifically accept-charset). One thing I would caution you on, however, is
to watch your definition of valid. For example, if you ask for an address
and someone gives:

Königin-Elisabeth Straße 47a in 14059 Berlin (the address of my favorite
Hotel), are you going to consider that invalid input?

The Plankmeister wrote:
Hi...

What's the best method of validating input characters? I would like to
prevent users submitting exotic characters (such as those acquired on
Windows Systems by pressing ALT+[keypad number of your choice]) and
thought a way of doing this would be to compare the submitted strings
with the array keys returned by
get_html_translation_table(HTML_ENTITIES), but padding this array out
with all the remaining normal keyboard characters.

But... am I reinventing the wheel? Surely there must be an existing
function along the lines of: valid_charset($str_blah, "iso-8859-01")
or somesutch?!

Here's hoping...

Plankmeister.

Jul 17 '05 #2

P: n/a
"Agelmar" <if**********@comcast.net> wrote in message
news:bt************@ID-30799.news.uni-berlin.de...
I don't know what you mean by validating input characters, really. I mean,
I didn't explain it very well... Apologies... What I mean by 'validating
input characters' is having a function {for instance
valid_charset($str_to_check, "iso-8859-01") } which returns true if all the
characters in the passed string are characters that appear in the indicated
character set., but false if it finds characters that are not valid for the
specified character set. This way, bizarre characters (such as that produced
by ALT + keypad 985) would be rejected...
whatever they type is going to get translated into *something* in your
charset. If you want to restrict the user to using a specified charset,
however, this can be done client-side as it's a part of the HTML 4.0
Doing anything client-side is unreliable as a crafty user can make his own
form and submit it, thereby circumventing the validation. I do %100 of my
validation on the server side, though in most cases I also do client-side
validation so that your average user doesn't have to wait for the submission
to be processed before seeing where they went wrong.
standard. See http://www.w3.org/TR/REC-html40/interact/forms.html
(specifically accept-charset). One thing I would caution you on, however, is to watch your definition of valid. For example, if you ask for an address
and someone gives:

Königin-Elisabeth Straße 47a in 14059 Berlin (the address of my favorite
Hotel), are you going to consider that invalid input?

Yeah... that would be valid because all those characters appear in
get_html_translation_table(HTML_ENTITIES).
Plankmeister
The Plankmeister wrote:
Hi...

What's the best method of validating input characters? I would like to
prevent users submitting exotic characters (such as those acquired on
Windows Systems by pressing ALT+[keypad number of your choice]) and
thought a way of doing this would be to compare the submitted strings
with the array keys returned by
get_html_translation_table(HTML_ENTITIES), but padding this array out
with all the remaining normal keyboard characters.

But... am I reinventing the wheel? Surely there must be an existing
function along the lines of: valid_charset($str_blah, "iso-8859-01")
or somesutch?!

Here's hoping...

Plankmeister.


Jul 17 '05 #3

P: n/a
On Wed, 7 Jan 2004 23:57:47 +0100, "The Plankmeister"
<pl******************@hotmail.com> wrote:
"Agelmar" <if**********@comcast.net> wrote in message
news:bt************@ID-30799.news.uni-berlin.de...
I don't know what you mean by validating input characters, really. I mean,


I didn't explain it very well... Apologies... What I mean by 'validating
input characters' is having a function {for instance
valid_charset($str_to_check, "iso-8859-01") } which returns true if all the
characters in the passed string are characters that appear in the indicated
character set., but false if it finds characters that are not valid for the
specified character set. This way, bizarre characters (such as that produced
by ALT + keypad 985) would be rejected...


Alt+985 gives me a + sign on Windows, so not sure what you mean here.

All 255 byte values are valid ISO-8859-1, although there are two ranges of
control characters. If you try and copy and paste a Unicode character in, e.g.
Chinese characters, if you've got the right character set headers then IE and
Mozilla (at least) send the HTML entity and NOT the raw Unicode character. So
you end up with something like 顓, which is all valid ISO-8859-1.
whatever they type is going to get translated into *something* in your
charset. If you want to restrict the user to using a specified charset,
however, this can be done client-side as it's a part of the HTML 4.0


Doing anything client-side is unreliable as a crafty user can make his own
form and submit it, thereby circumventing the validation. I do %100 of my
validation on the server side, though in most cases I also do client-side
validation so that your average user doesn't have to wait for the submission
to be processed before seeing where they went wrong.


I suppose you'd want to avoid 0-31 and 127-159 in ISO-8859-1/15 to avoid the
control characters (is this what you want?), but all 255 values _are_ valid.

--
Andy Hassall (an**@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Jul 17 '05 #4

P: n/a
Use regular expression:

$s = "This is a test";

if(preg_match("/^[\\x20-\\x7F]*$/", $s)) {
echo "Hey!";
}

If the user type in characters that fall outside of iso-8859-01 (and the
page is set to use that encoding), I think the browser would replace them
with HTML entities (e.g. ϙ). If you don't want these then you'd have to
check for them explicitly.

Uzytkownik "The Plankmeister" <pl******************@hotmail.com> napisal w
wiadomosci news:3f***********************@dread16.news.tele.d k...
"Agelmar" <if**********@comcast.net> wrote in message
news:bt************@ID-30799.news.uni-berlin.de...
I don't know what you mean by validating input characters, really. I mean,

I didn't explain it very well... Apologies... What I mean by 'validating
input characters' is having a function {for instance
valid_charset($str_to_check, "iso-8859-01") } which returns true if all the characters in the passed string are characters that appear in the indicated character set., but false if it finds characters that are not valid for the specified character set. This way, bizarre characters (such as that produced by ALT + keypad 985) would be rejected...
whatever they type is going to get translated into *something* in your
charset. If you want to restrict the user to using a specified charset,
however, this can be done client-side as it's a part of the HTML 4.0
Doing anything client-side is unreliable as a crafty user can make his own
form and submit it, thereby circumventing the validation. I do %100 of my
validation on the server side, though in most cases I also do client-side
validation so that your average user doesn't have to wait for the

submission to be processed before seeing where they went wrong.
standard. See http://www.w3.org/TR/REC-html40/interact/forms.html
(specifically accept-charset). One thing I would caution you on,
however, is
to watch your definition of valid. For example, if you ask for an

address and someone gives:

Königin-Elisabeth Straße 47a in 14059 Berlin (the address of my favorite
Hotel), are you going to consider that invalid input?


Yeah... that would be valid because all those characters appear in
get_html_translation_table(HTML_ENTITIES).
Plankmeister
The Plankmeister wrote:
Hi...

What's the best method of validating input characters? I would like to
prevent users submitting exotic characters (such as those acquired on
Windows Systems by pressing ALT+[keypad number of your choice]) and
thought a way of doing this would be to compare the submitted strings
with the array keys returned by
get_html_translation_table(HTML_ENTITIES), but padding this array out
with all the remaining normal keyboard characters.

But... am I reinventing the wheel? Surely there must be an existing
function along the lines of: valid_charset($str_blah, "iso-8859-01")
or somesutch?!

Here's hoping...

Plankmeister.



Jul 17 '05 #5

P: n/a

"The Plankmeister" <pl******************@hotmail.com> wrote in message
news:3f***********************@dread16.news.tele.d k...
Hi...

What's the best method of validating input characters? I would like to
prevent users submitting exotic characters (such as those acquired on
Windows Systems by pressing ALT+[keypad number of your choice]) and thought a way of doing this would be to compare the submitted strings with the array keys returned by get_html_translation_table(HTML_ENTITIES), but padding this array out with all the remaining normal keyboard characters.

But... am I reinventing the wheel? Surely there must be an existing function along the lines of: valid_charset($str_blah, "iso-8859-01") or somesutch?!

Here's hoping...

Plankmeister.


Ok... I think this should work, (it certainly seems to, and I've tested it
for an inordinately long time...) but is there anything I'm overlooking?

function valid_string($str_to_check)
{
return (!preg_match("/^&#[256-999]|[1000-9999];$/", $str_to_check) ?
true : false);
}

Jul 17 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.