473,503 Members | 1,711 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regular Expression for validating a url field

What is wrong with that?

regex =
/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$/

if(field.hpage.value != regex.test(field.hpage.value)){
alert("Bad Homepage")
field.hpage.focus()
field.hpage.select()
return false
}
return true
}

The regex should be all right... also the name for the fields are 100%
okay.. if I change the != for ==, it will go throught... which is
completely wrong =o(

Thanks guys.. cheers
Tizzah
tizzah.co.nr

Mar 1 '06 #1
7 12920
Tizzah wrote:
What is wrong with that?

regex =
/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]
{1,5})?\/.*)?$/

if(field.hpage.value != regex.test(field.hpage.value)){
alert("Bad Homepage")
field.hpage.focus()
field.hpage.select()
return false
}
return true
}
(Do not use the tab character for indentation, at least in postings.
Use multiples of two or four space characters instead.)

Plenty of things are wrong with this. From top to bottom, and left to
right:

- `(https?)' is equivalent to `(http|https)' and more efficient than
the latter.
- Valid domain names may contain uppercase ASCII characters.
- Valid domain names may contain more than one consecutive hyphen (`-'),
ref. IDN, and they may begin or end with a hyphen or a dot.
- The literal hyphen does not need to be escaped at the beginning
or the end of a character class (`[...]').
- The literal dot (`.') does not need not to be escaped in a character
class.
- The {1} quantifier is redundant always.
- ([0-9]{1,5})? is equivalent to \d{,5} (not considering backreferences).
- Valid domain names may contain more than 5 consecutive decimal digits.
- Valid top-level domain names must not contain any decimal digit.
- Valid top-level domain names are not restricted to five letters, and
the .test TLD specified in RFC2606 for testing purposes has only four
letters.
- A URI does not need to include the path delimiter `/' if there are no
further path components.
The regex should be all right...
For appropriate values of "all right".
also the name for the fields are 100% okay.. if I change the != for ==,
it will go throught... which is completely wrong =o(


Your code simply does not make sense. RegExp.prototype.test() returns a
boolean value, either `true' or `false'. You are comparing that value
against a supposed string value, and since you do not do perform a strict
comparison (`!==' or `==='), you are forcing implicit type conversion on
both operands. (Skip the following section if you are not interested in
the inner workings of the language.)

,-[ECMAScript 3 Final]
|
| 11.9.2 The Does-not-equals Operator ( != )
|
| The production
| EqualityExpression : EqualityExpression != RelationalExpression
| is evaluated as follows:
|
| 1. Evaluate EqualityExpression.
| 2. Call GetValue(Result(1)).
| 3. Evaluate RelationalExpression.
| 4. Call GetValue(Result(3)).
| 5. Perform the comparison Result(4) == Result(2). (Section 11.9.3.)
| [...]
| 11.9.3 The Abstract Equality Comparison Algorithm
|
| The comparison x == y, where x and y are values, produces true or false.
| Such a comparison is performed as follows:
|
| 1. If Type(x) is different from Type(y), go to step 14.

Type(x) = String, Type(y) = Boolean. Condition applies, go to step 14.

| [...]
| 14. If x is null and y is undefined, return true.
| 15. If x is undefined and y is null, return true.
| 16. If Type(x) is Number and Type(y) is String,
| return the result of the comparison x == ToNumber(y).
| 17. If Type(x) is String and Type(y) is Number,
| return the result of the comparison ToNumber(x) == y.
| 18. If Type(x) is Boolean, return the result of the comparison
| ToNumber(x) == y.

None of the above applies, continue.

| 19. If Type(y) is Boolean, return the result of the comparison
| x == ToNumber(y).

Condition applies. Return the result of x == ToNumber(y).

x_1 := x
y_1 := y

| The comparison x == y, where x and y are values, produces true or
| false.

x := x_1
y := ToNumber(y_1).

| 1. If Type(x) is different from Type(y), go to step 14.

Case 1: y_1 = false (no match). ToNumber(false) = 0 --> y := 0.
Case 2: y_1 = true (match). ToNumber(true) = 1 --> y := 1.

In both cases:

Type(x) = String, Type(y) = Number. Condition applies, go to step 14.

| 14. If x is null and y is undefined, return true.
| 15. If x is undefined and y is null, return true.
| 16. If Type(x) is Number and Type(y) is String,
| return the result of the comparison x == ToNumber(y).

None of the above applies, continue.

| 17. If Type(x) is String and Type(y) is Number,
| return the result of the comparison ToNumber(x) == y.

x_2 := x
y_2 := y

| The comparison x == y, where x and y are values, produces true or
| false.

x := ToNumber(x_2)
y := y_2.

| 1. If Type(x) is different from Type(y), go to step 14.

Case 1: x_2 = "" (empty string). ToNumber("") = 0 --> x := 0.

Case 2: x_2 = "N" (not empty). In that case, ToNumber("N") always
returns a number value. If "N" is not the string representation
of a numeric literal, that value is NaN.

In both cases:

Type(x) = Number, Type(y) = Number. Condition does not apply,
continue.

| 2. If Type(x) is Undefined, return true.
| 3. If Type(x) is Null, return true.
| 4. If Type(x) is not Number, go to step 11.
| 5. If x is NaN, return false.

This condition applies if x_1 is not a string representation
of a numeric literal, read: could be a URI. In that case,
`false' is returned to the calling algorithm, so ultimately
`false' is returned to the algorithm of `!=', its Result(5)
being `false':

| 5. Perform the comparison Result(4) == Result(2). (Section 11.9.3.)

Result(5) := false

| 6. If Result(5) is true, return false. Otherwise, return true.

Therefore, `true' is returned then!
___________

If the condition ("x is NaN") does not apply, i.e. x_1 can be
interpreted as a number (read: is definitely not a URI), continue.

| 6. If y is NaN, return false.

This applies never here, continue always.

| 7. If x is the same number value as y, return true.

Case 1: x = y. Applies if

- x_1 (being the value of field.hpage.value) is the empty string,
because "" is converted to 0, and there can be no match for
/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5
(([0-9]{1,5})?\/.*)?$/ in "", so `false' [being the result of
regex.test(field.hpage.value)] is converted to 0. x = y = 0.

- x_1 is "0...0" or "0x0...0", and there is no match, because x_1
is converted to 0, and `false' is converted to 1. x = y = 0.

- x_1 is "0...01" and "0x0...1", and there is a match, because
x_1 is converted to a 1, and `true' is converted to 1. x = y = 1.
Since the Regular Expression never matches "0...01" or "0x0...1",
this sub-case never applies.

`true' is returned here to the calling algorithm, and to its calling
algorithm, so ultimately `true' is returned to the algorithm of
`!=', its Result(5) being `true':

| 5. Perform the comparison Result(4) == Result(2). (Section 11.9.3.)

Result(5) := true

| 6. If Result(5) is true, return false. Otherwise, return true.

Therefore, `false' is returned then!
___________

Case 2: x != y. Applies if

- x_1 is different from "", "0...0" and "0x0...0", and there is
no match, because x_1 is then converted to a value n != 0, and
`false' is converted to 0. 0 != n = x != y = 0.

The condition

| 7. If x is the same number value as y, return true.

would not apply in case 2, therefore we continue.

| 8. If x is +0 and y is -0, return true.
| 9. If x is -0 and y is +0, return true.

None of above applies, continue.

| 10. Return false.

`false' is returned here to the calling algorithm, and to its calling
algorithm, so ultimately `false' is returned to the algorithm of `!=',
its Result(5) being `false':

| 5. Perform the comparison Result(4) == Result(2). (Section 11.9.3.)

Result(5) := false

| 6. If Result(5) is true, return false. Otherwise, return true.

Therefore, it returns `true' then!
__________________________________________________ _____________________

The outcome of the algorithm for the `==' operator is (of course) the
boolean opposite of the algorithm result for the `!=' operator, and
vice-versa.

So if the control's value is "", the (equals-)condition

field.hpage.value == regex.test(field.hpage.value)

is true:

0. "" == false ("no match")
1. "" == ToNumber(false)
2. "" == 0
3. ToNumber("") == 0
4. 0 == 0
5. true

If the control's value is "0...0" or "0x0...0", the condition is true:

0. "0...0" == false ("no match")
1. "0...0" == ToNumber(false)
2. "0...0" == 0
3. ToNumber("0...0") == 0
4. 0 == 0
5. true

If the control's value is "0...1" or "0x0...1", or another value that
can be interpreted as a number different from 0, the condition is false:

0. "0...1" == false ("no match")
1. "0...1" == ToNumber(false)
2. "0...1" == 0
3. ToNumber("0...1") == 0
4. 1 == 0
5. false

If the control's value is "http://f/" (not a URL, according to your
standards), the condition is false.

0 "http://f/" == false ("no match")
1. "http://f/" == ToNumber(false)
2. "http://f/" == 0
3. ToNumber("http://f/") == 0
4. NaN == 0
5. false (according to 11.9.3, step 5)

If the control's value is "http://x.org" (a URL, according to your
standards), the condition is _false_:

0 "http://x.org" == true ("match")
1. "http://x.org" == ToNumber(true)
2. "http://x.org" == 1
3. ToNumber("http://x.org") == 1
4. NaN == 1
5. false
You are looking for

if (!regex.test(field.hpage.value))
{
alert("Bad Homepage");
// ...
return false;
}
return true;

and probably a Regular Expression for matching URLs that makes sense,
see RFC3986.
PointedEars
Mar 1 '06 #2
Tizzah wrote:
What is wrong with that?

regex =
/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$/

if(field.hpage.value != regex.test(field.hpage.value)){
alert("Bad Homepage")
field.hpage.focus()
field.hpage.select()
return false
}
return true
}

The regex should be all right... also the name for the fields are 100%
okay.. if I change the != for ==, it will go throught... which is
completely wrong =o(


The problem with using a regular expression to check URLs (or any other
address such as an e-mail address) is that even if the string fits
within the range of valid addresses, you don't know if it's actually valid.

The only real way to validate a URL is to test it - ping comes to mind.
Let the user enter whatever they want at the client. When they send
the data to the server, validate it there and if the URL is bogus, tell
the user in the subsequent page.

If they want to persist with an invalid address, you can either assume
the site isn't up but might be later, or that the user doesn't want to
enter a valid address - how you deal with that is up to you.
--
Rob
Mar 2 '06 #3
RobG wrote:
The problem with using a regular expression to check URLs (or any other
address such as an e-mail address) is that even if the string fits
within the range of valid addresses, you don't know if it's actually
valid.
Exactly.
The only real way to validate a URL is to test it - ping comes to mind.


Only if you do not think it through. For example, have you ever tried to
ping(1) microsoft.com or one of their subdomains? ;-) (They, among others,
are DROPping or filtering ICMP requests, which is considered antisocial.)

What comes to /my/ mind here is of course to use DNS directly, therefore
host(1) or nslookup(1) (from the BIND9 host utilities), where the latter
is deprecated.
PointedEars
Mar 2 '06 #4
Thomas 'PointedEars' Lahn wrote:
RobG wrote:

The problem with using a regular expression to check URLs (or any other
address such as an e-mail address) is that even if the string fits
within the range of valid addresses, you don't know if it's actually
valid.

Exactly.

The only real way to validate a URL is to test it - ping comes to mind.

Only if you do not think it through. For example, have you ever tried to
ping(1) microsoft.com or one of their subdomains? ;-) (They, among others,
are DROPping or filtering ICMP requests, which is considered antisocial.)


Ping returns resolved domain names if they can be found (e.g. ping
www.microsoft.com and the resolved address is returned, even though the
request will time out).

I don't presume it is the best strategy, just one that came quickly to
mind. Without knowing what the OP's criteria are for a valid address,
all we can do is toss up a few possibilities.

[...]
--
Rob
Mar 2 '06 #5
RobG <rg***@iinet.net.au> writes:
Ping returns resolved domain names if they can be found (e.g. ping
www.microsoft.com and the resolved address is returned, even though
the request will time out).


If that's what you want, you could just use nslookup instead.

Neither pinging or dns resolution will tell you if a web server is
running, though.
/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
Mar 2 '06 #6
Lasse Reichstein Nielsen wrote:
RobG <rg***@iinet.net.au> writes:
Ping returns resolved domain names if they can be found (e.g. ping
www.microsoft.com and the resolved address is returned, even though
the request will time out).


If that's what you want, you could just use nslookup instead.

Neither pinging or dns resolution will tell you if a web server is
running, though.


True, to test this, one would have to make a HTTP request to a standard HTTP
port (using nc/netcat(1), HEAD(1) from the libwww-perl package, server-side
XMLHTTPRequest and the like). Which would still include the possibility of
a Web server that was temporarily down for maintenance being recognized as
not existing.
PointedEars
Mar 3 '06 #7
JRS: In article <11*********************@i40g2000cwc.googlegroups. com>,
dated Tue, 28 Feb 2006 20:30:58 remote, seen in
news:comp.lang.javascript, Tizzah <ti****@gmail.com> posted :
What is wrong with that?

One can, in principle, validate the full grammar of a URL against all
current applicable RFCs - but one then needs to watch for new RFCs which
may change the situation - and it's possible that there are sites
offering URLs that work but do not comply perfectly with the grammar.

AFAICS, only two forms of test are reasonable.

One can attempt to access the alleged URL in some manner, and see what
reply that gives; one learns something about the validity of that URL at
that instant.

Or one can look at a string to see whether there's a reasonable chance
of it being a valid URL or whether it cannot be but may be some other
form of data; that's easier of course if context permits testing for a
specific type of protocol.

Example : http://xxx and https://xxx and mailto:xxx can certainly start
a URL; faxto:xxx might well indicate a new protocol; c:xxx and c:/xxx
almost certainly indicate a mistake.

It's reasonable to check that mailto: is followed by a match for
..+@.+\..+ and that http:// is followed by .+\..*/.+ though, maybe with
another \..+ .
For the OP's purpose, there's no need to validate the alleged URL
locally. A URL-fetching agent needs to survive being offered any string
whatsoever; and if it works, the URL was right. No validator can
possibly check whether a URL is actually right; for example, <URL:http:/
/www.merlyn.demon.co.uk/astro.htm> is grammatically valid, and can be
fetched. However, it can hardly be what the summoner actually will
want.

--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://www.jibbering.com/faq/> JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Mar 3 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
2016
by: paulsmith5 | last post by:
Hi, I have the following regular expression to validate a date-time field in European or d/m/y h:m:s format. ...
8
7305
by: Rajeev Soni | last post by:
Hi I am looking for the regular expression for validating the allowed file types to upload like files like "zip,pdf,doc,rtf,gif,jpg,png,txt"; and the expression should not be case sensitive like...
3
2203
by: Mark | last post by:
To validate the length of a multiline textbox, I'm told that I have to use a regular expression validator. The regular expression below limits it to 25 characters in length, but if the user enters...
5
4459
by: John . | last post by:
I am using the Regular Expression Validator control to validate a correct email address. But, at the same time I would like to make it a required field. I tested by using just the regular...
6
7145
by: Friso Wiskerke | last post by:
Hi All, I'm trying to find a way to validate a string variable in the code behind with a regular expression. I know there are validator controls but these all work with the ControlToValidate...
14
11353
by: olekristianvillabo | last post by:
I have a regular expression that is approximately 100k bytes. (It is basically a list of all known norwegian postal numbers and the corresponding place with | in between. I know this is not the...
10
1645
by: venugopal.sjce | last post by:
Hi Friends, I'm constructing a regular expression for validating an expression which looks as any of the following forms: 1. =4*++2 OR 2. =Sum()*6 Some of the samples I have constructed...
1
2556
by: durga2005 | last post by:
hi I m validating a textbox using regular expression validator and these controls are placed inside the update panel the validation is not working inside update panel
7
1849
by: graphicsxp | last post by:
Hi, I can't figure out what the regular expression would look like for validating a string such as : Lastname, Firstname I could only come up with /\w/ which doesn't check for the comma. ...
0
7202
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7330
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6991
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7460
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5578
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
3167
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3154
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1512
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
380
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.