Thanks for your response.
I was talking about the second point.
However, I am already done with the part you mentioned in your reply. I-e
(http:// | https:// )[ www ]{.}(com|edu|co|..) etc.
The real problem starts after that. I-e
http://www.abc.com/xyz/this.htm is
a valid url and so is
http://www.abc123.com/xyz/this.htm?abc=xy_z'dfh&this=that123
breaking in constituent parts is a good idea, but may be not the optimum
one. It would be better if I get a solution based on regular expression or a
hybrid one.
Thanks again.
--
Regards,
Shahzad Atta
"Mark Rae" <mark@mark-N-O-S-P-A-M-rae.co.uk> wrote in message
news:OiMesstoEHA.3876@TK2MSFTNGP15.phx.gbl...[color=blue]
> "Shahzad Atta" <shahzad.atta@eintelligencesoft.com> wrote in message
> news:u9SRuPtoEHA.1576@TK2MSFTNGP12.phx.gbl...
>[color=green]
> > I need to write a URL validation function. This function needs to be
> > robust
> > and correct. I tried different solutions using regular expressions but
> > cannot find a real one. I tried to wrote my own , but it is still not
> > perfect. Any help would be appreciated.[/color]
>
> Do you mean that
>
> a) the URL should point to a working site e.g.
http://www.microsoft.com,[/color]
or[color=blue]
>
> b) that it just has to "look" like a real URL e.g.
>
http://www.mysitewhichdoesnotexist.co.zz
>
> If a), then use HttpWebRequest and HttpWebResponse to see if you can[/color]
browse[color=blue]
> to it and get a valid response back.
>
> If b), you'll need to break it up into its constituent parts and validate
> each individually e.g.
>
> Starts with http:// or https://
>
> Site >= 3 characters long and <= 64 characters long, doesn't start with a
> number, doesn't contain any of the invalid characters (e.g. "/" etc)
>
> Extension is one of the valid top-level domains e.g. .com, .net, .info,
> .gov, .museum etc or is one of the valid country identifiers e.g. .co.uk,
> ..ca etc
>
>[/color]