I'm not developing webcrawlers, but a quick thought of mine is
string link = "../../wohoo.asp"
string thisPageURL = "http://www.xyz.com/wohoo.asp"
stirng [] linkParts = System.Text.RegularExpressions.Regex.Split(link,
"x2Ex2E/"); // split on ../
string [] URLParts = System.Text.RegularExpressions.Regex.Split(thisPag eURL,
"/");
the length of linkParts.Lenght - 1 will now contain the wanted numbers of
"../" "directory recursion" and the last element will be the wanted page
the URL to the new page will be concatenated from the URLParts array,
exluding the the linkPartLength number of elements, and the last element in
LinkParts
Just a quick shot at an solution...
/mortb
"ask josephsen" <jaj(((a)))oticon.dk> wrote in message
news:4090c8a4$0$1118$4d4eb98e@news.dk.uu.net...[color=blue]
> Hi NG
>
> I'm making a program to crawl the internet. It works by retrieving all[/color]
links[color=blue]
> in a page, downloading the page of each link and again retrieving all the
> links. (If there is better ways I'd like to hear)
>
> My problem is relative links (like "../../wohoo.asp"). What is the[/color]
smartest[color=blue]
> way to get the full url (
http://www.xyz.com/wohoo.asp)? Do I have to parse
> the relative link in relation to the url where the relative link was found
> and then concatenate it? Does anyone know how other search-engines/[/color]
crawlers[color=blue]
> walk the net?
>
>
> Thanks :)
>
> ./ask
>
>[/color]