On Thu, 27 May 2004 13:11:59 +0200, Rico Huijbers
<E.A.M.Huijbers@REMOVEstudent.tue.nl> wrote:
[color=blue]
>Shaoyong Wang wrote:
>[color=green]
>> Dear All,
>>
>> I want to write a simple PHP code to verify whether a given list of
>> URLs is broken or not. The URLs given have various formats, for example,
>>
>>
http://www.afro.com/history/history.html
>>
http://www.worldhistorycompass.com/index.htm
>>
http://www.afup.org/article.php3?id_article=242
>>
>> fsockopen(...) only accepts target format (like
www.yahoo.com etc), so I
>> simply choose fopen (url,'r') to do the test.
>>
>> This method seems working well for majority of URLs I have. However,
>> when the following site seems causing problem,
>>
>> $site = fopen("http://www.marketingpower.com/live/topics12.php",'r');
>>
>> my browser seems working very hard on this link and after a long time,
>> the browser stops (not continuing on the next command. I am using
>> netscape on linux).
>>
>> Initially I thought probably fopen doesn't like
http://xxxx/xxx.php type
>> of format, however, links like:
>>
>>
http://www.afup.org/article.php3?id_article=242
>>
>> works just fine.
>>
>> So I am confused. Can anybody give me a hint on this?
>>
>> Thanks.[/color]
>
>I've tried to use nc to open the site you mentioned. It seems it does a
>whole lot of redirecting and setting cookies. Might be that's what's
>giving the URL wrapper trouble opening the page properly.
>
>If that's the case, you will have no other choice than to write your own
>HTTP request handler, which handles redirects and cookies. Which isn't
>too hard, but it's a lot of work :).[/color]
couldn't you just do something like this?:
$hostname = preg_replace('/http:\/\/([^\/]*).*/','$1',$url);
$conn_id = fsockopen('tcp://' . gethostbyname($hostname), 80,
$errno, $errstr);
fputs($conn_id,"GET $url HTTP/1.1\r\nHost: $hostname\r\n\r\n");
where $url is the url in question, and then where you do whatever you
would normally do?