473,372 Members | 1,039 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,372 software developers and data experts.

Curl gives 403 forbidden

I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:

<?
$tturl = "http://teletekst.nos.nl/";
echo "opening $tturl ...\n";
$ch = curl_init();
if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
$fp = fopen("ttread.htm", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_URL, $tturl);
curl_exec($ch);
curl_close($ch);
fclose($fp);
echo "finished\n";
?>

This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled). If I change $tturl in the script to
http://www.nos.nl/ itw works. What is teh difference between typing
itin my browser or accessing it with curl? Is tehere a workaround for
this?

Greetingz Bas

Aug 26 '05 #1
5 20132
"Basta" wrote:
I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:
(snip)
This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled).


That's probably because the owners of teletekst.nos.nl are fed up with
having idiot robots crawling all over their site and stealing its content.

If you had bothered to visit <http://teletekst.nos.nl/robots.txt> you might
have noticed that robots are not permitted to access this website. You're
getting a 403 response because their website has identified that you're
accessing it improperly.

There are probably some things you could do to bypass the blocks on this
website, but I'm not going to tell you what they are. Create your own
content. Don't steal it from other websites.

--
phil [dot] ronan @ virgin [dot] net
http://vzone.virgin.net/phil.ronan/
Aug 26 '05 #2
> There are probably some things you could do to bypass the blocks on this
website, but I'm not going to tell you what they are. Create your own
content. Don't steal it from other websites.


Thanx for your help. So I'm stealing content from a website? I can read
it but then I have to forget it as soon as possible otherwise I'm a
thief. Interesting thought. I'm surpsised you didn't even bother to
inform for what purpose I needed it.

Aug 28 '05 #3
On 2005-08-26, Basta <ba*******@gmail.com> wrote:
I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:

<?
$tturl = "http://teletekst.nos.nl/";
echo "opening $tturl ...\n";
$ch = curl_init();
if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
$fp = fopen("ttread.htm", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_URL, $tturl);
curl_exec($ch);
curl_close($ch);
fclose($fp);
echo "finished\n";
?>

This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled). If I change $tturl in the script to
http://www.nos.nl/ itw works. What is teh difference between typing
itin my browser or accessing it with curl? Is tehere a workaround for
this?


Perhaps it checks on user-agent?

--
Cheers,
- Jacob Atzen
Aug 28 '05 #4
> Perhaps it checks on user-agent?

Setting the CURLOPT_USERAGENT to "Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.

Aug 29 '05 #5
Basta wrote:
Perhaps it checks on user-agent?


Setting the CURLOPT_USERAGENT to "Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.


Or, may be referrer or cookie issue. Better use verbose mode and post
the log file here.

Sample to verbose mode and log:
$fp_err = fopen('verbose_file.txt', 'ab+');
fwrite($fp_err, date('Y-m-d H:i:s')."\n\n"); //add timestamp to the
verbose log
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_STDERR, $fp_err);

Also, check
<http://curl.haxx.se/libcurl/php/examples/?ex=cookiejar.php> for cookie
handling.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Aug 30 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Haluk Durmus | last post by:
Hello I checked out openssl,mm,apr,apr-util,apache 2,curl,libxml and php from cvs. php couse an ERROR I did the following steps:
3
by: Chris Fortune | last post by:
# uname -a Linux stargate.mxc-online.net 2.4.20-021stab022.2.777-smp #1 SMP Wed Jul 28 17:12:37 MSD 2004 i686 i686 i386 GNU/Linux I recompiled PHP with mcrypt, openssl, and curl phpinfo():...
3
by: Hans | last post by:
Hi everybody, I am desperately trying to log into my account at godaddy.com with PHP and Curl and just cannot make it happen. Has anybody written a script for this purpose? Here is what I...
0
by: nfhm2k | last post by:
I've been trying to find a solution to this for quite some time now... I even took a look at existing scripts... Including this one......
4
by: zorro | last post by:
Hello there, I can't figure out why is it that when i use an array for my postfields it doesn't work : this works curl_setopt($curl, CURLOPT_POSTFIELDS, "clown=bozo" ); this doesn't...
0
by: xerc | last post by:
I am trying to create a generic function I can call to download all files from a single remote FTP directory -- using CURL. I want to multi-thread it, but need to get the single thread functionality...
3
by: rottmanj | last post by:
I am re-writing my rets application in perl, and I have found a few modules that will help me on my way. One of them being WWW::Curl:easy. During my testing, I have tested both system curl and...
3
by: buzz2050 | last post by:
Hi all, I am using cURL for the first time. I need to login to a site and my cURL code to do the same is as follows: //curlScript.php <?php function getContent($url, $referer,...
1
by: ziycon | last post by:
When i use a 'pipe' in a link it gives a 403 forbidden error, an example of the link would be. http://www.mysite.com/details-12-name|02052009|About me If i use - or ~ it works fine but i need to...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.