473,545 Members | 2,639 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Curl gives 403 forbidden

I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:

<?
$tturl = "http://teletekst.nos.n l/";
echo "opening $tturl ...\n";
$ch = curl_init();
if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
$fp = fopen("ttread.h tm", "w");
curl_setopt($ch , CURLOPT_FILE, $fp);
curl_setopt($ch , CURLOPT_URL, $tturl);
curl_exec($ch);
curl_close($ch) ;
fclose($fp);
echo "finished\n ";
?>

This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled). If I change $tturl in the script to
http://www.nos.nl/ itw works. What is teh difference between typing
itin my browser or accessing it with curl? Is tehere a workaround for
this?

Greetingz Bas

Aug 26 '05 #1
5 20171
"Basta" wrote:
I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:
(snip)
This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled).


That's probably because the owners of teletekst.nos.n l are fed up with
having idiot robots crawling all over their site and stealing its content.

If you had bothered to visit <http://teletekst.nos.n l/robots.txt> you might
have noticed that robots are not permitted to access this website. You're
getting a 403 response because their website has identified that you're
accessing it improperly.

There are probably some things you could do to bypass the blocks on this
website, but I'm not going to tell you what they are. Create your own
content. Don't steal it from other websites.

--
phil [dot] ronan @ virgin [dot] net
http://vzone.virgin.net/phil.ronan/
Aug 26 '05 #2
> There are probably some things you could do to bypass the blocks on this
website, but I'm not going to tell you what they are. Create your own
content. Don't steal it from other websites.


Thanx for your help. So I'm stealing content from a website? I can read
it but then I have to forget it as soon as possible otherwise I'm a
thief. Interesting thought. I'm surpsised you didn't even bother to
inform for what purpose I needed it.

Aug 28 '05 #3
On 2005-08-26, Basta <ba*******@gmai l.com> wrote:
I'm trying to retrieve information of a website using PHP and Curl.
This is the code I use:

<?
$tturl = "http://teletekst.nos.n l/";
echo "opening $tturl ...\n";
$ch = curl_init();
if (! $ch) die( "Cannot allocate a new PHP-CURL handle\n" );
$fp = fopen("ttread.h tm", "w");
curl_setopt($ch , CURLOPT_FILE, $fp);
curl_setopt($ch , CURLOPT_URL, $tturl);
curl_exec($ch);
curl_close($ch) ;
fclose($fp);
echo "finished\n ";
?>

This results in a 403 forbidden page. However if I type the url
http://teletekst.nos.nl/ in my browser then it works fine (also with
cookies disabled). If I change $tturl in the script to
http://www.nos.nl/ itw works. What is teh difference between typing
itin my browser or accessing it with curl? Is tehere a workaround for
this?


Perhaps it checks on user-agent?

--
Cheers,
- Jacob Atzen
Aug 28 '05 #4
> Perhaps it checks on user-agent?

Setting the CURLOPT_USERAGE NT to "Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.

Aug 29 '05 #5
Basta wrote:
Perhaps it checks on user-agent?


Setting the CURLOPT_USERAGE NT to "Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0" doesn't help.


Or, may be referrer or cookie issue. Better use verbose mode and post
the log file here.

Sample to verbose mode and log:
$fp_err = fopen('verbose_ file.txt', 'ab+');
fwrite($fp_err, date('Y-m-d H:i:s')."\n\n") ; //add timestamp to the
verbose log
curl_setopt($ch , CURLOPT_VERBOSE , 1);
curl_setopt($ch , CURLOPT_FAILONE RROR, true);
curl_setopt($ch , CURLOPT_STDERR, $fp_err);

Also, check
<http://curl.haxx.se/libcurl/php/examples/?ex=cookiejar.p hp> for cookie
handling.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Aug 30 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2919
by: Haluk Durmus | last post by:
Hello I checked out openssl,mm,apr,apr-util,apache 2,curl,libxml and php from cvs. php couse an ERROR I did the following steps:
3
6316
by: Chris Fortune | last post by:
# uname -a Linux stargate.mxc-online.net 2.4.20-021stab022.2.777-smp #1 SMP Wed Jul 28 17:12:37 MSD 2004 i686 i686 i386 GNU/Linux I recompiled PHP with mcrypt, openssl, and curl phpinfo(): http://www.canadiandropshipping.com/hello.php3 Does anyone know why this ssl curl test fails? http://www.canadiandropshipping.com...t/diag_curl.php
3
5051
by: Hans | last post by:
Hi everybody, I am desperately trying to log into my account at godaddy.com with PHP and Curl and just cannot make it happen. Has anybody written a script for this purpose? Here is what I tried but the result is always the login page. $curl=curl_init(); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
0
5846
by: nfhm2k | last post by:
I've been trying to find a solution to this for quite some time now... I even took a look at existing scripts... Including this one... http://groups.google.co.uk/group/comp.lang.php/browse_thread/thread/2e052386da903425/b03ec83ac55273a2?lnk=st&q=&rnum=1#b03ec83ac55273a2 Everyone on that post seems to say its to do with the cookie's, yet...
4
8030
by: zorro | last post by:
Hello there, I can't figure out why is it that when i use an array for my postfields it doesn't work : this works curl_setopt($curl, CURLOPT_POSTFIELDS, "clown=bozo" ); this doesn't curl_setopt($curl, CURLOPT_POSTFIELDS, array('clown'=>'bozo') );
0
3335
by: xerc | last post by:
I am trying to create a generic function I can call to download all files from a single remote FTP directory -- using CURL. I want to multi-thread it, but need to get the single thread functionality working first before I tackle that. Anyway, in my function I can list all the files, but the function I have, no matter how I try, will only return...
3
10085
by: rottmanj | last post by:
I am re-writing my rets application in perl, and I have found a few modules that will help me on my way. One of them being WWW::Curl:easy. During my testing, I have tested both system curl and perl curl. At this point I can get the system curl to correctly connect to my server. However, I am having the hardest time trying to figure out why I...
3
6282
by: buzz2050 | last post by:
Hi all, I am using cURL for the first time. I need to login to a site and my cURL code to do the same is as follows: //curlScript.php <?php function getContent($url, $referer, $cookie_file_name, $post_fields='') {
1
2422
by: ziycon | last post by:
When i use a 'pipe' in a link it gives a 403 forbidden error, an example of the link would be. http://www.mysite.com/details-12-name|02052009|About me If i use - or ~ it works fine but i need to use the | in the links??
0
7682
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7935
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7449
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6009
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5351
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3479
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1911
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1037
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
734
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.