473,386 Members | 1,745 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Re: HTTP request error with urlopen

Try:

import re
import urllib2
url = 'http://www.google.com/search?num=20&hl=en&q=ipod&btnG=Search'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = {'User-Agent' : user_agent}
req = urllib2.Request(url, None, headers)
file_source=open("google_source.txt", 'w')
file_source.write(urllib2.urlopen(req).read())
file_source.close()

I think Google blocks the User-Agent urllib2 sends.

--Jonas Galvez, http://jonasgalvez.com.br/log

On Thu, Jul 3, 2008 at 3:52 AM, spandana g <sp***********@gmail.comwrote:
Hello ,

I have written a code to get the page source of the google search
page .. this is working for other urls. I have this problem with

import re
from urllib2 import urlopen
string='http://www.google.com/search?num=20&hl=en&q=ipod&btnG=Search'
file_source=file("google_source.txt",'w')
file_source.write(urlopen(string).read())
page_content=file_source.readlines()

Traceback (most recent call last) :
File "C:/Python25/google.py", line 5,in <module>
file_source.write(urlopen(string).read())
File "C:\Python25\lib\urllib2.py", line 124 , in urlopen
return__opener.open(url, data)
File "C:\Python25\lib\urllib2.py", line 387 , in open
response =meth(req, response)
File "C:\Python25\lib\urllib2.py", line 498 , in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python25\lib\urllib2.py", line 425, in error
return self._call_chain(*args)
File "C:\Python25\lib\urllib2.py", line 360, in __call_chain
result = func(*args)
File "C:\Python25\lib\urllib2.py", line 506, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden

Actually urlopen is working for google labs sets page but not for the
google.com and even I have same problem with wikipedia . Please let me know
.. If any one of have any idea about this .

Thank You,
Spandana.



--
http://mail.python.org/mailman/listinfo/python-list
Jul 4 '08 #1
0 1252

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Matthew Wilson | last post by:
I am writing a script to check on my router's external IP address. My ISP refreshes my IP very often and I use dyndns for the hostname for my computer. My Netgear mr814 router has a webserver that...
2
by: OvErboRed | last post by:
Hi, I'm trying to determine whether a given URL exists. I'm new to Python but I think that urllib is the tool for the job. However, if I give it a non-existent file, it simply returns the 404 page....
7
by: Michael Foord | last post by:
#!/usr/bin/python -u # 15-09-04 # v1.0.0 # auth_example.py # A simple script manually demonstrating basic authentication. # Copyright Michael Foord # Free to use, modify and relicense. #...
4
by: Dekaritae | last post by:
I have a script that I've written in Perl that retrieves files generated from a template. It works decently enough, but I'd like to rewrite it in Python (Perl was just a detour; it was originally...
0
by: Alimah | last post by:
My objective is to log onto a wiki account (specifically wikipedia) using the http proxies provided by them (145.97.39.130 - 145.97.39.140:80). The operating system is Windows XP/Windows Server 2003....
0
by: Nico Grubert | last post by:
Hi there, I am trying to open an https site and pass a request to it in order to simulate the submit of an HTML form on a https site that sets an authentication cookie for a tomcat application,...
1
by: iBlaine | last post by:
I'm hoping someone here can answer my problem - I'm getting a 500 error when I run this code. What it should do is setup cookies, log in, then post a file to a form. The problem is it throws an...
4
by: Mike Driscoll | last post by:
Hi, I have been using the following code for over a year in one of my programs: f = urllib2.urlopen('https://www.companywebsite.com/somestring') It worked great until the middle of the...
2
by: silk.odyssey | last post by:
I am getting the following error trying to download an html page using urllib2. urllib2.HTTPError: HTTP Error 204: NoContent The url is of this type: ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.