469,600 Members | 2,231 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,600 developers. It's quick & easy.

Trying to read google sercah page from python

Hi,

My program reads as follows
import urllib
print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
Aug 19 '08 #1
5 1810
On Aug 19, 9:47 am, "tedpot...@gmail.com" <tedpot...@gmail.comwrote:
Hi,

My program reads as follows
import urllib

print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
This is a PHP discussion group - not a Python group - but I'll answer.

It's against Google's Terms of Service to do what you're doing, so
they're blocking you. (Not you specifically, but anyone who requests
their search results in that manner.)

If you want to do it anyway, you'd have to trick Google into thinking
you're an actual web user. So you'd have to do some spoofing. I'll
leave that as an exercise for the reader.

Walter
Aug 19 '08 #2
te*******@gmail.com wrote:
Hi,

My program reads as follows
import urllib
print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
Actually, the problem is not google blocking you. Your request is
incorrect. But as Walter indicated, this is not a Python support group.
Try comp.lang.python.

And also, as Walter indicated, it is against Google's TOS. They aren't
blocking you now - but they will if they catch you.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Aug 20 '08 #3
Actually, the problem is not google blocking you. Your request is
incorrect.
http://www.google.com// is strangely formed, but it works. Google
doesn't appear to block automated requests to their front page.

Google _is_ blocking the other request.

Viewing http://www.google.com/search?hl=en&q=ted in Firefox works
fine.

"curl http://www.google.com/search?hl=en&q=ted" returns the error he
mentioned previously. Probably returns the error via Python for the
same reason.

Walter
Aug 20 '08 #4
WalterGR wrote:
>Actually, the problem is not google blocking you. Your request is
incorrect.

http://www.google.com// is strangely formed, but it works. Google
doesn't appear to block automated requests to their front page.

Google _is_ blocking the other request.

Viewing http://www.google.com/search?hl=en&q=ted in Firefox works
fine.

"curl http://www.google.com/search?hl=en&q=ted" returns the error he
mentioned previously. Probably returns the error via Python for the
same reason.

Walter
Your crystal ball must be working better than mine. I can't tell that.
But I could see a lot of other possibilities.

But this is not a python group, so I won't discuss them here.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Aug 20 '08 #5
On Aug 19, 7:10 pm, Jerry Stuckle <jstuck...@attglobal.netwrote:
Your crystal ball must be working better than mine. I can't tell that.
But I could see a lot of other possibilities.
Sorry to hear about your crystal ball. But you don't need one in this
particular case.

All one needs is knowledge of user agents and user agent overriding,
and then one can test my hypothesis. (Which, given that I've now
tested it, is in fact, fact.)

Walter
Aug 20 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

42 posts views Thread by Bicho Verde | last post: by
13 posts views Thread by fuzzyman | last post: by
18 posts views Thread by jas | last post: by
16 posts views Thread by Duncan Booth | last post: by
4 posts views Thread by Stef Mientki | last post: by
1 post views Thread by tedpottel | last post: by
1 post views Thread by tedpottel | last post: by
reply views Thread by suresh191 | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.