By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,746 Members | 1,924 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,746 IT Pros & Developers. It's quick & easy.

Trying to read google sercah page from python

P: n/a
Hi,

My program reads as follows
import urllib
print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
Aug 19 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On Aug 19, 9:47 am, "tedpot...@gmail.com" <tedpot...@gmail.comwrote:
Hi,

My program reads as follows
import urllib

print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
This is a PHP discussion group - not a Python group - but I'll answer.

It's against Google's Terms of Service to do what you're doing, so
they're blocking you. (Not you specifically, but anyone who requests
their search results in that manner.)

If you want to do it anyway, you'd have to trick Google into thinking
you're an actual web user. So you'd have to do some spoofing. I'll
leave that as an exercise for the reader.

Walter
Aug 19 '08 #2

P: n/a
te*******@gmail.com wrote:
Hi,

My program reads as follows
import urllib
print "-------- Google Web Page --------"
print urllib.urlopen('http://www.google.com//').read()

print "-------- Google Search Web Page --------"
print urllib.urlopen('http://www.google.com/search?
hl=en&q=ted').read()

The first urlib read works fine. The second one, when I am trying to
read in googles serach results, I get a web page saying I do not have
permission.
"Your client does not have permission to get URL "
Is there a way to do this? I am trying to write a program to read in
googles esercah results.

-Ted
Actually, the problem is not google blocking you. Your request is
incorrect. But as Walter indicated, this is not a Python support group.
Try comp.lang.python.

And also, as Walter indicated, it is against Google's TOS. They aren't
blocking you now - but they will if they catch you.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Aug 20 '08 #3

P: n/a
Actually, the problem is not google blocking you. Your request is
incorrect.
http://www.google.com// is strangely formed, but it works. Google
doesn't appear to block automated requests to their front page.

Google _is_ blocking the other request.

Viewing http://www.google.com/search?hl=en&q=ted in Firefox works
fine.

"curl http://www.google.com/search?hl=en&q=ted" returns the error he
mentioned previously. Probably returns the error via Python for the
same reason.

Walter
Aug 20 '08 #4

P: n/a
WalterGR wrote:
>Actually, the problem is not google blocking you. Your request is
incorrect.

http://www.google.com// is strangely formed, but it works. Google
doesn't appear to block automated requests to their front page.

Google _is_ blocking the other request.

Viewing http://www.google.com/search?hl=en&q=ted in Firefox works
fine.

"curl http://www.google.com/search?hl=en&q=ted" returns the error he
mentioned previously. Probably returns the error via Python for the
same reason.

Walter
Your crystal ball must be working better than mine. I can't tell that.
But I could see a lot of other possibilities.

But this is not a python group, so I won't discuss them here.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================

Aug 20 '08 #5

P: n/a
On Aug 19, 7:10 pm, Jerry Stuckle <jstuck...@attglobal.netwrote:
Your crystal ball must be working better than mine. I can't tell that.
But I could see a lot of other possibilities.
Sorry to hear about your crystal ball. But you don't need one in this
particular case.

All one needs is knowledge of user agents and user agent overriding,
and then one can test my hypothesis. (Which, given that I've now
tested it, is in fact, fact.)

Walter
Aug 20 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.