By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,364 Members | 1,593 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,364 IT Pros & Developers. It's quick & easy.

Re: using urllib2

P: n/a

Okay, I tried to follow that, and it is kinda hard. But since you obviously
know what you are doing, where did you learn this? Or where can I learn
this?
Maric Michaud wrote:

Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écritÂ*:
>I have never used the urllib or the urllib2. I really have looked online
for help on this issue, and mailing lists, but I can't figure out my
problem because people haven't been helping me, which is why I am here!
:].
Okay, so basically I want to be able to submit a word to dictionary.com
and
then get the definitions. However, to start off learning urllib2, I just
want to do a simple google search. Before you get mad, what I have found
on
urllib2 hasn't helped me. Anyway, How would you go about doing this. No,
I
did not post the html, but I mean if you want, right click on your
browser
and hit view source of the google homepage. Basically what I want to know
is how to submit the values(the search term) and then search for that
value. Heres what I know:

import urllib2
response = urllib2.urlopen("http://www.google.com/")
html = response.read()
print html

Now I know that all this does is print the source, but thats about all I
know. I know it may be a lot to ask to have someone show/help me, but I
really would appreciate it.
This example is for google, of course using pygoogle is easier in this
case,
but this is a valid example for the general case :
>>>>[207]: import urllib, urllib2
You need to trick the server with an imaginary User-Agent.
>>>>[208]: def google_search(terms) :
return urllib2.urlopen(urllib2.Request("http://www.google.com/search?"
+
urllib.urlencode({'hl':'fr', 'q':terms}),
headers={'User-Agent':'MyNav
1.0
(compatible; MSIE 6.0; Linux'})
).read()
.....:
>>>>[212]: res = google_search("python & co")
Now you got the whole html response, you'll have to parse it to recover
datas,
a quick & dirty try on google response page :
>>>>[213]: import re
>>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2
class=r>.*?</h2>',
res) ]
...[229]:
['Python Gallery',
'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty ...',
'Re: os x, panther, python &amp; co: msg#00041',
'Re: os x, panther, python &amp; co: msg#00040',
'Cardiff Web Site Design, Professional web site design services ...',
'Python Properties',
'Frees &lt; Programs &lt; Python &lt; Bin-Co',
'Torb: an interface between Tcl and CORBA',
'Royal Python Morphs',
'Python &amp; Co']
--
_____________

Maric Michaud
--
http://mail.python.org/mailman/listinfo/python-list
--
View this message in context: http://www.nabble.com/using-urllib2-...p18160312.html
Sent from the Python - python-list mailing list archive at Nabble.com.

Jun 27 '08 #1
Share this Question
Share on Google+
3 Replies


P: n/a
I stumbled across this a while back: http://www.voidspace.org.uk/python/a.../urllib2.shtml.
It covers quite a bit. The urllib2 module is pretty straightforward
once you've used it a few times. Some of the class naming and whatnot
takes a bit of getting used to (I found that to be the most confusing
bit).

On Jun 27, 1:41 pm, Alexnb <alexnbr...@gmail.comwrote:
Okay, I tried to follow that, and it is kinda hard. But since you obviously
know what you are doing, where did you learn this? Or where can I learn
this?

Maric Michaud wrote:
Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit :
I have never used the urllib or the urllib2. I really have looked online
for help on this issue, and mailing lists, but I can't figure out my
problem because people haven't been helping me, which is why I am here!
:].
Okay, so basically I want to be able to submit a word to dictionary.com
and
then get the definitions. However, to start off learning urllib2, I just
want to do a simple google search. Before you get mad, what I have found
on
urllib2 hasn't helped me. Anyway, How would you go about doing this. No,
I
did not post the html, but I mean if you want, right click on your
browser
and hit view source of the google homepage. Basically what I want to know
is how to submit the values(the search term) and then search for that
value. Heres what I know:
import urllib2
response = urllib2.urlopen("http://www.google.com/")
html = response.read()
print html
Now I know that all this does is print the source, but thats about allI
know. I know it may be a lot to ask to have someone show/help me, but I
really would appreciate it.
This example is for google, of course using pygoogle is easier in this
case,
but this is a valid example for the general case :
>>>[207]: import urllib, urllib2
You need to trick the server with an imaginary User-Agent.
>>>[208]: def google_search(terms) :
return urllib2.urlopen(urllib2.Request("http://www.google.com/search?"
+
urllib.urlencode({'hl':'fr', 'q':terms}),
headers={'User-Agent':'MyNav
1.0
(compatible; MSIE 6.0; Linux'})
).read()
.....:
>>>[212]: res = google_search("python & co")
Now you got the whole html response, you'll have to parse it to recover
datas,
a quick & dirty try on google response page :
>>>[213]: import re
>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2
class=r>.*?</h2>',
res) ]
...[229]:
['Python Gallery',
'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty ....',
'Re: os x, panther, python &amp; co: msg#00041',
'Re: os x, panther, python &amp; co: msg#00040',
'Cardiff Web Site Design, Professional web site design services ...',
'Python Properties',
'Frees &lt; Programs &lt; Python &lt; Bin-Co',
'Torb: an interface between Tcl and CORBA',
'Royal Python Morphs',
'Python &amp; Co']
--
_____________
Maric Michaud
--
http://mail.python.org/mailman/listinfo/python-list

--
View this message in context:http://www.nabble.com/using-urllib2-...p18160312.html
Sent from the Python - python-list mailing list archive at Nabble.com.


Jun 27 '08 #2

P: n/a

I have read that multiple times. It is hard to understand but it did help a
little. But I found a bit of a work-around for now which is not what I
ultimately want. However, even when I can get to the page I want lets say,
"Http://dictionary.reference.com/browse/cheese", I look on firebug, and
extension and see the definition in javascript,

<table class="luna-Ent">
<tbody>
<tr>
<td class="dn" valign="top">1.</td>
<td valign="top">the curd of milk separated from the whey and prepared in
many ways as a food. </td>

Jeff McNeil-2 wrote:


the problem being that if I use code like this to get the html of that
page in python:

response = urllib2.urlopen("the webiste....")
html = response.read()
print html

then, I get a bunch of stuff, but it doesn't show me the code with the
table that the definition is in. So I am asking how do I access this
javascript. Also, if someone could point me to a better reference than the
last one, because that really doesn't tell me much, whether it be a book
or anything.

I stumbled across this a while back:
http://www.voidspace.org.uk/python/a.../urllib2.shtml.
It covers quite a bit. The urllib2 module is pretty straightforward
once you've used it a few times. Some of the class naming and whatnot
takes a bit of getting used to (I found that to be the most confusing
bit).

On Jun 27, 1:41 pm, Alexnb <alexnbr...@gmail.comwrote:
>Okay, I tried to follow that, and it is kinda hard. But since you
obviously
know what you are doing, where did you learn this? Or where can I learn
this?

Maric Michaud wrote:
Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit :
I have never used the urllib or the urllib2. I really have looked
online
>for help on this issue, and mailing lists, but I can't figure out my
problem because people haven't been helping me, which is why I am
here!
>:].
Okay, so basically I want to be able to submit a word to
dictionary.com
>and
then get the definitions. However, to start off learning urllib2, I
just
>want to do a simple google search. Before you get mad, what I have
found
>on
urllib2 hasn't helped me. Anyway, How would you go about doing this.
No,
>I
did not post the html, but I mean if you want, right click on your
browser
and hit view source of the google homepage. Basically what I want to
know
>is how to submit the values(the search term) and then search for that
value. Heres what I know:
>import urllib2
response = urllib2.urlopen("http://www.google.com/")
html = response.read()
print html
>Now I know that all this does is print the source, but thats about all
I
>know. I know it may be a lot to ask to have someone show/help me, but
I
>really would appreciate it.
This example is for google, of course using pygoogle is easier in this
case,
but this is a valid example for the general case :
>>>>[207]: import urllib, urllib2
You need to trick the server with an imaginary User-Agent.
>>>>[208]: def google_search(terms) :
return
urllib2.urlopen(urllib2.Request("http://www.google.com/search?"
+
urllib.urlencode({'hl':'fr', 'q':terms}),
headers={'User-Agent':'MyNav
1.0
(compatible; MSIE 6.0; Linux'})
).read()
.....:
>>>>[212]: res = google_search("python & co")
Now you got the whole html response, you'll have to parse it to recover
datas,
a quick & dirty try on google response page :
>>>>[213]: import re
>>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2
class=r>.*?</h2>',
res) ]
...[229]:
['Python Gallery',
'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty
...',
'Re: os x, panther, python &amp; co: msg#00041',
'Re: os x, panther, python &amp; co: msg#00040',
'Cardiff Web Site Design, Professional web site design services ...',
'Python Properties',
'Frees &lt; Programs &lt; Python &lt; Bin-Co',
'Torb: an interface between Tcl and CORBA',
'Royal Python Morphs',
'Python &amp; Co']
--
_____________
Maric Michaud
--
http://mail.python.org/mailman/listinfo/python-list

--
View this message in
context:http://www.nabble.com/using-urllib2-...p18160312.html
Sent from the Python - python-list mailing list archive at Nabble.com.


--
http://mail.python.org/mailman/listinfo/python-list
--
View this message in context: http://www.nabble.com/using-urllib2-...p18165634.html
Sent from the Python - python-list mailing list archive at Nabble.com.

Jun 27 '08 #3

P: n/a

I have read that multiple times. It is hard to understand but it did help a
little. But I found a bit of a work-around for now which is not what I
ultimately want. However, even when I can get to the page I want lets say,
"Http://dictionary.reference.com/browse/cheese", I look on firebug, and
extension and see the definition in javascript,

<table class="luna-Ent">
<tbody>
<tr>
<td class="dn" valign="top">1.</td>
<td valign="top">the curd of milk separated from the whey and prepared in
many ways as a food. </td>

the problem being that if I use code like this to get the html of that page
in python:

response = urllib2.urlopen("the webiste....")
html = response.read()
print html

then, I get a bunch of stuff, but it doesn't show me the code with the table
that the definition is in. So I am asking how do I access this javascript.
Also, if someone could point me to a better reference than the last one,
because that really doesn't tell me much, whether it be a book or anything.

Jeff McNeil-2 wrote:

I stumbled across this a while back:
http://www.voidspace.org.uk/python/a.../urllib2.shtml.
It covers quite a bit. The urllib2 module is pretty straightforward
once you've used it a few times. Some of the class naming and whatnot
takes a bit of getting used to (I found that to be the most confusing
bit).

On Jun 27, 1:41 pm, Alexnb <alexnbr...@gmail.comwrote:
>Okay, I tried to follow that, and it is kinda hard. But since you
obviously
know what you are doing, where did you learn this? Or where can I learn
this?

Maric Michaud wrote:
Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit :
I have never used the urllib or the urllib2. I really have looked
online
>for help on this issue, and mailing lists, but I can't figure out my
problem because people haven't been helping me, which is why I am
here!
>:].
Okay, so basically I want to be able to submit a word to
dictionary.com
>and
then get the definitions. However, to start off learning urllib2, I
just
>want to do a simple google search. Before you get mad, what I have
found
>on
urllib2 hasn't helped me. Anyway, How would you go about doing this.
No,
>I
did not post the html, but I mean if you want, right click on your
browser
and hit view source of the google homepage. Basically what I want to
know
>is how to submit the values(the search term) and then search for that
value. Heres what I know:
>import urllib2
response = urllib2.urlopen("http://www.google.com/")
html = response.read()
print html
>Now I know that all this does is print the source, but thats about all
I
>know. I know it may be a lot to ask to have someone show/help me, but
I
>really would appreciate it.
This example is for google, of course using pygoogle is easier in this
case,
but this is a valid example for the general case :
>>>>[207]: import urllib, urllib2
You need to trick the server with an imaginary User-Agent.
>>>>[208]: def google_search(terms) :
return
urllib2.urlopen(urllib2.Request("http://www.google.com/search?"
+
urllib.urlencode({'hl':'fr', 'q':terms}),
headers={'User-Agent':'MyNav
1.0
(compatible; MSIE 6.0; Linux'})
).read()
.....:
>>>>[212]: res = google_search("python & co")
Now you got the whole html response, you'll have to parse it to recover
datas,
a quick & dirty try on google response page :
>>>>[213]: import re
>>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2
class=r>.*?</h2>',
res) ]
...[229]:
['Python Gallery',
'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty
...',
'Re: os x, panther, python &amp; co: msg#00041',
'Re: os x, panther, python &amp; co: msg#00040',
'Cardiff Web Site Design, Professional web site design services ...',
'Python Properties',
'Frees &lt; Programs &lt; Python &lt; Bin-Co',
'Torb: an interface between Tcl and CORBA',
'Royal Python Morphs',
'Python &amp; Co']
--
_____________
Maric Michaud
--
http://mail.python.org/mailman/listinfo/python-list

--
View this message in
context:http://www.nabble.com/using-urllib2-...p18160312.html
Sent from the Python - python-list mailing list archive at Nabble.com.


--
http://mail.python.org/mailman/listinfo/python-list
--
View this message in context: http://www.nabble.com/using-urllib2-...p18165692.html
Sent from the Python - python-list mailing list archive at Nabble.com.

Jun 27 '08 #4

This discussion thread is closed

Replies have been disabled for this discussion.