Connecting Tech Pros Worldwide Forums | Help | Site Map

simple Question about using BeautifulSoup

Alexnb
Guest
 
Posts: n/a
#1: Aug 20 '08

Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you
open a local html file?

Usually I do something like this for a url

soup = BeautifulSoup(urllib.urlopen('http://www.website.com')

but the file extension doesn't work. So how do I open one?
--
View this message in context: http://www.nabble.com/simple-Questio...p19069980.html
Sent from the Python - python-list mailing list archive at Nabble.com.


Grzegorz Staniak
Guest
 
Posts: n/a
#2: Aug 20 '08

re: simple Question about using BeautifulSoup


On 2008-08-20, Alexnb <alexnbryan@gmail.comwroted:
Quote:
Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you
open a local html file?
>
Usually I do something like this for a url
>
soup = BeautifulSoup(urllib.urlopen('http://www.website.com')
>
but the file extension doesn't work. So how do I open one?
Have you tried the local file URL, like "file:///home/user/file.html"?

GS
--
Grzegorz Staniak <gstaniak _at_ wp [dot] pl>
Diez B. Roggisch
Guest
 
Posts: n/a
#3: Aug 20 '08

re: simple Question about using BeautifulSoup


Alexnb wrote:
Quote:
>
Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do
you open a local html file?
>
Usually I do something like this for a url
>
soup = BeautifulSoup(urllib.urlopen('http://www.website.com')
>
but the file extension doesn't work. So how do I open one?
The docs for urllib.urlopen clearly state that it returns a file-like
object. Which BS seems to grok.

So... how about passing another file-like object, like... *drumroll* - a
file?

soup = BeautifulSoup(open("myfile.html"))

Apart from the documented possibility to pass the html as string, which
means


soup = BeautifulSoup(open("myfile.html").read())

will work as well.

Diez
Closed Thread