472,371 Members | 1,405 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,371 software developers and data experts.

Re: simple Question about using BeautifulSoup

On Wed, 20 Aug 2008 07:33:32 -0700 (PDT), Alexnb <al********@gmail.comwrote:
>
Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you
open a local html file?

Usually I do something like this for a url

soup = BeautifulSoup(urllib.urlopen('http://www.website.com')
urllib.urlopen gives you a file-like object for a resource at an url.

file gives you a file-like object for a file on the local filesystem.

soup = BeautifulSoup(file('/the/name/of/the/file'))

Jean-Paul
Aug 20 '08 #1
1 1332
In article <ma*************************************@python.or g>,
Jean-Paul Calderone <ex*****@divmod.comwrote:
>On Wed, 20 Aug 2008 07:33:32 -0700 (PDT), Alexnb <al********@gmail.comwrote:
>>
Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you
open a local html file?

Usually I do something like this for a url

soup = BeautifulSoup(urllib.urlopen('http://www.website.com')

urllib.urlopen gives you a file-like object for a resource at an url.

file gives you a file-like object for a file on the local filesystem.

soup = BeautifulSoup(file('/the/name/of/the/file'))
Except that you should use open() instead of file() -- docs prior to
Python 2.5 mistakenly had this reversed.
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

Adopt A Process -- stop killing all your children!
Aug 24 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Dan Stromberg | last post by:
I'm working on writing a program that will synchronize one database with another. For the source database, we can just use the python sybase API; that's nice and normal. For the target...
1
by: Dan Stromberg | last post by:
Has anyone tried to construct an HTML janitor script using BeautifulSoup? My situation: I'm trying to convert a series of web pages from .html to palmdoc format, using plucker, which is...
4
by: joe_public34 | last post by:
Hello all, I'm trying to write a script to log into Yahoo! (mail, groups, etc), but when I pass the full URL with all form elements via Python, I get a resutling 400 - Bad Request page. When I...
4
by: William Xu | last post by:
Hi, all, This piece of code used to work well. i guess the error occurs after some upgrade. >>> import urllib >>> from BeautifulSoup import BeautifulSoup >>> url = 'http://www.google.com'...
5
by: John Nagle | last post by:
This, which is from a real web site, went into BeautifulSoup: <param name="movie" value="/images/offersBanners/sw04.swf?binfot=We offer fantastic rates for selected weeks or days!!&blinkt=Click...
3
by: John Nagle | last post by:
Are weak refs slower than strong refs? I've been considering making the "parent" links in BeautifulSoup into weak refs, so the trees will release immediately when they're no longer needed. In...
2
by: Alexnb | last post by:
Okay, I am not sure if there is a better way of doing this than findAll() but that is how I am doing it right now. I am making an app that screen scapes dictionary.com for definitions. However, I...
1
by: Alexnb | last post by:
Okay, what I want to do with this code is to got to thesaurus.reference.com and then search for a word and get the syns for it. Now, I can get the syns, but they are still in html form and some are...
2
by: Alexnb | last post by:
Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you open a local html file? Usually I do something like this for a url soup =...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
2
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...
1
by: Johno34 | last post by:
I have this click event on my form. It speaks to a Datasheet Subform Private Sub Command260_Click() Dim r As DAO.Recordset Set r = Form_frmABCD.Form.RecordsetClone r.MoveFirst Do If...
1
by: ezappsrUS | last post by:
Hi, I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.