473,781 Members | 2,413 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

beautiful soup library question

Hi all,

I'm trying to extract some information from an html file using
beautiful soup. The strings I want get are after br tags, eg:

<font size='6'>
<br>this info
<br>more info
<br>and more info
</font>

I can navigate to the first br tag using find_next_sibli ng, but how do
I get the string after the br's?
br.contents is empty.

thanks for any ideas.

Mar 10 '06 #1
2 2172
me*****@gmail.c om wrote:
I'm trying to extract some information from an html file using
beautiful soup. The strings I want get are after br tags, eg:

<font size='6'>
<br>this info
<br>more info
<br>and more info
</font>

I can navigate to the first br tag using find_next_sibli ng, but how do
I get the string after the br's?
br.contents is empty.


I'm not familiar with Beautiful Soup specifically, but this isn't how
the <br> tag works. Unlike a tag like <li> or <p>, which need not be
closed in HTML, <br> does not contain anything, it's just a line break.
If it were XHTML, it would be <br />, indicating that it's a
standalone tag.

Instead you want to traverse the contents of the font tag, taking into
account line breaks that you encounter.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Fear is an emotion indispensible for survival.
-- Hannah Arendt
Mar 10 '06 #2
Here's how I print each line after the <br>'s:

import BeautifulSoup as Soup
page=open("test .html").read()
soup=Soup.Beaut ifulSoup(page)
for br in soup.fetch('br' ):
print br.next

Mar 11 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1421
by: rh0dium | last post by:
Hi all, I am trying to parse into a dictionary a table and I am having all kinds of fun. Can someone please help me out. What I want is this: dic={'Division Code':'SALS','Employee':'LOO ABLE'} Here is what I have..
1
2732
by: Tempo | last post by:
Heya. I have never used a module/script before, and the first problem I have run into is that I do not know how to install a module/script. I have downloaded Beautiful Soup, but how do I use it in one of my own programs? I know that I use an "include" statement, but do I first need to make a copy of BeautifulSoup.pyc or BeautifulSoup.py into the Python directory? Thanks in advanced for any and all help that you may provide. Many thanks.
15
5995
by: Francach | last post by:
Hi, I'm trying to use the Beautiful Soup package to parse through the "bookmarks.html" file which Firefox exports all your bookmarks into. I've been struggling with the documentation trying to figure out how to extract all the urls. Has anybody got a couple of longer examples using Beautiful Soup I could play around with? Thanks, Martin.
6
8194
by: wei.niu | last post by:
I'm writing a little software for managing diary.I use only SDK,and I find it's hard to create a good GUI.I hope it's skin can be changed easily.But I don't know how to do it.Are there any articles or book about it?
3
3130
by: PicURLPy | last post by:
Hello, I want to extract some image links from different html pages, in particular i want extract those image tags which height values are greater than 200. Is there an elegant way in BeautifulSoup to do this?
18
2327
by: Neil Cherry | last post by:
I'm in the process of redesigning my web page and started working more with CSS. I have, what I think is, a nice web layout (I'm no expert so I could be wrong). When I tested it with Konqueror and Firefox it works well (a few minor problems) but when I test it with IE 6.0 it doesn't display or it jumbles everything up. I've tried using various examples (conditionals) to get around IE problems but they are causing more problems that they...
5
6449
by: fAnSKyer/C# newbie | last post by:
How to make GUI more beautiful? Can any give any hint? Or some examples that downloadeable from internet? I am using C# and visual studio 2005 Thanks
3
10769
by: cjl | last post by:
I am learning python and beautiful soup, and I'm stuck. A web page has a table that contains data I would like to scrape. The table has a unique class, so I can use: soup.find("table", {"class": "class_name"}) This isolates the table. So far, so good. Next, this table has a certain number of rows (I won't know ahead of time how many), and each row has a set number of cells (which will be constant).
8
2773
by: js | last post by:
Hi, Have you ever seen Beautiful Python code? Zope? Django? Python standard lib? or else? Please tell me what code you think it's stunning.
0
9639
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10143
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10076
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9939
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8964
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7486
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5375
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3633
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2870
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.