BeautifulSoup: problems with parsing a website

Marco Hornung

Hy guys,

I'm using the python-framework BeautifulSoup(BS) to parse some
information out of a german soccer-website.
I spend some qualitiy time with the BS-docs, but I couldn't really
figure out how to get what I was looking for.

Here's the deal:
I want to parse the article shown on the website. To do so I want to
use the Tag " <div class="txt_fliesstext">" as a starting-point. When
I have found the Tag I somehow want to get all following "br"-Tags
until there is a new CSS-Class Style is coming up.
I tried several options in the findAll()-command, but nothing seems to
work.(like: soup.findAll('br',attrs={'class':'txt_fliesstext'} , text
=True) - This one comes with a thound addtional Tag that I don't want
to have, or soup.findAll(attrs={'class':'txt_fliesstext'}) - This
gives me a much better Result, but in this case I only get some few
Tags, instead of all the Tags I want)

Any suggestions?
Thanks in advance!

Website:
http://www.bundesliga.de/de/liga/new...hp?f=94820.php
Some html-code of the website:
<div id="area_headline">
<div class="txt_headline_red">Erst Höhenflug, dann Absturz</
div>
</div>
<div id="area_fliesstext">
<div class="txt_fliesstext_bold">Mit 28 Punkten stand der KSC
nach der Hinrunde sensationell auf Platz 6.</div>
<br><br>
<div class="txt_fliesstext">Doch in der Rückrunde brachen
die Badener regelrecht ein und holten nur noch 15 Zähler.<br />
<br />
43 Punkte reichten am Ende für den 11. Tabellenplatz, ein mehr
als respektables Ergebnis für einen Aufsteiger.<br />
<br />

Jun 27 '08 #1

Subscribe Post Reply

1065

Similar topics

BeautifulSoup

by: Steve Young | last post by:

I tried using BeautifulSoup to make changes to the url links on html pages, but when the page was displayed, it was garbled up and didn't look right (even when I didn't actually change anything on...

Python

scraping nested tables with BeautifulSoup

by: Gonzillaaa | last post by:

I'm trying to get the data on the "Central London Property Price Guide" box at the left hand side of this page http://www.findaproperty.com/regi0018.html I have managed to get the data :) but...

Python

BeautifulSoup vs. loose & chars

by: John Nagle | last post by:

I've been parsing existing HTML with BeautifulSoup, and occasionally hit content which has something like "Design & Advertising", that is, an "&" instead of an "&". Is there some way I can get...

Python

BeautifulSoup bug when ">>>" found in attribute value

by: John Nagle | last post by:

This, which is from a real web site, went into BeautifulSoup: <param name="movie" value="/images/offersBanners/sw04.swf?binfot=We offer fantastic rates for selected weeks or days!!&blinkt=Click...

Python

"Subscribing" to topics?

by: Mizipzor | last post by:

Is there a way to "subscribe" to individual topics? im currently getting bombarded with daily digests and i wish to only receive a mail when there is activity in a topic that interests me. Can this...

Python

BeautifulSoup vs. Microsoft

by: John Nagle | last post by:

Here's a construct with which BeautifulSoup has problems. It's from "http://support.microsoft.com/contactussupport/?ws=support". This is the original: <a...

Python

BeautifulSoup vs. real-world HTML comments

by: John Nagle | last post by:

The syntax that browsers understand as HTML comments is much less restrictive than what BeautifulSoup understands. I keep running into sites with formally incorrect HTML comments which are parsed...

Python

Re: Where to get BeautifulSoup--www.crummy.com appears to be down.

by: John Nagle | last post by:

Mike Driscoll wrote: What on earth do you need a "Windows binary" for? "BeautifulSoup" is ONE PYTHON SOURCE FILE, "BeautifulSoup.py". It can be downloaded here: ...

Python

simple Question about using BeautifulSoup

by: Alexnb | last post by:

Okay, I have used BeautifulSoup a lot lately, but I am wondering, how do you open a local html file? Usually I do something like this for a url soup =...

Python

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware