By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,427 Members | 1,354 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,427 IT Pros & Developers. It's quick & easy.

HTML Table-of-Content Extraction Script

P: n/a
I'm looking for a function which extracts a table of contents of HTML file(s) from <Hx....><a name=...></a>...</Hxand possibly auto-creates the ancors.
Maybe something already exists?
Robert
Nov 28 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
robert wrote:
I'm looking for a function which extracts a table of contents of HTML file(s) from <Hx....><a name=...></a>...</Hxand possibly auto-creates the ancors.
Maybe something already exists?
You can try mine:
http://www.thomas-guettler.de/script...eadings.py.txt

--
Thomas GŁttler, http://www.thomas-guettler.de/ http://www.tbz-pariv.de/
E-Mail: guettli (*) thomas-guettler + de
Spam Catcher: ni**************@thomas-guettler.de

Nov 28 '06 #2

P: n/a
robert wrote:
I'm looking for a function which extracts a table of contents
of HTML file(s) from <Hx....><a name=...></a>...</Hx>
and possibly auto-creates the ancors.
Maybe something already exists?
that's the kind of stuff you'll write in approximately two minutes using
BeautifulSoup (or if you prefer the ElementTree API, ElementSoup).

start here:

http://www.crummy.com/software/BeautifulSoup/

</F>

Nov 28 '06 #3

P: n/a

Fredrik Lundh wrote:
robert wrote:
I'm looking for a function which extracts a table of contents
of HTML file(s) from <Hx....><a name=...></a>...</Hx>
and possibly auto-creates the ancors.
Maybe something already exists?

that's the kind of stuff you'll write in approximately two minutes using
BeautifulSoup (or if you prefer the ElementTree API, ElementSoup).

start here:

http://www.crummy.com/software/BeautifulSoup/

</F>
splity does that, but it's not Python.

Cheers,
-T

Nov 29 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.