By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,738 Members | 1,463 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,738 IT Pros & Developers. It's quick & easy.

Converting HTML to ASCII

P: n/a
Hi. I'm looking for a Python lib to convert HTML to
ASCII. Of course, a quick Google search showed
several options (although, I must say, less than I
would expect, considering how easy this is to do in
*other* languages... :| ), but, I have 2 requirements,
which none of them seem to meet:

1) Be able to handle badly formed, or illegal, HTML,
as best as possible. Some of the converters I tried
ended up dieing on a weird character (that is, a high
ascii char). Others somehow got confused and dumped
the JavaScript as well.

2) Not embellish the text in any way - no asterisks,
no bracket links, no __ for underlines.

Can anyone direct me to something which could help me
for this?

--Thanks a mil.

__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail
Jul 18 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
gf gf <un**************@yahoo.com> wrote:
Hi. I'm looking for a Python lib to convert HTML to
ASCII. Of course, a quick Google search showed
several options (although, I must say, less than I
would expect, considering how easy this is to do in
*other* languages... :| ), but, I have 2 requirements,
which none of them seem to meet:

1) Be able to handle badly formed, or illegal, HTML,
as best as possible. Some of the converters I tried
ended up dieing on a weird character (that is, a high
ascii char). Others somehow got confused and dumped
the JavaScript as well.

2) Not embellish the text in any way - no asterisks,
no bracket links, no __ for underlines.

Can anyone direct me to something which could help me
for this?


man lynx
man links
man w3m

--
William Park <op**********@yahoo.ca>, Toronto, Canada
Slackware Linux -- because I can type.

Jul 18 '05 #2

P: n/a
Try Beautiful Soup!
1) Be able to handle badly formed, or illegal, HTML,
as best as possible. From the description:
"It won't choke if you give it ill-formed markup: it'll just give you access to
a correspondingly ill-formed data structure."
Can anyone direct me to something which could help me
for this?

http://www.crummy.com/software/BeautifulSoup/

Hans Christian
Jul 18 '05 #3

P: n/a
gf gf wrote:
Hi. I'm looking for a Python lib to convert HTML to
ASCII.


You might find these threads on comp.lang.python interesting:
http://tinyurl.com/5zmpn
http://tinyurl.com/6mxmb

Kent
Jul 18 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.