By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,235 Members | 1,008 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,235 IT Pros & Developers. It's quick & easy.

Behaviour of htmllib's HTML parser and formatter

P: n/a
Hi,

I have an HTML page that displays some content, and a part of that
content is HTML changed into regular text. The encoding of the page
is UTF-8.

Here's the code that makes the change (the HTML in self.contents is
UTF-8 encoded):

file = cStringIO.StringIO()
parser = htmllib.HTMLParser(formatter.AbstractFormatter(
formatter.DumbWriter(file=file)))
parser.feed(self.contents)
parser.close()
data = file.getvalue()[:size]
return return data

This renders entities such as   as black diamonds with a ? sign
in them in Firefox, so I guess something is going wrong along the way.
Any suggestions what it might be?

Thanks,

Morten
Jul 18 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.