468,505 Members | 1,591 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,505 developers. It's quick & easy.

Wanted: python script to convert to/from UTF-8 to/from XML Entities

Does someone have a little python script that will read a file in
UTF-8/UTF-16/UTF-32 (my choice) and search for all the characters between
0x7f-0xffffff and convert them to an ASCII digit string that begins with
"&#" and ends with ";" and output the whole thing? If not, could someone
tell me how to write one?

How about a script to do the inverse?

Thanks!
siegfried
Aug 30 '08 #1
1 1672
Siegfried Heintze wrote:
Does someone have a little python script that will read a file in
UTF-8/UTF-16/UTF-32 (my choice) and search for all the characters between
0x7f-0xffffff and convert them to an ASCII digit string that begins with
"&#" and ends with ";" and output the whole thing? If not, could someone
tell me how to write one?
file = open("filename.txt", "rb")
text = file.read()
text = unicode(text, "utf-8")
text = text.encode("ascii", "xmlcharrefreplace")
print text

tweak as necessary.

</F>

Aug 30 '08 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Mike Currie | last post: by
3 posts views Thread by Jared Wiltshire | last post: by
8 posts views Thread by sonald | last post: by
9 posts views Thread by thijs.braem | last post: by
6 posts views Thread by gita ziabari | last post: by
reply views Thread by NPC403 | last post: by
reply views Thread by fmendoza | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.