469,929 Members | 1,837 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,929 developers. It's quick & easy.

processing a large utf-8 file

Since the .encoding attribute of file objects are read-only, what is the
proper way to process large utf-8 text files?

I need "bulk" processing (i.e. in blocks - the file is ~ 1GB), but
reading it in fixed blocks is bound to result in partially-read utf-8
characters at block boundaries.

Jul 19 '05 #1
1 1040
Ivan Voras wrote:
Since the .encoding attribute of file objects are read-only, what is the
proper way to process large utf-8 text files?


You should use codecs.open, or codecs.getreader to get a StreamReader
for UTF-8.

Regards,
Martin
Jul 19 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by christof hoeke | last post: by
3 posts views Thread by thomas.porschberg | last post: by
10 posts views Thread by Enrique Cruiz | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.