By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,768 Members | 2,004 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,768 IT Pros & Developers. It's quick & easy.

how to write unicode to a txt file?

P: n/a
I want to change an srt file to unicode format so mpalyer can display
Chinese subtitles properly.
I did it like this:

txt=open('dmd-guardian-cd1.srt').read()
txt=unicode(txt,'gb18030')
open('dmd-guardian-cd1.srt','w').write(txt)

But it seems that python can't directly write unicode to a file,
I got and error at the 3rd line:
UnicodeEncodeError: 'ascii' codec can't encode characters in position
85-96: ordinal not in range(128)

How to save the unicode string to the file, please?
Thanks!

Jan 17 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Frank Potter wrote:
I want to change an srt file to unicode format so mpalyer can display
Chinese subtitles properly.
I did it like this:

txt=open('dmd-guardian-cd1.srt').read()
txt=unicode(txt,'gb18030')
open('dmd-guardian-cd1.srt','w').write(txt)

But it seems that python can't directly write unicode to a file,
I got and error at the 3rd line:
UnicodeEncodeError: 'ascii' codec can't encode characters in position
85-96: ordinal not in range(128)

How to save the unicode string to the file, please?
Thanks!
You have to tell Python what encoding to use (i. e how to translate the
codepoints into bytes):
>>txt = u"ähnlicher als gewöhnlich üblich"
import codecs
codecs.open("tmp.txt", "w", "utf8").write(txt)
codecs.open("tmp.txt", "r", "utf8").read()
u'\xe4hnlicher als gew\xf6hnlich \xfcblich'

You would perhaps use 'gb18030' instead of 'utf8'.

Peter
Jan 17 '07 #2

P: n/a
Frank Potter wrote:
But it seems that python can't directly write unicode to a file,
You need to use the method open from module codecs:
>>import codecs
a = codecs.open("pru_uni.txt", "w", "utf-8")
txt = unicode("campeón\n", "utf-8")
a.write(txt)
a.close()
So, then, from command line:

facundo@expiron:~$ file pru_uni.txt
pru_uni.txt: UTF-8 Unicode text

:)

Regards,

--
.. Facundo
..
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
Jan 17 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.