471,316 Members | 1,563 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,316 software developers and data experts.

pipeline encoding

My locale is set to UTF-8. The command:
python -c "print u'\u03A9'"
gives me the desired result and doesn't produce any error.

But when I want to redirect the output to a file I invoke:
python -c "print u'\u03A9'" file.txt
I get an error:

File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03a9' in
position 0: ordinal not in range(128)

How to cope with it?

-tt.
Dec 6 '07 #1
9 1676
Tomasz Toczyski schrieb:
My locale is set to UTF-8. The command:
python -c "print u'\u03A9'"
gives me the desired result and doesn't produce any error.

But when I want to redirect the output to a file I invoke:
python -c "print u'\u03A9'" file.txt
I get an error:

File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03a9' in
position 0: ordinal not in range(128)

How to cope with it?
Python tries and guesses the stdout-encoding based on the terminal
settings. So the first print works.

However, piping to a file means that it can't do so, because it doesn't
(and shouldn't) make any assumptions on the output encoding desired -
after all, it might be appending to a XML-file with e.g. latin1 encoding.

So you need to explictely encode the unicode-object with the desired
encoding:
python -c "print u'\u03A9'.encode('utf-8')" file.txt
Diez
Dec 6 '07 #2
Tomasz Toczyski <tt@praterm.com.plwrites:
My locale is set to UTF-8. The command:
python -c "print u'\u03A9'"
gives me the desired result and doesn't produce any error.

But when I want to redirect the output to a file I invoke:
python -c "print u'\u03A9'" file.txt
I get an error:

File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03a9' in
position 0: ordinal not in range(128)

How to cope with it?
If you print to a terminal Python can use terminal encoding,
but if you redirect to a file Python doesn't know
what encoding to use (e.g. how was encoded existing file)
and refuses to guess.
You have to specify that encoding explicit:
python -c "print u'\u03A9'.encode('utf-8')" file.txt

HTH,
Rob
Dec 6 '07 #3
Diez B. Roggisch:
>
Python tries and guesses the stdout-encoding based on the terminal
settings. So the first print works.

However, piping to a file means that it can't do so, because it doesn't
(and shouldn't) make any assumptions on the output encoding desired -
after all, it might be appending to a XML-file with e.g. latin1
encoding.

So you need to explictely encode the unicode-object with the desired
encoding:
python -c "print u'\u03A9'.encode('utf-8')" file.txt
Thanks. It is a solutiona to my problem but:

Are there any command line option for telling python what encoding to use
for stdout?

To be honest I have a more complicated program than the example that I
have presented - there are many print commands inside and it is not very
feasible for me to put .encode('utf-8') inside every print occurence.

-tt.
Dec 6 '07 #4
Tomek Toczyski schrieb:
Diez B. Roggisch:
>>
Python tries and guesses the stdout-encoding based on the terminal
settings. So the first print works.

However, piping to a file means that it can't do so, because it
doesn't (and shouldn't) make any assumptions on the output encoding
desired - after all, it might be appending to a XML-file with e.g.
latin1 encoding.

So you need to explictely encode the unicode-object with the desired
encoding:
python -c "print u'\u03A9'.encode('utf-8')" file.txt

Thanks. It is a solutiona to my problem but:

Are there any command line option for telling python what encoding to
use for stdout?

To be honest I have a more complicated program than the example that I
have presented - there are many print commands inside and it is not very
feasible for me to put .encode('utf-8') inside every print occurence.
No it hasn't, and it's easy enough remedied by doing
def eprint(msg):
print msg.encode('utf-8')

and then doing

eprint('whatever')

instead of

print 'whatever'

Diez
Dec 6 '07 #5
Are there any command line option for telling python what encoding to
use for stdout?
Not a command line option. However, you can wrap sys.stdout with a
stream that automatically performs an encoding. If all your print
statements output Unicode strings, you can do

sys.stdout = codecs.getwriter("utf-8")(sys.stdout)

HTH,
Martin
Dec 6 '07 #6
Tomek Toczyski <gu*@kajak.org.plwrites:
Are there any command line option for telling python what encoding to
use for stdout?

To be honest I have a more complicated program than the example that I
have presented - there are many print commands inside and it is not
very feasible for me to put .encode('utf-8') inside every print
occurence.
You can use sitecustomize.py [1]_ for that purpose, e.g.
create this file in your current directory:

# sitecustomize.py
import sys
sys.setdefaultencoding('utf-8')

and run Python like that:

PYTHONPATH=. python -c "print u'\u03A9'" file.txt

But remember that when you copy this file to the global
PYTHONPATH on your system it will affect all Python
programs.

... [1] http://docs.python.org/lib/module-site.html

HTH,
Rob
Dec 6 '07 #7
Tomek Toczyski wrote:
Diez B. Roggisch:
>>
Python tries and guesses the stdout-encoding based on the terminal
settings. So the first print works.

However, piping to a file means that it can't do so, because it doesn't
(and shouldn't) make any assumptions on the output encoding desired -
after all, it might be appending to a XML-file with e.g. latin1
encoding.

So you need to explictely encode the unicode-object with the desired
encoding:
python -c "print u'\u03A9'.encode('utf-8')" file.txt

Thanks. It is a solutiona to my problem but:

Are there any command line option for telling python what encoding to use
for stdout?

To be honest I have a more complicated program than the example that I
have presented - there are many print commands inside and it is not very
feasible for me to put .encode('utf-8') inside every print occurence.
Alternatively you can wrap stdout:

# -*- coding: utf-8 -*-
import sys

if sys.stdout.encoding is None:
import locale
import codecs

encoding = locale.getpreferredencoding() # or just "utf-8"
streamwriter = codecs.lookup(encoding).streamwriter
sys.stdout = streamwriter(sys.stdout)

print u"ähnlich üblich möglich"
Dec 6 '07 #8
En Thu, 06 Dec 2007 14:13:04 -0300, Tomasz Toczyski <tt@praterm.com.pl>
escribi:
My locale is set to UTF-8. The command:
python -c "print u'\u03A9'"
gives me the desired result and doesn't produce any error.
Because in this case stdout is bound to your terminal and Python can ask
the OS which encoding it uses.
But when I want to redirect the output to a file I invoke:
python -c "print u'\u03A9'" file.txt
I get an error:

File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u03a9' in
position 0: ordinal not in range(128)
In this case, stdout is redirected, and a file can be written in any
encoding you like. So unless you tell Python which encoding to use, it
refuses to guess. Try:

python -c "print u'\u03A9'.encode('utf-8')" file.txt

Also try: python -c "import sys; print sys.stdout.encoding"
and see what happens in both cases.

--
Gabriel Genellina

Dec 7 '07 #9
"Martin v. Lwis":
Not a command line option. However, you can wrap sys.stdout with a
stream that automatically performs an encoding. If all your print
statements output Unicode strings, you can do

sys.stdout = codecs.getwriter("utf-8")(sys.stdout)
It is the best solution for me.
Thanks.

-tt.
Dec 7 '07 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Henning Truslew Gulliksen | last post: by
reply views Thread by Yanir | last post: by
3 posts views Thread by David Prentice | last post: by
reply views Thread by Bharanee K | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.