469,267 Members | 1,085 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,267 developers. It's quick & easy.

UTF-8 output problems

I am having a slight problem with UTF-8 output with Python. I have the
following program:

x = 0

while x < 0x4000:
print u"This is Unicode code point %d (0x%x): %s" % (x, x,
unichr(x))
x += 1

This program works perfectly when run directly:

mbt@pepper:~/tmp$ python test.py
This is Unicode code point 0 (0x0):
This is Unicode code point 1 (0x1):
This is Unicode code point 2 (0x2):
This is Unicode code point 3 (0x3):
This is Unicode code point 4 (0x4):
This is Unicode code point 5 (0x5):
This is Unicode code point 6 (0x6):
This is Unicode code point 7 (0x7):
This is Unicode code point 8 (0x8):
This is Unicode code point 9 (0x9):
This is Unicode code point 10 (0xa):
(... continued)

However, when I attempt to redirect the output to a file:

mbt@pepper:~/tmp$ python test.py >f
Traceback (most recent call last):
File "test.py", line 6, in <module>
print u"This is Unicode code point %d (0x%x): %s" % (x, x,
unichr(x))
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in
position 39: ordinal not in range(128)

This is slightly confusing to me. The output goes all the way to the
end of the program when it is not redirected. Why is Python treating
the situation differently when the output is redirected? This failure
occurs for all redirection, by the way: >, >>, 1>2, pipes, and so forth.

Any ideas?

— Mike

--
Michael B. Trausch
fd****@gmail.com
Phone: (404) 592-5746
Jabber IM:
fd****@gmail.com
fd****@livejournal.com
Demand Freedom! Use open and free protocols, standards, and software!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQBF8gW+0kE/IBnFmjARAg4SAJ0RBrk/+W1udAMJXVGN1ev5Cid1MwCePLEj
N/AcFNwgm9mgYtP61Z9HYs0=
=w41X
-----END PGP SIGNATURE-----

Mar 10 '07 #1
2 5599
In <ma***************************************@python. org>, Michael B.
Trausch wrote:
However, when I attempt to redirect the output to a file:

mbt@pepper:~/tmp$ python test.py >f
Traceback (most recent call last):
File "test.py", line 6, in <module>
print u"This is Unicode code point %d (0x%x): %s" % (x, x,
unichr(x))
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in
position 39: ordinal not in range(128)

This is slightly confusing to me. The output goes all the way to the
end of the program when it is not redirected. Why is Python treating
the situation differently when the output is redirected?
If you print to a terminal `sys.stdout` is connected to that terminal and
there are ways to figure out that it is a terminal (`os.isatty()`) and
which encoding the terminal excepts. At least in most cases. But there
is no way to tell what encoding a file or pipe should have. So Python
refuses to guess.

If an encoding could be determined the `sys.stdout.encoding` attribute is
set to the name, otherwise it's `None`.

Ciao,
Marc 'BlackJack' Rintsch
Mar 10 '07 #2
Michael B. Trausch wrote:
I am having a slight problem with UTF-8 output with Python. I have the
following program:

x = 0

while x < 0x4000:
print u"This is Unicode code point %d (0x%x): %s" % (x, x,
unichr(x))
x += 1

This program works perfectly when run directly:

mbt@pepper:~/tmp$ python test.py
This is Unicode code point 0 (0x0):
This is Unicode code point 1 (0x1):
This is Unicode code point 2 (0x2):
This is Unicode code point 3 (0x3):
This is Unicode code point 4 (0x4):
This is Unicode code point 5 (0x5):
This is Unicode code point 6 (0x6):
This is Unicode code point 7 (0x7):
This is Unicode code point 8 (0x8):
This is Unicode code point 9 (0x9):
This is Unicode code point 10 (0xa):
(... continued)

However, when I attempt to redirect the output to a file:

mbt@pepper:~/tmp$ python test.py >f
Traceback (most recent call last):
File "test.py", line 6, in <module>
print u"This is Unicode code point %d (0x%x): %s" % (x, x,
unichr(x))
UnicodeEncodeError: 'ascii' codec can't encode character u'\x80' in
position 39: ordinal not in range(128)

This is slightly confusing to me. The output goes all the way to the
end of the program when it is not redirected. Why is Python treating
the situation differently when the output is redirected? This failure
occurs for all redirection, by the way: >, >>, 1>2, pipes, and so forth.

Any ideas?
In complement to Marc reply, you can open a file with a specific encoding
(see codecs.open() function), and use print >f,... to fill that file.

A+

Laurent.
Mar 10 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

9 posts views Thread by lawrence | last post: by
4 posts views Thread by Alban Hertroys | last post: by
38 posts views Thread by Haines Brown | last post: by
32 posts views Thread by Wolfgang Draxinger | last post: by
6 posts views Thread by archana | last post: by
7 posts views Thread by Jimmy Shaw | last post: by
23 posts views Thread by Allan Ebdrup | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.