By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,218 Members | 1,723 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,218 IT Pros & Developers. It's quick & easy.

Program works great, except under less, cron or execl (Unicode?)

P: n/a
Sam
I have a program which works great when run from the command line.

But when I run it combined with something else such as:
- piping it through less
- cron
- execl (i.e. calling it from another python program)

it gives me a unicode error

File "../myparser.py", line 261, in set_attributes
print "self.atd['Name'] is: ", self.atd['Name']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xeb' in
position 7: ordinal not in range(128)

I'd post the whole program here, except it involves weird Unicode
strings.

I could probably change the program to get it working under less/cron/
etc.

But I'd rather understand exactly what the issue is. Why does it work
fine when run directly from the command line, but not otherwise?
Sep 18 '08 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Sam wrote:
I have a program which works great when run from the command line.

But when I run it combined with something else such as:
- piping it through less
- cron
- execl (i.e. calling it from another python program)

it gives me a unicode error

File "../myparser.py", line 261, in set_attributes
print "self.atd['Name'] is: ", self.atd['Name']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xeb' in
position 7: ordinal not in range(128)

I'd post the whole program here, except it involves weird Unicode
strings.

I could probably change the program to get it working under less/cron/
etc.

But I'd rather understand exactly what the issue is. Why does it work
fine when run directly from the command line, but not otherwise?
Most probably because when to running directly inside a terminal, it gets
it's stdin/stdout as pipes - and python can't attempt to guess the proper
encoding on that, as it does on a terminal.

And thus, when you print unicode to the pipe, it can't decide which encoding
to use.

To circumvene this, try & wrap stdout into a codecs-module wrapper with a
proper encoding applied (e.g. utf-8).

You might make that conditionally based on the sys.stdout.encoding-variable
being set or not, albeit I'm not 100% sure to what it actually gets set
when used in a subprocess. But this should give you the idea where to look.

Diez
Sep 18 '08 #2

P: n/a
Most probably because when to running directly inside a terminal, it gets

That was of course meant to be "not running directly inside a terminal".
it's stdin/stdout as pipes - and python can't attempt to guess the proper
encoding on that, as it does on a terminal.
Diez
Sep 18 '08 #3

P: n/a
Sam
Diez for the win... :)

sys.stdout.encoding does indeed have the proper value when called from
the command line of UTF-8.

But when piped into anything or called from anywhere it's None.

Just for completeness, here's my test program:
#!/usr/bin/env python
import sys
print sys.stdout.encoding

And here are the results:
$ ./encoding.py
UTF-8
$ ./encoding.py | cat
None

Really, really annoying!

So how can I set sys.stdout.encoding so it's UTF-8 when piped through
cat (or anything else).

I tried assigning to it, but no dice.

On Sep 18, 2:12*am, "Diez B. Roggisch" <de...@nospam.web.dewrote:
Sam wrote:
I have a program which works great when run from the command line.
But when I run it combined with something else such as:
- piping it through less
- cron
- execl (i.e. calling it from another python program)
it gives me a unicode error
*File "../myparser.py", line 261, in set_attributes
* * print "self.atd['Name'] is: ", self.atd['Name']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xeb' in
position 7: ordinal not in range(128)
I'd post the whole program here, except it involves weird Unicode
strings.
I could probably change the program to get it working under less/cron/
etc.
But I'd rather understand exactly what the issue is. *Why does it work
fine when run directly from the command line, but not otherwise?

Most probably because when to running directly inside a terminal, it gets
it's stdin/stdout as pipes - and python can't attempt to guess the proper
encoding on that, as it does on a terminal.

And thus, when you print unicode to the pipe, it can't decide which encoding
to use.

To circumvene this, try & wrap stdout into a codecs-module wrapper with a
proper encoding applied (e.g. utf-8).

You might make that conditionally based on the sys.stdout.encoding-variable
being set or not, albeit I'm not 100% sure to what it actually gets set
when used in a subprocess. But this should give you the idea where to look.

Diez
Sep 19 '08 #4

P: n/a
In message
<c8**********************************@v16g2000prc. googlegroups.com>, Sam
wrote:
So how can I set sys.stdout.encoding so it's UTF-8 when piped through
cat (or anything else).

I tried assigning to it, but no dice.
You could try wrapping it in a file object that does explicit encoding
translation, using codecs.EncodedFile
<http://docs.python.org/lib/module-codecs.html>.
Sep 24 '08 #5

This discussion thread is closed

Replies have been disabled for this discussion.