By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,484 Members | 1,017 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,484 IT Pros & Developers. It's quick & easy.

Python beginner, unicode encode/decode Q

P: n/a
1 Objective to write little programs to help me learn German. See code
after numbered comments. //Thanks in advance for any direction or
suggestions.

tk

2 Want keyboard answer input, for example:

answer_str = raw_input(' Enter answer ') Herr ▄Ř

[ I keyboard in the following characters Herr ▄Ř ]
print answer_str
Output on screen is Herr ▄Ř

3 history 1 and 2 code run interactively under Debian Linux Python
2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
and it works.

4 history 3 and 4 code run from within a .py file produce different
output from example in book.

5 want to operate under Debian Linux but because the program failed
under Linux when I tried to run the code from a file in Linux Python, I
thougt I should fire up the win98 Idle/python program and try it to see
if ran there but it failed, too from within a file.

6 The sample code is from page 108-109 of: "Python for Dummies"
It says in the book: "Python's file objects and StringIO objects
don't support raw Unicode; the usual workaround is to encode Unicode as
UTF-8 before saving it to a file or stringIO object.
The sample code from the book is French as indicate here but trying
German produces the same result.

7 I have searched the net under all the keywords but this is as close as
I get to accomplishing my task. I suspect I may not be understanding:
StringIO objects don't support raw Unicode, but I don't know.
#_*_ coding: utf-8 _*_

# code run under linux debian interactively from a terminal and works

print " u'Libert\u00e9' "

# y = raw_input('Enter >') commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive
>>y = raw_input ('>')
Libert\xc3\xa9
>>q = 'Libert\xc3\xa9'
q.decode('utf-8')
u'Libert\xe9'
>>print q
LibertÚ
>>>
[ screen output is next line ]

LbertÚ

history 2
# code run under win98, first edition, within IDLE interactively and
succeeded in produce correct results.
# y = raw_input('Enter >') commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive
>>y = raw_input ('>')
Libert\xc3\xa9
>>q = 'Libert\xc3\xa9'
q.decode('utf-8')
u'Libert\xe9'
>>print q
LibertÚ
>>>
[ screen output is next line ]

LbertÚ


# history 3

# this code is run from within idle on win98 and inside a python file.
# The code DOES NOT produce the proper outout.

#_*_ coding: utf-8 _*_

# print "u'Libert\u00e9'" printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is on the lines below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'

u'Libert\u00e9'

The code DOES NOT produce LibertÚ but instead produce u'Libert\u00e9'

# history 4

# this code is run from within terminal on Debian linux inside a
python file.
# The code does not produce proper outout but produces the same output
as run on
# windows.

#_*_ coding: utf-8 _*_

print "u'Libert\u00e9'" printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is on the lines below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'
u'Libert\u00e9'

The code DID NOT produce LibertÚ but instead produce u'Libert\u00e9'
Jul 14 '08 #1
Share this Question
Share on Google+
1 Reply


P: n/a
On Jul 14, 1:51*pm, anonymous <anonym...@anonymous.comwrote:
1 Objective to write little programs to help me learn German. *See code
after numbered comments. //Thanks in advance for any direction or
suggestions.

tk

2 *Want keyboard answer input, for example: *

answer_str *= raw_input(' Enter answer ') Herr *▄Ř

[ I keyboard in the following characters Herr ▄Ř ]
print answer_str
Output on screen is Herr ▄Ř

3 * history 1 and 2 *code run interactively under Debian Linux Python
2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
and it works.

4 *history 3 and 4 code run from within a .py file produce different
output from example in book.

5 want to operate under Debian Linux but because the program failed
under Linux when I tried to run the code from a file in Linux Python, I
thougt I should fire up the win98 Idle/python program and try it to see
if ran there but it failed, too from within a file.

6 The sample code is from page 108-109 of: * "Python for Dummies"
* * * It says in the book: *"Python's file objects and StringIO objects
don't support raw Unicode; the usual workaround is to encode Unicode as
UTF-8 before saving it to a file or stringIO object. *
The sample code from the book is French as indicate here but trying
German produces the same result.

7 I have searched the net under all the keywords but this is as close as
I get to accomplishing my task. *I suspect I may not be understanding:
StringIO objects don't support raw Unicode, but I don't know.

#_*_ coding: utf-8 _*_

# code run under linux debian *interactively from a terminal and works

print " u'Libert\u00e9' "

# y = raw_input('Enter >') *commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

*>>y = raw_input ('>')
*>Libert\xc3\xa9
*>>q = 'Libert\xc3\xa9'
*>>q.decode('utf-8')
u'Libert\xe9'
*>>print q
LibertÚ
*>>>

[ *screen output is next line ]

LbertÚ

history 2
# code run under win98, first edition, within IDLE interactively and
succeeded in produce correct results.

# y = raw_input('Enter >') *commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

*>>y = raw_input ('>')
*>Libert\xc3\xa9
*>>q = 'Libert\xc3\xa9'
*>>q.decode('utf-8')
u'Libert\xe9'
*>>print q
LibertÚ
*>>>

[ *screen output is next line ]

LbertÚ

# history 3

# this code is run from within idle on win98 and inside a python file. *
# *The code DOES NOT produce the proper outout.

#_*_ coding: utf-8 _*_

# print "u'Libert\u00e9'" *printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is *on the lines *below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'

u'Libert\u00e9'

The code DOES NOT produce LibertÚ but instead produce u'Libert\u00e9'

# history 4

# this code is run from within terminal on Debian linux * inside a
python file. *
# The code does not produce proper outout but produces the same output
as run on
# windows.

#_*_ coding: utf-8 _*_

print "u'Libert\u00e9'" *printed to screen

y = raw_input('Enter >')

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is *on the lines *below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string
Enter >u'Libert\u00e9'
u'Libert\u00e9'

The code DID NOT produce LibertÚ but instead produce u'Libert\u00e9'
raw_input returns what you entered. You entered u'Libert\u00e9' so
that's what was printed out.

If you want to be able to enter escape sequences like \u00e9 and have
them decoded to the appropriate character then you must do something
like this:

# The code
text = raw_input('Enter >')
decoded_text = text.decode("unicode-escape")
print decoded_text
# The output
Enter >Libert\u00e9
LibertÚ

HTH
Jul 14 '08 #2

This discussion thread is closed

Replies have been disabled for this discussion.