Is unicode_escape broken?


I am confused by unicode_escape functionality - it doesn't seem to
follow string_escape functionality.

I would expect that given the same string (or at least a non-unicode
and unicode string appropriately) that they would produce more or less
the same output, but:
"\t\\t".encode( 'string_escape' ) '\\t\\\\t' u"\t\\t".encode ('unicode_escap e') '\\t\\t'

(I would have expected "\\t\\\\t" )

and then round - tripping also seems to be broken for unicode_escape: "\t\\t".encode( 'string_escape' ).decode('strin g_escape') '\t\\t' u"\t\\t".encode ('unicode_escap e').decode('uni code_escape')


Python Version "Python 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310
32 bit (Intel)] on win32"


Dec 13 '05 #1
I also believe this is a bug.

Here's an even shorter demonstration of the behavior:
u"\\".encode("u nicode_escape") .decode("unicod e_escape")

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeEr ror: 'unicodeescape' codec can't decode byte 0x5c in
position 0: \ at end of string

To report a bug, follow the directions at the bottom of this page:

To report a bug not listed above, always check the SourceForge Bug
Tracker[1] to see if they've already been reported. Use the bug tracker to
report new bugs. If you have a patch, please use the SourceForge Patch
Manager[2]. Please mention that you are reporting a bug in 2.4.2, and note
that you must have a SourceForge account and be logged in to submit a
bug report or patch (we require this in case we need more information
from you).

[1] http://sourceforge.net/bugs/?group_id=5470
[2] http://sourceforge.net/patch/?group_id=5470


Dec 14 '05 #2

