| re: unicode wrap unicode object?
ygao wrote:[color=blue]
> I must use utf-8 for chinese.[/color]
Sure. But please don't do that:
[color=blue][color=green][color=darkred]
>>>> import sys
>>>> reload(sys)
>>>> sys.setdefaultencoding("utf-8")[/color][/color][/color]
As Fredrik says, you should really avoid changing the
default encoding.
[color=blue][color=green][color=darkred]
>>>> s='\xe9\xab\x98' #this uff-8 string
>>>> ss=U'\xe9\xab\x98'
>>>> ss1=ss.encode('unicode_escape').decode('string_esc ape')
>>>> s1=s.decode('unicode_escape')
>>>> s1==ss[/color][/color]
> True[color=green][color=darkred]
>>>> ss1==s[/color][/color]
> True[/color]
Ok. But how about that:
py> s='\xe9\xab\x98'
py> ss=u'\u9ad8'
py> s1=s.decode('utf-8')
py> s1==ss
True
Here, ss is a single character, which uses 3 bytes in UTF-8.
In your example, ss has three characters, which are not Chinese,
but European.
Regards,
Martin |