By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,676 Members | 2,249 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,676 IT Pros & Developers. It's quick & easy.

can compile function have a bug?

P: n/a
>>compile('U""','c:/test','single')
<code object ? at 00F06B60, file "c:/test", line 1>
>>d=compile('U""','c:/test','single')
d
<code object ? at 00F06BA0, file "c:/test", line 1>
>>exec(d)
u'\xd6\xd0'
>>U""
u'\u4e2d'
>>>
why is the result different?
a bug or another reason?

Oct 8 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a
ygao wrote:
>>>compile('U"*"','c:/test','single')
<code object ? at 00F06B60, file "c:/test", line 1>
>>>d=compile('U"*"','c:/test','single')
d
<code object ? at 00F06BA0, file "c:/test", line 1>
>>>exec(d)
u'\xd6\xd0'
>>>U"*"
u'\u4e2d'
>>>>

why is the result different?
a bug or another reason?
How that particular output came to be I don't know, but you should be able
to avoid the confusion by either passing a unicode string to compile() or
specifying the encoding:
>>exec compile(u'u"*"','c:/test','single')
u'\u4e2d'
>>exec compile('# -*- coding: utf8 -*-\nu"*"','c:/test','single')
u'\u4e2d'

Peter

PS: In and all-UTF-8 environment I would have /expected/ to see
>>your_encoding = "utf8"
identity = "latin1"
u'\u4e2d'.encode(your_encoding).decode(identit y)
u'\xe4\xb8\xad'

and that's indeed what I get over here:
>>exec compile('u"*"','c:/test','single')
u'\xe4\xb8\xad'
Oct 9 '06 #2

P: n/a

Peter Otten wrote:
ygao wrote:
>>compile('U"*"','c:/test','single')
<code object ? at 00F06B60, file "c:/test", line 1>
>>d=compile('U"*"','c:/test','single')
d
<code object ? at 00F06BA0, file "c:/test", line 1>
>>exec(d)
u'\xd6\xd0'
>>U"*"
u'\u4e2d'
>>>
why is the result different?
a bug or another reason?

How that particular output came to be I don't know, but you should be able
to avoid the confusion by either passing a unicode string to compile() or
specifying the encoding:
>exec compile(u'u"*"','c:/test','single')
u'\u4e2d'
>exec compile('# -*- coding: utf8 -*-\nu"*"','c:/test','single')
u'\u4e2d'

Peter

PS: In and all-UTF-8 environment I would have /expected/ to see
>your_encoding = "utf8"
identity = "latin1"
u'\u4e2d'.encode(your_encoding).decode(identity )
u'\xe4\xb8\xad'

and that's indeed what I get over here:
>exec compile('u"*"','c:/test','single')
u'\xe4\xb8\xad'
But it's not an all-UTF-8 environment; his_encoding = 'gb2312' or one
of its heirs/successors :-)

Cheers,
John

Oct 9 '06 #3

P: n/a
John Machin wrote:
But it's not an all-UTF-8 environment; his_encoding = 'gb2312' or one
of its heirs/successors :-)
Ouch. Almost understanding a problem hurts more than not understanding it at
all. I just had a refresher of the experience...

Peter
Oct 9 '06 #4

P: n/a

Peter Otten wrote:
ygao wrote:
>>compile('U"*"','c:/test','single')
<code object ? at 00F06B60, file "c:/test", line 1>
>>d=compile('U"*"','c:/test','single')
d
<code object ? at 00F06BA0, file "c:/test", line 1>
>>exec(d)
u'\xd6\xd0'
>>U"*"
u'\u4e2d'
>>>
why is the result different?
a bug or another reason?

How that particular output came to be I don't know, but you should be able
to avoid the confusion by either passing a unicode string to compile() or
specifying the encoding:
>exec compile(u'u"*"','c:/test','single')
u'\u4e2d'
>exec compile('# -*- coding: utf8 -*-\nu"*"','c:/test','single')
u'\u4e2d'
this is what I want!
many thanks!
>
Peter

PS: In and all-UTF-8 environment I would have /expected/ to see
>your_encoding = "utf8"
identity = "latin1"
u'\u4e2d'.encode(your_encoding).decode(identity )
u'\xe4\xb8\xad'

and that's indeed what I get over here:
>exec compile('u"*"','c:/test','single')
u'\xe4\xb8\xad'
Oct 9 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.