On Thu, 18 Mar 2004 19:10:08 +0000, thehaas wrote:
[color=blue]
> "Martin v. Löwis" <martin@v.loewis.de> wrote:[color=green]
>>
thehaas@binary.net wrote:[color=darkred]
>> > Obviously, 'Grüß'!='Gr\xfc\xdf' .[/color][/color]
>[color=green]
>> It is not at all obvious that they are different. In fact, they
>> are the same, assuming the second string is encoding in Latin-1.[/color]
>[color=green][color=darkred]
>> > Any ideas on how I can get the correct value?[/color][/color]
>[color=green]
>> Pray tell: what is the correct value?[/color]
>
> The correct value is 'Grüß', or at least have it equal to that.
>
> Maybe I should back up -- I'm interfacing into a Windows API. In that API, I see 'Grüß' as:[color=green][color=darkred]
> >>> plist[-1].Reference[/color][/color]
> u'Gr\xfc\xdf'
>
> My value in goodProcList is:[color=green][color=darkred]
> >>> goodProcRef[18][/color][/color]
> 'Gr\xfc\xdf'
>
> (yeah, goodProcList isn't in Unicode -- that's probably the cause of all this)
>
> When I test their equality:
>[color=green][color=darkred]
>>>> goodProcRef[18] == plist[-1].Reference[/color][/color]
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 2: ordinal
> not in range(128)
>
> If I try to manually encode goodProcRef[18], I get the same thing:
>[color=green][color=darkred]
> >>> goodProcRef[18].encode('utf-8')[/color][/color]
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 2: ordinal not in range(128)[/color]
by experience, you must first decode your string to encode it
so[color=blue][color=green][color=darkred]
>>> goodProcRef='Gr\xfc\xdf'.decode('latin-1')
>>> goodProcRef[/color][/color][/color]
u'Gr\xfc\xdf'
now you could compare goodProcRef and plist[-1].Reference and get "True"
When strings are unicode strings, then you can encode them easily
[color=blue][color=green][color=darkred]
>>> goodProcRef.encode('UTF8')[/color][/color][/color]
'Gr\xc3\xbc\xc3\x9f'[color=blue][color=green][color=darkred]
>>> plist[-1].Reference.encode('UTF8')[/color][/color][/color]
'Gr\xc3\xbc\xc3\x9f'
Hope it can help,
Riccardo
--
-=Riccardo Galli=-
_,e.
s~ ``
~@. ideralis Programs
.. ol
`**~
http://www.sideralis.net