Jon Skeet [C# MVP] <sk***@pobox.comwrote in
news:MP*********************@msnews.microsoft.com:
Peter K <xd****@hotmail.comwrote:
>in the processing of some text files, I have found I have strings
like:
>>
f=E5t
pr=F8ve
where the strings "=E5" and "=F8" are danish characters "å" and "
ø". I can
>work this out myself, but how can my program know - or at least what
"decoding" do I need to do to get the correct characters in my string?
It's not entirely clear what you mean. Does the text file contain
"=E5" but it *should* contain a Danish character, or were you doing
some replacement for us?
What's the source of the text file?
The text files are actually xml (if that makes a difference).
And in actual fact the xml nodes I am interested in contain a "base64
encoded" string. So I extract this string, which is a long piece of text
starting:
"DQo8YnI+PGZvbnQgc2l6ZT0xIGNvbG9yPXdoaXRlIGZhY2U9I lZlcmRhbmEL".
I un-encode this base64 string, and get another text string which is in
"html" format. (With all sorts of html tags in it).
Some of this html contains the 3 symbols
= E 5
for example. (No spaces between the symbols).
They represent obviously the Danish letter 'å' (obviously because a
danish speaker can see it from the word it appears in).
So for example, the text file may contain strings like:
"f=E5et"
(I don't know how you see this string in your news reader - but I see it
as 6 characters: f = E 5 e t).
As a part of processing, my program needs to convert these strings (=E5)
to real text (å).
But I don't know what this encoding is, or how to decode it.
Thanks,
Peter