423,674 Members | 1,946 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 423,674 IT Pros & Developers. It's quick & easy.

decoding =E5, =F8

P: n/a
Hi

in the processing of some text files, I have found I have strings like:

f=E5t
pr=F8ve

where the strings "=E5" and "=F8" are danish characters "" and "". I can
work this out myself, but how can my program know - or at least what
"decoding" do I need to do to get the correct characters in my string?

Thanks,
Peter

Jun 15 '07 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Peter K <xd****@hotmail.comwrote:
in the processing of some text files, I have found I have strings like:

f=E5t
pr=F8ve

where the strings "=E5" and "=F8" are danish characters "" and "". I can
work this out myself, but how can my program know - or at least what
"decoding" do I need to do to get the correct characters in my string?
It's not entirely clear what you mean. Does the text file contain
"=E5" but it *should* contain a Danish character, or were you doing
some replacement for us?

What's the source of the text file?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Jun 15 '07 #2

P: n/a
Jon Skeet [C# MVP] <sk***@pobox.comwrote in
news:MP*********************@msnews.microsoft.com:
Peter K <xd****@hotmail.comwrote:
>in the processing of some text files, I have found I have strings
like:
>>
f=E5t
pr=F8ve

where the strings "=E5" and "=F8" are danish characters "" and "
". I can
>work this out myself, but how can my program know - or at least what
"decoding" do I need to do to get the correct characters in my string?

It's not entirely clear what you mean. Does the text file contain
"=E5" but it *should* contain a Danish character, or were you doing
some replacement for us?

What's the source of the text file?
The text files are actually xml (if that makes a difference).

And in actual fact the xml nodes I am interested in contain a "base64
encoded" string. So I extract this string, which is a long piece of text
starting:
"DQo8YnI+PGZvbnQgc2l6ZT0xIGNvbG9yPXdoaXRlIGZhY2U9I lZlcmRhbmEL".

I un-encode this base64 string, and get another text string which is in
"html" format. (With all sorts of html tags in it).

Some of this html contains the 3 symbols
= E 5
for example. (No spaces between the symbols).

They represent obviously the Danish letter '' (obviously because a
danish speaker can see it from the word it appears in).

So for example, the text file may contain strings like:

"f=E5et"

(I don't know how you see this string in your news reader - but I see it
as 6 characters: f = E 5 e t).
As a part of processing, my program needs to convert these strings (=E5)
to real text ().

But I don't know what this encoding is, or how to decode it.

Thanks,
Peter
Jun 15 '07 #3

P: n/a
Looks like encoding for SMTP, is it a string from an email?
You can see how this encoding is done here:
http://www.lesnikowski.com/mail/Rfc/rfc2047.txt
I don't know if there is a freeware component that can decode this kind of
thing.

Kind Regards,
Allan Ebdrup
Jun 15 '07 #4

P: n/a
* Allan Ebdrup wrote, On 15-6-2007 16:01:
Looks like encoding for SMTP, is it a string from an email?
You can see how this encoding is done here:
http://www.lesnikowski.com/mail/Rfc/rfc2047.txt
I don't know if there is a freeware component that can decode this kind of
thing.

Kind Regards,
Allan Ebdrup

Indeed it looks like quoted printable encoding of text messages sent
through mail or otherwise. Maybe the System.Net.Mail namespace can help
you out here. But I'd have to dig just as far as you would.

Jesse
Jun 15 '07 #5

P: n/a
Jesse Houwing <je***********@nospam-sogeti.nlwrote in
news:#t**************@TK2MSFTNGP02.phx.gbl:
* Allan Ebdrup wrote, On 15-6-2007 16:01:
>Looks like encoding for SMTP, is it a string from an email?
You can see how this encoding is done here:
http://www.lesnikowski.com/mail/Rfc/rfc2047.txt
I don't know if there is a freeware component that can decode this
kind of thing.

Kind Regards,
Allan Ebdrup

Indeed it looks like quoted printable encoding of text messages sent
through mail or otherwise. Maybe the System.Net.Mail namespace can
help you out here. But I'd have to dig just as far as you would.
Ah - this could be right, because some of the other texts I get are not
encoded at all and come with all sorts of "multipart" metadata like
content-type and content-transfer-encoding. And some of them say "quoted-
printable".

The base64-encoded html I am currently having trouble with has no such
encoding information however.

The xml files I am processing are not containing emails as far as I know,
but all sorts of data which has been extracted from a "Domino-Notes"
database and transformed to xml. Horrible xml if you ask me.

Thanks for the help/pointers.
Peter
Jun 15 '07 #6

P: n/a
Peter K wrote:
Jesse Houwing <je***********@nospam-sogeti.nlwrote in
news:#t**************@TK2MSFTNGP02.phx.gbl:
>* Allan Ebdrup wrote, On 15-6-2007 16:01:
>>Looks like encoding for SMTP, is it a string from an email?
You can see how this encoding is done here:
http://www.lesnikowski.com/mail/Rfc/rfc2047.txt
I don't know if there is a freeware component that can decode this
kind of thing.
Indeed it looks like quoted printable encoding of text messages sent
through mail or otherwise. Maybe the System.Net.Mail namespace can
help you out here. But I'd have to dig just as far as you would.

Ah - this could be right, because some of the other texts I get are not
encoded at all and come with all sorts of "multipart" metadata like
content-type and content-transfer-encoding. And some of them say "quoted-
printable".

The base64-encoded html I am currently having trouble with has no such
encoding information however.

The xml files I am processing are not containing emails as far as I know,
but all sorts of data which has been extracted from a "Domino-Notes"
database and transformed to xml. Horrible xml if you ask me.
A hack and a real Quoted Printable decode (note that real quoted
printable sometimes skips newlines as well - you will need to add that
if needed):

public static string FromQP1(string s)
{
return s.Replace("=E6", "").Replace("=F8", "").Replace("=E5",
"").Replace("=C6", "").Replace("=D8", "").Replace("=C5", "");
}
public static string FromQP2(string s)
{
StringBuilder sb = new StringBuilder("");
int ix = 0;
while(ix < s.Length)
{
if(s[ix] == '=')
{
sb.Append((char)int.Parse(s.Substring(ix + 1, 2),
NumberStyles.HexNumber));
ix += 3;
}
else
{
sb.Append(s[ix]);
ix++;
}
}
return sb.ToString();
}
Arne
Jun 16 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.