By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,467 Members | 1,448 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,467 IT Pros & Developers. It's quick & easy.

Encoding to ISO-8859-1 problems

P: n/a
Hi,
We are trying to encode to ISO-8859-1, but we have problems doing it using
the encoders in .NET. We get some unknown characters in some culture which
comes out fine if we post (from IE) from a page in ISO-8859-1 to another
page using ISO-8859-1, but cannot take a .NET string or a UTF-8 string,
convert it in ISO-8859-1 and display it with this encoding using the same
content in the string...

Are there anyone that know how IE does it? Is there any correspondance
table, are the any information into the unicode encoding that says "this
character has this style and should convert to another character with this
style" or something like it?

Thanks

ThunderMusic
Feb 1 '07 #1
Share this Question
Share on Google+
8 Replies


P: n/a
ThunderMusic <No*************************@NoSpAm.comwrote:
We are trying to encode to ISO-8859-1, but we have problems doing it using
the encoders in .NET. We get some unknown characters in some culture which
comes out fine if we post (from IE) from a page in ISO-8859-1 to another
page using ISO-8859-1, but cannot take a .NET string or a UTF-8 string,
convert it in ISO-8859-1 and display it with this encoding using the same
content in the string...

Are there anyone that know how IE does it? Is there any correspondance
table, are the any information into the unicode encoding that says "this
character has this style and should convert to another character with this
style" or something like it?
It's not very clear exactly what's going on. It's quite possible that
when you post in IE, it's not using 8859-1 for the post even if the
pages returned *are* genuinely using 8859-1. It's hard to see how it
would be displayed in that case though.

If a character isn't in 8859-1, you won't be able to represent it in
8859-1.

Do you *have* to use 8859-1 rather than an encoding which can represent
the whole of Unicode (eg UTF-8)?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 1 '07 #2

P: n/a
it's actually because we are sending mails to hotmail, and because our
application is in utf-8 and hotmail uses ISO-8859-1, so if we send a message
encoded using UTF-8, the displayed message comes out plain ugly in
hotmail...

So we must find a way to convert things correctly... to give an exemple of
what should be converted... a character like the oe (o and e tied together)
should be converted to ISO-8859-1 oe (o and e separatly) because the "tied
together" version does not exists in 8859-1...

Anyone knows how to do it seemlessly?

Thanks

ThunderMusic

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
ThunderMusic <No*************************@NoSpAm.comwrote:
>We are trying to encode to ISO-8859-1, but we have problems doing it
using
the encoders in .NET. We get some unknown characters in some culture
which
comes out fine if we post (from IE) from a page in ISO-8859-1 to another
page using ISO-8859-1, but cannot take a .NET string or a UTF-8 string,
convert it in ISO-8859-1 and display it with this encoding using the same
content in the string...

Are there anyone that know how IE does it? Is there any correspondance
table, are the any information into the unicode encoding that says "this
character has this style and should convert to another character with
this
style" or something like it?

It's not very clear exactly what's going on. It's quite possible that
when you post in IE, it's not using 8859-1 for the post even if the
pages returned *are* genuinely using 8859-1. It's hard to see how it
would be displayed in that case though.

If a character isn't in 8859-1, you won't be able to represent it in
8859-1.

Do you *have* to use 8859-1 rather than an encoding which can represent
the whole of Unicode (eg UTF-8)?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Feb 1 '07 #3

P: n/a
ThunderMusic <No*************************@NoSpAm.comwrote:
it's actually because we are sending mails to hotmail, and because our
application is in utf-8 and hotmail uses ISO-8859-1, so if we send a message
encoded using UTF-8, the displayed message comes out plain ugly in
hotmail...
Does Hotmail *really* only use ISO-8859-1? Eek - how horrible.
So we must find a way to convert things correctly... to give an exemple of
what should be converted... a character like the oe (o and e tied together)
should be converted to ISO-8859-1 oe (o and e separatly) because the "tied
together" version does not exists in 8859-1...

Anyone knows how to do it seemlessly?
Right. At that point you're talking about much more than just normal
encoding - and unfortunately we've reached the limit of my knowledge
there :(

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 1 '07 #4

P: n/a
You can encode your strings into ISO-8859-1 or into any other encoding
(see:)

http://msdn2.microsoft.com/en-us/lib...ng(d=ide).aspx
byte[] encodedString =
Encoding.GetEncoding("iso-8859-1").GetBytes("Whatever you want to
say");

// is the same as

byte[] encodedString = Encoding.GetEncoding(28591).GetBytes("Whatever
you want to say");

// then you can then write it to a FileStream
FileStream file = File.Create(fileName);
file.Write(encodedString, 0, encodedString.Length);

// you can also write to a StreamWriter directly
string fileName = pathToYourFile;
// false: overwrite, true: append
StreamWriter writer = new StreamWriter(fileName, false,
Encoding.GetEncoding("iso-8859-1"));
writer.WriteLine("Whatever you want to say");

Regards,
Joachim

Feb 1 '07 #5

P: n/a
jo*****@yamagata-europe.com <jo*****@yamagata-europe.comwrote:
You can encode your strings into ISO-8859-1 or into any other encoding
(see:)

http://msdn2.microsoft.com/en-us/lib...ng(d=ide).aspx
The problem is that the OP is trying to represent characters which
aren't actually in ISO-8859-1 though.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 1 '07 #6

P: n/a
actually, hotmail supports 4 types of encoding, but not UTF-8 nor UTF-16, so
it's becoming a problem...

Thanks... I'm still searching for informations on the net...

ThunderMusic

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
ThunderMusic <No*************************@NoSpAm.comwrote:
>it's actually because we are sending mails to hotmail, and because our
application is in utf-8 and hotmail uses ISO-8859-1, so if we send a
message
encoded using UTF-8, the displayed message comes out plain ugly in
hotmail...

Does Hotmail *really* only use ISO-8859-1? Eek - how horrible.
>So we must find a way to convert things correctly... to give an exemple
of
what should be converted... a character like the oe (o and e tied
together)
should be converted to ISO-8859-1 oe (o and e separatly) because the
"tied
together" version does not exists in 8859-1...

Anyone knows how to do it seemlessly?

Right. At that point you're talking about much more than just normal
encoding - and unfortunately we've reached the limit of my knowledge
there :(

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Feb 1 '07 #7

P: n/a

"ThunderMusic" <No*************************@NoSpAm.comwrote in message
news:u3**************@TK2MSFTNGP02.phx.gbl...
it's actually because we are sending mails to hotmail, and because our
application is in utf-8 and hotmail uses ISO-8859-1, so if we send a
message
encoded using UTF-8, the displayed message comes out plain ugly in
hotmail...

So we must find a way to convert things correctly... to give an exemple of
what should be converted... a character like the oe (o and e tied
together)
should be converted to ISO-8859-1 oe (o and e separatly) because the "tied
together" version does not exists in 8859-1...

Anyone knows how to do it seemlessly?
The discrepancy you are seeing in the way IE behaves is due to windows
drawing a correlation between the OEM codepage 1252 and the ISO-8859-1.
Windows-1252 is character set based on ISO-8859-1 in that all characters
have the same encoding except for characters in the 128-159 range. In this
range ISO-8859-1 has a set of control codes that are almost never used these
days. Windows-1252 borrows this area to squeeze in some extra characters.

When a page coming from source claiming ISO-8859-1 charset uses characters
in this range IE just renders the Windows-1252 characters for them. However
something sticking more strictly to ISO-8859-1 just doesn't know what to do
with them.

This doesn't solve your problem I know. If what you say is true then
hotmail is unable to communicate well with all it's possible clients.
That's so shocking it leaves me wondering whether there is something else
wrong.

Can you show some code you are using to generate the email?

Feb 2 '07 #8

P: n/a
If what you say is true then
hotmail is unable to communicate well with all it's possible clients.
That's so shocking it leaves me wondering whether there is something else
wrong.
I strongly-strongly doubt that hotmail is unable to handle UTF-8
Can you show some code you are using to generate the email?
Agree, most likely the probem is here :-)
From experience I know that the chances that I discov a bug in a compiler are
slim (happened two times only). So before blaming something on the compiler,
I check my code 30 times!

This is one of those. I am so sure that hotmail can handle utf-8, that I
would check my code 30 times :-)
--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Feb 2 '07 #9

This discussion thread is closed

Replies have been disabled for this discussion.