On Aug 22, 3:27 pm, Steven Blair <steven.bl...@btinternet.comwrote:
USMesg = "¬Credits¬Remaining¬";
byte[] bArray = Encoding.UTF8.GetBytes(USMesg);
[1] = 0xAC //This is the character I was expecting
Why has an extra byte been inserted after each ¬?
This is how UTF-8 works. I assume that when/if you review the UTF-8
specifications, you will find that the character "¬" is to be
represented as 0xC2AC in this particular scenario.
I tried using Encoding.ASCII.GetBytes()
ASCII and UTF-8 are not interchangeable.
Anyone any idea whats happened and how I can get round this problem?
What is happening is that characters are being converted to bytes,
using the character encoding you specify. The process of going
between actual characters and bits is very complex.
Perhaps what you want is Encoding.Default.GetBytes()? This will use
the system default ANSI codepage (in your case Windows-1252, which
internally means ISO-8859-1 (aka "Latin-1" or "Western European")).
This might encode "¬" as 0xAC, or it might not.
However, if you want to write predictable code, you must agree with
whoever will read the bytes back upon which encoding to use.
Otherwise, when reading 0xC2AC back into a string, the reader might
get a tiny picture of a tiny goat instead of the "¬".