By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,080 Members | 1,748 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,080 IT Pros & Developers. It's quick & easy.

Odd string encoding behaviour

P: n/a
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?

Miki
Nov 15 '05 #1
Share this Question
Share on Google+
11 Replies


P: n/a
Well, i managed to find a solution of some sort:

System.Text.Encoding e = System.Text.Encoding.GetEncoding("iso-8859-1");
output = BitConverter.ToString(e.GetBytes(FieldContent)).Re place("-"," ");

Is there something equivalent to the iso-8859-1 codepage?

Miki
Nov 15 '05 #2

P: n/a
Well, i managed to find a solution of some sort:

System.Text.Encoding e = System.Text.Encoding.GetEncoding("iso-8859-1");
output = BitConverter.ToString(e.GetBytes(FieldContent)).Re place("-"," ");

Is there something equivalent to the iso-8859-1 codepage?

Miki
Nov 15 '05 #3

P: n/a
> Is there something equivalent to the iso-8859-1 codepage?
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Nov 15 '05 #4

P: n/a
> Is there something equivalent to the iso-8859-1 codepage?
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Nov 15 '05 #5

P: n/a
Miki Watts <mi*****@netvision.net.il> wrote:
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?


Nothing. What do you think it's doing wrong? It's doing exactly what it
should be - it's encoding your text in the various different ways,
depending on the encoding type used.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #6

P: n/a
Miki Watts <mi*****@netvision.net.il> wrote:
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?


Nothing. What do you think it's doing wrong? It's doing exactly what it
should be - it's encoding your text in the various different ways,
depending on the encoding type used.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #7

P: n/a
Mihai N. <nm**************@yahoo.com> wrote:
Is there something equivalent to the iso-8859-1 codepage?

1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)


Sort of - using the 8859-1 code page, you'll actually end up with bytes
effectively being "passed through", even if they shouldn't really be.
(I'm talking about characters 128-139 IIRC.) Code page 1252 has
entirely different characters in that range (the extras you mean).

If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #8

P: n/a
Mihai N. <nm**************@yahoo.com> wrote:
Is there something equivalent to the iso-8859-1 codepage?

1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)


Sort of - using the 8859-1 code page, you'll actually end up with bytes
effectively being "passed through", even if they shouldn't really be.
(I'm talking about characters 128-139 IIRC.) Code page 1252 has
entirely different characters in that range (the extras you mean).

If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #9

P: n/a
> If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.


well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want, without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).
Nov 15 '05 #10

P: n/a
Miki Watts <mi*****@netvision.net.il> wrote:
If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.
well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want


Strings don't contain bytes. They contain characters. You shouldn't use
them for binary data - that's not what they're designed for.
without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).


String itself isn't dynamic either - once created, a string is fixed.
It just has methods to make it easy to create a new string with (say)
the value of two strings concatenated.

I suspect that MemoryStream might be helpful to you though.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #11

P: n/a
> I suspect that MemoryStream might be helpful to you though.

ok, thanks. I'll check it out.
Nov 15 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.