473,383 Members | 1,846 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Odd string encoding behaviour

I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?

Miki
Nov 15 '05 #1
11 3544
Well, i managed to find a solution of some sort:

System.Text.Encoding e = System.Text.Encoding.GetEncoding("iso-8859-1");
output = BitConverter.ToString(e.GetBytes(FieldContent)).Re place("-"," ");

Is there something equivalent to the iso-8859-1 codepage?

Miki
Nov 15 '05 #2
Well, i managed to find a solution of some sort:

System.Text.Encoding e = System.Text.Encoding.GetEncoding("iso-8859-1");
output = BitConverter.ToString(e.GetBytes(FieldContent)).Re place("-"," ");

Is there something equivalent to the iso-8859-1 codepage?

Miki
Nov 15 '05 #3
> Is there something equivalent to the iso-8859-1 codepage?
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Nov 15 '05 #4
> Is there something equivalent to the iso-8859-1 codepage?
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Nov 15 '05 #5
Miki Watts <mi*****@netvision.net.il> wrote:
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?


Nothing. What do you think it's doing wrong? It's doing exactly what it
should be - it's encoding your text in the various different ways,
depending on the encoding type used.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #6
Miki Watts <mi*****@netvision.net.il> wrote:
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?


Nothing. What do you think it's doing wrong? It's doing exactly what it
should be - it's encoding your text in the various different ways,
depending on the encoding type used.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #7
Mihai N. <nm**************@yahoo.com> wrote:
Is there something equivalent to the iso-8859-1 codepage?

1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)


Sort of - using the 8859-1 code page, you'll actually end up with bytes
effectively being "passed through", even if they shouldn't really be.
(I'm talking about characters 128-139 IIRC.) Code page 1252 has
entirely different characters in that range (the extras you mean).

If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #8
Mihai N. <nm**************@yahoo.com> wrote:
Is there something equivalent to the iso-8859-1 codepage?

1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)


Sort of - using the 8859-1 code page, you'll actually end up with bytes
effectively being "passed through", even if they shouldn't really be.
(I'm talking about characters 128-139 IIRC.) Code page 1252 has
entirely different characters in that range (the extras you mean).

If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #9
> If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.


well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want, without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).
Nov 15 '05 #10
Miki Watts <mi*****@netvision.net.il> wrote:
If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.
well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want


Strings don't contain bytes. They contain characters. You shouldn't use
them for binary data - that's not what they're designed for.
without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).


String itself isn't dynamic either - once created, a string is fixed.
It just has methods to make it easy to create a new string with (say)
the value of two strings concatenated.

I suspect that MemoryStream might be helpful to you though.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #11
> I suspect that MemoryStream might be helpful to you though.

ok, thanks. I'll check it out.
Nov 15 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: lkrubner | last post by:
Last year I asked a bunch of questions about character encoding on this newsgroup. All the answers came down to using ord() in creative ways to try to make guesses about multi-byte characters. I...
2
by: Sylvain Thenault | last post by:
Hi there ! I've noticed the following problem with python >= 2.3 (actually 2.3.4 and 2.4): syt@musca:test$ python Python 2.3.4 (#2, Sep 24 2004, 08:39:09) on linux2 Type "help", "copyright",...
1
by: Neil Schemenauer | last post by:
The title is perhaps a little too grandiose but it's the best I could think of. The change is really not large. Personally, I would be happy enough if only %s was changed and the built-in was...
35
by: michael.casey | last post by:
The purpose of this post is to obtain the communities opinion of the usefulness, efficiency, and most importantly the correctness of this small piece of code. I thank everyone in advance for your...
0
by: Miki Watts | last post by:
I'm having a problem with encoding a string... here's my code: byte s = System.Text.Encoding.ASCII.GetBytes(FieldContent); Now, this works fine, as long as there are no bytes that are over 128,...
9
by: Mark | last post by:
I've run a few simple tests looking at how query string encoding/decoding gets handled in asp.net, and it seems like the situation is even messier than it was in asp... Can't say I think much of the...
18
by: Ger | last post by:
I have not been able to find a simple, straight forward Unicode to ASCII string conversion function in VB.Net. Is that because such a function does not exists or do I overlook it? I found...
8
by: Richard Schulman | last post by:
The following program fragment works correctly with an ascii input file. But the file I actually want to process is Unicode (utf-16 encoding). The file must be Unicode rather than ASCII or...
5
by: Holger Joukl | last post by:
Hi there, I consider the behaviour of unicode() inconvenient wrt to conversion of non-string arguments. While you can do: u'17.3' you cannot do:
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.