473,378 Members | 1,594 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Encoding Question

Imagine the following scenario.

You receive a byte array from a socket. This byte array contains both
text and binary data; it contains text fields delimited by specified
byte sequences.

For example:

"one \xC0\x80 two \xC0\x80 three"

The way I'm currently dealing with this is to convert the byte array to
a string with the following function, then splitting the string by the
delimiter sub-string.

public static string GetString(byte[] data)
{
StringBuilder sb = new StringBuilder();

for (int i = 0; i < data.Length; ++i) {
sb.Append((char)data[i]);
}

return sb.ToString();
}

I know this is a hack, but is there a better way?
Nov 15 '05 #1
5 1134
C# Learner <cs****@learner.here> wrote:
Imagine the following scenario.

You receive a byte array from a socket. This byte array contains both
text and binary data; it contains text fields delimited by specified
byte sequences.

For example:

"one \xC0\x80 two \xC0\x80 three"

The way I'm currently dealing with this is to convert the byte array to
a string with the following function, then splitting the string by the
delimiter sub-string.

public static string GetString(byte[] data)
{
StringBuilder sb = new StringBuilder();

for (int i = 0; i < data.Length; ++i) {
sb.Append((char)data[i]);
}

return sb.ToString();
}

I know this is a hack, but is there a better way?


Yes. You definitely, definitely shouldn't be doing that. Instead, you
should be reading blocks into memory, and then scanning for the
delimiters. Then build a string using
Encoding.whatever.GetString (byte[], int, int).

However, if you have control over the protocol, it would be better to
prefix each string with the number of bytes in it - that way you don't
need to do any scanning.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #2
Jon Skeet [C# MVP] wrote:
C# Learner <cs****@learner.here> wrote:

<snip>

I know this is a hack, but is there a better way?
Yes. You definitely, definitely shouldn't be doing that. Instead, you
should be reading blocks into memory, and then scanning for the
delimiters. Then build a string using
Encoding.whatever.GetString (byte[], int, int).


Ah, so the only "problem" is finding the delimiters in the byte array then.

I completely forgot that Encoding.Whatever.GetString() can take index
and count parameters. Thanks for pointing that out!
However, if you have control over the protocol, it would be better to
prefix each string with the number of bytes in it - that way you don't
need to do any scanning.


Not in this case.

I guess I'll just write a library method that splits the byte array into
strings then.

Cheers
Nov 15 '05 #3
C# Learner <cs****@learner.here> wrote:
Yes. You definitely, definitely shouldn't be doing that. Instead, you
should be reading blocks into memory, and then scanning for the
delimiters. Then build a string using
Encoding.whatever.GetString (byte[], int, int).
Ah, so the only "problem" is finding the delimiters in the byte array then.


Yup.
I completely forgot that Encoding.Whatever.GetString() can take index
and count parameters. Thanks for pointing that out!


No problem.
However, if you have control over the protocol, it would be better to
prefix each string with the number of bytes in it - that way you don't
need to do any scanning.


Not in this case.

I guess I'll just write a library method that splits the byte array into
strings then.


Righto. Don't forget that things get tricky if you've got to read the
stream in chunks, but you need to combine multiple chunks to decode
them, etc. It's all doable, just tricky...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #4
Jon Skeet [C# MVP] wrote:

<snip>
I guess I'll just write a library method that splits the byte array into
strings then.


Righto. Don't forget that things get tricky if you've got to read the
stream in chunks, but you need to combine multiple chunks to decode
them, etc. It's all doable, just tricky...


Gladly, I don't need to do that in this case. :-)

Regards
Nov 15 '05 #5
Ray

"C# Learner" <cs****@learner.here> wrote in message
news:#6**************@tk2msftngp13.phx.gbl...
Imagine the following scenario.

You receive a byte array from a socket. This byte array contains both
text and binary data; it contains text fields delimited by specified
byte sequences.

For example:

"one \xC0\x80 two \xC0\x80 three"

The way I'm currently dealing with this is to convert the byte array to
a string with the following function, then splitting the string by the
delimiter sub-string.

public static string GetString(byte[] data)
{
StringBuilder sb = new StringBuilder();

for (int i = 0; i < data.Length; ++i) {
sb.Append((char)data[i]);
}

return sb.ToString();
}

I know this is a hack, but is there a better way?


I have a similar problem reading in a file that can have various delimiters,
even combinations of them. I do the following using the Split function. The
try catch loop works by conveniently doing nothing if it encounts an error
such as two delimiters together. Perhaps you can adapt this to your problem.

string delimStr = " ,:;\t";
char [] delimiter = delimStr.ToCharArray();
string [] split = null;
string newStr;
StreamReader sr = new StreamReader(filestr);
while ((line = sr.ReadLine()) != null)
{
split=line.Split(delim,columns);
foreach (string s in split)
{
try
{
newStr+=s;
}
catch{} // do nothing
}
}
Nov 15 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mark | last post by:
Sorry about the last... Anyway, here's the question: I've been working on some C# routines to process strings in and out of various encodings. The hope is that I can just let the user type in...
7
by: polarz | last post by:
I wrote a front end to a command line mp4 music tagger that reads my playlist files and gets info such as artist, title, album, etc, etc. I use the info to catalog, tag, sort, etc, my files. I've...
5
by: Waldy | last post by:
Hi there, how do you set the encoding format of an XML string? When I was outputting the XML to a file you can specify the encoding format like so: XmlTextWriter myWriter; myWriter = new...
4
by: Christina | last post by:
Hey Guys, Currently, I am using the below code: Dim oReqDoc as XmlDocument Dim requiredBytes As Byte() requiredBytes = System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml). Here, I am...
4
by: George | last post by:
Hi, I am puzzled by the following and seeking some assistance to help me understand what happened. I have very limited encoding knowledge. Our SAP system writes out a text file which includes...
4
by: Provost Zakharov | last post by:
Hello, I just needed some help on how the DOM is encoded by the IE parser. As per the MSDN page, http://msdn.microsoft.com/workshop/author/dhtml/reference/charsets/charset4.asp ,server encodings...
3
by: mortb | last post by:
1. How do I determine which encoding a xmldocument or xmlreader uses when opening a document? I'm not just talking about the <?xml encoding="utf-8"?attribute, but the actual encoding of the...
23
by: Allan Ebdrup | last post by:
I hava an ajax web application where i hvae problems with UTF-8 encoding oc chineese chars. My Ajax webapplication runs in a HTML page that is UTF-8 Encoded. I copy and paste some chineese chars...
1
by: ujjwaltrivedi | last post by:
Hey guys, Can anyone tell me how to create a text file with Unicode Encoding. In am using FileStream Finalfile = new FileStream("finalfile.txt", FileMode.Append, FileAccess.Write); ...
0
by: deloford | last post by:
Hi This is going to be a question for anyone who is an expert in C# Text Encoding. My situation is this: I have a Sybase database which is firing back ISO-8559 encoded strings. I am unable to...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.