473,748 Members | 2,225 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Seperate words

375 Contributor
Hello

I want to separate or say split a string.
Eg. I have a word Client Server
It should be split as
Client
Server
I went and searched and got it

Expand|Select|Wrap|Line Numbers
  1. string MainString = "String Manipulation"; 
  2. string [] Split = MainString.Split(new Char [] {' '}); 
  3. MessageBox.Show(Convert.ToString(Split[0])); 
  4. MessageBox.Show(Convert.ToString(Split[1])); 
Now this is not working if there are two spaces between two words
Moroever I also want to be split if the string consists of more than one word

Eg.
Client Server Connection
to be split into
Client
Server
Connection



Kindly do help
Regards
cmrhema
Jul 19 '07 #1
5 2148
bharathreddy
111 New Member
Hai,

Pls find the article below. I hope this will help you.

Strings in .NET and C#
The System.String type (shorthand string in C#) is one of the most important types in .NET, and unfortunately it's much misunderstood. This article attempts to deal with some of the basics of the type.

What is a string?
A string is basically a sequence of characters. Each character is a Unicode character in the range U+0000 to U+FFFF (more on that later). The string type (I'll use the C# shorthand rather than putting System.String each time) has the following characteristics :

It is a reference type
It's a common misconception that string is a value type. That's because its immutability (see next point) makes it act sort of like a value type. It actually acts like a normal reference type. See my articles on parameter passing and memory for more details of the differences between value types and reference types.
It's immutable
You can never actually change the contents of a string, at least with safe code which doesn't use reflection. Because of this, you often end up changing the value of a string variable. For instance, the code s = s.Replace ("foo", "bar"); doesn't change the contents of the string that s originally referred to - it just sets the value of s to a new string, which is a copy of the old string but with "foo" replaced by "bar".
It can contain nulls
C programmers are used to strings being sequences of characters ending in '\0', the nul or null character. (I'll use "null" because that's what the Unicode code chart calls it in the detail; don't get it confused with the null keyword in C# - char is a value type, so can't be a null reference!) In .NET, strings can contain null characters with no problems at all as far as the string methods themselves are concerned. However, other classes (for instance many of the Windows Forms ones) may well think that the string finishes at the first null character - if your string ever appears to be truncated oddly, that could be the problem.
It overloads the == operator
When the == operator is used to compare two strings, the Equals method is called, which checks for the equality of the contents of the strings rather than the references themselves. For instance, "hello".Substri ng(0, 4)=="hell" is true, even though the references on the two sides of the operator are different (they refer to two different string objects, which both contain the same character sequence). Note that operator overloading only works here if both sides of the operator are string expressions at compile time - operators aren't applied polymorphically . If either side of the operator is of type object as far as the compiler is concerned, the normal == operator will be applied, and simple reference equality will be tested.
Interning
.NET has the concept of an "intern pool". It's basically just a set of strings, but it makes sure that every time you reference the same string literal, you get a reference to the same string. This is probably language-dependent, but it's certainly true in C# and VB.NET, and I'd be very surprised to see a language it didn't hold for, as IL makes it very easy to do (probably easier than failing to intern literals). As well as literals being automatically interned, you can intern strings manually with the Intern method, and check whether or not there is already an interned string with the same character sequence in the pool using the IsInterned method. This somewhat unintuitively returns a string rather than a boolean - if an equal string is in the pool, a reference to that string is returned. Otherwise, null is returned. Likewise, the Intern method returns a reference to an interned string - either the string you passed in if was already in the pool, or a newly created interned string, or an equal string which was already in the pool.

Literals
Literals are how you hard-code strings into C# programs. There are two types of string literals in C# - regular string literals and verbatim string literals. Regular string literals are similar to those in many other languages such as Java and C - they start and end with ", and various characters (in particular, " itself, \, and carriage return (CR) and line feed (LF)) need to be "escaped" to be represented in the string. Verbatim string literals allow pretty much anything within them, and end at the first " which isn't doubled. Even carriage returns and line feeds can appear in the literal! To obtain a " within the string itself, you need to write "". Verbatim string literals are distinguished by having an @ before the opening quote. Here are some examples of the two types of literal, and what they amount to:

Regular literal Verbatim literal Resulting string
"Hello" @"Hello" Hello
"Backslash: \\" @"Backslash: \" Backslash: \
"Quote: \"" @"Quote: """ Quote: "
"CRLF:\r\nP ost CRLF" @"CRLF:
Post CRLF" CRLF:
Post CRLF

For other escape sequences, please see the relevant FAQ entry. Note that the difference is only for the compiler's sake. Once the string is in the compiled code, there's no such thing as a verbatim string literal vs a regular string literal.

Strings and the debugger
Numerous people run into problems when inspecting strings in the debugger, both with VS.NET 2002 and VS.NET 2003. Ironically, the problems are often generated by the debugger trying to be helpful, and either displaying the string as a regular string literal with backslash-escaped characters in, or displaying it as a verbatim string literal complete with leading @. This leads to many questions asking how the @ can be removed, despite the fact that it's not really there in the first place - it's only how the debugger's showing it. Also, some versions of VS.NET will stop displaying the contents of the string at the first null character, and evaluate its Length property incorrectly, calculating the value itself instead of asking the managed code. Again, it then considers the string to finish at the first null character.

Given the confusion this has caused, I believe it's best to examine strings in a different way when debugging, at least if you think something odd is going on. I suggest using a method like the one below, which will print the contents of a string to the console in a safe way. Depending on what kind of application you're developing, you may want to write this information to a log file, to the debug or trace listeners, or pop it up in a message box.

static readonly string[] LowNames =
{
"NUL", "SOH", "STX", "ETX", "EOT", "ENQ", "ACK", "BEL",
"BS", "HT", "LF", "VT", "FF", "CR", "SO", "SI",
"DLE", "DC1", "DC2", "DC3", "DC4", "NAK", "SYN", "ETB",
"CAN", "EM", "SUB", "ESC", "FS", "GS", "RS", "US"
};
public static void DisplayString (string text)
{
Console.WriteLi ne ("String length: {0}", text.Length);
foreach (char c in text)
{
if (c < 32)
{
Console.WriteLi ne ("<{0}> U+{1:x4}", LowNames[c], (int)c);
}
else if (c > 127)
{
Console.WriteLi ne ("(Possibly non-printable) U+{0:x4}", (int)c);
}
else
{
Console.WriteLi ne ("{0} U+{1:x4}", c, (int)c);
}
}
}



Memory usage
In the current implementation at least, strings take up 20+(n/2)*4 bytes (rounding the value of n/2 down), where n is the number of characters in the string. The string type is unusual in that the size of the object itself varies. The only other classes which do this (as far as I know) are arrays. Essentially, a string is a character array in memory, plus the length of the array and the length of the string (in characters). The length of the array isn't always the same as the length in characters, as strings can be "over-allocated" within mscorlib.dll, to make building them up easier. (StringBuilder does this, for instance.) While strings are immutable to the outside world, code within mscorlib can change the contents, so StringBuilder creates a string with a larger internal character array than the current contents requires, then appends to that string until the character array is no longer big enough to cope, at which point it creates a new string with a larger array. The string length member also contains a flag in its top bit to say whether or not the string contains any non-ASCII characters. This allows for extra optimisation in some cases.

Although strings aren't null-terminated as far as the API is concerned, the character array is null-terminated, as this means it can be passed directly to unmanaged functions without any copying being involved, assuming the inter-op specifies that the string should be marshalled as Unicode.

Encoding
(If you don't know about character encodings and Unicode, please read my article on the subject first.)

As stated at the start of the article, strings are always in Unicode encoding. The idea of "a Big-5 string" or "a string in UTF-8 encoding" is a mistake (as far as .NET is concerned) and usually indicates a lack of understanding of either encodings or the way .NET handles strings. It's very important to understand this - treating a string as if it represented some valid text in a non-Unicode encoding is almost always a mistake.

Now, the Unicode coded character set (one of the flaws of Unicode is that the one term is used for various things, including a coded character set and a character encoding scheme) contains more than 65536 characters. This means that a single char (System.Char) cannot cover every character. This leads to the use of surrogates where characters above U+FFFF are represented in strings as two characters. Essentially, string uses the UTF-16 character encoding form. Most developers may well not need to know much about this, but it's worth at least being aware of it.

Culture and internationaliz ation oddities
Some of the oddities of Unicode lead to oddities in string and character handling. Many of the string methods are culture-sensitive - in other words, what they do depends on the culture of the current thread. For example, what would you expect "i".toUpper () to return? Most people would say "I", but in Turkish the correct answer is "İ" (Unicode U+0130, "Latin capital I with dot above"). To perform a culture-insensitive case change, you can use CultureInfo.Inv ariantCulture, and pass that to the overload of String.ToUpper which takes a CultureInfo.

There are further oddities when it comes to comparing, sorting, and finding the index of a substring. Some of these are culture-specific, and some aren't. For instance, in all cultures (as far as I can see), "lassen" and "la\u00dfen " (a "sharp S" or eszett being the Unicode-escaped character in there) are considered equal when CompareTo or Compare are used, but not when Equals is used. IndexOf will treat the eszett as the same as "ss", unless you use a CompareInfo.Ind exOf and specify CompareOptions. Ordinal as the options to use.

Some other unicode character appear to be completely invisible to the normal IndexOf. Someone asked in the C# newsgroup why a search/replace method was going into an infinite loop. It was repeatedly using Replace to replace all double spaces with a single space, and checking whether or not it had finished by using IndexOf, so that multiple spaces would collapse to a single space. Unfortunately, this was failing due to a "strange" character in the original string between two spaces. IndexOf matched the double space, ignoring the extra character, but Replace didn't. I don't know which exact character was in the real data, but it can be easily reproduced using U+200C which is a zero-width non-joiner character (whatever that means, exactly!). Put one of those in the middle of the text you're searching in, and IndexOf will ignore it, but Replace won't. Again, to make the two methods behave the same, you can use CompareInfo.Ind exOf and pass in CompareOptions. Ordinal. My guess is that there's a lot of code which would fail on "awkward" data like this. (I wouldn't for a moment claim that all my code is immune, either.)
Jul 19 '07 #2
RoninZA
78 New Member
That was a little long-winded, so you can try the following code...C# in VS2005, and I've replaced all space characters with + to improve readability:

Expand|Select|Wrap|Line Numbers
  1. private string[] splitSentence(string sentence)
  2. {
  3.     //First we're going to strip out any double spaces, and replace them with
  4.     //single spaces
  5.     while (sentence.IndexOf("++") > -1)
  6.         sentence = sentence.Replace("++", "+");
  7.  
  8.     //Now we're going to split the sentence into seperate words, into a string
  9.     //array, which we will be returning out of this function
  10.     string[] words = sentence.Split('+');
  11.  
  12.     return words;
  13. }
Hope this helps :)

PS: Remember to replace the '+' with spaces when you test the code!
Jul 19 '07 #3
cmrhema
375 Contributor
That was a little long-winded, so you can try the following code...C# in VS2005, and I've replaced all space characters with + to improve readability:

Expand|Select|Wrap|Line Numbers
  1. private string[] splitSentence(string sentence)
  2. {
  3.     //First we're going to strip out any double spaces, and replace them with
  4.     //single spaces
  5.     while (sentence.IndexOf("++") > -1)
  6.         sentence = sentence.Replace("++", "+");
  7.  
  8.     //Now we're going to split the sentence into seperate words, into a string
  9.     //array, which we will be returning out of this function
  10.     string[] words = sentence.Split('+');
  11.  
  12.     return words;
  13. }
Hope this helps :)

PS: Remember to replace the '+' with spaces when you test the code!
Thanks both of you
Its resolved.
I did in the below way

Expand|Select|Wrap|Line Numbers
  1. string MainString = lstClientSendData.Text;
  2.            string[] Split = MainString.Split(new Char[] { ' ' });
  3.            int s1 = Split.Length;
  4.            string final;
  5.  
  6.             for (int i = 0; i < s1; i++)
  7.  
  8.             {
  9.                 final = Split[i].ToString();
  10.                 if (final != "")
  11.  
  12.                 {
  13.                     MessageBox.Show(final);
  14.                 }
  15.  
  16.  
  17.             }
Thanks onece again
Jul 20 '07 #4
Plater
7,872 Recognized Expert Expert
Better yet, the Split() function is overriden to take a 2nd parameter that says what to do with "empty sets"
So if you split a string based on ' ' (a space) and you had the string
"I ran fast" (2 spaces in between ran and fast) The output array would be
"i"
"ran"
"fast"

instead of
"I"
"ran"
""
"fast"
Jul 20 '07 #5
cmrhema
375 Contributor
Better yet, the Split() function is overriden to take a 2nd parameter that says what to do with "empty sets"
So if you split a string based on ' ' (a space) and you had the string
"I ran fast" (2 spaces in between ran and fast) The output array would be
"i"
"ran"
"fast"

instead of
"I"
"ran"
""
"fast"
yes plater thats exactly what i wanted in my program
Jul 21 '07 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

0
995
by: Karl Rhodes | last post by:
Ok, we think we have a problem which SHOULD be simple to resolve but appears to be impossible (unless we're not as good as we thought we were!!!) We have seperate workstations on which we do our development. We have a seperate server running IIS which hosts all our development websites. We have a seperate fileserver where all our source code is kept in a very neat file structure.
7
2563
by: Shannan Casteel via AccessMonster.com | last post by:
I have a form for entering part numbers along with the associated quantity for each part. There are 25 Part fields and 25 associated Quantity fields. If I go to record 1 and enter part number 1234 for part1, 3 for quantity1, and 1235 for part2, 8 for quantity2, and on record 2 I enter 1236 for part1, 2 for quantity1, and 1234 for part2, 2 for quantity2, how would I get Access to spit out a report that says something like this: Part...
9
1977
by: ern | last post by:
I'm using scanf("%s",userInput) to capture up to three words from the user. I want to seperate those three words into three variables: char * firstWord; char * secondWord; char * thirdWord; Is there an easy way/function that can do this already? Thanks!
8
3325
by: Serge | last post by:
Hi, I have some intensive code that is running on my main thread. I try to show a status update on a 'status form'. The problem that i have is that because it is running in the same thread the window is not responding to the user. The user is now able to minimize, move the window because the code is too busy on it's own work. (and they are all running on the same thread)
12
5851
by: Brian Keating EI9FXB | last post by:
Hello all, Wonder what approach is used for this problem. I have a MDIApplication, the MDIClinets are to be in a seperate thread. So I've done something like this, // Create a new Show method that starts a new thread public new void Show() { Thread t = new Thread(new ThreadStart(ThreadProc));
8
2981
by: feng | last post by:
In our VB.Net application, we need to be able to start another process (thread won't do it) and run some logic in it, and still be able to communicate with the main process. Is this possible and how to do it? Thanks
6
3182
by: Kyle Teague | last post by:
What would give better performance, serializing a multidimensional array and storing it in a single entry in a table or storing each element of the array in a separate table and associating the entries with the entry in the other table? Having a separate table would result in two queries instead of one, but you wouldn't have to deal with the overhead of serializing and unserializing data. -- Kyle
0
1247
by: bloukopkoggelmander | last post by:
Hi All wonderfull brains! Right I have two questions after my last very successfull thread. I have tried looking these up on the net, but no luck. Scenario 1 is : I have a bound form with bound controls. For one of the controls I look up the values from a table, but the values I then sleect from this combo box, is concatednated, ie. Lastname,Firstname,Employee number. Now what I want is that once the user has selected a concatenated name...
2
2141
by: desertavataraz | last post by:
I am going write an application in C++ that allows the user to see two languages at once, and allows them to search each individual language for words or keywords. I have a font that I made specially for the non-English language, which would allow me to simply use different fonts for each specific language. I could also control how the search works simply by reading the font tags, for each language, so I only get results in that specific...
0
8831
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9548
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9325
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9249
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8244
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6076
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4876
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2787
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2215
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.