Hi,
Is there a class that can handle splitting of a string on a comma such
that the commas in quotes are ignored?
I know we can use Text::ParseWords directive in perl to do this, but I
am new to C#.Net and couldn't find anything similar.
For example
string str = "one field, two field, \"field val one, field val two,
field val three\", three field" ;
Then I should get the following in a str_arr
str_arr[0] = "one field"
str_arr[1] = "two filed"
str_arr[2] = "field val one, field val two, field val three"
str_arr[3] = "three field"
Thanks in advance,
Ashoo. 19 1697
string[] str_arr = str.Split(",");
string [] str_arr = str.Split(',');
does not work. I had tried it. It gives me
str_arr[0] = "one field"
str_arr[1] = "two filed"
str_arr[2] = "\"field val one"
str_arr[3] = "field val two"
str_arr[4] = "field val three\""
str_arr[5] = "three field"
Instead of
str_arr[0] = "one field"
str_arr[1] = "two filed"
str_arr[2] = "field val one, field val two, field val three"
str_arr[3] = "three field"
look at the str_arr[2] value.
Thanks,
Ashoo
--
Sent via .NET Newsgroups http://www.dotnetnewsgroups.com
hi,
you can use Regular Expression which can give u such results. here is
one regular expression which will fetch you your desired result. i
assumed that you are using VS2003.
(?:(?<Vals>.*?),)+"(?<Vals>.*?)"(?:,(?<Vals>.*?))+ $
the ouput of your given example will be like this :
SubMatch: [Vals]
1:one field
2:two field
3:field val one, field val two,field val three
4:three field
try it out
At this time using split more than once seems to be the only option. I
cant recollect any other function that can help you do this quicker.
how do you use that function in perl? put some code and may be that can
ring the bell for some csharpers.
thanks
hi,
you can use Regular Expression which can give u such results. here is
one regular expression which will fetch you your desired result. i
assumed that you are using VS2003.
(?:(?<Vals>.*?),)+"(?<Vals>.*?)"(?:,(?<Vals>.*?))+ $
the ouput of your given example will be like this :
SubMatch: [Vals]
1:one field
2:two field
3:field val one, field val two,field val three
4:three field
try it out
This is how you cld do it in perl
#!/usr/bin/perl
use Text::ParseWords;
$line = "one field, two field, \"field val one, field val two,field val
three\", three field" ;
my @line = &parse_line('\,', 0, $line);
for (int i=0; i<$#line; i++)
print $line[i];
Thanks,
Ashoo
--
Sent via .NET Newsgroups http://www.dotnetnewsgroups.com
public System.Collections.ArrayList parseWords(string s)
{
if (s == null)
{
return (null);
}
bool bQuote = false;
System.Collections.ArrayList al = new ArrayList();
System.Text.StringBuilder sTemp = new StringBuilder();
for (int i = 0; i < s.Length; i++)
{
switch (s[i])
{
case ',':
if (bQuote == false)
{
al.Add(sTemp.ToString());
sTemp.Length = 0;
}
else
{
sTemp.Append(s[i]);
}
break;
case '\"':
if (bQuote == true)
{
bQuote = false;
}
else
{
bQuote = true;
}
//requirement:: remove quote character
//sTemp.Append(s[i]);
break;
default:
sTemp.Append(s[i]);
break;
}
}
if (sTemp.Length > 0)
{
al.Add(sTemp.ToString());
sTemp.Length = 0;
}
return (al);
}
"asrs63" wrote: Hi,
Is there a class that can handle splitting of a string on a comma such that the commas in quotes are ignored?
I know we can use Text::ParseWords directive in perl to do this, but I am new to C#.Net and couldn't find anything similar.
For example
string str = "one field, two field, \"field val one, field val two, field val three\", three field" ;
Then I should get the following in a str_arr
str_arr[0] = "one field" str_arr[1] = "two filed" str_arr[2] = "field val one, field val two, field val three" str_arr[3] = "three field"
Thanks in advance,
Ashoo.
Have you tried using regular expressions, instead of the Split()?
~~~~~~~~~~
"Ashoo Sharda" <as**********@reyrey.com> wrote in message
news:%2*****************@TK2MSFTNGP15.phx.gbl... string [] str_arr = str.Split(',');
does not work. I had tried it. It gives me
str_arr[0] = "one field" str_arr[1] = "two filed" str_arr[2] = "\"field val one" str_arr[3] = "field val two" str_arr[4] = "field val three\"" str_arr[5] = "three field"
Instead of str_arr[0] = "one field" str_arr[1] = "two filed" str_arr[2] = "field val one, field val two, field val three" str_arr[3] = "three field"
look at the str_arr[2] value.
Thanks, Ashoo
-- Sent via .NET Newsgroups http://www.dotnetnewsgroups.com
Hi,
I tried the regular expressions and I am using VS.Net 2003.
This is how I have used it.
Regex regEx = new
Regex("(??<Vals>.*?),)+\"(?<Vals>.*?)\"(?:,(?<Vals >.*?))+$");
string [] text1 = regEx.Split(text);
I am getting the following run-time error
"An unhandled exception of type 'System.ArgumentException' occurred in
system.dll
Additional information: parsing
"(??<Vals>.*?),)+"(?<Vals>.*?)"(?:,(?<Vals>.*?))+$ " - Unrecognized
grouping construct."
Can you please advise as to what I am doing wrong?
Thanks,
Ashoo
--
Sent via .NET Newsgroups http://www.dotnetnewsgroups.com
This is good stuff. How would the rest of the code look? I tried just
using plain regex but I couldn't get it to return the array.
Ron
"Lucky" <tu************@gmail.com> wrote in message
news:11**********************@o13g2000cwo.googlegr oups.com... hi, you can use Regular Expression which can give u such results. here is one regular expression which will fetch you your desired result. i assumed that you are using VS2003.
(?:(?<Vals>.*?),)+"(?<Vals>.*?)"(?:,(?<Vals>.*?))+ $
the ouput of your given example will be like this :
SubMatch: [Vals] 1:one field 2:two field 3:field val one, field val two,field val three 4:three field
try it out
This one removes all the unwanted characters as well:
public Form1()
{
InitializeComponent();
ArrayList al = ParseString(" \"M1, M2, M3, M4, \"S1, S2, S3 , S4,\"M5, M6,\"
S5, S6, \" M7, M8, M9 \"");
foreach (String aItem in al)
{
Console.WriteLine(aItem);
}
}
public ArrayList ParseString(string strInput)
{
string strTemp = "";
string ModString = "";
Boolean bQuote = false;
ArrayList aParsedString = new ArrayList();
for(int i = 0; i < strInput.Length; i++)
{
if (strInput[i] == '\"' && bQuote == false)
{
bQuote = true;
}
else if (strInput[i] == '\"' && bQuote == true)
{
ModString = strTemp.ToString();
ModString = ModString.Replace("\"", "");
ModString = ModString.TrimEnd(null);
ModString = ModString.TrimStart(null);
aParsedString.Add(ModString);
strTemp = "";
bQuote = false;
}
if(strInput[i] != ',')
{
strTemp += (strInput[i]);
}
else
{
strTemp += (strInput[i]);
if (bQuote == false)
{
ModString = strTemp.ToString();
ModString = ModString.Replace(",", " ");
ModString = ModString.Replace("\"", "");
ModString = ModString.TrimEnd(null);
ModString = ModString.TrimStart(null);
aParsedString.Add(ModString);
strTemp = "";
}
}
}
return (aParsedString);
}
hi Ashoo,
here is the implimentation of the Expression
Regex reg = new Regex("(?:(?.*?),)+[\\s]\"(?.*?)\"(?:,[\\s](?.*?))+$");
MatchCollection MatchColl;
MatchColl = reg.Matches("one field, two field, \"field val one, field
val two, field val three\", three field");
string[] vals;
foreach (Match mat in MatchColl) {
vals = Array.CreateInstance(typeof(string),
mat.Groups["Vals"].Captures.Count);
int i = 0;
foreach (Capture cap in mat.Groups["Vals"].Captures) {
vals(i) = cap.Value();
i++;
}
}
i've done this in vb.net and converted into c# for you so check some
sytaxts. anyways code is running.
let me know if you have any query regarding it.
Lucky
also check that i've little modified the expression. you can set
properties of RegEx to ignore case, multi line as per your requirements
but in your case i think only "ignore case" is only required.
Lucky,
I wish I could get this to work but I'm getting two major errors:
parsing "(?.*?),)+[\s]"(?.*?)"(?:,[\s](?.*?))+$" - Unrecognized grouping
construct.
And the looping part is generating a whole series of errors.
"Lucky" <tu************@gmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com... hi Ashoo,
here is the implimentation of the Expression
Regex reg = new Regex("(?:(?.*?),)+[\\s]\"(?.*?)\"(?:,[\\s](?.*?))+$"); MatchCollection MatchColl; MatchColl = reg.Matches("one field, two field, \"field val one, field val two, field val three\", three field"); string[] vals; foreach (Match mat in MatchColl) { vals = Array.CreateInstance(typeof(string), mat.Groups["Vals"].Captures.Count); int i = 0; foreach (Capture cap in mat.Groups["Vals"].Captures) { vals(i) = cap.Value(); i++; } }
i've done this in vb.net and converted into c# for you so check some sytaxts. anyways code is running.
let me know if you have any query regarding it.
Lucky
"RSH" <wa*************@yahoo.com> wrote in news:ud8yT#fFGHA.3120
@TK2MSFTNGP10.phx.gbl: parsing "(?.*?),)+[\s]"(?.*?)"(?:,[\s](?.*?))+$" - Unrecognized grouping construct.
Looks like the poster missed a '(' at the beginning. The second ')' is
unmatched. I haven't tested, but try just adding another '(' at the very
beginning.
-mdb
hi,
as i said i wrote this in vb.net and converted for you. but the
converter missed some parts. so i've manually wrote the code in c#.net.
here is the code. try it and let me know.
Regex reg = new
Regex("(?:(?<Vals>.*?),)+[\\s]\"(?<Vals>.*?)\"(?:,[\\s](?<Vals>.*?))+$");
MatchCollection MatchColl;
MatchColl = reg.Matches("one field, two field, \"field val one, field
val two, field val three\", three field");
string[] vals;
foreach (Match mat in MatchColl)
{
vals =new string[mat.Groups["Vals"].Captures.Count];
int i = 0;
foreach (Capture cap in mat.Groups["Vals"].Captures)
{
vals[i] = cap.Value;
i++;
}
}
you need to import this namespace in order to use this code.
using System.Text.RegularExpressions;
Lucky
http://spaces.msn.com/members/staceyw/Blog/cns!1pnsZpX0fPvDxLKC6rAAhLsQ!352.entry
--
William Stacey [MVP]
"asrs63" <as**********@reyrey.com> wrote in message
news:11**********************@o13g2000cwo.googlegr oups.com... Hi,
Is there a class that can handle splitting of a string on a comma such that the commas in quotes are ignored?
I know we can use Text::ParseWords directive in perl to do this, but I am new to C#.Net and couldn't find anything similar.
For example
string str = "one field, two field, \"field val one, field val two, field val three\", three field" ;
Then I should get the following in a str_arr
str_arr[0] = "one field" str_arr[1] = "two filed" str_arr[2] = "field val one, field val two, field val three" str_arr[3] = "three field"
Thanks in advance,
Ashoo. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Fuzzyman |
last post by:
I want to parse some text and generate an output that is similar but
not identical to the input.
The string I produce will be of similar length to the input string -
but a bit longer.
I'm...
|
by: Anders Eriksson |
last post by:
Hello!
I want to extract some info from a some specific HTML pages, Microsofts
International Word list (e.g.
http://msdn.microsoft.com/library/en-us/dnwue/html/swe_word_list.htm). I
want to...
|
by: Freddie |
last post by:
Happy new year! Since I have run out of alcohol, I'll ask a question that I
haven't really worked out an answer for yet. Is there an elegant way to turn
something like:
> moo cow "farmer john"...
|
by: ARK |
last post by:
I am writing a search program in ASP(VBScript). The user can enter keywords
and press submit.
The user can separate the keywords by spaces and/or commas and key words may
contain plain words,...
|
by: meldrape |
last post by:
Hello,
I need to parse a long string into no more than 30
character chunks, but I also need to leave the words
intact. Right now, I am using:
For intStart = 1 to Len(strOriginal) by 30...
|
by: Aleksandar Matijaca |
last post by:
Hi there,
I am in some need of help. I am trying to parse using the apache sax
parser
a file that has vaid UTF-8 characters - I keep end up getting a
sun.io.MalformedInputException error.
...
|
by: JaythePCguy |
last post by:
Hi,
I am trying to write a text parser to group all nonprintable and
control characters, spaces and space delimited words in different
groups using Regex class. Using a parsing of...
|
by: william |
last post by:
Hello,
I've imported an excel spreadsheet with a Name column which is
formatted as Last, First, MI. Some examples I have in the Name column:
Smith, Ellen P.
Jones, Mary Jane...
|
by: kellysgirl |
last post by:
Im not good at parsing strings....and Ive been driving myslef nuts
This is what I need to do....use an if/else statement to validate thata delimeter has been selected. These delimeters being...
|
by: Chris Carlen |
last post by:
Hi:
Having completed enough serial driver code for a TMS320F2812
microcontroller to talk to a terminal, I am now trying different
approaches to command interpretation.
I have a very simple...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |