I think this is very simple but I am having difficult doing it. Basically
take a comma separated list:
abc, def, ghi, jk
A list with only one token does not have any commas:
abc
The first letter of each token (abc) must not be a number. I am simply
trying to parse it to get an array of tokens:
abc
def
ghi
jk
....or for the single token one:
abc
I can easily do this with String.Replace and String.Split, but would like to
do this with regular expressions. Yet I cannot seem to get it to work, here
is what I have so far:
String input = "abc, def, ghi, jk";
String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$";
Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture);
Any input would be appreciated,
Thanks 5 10047
I don't think Regular Expressions is the right tool for this job, Bob.
Regular Expressions are used to search for patterns, that is, strings which
share certain characteristics in common, but are not identical. In your
case, you want to convert a comma-delmited string into an array, and
String.Split() does just that.
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
A watched clock never boils.
"Bob" <no****@nowhere.com> wrote in message
news:uL**************@TK2MSFTNGP09.phx.gbl... I think this is very simple but I am having difficult doing it. Basically take a comma separated list: abc, def, ghi, jk
A list with only one token does not have any commas: abc
The first letter of each token (abc) must not be a number. I am simply trying to parse it to get an array of tokens: abc def ghi jk
...or for the single token one: abc
I can easily do this with String.Replace and String.Split, but would like to do this with regular expressions. Yet I cannot seem to get it to work, here is what I have so far:
String input = "abc, def, ghi, jk"; String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$"; Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture);
Any input would be appreciated,
Thanks
In article <uL**************@TK2MSFTNGP09.phx.gbl>,
Bob <no****@nowhere.com> wrote:
: I think this is very simple but I am having difficult doing it. Basically
: take a comma separated list:
: abc, def, ghi, jk
:
: A list with only one token does not have any commas:
: abc
:
: The first letter of each token (abc) must not be a number. I am simply
: trying to parse it to get an array of tokens:
: abc
: def
: ghi
: jk
:
: ...or for the single token one:
: abc
:
: I can easily do this with String.Replace and String.Split, but would like to
: do this with regular expressions. Yet I cannot seem to get it to work, here
: is what I have so far:
:
: String input = "abc, def, ghi, jk";
: String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$";
: Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture);
:
: Any input would be appreciated,
Consider the following code:
static void Main(string[] args)
{
string[] inputs = new string[]
{
"abc, def, ghi, jk",
"abc",
"good, 1bad, good, 2bad",
"trailingcomma,",
",",
",,",
",,,",
};
string pattern =
@"^(
(
| # ignore empties
(?<token>\D.*?) # a token worth keeping
|\d.*? # or one to ignore
)
\s* # eat trailing whitespace
(,\s*|$) # separator or done
)+$ # catch a sequence of the above
";
Regex tokens = new Regex(pattern, RegexOptions.IgnorePatternWhitespace);
foreach (string input in inputs)
{
Match m = tokens.Match(input);
Console.WriteLine("input = [" + input + "]:");
if (m.Success)
{
if (m.Groups["token"].Captures.Count > 0)
foreach (Capture c in m.Groups["token"].Captures)
Console.WriteLine(" - [" + c.Value + "]");
else
Console.WriteLine(" - no captures");
}
else
Console.WriteLine(" - no match.");
}
}
Its output is
input = [abc, def, ghi, jk]:
- [abc]
- [def]
- [ghi]
- [jk]
input = [abc]:
- [abc]
input = [good, 1bad, good, 2bad]:
- [good]
- [good]
input = [trailingcomma,]:
- [trailingcomma]
input = [,]:
- no captures
input = [,,]:
- no captures
input = [,,,]:
- no captures
It's easy to anticipate Jon Skeet's objections to the regular
expression above, and he'd certainly be on solid ground. Passing the
result of a split through a filter would be much clearer, e.g.,
public static void ExtractGoodTokens(string[] inputs)
{
Regex goodtoken = new Regex(@"^\D");
foreach (string input in inputs)
{
ArrayList goodtokens = new ArrayList();
foreach (string token in Regex.Split(input, @"\s*,\s*"))
if (goodtoken.IsMatch(token))
goodtokens.Add(token);
Console.WriteLine("input = [" + input + "]:");
if (goodtokens.Count > 0)
foreach (string token in goodtokens)
Console.WriteLine(" - [" + token + "]");
else
Console.WriteLine(" - none");
}
}
Hope this helps,
Greg
--
I have felt for a long time that a talent for programming consists largely
of the abilty to switch readily from microscopic to macroscopic views of
things, i.e., to change levels of abstraction fluently.
-- Donald E. Knuth, "Structured Programming with go to Statements"
How about
string[] aryList = strList.Split(new char[] {','});
???
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
A watched clock never boils.
"Greg Bacon" <gb****@hiwaay.net> wrote in message
news:11*************@corp.supernews.com... In article <uL**************@TK2MSFTNGP09.phx.gbl>, Bob <no****@nowhere.com> wrote:
: I think this is very simple but I am having difficult doing it. Basically : take a comma separated list: : abc, def, ghi, jk : : A list with only one token does not have any commas: : abc : : The first letter of each token (abc) must not be a number. I am simply : trying to parse it to get an array of tokens: : abc : def : ghi : jk : : ...or for the single token one: : abc : : I can easily do this with String.Replace and String.Split, but would like to : do this with regular expressions. Yet I cannot seem to get it to work, here : is what I have so far: : : String input = "abc, def, ghi, jk"; : String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$"; : Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture); : : Any input would be appreciated,
Consider the following code:
static void Main(string[] args) { string[] inputs = new string[] { "abc, def, ghi, jk", "abc", "good, 1bad, good, 2bad", "trailingcomma,", ",", ",,", ",,,", };
string pattern = @"^( ( | # ignore empties (?<token>\D.*?) # a token worth keeping |\d.*? # or one to ignore ) \s* # eat trailing whitespace (,\s*|$) # separator or done )+$ # catch a sequence of the above ";
Regex tokens = new Regex(pattern, RegexOptions.IgnorePatternWhitespace);
foreach (string input in inputs) { Match m = tokens.Match(input);
Console.WriteLine("input = [" + input + "]:"); if (m.Success) { if (m.Groups["token"].Captures.Count > 0) foreach (Capture c in m.Groups["token"].Captures) Console.WriteLine(" - [" + c.Value + "]"); else Console.WriteLine(" - no captures"); } else Console.WriteLine(" - no match."); } }
Its output is
input = [abc, def, ghi, jk]: - [abc] - [def] - [ghi] - [jk] input = [abc]: - [abc] input = [good, 1bad, good, 2bad]: - [good] - [good] input = [trailingcomma,]: - [trailingcomma] input = [,]: - no captures input = [,,]: - no captures input = [,,,]: - no captures
It's easy to anticipate Jon Skeet's objections to the regular expression above, and he'd certainly be on solid ground. Passing the result of a split through a filter would be much clearer, e.g.,
public static void ExtractGoodTokens(string[] inputs) { Regex goodtoken = new Regex(@"^\D");
foreach (string input in inputs) { ArrayList goodtokens = new ArrayList();
foreach (string token in Regex.Split(input, @"\s*,\s*")) if (goodtoken.IsMatch(token)) goodtokens.Add(token);
Console.WriteLine("input = [" + input + "]:"); if (goodtokens.Count > 0) foreach (string token in goodtokens) Console.WriteLine(" - [" + token + "]"); else Console.WriteLine(" - none"); } }
Hope this helps, Greg -- I have felt for a long time that a talent for programming consists largely of the abilty to switch readily from microscopic to macroscopic views of things, i.e., to change levels of abstraction fluently. -- Donald E. Knuth, "Structured Programming with go to Statements"
On Sun, 30 Oct 2005 20:06:37 -0800, "Bob" <no****@nowhere.com> wrote: I can easily do this with String.Replace and String.Split, but would like to do this with regular expressions. Yet I cannot seem to get it to work, here is what I have so far:
String input = "abc, def, ghi, jk"; String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$";
This pattern is far from what you want.
First of all, it is easy to see that as you start with ^ and end with
$ you will always either match the complete string or nothing at all.
Secondly, Groups doesn't multiple matches, they only store the last
match in the given regular expression match. All ExplicitCapture does
is t make sure (\x2C ) as well as the outer parantheses don't count as
groups. The "name" group will only contain the characters captured on
the last loop.
This leads to the third problem. As the regex is written it will
capture a single character and than simply loop and repeat.
This is how it should be done:
(Using RegexOptions.IgnorePatternWhitespace)
string patternSplit =
@"
(?<=,|^) #The character preceding the match is either a comma or
#the beginning of the string
\D.*? #The string itself should be a non digit follow by
#any number of characters
(?=,|$) #The first character after the match should be , or
#the end of the string
";
This will find all the valid substrings while ignoring those beginning
with a digit.
It will however not make a noise if the string consists of invalid
entries. For example "12abc,def,ghi" will return "def" and "ghi" as
the two matches while just ignoring 12abc.
If you need to validate that the string doesn't contain any invalid
entries, you will have to write a seperate regular expressions that
tries to capture the entire string.
--
Marcus Andrén
Forgot to add, remove the members that start with a number.
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
A watched clock never boils.
"Greg Bacon" <gb****@hiwaay.net> wrote in message
news:11*************@corp.supernews.com... In article <uL**************@TK2MSFTNGP09.phx.gbl>, Bob <no****@nowhere.com> wrote:
: I think this is very simple but I am having difficult doing it. Basically : take a comma separated list: : abc, def, ghi, jk : : A list with only one token does not have any commas: : abc : : The first letter of each token (abc) must not be a number. I am simply : trying to parse it to get an array of tokens: : abc : def : ghi : jk : : ...or for the single token one: : abc : : I can easily do this with String.Replace and String.Split, but would like to : do this with regular expressions. Yet I cannot seem to get it to work, here : is what I have so far: : : String input = "abc, def, ghi, jk"; : String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$"; : Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture); : : Any input would be appreciated,
Consider the following code:
static void Main(string[] args) { string[] inputs = new string[] { "abc, def, ghi, jk", "abc", "good, 1bad, good, 2bad", "trailingcomma,", ",", ",,", ",,,", };
string pattern = @"^( ( | # ignore empties (?<token>\D.*?) # a token worth keeping |\d.*? # or one to ignore ) \s* # eat trailing whitespace (,\s*|$) # separator or done )+$ # catch a sequence of the above ";
Regex tokens = new Regex(pattern, RegexOptions.IgnorePatternWhitespace);
foreach (string input in inputs) { Match m = tokens.Match(input);
Console.WriteLine("input = [" + input + "]:"); if (m.Success) { if (m.Groups["token"].Captures.Count > 0) foreach (Capture c in m.Groups["token"].Captures) Console.WriteLine(" - [" + c.Value + "]"); else Console.WriteLine(" - no captures"); } else Console.WriteLine(" - no match."); } }
Its output is
input = [abc, def, ghi, jk]: - [abc] - [def] - [ghi] - [jk] input = [abc]: - [abc] input = [good, 1bad, good, 2bad]: - [good] - [good] input = [trailingcomma,]: - [trailingcomma] input = [,]: - no captures input = [,,]: - no captures input = [,,,]: - no captures
It's easy to anticipate Jon Skeet's objections to the regular expression above, and he'd certainly be on solid ground. Passing the result of a split through a filter would be much clearer, e.g.,
public static void ExtractGoodTokens(string[] inputs) { Regex goodtoken = new Regex(@"^\D");
foreach (string input in inputs) { ArrayList goodtokens = new ArrayList();
foreach (string token in Regex.Split(input, @"\s*,\s*")) if (goodtoken.IsMatch(token)) goodtokens.Add(token);
Console.WriteLine("input = [" + input + "]:"); if (goodtokens.Count > 0) foreach (string token in goodtokens) Console.WriteLine(" - [" + token + "]"); else Console.WriteLine(" - none"); } }
Hope this helps, Greg -- I have felt for a long time that a talent for programming consists largely of the abilty to switch readily from microscopic to macroscopic views of things, i.e., to change levels of abstraction fluently. -- Donald E. Knuth, "Structured Programming with go to Statements" This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Craig Keightley |
last post by:
I can do the match perfectly but what i also need to do is create a third
list of comma separated values that are in both
eg:
List 1 => 1,2,3,4,5,6,7,8,11
List 2 => 1,3,4,5,6,7,10,23
...
|
by: Alan Pretre |
last post by:
Can anyone help me figure out a regex pattern for the following input
example:
xxx:a=b,c=d,yyy:e=f,zzz:www:g=h,i=j,l=m
I would want four matches from this:
1. xxx a=b,c=d
2. yyy e=f
3....
|
by: Craig Keightley |
last post by:
I have a mysql database with a list of companies who supply specific
products
tblSuppliers (simplified)
sID | sName | goodsRefs
1 | comp name | 1,2,3,4,5
2 | company 2 | 2,4
|
by: CB |
last post by:
Trying to match the entire following object literal code using a RegEx.
var Punctuators = { '{' : 'LeftCurly', '}' : 'RightCurly' }
Variations on the idea of using /var.*{.*}/ of course stops...
|
by: Steve |
last post by:
Hi All,
I'm having a tough time converting the following regex.compile patterns
into the new re.compile format. There is also a differences in the
regsub.sub() vs. re.sub()
Could anyone lend...
|
by: Chris Newman |
last post by:
I am working on a script to process a large number of old electoral records.
There are about 100,000 records in all but here is a representative sample
BTW hd =household duties
ALLISON,...
|
by: Tidane |
last post by:
Visual Basic.NET Framework 2.0
I've created a program to parse out text as the program recieved it and use Regex matching to decide what should be done. My problem is that the text is matching when...
|
by: sherifffruitfly |
last post by:
Hi all,
I can't see what's wrong with this regex pattern:
private int ParsePageViews(string str)
{
int ret = 0;
string pattern = @"Visits.*\n\s*Total\s\.*\s(?
<visits>(\d{3})|(\d,\d{3}))";
|
by: bullockbefriending bard |
last post by:
first, regex part:
I am new to regexes and have come up with the following expression:
((1|),(1|)/){5}(1|),(1|)
to exactly match strings which look like this:
1,2/3,4/5,6/7,8/9,10/11,12
...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
| |