473,387 Members | 1,899 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

RegEx problem

Hi,
I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac

Jun 28 '07 #1
7 2204
"jac" <ja*@discussions.microsoft.comschrieb im Newsbeitrag
news:00**********************************@microsof t.com...
Hi,
I have problems with following code and don't find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
Why the '?' behine '[,]' ?
That allows to match only part of a number and put the rest in the next
number.
And why the brackets around the comma?
That seems souerfluous to me.

Christof
Jun 28 '07 #2
* jac wrote, On 28-6-2007 17:26:
Hi,
I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac

\[(?<number>\d+)(?:,(?<number>\d+))*\]

should do the trick. Currently there are too many options as both the ,
as well as the whole first group are optional (which they're not).

The new expression reads

find a [
find a number (one or more digits)
optionally find a comma followed by a number
repeat optional group if possible
find a ]

both number are captured in the same named group, which makes it easier
to extract the values:

Match m = regStr.Match(text);
foreach (Capture c in m.Groups["number"].Captures)
{
aArray.Add(c.Value);
}

number = aArray.Count;

Optionally you could also do a string.Split with '[', ',' and ']' as
separator characters which would probably be faster as well. You can
instruct string.Split to ignore empty groups.

string[] results = "[16,23,1]".Split(new char[] { ',', '[', ']' },
StringSplitOptions.RemoveEmptyEntries);
int number = results.Length;

I'd prefer this solution over the regex one.

Jesse
Jun 28 '07 #3
Because I can have 0 or multiple sets of 15,12,5,13, therefore ((\d+)[,]?)
In the set I can have 0 or 1 comma, but I can have the set multiple times
(Example[12,4,56,7,14,25,12]) or not and then I think I fall in the last part
of it (example [45])

"Christof Nordiek" wrote:
"jac" <ja*@discussions.microsoft.comschrieb im Newsbeitrag
news:00**********************************@microsof t.com...
Hi,
I have problems with following code and don't find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");

Why the '?' behine '[,]' ?
That allows to match only part of a number and put the rest in the next
number.
And why the brackets around the comma?
That seems souerfluous to me.

Christof
Jun 28 '07 #4
Hello,

First, very good and detailed answer! (Got a positive rate from me)

But I would prefere the string.Split solution that you also presented.
A quick test with a loop and two timestamps will show you why!

All the best,

Martin

"Jesse Houwing" wrote:
* jac wrote, On 28-6-2007 17:26:
Hi,
I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac


\[(?<number>\d+)(?:,(?<number>\d+))*\]

should do the trick. Currently there are too many options as both the ,
as well as the whole first group are optional (which they're not).

The new expression reads

find a [
find a number (one or more digits)
optionally find a comma followed by a number
repeat optional group if possible
find a ]

both number are captured in the same named group, which makes it easier
to extract the values:

Match m = regStr.Match(text);
foreach (Capture c in m.Groups["number"].Captures)
{
aArray.Add(c.Value);
}

number = aArray.Count;

Optionally you could also do a string.Split with '[', ',' and ']' as
separator characters which would probably be faster as well. You can
instruct string.Split to ignore empty groups.

string[] results = "[16,23,1]".Split(new char[] { ',', '[', ']' },
StringSplitOptions.RemoveEmptyEntries);
int number = results.Length;

I'd prefer this solution over the regex one.

Jesse
Jun 28 '07 #5
Thank you, it works nice and it was a very good description how to read a
regex.
"Jesse Houwing" wrote:
* jac wrote, On 28-6-2007 17:26:
Hi,
I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac


\[(?<number>\d+)(?:,(?<number>\d+))*\]

should do the trick. Currently there are too many options as both the ,
as well as the whole first group are optional (which they're not).

The new expression reads

find a [
find a number (one or more digits)
optionally find a comma followed by a number
repeat optional group if possible
find a ]

both number are captured in the same named group, which makes it easier
to extract the values:

Match m = regStr.Match(text);
foreach (Capture c in m.Groups["number"].Captures)
{
aArray.Add(c.Value);
}

number = aArray.Count;

Optionally you could also do a string.Split with '[', ',' and ']' as
separator characters which would probably be faster as well. You can
instruct string.Split to ignore empty groups.

string[] results = "[16,23,1]".Split(new char[] { ',', '[', ']' },
StringSplitOptions.RemoveEmptyEntries);
int number = results.Length;

I'd prefer this solution over the regex one.

Jesse
Jun 28 '07 #6
* Martin# wrote, On 28-6-2007 18:40:
Hello,

First, very good and detailed answer! (Got a positive rate from me)
Thank you :)
But I would prefere the string.Split solution that you also presented.
A quick test with a loop and two timestamps will show you why!
I hadn't tested, but my guess is that it's a major difference. Regex can
do beautiful things, but isn't the best tool for every problem. As I
said before: I'd prefer this solution over the regex one. It's both
easier to read, and faster. The only problem is that it doesn't validate
the input while the regex would do that for you.

I'm not sure if a int.TryParse would impact the loop you tried enough to
make is slower than a regex though, my guess is that it's still faster
than a regex.

Jesse
All the best,
and to you.

Jesse

>
Martin

"Jesse Houwing" wrote:
>* jac wrote, On 28-6-2007 17:26:
>>Hi,
I have problems with following code and don’t find the bug :

// Set [8,9,54]
ArrayList aArray = new ArrayList();
regStr = new Regex(@"\[(?:(\d+)[,]?)*(\d+)\]");
if(text != null && regStr.IsMatch(text))
{
Match m = regStr.Match(text);
GroupCollection groups = m.Groups;
number = 0;
for(int i=1;i < groups.Count;i++)
{
foreach(Capture c in groups[i].Captures)
{
aArray.Add(c.Value.ToString());
number++;
}
}

}

[8,9] : thats working in my aArray I have 8 and 9
[16,5] : OK I have 16 and 5
[16,34] : That is nok I have 3 items in my array 16 and 3 and 4
[16] : that’s is nok I have 2 items in my array 1 and 6

Why m.groups has 3 groups for [16,34]? The same for [16] why m.groups has 2
groups.
I think it must be the last part of my regex expression (\d+). This is one
group even if there are more numbers in it. How can I solve this?

Thanks in advance,
jac

\[(?<number>\d+)(?:,(?<number>\d+))*\]

should do the trick. Currently there are too many options as both the ,
as well as the whole first group are optional (which they're not).

The new expression reads

find a [
find a number (one or more digits)
optionally find a comma followed by a number
repeat optional group if possible
find a ]

both number are captured in the same named group, which makes it easier
to extract the values:

Match m = regStr.Match(text);
foreach (Capture c in m.Groups["number"].Captures)
{
aArray.Add(c.Value);
}

number = aArray.Count;

Optionally you could also do a string.Split with '[', ',' and ']' as
separator characters which would probably be faster as well. You can
instruct string.Split to ignore empty groups.

string[] results = "[16,23,1]".Split(new char[] { ',', '[', ']' },
StringSplitOptions.RemoveEmptyEntries);
int number = results.Length;

I'd prefer this solution over the regex one.

Jesse
Jun 28 '07 #7
"jac" <ja*@discussions.microsoft.comschrieb im Newsbeitrag
news:86**********************************@microsof t.com...
Because I can have 0 or multiple sets of 15,12,5,13, therefore
((\d+)[,]?)
In the set I can have 0 or 1 comma, but I can have the set multiple times
(Example[12,4,56,7,14,25,12]) or not and then I think I fall in the last
part
of it (example [45])
But the 45 would simply be the last number, wich is allready in the RegEx
and the privious group, with the comma will be matched zero times.
Actually that's the cause of the fault, the the first part can match, even
if there is no comma.

Christof
Jun 29 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jon Maz | last post by:
Hi All, Am getting frustrated trying to port the following (pretty simple) function to CSharp. The problem is that I'm lousy at Regular Expressions.... //from...
4
by: aevans1108 | last post by:
expanding this message to microsoft.public.dotnet.xml Greetings Please direct me to the right group if this is an inappropriate place to post this question. Thanks. I want to format a...
7
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b)...
6
by: Dave | last post by:
I'm struggling with something that should be fairly simple. I just don't know the regext syntax very well, unfortunately. I'd like to parse words out of what is basically a boolean search...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
3
by: jg | last post by:
I made a mistake somewhere in my vb code and I look, check and read against the articles and help on regex, I still can't find the mistake I made. I know my test string and the test patterns...
6
by: Talin | last post by:
I've run in to this problem a couple of times. Say I have a piece of text that I want to test against a large number of regular expressions, where a different action is taken based on which regex...
16
by: Mark Chambers | last post by:
Hi there, I'm seeking opinions on the use of regular expression searching. Is there general consensus on whether it's now a best practice to rely on this rather than rolling your own (string)...
1
by: jonnyboy6969 | last post by:
Hi All Really hoping someone can help me out here with my deficient regex skills :) I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.