473,385 Members | 1,829 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

regalar expression match

New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them;
until you hit a ]. It's the later part that doesn't work as I can't seem to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.
Nov 15 '05 #1
12 1804
I think String.split(",") help. Since you have comma as delimiter
eg. code
String s = "[a][b][c[d]], [a[b]], c[d][e]";
String sa[] = s.Split(",");

this will give me an array, which I can check length of & traverse using a for loop to check the contents of other elements

:-)
Kalpesh

Jed Ozone wrote:
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]


Nov 15 '05 #2
Hi Jed,

Why don't you create a state machine rather - iterate through string and
count [ (+1) and ] (-1) - when the count=0 and you are over a comma, you
will know that the comma is delimiter.

--
Miha Markic - RightHand .NET consulting & software development
miha at rthand com

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them; until you hit a ]. It's the later part that doesn't work as I can't seem to figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.

Nov 15 '05 #3
I think String.split(",") help. Since you have comma as delimiter
eg. code
String s = "[a][b][c[d]], [a[b]], c[d][e]";
String sa[] = s.Split(",");

this will give me an array, which I can check length of & traverse using a for loop to check the contents of other elements

:-)
Kalpesh

Jed Ozone wrote:
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]


Nov 15 '05 #4
Hi Jed,

Why don't you create a state machine rather - iterate through string and
count [ (+1) and ] (-1) - when the count=0 and you are over a comma, you
will know that the comma is delimiter.

--
Miha Markic - RightHand .NET consulting & software development
miha at rthand com

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them; until you hit a ]. It's the later part that doesn't work as I can't seem to figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.

Nov 15 '05 #5
Jed,

try this:

(?:(?:^|\s*,\s*)(?<column>\[(([^,]*\[[^,]*\][^,]*)*|[^,]*)\]))

this will not work for the string you wrote, since there is a comma in this
expression: [BB[bb],CC]. is this a mistake?

if its not a mistake, get back to me and we will figure it out.

Picho

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them; until you hit a ]. It's the later part that doesn't work as I can't seem to figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.

Nov 15 '05 #6
Jed,

try this:

(?:(?:^|\s*,\s*)(?<column>\[(([^,]*\[[^,]*\][^,]*)*|[^,]*)\]))

this will not work for the string you wrote, since there is a comma in this
expression: [BB[bb],CC]. is this a mistake?

if its not a mistake, get back to me and we will figure it out.

Picho

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them; until you hit a ]. It's the later part that doesn't work as I can't seem to figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.

Nov 15 '05 #7
Thanks for the help Picho. The comma is not a mistake unfortunately.
Really the only rules for the syntax are the square brackets will be
balanced and there will be a comma seperating each set (outside all
brackets). Between the brackets all characters are valid along with
additional sets of square brackets. I am trying to only get the outer most
set in each case (as I showed below), as the inner brackets have a different
meaning than the outer brackets (I did not make up this syntax!).

Thanks for an additional ideas you might have.
"Picho" <pi***********@telhai.ac.il> wrote in message
news:%2***************@tk2msftngp13.phx.gbl...
Jed,

try this:

(?:(?:^|\s*,\s*)(?<column>\[(([^,]*\[[^,]*\][^,]*)*|[^,]*)\]))

this will not work for the string you wrote, since there is a comma in this expression: [BB[bb],CC]. is this a mistake?

if its not a mistake, get back to me and we will figure it out.

Picho

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between

them;
until you hit a ]. It's the later part that doesn't work as I can't

seem to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.


Nov 15 '05 #8
Thanks Miha. I was hoping to break apart my whole file using regualar
expressions. I've broken up the file down to this point and was trying to
keep with regular expressions. I actually have written a procedure to do
exactly as you say below which I had planned as a temporary solution. I
actually had thought a regular expression would be able to handle something
like this fairly easily when I started, but hence, my compelete lack of
experience with them.

Now it's become more of a crusade just to see if in fact it's possible to
balanace brackets using regular expressions.

Thanks for the feedback.

"Miha Markic" <miha at rthand com> wrote in message
news:up****************@tk2msftngp13.phx.gbl...
Hi Jed,

Why don't you create a state machine rather - iterate through string and
count [ (+1) and ] (-1) - when the count=0 and you are over a comma, you
will know that the comma is delimiter.

--
Miha Markic - RightHand .NET consulting & software development
miha at rthand com

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between

them;
until you hit a ]. It's the later part that doesn't work as I can't

seem to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.


Nov 15 '05 #9
Hi hemol,

Yes, I understand you. However, regular expressions are more like a pattern
matching and not parameter matching imo.
The kind of problem you are describing is just not meant to be solved with
regex IMO.
Anyway, I am curious too, if anybody comes with a solution with regex.
--
Miha Markic - DXSquad/RightHand .NET consulting & software development
miha at rthand com

Developer Express newsgroups are for peer-to-peer support.
For direct support from Developer Express, write to su*****@devexpress.com
Bug reports should be directed to: su*****@devexpress.com
Due to newsgroup guidelines, DX-Squad will not answer anonymous postings.
"hemol" <je*************@yahoo.com> wrote in message
news:t8Hzb.299807$9E1.1524699@attbi_s52...
Thanks Miha. I was hoping to break apart my whole file using regualar
expressions. I've broken up the file down to this point and was trying to
keep with regular expressions. I actually have written a procedure to do
exactly as you say below which I had planned as a temporary solution. I
actually had thought a regular expression would be able to handle something like this fairly easily when I started, but hence, my compelete lack of
experience with them.

Now it's become more of a crusade just to see if in fact it's possible to
balanace brackets using regular expressions.

Thanks for the feedback.

"Miha Markic" <miha at rthand com> wrote in message
news:up****************@tk2msftngp13.phx.gbl...
Hi Jed,

Why don't you create a state machine rather - iterate through string and
count [ (+1) and ] (-1) - when the count=0 and you are over a comma, you
will know that the comma is delimiter.

--
Miha Markic - RightHand .NET consulting & software development
miha at rthand com

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets. For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to

pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]" character or find an internal set of [...] and take everything between

them;
until you hit a ]. It's the later part that doesn't work as I can't

seem
to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.



Nov 15 '05 #10

This should work:

\[(?>[^\[\]]+|\[(?<DEPTH>)|\](?<-DEPTH>))*(?(DEPTH)(?!))\]

This is based on the method described in the book "Mastering Regular
Expressions" by Jeffrey E. F. Friedl by O'Reilly. It is an excellent book
that covers regular expressions in many different languages.

Basically, the .NET flavor of Regex allows for matching nested constructs
like this. It is a very powerful feature, but it can be a little tricky.
Brian Davis
www.knowdotnet.com


"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between them; until you hit a ]. It's the later part that doesn't work as I can't seem to figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.

Nov 15 '05 #11
Thanks for the expression. Unfortunately, this didn't seem to pick up the
right pieces. It missed the first set of square brackets. So when parsing
something like:

[AA[aa]BB],[CC]

It returned [aa] and [CC] (and a bunch of empty results).

There is some syntax in your expression that I don't understand (a fair
amount), so I'll have to study it. In any case, if nothing else, I learned
something about regular expression and that I'm much better at writing state
machines than regular expressions! Thanks for the help.
"Brian Davis" <@> wrote in message
news:e9*************@tk2msftngp13.phx.gbl...

This should work:

\[(?>[^\[\]]+|\[(?<DEPTH>)|\](?<-DEPTH>))*(?(DEPTH)(?!))\]

This is based on the method described in the book "Mastering Regular
Expressions" by Jeffrey E. F. Friedl by O'Reilly. It is an excellent book
that covers regular expressions in many different languages.

Basically, the .NET flavor of Regex allows for matching nested constructs
like this. It is a very powerful feature, but it can be a little tricky.
Brian Davis
www.knowdotnet.com


"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets.
For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]"
character or find an internal set of [...] and take everything between

them;
until you hit a ]. It's the later part that doesn't work as I can't

seem to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.


Nov 15 '05 #12

When I test it, it seems to work as expected. Here is the code snippet and
the output:

Code:
MatchCollection mc =
Regex.Matches("[AA[aa]BB],[CC]",@"\[(?>[^\[\]]+|\[(?<DEPTH>)|\](?<-DEPTH>))*
(?(DEPTH)(?!))\]");
foreach (Match m in mc)
{
Console.WriteLine(m.Value);
}

Output:
[AA[aa]BB]
[CC]
Brian Davis
www.knowdotnet.com

"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OV****************@TK2MSFTNGP12.phx.gbl...
Thanks for the expression. Unfortunately, this didn't seem to pick up the
right pieces. It missed the first set of square brackets. So when parsing something like:

[AA[aa]BB],[CC]

It returned [aa] and [CC] (and a bunch of empty results).

There is some syntax in your expression that I don't understand (a fair
amount), so I'll have to study it. In any case, if nothing else, I learned something about regular expression and that I'm much better at writing state machines than regular expressions! Thanks for the help.
"Brian Davis" <@> wrote in message
news:e9*************@tk2msftngp13.phx.gbl...

This should work:

\[(?>[^\[\]]+|\[(?<DEPTH>)|\](?<-DEPTH>))*(?(DEPTH)(?!))\]

This is based on the method described in the book "Mastering Regular
Expressions" by Jeffrey E. F. Friedl by O'Reilly. It is an excellent book that covers regular expressions in many different languages.

Basically, the .NET flavor of Regex allows for matching nested constructs like this. It is a very powerful feature, but it can be a little tricky.

Brian Davis
www.knowdotnet.com


"Jed Ozone" <jed oz***@yahoo.com> wrote in message
news:OY**************@TK2MSFTNGP11.phx.gbl...
New to Regex and I'm having a hard time figuring this one out.

I need a regular expression what will based on balanced square brackets. For example:
[$AA[123]], [BB[bb],CC], [a[b[c]]]

I'm trying to write a reg ex that will parse the above into 3 pieces:
1) [$AA[123]] (or $AA[123] would be fine)
2) [BB[bb],CC]
3) [a[b[c]]]

Basically, any time the square brackets balance, I want to be able to

pluck
out that value.

I wrote: (?<column>\[([^\[\]]*(\[.*\])*)*\])

Basically, look for a [, then repeatly take either a non-"[" and non-"]" character or find an internal set of [...] and take everything between

them;
until you hit a ]. It's the later part that doesn't work as I can't

seem
to
figure out how to get the brackets to balance properly. Is this just
something regular expression are not meant to do? Thanks for any help.



Nov 15 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Ron Adam | last post by:
Is it possible to match a string to regular expression pattern instead of the other way around? For example, instead of finding a match within a string, I want to find out, (pass or fail), if...
3
by: Tom | last post by:
I have struggled with the issue of whether or not to use Regular Expressions for a long time now, and after implementing many text manipulating solutions both ways, I've found that writing...
1
by: Gareth James via .NET 247 | last post by:
I have an expression that when run uses 100% cpu for over 1minute. I can change the expression so this does not happen, but couldsome one explain why this happens so that I don't do it again ...
5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
2
by: Christian Staffe | last post by:
Hi, I would like to check for a partial match between an input string and a regular expression using the Regex class in .NET. By partial match, I mean that the input string could not yet be...
0
by: Jed Ozone | last post by:
New to Regex and I'm having a hard time figuring this one out. I need a regular expression what will based on balanced square brackets. For example: ], ,CC], ]] I'm trying to write a reg ex...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
11
by: Dimitris Georgakopuolos | last post by:
Hello, I have a text file that I load up to a string. The text includes certain expression like {firstName} or {userName} that I want to match and then replace with a new expression. However,...
14
by: Andy B | last post by:
I need to create a regular expression that will match a 5 digit number, a space and then anything up to but not including the next closing html tag. Here is an example: <startTag>55555 any...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.