469,936 Members | 2,317 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,936 developers. It's quick & easy.

Parsing C# string usinsg RegEx

Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<comments>.*))*$"
@"///?\s*(</?.+>)*(?<comments>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #1
3 2111
Natalie, you need to grab the comments with XML and then post-process what
you've
grabbed using an XML dom. You can easily modify the last regex that I sent to
allow
for documentation comments ///, append all such instances into a string to
process as XML.

Regex regex = new Regex(
"(?ms)(?# Specify our options )" +
"^.*?((?<lineComment>///?)|/\\*)" +
"(?<comments>.*?)" +
"(?(lineComment)$|\\*/)");

if ( match.Groups["lineComment"].Value == "///" ) {
string xmlString += match.Groups["comments"].Value;
}

Expressions are not a jack of all trades, nor are they the best or fastest
parsing structure for
all cases. Use the right tool for the job. Hope this helps in your endeavor.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Natalia DeBow" <na***********@unisys.com> wrote in message
news:c9**********@si05.rsvl.unisys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<comments>.*))*$"
@"///?\s*(</?.+>)*(?<comments>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #2
Hi, inline

"Natalia DeBow" <na***********@unisys.com> wrote in message
news:c9**********@si05.rsvl.unisys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<comments>.*))*$"
@"///?\s*(</?.+>)*(?<comments>.*)"
Problems:
1) '.+' inside "</?.+>", will match anything including '>'
2) '.*' inside (?<comments>.*), will match anything including '<'

I suggest trying this:

strRex = @"///?\s(?:(?:<[^>]+>)|(?<comments>[^<]+))*";

Case d and e will not match, because they don't contain any comments you
want.

HTH,
greetings


I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #3
There is a great Visual Studio .NET add-in called Project Line Counter
you should have a look at downloadable from www.wndtabs.com

"Natalia DeBow" <na***********@unisys.com> wrote in message
news:c9**********@si05.rsvl.unisys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<comments>.*))*$"
@"///?\s*(</?.+>)*(?<comments>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

9 posts views Thread by Ravi Singh (UCSD) | last post: by
17 posts views Thread by Mark | last post: by
3 posts views Thread by aspineux | last post: by
6 posts views Thread by John Rogers | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.