473,837 Members | 1,541 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parsing C# string usinsg RegEx

Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #1
3 2261
Natalie, you need to grab the comments with XML and then post-process what
you've
grabbed using an XML dom. You can easily modify the last regex that I sent to
allow
for documentation comments ///, append all such instances into a string to
process as XML.

Regex regex = new Regex(
"(?ms)(?# Specify our options )" +
"^.*?((?<lineCo mment>///?)|/\\*)" +
"(?<comments>.* ?)" +
"(?(lineComment )$|\\*/)");

if ( match.Groups["lineCommen t"].Value == "///" ) {
string xmlString += match.Groups["comments"].Value;
}

Expressions are not a jack of all trades, nor are they the best or fastest
parsing structure for
all cases. Use the right tool for the job. Hope this helps in your endeavor.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #2
Hi, inline

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"
Problems:
1) '.+' inside "</?.+>", will match anything including '>'
2) '.*' inside (?<comments>.*) , will match anything including '<'

I suggest trying this:

strRex = @"///?\s(?:(?:<[^>]+>)|(?<comments >[^<]+))*";

Case d and e will not match, because they don't contain any comments you
want.

HTH,
greetings


I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #3
There is a great Visual Studio .NET add-in called Project Line Counter
you should have a look at downloadable from www.wndtabs.com

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
3924
by: Martin Robins | last post by:
I am trying to parse a string that is similar in form to an OLEDB connection string using regular expressions; in principle it is working, but certain character combinations in the string being parsed can completely wreck it. The string I am trying to parse is as follows: commandText=insert into (Text) values (@message + N': ' + @category);commandType=StoredProcedure; message=@message; category=@category I am looking to retrive name value...
9
1479
by: Ravi Singh (UCSD) | last post by:
Hello all I have a huge string that I need to parse Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim3> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim3>
17
2796
by: Mark | last post by:
I must create a routine that finds tokens in small, arbitrary VB code snippets. For example, it might have to find all occurrences of {Formula} I was thinking that using regular expressions might be a neat way to solve this, but I am new to them. Can anyone give me a hint here? The catch is, it must only find tokens that are not quoted and not commented; examples follow
3
2556
by: Chris | last post by:
Hi everyone, I'm trying to parse through the contents of some text files with regular expressions, but am new to regular expressions and how to use them in VB.net. I'm pretty sure that the regular expressions are correct as I got them from regexlib.com and tested them in the Regulator and Expresso. The problem is I tested this function with a file that contains a string
9
1993
by: Paulers | last post by:
Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an ArrayList? I have tried looping through the logfile using regex, if statements and flags to find the start and end of each message but I do not see a good time in this process to create a new instance of my Message object. While messing around with...
3
2708
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def RN(name, regex): """protect using () and give an optional name to a regex""" if name:
6
1616
by: John Rogers | last post by:
Can someone show me how to parse a string to find a specific value? <b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G" href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b> That is my string, I have thousands of lines to go through, I am looking to get back the following value: "Details List Filers" These are unique in the string:
4
2677
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. <File_Name>Services</File_Name> Thanks
3
4521
by: GazK | last post by:
I have been using an xml parsing script to parse a number of rss feeds and return relevant results to a database. The script has worked well for a couple of years, despite having very crude error-trapping (if it finds an error in one of the xml files, the script stops). Recently, the script has stopped working because one of the xml files is badly formed. So I decided to rewrite the script with better error trapping; the script should...
0
9839
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9682
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10871
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10621
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9396
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5846
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4474
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4039
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3123
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.