473,732 Members | 2,204 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parsing C# string usinsg RegEx

Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #1
3 2258
Natalie, you need to grab the comments with XML and then post-process what
you've
grabbed using an XML dom. You can easily modify the last regex that I sent to
allow
for documentation comments ///, append all such instances into a string to
process as XML.

Regex regex = new Regex(
"(?ms)(?# Specify our options )" +
"^.*?((?<lineCo mment>///?)|/\\*)" +
"(?<comments>.* ?)" +
"(?(lineComment )$|\\*/)");

if ( match.Groups["lineCommen t"].Value == "///" ) {
string xmlString += match.Groups["comments"].Value;
}

Expressions are not a jack of all trades, nor are they the best or fastest
parsing structure for
all cases. Use the right tool for the job. Hope this helps in your endeavor.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #2
Hi, inline

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"
Problems:
1) '.+' inside "</?.+>", will match anything including '>'
2) '.*' inside (?<comments>.*) , will match anything including '<'

I suggest trying this:

strRex = @"///?\s(?:(?:<[^>]+>)|(?<comments >[^<]+))*";

Case d and e will not match, because they don't contain any comments you
want.

HTH,
greetings


I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #3
There is a great Visual Studio .NET add-in called Project Line Counter
you should have a look at downloadable from www.wndtabs.com

"Natalia DeBow" <na***********@ unisys.com> wrote in message
news:c9******** **@si05.rsvl.un isys.com...
Hi there,

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats with
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments but not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<commen ts>.*))*$"
@"///?\s*(</?.+>)*(?<commen ts>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia

Nov 16 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
3912
by: Martin Robins | last post by:
I am trying to parse a string that is similar in form to an OLEDB connection string using regular expressions; in principle it is working, but certain character combinations in the string being parsed can completely wreck it. The string I am trying to parse is as follows: commandText=insert into (Text) values (@message + N': ' + @category);commandType=StoredProcedure; message=@message; category=@category I am looking to retrive name value...
9
1474
by: Ravi Singh (UCSD) | last post by:
Hello all I have a huge string that I need to parse Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim3> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim2> Key <Delim1> Value <Delim3>
17
2788
by: Mark | last post by:
I must create a routine that finds tokens in small, arbitrary VB code snippets. For example, it might have to find all occurrences of {Formula} I was thinking that using regular expressions might be a neat way to solve this, but I am new to them. Can anyone give me a hint here? The catch is, it must only find tokens that are not quoted and not commented; examples follow
3
2550
by: Chris | last post by:
Hi everyone, I'm trying to parse through the contents of some text files with regular expressions, but am new to regular expressions and how to use them in VB.net. I'm pretty sure that the regular expressions are correct as I got them from regexlib.com and tested them in the Regulator and Expresso. The problem is I tested this function with a file that contains a string
9
1988
by: Paulers | last post by:
Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an ArrayList? I have tried looping through the logfile using regex, if statements and flags to find the start and end of each message but I do not see a good time in this process to create a new instance of my Message object. While messing around with...
3
2701
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def RN(name, regex): """protect using () and give an optional name to a regex""" if name:
6
1610
by: John Rogers | last post by:
Can someone show me how to parse a string to find a specific value? <b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G" href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b> That is my string, I have thousands of lines to go through, I am looking to get back the following value: "Details List Filers" These are unique in the string:
4
2671
by: CJ | last post by:
Is this the format to parse a string and return the value between the item? Regex pRE = new Regex("<File_Name>.*>(?<insideText>.*)</File_Name>"); I am trying to parse this string. <File_Name>Services</File_Name> Thanks
3
4512
by: GazK | last post by:
I have been using an xml parsing script to parse a number of rss feeds and return relevant results to a database. The script has worked well for a couple of years, despite having very crude error-trapping (if it finds an error in one of the xml files, the script stops). Recently, the script has stopped working because one of the xml files is badly formed. So I decided to rewrite the script with better error trapping; the script should...
0
8946
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9307
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9235
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9181
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6735
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6031
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4809
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.