By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,635 Members | 2,187 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,635 IT Pros & Developers. It's quick & easy.

Problem with Regular Expressions in .NET

P: n/a
Hi,

I'm using regular expressions to extract some information from my
vb.net source code files.

I have something like this:

1: '<class name="xyz" description="xxxxxx"/>
2: Class xyz
... other lines of code ...
y: End Class

I want to extract with regular expression a string that contains the
file from line 1 to line y. To do this I use the following regular
expression pattern:

'\s*<class.*End\s+Class

But it doesn't work, it returns zero matches. I tried to set the
IgnoreCase flag, the MultiLine flag, but there is nothing to do. Note
that if I use the following pattern:

'\s*<class.*/>

to extract only line 1 it works.
So it seems that the .* (It should match all characters between
'<Class and End Class) doesn't work when there are multiple lines.

What can I do?

Thanks in advance for the help.

Bye

Gianluca
Nov 16 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
1. You need the Singleline-Flag, or the '.' character won't match newlines
2. Use something like: "\bclass\b.*\bEnd\s+Class\b" "\b" ensures word
boundary
3. This won't work with nested classes, or more than one class in per file
4. This won't recognize comments or strings - If your code contains a string
literal "end class" or a comment ' end class, the regex will think this is
the end of the class.

Correct handling of these cases is possible, but quite complex: If you need
this, maybe a parsing the file line-by-line would be easier.

Niki

"Gianluca" <mo**********@hotmail.com> wrote in
news:56**************************@posting.google.c om...
Hi,

I'm using regular expressions to extract some information from my
vb.net source code files.

I have something like this:

1: '<class name="xyz" description="xxxxxx"/>
2: Class xyz
... other lines of code ...
y: End Class

I want to extract with regular expression a string that contains the
file from line 1 to line y. To do this I use the following regular
expression pattern:

'\s*<class.*End\s+Class

But it doesn't work, it returns zero matches. I tried to set the
IgnoreCase flag, the MultiLine flag, but there is nothing to do. Note
that if I use the following pattern:

'\s*<class.*/>

to extract only line 1 it works.
So it seems that the .* (It should match all characters between
'<Class and End Class) doesn't work when there are multiple lines.

What can I do?

Thanks in advance for the help.

Bye

Gianluca

Nov 16 '05 #2

P: n/a
Niki,

I wrote 10 minutes ago in the general group I did not see messages from you
some days, a mistake I see.

Cor
Nov 16 '05 #3

P: n/a
"Niki Estner" <ni*********@cube.net> wrote in message news:<#6*************@TK2MSFTNGP09.phx.gbl>...

You are right, setting the Singleline flag it works.
I thought that this flag was the default, that is the reason why I
didn't tried this flag before.

Thanks for your suggestions.

Bye

Gianluca

1. You need the Singleline-Flag, or the '.' character won't match newlines
2. Use something like: "\bclass\b.*\bEnd\s+Class\b" "\b" ensures word
boundary
3. This won't work with nested classes, or more than one class in per file
4. This won't recognize comments or strings - If your code contains a string
literal "end class" or a comment ' end class, the regex will think this is
the end of the class.

Correct handling of these cases is possible, but quite complex: If you need
this, maybe a parsing the file line-by-line would be easier.

Niki

"Gianluca" <mo**********@hotmail.com> wrote in
news:56**************************@posting.google.c om...
Hi,

I'm using regular expressions to extract some information from my
vb.net source code files.

I have something like this:

1: '<class name="xyz" description="xxxxxx"/>
2: Class xyz
... other lines of code ...
y: End Class

I want to extract with regular expression a string that contains the
file from line 1 to line y. To do this I use the following regular
expression pattern:

'\s*<class.*End\s+Class

But it doesn't work, it returns zero matches. I tried to set the
IgnoreCase flag, the MultiLine flag, but there is nothing to do. Note
that if I use the following pattern:

'\s*<class.*/>

to extract only line 1 it works.
So it seems that the .* (It should match all characters between
'<Class and End Class) doesn't work when there are multiple lines.

What can I do?

Thanks in advance for the help.

Bye

Gianluca

Nov 16 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.