473,405 Members | 2,187 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

Regular Expression Pattern Help

Hello,
i want to parse some html with regex and have the following problem:
--- html to parse start ---

some text <span class="x">
some text with linebreaks
and tabs and <b>tags <i>in it</i>
goes here
</span> another text

--- html to parse end ---
Now my question: How do I have to write the pattern so I get
"some text with linebreaks and tabs and <b>tags <i>in it</i> goes here"
as a match?

TIA Martin
Nov 15 '05 #1
1 1365
I highly recommend getting Regular Expression Workbench by Eric Gunnerson .
It helps a lot with this sort of stuff.
http://www.gotdotnet.com/Community/U...1-4ee2729d7322

The important thing is to set RegexOptions to SingleLine. Another important
note is to use .*? to match zero or more character between span tags, but
non-greedy (the ? makes it non greedy) that is- match the next </span> tag
found.

System.Text.RegularExpressions.Regex regex = new
System.Text.RegularExpressions.Regex(@"<span class="x">(?<Text>.*?)</span>",
System.Text.RegularExpressions.RegexOptions.Single line);

Match match = regex..Match(str);

It will create a capture called Text. see ms help for Match.Captures for
what to do with the result of the above line.

ms-help://MS.VSCC.2003/MS.MSDNQTR.2003FEB.1033/cpref/html/frlrfsystemtextreg
ularexpressionsgroupclasscapturestopic.htm

Hope that gets you started!

Mike Mayer - Visual C# MVP
http://www.mag37.com/csharp/
mi**@mag37.com

"Martin Andert" <ma***********@gmx.de> wrote in message
news:9a**************************@posting.google.c om...
Hello,
i want to parse some html with regex and have the following problem:
--- html to parse start ---

some text <span class="x">
some text with linebreaks
and tabs and <b>tags <i>in it</i>
goes here
</span> another text

--- html to parse end ---
Now my question: How do I have to write the pattern so I get
"some text with linebreaks and tabs and <b>tags <i>in it</i> goes here"
as a match?

TIA Martin

Nov 15 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Tina Li | last post by:
Hello, I've been struggling with a regular expression for parsing XML files, which keeps giving the run time error "maximum recursion limit exceeded". Here is the pattern string: ...
1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
4
by: Chris Kettenbach | last post by:
Does anyone know of a regualr expression to validate US and Canadian Postal Codes? I want to be able to accept ##### or #####-#### or Canadian A#A #A# that is alpha,number,alpha...
2
by: cleo | last post by:
I'm experimenting with Regular Expressions and Windows Forms. Frequently I want a value to be either a valid pattern or empty. For example a Zip code must be 5 digits or may be empty. I know that...
3
by: rodchar | last post by:
hey all, what would my expression look like if i wanted to make sure that the input matched the following pattern. c:\filename.ext it doesn't have to be the c drive just a letter, colon,...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
6
by: rorymo | last post by:
I have a regular expression that allows only certain characters to be valid in an xml doc as follows: <xs:pattern value="^*" /> What I want to do is also allow any unicode character that is...
5
by: shawnmkramer | last post by:
Anyone every heard of the Regex.IsMatch and Regex.Match methods just hanging and eventually getting a message "Requested Service not found"? I have the following pattern: ^(?<OrgCity>(+)+),...
5
by: Noah Hoffman | last post by:
I have been trying to write a regular expression that identifies a block of text enclosed by (potentially nested) parentheses. I've found solutions using other regular expression engines (for...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.