473,320 Members | 1,832 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Extract text with C# and RegExp

Hello,
I have a html text with custom tags which looks like html comment,
such:

"text text text <p>text</ptext test test
text text text <p>text</ptext test test
<!-- @MyTag@ -->extract this<!-- /@MyTag@ -->
text text text <p>text</ptext test test
<!-- @MyTag@ -->and this<!-- /@MyTag@ -->
text text text <p>text</ptext test test"

My regexp should extract the first part of text till first opening tag
(<!-- @MyTag@ -->), then the text between tags (extract this, and
this). I had headache by finding the right pattern. Any help? thanks!

Alberto

Jul 18 '07 #1
1 4763
* Alberto Sartori wrote, On 18-7-2007 15:10:
Hello,
I have a html text with custom tags which looks like html comment,
such:

"text text text <p>text</ptext test test
text text text <p>text</ptext test test
<!-- @MyTag@ -->extract this<!-- /@MyTag@ -->
text text text <p>text</ptext test test
<!-- @MyTag@ -->and this<!-- /@MyTag@ -->
text text text <p>text</ptext test test"

My regexp should extract the first part of text till first opening tag
(<!-- @MyTag@ -->), then the text between tags (extract this, and
this). I had headache by finding the right pattern. Any help? thanks!

Alberto


<!-- @(?<tagname>\w+)@ -->(?<content>.*?)<!-- /@\k<tagname>@ -->

should do the trick.

<!-- @ looks for the beginning of a tag
(?<tagname>\w+) looks for the name of the tag and captures it
@ -- end of the opening tag

(?<content>.*?) Capture the contents of the tag

<!-- /@ looks for the beginning of an end tag
\k<tagname ensures it's the same tagname as the one before
@ -- end of the end tag

The tagname is captured in a group named 'tagname' and the content of
the tag in a group named 'content'.

Once you've gotten a match in your text you can reference the contents
like this:

Match m = Regex.Match(...);
if (m.Success)
{
string tagname = m.Groups["tagname"].Value;
string content = m.Groups["content"].Value;
}
Kind regards,

Jesse
Jul 18 '07 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Eddie | last post by:
I need to validate a text input field. I just want to say if user enters 93101 or 93102 or 93103 or 93105 or 93106 or 93107 or 93108 or 93109 or 93110 or 93111 or 93116 or 93117 or 93118 or...
7
by: A Causal | last post by:
I'm an experienced C programmer, but I have never worked with any sort of internet programming. I would like to write a program to search for certain character strings in a currently displayed web...
4
by: Guogang | last post by:
Hi, I need to extract plain text from HTML page (i.e. do not show images, html formatting, ...) Is there some C# class/function that can help me on this? Thanks, Guogang
3
by: ksr | last post by:
Hi, I am looking for a regular expression that would extract UNC paths from a given string and place that inside a href. Currently the expression fails if there is a space in the path.. eg....
7
by: teo | last post by:
hallo, I need to extract a word and few text that precedes and follows it (about 30 + 30 chars) from a long textual document. Like the description that Google returns when it has found a...
9
by: trihanhcie | last post by:
Hi, I would like to extract the text in an HTML file For the moment, I'm trying to get all text between <tdand </td>. I used a regular expression because i don't know the "format between...
6
by: Dave | last post by:
Hope someone can help! I have a memo fiels in which there are a few numbers including dates but what I want to do is extract a number which is 6 figures long. Can anyone help me? Thanks Dave
4
by: boris.smirnov | last post by:
Hallo all, I have tried for a couple of hours to solve my problem with re but I have no success. I have a string containing: "+abc_cde.fgh_jkl\n" and what I need to become is...
0
by: Ciaran | last post by:
Hi what's the best way to extract a var from a string based on a regexp? I can't seem to find the right function. I want to get the domain extension from any url. examples:...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.