* Alberto Sartori wrote, On 18-7-2007 15:10:
Hello,
I have a html text with custom tags which looks like html comment,
such:
"text text text <p>text</ptext test test
text text text <p>text</ptext test test
<!-- @MyTag@ -->extract this<!-- /@MyTag@ -->
text text text <p>text</ptext test test
<!-- @MyTag@ -->and this<!-- /@MyTag@ -->
text text text <p>text</ptext test test"
My regexp should extract the first part of text till first opening tag
(<!-- @MyTag@ -->), then the text between tags (extract this, and
this). I had headache by finding the right pattern. Any help? thanks!
Alberto
<!-- @(?<tagname>\w+)@ -->(?<content>.*?)<!-- /@\k<tagname>@ -->
should do the trick.
<!-- @ looks for the beginning of a tag
(?<tagname>\w+) looks for the name of the tag and captures it
@ -- end of the opening tag
(?<content>.*?) Capture the contents of the tag
<!-- /@ looks for the beginning of an end tag
\k<tagname ensures it's the same tagname as the one before
@ -- end of the end tag
The tagname is captured in a group named 'tagname' and the content of
the tag in a group named 'content'.
Once you've gotten a match in your text you can reference the contents
like this:
Match m = Regex.Match(...);
if (m.Success)
{
string tagname = m.Groups["tagname"].Value;
string content = m.Groups["content"].Value;
}
Kind regards,
Jesse