471,330 Members | 1,867 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,330 software developers and data experts.

get inner content with regular expression

How can I get the inner content of a tag with regular expression

I couldn't the the opening and closing tags to match properly

Input
"fjkdjfkdj <div>sadfdf dfdf <b>dfd</b>dfdf<div>nested<div>tags</div>.</div>
</div>dfdfdf"

Get content of the first div tag

Output
"sadfdf dfdf <b>dfd</b>dfdf<div>nested<div>tags</div>.</div"
thank you
Sami

Aug 24 '08 #1
2 4696
On Aug 24, 3:17*pm, "Sami" <sam...@ymail.comwrote:
How can I get the inner content of a tag with regular expression

I couldn't the the opening and closing tags to match properly

Input
"fjkdjfkdj <div>sadfdf dfdf <b>dfd</b>dfdf<div>nested<div>tags</div>.</div>
</div>dfdfdf"

Get content of the first div tag

Output
"sadfdf dfdf <b>dfd</b>dfdf<div>nested<div>tags</div>.</div"

thank you
Sami
There is no way to do it using regular expressions other than
hardcoding it as there are numerous <divand </divtags in the main
<divtag. In the program I'm building right now, I use regex to find
the content between two tags in html (if it were xml it would be much
easier!), but i don't have multiple tags with the same name.
Now, if your content is xml (it can be html but it must be well-
formed), there is a much easier approach. You just read it as an xml
document and you search for the correct tag node. Very simple. (To see
if your html fits, google well-formed html checker).
Aug 25 '08 #2
Hello maximz2005,
On Aug 24, 3:17 pm, "Sami" <sam...@ymail.comwrote:
>How can I get the inner content of a tag with regular expression

I couldn't the the opening and closing tags to match properly

Input
"fjkdjfkdj <div>sadfdf dfdf
<b>dfd</b>dfdf<div>nested<div>tags</div>.</div>
</div>dfdfdf"
Get content of the first div tag

Output
"sadfdf dfdf <b>dfd</b>dfdf<div>nested<div>tags</div>.</div"
thank you
Sami
There is no way to do it using regular expressions other than
hardcoding it as there are numerous <divand </divtags in the main
<divtag. In the program I'm building right now, I use regex to find
the content between two tags in html (if it were xml it would be much
easier!), but i don't have multiple tags with the same name.
Now, if your content is xml (it can be html but it must be well-
formed), there is a much easier approach. You just read it as an xml
document and you search for the correct tag node. Very simple. (To see
if your html fits, google well-formed html checker).
You can use the HTMLAgility pack (on codeplex) to rea the HTML as it were
XML and you could easily get the contents you wanted. You can also use regex
for this, though you'd end up the the more advanced constructs (the hardest
to understand ones) like the balanced group sets. (more info here: http://blogs.msdn.com/bclteam/archiv...15/396452.aspx)

--
Jesse Houwing
jesse.houwing at sogeti.nl
Aug 25 '08 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Bernard A. | last post: by
7 posts views Thread by Billa | last post: by
25 posts views Thread by Mike | last post: by
9 posts views Thread by netimen | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.