Connecting Tech Pros Worldwide Forums | Help | Site Map

Grab Text Between Tags

Newbie
 
Join Date: Apr 2007
Posts: 2
#1: Apr 11 '07
I have researched the jist behind making this work but I am having a unique problem that I can't seem to fix.

I use the following code to grab the text between these two tags.
Expand|Select|Wrap|Line Numbers
  1. $content = "[tag]Hello[/tag]";
  2. preg_match_all("/(\[([\w]+)\])(.*)(\[\/\\2\])/", $content, $matches);
  3. print_r($matches);
  4.  
  5.  
This yields the following results succesfully:
Expand|Select|Wrap|Line Numbers
  1. Array
  2. (
  3.     [0] => Array
  4.         (
  5.             [0] => [tag]Hello[/tag]
  6.         )
  7.     [1] => Array
  8.         (
  9.             [0] => [tag]
  10.         )
  11.     [2] => Array
  12.         (
  13.             [0] => tag
  14.         )
  15.     [3] => Array
  16.         (
  17.             [0] => Hello
  18.         )
  19.     [4] => Array
  20.         (
  21.             [0] => [/tag]
  22.         )
  23. )
  24.  
The problem arises when there are duplicate tags.
Expand|Select|Wrap|Line Numbers
  1. $content = "[tag]Hello[/tag] More Text [tag]Hello2[/tag]";
  2. preg_match_all("/(\[([\w]+)\])(.*)(\[\/\\2\])/", $content, $matches);
  3. print_r($matches);
  4.  
  5.  
With this I end up with the following:
Expand|Select|Wrap|Line Numbers
  1. Array
  2. (
  3.     [0] => Array
  4.         (
  5.             [0] => [tag]Hello[/tag] More Text [tag]Hello2[/tag]
  6.         )
  7.     [1] => Array
  8.         (
  9.             [0] => [tag]
  10.         )
  11.     [2] => Array
  12.         (
  13.             [0] => tag
  14.         )
  15.     [3] => Array
  16.         (
  17.             [0] => Hello[/tag] More Text [tag]Hello2
  18.         )
  19.     [4] => Array
  20.         (
  21.             [0] => [/tag]
  22.         )
  23. )
  24.  
  25.  
I need to be able to have it stop at the first close tag and then register the second open and close tag in a different array. Any Ideas?

Newbie
 
Join Date: Apr 2007
Posts: 18
#2: Apr 11 '07

re: Grab Text Between Tags


Quote:

Originally Posted by mgriggs13

I have researched the jist behind making this work but I am having a unique problem that I can't seem to fix.

I use the following code to grab the text between these two tags.

Expand|Select|Wrap|Line Numbers
  1. $content = "[tag]Hello[/tag]";
  2. preg_match_all("/(\[([\w]+)\])(.*)(\[\/\\2\])/", $content, $matches);
  3. print_r($matches);
  4.  
  5.  
This yields the following results succesfully:
Expand|Select|Wrap|Line Numbers
  1. Array
  2. (
  3.     [0] => Array
  4.         (
  5.             [0] => [tag]Hello[/tag]
  6.         )
  7.     [1] => Array
  8.         (
  9.             [0] => [tag]
  10.         )
  11.     [2] => Array
  12.         (
  13.             [0] => tag
  14.         )
  15.     [3] => Array
  16.         (
  17.             [0] => Hello
  18.         )
  19.     [4] => Array
  20.         (
  21.             [0] => [/tag]
  22.         )
  23. )
  24.  
The problem arises when there are duplicate tags.
Expand|Select|Wrap|Line Numbers
  1. $content = "[tag]Hello[/tag] More Text [tag]Hello2[/tag]";
  2. preg_match_all("/(\[([\w]+)\])(.*)(\[\/\\2\])/", $content, $matches);
  3. print_r($matches);
  4.  
  5.  
With this I end up with the following:
Expand|Select|Wrap|Line Numbers
  1. Array
  2. (
  3.     [0] => Array
  4.         (
  5.             [0] => [tag]Hello[/tag] More Text [tag]Hello2[/tag]
  6.         )
  7.     [1] => Array
  8.         (
  9.             [0] => [tag]
  10.         )
  11.     [2] => Array
  12.         (
  13.             [0] => tag
  14.         )
  15.     [3] => Array
  16.         (
  17.             [0] => Hello[/tag] More Text [tag]Hello2
  18.         )
  19.     [4] => Array
  20.         (
  21.             [0] => [/tag]
  22.         )
  23. )
  24.  
  25.  
I need to be able to have it stop at the first close tag and then register the second open and close tag in a different array. Any Ideas?

What is happening is that PHP is matching as many characters as it can for the quantifier .* in your regular expression.

To make the quantifiers match the least amount of characters use the "un-greedy" indicator, "?".

eg:

[PHP]preg_match_all("/(\[([\w]+)\])(.*?)(\[\/\\2\])/", $content, $matches);[/PHP]

notice you now have (.*?) matching the characters in between tags rather than the previous (.*)

(.*?) will match until it reaches the first (\[\/\\2\])

before you had:

(.*) will match until it reaches the last (\[\/\\2\])
Newbie
 
Join Date: Apr 2007
Posts: 2
#3: Apr 11 '07

re: Grab Text Between Tags


Quote:

Originally Posted by bucabay

What is happening is that PHP is matching as many characters as it can for the quantifier .* in your regular expression.

To make the quantifiers match the least amount of characters use the "un-greedy" indicator, "?".

eg:

[PHP]preg_match_all("/(\[([\w]+)\])(.*?)(\[\/\\2\])/", $content, $matches);[/PHP]

notice you now have (.*?) matching the characters in between tags rather than the previous (.*)

(.*?) will match until it reaches the first (\[\/\\2\])

before you had:

(.*) will match until it reaches the last (\[\/\\2\])


Rock on! Thanks a lot. I thought I had tried that but I guess working at 4 am with 0 sleep can screw with your head. Thanks again.
Reply