By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,994 Members | 2,063 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,994 IT Pros & Developers. It's quick & easy.

Truncate string but keep full html tag content

100+
P: 106
Hi, I have this nice function to truncate a string with HTML.
It's very good only now I ran into a problem.

There are cases where for example The input will be this:
<a href="some url">this is a link</a>
and what I want to output is exactly the same, even if it it's bigger than the maximum length specified.

In other words, inside html tags it can't cut.

Can someone help me, please?
I've searched for a solution and thought about it for quite some time..

Expand|Select|Wrap|Line Numbers
  1. <?php
  2. /**
  3.  * truncateHtml can truncate a string up to a number of characters while preserving whole words and HTML tags
  4.  *
  5.  * @param string $text String to truncate.
  6.  * @param integer $length Length of returned string, including ellipsis.
  7.  * @param string $ending Ending to be appended to the trimmed string.
  8.  * @param boolean $exact If false, $text will not be cut mid-word
  9.  * @param boolean $considerHtml If true, HTML tags would be handled correctly
  10.  *
  11.  * @return string Trimmed string.
  12.  */
  13. function truncateHtml($text, $length = 100, $ending = '...', $exact = false, $considerHtml = true) {
  14.     if ($considerHtml) {
  15.         // if the plain text is shorter than the maximum length, return the whole text
  16.         if (strlen(preg_replace('/<.*?>/', '', $text)) <= $length) {
  17.             return $text;
  18.         }
  19.         // splits all html-tags to scanable lines
  20.         preg_match_all('/(<.+?>)?([^<>]*)/s', $text, $lines, PREG_SET_ORDER);
  21.         $total_length = strlen($ending);
  22.         $open_tags = array();
  23.         $truncate = '';
  24.         foreach ($lines as $line_matchings) {
  25.             // if there is any html-tag in this line, handle it and add it (uncounted) to the output
  26.             if (!empty($line_matchings[1])) {
  27.                 // if it's an "empty element" with or without xhtml-conform closing slash
  28.                 if (preg_match('/^<(\s*.+?\/\s*|\s*(img|br|input|hr|area|base|basefont|col|frame|isindex|link|meta|param)(\s.+?)?)>$/is', $line_matchings[1])) {
  29.                     // do nothing
  30.                 // if tag is a closing tag
  31.                 } else if (preg_match('/^<\s*\/([^\s]+?)\s*>$/s', $line_matchings[1], $tag_matchings)) {
  32.                     // delete tag from $open_tags list
  33.                     $pos = array_search($tag_matchings[1], $open_tags);
  34.                     if ($pos !== false) {
  35.                     unset($open_tags[$pos]);
  36.                     }
  37.                 // if tag is an opening tag
  38.                 } else if (preg_match('/^<\s*([^\s>!]+).*?>$/s', $line_matchings[1], $tag_matchings)) {
  39.                     // add tag to the beginning of $open_tags list
  40.                     array_unshift($open_tags, strtolower($tag_matchings[1]));
  41.                 }
  42.                 // add html-tag to $truncate'd text
  43.                 $truncate .= $line_matchings[1];
  44.             }
  45.             // calculate the length of the plain text part of the line; handle entities as one character
  46.             $content_length = strlen(preg_replace('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', ' ', $line_matchings[2]));
  47.             if ($total_length+$content_length> $length) {
  48.                 // the number of characters which are left
  49.                 $left = $length - $total_length;
  50.                 $entities_length = 0;
  51.                 // search for html entities
  52.                 if (preg_match_all('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', $line_matchings[2], $entities, PREG_OFFSET_CAPTURE)) {
  53.                     // calculate the real length of all entities in the legal range
  54.                     foreach ($entities[0] as $entity) {
  55.                         if ($entity[1]+1-$entities_length <= $left) {
  56.                             $left--;
  57.                             $entities_length += strlen($entity[0]);
  58.                         } else {
  59.                             // no more characters left
  60.                             break;
  61.                         }
  62.                     }
  63.                 }
  64.                 $truncate .= substr($line_matchings[2], 0, $left+$entities_length);
  65.                 // maximum lenght is reached, so get off the loop
  66.                 break;
  67.             } else {
  68.                 $truncate .= $line_matchings[2];
  69.                 $total_length += $content_length;
  70.             }
  71.             // if the maximum length is reached, get off the loop
  72.             if($total_length>= $length) {
  73.                 break;
  74.             }
  75.         }
  76.     } else {
  77.         if (strlen($text) <= $length) {
  78.             return $text;
  79.         } else {
  80.             $truncate = substr($text, 0, $length - strlen($ending));
  81.         }
  82.     }
  83.     // if the words shouldn't be cut in the middle...
  84.     if (!$exact) {
  85.         // ...search the last occurance of a space...
  86.         $spacepos = strrpos($truncate, ' ');
  87.         if (isset($spacepos)) {
  88.             // ...and cut the text in this position
  89.             $truncate = substr($truncate, 0, $spacepos);
  90.         }
  91.     }
  92.     // add the defined ending to the text
  93.     $truncate .= $ending;
  94.     if($considerHtml) {
  95.         // close all unclosed html-tags
  96.         foreach ($open_tags as $tag) {
  97.             $truncate .= '</' . $tag . '>';
  98.         }
  99.     }
  100.     return $truncate;
  101. }
  102.  
  103. ?>
Apr 1 '12 #1
Share this Question
Share on Google+
1 Reply


Expert 100+
P: 1,033
This cannnot be done, because everything is inside the HTML-tag 'html'.....
Apr 7 '12 #2

Post your reply

Sign in to post your reply or Sign up for a free account.