473,385 Members | 1,912 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Truncate string but keep full html tag content

106 100+
Hi, I have this nice function to truncate a string with HTML.
It's very good only now I ran into a problem.

There are cases where for example The input will be this:
<a href="some url">this is a link</a>
and what I want to output is exactly the same, even if it it's bigger than the maximum length specified.

In other words, inside html tags it can't cut.

Can someone help me, please?
I've searched for a solution and thought about it for quite some time..

Expand|Select|Wrap|Line Numbers
  1. <?php
  2. /**
  3.  * truncateHtml can truncate a string up to a number of characters while preserving whole words and HTML tags
  4.  *
  5.  * @param string $text String to truncate.
  6.  * @param integer $length Length of returned string, including ellipsis.
  7.  * @param string $ending Ending to be appended to the trimmed string.
  8.  * @param boolean $exact If false, $text will not be cut mid-word
  9.  * @param boolean $considerHtml If true, HTML tags would be handled correctly
  10.  *
  11.  * @return string Trimmed string.
  12.  */
  13. function truncateHtml($text, $length = 100, $ending = '...', $exact = false, $considerHtml = true) {
  14.     if ($considerHtml) {
  15.         // if the plain text is shorter than the maximum length, return the whole text
  16.         if (strlen(preg_replace('/<.*?>/', '', $text)) <= $length) {
  17.             return $text;
  18.         }
  19.         // splits all html-tags to scanable lines
  20.         preg_match_all('/(<.+?>)?([^<>]*)/s', $text, $lines, PREG_SET_ORDER);
  21.         $total_length = strlen($ending);
  22.         $open_tags = array();
  23.         $truncate = '';
  24.         foreach ($lines as $line_matchings) {
  25.             // if there is any html-tag in this line, handle it and add it (uncounted) to the output
  26.             if (!empty($line_matchings[1])) {
  27.                 // if it's an "empty element" with or without xhtml-conform closing slash
  28.                 if (preg_match('/^<(\s*.+?\/\s*|\s*(img|br|input|hr|area|base|basefont|col|frame|isindex|link|meta|param)(\s.+?)?)>$/is', $line_matchings[1])) {
  29.                     // do nothing
  30.                 // if tag is a closing tag
  31.                 } else if (preg_match('/^<\s*\/([^\s]+?)\s*>$/s', $line_matchings[1], $tag_matchings)) {
  32.                     // delete tag from $open_tags list
  33.                     $pos = array_search($tag_matchings[1], $open_tags);
  34.                     if ($pos !== false) {
  35.                     unset($open_tags[$pos]);
  36.                     }
  37.                 // if tag is an opening tag
  38.                 } else if (preg_match('/^<\s*([^\s>!]+).*?>$/s', $line_matchings[1], $tag_matchings)) {
  39.                     // add tag to the beginning of $open_tags list
  40.                     array_unshift($open_tags, strtolower($tag_matchings[1]));
  41.                 }
  42.                 // add html-tag to $truncate'd text
  43.                 $truncate .= $line_matchings[1];
  44.             }
  45.             // calculate the length of the plain text part of the line; handle entities as one character
  46.             $content_length = strlen(preg_replace('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', ' ', $line_matchings[2]));
  47.             if ($total_length+$content_length> $length) {
  48.                 // the number of characters which are left
  49.                 $left = $length - $total_length;
  50.                 $entities_length = 0;
  51.                 // search for html entities
  52.                 if (preg_match_all('/&[0-9a-z]{2,8};|&#[0-9]{1,7};|[0-9a-f]{1,6};/i', $line_matchings[2], $entities, PREG_OFFSET_CAPTURE)) {
  53.                     // calculate the real length of all entities in the legal range
  54.                     foreach ($entities[0] as $entity) {
  55.                         if ($entity[1]+1-$entities_length <= $left) {
  56.                             $left--;
  57.                             $entities_length += strlen($entity[0]);
  58.                         } else {
  59.                             // no more characters left
  60.                             break;
  61.                         }
  62.                     }
  63.                 }
  64.                 $truncate .= substr($line_matchings[2], 0, $left+$entities_length);
  65.                 // maximum lenght is reached, so get off the loop
  66.                 break;
  67.             } else {
  68.                 $truncate .= $line_matchings[2];
  69.                 $total_length += $content_length;
  70.             }
  71.             // if the maximum length is reached, get off the loop
  72.             if($total_length>= $length) {
  73.                 break;
  74.             }
  75.         }
  76.     } else {
  77.         if (strlen($text) <= $length) {
  78.             return $text;
  79.         } else {
  80.             $truncate = substr($text, 0, $length - strlen($ending));
  81.         }
  82.     }
  83.     // if the words shouldn't be cut in the middle...
  84.     if (!$exact) {
  85.         // ...search the last occurance of a space...
  86.         $spacepos = strrpos($truncate, ' ');
  87.         if (isset($spacepos)) {
  88.             // ...and cut the text in this position
  89.             $truncate = substr($truncate, 0, $spacepos);
  90.         }
  91.     }
  92.     // add the defined ending to the text
  93.     $truncate .= $ending;
  94.     if($considerHtml) {
  95.         // close all unclosed html-tags
  96.         foreach ($open_tags as $tag) {
  97.             $truncate .= '</' . $tag . '>';
  98.         }
  99.     }
  100.     return $truncate;
  101. }
  102.  
  103. ?>
Apr 1 '12 #1
1 3434
Luuk
1,047 Expert 1GB
This cannnot be done, because everything is inside the HTML-tag 'html'.....
Apr 7 '12 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Sheela | last post by:
Hi all gurus in tha club, I scripted a prog that extract a string from an html page excluding all the tags. The problem is that it works quite slowly and I wanted to know if somebody of us as an...
2
by: Jason Kester | last post by:
I have a UserControl living on MyPage.aspx. How would MyPage.aspx go about retrieving the HTML content generated by that UserControl as a string? I've looked into two options, neither of which...
6
by: John Dalberg | last post by:
Why does IE not show the full html source when I try to view the source?? I mean why does it hide a few features. Previously I had an issue where the data grid html representation was hidden in...
3
by: Alex | last post by:
Hello. First, with AJAX I will get a remote web page into a string. Thus, a string will contain HTML tags and such. I will need to extract text from one <span> for which I know the ID the inner...
3
by: Sam | last post by:
Hi All, Is it possible to get a web page content of one of my webpages using relative path? for example: If I have a website which consists of 2 pages: testpage1.aspx and testpage2.aspx...
2
by: =?Utf-8?B?TW9uaWNh?= | last post by:
Dear reader, how can I keep my password content when a postback occur? I define an edit box in password textmode. but when a postback occur and my page became refresh, its content would be...
3
gagandeepgupta16
by: gagandeepgupta16 | last post by:
I am having problem in parsing a string containing HTML Tags. The situation is somewhat similar to the following as quoted in some other forum : if (typeof DOMParser == "undefined") { ...
0
cardei
by: cardei | last post by:
Hi, I try to retrieve html content from a web page with VB .net I used this code (console app) that I find on internet : Imports System Imports System.Net Imports System.IO Module...
0
by: arindams | last post by:
I've a form where an webrowser contains a html file. I want to copy a selective portion of the html content and paste that content in an excel 2007 file where it will be pasted as a normal text (not...
2
by: Andy B | last post by:
How do you add html content to a string and make it renter in the browser as html? Do you just add the html markup to it? Example would be something like: <asp:CustomControl ID="Control"...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.