Connecting Tech Pros Worldwide Help | Site Map

Safely cut off short preview version of long string

 
LinkBack Thread Tools Search this Thread
  #1  
Old July 17th, 2005, 01:30 PM
Hans Gruber
Guest
 
Posts: n/a
Default Safely cut off short preview version of long string

Hi all,

Here`s a problem I have been working on for a while, but can`t seem to
solve satisfactory.

I have a database with blog entries. Because each of those entries has a
variable length which can be quite long, I want to build an overview page.
Of each entry there will be a preview version, say 700 characters max.

My problem has to do with HTML tags. If for example an entry contains a
<BLOCKQUOTE> with a large quote, my function would break off somewhere
halfway in the quote. The end result of course won`t have the
</BLOCKQUOTE>, rendering the resulting page horribly bad.

I would like to build a function that breaks a string up to max X
characters long, but plays it safe when it encounters any HTML tag: it
does not matter if the end result is a string of say 670 characters long,
it only matters that it approximates the max character setting and doesn`t
mess up the HTML tags.

Can anyone point me in the right direction?

Hans

  #2  
Old July 17th, 2005, 01:30 PM
Peter Fox
Guest
 
Posts: n/a
Default Re: Safely cut off short preview version of long string

Following on from Hans Gruber's message. . .[color=blue]
>My problem has to do with HTML tags. If for example an entry contains a
><BLOCKQUOTE> with a large quote, my function would break off somewhere
>halfway in the quote. The end result of course won`t have the
></BLOCKQUOTE>, rendering the resulting page horribly bad.
>
>I would like to build a function that breaks a string up to max X
>characters long, but plays it safe when it encounters any HTML tag: it
>does not matter if the end result is a string of say 670 characters long,
>it only matters that it approximates the max character setting and doesn`t
>mess up the HTML tags.[/color]

A simple way would be to decide where your end point was going to be
roughly (not inside <...>) then leave all the remaining tags but remove
the text.

The reason for putting all the following tags in is that you can have
complex nested structures where you'd have to do lots of complicated
parsing - just not worth the effort. Also the entry could start with
say <center> and end with </center> many pages apart.


eg.
1 - split string to get 1st X chars and work with remainder of string
2 - explode remainder by '<' so that tags _except possibly in array[0]_
will be the first part and therefore look like "ATAG>some text" (or
"/ATAG>some text")
3 - if array[0] doesn't contain a '>' this is tail of a tag
(NB /sort of/ there are two exceptions - no more tags at all and this
tag followed immediately by another in which case '>' would appear as
last character if you see what I mean)
4 - Now strip the bits after '>' from the array , implode with '<' and
add to end of text.

--
PETER FOX Not the same since the pancake business flopped
peterfox@eminent.demon.co.uk.not.this.bit.no.html
2 Tees Close, Witham, Essex.
Gravity beer in Essex <http://www.eminent.demon.co.uk>
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,989 network members.