470,631 Members | 1,556 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,631 developers. It's quick & easy.

substr on UTF-8 strings returned from MySQL

I am trying to truncate some Chinese Text returned from MySQL. If I use the
substr function, then the last one or two chinese words would appear as
symbols as opposed to the word it should be displaying? The PHP page itself
is already UTF-8 encoded

What is the best way of truncating such UTF-8 strings (e.g., to return the
first 50 chinese "words"). Chinese words appears like
"?????????????????????" (21 chinese "words" shown here).
Jul 17 '05 #1
2 3045
peter <pe***@mail.co.uk> wrote:
I am trying to truncate some Chinese Text returned from MySQL. If I use the
substr function, then the last one or two chinese words would appear as
symbols as opposed to the word it should be displaying? The PHP page itself
is already UTF-8 encoded
Behold: http://nl2.php.net/manual/en/ref.mbstring.php

The multibyte string functions
What is the best way of truncating such UTF-8 strings (e.g., to return the
first 50 chinese "words"). Chinese words appears like
"?????????????????????" (21 chinese "words" shown here).


mb_substr will do the trick....

--

Daniel Tryba

Jul 17 '05 #2

"Daniel Tryba" <ne****************@canopus.nl> wrote in message
news:ce**********@news.tue.nl...
peter <pe***@mail.co.uk> wrote:
I am trying to truncate some Chinese Text returned from MySQL. If I use the substr function, then the last one or two chinese words would appear as
symbols as opposed to the word it should be displaying? The PHP page itself is already UTF-8 encoded


Behold: http://nl2.php.net/manual/en/ref.mbstring.php

The multibyte string functions
What is the best way of truncating such UTF-8 strings (e.g., to return the first 50 chinese "words"). Chinese words appears like
"?????????????????????" (21 chinese "words" shown here).


mb_substr will do the trick....

--

Daniel Tryba


The PCRE engine of newer version of PHP is also capable of dealing with
UTF-8 text. preg_match('/(.{1,50})/u', $s, $m) would yield a string of 50
characters.
Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by lawrence | last post: by
5 posts views Thread by MuffinMan | last post: by
1 post views Thread by Kim Gijung | last post: by
1 post views Thread by Patrick Londema | last post: by
7 posts views Thread by entropy123 | last post: by
32 posts views Thread by Wolfgang Draxinger | last post: by
6 posts views Thread by sks | last post: by
3 posts views Thread by kocek | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.