We have a CMS which is written is based on php & mysql. Recently we received
a request to support multiple languages so that sites in that particular
laguage can be created. I did some search on the google and it seems I have
to build in multibyte support for php and mysql. Mbstring
(http://us3.php.net/mbstring) claims to support multiple languages with a
caution saying it might not work properly.
After further research it seems unicode might be the way to go, since
unicode can represents all characters (in all languages) with integers,
which in turn can be handled in php as it has excellent integer support. But
again since all the data is store in mysql we need unicode support for mysql
too and it has 2 formats (http://www.mysql.com/doc/en/Charset-Unicode.html)
usc-2 (for storing data) and utf-8 (for encoding). Here is where I need
help. Do I opt for usc-2 or go ahead with utf-8? What are the advantages and
disadvantages of both.
Now back to our CMS; can we make changes so that this new support is
transparent to the code (that doesn't sound right). Any suggestion on how I
can minimize the amount of rework we have to do on the code to accomodate
for unicode. Are there any other suggestions on how to approach this
transformation?
--Turi