473,396 Members | 2,011 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

PHP & Unicode

Hello all,

I have a simple question on PHP and the way it handles strings.

Let's say I have a database and a php script that communicates with the
database. The database has some kind of character encoding - let's say
UTF-8, UTF-16, or something different.

When I select some string from the database, for example:

$res = mysql_query("SELECT movie_name FROM my_table");
$row = mysql_fetch_row($res);
echo $row[0];

the database will return the movie_name field in a correctly-encoded
way.

My question is, how do you tell the PHP interpreter what encoding to
use when displaying the text that the mysql queries return? In other
words, will the $row[0] be displayed correctly regardless the database
encoding, provided the database encoding and the HTML <meta> tags are
the same, or do I have to set the PHP encoding in some config file?
What character representation does PHP use when working with strings?

Best Regards,
Slavi

Oct 24 '05 #1
2 2455
Not sure since I have some problems regarding encoding, but it should be
enough to have the html tag matching your db's encoding. Php shouldn't
care.

JD

sl***********@gmail.com wrote:
Hello all,

I have a simple question on PHP and the way it handles strings.

Let's say I have a database and a php script that communicates with the
database. The database has some kind of character encoding - let's say
UTF-8, UTF-16, or something different.

When I select some string from the database, for example:

$res = mysql_query("SELECT movie_name FROM my_table");
$row = mysql_fetch_row($res);
echo $row[0];

the database will return the movie_name field in a correctly-encoded
way.

My question is, how do you tell the PHP interpreter what encoding to
use when displaying the text that the mysql queries return? In other
words, will the $row[0] be displayed correctly regardless the database
encoding, provided the database encoding and the HTML <meta> tags are
the same, or do I have to set the PHP encoding in some config file?
What character representation does PHP use when working with strings?

Best Regards,
Slavi

Oct 24 '05 #2
In article <11**********************@z14g2000cwz.googlegroups .com>,
sl***********@gmail.com wrote:

[...]
Let's say I have a database and a php script that communicates with the
database. The database has some kind of character encoding - let's say
UTF-8, UTF-16, or something different.
[...]
My question is, how do you tell the PHP interpreter what encoding to
use when displaying the text that the mysql queries return? In other
words, will the $row[0] be displayed correctly regardless the database
encoding, provided the database encoding and the HTML <meta> tags are
the same


No. First off you'll need to use a character repertoire that makes sense
on the Web. utf-8 makes sense, utf-16 does not. So if your database uses
utf-16, you'll need to transliterate to utf-8 before serving.[*]

In addition you need to ensure that the user-agent (a browser for
example) is informed correctly of which character repertoire applies.
(Unless you want to rely on chance this is *always* a requirement, with
any character repertoire. Not just when you work with utf-8.) You do so
by having your server accompany the document with an appropriate
Content-Type header. For example, if it's a utf-8 encoded HTML file,
your server must say Content-Type: text/html; charset=utf-8. (Whether
the file name extension is ".php" or ".html" is irrelevant)

An alternative to configuring the server to do so is to have PHP
generate the Content-Type header:

header("Content-Type: text/html; charset=utf-8");

Contrary to popular belief, a META HTTP-EQUIV is *not* a realiable
alternative.
Notes:
- I'm not entirely sure what you mean with "displaying". PHP doesn't
display. Nor does a Web server. It is the *browser*'s job to "display"
(whether visually or otherwise).
- all this assumes what you're trying to do is meant for the Web. An
intranet situation may have different requirements and possibilities.
[*] How exactly to do transliteration in PHP I can't tell you. I'm sure
it can be found in the documentation. It might also be that your
database allows you to request output in a specific character
repertoire. If so, that route might be more efficient.

--
Sander Tekelenburg, <http://www.euronet.nl/~tekelenb/>

Mac user: "Macs only have 40 viruses, tops!"
PC user: "SEE! Not even the virus writers support Macs!"
Oct 24 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Edward K. Ream | last post by:
Am I reading pep 277 correctly? On Windows NT/XP, should filenames always be converted to Unicode using the mbcs encoding? For example, myFile = unicode(__file__, "mbcs", "strict") This...
12
by: Mike Dee | last post by:
A very very basic UTF-8 question that's driving me nuts: If I have this in the beginning of my Python script in Linux: #!/usr/bin/env python # -*- coding: UTF-8 -*- should I - or should I...
27
by: EU citizen | last post by:
Do web pages have to be created in unicode in order to use UTF-8 encoding? If so, can anyone name a free application which I can use under Windows 98 to create web pages?
19
by: Philipp Lenssen | last post by:
I don't know the English word, but I'm referring to the double-dash which is used to separate parts of a sentence. I'm using — so far. Now I saw – which is slightly shorter. Some sites use --. ...
5
by: Jukka K. Korpela | last post by:
The HTML specifications define the entities &zwj;, &zwnj;, &lrm;, &rlm; as denoting zero-width joiner, zero-width non-joiner, left to right mark, and right to left mark. Is there any evidence of...
3
by: hunterb | last post by:
I have a file which has no BOM and contains mostly single byte chars. There are numerous double byte chars (Japanese) which appear throughout. I need to take the resulting Unicode and store it in a...
4
by: webdev | last post by:
lo all, some of the questions i'll ask below have most certainly been discussed already, i just hope someone's kind enough to answer them again to help me out.. so i started a python 2.3...
7
by: Robert | last post by:
Hello, I'm using Pythonwin and py2.3 (py2.4). I did not come clear with this: I want to use win32-fuctions like win32ui.MessageBox, listctrl.InsertItem ..... to get unicode strings on the...
4
by: Tom Fields | last post by:
Hello! I like to use the XmlTextWriter to write some SVG files. But in some cases, I need the '&' as '&' and not as &amp;. Example: <glyph unicode="&#x4c;"/> Some code-snippet:
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.