473,320 Members | 2,124 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

php5 mssql utf-8 problem

hi all

since nearly 2 days i fight with mssql and utf-8 as i need to store and
retrieve arabic characters.

i tried the com approach

http://groups.google.com/group/maili...1aa0bd48de23f0

i tried as well the odbtp library, odbc and obviously the mssql native
extension.. but i never succeded so far to store utf-8 data in the
mssql server (column type nvarchar). if anybody has an idea, an
approach or thinks he knows what i might have done wrong.. it is all
highly appreciated.

ou yes the content type is specified as follows in the document

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

cheers ralphie

Jan 24 '06 #1
11 31699
I was fighting with this problem today(i tested romanian deutch and
russian characters).
The main requiremnts are:
1. Your document must be UTF-8
2. Your database table or column must be UTF-8
3. Before calling any queries you should execute the following query

SET NAMES 'utf8'

this will set conection to mysql in utf-8 mode if it wasn't set by the
default MySql config.
4. Just Insert your data in the table and then you will be able to
select it.

Hope this will help.

P.S. I tested all this stuff on PHP 5 and Mysql 4.1 so if your config
is different sorry.

Jan 24 '06 #2
thanks alex i will try to do something similar tomorrow.. but am not
really hopeful, because i use microsoft sql server.. and the connection
between php and mssql seems to be realy unfriendly ot utf-8 :-(

Jan 24 '06 #3
ralphie wrote:
hi all

since nearly 2 days i fight with mssql and utf-8 as i need to store and
retrieve arabic characters.

i tried the com approach

http://groups.google.com/group/maili...1aa0bd48de23f0

i tried as well the odbtp library, odbc and obviously the mssql native
extension.. but i never succeded so far to store utf-8 data in the
mssql server (column type nvarchar). if anybody has an idea, an
approach or thinks he knows what i might have done wrong.. it is all
highly appreciated.

ou yes the content type is specified as follows in the document

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

cheers ralphie


At work, we also have to deal with the Arabic script on a daily basis
and we also happen to use MSSQL. I have found is that there's no good
way to make use of the nvarchar, ntext column type. Maybe there's a
secret knob somewhere. I didn't find it in any event. So what we do is
just store the text as UTF-8 in regular varchar and text columns.

Varchar columns are a bit problematic, since you risk chopping a
character in half when truncation happens. Retrieving varchar columns
wider than 255 is also somewhat tricky using the extension (COM on the
other hand, leaks memory). So we usually use a text column when there
is no natural limit on text length.

MSSQL will sometimes complain when you put a UTF-8 string into a query
because it doesn't think that what you have is valid text. What you
have to do is just CONVERT() or CAST() to force it to accept the
string.

Jan 25 '06 #4
thanks chung, i'm really happy to find a mate fighting the same
problems.

you suggest using normal varchar or text fields. how to store utf-8 in
them? well i know if i insert it through php it works and it turns out
in some wired characters (at least displayed like that in the
enterprise manager). the nice thing is if i fetch it again via php it
comes back as arabic text. that so far is cool. the problem arises if i
want to import data via the enterprise manager, this one would convert
the arabic signs to ?? only. if i want to copy past the data into it,
it does not accept it at all in normal fields but displays it correctly
in the n-types.. the ohter thing i wonder about is the sorting and
stuff.. if the data are stored in this wired (sorry don't know the name
for what that is, just some symbols) format i wonder about the
collation (sorting, etc) stuff? would it still work fine?

any further advise is greatly appreciated!

Jan 25 '06 #5
ralphie wrote:
thanks chung, i'm really happy to find a mate fighting the same
problems.

you suggest using normal varchar or text fields. how to store utf-8 in
them? well i know if i insert it through php it works and it turns out
in some wired characters (at least displayed like that in the
enterprise manager). the nice thing is if i fetch it again via php it
comes back as arabic text. that so far is cool. the problem arises if i
want to import data via the enterprise manager, this one would convert
the arabic signs to ?? only. if i want to copy past the data into it,
it does not accept it at all in normal fields but displays it correctly
in the n-types.. the ohter thing i wonder about is the sorting and
stuff.. if the data are stored in this wired (sorry don't know the name
for what that is, just some symbols) format i wonder about the
collation (sorting, etc) stuff? would it still work fine?


Yeah, that's the problem with storing it as UTF-8: the text will show
up as garbage in Enterprise Manager and Query Analyser. Collation will
also be weird. If you set the column to do binary sort, the result
would be somewhat reasonable, as the alef will still come before the
alef maksura encoded in UTF-8. The digits come after the letters
though, which might not be what you'd want. The order would be
incorrect too if you need to use pe or gaf.

The n- columns offer the best solution. I would give ADODB another
shot. COM is problematic in PHP4. Support might have improved in PHP5.
In theory, if you pass CP_UTF8 as codepage to COM(), you'd get UTF-8
text out of the resultsets. I remember that when I tried it though,
nothing happened. This was a couple years ago. Maybe newer versions of
ADODB would work better.

Jan 25 '06 #6
after another day of attempts without success (i think i succeded to
write utf-8 into mssql, showed only ??? though but this is probably
because of enterprise manager not beeing able to properly display) i
give it up (never got proper utf-8 out of the db, only straight after
import but that was mostlikely not even utf-8.. whatever i'm tired and
will join your way to use the "strange sign" method and write manually
an import script as it works only if i stay whithin php.. whatever
thanks for walking along with me and your inputs chung.

Jan 25 '06 #7

ralphie wrote:
after another day of attempts without success (i think i succeded to
write utf-8 into mssql, showed only ??? though but this is probably
because of enterprise manager not beeing able to properly display) i
give it up (never got proper utf-8 out of the db, only straight after
import but that was mostlikely not even utf-8.. whatever i'm tired and
will join your way to use the "strange sign" method and write manually
an import script as it works only if i stay whithin php.. whatever
thanks for walking along with me and your inputs chung.


Yeah, the lack of support is rather disappointing. One of these days
I'm going to sit down and right an improved version of the mssql
extension.

Jan 25 '06 #8
if you get to this point let me know me at ralphie.ch yeah? maybe its
easier to improve the odbtp though as it seems to be a pretty straight
forward thing, but still didnt help me.. well the neatest is definetly
if the native extension works anyway.. or we just need to leave
mssql.. but probably its bound to customers in your case as well..

Jan 26 '06 #9

ralphie wrote:
if you get to this point let me know me at ralphie.ch yeah? maybe its
easier to improve the odbtp though as it seems to be a pretty straight
forward thing, but still didnt help me.. well the neatest is definetly
if the native extension works anyway.. or we just need to leave
mssql.. but probably its bound to customers in your case as well..


Well, since things are working the way they're now, I won't start
working on it anytime soon. My thinking is to develop something with
identical API to the current mssql extension, but takes to the server
using OLE-DB instead of the old DB library.

Jan 26 '06 #10
i see.. eventhough i'd rather have proper utf8 in the database what?
whatever somethimes you have to cut down your expectation to reality :-)

Jan 26 '06 #11
ralphie wrote:
i see.. eventhough i'd rather have proper utf8 in the database what?
whatever somethimes you have to cut down your expectation to reality :-)


Ok, after playing around with this a bit more I have somewhat of a
solution. It's a ugly, nasty hack and quite limited. Basically, since
the mssql extension won't return the data as UTF-8, we will ask MSSQL
to return the text as binary, and then we convert it to UTF-8 with
iconv.

To insert some text, we first convert from UTF-8 to UCS-2, little
endian:

$ucs2 = iconv("UTF-8", "UCS-2LE", $utf8);

Then we put it into binary representation:

$a = unpack('H*hex', $s);
$hex = '0x' . $a['hex'];

Now stick that into a INSERT statement. MSSQL will do an implicit
conversion to nvarchar if that's the type of the column:

mssql_query("INSERT INTO table (nvarchar_col) VALUES($hex)");
Retrieval is a bit tricky, as the mssql extension is too dumb to get
varbinary bigger than 255. It handles image columns with no problem
however. So what we do is cast from nvarchar to varbinary, then from
there to image:

mssql_query("SELECT CAST(CAST(nvarchar_col AS varbinary(8000)) AS
image) AS col FROM table");

The data comes out in UCS-2, so we have to convert it to UTF-8:

$utf8 = iconv("UCS-2LE", "UTF-8", $ucs2);

It's not a great solution but seems to work in my tests. Conversion is
only possible between nvarchar and varbinary. You can't cast a ntext
column to image. So you're limited to nvarchar's 4000 maximum size.

Jan 27 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Philipp Lenssen | last post by:
I want to try the new Yahoo API, but I get the PHP5 error message Unknown encoding "utf-8" in the marked line below... (After that, applying the Xpath will fail as symptom of the first bug.) ...
8
by: Dorthe Luebbert | last post by:
Hi, we have to convert a quite large ISO-application (Mysql4, PHP5) to UTF8. We found out so far that there is no collation for MySQL which is able to sort all character sets correctly. So this...
7
by: Jan Wagner | last post by:
Hi, I'm running into a problem with php 4.1.1 on IIS (XP Pro version). For example writing a page test.php that starts with <?php session_start(); ?> <?php echo '<?xml version="1.0"'; ?>...
0
by: Benjamin Fischer | last post by:
Hi, I'm running a "Windows Server 2003" with "MS SQL 2000" IIS and PHP5.1 as test environment for my project. I would like to access my MSSQL-Server via Windows ODBC and PDO. I created a System...
1
by: R | last post by:
Hi All, I'm using UTF-8 encoding for my PHP scripts, output is also UTF-8. all files have UTF-8 BOM: 0xEF 0xBB 0xBF at the beginning I have PHP Version 5.1.2 on WinNT and PHP 5.1.1 on Gentoo...
2
by: Alfred | last post by:
I'm about to host a solution for our developers and it will use PHP5 on Suse Enterprise Linux 9 and MS SQL Server API. The developers are kind of green to this. Is this MS SQL Server API fairly...
23
by: Allan Ebdrup | last post by:
I hava an ajax web application where i hvae problems with UTF-8 encoding oc chineese chars. My Ajax webapplication runs in a HTML page that is UTF-8 Encoded. I copy and paste some chineese chars...
2
by: Todd Michels | last post by:
Hi all, I have PHP 5.2.1 installed on IIS, WinXP Pro. The MSSQL module is not loading, or any other module that needs to be loaded manually. PHP in general works, I am able to run...
2
by: pks83 | last post by:
In MSSQL I am using datetime as the column type and trying to execute the below mentioned query select time from tickets where ticket_id = '1'; when i execute the query on sql manager window it...
3
by: Shawn Beasley | last post by:
Hi List, I am searching franticly for a solution (or the procedure) to setting the coding of a new DB to UTF-8. I can find no setting in the Server Manager, during creation of the DB, to...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.