473,699 Members | 3,239 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

MySQL 5.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

Hello All,

This post is essentially a reply a previous post/thread
here on this mailing.databas e.myodbc group titled:

MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

[This version has a couple subtle edits from the orginial I posted
on mailing.databas e.myodbc - I'm cross posting here on this
topic/subject related newsgroup]

I was wondering if anybody has experienced the same issues
challenges I'm experiencing I'll describe shortly. Once
resolved some fascinating and powerful multi-lingual
apps incorporating non-English/latin character sets can be
realized by many developers.

I have a Unicode utf8 English - Arabic - Hebrew - Greek (and
several other languages) database in Microsoft Excel. I KNOW
that it is Unicode utf8 data because MySQL tells me it
recognizes the encoding as such but not in the context I want.

Allow me to explain ...

I can search the Unicode utf8 encoding with no problem in
Excel. While in Excel I highlight a complete word or a
partial string of an Arabic word copy it to the clipboard
(i.e. memory). I then do a find and the process is the
same successful result as if it was an English string.

MySQL 5.0 is supposed to handle Unicode utf8

I created a MySQL database I named: languages

CREATE DATABASE languages ;

and I implemented the following command on a MySQL
command prompt:

ALTER DATABASE languages DEFAULT CHARACTER SET utf8;

No problem (so far) MySQL seemingly recognized utf8 and
accepted it. My understanding is with the ALTER command
the tables I create against languages will be utf8.

I now created a table I named mainlang which denotes it
will be the main table for my languages.

mysql>CREATE TABLE mainlang
->(
->langNumID varchar(30),
->colB varchar(30),
->colC varchar(30),
->primary key (langNumID, colB)
->);

Again so far no problem: Table successfully created.
My third column 'colC' is where the Unicode data
will be stored.

I now attempt to import the database from my
Excel file into my MySQL database as follows:

mysql>load data infile 'c:\\arabicdict ionary.csv'
->into table mainlang
->fields terminated by ','
->lines terminated by '\n'
->(langNumID, colB, colC);
ERROR 1406 (22001): Data too long for 'colC' at row 1

So what to do? I did a search and found other
people seemingly had the same problem and someone
suggested:

ALTER DATABASE languages DEFAULT CHARACTER SET cp1250;

I dropped mainlang, recreated it, redid the load and
Lo and behold ... it seemed to work. No Data too long
error occurred and when I did the following query:

mysql>select langNumID, colB, colC
->from mainlang
->where colB = '4994';

I see colA have a correct numeric value, colB a
correct numeric value (4994) and for colC a string of
unintelligible characters with diacritical marks,
oomlats etc. which I know is the cp1250 encoding
interpretation of the Unicode utf8 data which is
similarly unintelligible in its own regard.

Now what I try is: do a copy of the obscure colC
cp1250 character string into the clipboard/memory
and then do the following tweak on the original
select statement to see if I can search on the
(now) cp1250 character string:

mysql>select langNumID, colB, colC
->from mainlang
->where colc = 'paste of the cp1250 character string';

The computer would not allow a paste unless I pressed
the escape key. On initiating this select command
I got an empty set (no match)

My questions are:

Has anyone been successful creating a Unicode utf8
MySQL database that accepts Arabic?

If yes, how did you get around or not encounter the
Data too long issue?

Have you tried the cp1250 (or cp1251 - same mechanics
same results) work around as I have? Are you
able to search the cp1250 character string (my colC)?
If yes, how did you successfully manage to do it?

Lastly, if I take the cp1250 encoded string and paste
it into Excel ... I can string search the cp1250
encoding with no problem.

Also, here's how I know my Unicode utf-8 data is
correct apart from my own manual cross-referencing
and being recognized by MySQL in some respect:

When I copy the Unicode utf8 encoding and try to
paste it into the select command to see what would
happen I get the following error:

ERROR 1257 (HY000): Illegal mix of collations
(cp1250_general _ci, IMPLICIT) and
(utf8_general_c i, COERCIBLE) for operation '='

So what I have here is a situation where MySQL
is recognizing Unicode utf8 encoding but not
from the respect of packing a table!

Go Figure ...

Anyone wishing to share any insight or solution would
be GREATLY appeciated. I promise if I find a solution
I will share it.

Thank you Very Much, Shukran Jiddan, Todah Rabah,
Muchos Gracias ...

Joel S
(585) 255-0997
jrs_14618 at yahoo.com

Jun 13 '06 #1
1 4849
jr*******@yahoo .com wrote:
Hello All,

This post is essentially a reply a previous post/thread
here on this mailing.databas e.myodbc group titled:

MySQL 4.0, FULL-TEXT Indexing and Search Arabic Data, Unicode

[This version has a couple subtle edits from the orginial I posted
on mailing.databas e.myodbc - I'm cross posting here on this
topic/subject related newsgroup]

I was wondering if anybody has experienced the same issues
challenges I'm experiencing I'll describe shortly. Once
resolved some fascinating and powerful multi-lingual
apps incorporating non-English/latin character sets can be
realized by many developers.

I have a Unicode utf8 English - Arabic - Hebrew - Greek (and
several other languages) database in Microsoft Excel. I KNOW
that it is Unicode utf8 data because MySQL tells me it
recognizes the encoding as such but not in the context I want.

Allow me to explain ...

I can search the Unicode utf8 encoding with no problem in
Excel. While in Excel I highlight a complete word or a
partial string of an Arabic word copy it to the clipboard
(i.e. memory). I then do a find and the process is the
same successful result as if it was an English string.

MySQL 5.0 is supposed to handle Unicode utf8

I created a MySQL database I named: languages

CREATE DATABASE languages ;

and I implemented the following command on a MySQL
command prompt:

ALTER DATABASE languages DEFAULT CHARACTER SET utf8;

No problem (so far) MySQL seemingly recognized utf8 and
accepted it. My understanding is with the ALTER command
the tables I create against languages will be utf8.

I now created a table I named mainlang which denotes it
will be the main table for my languages.

mysql>CREATE TABLE mainlang
->(
->langNumID varchar(30),
->colB varchar(30),
->colC varchar(30),
->primary key (langNumID, colB)
->);

Again so far no problem: Table successfully created.
My third column 'colC' is where the Unicode data
will be stored.

I now attempt to import the database from my
Excel file into my MySQL database as follows:

mysql>load data infile 'c:\\arabicdict ionary.csv'
->into table mainlang
->fields terminated by ','
->lines terminated by '\n'
->(langNumID, colB, colC);
ERROR 1406 (22001): Data too long for 'colC' at row 1

So what to do? I did a search and found other
people seemingly had the same problem and someone
suggested:

ALTER DATABASE languages DEFAULT CHARACTER SET cp1250;

I dropped mainlang, recreated it, redid the load and
Lo and behold ... it seemed to work. No Data too long
error occurred and when I did the following query:

mysql>select langNumID, colB, colC
->from mainlang
->where colB = '4994';

I see colA have a correct numeric value, colB a
correct numeric value (4994) and for colC a string of
unintelligible characters with diacritical marks,
oomlats etc. which I know is the cp1250 encoding
interpretation of the Unicode utf8 data which is
similarly unintelligible in its own regard.

Now what I try is: do a copy of the obscure colC
cp1250 character string into the clipboard/memory
and then do the following tweak on the original
select statement to see if I can search on the
(now) cp1250 character string:

mysql>select langNumID, colB, colC
->from mainlang
->where colc = 'paste of the cp1250 character string';

The computer would not allow a paste unless I pressed
the escape key. On initiating this select command
I got an empty set (no match)

My questions are:

Has anyone been successful creating a Unicode utf8
MySQL database that accepts Arabic?

If yes, how did you get around or not encounter the
Data too long issue?

Have you tried the cp1250 (or cp1251 - same mechanics
same results) work around as I have? Are you
able to search the cp1250 character string (my colC)?
If yes, how did you successfully manage to do it?

Lastly, if I take the cp1250 encoded string and paste
it into Excel ... I can string search the cp1250
encoding with no problem.

Also, here's how I know my Unicode utf-8 data is
correct apart from my own manual cross-referencing
and being recognized by MySQL in some respect:

When I copy the Unicode utf8 encoding and try to
paste it into the select command to see what would
happen I get the following error:

ERROR 1257 (HY000): Illegal mix of collations
(cp1250_general _ci, IMPLICIT) and
(utf8_general_c i, COERCIBLE) for operation '='

So what I have here is a situation where MySQL
is recognizing Unicode utf8 encoding but not
from the respect of packing a table!

Go Figure ...

Anyone wishing to share any insight or solution would
be GREATLY appeciated. I promise if I find a solution
I will share it.

Thank you Very Much, Shukran Jiddan, Todah Rabah,
Muchos Gracias ...

Joel S
(585) 255-0997
jrs_14618 at yahoo.com


No idea, Joel. Why don't you try asking in a mysql database newsgroup - such as
comp.databases. mysql. This newsgroup is for PHP programming.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Jun 14 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3970
by: Tarbuza Smith | last post by:
I was told that it is easy to create a table in MYSQL but it is hard to create a table with relationship like in Oracle or SQL Server by using Relationship or Query by Design or whatever wysiwyg tools. It has to be done by SQL query. Is there a better way to do that in MySQL? I would appreciate any suggestions.
74
8008
by: John Wells | last post by:
Yes, I know you've seen the above subject before, so please be gentle with the flamethrowers. I'm preparing to enter a discussion with management at my company regarding going forward as either a MySql shop or a Postgresql shop. It's my opinion that we should be using PG, because of the full ACID support, and the license involved. A consultant my company hired before bringing me in is pushing hard for MySql, citing speed and community...
39
8402
by: Mairhtin O'Feannag | last post by:
Hello, I have a client (customer) who asked the question : "Why would I buy and use UDB, when MySql is free?" I had to say I was stunned. I have no experience with MySql, so I was left sort of stammering and sputtering, and managed to pull out something I heard a couple of years back - that there was no real transaction safety in MySql. In flight transactions could be lost.
4
3116
by: rass.elma | last post by:
Hi all I fetch the following value from a string (VCAHR(250))colmun in a MySql table: "300000000000000000000000000000000000000000000000000" When I write it out using echo() I get : 3E+50 Appearently the PHP interpreter converts the VCAHR value automatically in a float despite the fact that the value is defined as a VCAHR in
1
13679
by: marcfischman | last post by:
Please help. I have a website running on a linux/apache/mysql/php server. I receive about 8,000-10,000 visitors a day with about 200,000 to 300,000 page views. The server is a RedHat Linux server running PHP 5.x, MySQL 5.x, Apache 2.x We have been suffering from a number of performance issues. Our hosting company has set our max connections to 100, and we are using persistent connections in PHP. At times the mysqld process takes 100%...
0
1958
by: Vineeta | last post by:
Volt is the leader in providing workforce design and is currently the 5th largest staffing firm nationwide. Volt is an E.O.E. Our client is a top entertainment/community network and a large-scale consumer facing website. At present they are seeking a highly motivated, energetic MySQl DBA to fill full-time position in their stratospherically growing organization. Our client offers entrepreneurial culture - ideas are always welcome,...
0
2284
by: yeahuh | last post by:
Quick and dirty version. Godaddy server using MySQL 4.0.24 I’m trying a left join to obtain id’s in table A(cars) that are NOT in car_id in table B(newspaper): *This is a cut down version to simplify testing. Full version is posted towards the end. SELECT C.id FROM cars C LEFT OUTER JOIN newspaper N USING (C.id=N.car_id) WHERE N.car_id IS NOT NULL;
0
11698
by: cwho.work | last post by:
Hi! We are using apache ibatis with our MySQL 5.0 database (using innodb tables), in our web application running on Tomcat 5. Recently we started getting a number of errors relating to java.sql.SQLException: Deadlock found when trying to get lock; Try restarting transaction message from server: "Lock wait timeout exceeded; try restarting transaction"; We get such errors generally on inserts or updates while applying a
4
1713
omerbutt
by: omerbutt | last post by:
hi every one I am A new Bee to php mysql and i was surfing through the net to learn about how to secure the mysql when you are working in a web environment while working with php html and javascript i came through this article http://articles.techrepublic.com.com/5100-6350_11-5287638.html and before i proceede i must tell you that iam using win xp professional sp2 where were given two main and very first step before you start making your...
2
2750
by: paulysa | last post by:
Hi there. I am experimenting with Apache/MySQL/PHP to set up a contact list. Background Am a bit of a dabbler. Downloaded MySQL 5.0.67 for windows and Apache 2.2.9 with OpenSSL-0.9.8h-r2 and PHP 5.2.6 to experiment on my PC with WinXP with SP2 before perhaps trying a Linux server system. Full install of MySQL and Apache both work separately. But a full install of PHP failed. Now I wish to uninstall PHP and try a default PHP install but...
0
8633
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9199
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8947
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7787
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5891
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4392
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4642
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3076
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2366
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.