473,574 Members | 2,704 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

select query on latin1 or utf8 column: which is faster?

Assume you have two varchar (or Text) columns named L and U which are
identical except that the charset for L is latin1 and the charset for
U is utf8. All the records in L and U are identical in terms of
content, consisting of only 7 bit ASCII characters. Both columns have
indexes of the same type (e.g. assume Unique indexes if you want).

Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column? For example, will the
query

Select * from table where U='blahblah'

run slower than the query

Select * from table where L='blahblah'

??

Significantly slower?

I'm thinking that a query on the latin1 column would go faster since
the program knows upfront that one byte equals one character, and
vice-versa; whereas in the same query on a utf8 column the program has
a lot more "overhead" because it has to constantly be determining how
many bytes represent a character. Since queries on string columns are
case insensitive, the program can't just do a byte-for-byte
comparision; rather, it has to compare *characters*, and sometimes
convert a character from upper to lower case, or vice versa, in order
to do the case-insensitive comparison.

The actual column in question is going to store URLs, so it should
only need to hold 7 bit ascii characters (in theory at least). So, in
terms of content, it shouldn't matter whether I make the column latin1
or utf8. But in terms of query speed....on, say, a few million
records...??

I would like to do everything in utf8 (web pages, forms, mysql
database columns, etc.). But since that one column might be heavily
queried, maybe I should make an exception and do it in latin1?? I wish
the mysql docs would speak to these issues.... Thanks for any help.

Paul

(ps, if you know of any good websites or books that deal with this
issue, let me know....thanks) .
Jul 23 '05 #1
1 4277
Paul wrote:
Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column?


It's an interesting question, but my educated guess is that the
character set is not high on the list of factors that affect query
performance. I would guess that making sure that the column is indexed
appropriately, and that the MySQL service's cache settings are tuned
well, would have a much greater impact on performance.

I think that the MySQL docs don't make explicit claims about performance
of this feature over that feature because there are so many other factors.

The length of the strings, the number of records in the table, the
degree to which the values are unique within that field or not, the type
of query terms used to fetch them, and the server hardware configuration
all can be influential on performance, and might make it hard to make
blanket statements about one other factor such as character set.

In general, before you worry about fine-grained performance issues, you
should identify where your bottlenecks truly are, and take care of
those. This should be based on performance measurements that are
representative of your system and usage, not claims made in documentation.

Regards,
Bill K.
Jul 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2230
by: David Lawson | last post by:
The line indicated below from my php script is very slow (about 10 seconds). I have this field indexed so I thought that it would be much faster. Could someone tell me what might be wrong? I'm also including the dump of the table definitions. This is a cd cataloging database. Right now the filenames table is empty and I'm trying to populate...
2
4759
by: Claudio Cicali | last post by:
Hi, I'm trying to restore a pg_dump-backed up database from one server to another. The problem is that the db is "mixed encoded" in UTF-8 and LATIN1... (weird but, yes it is ! It was ported once from a hypersonic db... that screwed up something and now I'm fighting with that...). So, trying to restore that db into a UTF-8 encoded new...
2
3119
by: ranjithkumar | last post by:
I am using mysql and have some data in my application in the latin1 charset. I have a necessity to support the utf 8 charset. Now I want to migrate the data between these two charset. The normal way I do migration is as follows: Taking a dump of the data with the currently running mysql converting the necessary parameters in the mysql...
39
5841
by: alex | last post by:
I've converted a latin1 database I have to utf8. The process has been: # mysqldump -u root -p --default-character-set=latin1 -c --insert-ignore --skip-set-charset mydb mydb.sql # iconv -f ISO-8859-1 -t UTF-8 mydb.sql mydb_utf8.sql mysqlCREATE DATABASE mydb_utf8 CHARACTER SET utf8 COLLATE utf8_general_ci;
0
7814
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7736
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8070
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8252
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7827
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6469
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5631
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5309
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3750
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.