473,722 Members | 2,243 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

select query on latin1 or utf8 column: which is faster?

Assume you have two varchar (or Text) columns named L and U which are
identical except that the charset for L is latin1 and the charset for
U is utf8. All the records in L and U are identical in terms of
content, consisting of only 7 bit ASCII characters. Both columns have
indexes of the same type (e.g. assume Unique indexes if you want).

Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column? For example, will the

Select * from table where U='blahblah'

run slower than the query

Select * from table where L='blahblah'


Significantly slower?

I'm thinking that a query on the latin1 column would go faster since
the program knows upfront that one byte equals one character, and
vice-versa; whereas in the same query on a utf8 column the program has
a lot more "overhead" because it has to constantly be determining how
many bytes represent a character. Since queries on string columns are
case insensitive, the program can't just do a byte-for-byte
comparision; rather, it has to compare *characters*, and sometimes
convert a character from upper to lower case, or vice versa, in order
to do the case-insensitive comparison.

The actual column in question is going to store URLs, so it should
only need to hold 7 bit ascii characters (in theory at least). So, in
terms of content, it shouldn't matter whether I make the column latin1
or utf8. But in terms of query speed....on, say, a few million

I would like to do everything in utf8 (web pages, forms, mysql
database columns, etc.). But since that one column might be heavily
queried, maybe I should make an exception and do it in latin1?? I wish
the mysql docs would speak to these issues.... Thanks for any help.


(ps, if you know of any good websites or books that deal with this
issue, let me know....thanks) .
Jul 23 '05 #1
1 4288
Paul wrote:
Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column?

It's an interesting question, but my educated guess is that the
character set is not high on the list of factors that affect query
performance. I would guess that making sure that the column is indexed
appropriately, and that the MySQL service's cache settings are tuned
well, would have a much greater impact on performance.

I think that the MySQL docs don't make explicit claims about performance
of this feature over that feature because there are so many other factors.

The length of the strings, the number of records in the table, the
degree to which the values are unique within that field or not, the type
of query terms used to fetch them, and the server hardware configuration
all can be influential on performance, and might make it hard to make
blanket statements about one other factor such as character set.

In general, before you worry about fine-grained performance issues, you
should identify where your bottlenecks truly are, and take care of
those. This should be based on performance measurements that are
representative of your system and usage, not claims made in documentation.

Bill K.
Jul 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

by: David Lawson | last post by:
The line indicated below from my php script is very slow (about 10 seconds). I have this field indexed so I thought that it would be much faster. Could someone tell me what might be wrong? I'm also including the dump of the table definitions. This is a cd cataloging database. Right now the filenames table is empty and I'm trying to populate it, but at the rate it's going it would take days. I have about 700,000 records in the 'files'...
by: Claudio Cicali | last post by:
Hi, I'm trying to restore a pg_dump-backed up database from one server to another. The problem is that the db is "mixed encoded" in UTF-8 and LATIN1... (weird but, yes it is ! It was ported once from a hypersonic db... that screwed up something and now I'm fighting with that...). So, trying to restore that db into a UTF-8 encoded new one, gives me errors ("invalid unicode character..."), but importing it
by: ranjithkumar | last post by:
I am using mysql and have some data in my application in the latin1 charset. I have a necessity to support the utf 8 charset. Now I want to migrate the data between these two charset. The normal way I do migration is as follows: Taking a dump of the data with the currently running mysql converting the necessary parameters in the mysql settings and starting the mysql with utf8 support droping the database.
by: alex | last post by:
I've converted a latin1 database I have to utf8. The process has been: # mysqldump -u root -p --default-character-set=latin1 -c --insert-ignore --skip-set-charset mydb mydb.sql # iconv -f ISO-8859-1 -t UTF-8 mydb.sql mydb_utf8.sql mysqlCREATE DATABASE mydb_utf8 CHARACTER SET utf8 COLLATE utf8_general_ci;
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.