469,934 Members | 1,857 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,934 developers. It's quick & easy.

select query on latin1 or utf8 column: which is faster?

Assume you have two varchar (or Text) columns named L and U which are
identical except that the charset for L is latin1 and the charset for
U is utf8. All the records in L and U are identical in terms of
content, consisting of only 7 bit ASCII characters. Both columns have
indexes of the same type (e.g. assume Unique indexes if you want).

Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column? For example, will the
query

Select * from table where U='blahblah'

run slower than the query

Select * from table where L='blahblah'

??

Significantly slower?

I'm thinking that a query on the latin1 column would go faster since
the program knows upfront that one byte equals one character, and
vice-versa; whereas in the same query on a utf8 column the program has
a lot more "overhead" because it has to constantly be determining how
many bytes represent a character. Since queries on string columns are
case insensitive, the program can't just do a byte-for-byte
comparision; rather, it has to compare *characters*, and sometimes
convert a character from upper to lower case, or vice versa, in order
to do the case-insensitive comparison.

The actual column in question is going to store URLs, so it should
only need to hold 7 bit ascii characters (in theory at least). So, in
terms of content, it shouldn't matter whether I make the column latin1
or utf8. But in terms of query speed....on, say, a few million
records...??

I would like to do everything in utf8 (web pages, forms, mysql
database columns, etc.). But since that one column might be heavily
queried, maybe I should make an exception and do it in latin1?? I wish
the mysql docs would speak to these issues.... Thanks for any help.

Paul

(ps, if you know of any good websites or books that deal with this
issue, let me know....thanks).
Jul 23 '05 #1
1 4122
Paul wrote:
Here's my question: Will the fact that column U has a utf8 charset
make select queries run slower on that column?


It's an interesting question, but my educated guess is that the
character set is not high on the list of factors that affect query
performance. I would guess that making sure that the column is indexed
appropriately, and that the MySQL service's cache settings are tuned
well, would have a much greater impact on performance.

I think that the MySQL docs don't make explicit claims about performance
of this feature over that feature because there are so many other factors.

The length of the strings, the number of records in the table, the
degree to which the values are unique within that field or not, the type
of query terms used to fetch them, and the server hardware configuration
all can be influential on performance, and might make it hard to make
blanket statements about one other factor such as character set.

In general, before you worry about fine-grained performance issues, you
should identify where your bottlenecks truly are, and take care of
those. This should be based on performance measurements that are
representative of your system and usage, not claims made in documentation.

Regards,
Bill K.
Jul 23 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

By using this site, you agree to our Privacy Policy and Terms of Use.