Consider the output of these two explain statements:
mysql> explain select count(*)
from cities,addresse s,venues,events
where cities.latitude <= 30.2741903768
and cities.latitude >= 30.2596976232
and cities.longitud e <= -97.734387326
and cities.longitud e >= -97.751168674
and events.venue = venues.id
and venues.address = addresses.id
and addresses.city = cities.id
and events.submit_d ate >= '1976-12-03 18:33:27'
and events.active = 1
order by events.start,ev ents.end limit 1000;
+-----------+-------+-------------------------------+-----------+---------+-----------------+------+---------------------------------------------+
| table | type | possible_keys | key |
key_len | ref | rows | Extra
|
+-----------+-------+-------------------------------+-----------+---------+-----------------+------+---------------------------------------------+
| cities | range | id,latlong,lati tude,longitude | longitude |
4 | NULL | 2 | where used; Using temporary; Using
filesort |
| addresses | ref | id,city | city |
4 | cities.id | 16 |
|
| venues | ref | id,address | address |
4 | addresses.id | 20 |
|
| events | ref | submit_date,ven ue,active,sva | venue |
5 | venues.id,const | 2 | where used
|
+-----------+-------+-------------------------------+-----------+---------+-----------------+------+---------------------------------------------+
4 rows in set (0.00 sec)
mysql> explain select count(*)
from cities,addresse s,venues,events
where cities.latitude <= 42.6937078406
and cities.latitude >= 41.9690701594
and cities.longitud e <= -82.5528748628
and cities.longitud e >= -83.5387911372
and events.venue = venues.id
and venues.address = addresses.id
and addresses.city = cities.id
and events.submit_d ate >= '1976-12-03 18:36:37'
and events.active = 1
order by events.start,ev ents.end limit 1000;
+-----------+--------+-------------------------------+--------+---------+----------------+-------+----------------------------+
| table | type | possible_keys | key |
key_len | ref | rows | Extra |
+-----------+--------+-------------------------------+--------+---------+----------------+-------+----------------------------+
| events | ref | submit_date,ven ue,active,sva | active |
1 | const | 16135 | where used; Using filesort |
| venues | eq_ref | id,address | id |
4 | events.venue | 1 | |
| addresses | eq_ref | id,city | id |
4 | venues.address | 1 | |
| cities | eq_ref | id,latlong,lati tude,longitude | id |
4 | addresses.city | 1 | where used |
+-----------+--------+-------------------------------+--------+---------+----------------+-------+----------------------------+
4 rows in set (0.00 sec)
Note that the only differences between them are the specific values
compared to the latitude, longitude and submit_date columns. Why then
does the first query perform so few comparisons and the second perform
so many? I've tried the same queries several times over the past
couple days. The server has been restarted, the MYI files have been
myisamchk'd. The tables have been flushed. (Is there a way to tell
the server to rebuild its indexes?) To show that explain isn't just
screwed up, the first query takes less that a tenth of a second on my
system while the second takes nearly six seconds. Both return a count
of slightly more than 100.
Here are the table definitions for the events and cities tables:
CREATE TABLE events (
id int(10) unsigned NOT NULL auto_increment,
start date NOT NULL default '0000-00-00',
end date NOT NULL default '0000-00-00',
time varchar(20) NOT NULL default '',
info varchar(255) NOT NULL default '',
event varchar(128) NOT NULL default '',
venue int(10) unsigned NOT NULL default '0',
active tinyint(3) unsigned NOT NULL default '1',
submitter int(10) unsigned NOT NULL default '0',
submit_date date NOT NULL default '0000-00-00',
update_date date NOT NULL default '0000-00-00',
price varchar(32) NOT NULL default '',
UNIQUE KEY id (id),
KEY submit_date (submit_date),
KEY update_date (update_date),
KEY venue (venue,active),
KEY event (event,active),
KEY date (start,active,e nd),
KEY active (active),
KEY sva (submit_date,ve nue,active)
) TYPE=MyISAM;
CREATE TABLE cities (
id int(10) unsigned NOT NULL auto_increment,
city varchar(128) NOT NULL default '',
state char(2) NOT NULL default '',
country varchar(128) NOT NULL default '',
latitude float NOT NULL default '-255',
longitude float NOT NULL default '-255',
UNIQUE KEY id (id),
KEY city (city),
KEY state (state),
KEY country (country),
KEY latlong (latitude,longi tude),
KEY latitude (latitude),
KEY longitude (longitude)
) TYPE=MyISAM;
All other involved tables just chain between the events and cities
tables. All tables are TYPE=MyISAM.
Based on some inputs I received to a similar post last November, I've
tried the following suggestions:
* reorganized the lat/long comparisons to compare both latitudes
then both longitudes
* recast the lat/long comparisons to use the BETWEEN function
instead of comparison operators
* added separate latitude and longitude indexes to the cities
table instead of just the composite latlong index.
At one point I thought that STRAIGHT_JOIN was going to be my
salvation. Alas, while it seems to solve the problem some of the
time, it also appears to just move the problem around (other similar
queries perform badly at times, and for no apparent reason).
Accordingly, I've dumped STRAIGHT_JOIN and just let the MySQL query
optimizer do its thing without any help from me.
My server is 3.23.41-log as reported by mysqladmin, protocol version
10. Connections are to localhost via a unix domain socket.
I can pretty easily upgrade to the latest micro release (.58). Is
there anything to suggest that the query optimizer is somehow better
in 4.0.18?
Thanks,
Skip Montanaro
Got gigs? http://www.musi-cal.com/submit.html
Got spam? http://www.spambayes.org/
sk**@pobox.com