473,396 Members | 1,971 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

18k queries against 70k rows

D
18k queries against 70k rows

query:
select cn from geoip where [decimal IP] between start and end

table:
"start_ip","end_ip","start","end","cc","cn"
"11.0.0.0","12.1.52.63",184549376,201405503,"US"," United States"
"111.0.0.0","111.0.0.255",1862270976,1862271231,"K R","Korea"
"112.63.162.116","112.63.162.119",1883218548,18832 18551,"ES","Spain"

After the queries are cached -thank god for query caching- it takes less
than a minute. Before the queries are cached it takes upwards of half an
hour. I have the primary key set to the start and end columns. Any ideas for
speeding things up?
Jul 23 '05 #1
7 1429
D wrote:
18k queries against 70k rows

query:
select cn from geoip where [decimal IP] between start and end

table:
"start_ip","end_ip","start","end","cc","cn"
"11.0.0.0","12.1.52.63",184549376,201405503,"US"," United States"
"111.0.0.0","111.0.0.255",1862270976,1862271231,"K R","Korea"
"112.63.162.116","112.63.162.119",1883218548,18832 18551,"ES","Spain"

After the queries are cached -thank god for query caching- it takes less
than a minute. Before the queries are cached it takes upwards of half an
hour. I have the primary key set to the start and end columns. Any ideas for
speeding things up?


Always when trying to improve performance, examine the query using
EXPLAIN to see if it's really using the indexes you intend it to use.
You might need to create another single-column index on 'end'.

Another idea might be to load your 18k decimal IP's into another table,
and do a join to geoip. For instance:

CREATE TEMPORARY TABLE tmpiplist (
decimalip VARCHAR(12) NOT NULL PRIMARY KEY
);
LOAD DATA INFILE 'iplist.txt' INTO TABLE tmpiplist;

SELECT i.decimalip, COALESCE(g.cn, 'unknown') AS cn
FROM tmpiplist AS i LEFT OUTER JOIN geoip AS g
ON i.decimalip BETWEEN g.start and g.end;

At least that way you'd be eliminating the overhead of parsing and
executing 18,000 SQL statements; you'd execute one query and get back
18,000 rows in the result set.

Regards,
Bill K.
Jul 23 '05 #2
D
thanks for responding

explain query:

explain select cn from geoip where 3734003077 between start and end

result:

"id","select_type","table","type","possible_keys", "key","key_len","ref","rows","Extra"
1,"SIMPLE","geoip","ALL","","",(null),"",70865,"Us ing where"
Jul 23 '05 #3
D wrote:
thanks for responding

explain query:

explain select cn from geoip where 3734003077 between start and end

result:

"id","select_type","table","type","possible_keys", "key","key_len","ref","rows","Extra"
1,"SIMPLE","geoip","ALL","","",(null),"",70865,"Us ing where"


Well, I had in mind that _you_ would do the analysis of what this means.
I suggested that you run the query with EXPLAIN so that _you_ could
interpret the results.

See http://dev.mysql.com/doc/mysql/en/explain.html

That page has some good examples of how to interpret the output of
EXPLAIN. (Hint: the EXPLAIN output above indicates that the query is
using no index.)

You should try defining an additional index on your 'end' column, and
using >= and <= instead of BETWEEN. The MySQL optimizer might not be
smart enough to use a compound index with BETWEEN.

Regards,
Bill K.
Jul 23 '05 #4
D
Yes, I read that page. Thanks for the additional tips.
Jul 23 '05 #5
D
I think I've made a bit of progress. I created one compound unique index
with start and end. The between statement indeed doesn't use the index for
my query.

explain select * from geoip where start<=1130773714 and end>=1130773714

1, 'SIMPLE', 'geoip', 'range', 'Index_1', 'Index_1', 4, '', 10747, 'Using
where'
Jul 23 '05 #6
D wrote:
I think I've made a bit of progress. I created one compound unique index
with start and end. The between statement indeed doesn't use the index for
my query.

explain select * from geoip where start<=1130773714 and end>=1130773714

1, 'SIMPLE', 'geoip', 'range', 'Index_1', 'Index_1', 4, '', 10747, 'Using
where'


So now you've got it down from scanning 70,865 rows to 10,747 rows.
That should improve the speed of the query somewhat.

I'm guessing that it's able to use the index to evaluate "start <= ...",
but not "end >= ..." (the usual problem with compound indexes: find
everyone named "Steve" in the telephone book).

Now try creating an additional index on end.

Regards,
Bill K.
Jul 23 '05 #7
D
I created two indexes and it did not use them. The compound index was the
only one it would use. The compound index is smaller than two independent
indexes.

I think the between statement deals better with constants, such as: col
between 1 and 10...not: foo between col1 and col2.
Jul 23 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Phil Powell | last post by:
http://www.php.net/array_filter I went there at first for my information on filtering mySQL query results using PHP, to no avail. This is more of a Vignette construct (my native environment)...
4
by: Dave Edwards | last post by:
I understand that I can fill a datagrid with multiple queries, but I cannot figure out how to fill a dataset with the same query but run against multiple SQL servers, the query , table structure...
0
by: Jim Crate | last post by:
<<finishing message this time, after accidentally sending before>> I have a couple queries in a PL/PGSQL function which execute very slowly (around one minute each) which execute in .5 second...
2
by: Jeffrey Melloy | last post by:
I have been using tsearch2 for quite a while with a fair amount of success. The other day I was playiing around with a query, and randomly changed a few things. I noticed a 10 times speedup and...
3
by: Ian Roddis | last post by:
Hello, I want to embed SQL type queries within an XML data record. The XML looks something like this: <DISPLAYPAGE> <FIELD NAME="SERVER" TYPE="DROPDOWN"> <OPTION>1<OPTION> <OPTION>2<OPTION>...
13
by: JayCallas | last post by:
I know this question has been asked. And the usual answer is don't use cursors or any other looping method. Instead, try to find a solution that uses set-based queries. But this brings up...
7
by: lawrence k | last post by:
Okay, I just backed up my database, just in case. The whole schema for the database is here: http://www.accumulist.com/index.php?whatPage=db.php You can run any SELECT query against this...
16
by: shawno | last post by:
I'm relatively new to DB2 and have noticed a problem with my queries when doing a like with ucase. Since queries need to be case insensitive, I have to do a ucase on the column(s) being searched...
9
by: jehugaleahsa | last post by:
Hello: I am writing a cute little class that will cache queries against a database. Currently, I am implementing this by storing the command text, parameter values and generated DataRows. ...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.