473,396 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

quieries taking a while - key problem?

crabpot8
I am decent at queries, but recently i made a new DB with about 50000+ entries. Queries are taking almost 20+ seconds, and i think it is that i dont have decently structured keys - this is the first time I was the one that created the DB structure. Here is the info that seemed relevant - ANY HELP APPRECIATED

There is one 'primary' key - a mix of server, timestamp, and ipaddress
This is a few server access logs put into a database

Also - if you know of a good tutorial on keys - i have been trying to read up - but everywhere either deals with what they are or how to optimize queries, and none of them seem to deal with the issue together

crabpot8

____________________________________________

Field Type Null Default
ip_address char(15) No
timestamp timestamp No CURRENT_TIMESTAMP
filename varchar(1000) Yes NULL
http_status smallint(6) Yes NULL
bytes_sent int(11) Yes NULL
client_info text Yes NULL
client_id text Yes NULL
client_stat_results text Yes NULL
file_size int(11) Yes NULL
send_time int(11) Yes NULL
resends smallint(6) Yes NULL
failed_resends smallint(6) Yes NULL
pres_id smallint(6) Yes NULL
file_time int(11) Yes NULL
helix_ enum('a', 'b') No a


Indexes:
Keyname Type Cardinality Field
PRIMARY PRIMARY 55485 ip_address
timestamp
helix_


Space usage:
Type Usage
Data 39,552 KiB
Index 0 B
Total 39,552 KiB
Row Statistics:
Statements Value
Format Compact
Rows 73,709
Row length ř 549
Row size ř 549 B
Creation Feb 04, 2008 at 12:36 PM
________________________________________________

here are the types of query i have to run ran - each took over 20 seconds
Expand|Select|Wrap|Line Numbers
  1. SELECT COUNT(*) FROM rmaccess WHERE timestamp BETWEEN 20080225004627 AND 20080225014627
  2.  
  3. SELECT COUNT(*) FROM rmaccess WHERE timestamp BETWEEN 20080225014627 AND 20080225024627
  4.  
  5. SELECT COUNT(*) FROM rmaccess WHERE timestamp BETWEEN 20080225024627 AND 20080225034627
Please enclose any code within the proper code tags. See the Posting Guidelines on how to do that.

MODERATOR
Mar 11 '08 #1
3 1361
mwasif
802 Expert 512MB
Hi crabpot8,

Your query is not using the INDEX at all. Use EXPLAIN to confirm this.

Reason: In multiple column indexes, they are always read from left to right. You have defined the index in the order
Expand|Select|Wrap|Line Numbers
  1. ip_address
  2. timestamp
  3. helix_
To properly use the index for your said query, you must have to create a combine index in the following order
Expand|Select|Wrap|Line Numbers
  1. timestamp
  2. ip_address
  3. helix_
Let us know if you need more help.
Mar 12 '08 #2
Hi crabpot8,

Your query is not using the INDEX at all. Use EXPLAIN to confirm this.

Reason: In multiple column indexes, they are always read from left to right. You have defined the index in the order
Expand|Select|Wrap|Line Numbers
  1. ip_address
  2. timestamp
  3. helix_
To properly use the index for your said query, you must have to create a combine index in the following order
Expand|Select|Wrap|Line Numbers
  1. timestamp
  2. ip_address
  3. helix_
Let us know if you need more help.

Thanks a lot! I actually did it a little differently - I am not sure if its maximum effectiveness but it definitely works better. I needed the three columns to be unique together, but I am generating charts of server stats so i normally use the timestamps in my queries. So i did this - any opinions or am i headed in the right direction?
Expand|Select|Wrap|Line Numbers
  1. Primary Key on timestamp, helix_
  2. Unique Key on ip_address, timestamp, helix_
These are in the correct order that i have them in my table. My queries work in a reasonable time now, but if I made the second key timestamp, ip_address, helix_ , would it help even more?


I still would love a good tutorial on how to make indexes on data like this that does not need a unique ID, but has a column or set of columns that will suffice.

Thanks for the help!
~crabpot
Mar 12 '08 #3
Hm - I tried changing my keys up and here are the results -

SQL Statement -
Expand|Select|Wrap|Line Numbers
  1. SELECT COUNT( timestamp )
  2. FROM rmaccess
  3. WHERE timestamp
  4. BETWEEN 20080225034627
  5. AND 20080225044627
SQL Statement (with different timestamp values) Ran about 150 times on both



Keys version 1
  • Primary - timestamp,helix_
  • Unique - ip_address, timestamp, helix_

Explain Of SQL

Expand|Select|Wrap|Line Numbers
  1. id - 1
  2. select_type - SIMPLE
  3. table - rmaccess
  4. type - range
  5. possible_keys - PRIMARY
  6. key - PRIMARY
  7. key_len - 4
  8. ref - NULL
  9. rows - 6
  10. Extra - Using where; Using index
  11.  
Speed Test Results
Expand|Select|Wrap|Line Numbers
  1. 1st - 0.97661185264587
  2. 2nd - 0.71671605110168 
  3. 3rd - 0.67619180679321 

Keys version 2
  • Primary - timestamp,helix_
  • Unique - timestamp, ip_address, helix_

Explain of SQL
Expand|Select|Wrap|Line Numbers
  1. id - 1
  2. select_type - SIMPLE
  3. table - rmaccess
  4. type - range
  5. possible_keys - PRIMARY, unique
  6. key - PRIMARY
  7. key_len - 4
  8. ref - NULL
  9. rows - 6
  10. Extra - Using where; Using index
  11.  
Speed Test Results
Expand|Select|Wrap|Line Numbers
  1. 1st - 1.0137181282043 
  2. 2nd - 0.66393899917603 
  3. 3rd - 0.67385196685791 
Interesting Note: phpmyadmin said this with version two:
'PRIMARY and INDEX keys should not both be set for column `timestamp`'



So from looking at the explains - I am guessing (i did a little reading, but have not had time to do a lot) That the two queries are equivalent with the different indexes. Strangely enough - perhaps this is what phpmyadmin meant by should not be using for two queries - the version one, with timestamp in the middle of the unique rather than the front - ran faster in subsequent tests (as in perhaps it cached more). The only reason i can think of is that if it used two keys in the query it might try to cache them both, versus if it just used one it might cache more and return other results faster.

~crabpot8

PS - i know this is a lot to dig results from, but I really cant find a good tutorial on this
Mar 12 '08 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

9
by: Mikhail Teterin | last post by:
Hello! I'd like to have a variable of a pointer-to-function type. The two possible values are of type (*)(FILE *) and (*)(void *). For example: getter = straight ? fgetc : gzgetc; nextchar...
0
by: RSH | last post by:
I have a simple Windows Form, where i have a datagridview. The user can switch between databases and select the tables via DropDownLists. I am having a problem that I can't really pinpoint when...
13
by: DH | last post by:
Hi, I'm trying to strip the html and other useless junk from a html page.. Id like to create something like an automated text editor, where it takes the keywords from a txt file and removes them...
2
by: Adi | last post by:
Hello friend, Lately, i cam accross a small problem that has causing quite more pain. Scanf hangs whenever i try to re- take input into the same char array. I've pasted below excerpt of my code:...
8
by: so many sites so little time | last post by:
this time i am going to use the scripts from the book and just change the names to match what i am trying to do now i just got an error that i know shouldnt be an error i think. this is the error:...
6
by: billyusa | last post by:
Hello all, im taking a intro to c++ class as part of my design comm, and having issues withthe codelab part (online). First off it doesnt give any help to what you are doing wrong. the prob is: ...
3
by: tshad | last post by:
Using asp.net 2.0, I am finding that at times, the old javascript will still be there. I was working with it for a couple of hours and the changes seem to happen. But at the end of the day, I...
3
by: Robert Bevington | last post by:
Hi all, I ran into memory problems while tying to search and replace a very large text file. To solve this I break the file up into chunks and run the search and replace on each chunk. This...
1
by: sheldonlg | last post by:
I have inherited code with a TDC control. In this file, there are two javascripts of interest. One of these is a function, filter(), which is inside <script language=javascript></script>. The...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.