mySQL underperforming - trying to identify bottleneck

NancyJ

Currently we have a database with a main table containing 3 million
records - we want to increase that to 10 million but thats not a
possibility at the moment.
Nearly all 3 million records are deleted and replaced every day - all
through the day - currently we're handling this by having 2 sets of
tables - 1 for inserting, 1 for searching.

A block of records (10k - 1 million) (distinguished by a client
identifier field) happen on the 'alt' set of tables, then records are
inserted from CSV files using LOAD_DATA_INFILE (csv file created by
loading xml or csv files in proprietary client formats, validating and
rewriting data in our format)
To facilitate faster search times summary tables are updated from the
latest update - ie. insert into summarytable select fields from
alttable join on supportingtables where clientID = $clientID
Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
is set to 512MB)
Then we switch a flag in an info table to tell the searches to start
pulling from these updated tables and then we repeat the process on the
table that was previously the search table.

During this time even simple queries can end up in the slow query log
and I cant figure out why.

This query benchmarks at approx 0.25s
SELECT fldResort AS dest_name, fldResort as ap_destname,
fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
'2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
GROUP BY dest_name, fldBoardBasis ORDER BY price
Its using where, temporary and filesort with a key length of 3 -
examined 23k rows -
The log reads:
Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889

But even the most basic queries are being affected

SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'

Benchmarked at 0.02s (there are 0 results for this query)

>From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:

1

The site is at very low traffic atm, (around 3k visitors per day)

I'm doing everything I can to improve performance and query speeds
before next summer (where we're aiming for around 30k per day) but I
cant seem to do anything about this and if queries wont run at their
optimal speed then all this work has been for nothing.

Its probably worth noting that our CPU usage is barely at 50% - ditto
with RAM

Nov 2 '06 #1

Subscribe Post Reply

4076

Jeff North

On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"
<ha***@hazelryan.co.uk>
<11**********************@e3g2000cwe.googlegroups. comwrote:

>| Currently we have a database with a main table containing 3 million
| records - we want to increase that to 10 million but thats not a
| possibility at the moment.
| Nearly all 3 million records are deleted and replaced every day - all

Why?

>| through the day - currently we're handling this by having 2 sets of
| tables - 1 for inserting, 1 for searching.
|
| A block of records (10k - 1 million) (distinguished by a client
| identifier field) happen on the 'alt' set of tables, then records are
| inserted from CSV files using LOAD_DATA_INFILE (csv file created by
| loading xml or csv files in proprietary client formats, validating and
| rewriting data in our format)
| To facilitate faster search times summary tables are updated from the
| latest update - ie. insert into summarytable select fields from
| alttable join on supportingtables where clientID = $clientID
| Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
| is set to 512MB)
| Then we switch a flag in an info table to tell the searches to start
| pulling from these updated tables and then we repeat the process on the
| table that was previously the search table.
|
| During this time even simple queries can end up in the slow query log
| and I cant figure out why.

What indices have you set for the table(s)?

>| This query benchmarks at approx 0.25s
| SELECT fldResort AS dest_name, fldResort as ap_destname,
| fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
| fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
| FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
| '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
| AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
| GROUP BY dest_name, fldBoardBasis ORDER BY price
| Its using where, temporary and filesort with a key length of 3 -
| examined 23k rows -
| The log reads:
| Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
|
| But even the most basic queries are being affected
|
| SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
|
| Benchmarked at 0.02s (there are 0 results for this query)
| >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
| 1
|
| The site is at very low traffic atm, (around 3k visitors per day)
|
| I'm doing everything I can to improve performance and query speeds
| before next summer (where we're aiming for around 30k per day) but I
| cant seem to do anything about this and if queries wont run at their
| optimal speed then all this work has been for nothing.
|
| Its probably worth noting that our CPU usage is barely at 50% - ditto
| with RAM

---------------------------------------------------------------
jn******@yourpantsyahoo.com.au : Remove your pants to reply
---------------------------------------------------------------

Nov 2 '06 #2

NancyJ

Jeff North wrote:

On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"
<ha***@hazelryan.co.uk>
<11**********************@e3g2000cwe.googlegroups. comwrote:

| Currently we have a database with a main table containing 3 million
| records - we want to increase that to 10 million but thats not a
| possibility at the moment.
| Nearly all 3 million records are deleted and replaced every day - all

Why?

Because they change every day - we have around 30 data suppliers and
every day they supply us with a new file - sometimes they want to add
the their current dataset, sometimes they want to replace it with a
whole new data set.

>
| through the day - currently we're handling this by having 2 sets of
| tables - 1 for inserting, 1 for searching.
|
| A block of records (10k - 1 million) (distinguished by a client
| identifier field) happen on the 'alt' set of tables, then records are
| inserted from CSV files using LOAD_DATA_INFILE (csv file created by
| loading xml or csv files in proprietary client formats, validating and
| rewriting data in our format)
| To facilitate faster search times summary tables are updated from the
| latest update - ie. insert into summarytable select fields from
| alttable join on supportingtables where clientID = $clientID
| Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
| is set to 512MB)
| Then we switch a flag in an info table to tell the searches to start
| pulling from these updated tables and then we repeat the process on the
| table that was previously the search table.
|
| During this time even simple queries can end up in the slow query log
| and I cant figure out why.

What indices have you set for the table(s)?

We have nearly 100 tables - it would take all day to list every index.
Under good conditions all our uncached queries are fast - I'm trying to
find the cause of simple queries that arent locked or being limited by
CPU or Memory, going 1000 times slower than they should.

| This query benchmarks at approx 0.25s
| SELECT fldResort AS dest_name, fldResort as ap_destname,
| fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
| fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
| FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
| '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
| AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
| GROUP BY dest_name, fldBoardBasis ORDER BY price
| Its using where, temporary and filesort with a key length of 3 -
| examined 23k rows -
| The log reads:
| Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
|
| But even the most basic queries are being affected
|
| SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
|
| Benchmarked at 0.02s (there are 0 results for this query)
| >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
| 1
|
| The site is at very low traffic atm, (around 3k visitors per day)
|
| I'm doing everything I can to improve performance and query speeds
| before next summer (where we're aiming for around 30k per day) but I
| cant seem to do anything about this and if queries wont run at their
| optimal speed then all this work has been for nothing.
|
| Its probably worth noting that our CPU usage is barely at 50% - ditto
| with RAM
---------------------------------------------------------------
jn******@yourpantsyahoo.com.au : Remove your pants to reply
---------------------------------------------------------------

Nov 2 '06 #3

lark

NancyJ wrote:

Currently we have a database with a main table containing 3 million
records - we want to increase that to 10 million but thats not a
possibility at the moment.
Nearly all 3 million records are deleted and replaced every day - all
through the day - currently we're handling this by having 2 sets of
tables - 1 for inserting, 1 for searching.

A block of records (10k - 1 million) (distinguished by a client
identifier field) happen on the 'alt' set of tables, then records are
inserted from CSV files using LOAD_DATA_INFILE (csv file created by
loading xml or csv files in proprietary client formats, validating and
rewriting data in our format)
To facilitate faster search times summary tables are updated from the
latest update - ie. insert into summarytable select fields from
alttable join on supportingtables where clientID = $clientID
Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
is set to 512MB)
Then we switch a flag in an info table to tell the searches to start
pulling from these updated tables and then we repeat the process on the
table that was previously the search table.

During this time even simple queries can end up in the slow query log
and I cant figure out why.

This query benchmarks at approx 0.25s
SELECT fldResort AS dest_name, fldResort as ap_destname,
fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
'2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
GROUP BY dest_name, fldBoardBasis ORDER BY price
Its using where, temporary and filesort with a key length of 3 -
examined 23k rows -
The log reads:
Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889

But even the most basic queries are being affected

SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'

Benchmarked at 0.02s (there are 0 results for this query)
>>From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
1

The site is at very low traffic atm, (around 3k visitors per day)

I'm doing everything I can to improve performance and query speeds
before next summer (where we're aiming for around 30k per day) but I
cant seem to do anything about this and if queries wont run at their
optimal speed then all this work has been for nothing.

Its probably worth noting that our CPU usage is barely at 50% - ditto
with RAM

it shouldn't really matter why a dba deletes or adds tables or index
fields. the server should and is able to handle this and then some if
you have the right configuration.

having said that, turn on the slow query logging on your server and
start looking at what is causing the bottlenecks through slowquerydump
command. it will give you a somewhat aggregated tally of what is going
on with all of your queries. utilizing the slow query dump results start
creating index fields on the guilty parties. can't get any simpler than
that, ey?

you should set the slow_query_time parameters (it's a threshold
parameter) in the my.cnf file. usually that is set at 5 seconds or
whatever you feel is the right number.

Nov 2 '06 #4

Jeff North

On 2 Nov 2006 08:41:30 -0800, in mailing.database.mysql "NancyJ"
<ha***@hazelryan.co.uk>
<11*********************@m73g2000cwd.googlegroups. comwrote:

>|
| Jeff North wrote:
|
| On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"
| <ha***@hazelryan.co.uk>
| <11**********************@e3g2000cwe.googlegroups. comwrote:
| >
| | Currently we have a database with a main table containing 3 million
| | records - we want to increase that to 10 million but thats not a
| | possibility at the moment.
| | Nearly all 3 million records are deleted and replaced every day - all
| >
| Why?
| Because they change every day - we have around 30 data suppliers and
| every day they supply us with a new file - sometimes they want to add
| the their current dataset, sometimes they want to replace it with a
| whole new data set.

Just wanting clarification of what it was necessary to delete the
records :-)

What method are you using to delete the records DELETE FROM or
TRUNCATE table?

>| | through the day - currently we're handling this by having 2 sets of
| | tables - 1 for inserting, 1 for searching.
| |
| | A block of records (10k - 1 million) (distinguished by a client
| | identifier field) happen on the 'alt' set of tables, then records are
| | inserted from CSV files using LOAD_DATA_INFILE (csv file created by
| | loading xml or csv files in proprietary client formats, validating and
| | rewriting data in our format)
| | To facilitate faster search times summary tables are updated from the
| | latest update - ie. insert into summarytable select fields from
| | alttable join on supportingtables where clientID = $clientID
| | Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
| | is set to 512MB)
| | Then we switch a flag in an info table to tell the searches to start
| | pulling from these updated tables and then we repeat the process on the
| | table that was previously the search table.
| |
| | During this time even simple queries can end up in the slow query log
| | and I cant figure out why.
| >
| What indices have you set for the table(s)?
| >
| We have nearly 100 tables - it would take all day to list every index.
| Under good conditions all our uncached queries are fast - I'm trying to
| find the cause of simple queries that arent locked or being limited by
| CPU or Memory, going 1000 times slower than they should.

This may not be a database issue. If you are deleting and recreating
tables/files the actual data maybe fragmented on the hard drive. Have
you tried defragging your drive?

>| | This query benchmarks at approx 0.25s
| | SELECT fldResort AS dest_name, fldResort as ap_destname,
| | fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
| | fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
| | FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
| | '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
| | AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
| | GROUP BY dest_name, fldBoardBasis ORDER BY price
| | Its using where, temporary and filesort with a key length of 3 -
| | examined 23k rows -
| | The log reads:
| | Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
| |
| | But even the most basic queries are being affected
| |
| | SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
| |
| | Benchmarked at 0.02s (there are 0 results for this query)
| | >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
| | 1
| |
| | The site is at very low traffic atm, (around 3k visitors per day)
| |
| | I'm doing everything I can to improve performance and query speeds
| | before next summer (where we're aiming for around 30k per day) but I
| | cant seem to do anything about this and if queries wont run at their
| | optimal speed then all this work has been for nothing.
| |
| | Its probably worth noting that our CPU usage is barely at 50% - ditto
| | with RAM
| ---------------------------------------------------------------
| jn******@yourpantsyahoo.com.au : Remove your pants to reply
| ---------------------------------------------------------------

---------------------------------------------------------------
jn******@yourpantsyahoo.com.au : Remove your pants to reply
---------------------------------------------------------------

Nov 2 '06 #5

by: dave | last post by:

Hello there, I am at my wit's end ! I have used the following script succesfully to upload an image to my web space. But what I really want to be able to do is to update an existing record in a...

PHP

MySQL

by: R.Gill | last post by:

I have been trying to find information on the business potential for providing PHP/MySQL services in US and Europe. Would appreciate if somebody can help me on this.

MySQL Database

MySQL ODBC 3.51 driver Very Slow!

by: DJJ | last post by:

I am using the MySQL ODBC 3.51 driver to link three relatively small MySQL tables to a Microsoft Access 2003 database. I am finding that the data from the MySQL tables takes a hell of a long time...

MySQL Database

MySQL with large table.

by: Shashikant Kore | last post by:

Hi, I am using MySQL for a table which will have 100M+ records, avg length of records being 130 bytes. When the number of records reach approx. 25M (and the file size close to 4GB), the rate of...

MySQL Database

MySQL help

by: supercomputer | last post by:

So i'm writing this program to check if a row exists in a table. If it doesn't it inserts it if it does it will update that row with the current info. Well it sorta works but not fully. It...

Python

Announcing the MySQL Journal

by: Sam Flywheel | last post by:

Hello, all: I am pleased to announce that the MySQL Journal is moving forward. During the past two weeks, I've been discussing the journal with MySQL AB, and the plan that's been devised...

MySQL Database

Filtered Import From MySQL to MySQL

by: Randy | last post by:

Folks: We have a web-based app that's _really_ slowing down because multiple clients are writing their own private data into a single, central database. I guess the previous programmer did...

MySQL Database

MySQL slow in windows NT4

by: Charles Crume | last post by:

Hello; I am running the Apache web server, MySQL v4, and PHP on an NT4 server. Apache runs great, but the auction software I am using (Web2035 Auction software written in PHP) is very, very...

MySQL Database

gethostbyaddr() bottleneck?

by: RJ_32 | last post by:

When there is no rDNS record on an IP address, then it might take a while for the reverse lookup to timeout. I'm thinking of how on a tracert you can observe the delay. If so, then calling...

PHP

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

mySQL underperforming - trying to identify bottleneck

Similar topics