Improve query performance on large table

Bing Wu

Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is there any other way to improve the performance?

A strange thing is, when I join the above SQL with another table, it reduces the time:

SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Nov 12 '05 #1

Subscribe Post Reply

10780

Mark A

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...

Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?
A strange thing is, when I join the above SQL with another table, it reduces the time:
SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Did you perform runstats utility with full statistics on the table and all
indexes?
RUNSTATS ON TABLE table-name DETAILED INDEXES ALL WITH DISTRIBUTION ON ALL
COLUMNS

What percent of the table is returned with this query?

Did you do a visual explain on this query? If so, is the index used?

If you only tested the first query once, it could be because the buffer
pools are loaded for the second query.

Nov 12 '05 #2

Joachim Banzhaf

Hi Bing,

I guess you did not
RUNSTATS ON TABLE table-name WITH DISTRIBUTION AND DETAILED INDEXES ALL
after creating the index and filling the table with data?

With no distribution statistics <1000 would be estimated to roughly 50% of
the rows (85 million) which would not qualify for using an index.
If you expect a relatively small result set (e.g. 1000 fid's) this will
help.

Joachim Banzhaf

Bing Wu wrote:

Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt
box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?

A strange thing is, when I join the above SQL with another table, it
reduces the time:

SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Nov 12 '05 #3

Bing Wu

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...
Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt

box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?
A strange thing is, when I join the above SQL with another table, it

reduces the time:
SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Did you perform runstats utility with full statistics on the table and all
indexes?
RUNSTATS ON TABLE table-name DETAILED INDEXES ALL WITH DISTRIBUTION ON ALL
COLUMNS

What percent of the table is returned with this query?

Did you do a visual explain on this query? If so, is the index used?

If you only tested the first query once, it could be because the buffer
pools are loaded for the second query.

Nov 12 '05 #4

Mark A

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:3F**************@biop.ox.ac.uk...

Thanks,

The query returns 5% of rows. I have the result from dynexpln:

SQL Statement:
DECLARE C1 CURSOR
FOR
select c.x, c.y, c.z
from coordinate as c, frame as f
where f.tid=1 and c.fid=f.id and f.id<1000
Section Code Page = 819

Estimated Cost = 4862935.000000
Estimated Cardinality = 3394883.500000

Access Table Name = DB2ADMIN.FRAME ID = 2,38
| #Columns = 0
| Index Scan: Name = DB2ADMIN.PK_FRAME3 ID = 5
| | Regular Index (Not Clustered)
| | Index Columns:
| | | 1: ID (Ascending)
| | #Key Columns = 1
| | | Start Key: Beginning of Index
| | | Stop Key: Exclusive Value
| | | | 1: 1000
| | Index-Only Access
| | Index Prefetch: None
| | Insert Into Sorted Temp Table ID = t1
| | | #Columns = 1
| | | #Sort Key Columns = 1
| | | | Key 1: (Ascending)
| | | Sortheap Allocation Parameters:
| | | | #Rows = 804
| | | | Row Width = 12
| | | Piped
| | | Duplicate Elimination
| Isolation Level: Uncommitted Read
| Lock Intents
| | Table: Intent None
| | Row : None
Sorted Temp Table Completion ID = t1
List Prefetch Preparation
| Access Table Name = DB2ADMIN.FRAME ID = 2,38
| | #Columns = 1
| | Relation Scan
| | | Prefetch: 50 Pages
| | Lock Intents
| | | Table: Intent Share
| | | Row : Next Key Share
| | Sargable Predicate(s)
| | | #Predicates = 2
| | Process Build Table for Hash Join
Hash Join
| Early Out: Single Match Per Outer Row
| Estimated Build Size: 12093
| Estimated Probe Size: 220287872
| Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | #Columns = 0
| | Index Scan: Name = DB2ADMIN.IDX_COORDINATE1 ID = 2
| | | Regular Index (Not Clustered)
| | | Index Columns:
| | | | 1: FID (Ascending)
| | | #Key Columns = 1
| | | | Start Key: Beginning of Index
| | | | Stop Key: Exclusive Value
| | | | | 1: 1000
| | | Index-Only Access
| | | Index Prefetch: Eligible 31867
| | | Insert Into Sorted Temp Table ID = t2
| | | | #Columns = 1
| | | | #Sort Key Columns = 1
| | | | | Key 1: (Ascending)
| | | | Sortheap Allocation Parameters:
| | | | | #Rows = 6835684
| | | | | Row Width = 12
| | | | Piped
| | | | Duplicate Elimination
| | Isolation Level: Uncommitted Read
| | Lock Intents
| | | Table: Intent None
| | | Row : None
| Sorted Temp Table Completion ID = t2
| List Prefetch Preparation
| | Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | | #Columns = 4
| | | Relation Scan
| | | | Prefetch: 1029488 Pages
| | | Lock Intents
| | | | Table: Intent Share
| | | | Row : Next Key Share
| | | Sargable Predicate(s)
| | | | #Predicates = 1
| | | Process Probe Table for Hash Join
Return Data to Application
| #Columns = 3
I did test two queries several times and it still truns out query2 3 mins ahead of query1. Any thoughts?
Cheers,

Bing

If you did the runstats requested with detail statistics (you did not say
above), and this is static SQL, then you need to rebind your package.

You did not show the access plan of query 1, but it may be that it just
takes that long to return the 200,000 rows (5%).

However I would use the Performance Wizard in the Control Center to tune the
database parameters, such as sort heaps, etc.

Nov 12 '05 #5

Bing Wu

Mark A wrote:

If you did the runstats requested with detail statistics (you did not say
above), and this is static SQL, then you need to rebind your package.
Yes, I did run the runstats.
You did not show the access plan of query 1, but it may be that it just
takes that long to return the 200,000 rows (5%).

Ahh, the returned rows are 8.5 million (that's about 500MB in file). But 10 minutes is still a bit long...
However I would use the Performance Wizard in the Control Center to tune the
database parameters, such as sort heaps, etc.

Thanks for the info. I'll try to figure around.

Bing

Nov 12 '05 #6

Mark A

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...

Ahh, the returned rows are 8.5 million (that's about 500MB in file). But 10 minutes is still a bit long...

Why not take the output file and copy it to another file to see how long it
takes for the operating system to create a file that large.

Nov 12 '05 #7

Bing Wu

Thanks Mark. The copy I/O only took half minute to complete. I doubt there might be some parameters need to be tuned. Since I have done a MySQL implementation, the same SQL on the same datasets took only 3 minutes...

Bing

Mark A wrote:

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...
Ahh, the returned rows are 8.5 million (that's about 500MB in file). But

10 minutes is still a bit long...

Why not take the output file and copy it to another file to see how long it
takes for the operating system to create a file that large.

Nov 12 '05 #8

by: Leader | last post by:

Hi All, I am getting slower performance of select statements in MS SQL. I am finding select statements in MS SQL are even slower than MS ACCESS. Is there any way to improve the performance of...

Microsoft SQL Server

How do you improve SQL performance over large amount of data?

by: charlies224 | last post by:

Hi, I am using SQL 2000 and has a table that contains more than 2 million rows of data (and growing). Right now, I have encountered 2 problems: 1) Sometimes, when I try to query against this...

Microsoft SQL Server

Going for a LARGE Table: Any Tips?

by: Good Man | last post by:

Hi there I'm developing a large web application. Part of this web application will be storing numerical chart data in a MySQL table - these numbers will be already calculated, and are just being...

MySQL Database

Query performance question on a large table

by: Együd Csaba | last post by:

Hi All, how can I improve the query performance in the following situation: I have a big (4.5+ million rows) table. One query takes approx. 9 sec to finish resulting ~10000 rows. But if I run...

PostgreSQL Database

Query performance

by: Robert Stearns | last post by:

I am currently using the following query to select rows for a report. It takes forever, even though there are < 100 rows in the result set; The problem is that animals, epd and ent_herdid are all...

DB2 Database

Large table performance problem with MYSQL performance with ASP.NET

by: murray | last post by:

Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment...

MySQL Database

Help! Large table performance problem with MYSQL performance with ASP.NET

by: OctoTools | last post by:

Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment...

MySQL Database

Query Performance

by: db2udbgirl | last post by:

Is this possible to tune this query further : Env : DB2 UDB 8.2 on AIX 5.3, Non partitioned tables Query: SELECT ETL.T00601.* FROM ETL.T00601, ETL.STG_LAAM_CARD20_BUS_ASCT_BUS_FCN where

DB2 Database

Improve query performance?

by: colintis | last post by:

From the below SQL statement, can anyone tell me any methods to improve it? As this query has keep running without results in my latest try, I have to break the process as it takes too long (waited...

Microsoft Access / VBA

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Improve query performance on large table

Similar topics