473,387 Members | 1,532 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Improve query performance on large table

Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is there any other way to improve the performance?

A strange thing is, when I join the above SQL with another table, it reduces the time:

SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Nov 12 '05 #1
7 10780

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...
Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?
A strange thing is, when I join the above SQL with another table, it reduces the time:
SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing


Did you perform runstats utility with full statistics on the table and all
indexes?
RUNSTATS ON TABLE table-name DETAILED INDEXES ALL WITH DISTRIBUTION ON ALL
COLUMNS

What percent of the table is returned with this query?

Did you do a visual explain on this query? If so, is the index used?

If you only tested the first query once, it could be because the buffer
pools are loaded for the second query.
Nov 12 '05 #2
Hi Bing,

I guess you did not
RUNSTATS ON TABLE table-name WITH DISTRIBUTION AND DETAILED INDEXES ALL
after creating the index and filling the table with data?

With no distribution statistics <1000 would be estimated to roughly 50% of
the rows (85 million) which would not qualify for using an index.
If you expect a relatively small result set (e.g. 1000 fid's) this will
help.

Joachim Banzhaf

Bing Wu wrote:
Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt
box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?

A strange thing is, when I join the above SQL with another table, it
reduces the time:

SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing


Nov 12 '05 #3
Thanks,

The query returns 5% of rows. I have the result from dynexpln:

SQL Statement:
DECLARE C1 CURSOR
FOR
select c.x, c.y, c.z
from coordinate as c, frame as f
where f.tid=1 and c.fid=f.id and f.id<1000
Section Code Page = 819

Estimated Cost = 4862935.000000
Estimated Cardinality = 3394883.500000

Access Table Name = DB2ADMIN.FRAME ID = 2,38
| #Columns = 0
| Index Scan: Name = DB2ADMIN.PK_FRAME3 ID = 5
| | Regular Index (Not Clustered)
| | Index Columns:
| | | 1: ID (Ascending)
| | #Key Columns = 1
| | | Start Key: Beginning of Index
| | | Stop Key: Exclusive Value
| | | | 1: 1000
| | Index-Only Access
| | Index Prefetch: None
| | Insert Into Sorted Temp Table ID = t1
| | | #Columns = 1
| | | #Sort Key Columns = 1
| | | | Key 1: (Ascending)
| | | Sortheap Allocation Parameters:
| | | | #Rows = 804
| | | | Row Width = 12
| | | Piped
| | | Duplicate Elimination
| Isolation Level: Uncommitted Read
| Lock Intents
| | Table: Intent None
| | Row : None
Sorted Temp Table Completion ID = t1
List Prefetch Preparation
| Access Table Name = DB2ADMIN.FRAME ID = 2,38
| | #Columns = 1
| | Relation Scan
| | | Prefetch: 50 Pages
| | Lock Intents
| | | Table: Intent Share
| | | Row : Next Key Share
| | Sargable Predicate(s)
| | | #Predicates = 2
| | Process Build Table for Hash Join
Hash Join
| Early Out: Single Match Per Outer Row
| Estimated Build Size: 12093
| Estimated Probe Size: 220287872
| Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | #Columns = 0
| | Index Scan: Name = DB2ADMIN.IDX_COORDINATE1 ID = 2
| | | Regular Index (Not Clustered)
| | | Index Columns:
| | | | 1: FID (Ascending)
| | | #Key Columns = 1
| | | | Start Key: Beginning of Index
| | | | Stop Key: Exclusive Value
| | | | | 1: 1000
| | | Index-Only Access
| | | Index Prefetch: Eligible 31867
| | | Insert Into Sorted Temp Table ID = t2
| | | | #Columns = 1
| | | | #Sort Key Columns = 1
| | | | | Key 1: (Ascending)
| | | | Sortheap Allocation Parameters:
| | | | | #Rows = 6835684
| | | | | Row Width = 12
| | | | Piped
| | | | Duplicate Elimination
| | Isolation Level: Uncommitted Read
| | Lock Intents
| | | Table: Intent None
| | | Row : None
| Sorted Temp Table Completion ID = t2
| List Prefetch Preparation
| | Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | | #Columns = 4
| | | Relation Scan
| | | | Prefetch: 1029488 Pages
| | | Lock Intents
| | | | Table: Intent Share
| | | | Row : Next Key Share
| | | Sargable Predicate(s)
| | | | #Predicates = 1
| | | Process Probe Table for Hash Join
Return Data to Application
| #Columns = 3
I did test two queries several times and it still truns out query2 3 mins ahead of query1. Any thoughts?

Cheers,

Bing
Mark A wrote:
"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...
Hi Folks,

I have a very large table containing 170 million rows of coordinats:

CREATE TABLE "DB2ADMIN"."COORDINATE" (
"FID" INTEGER NOT NULL ,
"AID" INTEGER NOT NULL ,
"X" REAL NOT NULL ,
"Y" REAL NOT NULL ,
"Z" REAL NOT NULL )
IN "USERSPACE1" ;

ALTER TABLE "DB2ADMIN"."COORDINATE" APPEND ON;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE1" ON "DB2ADMIN"."COORDINATE"
("FID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
CREATE INDEX "DB2ADMIN"."IDX_COORDINATE2" ON "DB2ADMIN"."COORDINATE"
("AID" ASC) PCTFREE 0 ALLOW REVERSE SCANS;
ALTER TABLE "DB2ADMIN"."COORDINATE" ADD CONSTRAINT
"PK_COORDINATE8" PRIMARY KEY ("FID", "AID");

When I issue:

SELECT X,Y,Z FROM COORDINATE WHERE FID<1000;

it takes more than 10 minutes to finish. The system runs on a Sun Cobalt


box (dual-PIII 1.4G, 1GB mem). I have tuned the Buffer Pool to 500MB. Is
there any other way to improve the performance?
A strange thing is, when I join the above SQL with another table, it


reduces the time:
SELECT C.X,C.Y,C.Z FROM COORDINATE AS C, FRAME AS F
WHERE C.FID=F.ID AND F.ID<1000

It takes only 7 minutes. FID is a foreign key of table 'FRAME'.

Please advice.

Thanks,

Bing

Did you perform runstats utility with full statistics on the table and all
indexes?
RUNSTATS ON TABLE table-name DETAILED INDEXES ALL WITH DISTRIBUTION ON ALL
COLUMNS

What percent of the table is returned with this query?

Did you do a visual explain on this query? If so, is the index used?

If you only tested the first query once, it could be because the buffer
pools are loaded for the second query.


Nov 12 '05 #4

"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:3F**************@biop.ox.ac.uk...
Thanks,

The query returns 5% of rows. I have the result from dynexpln:

SQL Statement:
DECLARE C1 CURSOR
FOR
select c.x, c.y, c.z
from coordinate as c, frame as f
where f.tid=1 and c.fid=f.id and f.id<1000
Section Code Page = 819

Estimated Cost = 4862935.000000
Estimated Cardinality = 3394883.500000

Access Table Name = DB2ADMIN.FRAME ID = 2,38
| #Columns = 0
| Index Scan: Name = DB2ADMIN.PK_FRAME3 ID = 5
| | Regular Index (Not Clustered)
| | Index Columns:
| | | 1: ID (Ascending)
| | #Key Columns = 1
| | | Start Key: Beginning of Index
| | | Stop Key: Exclusive Value
| | | | 1: 1000
| | Index-Only Access
| | Index Prefetch: None
| | Insert Into Sorted Temp Table ID = t1
| | | #Columns = 1
| | | #Sort Key Columns = 1
| | | | Key 1: (Ascending)
| | | Sortheap Allocation Parameters:
| | | | #Rows = 804
| | | | Row Width = 12
| | | Piped
| | | Duplicate Elimination
| Isolation Level: Uncommitted Read
| Lock Intents
| | Table: Intent None
| | Row : None
Sorted Temp Table Completion ID = t1
List Prefetch Preparation
| Access Table Name = DB2ADMIN.FRAME ID = 2,38
| | #Columns = 1
| | Relation Scan
| | | Prefetch: 50 Pages
| | Lock Intents
| | | Table: Intent Share
| | | Row : Next Key Share
| | Sargable Predicate(s)
| | | #Predicates = 2
| | Process Build Table for Hash Join
Hash Join
| Early Out: Single Match Per Outer Row
| Estimated Build Size: 12093
| Estimated Probe Size: 220287872
| Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | #Columns = 0
| | Index Scan: Name = DB2ADMIN.IDX_COORDINATE1 ID = 2
| | | Regular Index (Not Clustered)
| | | Index Columns:
| | | | 1: FID (Ascending)
| | | #Key Columns = 1
| | | | Start Key: Beginning of Index
| | | | Stop Key: Exclusive Value
| | | | | 1: 1000
| | | Index-Only Access
| | | Index Prefetch: Eligible 31867
| | | Insert Into Sorted Temp Table ID = t2
| | | | #Columns = 1
| | | | #Sort Key Columns = 1
| | | | | Key 1: (Ascending)
| | | | Sortheap Allocation Parameters:
| | | | | #Rows = 6835684
| | | | | Row Width = 12
| | | | Piped
| | | | Duplicate Elimination
| | Isolation Level: Uncommitted Read
| | Lock Intents
| | | Table: Intent None
| | | Row : None
| Sorted Temp Table Completion ID = t2
| List Prefetch Preparation
| | Access Table Name = DB2ADMIN.COORDINATE ID = 2,6
| | | #Columns = 4
| | | Relation Scan
| | | | Prefetch: 1029488 Pages
| | | Lock Intents
| | | | Table: Intent Share
| | | | Row : Next Key Share
| | | Sargable Predicate(s)
| | | | #Predicates = 1
| | | Process Probe Table for Hash Join
Return Data to Application
| #Columns = 3
I did test two queries several times and it still truns out query2 3 mins ahead of query1. Any thoughts?
Cheers,

Bing

If you did the runstats requested with detail statistics (you did not say
above), and this is static SQL, then you need to rebind your package.

You did not show the access plan of query 1, but it may be that it just
takes that long to return the 200,000 rows (5%).

However I would use the Performance Wizard in the Control Center to tune the
database parameters, such as sort heaps, etc.
Nov 12 '05 #5

Mark A wrote:

If you did the runstats requested with detail statistics (you did not say
above), and this is static SQL, then you need to rebind your package.
Yes, I did run the runstats.
You did not show the access plan of query 1, but it may be that it just
takes that long to return the 200,000 rows (5%).

Ahh, the returned rows are 8.5 million (that's about 500MB in file). But 10 minutes is still a bit long...
However I would use the Performance Wizard in the Control Center to tune the
database parameters, such as sort heaps, etc.


Thanks for the info. I'll try to figure around.

Bing

Nov 12 '05 #6
"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...

Ahh, the returned rows are 8.5 million (that's about 500MB in file). But 10 minutes is still a bit long...

Why not take the output file and copy it to another file to see how long it
takes for the operating system to create a file that large.
Nov 12 '05 #7
Thanks Mark. The copy I/O only took half minute to complete. I doubt there might be some parameters need to be tuned. Since I have done a MySQL implementation, the same SQL on the same datasets took only 3 minutes...

Bing

Mark A wrote:
"Bing Wu" <bi**@biop.ox.ac.uk> wrote in message
news:bo**********@news.ox.ac.uk...
Ahh, the returned rows are 8.5 million (that's about 500MB in file). But


10 minutes is still a bit long...

Why not take the output file and copy it to another file to see how long it
takes for the operating system to create a file that large.


Nov 12 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Leader | last post by:
Hi All, I am getting slower performance of select statements in MS SQL. I am finding select statements in MS SQL are even slower than MS ACCESS. Is there any way to improve the performance of...
5
by: charlies224 | last post by:
Hi, I am using SQL 2000 and has a table that contains more than 2 million rows of data (and growing). Right now, I have encountered 2 problems: 1) Sometimes, when I try to query against this...
1
by: Good Man | last post by:
Hi there I'm developing a large web application. Part of this web application will be storing numerical chart data in a MySQL table - these numbers will be already calculated, and are just being...
8
by: Együd Csaba | last post by:
Hi All, how can I improve the query performance in the following situation: I have a big (4.5+ million rows) table. One query takes approx. 9 sec to finish resulting ~10000 rows. But if I run...
2
by: Robert Stearns | last post by:
I am currently using the following query to select rows for a report. It takes forever, even though there are < 100 rows in the result set; The problem is that animals, epd and ent_herdid are all...
1
by: murray | last post by:
Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment...
0
by: OctoTools | last post by:
Large table performance problem with MYSQL performance with ASP.NET I am doing some benchmarking of an application using MYSQL under Windows and I confused about the results. Environment...
9
by: db2udbgirl | last post by:
Is this possible to tune this query further : Env : DB2 UDB 8.2 on AIX 5.3, Non partitioned tables Query: SELECT ETL.T00601.* FROM ETL.T00601, ETL.STG_LAAM_CARD20_BUS_ASCT_BUS_FCN where
1
by: colintis | last post by:
From the below SQL statement, can anyone tell me any methods to improve it? As this query has keep running without results in my latest try, I have to break the process as it takes too long (waited...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.