By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,835 Members | 1,454 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,835 IT Pros & Developers. It's quick & easy.

rows read vs. rows selected.

P: n/a
Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646485498
Rows written = 933743357
The rows read is too high compared to rows selected.
(The table has 933743360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay
The plan is using the correct index:

RETURN
( 1)
|
FILTER
( 2)
|
GRPBY
( 3)
|
TBSCAN
( 4)
|
SORT
( 5)
|
IXSCAN
( 6)
/ \
Index: Table:
ERD ERD
ADDR_QRYA ADDRESSA

Nov 12 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
"UDBDBA" <vi***********@gmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646 485 498
Rows written = 933 743 357
The rows read is too high compared to rows selected.
(The table has 933 743 360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay

If you want to increase selectivity, you need to use a WHERE clause before
the group by. If you want to evaluate every row, then you cannot do that.

I don't think that the snapshot monitor data you are looking at is telling
you what you think it is.

The best way to improve the performance on a table scan on a very large
table is to use multiple containers (on separate physical devices) and to
enable intra-partition parallelism. Make sure your prefetch size is n times
your extent size of the tablespace (where n is the number of containers).
You could also consider splitting the data into multiple tables and using a
UNION ALL view, which would also enable parallelism. The degree of
parallelism you want depends on the number of CPUs and separate physical
disk devices you have.

You should also look sort heaps, temporary tablespaces, and make sure they
are large enough.

If your application does mostly table scans (data warehouse application),
increase your tablespace page size to 16K or 32K (this will hurt your OLTP
transactions if you have any).
Nov 12 '05 #2

P: n/a
UDBDBA wrote:
Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646485498
Rows written = 933743357
The rows read is too high compared to rows selected.
(The table has 933743360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay
The plan is using the correct index:

RETURN
( 1)
|
FILTER
( 2)
|
GRPBY
( 3)
|
TBSCAN
( 4)
|
SORT
( 5)
|
IXSCAN
( 6)
/ \
Index: Table:
ERD ERD
ADDR_QRYA ADDRESSA

What's the ADDR_QRYA index defined on?
Do you ahve an index on your group by columns (in that order!):
(DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE)
Reason why I'm asking is that the SORT should go (do you have sort
overflows? (rows written...??).

A totally different approach could be to employ an MQT.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #3

P: n/a
The index definition:
create UNIQUE index addr_qrya on erd.addressa (DSRC_ACCT_ID,ADDR_ID)
INCLUDE (ADDR1,ADDR2, ADDR3, CITY, STATE, POSTAL_CODE) ALLOW REVERSE
SCANS;

The access graph shows an index only access, but the SORT and TBSCAN in
access graph .. can you explain it?

We evaluated MQT vs. Adding an extra index with INCLUDE columns. Seems
like this index could be used by other queries and saves us the
overhead of double the operation (update,insert.logging,locking...) by
having MQT.

Thanks!
Vijay

Nov 12 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.