473,396 Members | 1,923 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

rows read vs. rows selected.

Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646485498
Rows written = 933743357
The rows read is too high compared to rows selected.
(The table has 933743360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay
The plan is using the correct index:

RETURN
( 1)
|
FILTER
( 2)
|
GRPBY
( 3)
|
TBSCAN
( 4)
|
SORT
( 5)
|
IXSCAN
( 6)
/ \
Index: Table:
ERD ERD
ADDR_QRYA ADDRESSA

Nov 12 '05 #1
3 5829
"UDBDBA" <vi***********@gmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646 485 498
Rows written = 933 743 357
The rows read is too high compared to rows selected.
(The table has 933 743 360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay

If you want to increase selectivity, you need to use a WHERE clause before
the group by. If you want to evaluate every row, then you cannot do that.

I don't think that the snapshot monitor data you are looking at is telling
you what you think it is.

The best way to improve the performance on a table scan on a very large
table is to use multiple containers (on separate physical devices) and to
enable intra-partition parallelism. Make sure your prefetch size is n times
your extent size of the tablespace (where n is the number of containers).
You could also consider splitting the data into multiple tables and using a
UNION ALL view, which would also enable parallelism. The degree of
parallelism you want depends on the number of CPUs and separate physical
disk devices you have.

You should also look sort heaps, temporary tablespaces, and make sure they
are large enough.

If your application does mostly table scans (data warehouse application),
increase your tablespace page size to 16K or 32K (this will hurt your OLTP
transactions if you have any).
Nov 12 '05 #2
UDBDBA wrote:
Hi All:

I have a query which is running against large table. The query:

SELECT DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE,
COUNT(*) FROM ERD.ADDRESSA GROUP BY DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3,
CITY, STATE, POSTAL_CODE HAVING COUNT(*) > 2

The applcation snapshot monitor shows:
Rows selected = 181
Rows read = 646485498
Rows written = 933743357
The rows read is too high compared to rows selected.
(The table has 933743360 rows)

My questions is: What's the best way to increase selectivity of this
query?
Thanks in Advance.

Vijay
The plan is using the correct index:

RETURN
( 1)
|
FILTER
( 2)
|
GRPBY
( 3)
|
TBSCAN
( 4)
|
SORT
( 5)
|
IXSCAN
( 6)
/ \
Index: Table:
ERD ERD
ADDR_QRYA ADDRESSA

What's the ADDR_QRYA index defined on?
Do you ahve an index on your group by columns (in that order!):
(DSRC_ACCT_ID, ADDR1, ADDR2, ADDR3, CITY, STATE, POSTAL_CODE)
Reason why I'm asking is that the SORT should go (do you have sort
overflows? (rows written...??).

A totally different approach could be to employ an MQT.

Cheers
Serge
--
Serge Rielau
DB2 SQL Compiler Development
IBM Toronto Lab
Nov 12 '05 #3
The index definition:
create UNIQUE index addr_qrya on erd.addressa (DSRC_ACCT_ID,ADDR_ID)
INCLUDE (ADDR1,ADDR2, ADDR3, CITY, STATE, POSTAL_CODE) ALLOW REVERSE
SCANS;

The access graph shows an index only access, but the SORT and TBSCAN in
access graph .. can you explain it?

We evaluated MQT vs. Adding an extra index with INCLUDE columns. Seems
like this index could be used by other queries and saves us the
overhead of double the operation (update,insert.logging,locking...) by
having MQT.

Thanks!
Vijay

Nov 12 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Mojtaba Faridzad | last post by:
Hi, with SetDataBinding( ) a DataGrid shows a DataView. user can select some rows in the grid by holding cotrol key. when user clicks on Delete button, I should delete all selected rows. I am...
3
by: BBFrost | last post by:
Ok, I know how to count the number of selected datagrid rows using the code below. What has me stumped is how to determine when the selected rows within a datagrid have been changed. The...
5
by: BBFrost | last post by:
Win2000 ..Net 1.1 SP1 c# using Visual Studio Ok, I'm currently in a "knock down - drag out" tussle with the .Net 1.1 datagrid. I've come to realize that a 'block' of rows highlighted within...
1
by: Alex K. | last post by:
I need to be able to jump to first selected row in DataGrid. How can I do this without using IsSelected for each row? Is there some sort of collection which contains all selected rows? Thank you.
1
by: Jon | last post by:
Question: does datagrid1.isSelected(i) point to the same row as datatable.row(i).delete after datagrid sorted?? I am using datagrid1.isSelected(i) to identify datatable rows that have been...
2
by: Ettenurb | last post by:
I was hoping someone has come across this and came up with a solution. We have upgraded our custom software to us Infragistics UltraWinGrid 2006 CLR 2.0. The code below worked with a previous...
0
by: Tim Kelley | last post by:
I have a dataviewgrid in my project which the user will select multiple rows. They will click a button and my program will perform some operation on the selected rows. My question is, how do I...
3
by: dianaj86 | last post by:
I have multiple dropdownlists each one filled with values from a specific column in the table. Also I have multiple textboxes corresponding to dropdownlists. For example, when I select an item from...
11
by: patsman77 | last post by:
Hello All, I am having a frustrating time trying to figure this out. I have a form, in which I want to have fields WEEK, ID, FAVORITE, UNDERDOG, SPREAD, FSCORE, USCORE, and TIME. The form is...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.