473,734 Members | 2,576 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

SQL Query x Multiple table Scans

Hello,

Db2 Linux LUW FP 15.

Consider table A (ID varchar, EXECUTION_DATE date).

a) I want to first retrieve all IDs that have not executed during the
last 90 days:

select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)

b) Now I need the last execution for all IDs that have NOT executed
during the last 90 days:

select TABLE_A.ID, MAX(EXECUTION_D ATE) from TABLE_A,

(
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
) as TABLE_C

where TABLE_A.ID = TABLE_C.ID and
EXECUTION_DATE < current date - 90 days

group by TABLE_A.ID

That was easy. The question is ... is there any way to do the same
thing using a better approach, perhaps SINGLE table scan on table_A ?
This table is really big.

TIA, Michel.
Nov 30 '07 #1
12 2446
On Nov 30, 6:33 pm, Michel Esber <mic...@us.auto matos.comwrote:
Hello,

Db2 Linux LUW FP 15.

Consider table A (ID varchar, EXECUTION_DATE date).

a) I want to first retrieve all IDs that have not executed during the
last 90 days:

select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)

b) Now I need the last execution for all IDs that have NOT executed
during the last 90 days:

select TABLE_A.ID, MAX(EXECUTION_D ATE) from TABLE_A,

(
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
) as TABLE_C

where TABLE_A.ID = TABLE_C.ID and
EXECUTION_DATE < current date - 90 days

group by TABLE_A.ID

That was easy. The question is ... is there any way to do the same
thing using a better approach, perhaps SINGLE table scan on table_A ?
This table is really big.

TIA, Michel.
What indexes do you have on table_A and table_B, and what access plan
do you currently have?

/Lennart

Nov 30 '07 #2
On Nov 30, 10:59 pm, Lennart <Erik.Lennart.J ons...@gmail.co mwrote:
On Nov 30, 6:33 pm, Michel Esber <mic...@us.auto matos.comwrote:
Hello,
Db2 Linux LUW FP 15.
Consider table A (ID varchar, EXECUTION_DATE date).
a) I want to first retrieve all IDs that have not executed during the
last 90 days:
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
b) Now I need the last execution for all IDs that have NOT executed
during the last 90 days:
select TABLE_A.ID, MAX(EXECUTION_D ATE) from TABLE_A,
(
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
) as TABLE_C
where TABLE_A.ID = TABLE_C.ID and
EXECUTION_DATE < current date - 90 days
group by TABLE_A.ID
That was easy. The question is ... is there any way to do the same
thing using a better approach, perhaps SINGLE table scan on table_A ?
This table is really big.
TIA, Michel.

What indexes do you have on table_A and table_B, and what access plan
do you currently have?
In addition, what are the number of rows for A and B?

/Lennart

Nov 30 '07 #3
Maybe I'm missing something obvious....:

SELECT ID, MAX(EXECUTION_D ATE) FROM TABLEA GROUP BY ID
HAVING MAX(EXECUTION_D ATE) < CURRENT DATE - 90 DAYS

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Dec 1 '07 #4
On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm .comwrote:
Maybe I'm missing something obvious....:

SELECT ID, MAX(EXECUTION_D ATE) FROM TABLEA GROUP BY ID
HAVING MAX(EXECUTION_D ATE) < CURRENT DATE - 90 DAYS

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Hi Serge,

In fact, this query does return the last execution prior to current
date - 90 days, but it does not consider that IDs should not execute
between current date - 90 days and current date

The above query returns IDs that have and have not executed between
current date - 90 days and current date.

The query I have posted works fine. I am looking for a way to tune
it ... Is there any way to produce the same result set with a single
table scan ?

TIA, Michel.
Dec 2 '07 #5
On 30 nov, 20:00, Lennart <Erik.Lennart.J ons...@gmail.co mwrote:
On Nov 30, 10:59 pm, Lennart <Erik.Lennart.J ons...@gmail.co mwrote:


On Nov 30, 6:33 pm, Michel Esber <mic...@us.auto matos.comwrote:
Hello,
Db2 Linux LUW FP 15.
Consider table A (ID varchar, EXECUTION_DATE date).
a) I want to first retrieve all IDs that have not executed during the
last 90 days:
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
b) Now I need the last execution for all IDs that have NOT executed
during the last 90 days:
select TABLE_A.ID, MAX(EXECUTION_D ATE) from TABLE_A,
(
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
) as TABLE_C
where TABLE_A.ID = TABLE_C.ID and
EXECUTION_DATE < current date - 90 days
group by TABLE_A.ID
That was easy. The question is ... is there any way to do the same
thing using a better approach, perhaps SINGLE table scan on table_A ?
This table is really big.
TIA, Michel.
What indexes do you have on table_A and table_B, and what access plan
do you currently have?

In addition, what are the number of rows for A and B?

/Lennart

Original Statement:
------------------

select R.MACHINE_ID, R.SW_ID, MAX(MONITOR_DAT E)

from ASSET.T_METERIN G R inner join

(
select distinct I.MACHINE_ID, I.SW_ID
from ASSET.TBL_ASSET _SW I
where NOT EXISTS
(select 1
from ASSET.T_METERIN G M
where MONITOR_DATE between current date - 90 days and current
date and
M. MACHINE_ID = I.MACHINE_ID and
M.SW_ID = I.SW_ID
fetch first 1 rows only)
) as C

on (C.MACHINE_ID = R.MACHINE_ID and C.SW_ID = R.SW_ID)

where MONITOR_DATE < current date - 90 days

group by R.MACHINE_ID, R.SW_ID


Access Plan:
-----------
Total Cost: 3507.52
Query Degree: 1

Rows
RETURN
( 1)
Cost
I/O
|
1.60851e-05
GRPBY
( 2)
3507.52
325.697
|
1.60851e-05
TBSCAN
( 3)
3507.52
325.697
|
0.00401062
SORT
( 4)
3507.51
325.697
|
0.00401062
NLJOIN
( 5)
3507.51
325.697
/---+---\
249.338 0.0172951
TBSCAN IXSCAN
( 6) ( 13)
2514.15 25.0315
196.182 1
| |
249.338 18785
SORT INDEX: ASSET
( 7) MTRIDX03
2514.12
196.182
|
249.338
TBSCAN
( 8)
2513.97
196.182
|
249.338
SORT
( 9)
2513.94
196.182
|
249.338
HSJOIN
( 10)
2513.78
196.182
/------+------\
3500.53 4325
IXSCAN IXSCAN
( 11) ( 12)
1034.09 1478.12
137.182 59
| |
18785 4325
INDEX: ASSET INDEX: SYSIBM
MTRIDX03 SQL061207070618 270
Indexes:

Table = T_METERING, index = ASSET.MTRIDX03 Columns = +SW_ID
+MONITOR_DATE+M ACHINE_ID
Table = TBL_ASSET_SW, index = SYSIBM.SQL06120 7070618270 Columns =
+MACHINE_ID+SW_ ID

T_METERING has 20M rows, TBL_ASSET_SW has 5M rows.

Thanks, Michel.
Dec 2 '07 #6
On Nov 30, 9:33 am, Michel Esber <mic...@us.auto matos.comwrote:
Hello,

Db2 Linux LUW FP 15.

Consider table A (ID varchar, EXECUTION_DATE date).

a) I want to first retrieve all IDs that have not executed during the
last 90 days:

select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)

b) Now I need the last execution for all IDs that have NOT executed
during the last 90 days:

select TABLE_A.ID, MAX(EXECUTION_D ATE) from TABLE_A,

(
select distinct ID from table_B where NOT EXISTS
(select 1
from table_A
where EXECUTION_DATE between current date - 90 days and current
date AND
table_A.ID = table_B.ID)
) as TABLE_C

where TABLE_A.ID = TABLE_C.ID and
EXECUTION_DATE < current date - 90 days

group by TABLE_A.ID

That was easy. The question is ... is there any way to do the same
thing using a better approach, perhaps SINGLE table scan on table_A ?
This table is really big.
I read the rest of the thread. I'll see what I can do with this.

First, let's restate the problem: you want the set of all IDs that
have not executed in the last 90 days, and you want the date of their
last execution.

-- id, and the last execution date of all ids that join to the inner
query
SELECT a.id,
max(execution_d ate)
FROM table_a a
JOIN (
-- the set of all ids that have not executed in the last 90
days
SELECT id
FROM table_a
GROUP BY id
HAVING MAX(execution_d ate) < CURRENT DATE - 90 DAYS
) lt90
ON a.id = lt90.id

In your last post on this, it looks like you're interested in SW_ID
and MACHINE_ID, and there is an index that contains those columns plus
the date you're interested in, so I would expect this query to be
reasonably efficient.
Dec 2 '07 #7


Michel Esber wrote:
On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm .comwrote:
>Maybe I'm missing something obvious....:

SELECT ID, MAX(EXECUTION_D ATE) FROM TABLEA GROUP BY ID
HAVING MAX(EXECUTION_D ATE) < CURRENT DATE - 90 DAYS

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Hi Serge,

In fact, this query does return the last execution prior to current
date - 90 days, but it does not consider that IDs should not execute
between current date - 90 days and current date

The above query returns IDs that have and have not executed between
current date - 90 days and current date.

The query I have posted works fine. I am looking for a way to tune
it ... Is there any way to produce the same result set with a single
table scan ?

TIA, Michel.
Serge's query works properly on my LUW FP14 system. I'd recommend trying
a small test case (a sample table containing three rows) to verify that
this is failing on your system. If it's failing, then it needs to be
reported. I strongly doubt that FP15 will break this simple SQL statement.

[psherman@T40 ~]$ db2 "create table t1 (id char(2) not null,
execution_date date not null with default)"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('A','2007-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('A','2006-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('B','2006-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "select id, max(execution_d ate) from t1 group by id"
ID 2
-- ----------
A 12/01/2007
B 12/01/2006

2 record(s) selected.

[psherman@T40 ~]$ db2 "select id, max(execution_d ate) from t1 group by
id having max(execution_d ate) < current_date - 90 days"

ID 2
-- ----------
B 12/01/2006

1 record(s) selected.
Phil Sherman
Dec 2 '07 #8
On 2 dez, 13:01, Phil Sherman <psher...@ameri tech.netwrote:
Michel Esber wrote:
On 1 dez, 00:17, Serge Rielau <srie...@ca.ibm .comwrote:
Maybe I'm missing something obvious....:
SELECT ID, MAX(EXECUTION_D ATE) FROM TABLEA GROUP BY ID
HAVING MAX(EXECUTION_D ATE) < CURRENT DATE - 90 DAYS
Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Hi Serge,
In fact, this query does return the last execution prior to current
date - 90 days, but it does not consider that IDs should not execute
between current date - 90 days and current date
The above query returns IDs that have and have not executed between
current date - 90 days and current date.
The query I have posted works fine. I am looking for a way to tune
it ... Is there any way to produce the same result set with a single
table scan ?
TIA, Michel.

Serge's query works properly on my LUW FP14 system. I'd recommend trying
a small test case (a sample table containing three rows) to verify that
this is failing on your system. If it's failing, then it needs to be
reported. I strongly doubt that FP15 will break this simple SQL statement.

[psherman@T40 ~]$ db2 "create table t1 (id char(2) not null,
execution_date date not null with default)"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('A','2007-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('A','2006-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "insert into t1 values ('B','2006-12-01')"
DB20000I The SQL command completed successfully.
[psherman@T40 ~]$ db2 "select id, max(execution_d ate) from t1 group by id"
ID 2
-- ----------
A 12/01/2007
B 12/01/2006

2 record(s) selected.

[psherman@T40 ~]$ db2 "select id, max(execution_d ate) from t1 group by
id having max(execution_d ate) < current_date - 90 days"

ID 2
-- ----------
B 12/01/2006

1 record(s) selected.

Phil Sherman-
Phil, Serge and Dean, thank you for your time. Let me try to explain
why this is incorrect:

Consider the following table data:

db2 "select MONITOR_DATE from ASSET.T_METERIN G where
MACHINE_ID='000 B6A10620ED121C9 481542' and SW_ID=77"

MONITOR_DATE
------------
10/17/2007
10/18/2007
11/30/2007

If I run my original query:

db2 "select MACHINE_ID, SW_ID from ASSET.TBL_ASSET _SW S where not
exists (select 1 from ASSET.T_METERIN G M where
M.MACHINE_ID=S. MACHINE_ID and S.SW_ID=M.SW_ID and MONITOR_DATE between
current date - 30 days and current date) and
MACHINE_ID='000 B6A10620ED121C9 481542' and SW_ID=77"

I get no results. That´s abovius, because there was an execution on
11/30/2007. Note that I do need to join with TBL_ASSET_SW for other
purposes.

Now, executing the queries you have suggested:

db2 "select MACHINE_ID,SW_I D from ASSET.T_METERIN G where
MACHINE_ID='000 B6A10620ED121C9 481542' group by MACHINE_ID, SW_ID
having MAX(MONITOR_DAT E) < current date - 90 days"

SW_ID 77 is not returned here, which is nice.

Next:

db2 "select S.MACHINE_ID, S.SW_ID
from ASSET.TBL_ASSET _SW S
where NOT EXISTS (select MACHINE_ID, SW_ID
from ASSET.T_METERIN G M
where M.MACHINE_ID='0 00B6A10620ED121 C9481542'
and
S.MACHINE_ID = M.MACHINE_ID and
S.SW_ID = M.SW_ID
group by MACHINE_ID, SW_ID
having MAX(MONITOR_DAT E) < current date - 30
days)
and MACHINE_ID='000 B6A10620ED121C9 481542' and SW_ID=77"

It does return one row:

MACHINE_ID SW_ID
------------------------ ----------
000B6A10620ED12 1C9481542 77.

I understand what´s happening here, and the result set is not
incorrect (according to the query). But for my scenario, this is just
wrong. I need to certify that the execution did not happen between
current date - X days and current date.

Hope that it makes sense now.

Thanks again.

Dec 2 '07 #9
Michel Esber wrote:
I understand what´s happening here, and the result set is not
incorrect (according to the query).
I do not understand what's happening here.
Hope that it makes sense now.
No it doesn't, you are assuming we have the same local knowledge as you.
1. We are missing the sample data for the asset table
2. You keep flip flopping between 30 days and 90 days which confuses the
matter
3. Shouldn't matter, but: Why are you adding the literal into the NOT
EXISTS with the HAVING?

I propose we restart with a clean slate:
Define both tables, define contents for both. Define the output.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Dec 2 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
7207
by: hrishy | last post by:
Hi All I have the following query to be tuned.. Select distinct PA.PersonAddress_IDX, AT.Name AddressType, A.Line1 Address1, A.Line2 Address2, A.City, A.State, A.County, A.Country, A.PostalCode, A.AllowPostalSoftYN, PA.ChangedBy, PA.ChangedDT, PA.DeletedYN ,PA.Person_Key, PA.Address_Key, PA.AddressType_Key FROM PersonAddress_h PA,Address_h A,AddressType_h AT
4
5777
by: DFS | last post by:
This UNION query is very slow. With only 3,000 records in the Parent table and 7,000 records in the Child table, it takes about 60 seconds to run and returns about 2200 records. Any ideas on speeding it up? Thanks. -- PART 1: HAS NO CHILD RECORDS SELECT P.PROJECT_ID, 'No Child Data' AS SUB_TYPE
1
2708
by: sumGirl | last post by:
Hi all. Anyone know if sql server collects stats on how many full table scans a table gets and if so how do I get at those stats? Trying to track down poorly indexed tables / processes and I am guessing that sql server does have this data secreted away somewhere much like my db2 and informix databases do.
0
1453
by: apb18 | last post by:
A bit of query plan strangeness. Suppose you have an inheritance tree such that the columns 'ID' and 'field' appear in the top level table, call that table XXX. tables YYY and ZZZ both inherit XXX. Now suppose there exists some query that returns a set of IDs that match some criteria (that query may involve various tests/joins/etc on other arbitrary tables). Executing that query alone produces an optimal plan and the exact result set...
7
10812
by: Bing Wu | last post by:
Hi Folks, I have a very large table containing 170 million rows of coordinats: CREATE TABLE "DB2ADMIN"."COORDINATE" ( "FID" INTEGER NOT NULL , "AID" INTEGER NOT NULL , "X" REAL NOT NULL , "Y" REAL NOT NULL , "Z" REAL NOT NULL )
5
6340
by: sql-db2-dba | last post by:
We have DB2 UDB v8.1 fixpak3 on AIX 5. Production and Development configuarations (at least for DB2) are identical albeit production is a 2-way server while development has only one processor. Tables and indexes have the same schema. In fact, the dev database was taken from a prod backup recently. Size of the tables differ slightly. Yet, on a given query (with 4 tables joined), it took 30-50 times longer to run in prod than on development....
9
1500
by: Bernd Hohmann | last post by:
Maybe someone has an idea for this: table "customer_article" contains prices for every article by customer and has about 400.000 now. customer char(5) article char(5) price double There is a table "customer_article_bak" with the same structure but
0
207
by: Daniel Roy | last post by:
In addition to setting this parameter to TRUE, you'll need to either specify the PARALLEL clause for the table(s) involved, or use the relevant hints ("PARALLEL"). Note also that parallelization is available only for some specific operations (such as full table scans). It won't work for searches using a primary key, as is your case. Daniel
1
15982
by: Rudolf Bargholz | last post by:
Hi, We have created triggers to log modifications to tables in our application. The triggers work fine, just on one of the tables in our database the triggers fail with the error message <SQL0437W Performance of this complex query may be sub-optimal. Reason code: "1". SQLSTATE=01602>. The same trigger on other tables works fine (the triggers are autogenerated with the same structure for most of the tables used in our application). I...
0
8777
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9452
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9310
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9237
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8187
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6737
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4551
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4813
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2729
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.