473,395 Members | 1,692 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

FILTERING LEFT TABLE IN LEFT OUTER JOIN

I have two tables that are pretty big. I need about 10 rows in the
left table and the right table is filtered to 5 rows as well. It
seems when I join the tables in the FROM clause, I have to filter my
left table in the WHERE clause and cannot filter it in the FROM
clause.

This seems like it would cause a lot of overhead especially when my
left table is ten thousand rows. Am I wrong in thinking that the two
tables get joined with the FROM clause, causing a huge result set,
then get filtered with the WHERE clause?

Does anyone know if maybe I am doing this wrong or thinking about it
wrong?
For example, I want a query to return the number of dollars made in
the state of NY even if a store does not exist in the right table:

----------
table1
----------
store #
state

----------
table2
----------
store #
type
earnings

-WORKS CORRECTLY BUT SEEMS VERY EXPENSIVE IF LEFT AND RIGHT TABLES ARE
BIG
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT')
and t1.state = 'NY'

----------
result
----------
231, $0
232, $10000
233, $50000
234, $0
I cant just join the tables, otherwise, I would get this result set:

----------
result
----------
232, $10000
233, $50000

-I HAVE TRIED THE FOLLOWING, BUT IT DOESN'T WORK
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT' and t1.state =
'NY')

Any advice??????

May 11 '07 #1
9 9194
sh********@gmail.com wrote:
I have two tables that are pretty big. I need about 10 rows in the
left table and the right table is filtered to 5 rows as well. It
seems when I join the tables in the FROM clause, I have to filter my
left table in the WHERE clause and cannot filter it in the FROM
clause.

This seems like it would cause a lot of overhead especially when my
left table is ten thousand rows. Am I wrong in thinking that the two
tables get joined with the FROM clause, causing a huge result set,
then get filtered with the WHERE clause?

Does anyone know if maybe I am doing this wrong or thinking about it
wrong?
For example, I want a query to return the number of dollars made in
the state of NY even if a store does not exist in the right table:

----------
table1
----------
store #
state

----------
table2
----------
store #
type
earnings

-WORKS CORRECTLY BUT SEEMS VERY EXPENSIVE IF LEFT AND RIGHT TABLES ARE
BIG
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT')
and t1.state = 'NY'

----------
result
----------
231, $0
232, $10000
233, $50000
234, $0
I cant just join the tables, otherwise, I would get this result set:

----------
result
----------
232, $10000
233, $50000

-I HAVE TRIED THE FOLLOWING, BUT IT DOESN'T WORK
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT' and t1.state =
'NY')

Any advice??????
This approach should work, but the syntax may be NOT quite right:

select store, coalesce(earnings,0)
from (select * from t1 where state='NY') as ta
left outer
join (select * from t2 where type='RESTAURANT') as tb
on ta.store=tb.store
May 11 '07 #2
Ian
sh********@gmail.com wrote:
I have two tables that are pretty big. I need about 10 rows in the
left table and the right table is filtered to 5 rows as well. It
seems when I join the tables in the FROM clause, I have to filter my
left table in the WHERE clause and cannot filter it in the FROM
clause.
This won't make any difference. DB2 rewrites queries during
optimization -- so putting the predicates in the WHERE clause
as opposed join predicates won't matter.

Does anyone know if maybe I am doing this wrong or thinking about it
wrong?
You're thinking about it wrong.

Are you actually having a performance problem? Or you're just thinking
that there could be a problem?
Any advice??????
Have you done RUNSTATS on your tables? Do you have any indexes on your
tables? What does the access path look like?
May 11 '07 #3
I actually don't have a problem yet on our test boxes since our test
server doesn't contain nearly the amount of data in production. But
we will be sending LIVE soon and I was reviewing this code and it just
brought up a concern. Our LIVE system has much more data, so I just
wanted to make sure I understood things correctly. Thanks for the
advice.

However, I have to disagree with your conclusion as a whole because
using the FROM clause when limiting the right table makes a big
difference than when using the WHERE clause to filter the right
table. I have seen this first hand through testing with this query.

Seems like Bob's response would be efficient. Any thoughts??

May 11 '07 #4
SVK
On May 11, 2:55 am, Ian <ianb...@mobileaudio.comwrote:
shaneva...@gmail.com wrote:
I have two tables that are pretty big. I need about 10 rows in the
left table and the right table is filtered to 5 rows as well. It
seems when I join the tables in the FROM clause, I have to filter my
left table in the WHERE clause and cannot filter it in the FROM
clause.

This won't make any difference. DB2 rewrites queries during
optimization -- so putting the predicates in the WHERE clause
as opposed join predicates won't matter.
Does anyone know if maybe I am doing this wrong or thinking about it
wrong?

You're thinking about it wrong.

Are you actually having a performance problem? Or you're just thinking
that there could be a problem?
Any advice??????

Have you done RUNSTATS on your tables? Do you have any indexes on your
tables? What does the access path look like?
I fully agree with Ian:
1) It needs to be clearly stated: what indexes are available for those
original tables?
2) EXPLAIN must be executed on those various types of possible
queries, to fully understand the access path?
3) The access path may seriously change depending on RUNSTATS
performed or not, and on different RUNSTATS parameters.

Preliminary I can suggest, to improve performance of this particular
query, i.e. the original one:

select store,
coalesce(earnings, '$0')
from table1
LEFT OUTER
JOIN table2
on ( t1.store = t2.store
and t2.type = 'RESTAURANT'
)
and t1.state = 'NY'

The following indexes would be really useful:

create ... index ix1_table1
on table1
(
state asc,
store # asc
);

create ... index ix1_table2
on table2
(
store # asc,
type asc
) ;

Please try it.
I'm pretty sure you will see big difference in performance.

May 11 '07 #5
Hi.

sh********@gmail.com wrote:
However, I have to disagree with your conclusion as a whole because
using the FROM clause when limiting the right table makes a big
difference than when using the WHERE clause to filter the right
table. I have seen this first hand through testing with this query.
When you put a WHERE clause on the right table, it acts to make the
report behave as if it was an inner join. If you are filtering the
right table in the FROM clause, then I think DB2 needs to scan through
the left table (using any filter clauses from either the table
definition or the WHERE clause) and then join to the right table to
see if there are valid rows to connect to; if you filter the right
table in the WHERE clause, then DB2 can re-write the query to first
pick up the right table using that filter clause, and join into the
left table. So, there could be a big performance difference. In any
case, I would expect there to be significant results differences.

-Chris

May 11 '07 #6
Ian
sh********@gmail.com wrote:
I actually don't have a problem yet on our test boxes since our test
server doesn't contain nearly the amount of data in production. But
we will be sending LIVE soon and I was reviewing this code and it just
brought up a concern. Our LIVE system has much more data, so I just
wanted to make sure I understood things correctly. Thanks for the
advice.

However, I have to disagree with your conclusion as a whole because
using the FROM clause when limiting the right table makes a big
difference than when using the WHERE clause to filter the right
table. I have seen this first hand through testing with this query.
OK. But answer the questions I posted earlier. If we're talking
about an untuned system where the optimizer has to make guesses, seeing
big differences is certainly a possibility.
Seems like Bob's response would be efficient. Any thoughts??
Again, this is completely dependent on how DB2 decides to rewrite the
query.

A long time ago, it used to be that you could take advantage of
"opportunities" within the optimizer to cause certain behavior.
However, the optimizer query-rewrite phase is generally a lot
smarter than any of us. So the chances of you being able to outsmart
it are slim.

And of course, you can verify this by simply looking at the explains.
Ian
May 11 '07 #7
thanks for all the feedback. this is very helpful. I am not sure on
the statistics nor am I sure indexes. I have to get the DBA's
involved since I do not have access to do this stuff on the boxes. I
will take what I have found out here and apply and research. Thanks
again!

May 12 '07 #8
On May 11, 9:06 am, shaneva...@gmail.com wrote:
-WORKS CORRECTLY BUT SEEMS VERY EXPENSIVE IF LEFT AND RIGHT TABLES ARE
BIG
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT')
and t1.state = 'NY'

----------
result
----------
231, $0
232, $10000
233, $50000
234, $0
I doubt this query works corrrectly. Because, ON condition of LEFT
OUTER JOIN specifies selection condition of rows of RIGHT TABLE, and
all rows of LEFT TABLE would be selected even if the ON condition
includes predicate for LEFT TABLE(In your example t1.state = 'NY').
I want to recommend you to add rows t1.state <'NY' then execute the
Query again. You will see to be selected those added rows with
t1.state <'NY', even you had specified " and t1.state = 'NY' ".
While rows of RIGHT TABLE must not be selected even satisfied
predicates for RIGHT TABLE(i.e. t1.store = t2.store and t2.type =
'RESTAURANT').
If you want select LEFT TABLE, you should specify selection condition
in WHERE caluse. If wrote in WHERE clause DB2 is enough clever(as Ian
wrote). So, you need not worry about performance.
Indexes recommended by SVK must be effective.
Again, I want to recommend you to try following Query with that
indexes and see access path by EXPLAIN.
select store, coalesce(earnings, '$0')
from table1 LEFT OUTER JOIN
table2
on t1.store = t2.store and t2.type = 'RESTAURANT'
WHERE t1.state = 'NY'
;
Another point is usuallly parentheses in conditions are not influenced
on access path(then performance), if all predicate are connected by
only AND or only OR(In other words there's no mixture of AND and OR).
Parentheses are only influenced on readability of the conditions, if
predicates are not connected with mixture of AND and OR.

May 13 '07 #9
On 10 May 2007 17:06:49 -0700, sh********@gmail.com wrote:
>I have two tables that are pretty big. I need about 10 rows in the
left table and the right table is filtered to 5 rows as well. It
seems when I join the tables in the FROM clause, I have to filter my
left table in the WHERE clause and cannot filter it in the FROM
clause.

This seems like it would cause a lot of overhead especially when my
left table is ten thousand rows. Am I wrong in thinking that the two
tables get joined with the FROM clause, causing a huge result set,
then get filtered with the WHERE clause?

Does anyone know if maybe I am doing this wrong or thinking about it
wrong?
For example, I want a query to return the number of dollars made in
the state of NY even if a store does not exist in the right table:

----------
table1
----------
store #
state

----------
table2
----------
store #
type
earnings

-WORKS CORRECTLY BUT SEEMS VERY EXPENSIVE IF LEFT AND RIGHT TABLES ARE
BIG
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT')
and t1.state = 'NY'

----------
result
----------
231, $0
232, $10000
233, $50000
234, $0
I cant just join the tables, otherwise, I would get this result set:

----------
result
----------
232, $10000
233, $50000

-I HAVE TRIED THE FOLLOWING, BUT IT DOESN'T WORK
select store, coalesce(earnings, '$0') from table1 LEFT OUTER JOIN
table2
on (t1.store = t2.store and t2.type = 'RESTAURANT' and t1.state =
'NY')

Any advice??????
Depending on the amount of information involved, a subquery might be
the better solution:

SELECT
Store,
(
SELECT
COASLESCE(Earnings, '$0')
FROM
Table2
WHERE
Table2.Store = Table1.Store
AND Table2.Type = 'RESTAURANT'
)
FROM
Table1
WHERE
Table1.State = 'NY'

B.
May 16 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: KT | last post by:
This might not be possible, but on the chance that it can - is there a way to do the following: Given a arbitray one dimesional value list: ('AALGX','12345','XXXXX','AAINX','AMMXX') Is there...
8
by: Matt | last post by:
Hello I have to tables ar and arb, ar holds articles and a swedish description, arb holds descriptions in other languages. I want to retreive all articles that match a criteria from ar and...
3
by: Dam | last post by:
Using SqlServer : Query 1 : SELECT def.lID as IdDefinition, TDC_AUneValeur.VALEURDERETOUR as ValeurDeRetour FROM serveur.Data_tblDEFINITIONTABLEDECODES def,...
0
by: Steve Claflin | last post by:
I'm using mysql 4.0.x, which does not allow the following query (which works fine in 4.1). Is there any way I can issue a single query to achieve the same results? (I want a list of all records...
3
by: Martin | last post by:
Hello everybody, I have the following question. As a join clause on Oracle we use " table1.field1 = table2.field1 (+) " On SQL Server we use " table1.field1 *= table2.field1 " Does DB2...
4
by: Anthony Robinson | last post by:
I was actually just wondering if someone could possibly take a look and tell me what I may be doing wrong in this query? I keep getting ambiguous column errors and have no idea why...? Thanks in...
1
by: Bob Alston | last post by:
I am trying to use an access sql statement in a module that does a left outer join on one table - joined to a table that has a selection criteria. The result is an inner join. If I do this in...
5
by: kumar_rangan1976 | last post by:
I need the below sybase code to be migrated in UDB : select distinct c.partnumber as I_PART, case when d.IntegratorID = 'DCX05' then 'U' when d.IntegratorID = 'DCX04' then 'M'...
2
by: Rhishabh07 | last post by:
difference between (left, right, inner, outer) join? anybody can help me? Thanx
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.