Avoiding sequential scans with OR join condition

Mike Mascari

Hello. I have a query like:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y IN (big_table.y1, big_table.y2);

I have indexes on both big_table.y1 and big_table.y2 and on
little_table.x and little_table.y. The result is a sequential scan of
big_table. In order to prevent this, I've rewritten the query as:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y1
UNION
SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y2

which does allow an index scan, but suffers from two separate queries
along with a unique sort, which, from the data, represents 90% of the
tuples returned by both queries.

Is there any way to write the first query such that indexes will be used?

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1

Subscribe Reply

3600

Janning Vygen

Am Samstag, 16. Oktober 2004 07:23 schrieb Mike Mascari:

Hello. I have a query like:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y IN (big_table.y1, big_table.y2);

I have indexes on both big_table.y1 and big_table.y2 and on
little_table.x and little_table.y. The result is a sequential scan of
big_table. In order to prevent this,
Maybe the postgres planner decided to choose a seq scan because the planner
thinks it is faster, and often it is right. Did you vacuum analyze before?

try:
VACCUM ANALYZE;
SET enable_seq_scan to off;
EXPLAIN ANALYZE <your query>
SET enable_seq_scan to on;
EXPLAIN ANALYZE <your query>

you will see why postgres planner did choose a seq scan and if it was right to
do so (but never disable seq scan on production environment, not even for one
query. you do not want it.)

(i hope syntax is correct otherwise consult the manual)
I've rewritten the query as:
SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y1
UNION
SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y2

which does allow an index scan, but suffers from two separate queries
along with a unique sort, which, from the data, represents 90% of the
tuples returned by both queries.
this is the reason it seems why postgres choose a seq scan in the first query.
if it has to scan 90% of data anyway, it is faster than doing two index
lookups before.
Is there any way to write the first query such that indexes will be used?
i do not know your db design but it looks queer to me to have a big_table with
two columns y1 and y2 which seems to have the same meaning (some value which
is compared to another value of little_table).

why dont you put just one column "y" in your big_table?

kind regards,
janning
Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

Martijn van Oosterhout

If the problem is the sort, use UNION ALL.

As for the query restructuring, I don't know if there is a way of
restructuring the query to do it in a single query. You would be able
to contruct a query plan that would do it, something like:

-> Nested Loop
-> Append
-> Index Scan on big_table.y1
-> Index Scan on big_table.y2
-> Index Scan on little_table

But I have no idea how to get PostgreSQL to produce this...

On Sat, Oct 16, 2004 at 01:23:09AM -0400, Mike Mascari wrote:

Hello. I have a query like:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y IN (big_table.y1, big_table.y2);

I have indexes on both big_table.y1 and big_table.y2 and on
little_table.x and little_table.y. The result is a sequential scan of
big_table. In order to prevent this, I've rewritten the query as:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y1
UNION
SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y2

which does allow an index scan, but suffers from two separate queries
along with a unique sort, which, from the data, represents 90% of the
tuples returned by both queries.

Is there any way to write the first query such that indexes will be used?

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly
--
Martijn van Oosterhout <kl*****@svana. org> http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFBcS3mY5T wig3Ge+YRAujSAK C1CPpjQnNccACK9 X0hq9uN5ByLzgCe Ms+N
vkNJj2sF+ADGXB8 x5UyZ8rA=
=Xcfb
-----END PGP SIGNATURE-----

Nov 23 '05 #3

Tom Lane

Mike Mascari <ma*****@mascar i.com> writes:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y IN (big_table.y1, big_table.y2); Is there any way to write the first query such that indexes will be used?

I'm afraid you're stuck with the UNION workaround. The planner's
treatment of OR indexscans is entirely separate from its treatment of
join indexscans, so it's just not capable of forming the sort of plan
you are envisioning. It'd be nice to improve that someday, but it'd
take either a pile of duplicate code, or a fairly thorough rewrite
of indxpath.c/orindxpath.c.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #4

Sim Zacks

I would use 2 left joins and use the where condition to make sure one
of them is true, such as:

select big_table.* from
big_table left join little_table as l1 on big_table.y1=l1 .y and
l1.x=10
left join little_table as l2 on big_table.y2=l2 .y and l1.x=10
where l1.p_key is not null and l2.p_key is not null

I have never tried this in postgresql, but in my experience with
various other DB engines it is a lot faster then using an or in the
join and faster then a union.

Thank You
Sim Zacks
IT Manager
CompuLab
04-829-0145 - Office
04-832-5251 - Fax

_______________ _______________ _______________ _______________ _______________ _____

Hello. I have a query like:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y IN (big_table.y1, big_table.y2);

I have indexes on both big_table.y1 and big_table.y2 and on
little_table.x and little_table.y. The result is a sequential scan of
big_table. In order to prevent this, I've rewritten the query as:

SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y1
UNION
SELECT big_table.*
FROM little_table, big_table
WHERE little_table.x = 10 AND
little_table.y = big_table.y2

which does allow an index scan, but suffers from two separate queries
along with a unique sort, which, from the data, represents 90% of the
tuples returned by both queries.

Is there any way to write the first query such that indexes will be used?

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #5

Mike Mascari

Sim Zacks wrote:

I would use 2 left joins and use the where condition to make sure one
of them is true, such as:

select big_table.* from
big_table left join little_table as l1 on big_table.y1=l1 .y and
l1.x=10
left join little_table as l2 on big_table.y2=l2 .y and l1.x=10
where l1.p_key is not null and l2.p_key is not null

I have never tried this in postgresql, but in my experience with
various other DB engines it is a lot faster then using an or in the
join and faster then a union.

Wow! Thanks! That certainly did the trick.

Mike Mascari

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #6

Mike Mascari

Mike Mascari wrote:

Sim Zacks wrote:
I would use 2 left joins and use the where condition to make sure one
of them is true, such as:

select big_table.* from
big_table left join little_table as l1 on big_table.y1=l1 .y and
l1.x=10
left join little_table as l2 on big_table.y2=l2 .y and l1.x=10
where l1.p_key is not null and l2.p_key is not null

I have never tried this in postgresql, but in my experience with
various other DB engines it is a lot faster then using an or in the
join and faster then a union.

Wow! Thanks! That certainly did the trick.

I'm thinking that the WHERE clauses condition should read:

WHERE l1.p_pkey is not null OR l2.p_key is not null;

My condition for a given selection of a big_table tuple is that either
y1 or y2 exist as a valid x from little_table. So I think I need an OR
instead of an AND. And AND condition would require that both y1 and y2
for the sample tuple of big_table be a valid x from little_table. Correct?

Mike Mascari
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #7

Sim Zacks

Mike,

You are probably correct, I was thinking in English, not SQL. That's
what happens when I bang code too early in the morning.

Thank You
Sim Zacks
IT Manager
CompuLab
04-829-0145 - Office
04-832-5251 - Fax

_______________ _______________ _______________ _______________ _______________ _____

Mike Mascari wrote:

Sim Zacks wrote:
I would use 2 left joins and use the where condition to make sure one
of them is true, such as:

select big_table.* from
big_table left join little_table as l1 on big_table.y1=l1 .y and
l1.x=10
left join little_table as l2 on big_table.y2=l2 .y and l1.x=10
where l1.p_key is not null and l2.p_key is not null

I have never tried this in postgresql, but in my experience with
various other DB engines it is a lot faster then using an or in the
join and faster then a union.

Wow! Thanks! That certainly did the trick.

I'm thinking that the WHERE clauses condition should read:

WHERE l1.p_pkey is not null OR l2.p_key is not null;

My condition for a given selection of a big_table tuple is that either
y1 or y2 exist as a valid x from little_table. So I think I need an OR
instead of an AND. And AND condition would require that both y1 and y2
for the sample tuple of big_table be a valid x from little_table. Correct?

Mike Mascari
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #8

Bruno Wolff III

On Sun, Oct 17, 2004 at 03:30:32 -0400,
Mike Mascari <ma*****@mascar i.com> wrote:

I'm thinking that the WHERE clauses condition should read:

WHERE l1.p_pkey is not null OR l2.p_key is not null;

That seems to make more sense. I was puzzling about that condition myself.
If both keys where not null, there wouldn't even be a need for "left" joins.

Note that the output is different than you were originally getting as
well, since previously rows looked like one big table row combined with
one little table row. Now you are getting one big table row combined with
two little table rows (one of which might be null). This is probably
still faster, but you will need to change how you use the output.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #9

Similar topics

3073

MySQL JOIN Query--help

by: Xenophobe | last post by:

I know this isn't a MySQL forum, but my question is related to a PHP project. I have two tables. table1 table2 "table1" contains 2 columns, ID and FirstName:

PHP

3604

left join in case of....

by: Quarco | last post by:

Hey, Suppose you have a table_ex like: id nuber1 chr ( = a or b) 1 1000 a 2 1001 a 3 1002 b 4 1003 a etc..

MySQL Database

18113

Outer Join syntax

by: Martin | last post by:

I am now working on SQL Server 2000 having had previous experience on a different database. Both of the OUTER JOIN syntaxes is different from what I am used to and I am finding it slightly confusing. For example, given two tables : wipm_tbl_mi wipm_tbl_wi (which may not have data in it for a specific record that exists in the first table.)

Microsoft SQL Server

4854

Right outer join problem

by: Omavlana | last post by:

Hi, I need your help to resolve this problem. I have written a right outer join query between 2 indipendent tables as follows. select b.Account_desc, b.Account, a.CSPL_CSPL from Actual_data_final a right outer join Actual_account_Tbl b on a.account_desc = b.account_desc where (a.source_type = 'TY02' or a.source_type is null) and (a.month = '2ND HALF' or a.month is null) and (a.year = 2004 or a.year is null) and (a.product = 'NP' or...

Microsoft SQL Server

4975

OUTER JOIN problem

by: Matt | last post by:

Hello I have to tables ar and arb, ar holds articles and a swedish description, arb holds descriptions in other languages. I want to retreive all articles that match a criteria from ar and also display their corresponding entries in arb, but if there is NO entry in arb I still want it to show up as NULL or something, so that I can get the attention that there IS no language associated with that article.

Microsoft SQL Server

10054

Diff between "LEFT OUTER JOIN" and " *= " when condition

by: Dam | last post by:

Using SqlServer : Query 1 : SELECT def.lID as IdDefinition, TDC_AUneValeur.VALEURDERETOUR as ValeurDeRetour FROM serveur.Data_tblDEFINITIONTABLEDECODES def, serveur.Data_tblTABLEDECODEAUNEVALEUR TDC_AUneValeur where def.TYPEDETABLEDECODES = 4

Microsoft SQL Server

14286

Difference on condition in JOIN & WHERE

by: Jane T | last post by:

I appreciate how difficult it is to resolve a problem without all the information but maybe someone has come across a similar problem. I have an 'extract' table which has 1853 rows when I ask for all rows where period_ = 3. The allocation table for info has 210 rows. I have two scripts below. The first script where I specify a period on a join, brings back 1853 lines and works. The second script where I specify the period in the...

Microsoft SQL Server

2641

Creating global temporary table on join condition

by: hgriva1 | last post by:

Hi, Is there a way in which i can create a global temporary table based on join condition eg: scott@ISNS>CREATE GLOBAL TEMPORARY TABLE x 2 AS 3 SELECT deptno,dname 4 FROM( 5 select e.deptno,d.dname 6 from emp e ,dept d

Oracle Database

9621

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9454

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10106

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10040

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

9914

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8939

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

5484

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

4012

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3611

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP