By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,373 Members | 1,984 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,373 IT Pros & Developers. It's quick & easy.

Postgresql sql query - selecting rows outside a join

P: n/a
Hi all,

I am trying to do a query that returns all rows that are _not_ part of
a join, and so far I cannot seem to find a query that doesn't take 30
minutes or more to run.

The basic query is "select * from tableA where tableA_id NOT IN (select
tableA_id from tableB)".

Is there a more efficient way of doing this?

Regards,
Graham
--


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Hello

On Mon, 1 Dec 2003, Graham Leggett wrote:
Hi all,

I am trying to do a query that returns all rows that are _not_ part of
a join, and so far I cannot seem to find a query that doesn't take 30
minutes or more to run.

The basic query is "select * from tableA where tableA_id NOT IN (select
tableA_id from tableB)".

Is there a more efficient way of doing this?

Regards,
Graham
--


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #2

P: n/a
> Graham Leggett

Hi all,

I am trying to do a query that returns all rows that are
_not_ part of a join, and so far I cannot seem to find a
query that doesn't take 30 minutes or more to run.

The basic query is "select * from tableA where tableA_id NOT
IN (select tableA_id from tableB)".

Is there a more efficient way of doing this?

Mysql's version to do something like that is, afaik:

SELECT * FROM tableA LEFT JOIN tableB USING(tableA_id) WHERE
tableB.tableA_id IS NULL;

Perhaps that makes more efficient use of your indices.
Another version is something like:
SELECT *, COUNT(tableB.*) FROM tableA LEFT JOIN tableB USING(tableA_id)
GROUP BY columns_of_tableA HAVING count(tableB.*) = 0;

And perhaps a rewrite to use EXISTS (although that is claimed to be
similar in speed or even slower as of postgres 7.4) is useful:
SELECT * FROM tableA WHERE NOT EXISTS(SELECT * FROM tableB WHERE
tableB.tableA_id = tableA.tableA_id)

There are a few others, but it all depens on your index structure and
table sizes whether they work more efficient or not.

Best regards,

Arjen


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #3

P: n/a
Hello

If you can, use PostgreSQL version 7.4. There is some optimalisation for
this questions. You can change your query from select .. IN (select) to
select .. exists (select). More about it you can find in FAQ.

regards
Pavel

On Mon, 1 Dec 2003, Graham Leggett wrote:
Hi all,

I am trying to do a query that returns all rows that are _not_ part of
a join, and so far I cannot seem to find a query that doesn't take 30
minutes or more to run.

The basic query is "select * from tableA where tableA_id NOT IN (select
tableA_id from tableB)".

Is there a more efficient way of doing this?

Regards,
Graham
--


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #4

P: n/a

On Dec 1, 2003, at 10:15 AM, Graham Leggett wrote:
Hi all,

I am trying to do a query that returns all rows that are _not_ part of
a join, and so far I cannot seem to find a query that doesn't take 30
minutes or more to run.

The basic query is "select * from tableA where tableA_id NOT IN (select
tableA_id from tableB)".

Is there a more efficient way of doing this?

Regards,
Graham
--

This should be more efficient:

select * from tableA where not exists (select null from tableB where
tableA_id = tableA.tableA_id);

Version 7.4 fixes the slow IN - NOT IN, but I haven't tested it myself
yet to see how it compares to this.

Adam Ruth
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.