By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,363 Members | 1,291 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,363 IT Pros & Developers. It's quick & easy.

Fastest way to query limited time ordered data set

P: n/a
Hi,

Suppose I have 2 tables. TableA has about 3 million rows and TableB has
about 50,000 rows. Assuming a btree index is used in the time column.

I need to query a limited and time ordered data in TableA that are not in
TableB. Because TableA can be huge, I only want to compare the top 1000 rows
ordered by time in TableA. What would be the fastest way to write the SQL
query?

Method 1
--------
SELECT * FROM TableA a
WHERE date_part('epoch', CURRENT_TIMESTAMP - a.col2) > 3600 AND
NOT EXISTS (SELECT b.col1 FROM TableB b WHERE b.col1 = a.col1 AND b.col2 >=
100)
ORDER BY a.col1 LIMIT 1000;

Method 2
--------
SELECT * FROM (SELECT * FROM TableA WHERE date_part('epoch',
CURRENT_TIMESTAMP - col2) > 3600 ORDER BY col1 LIMIT 1000) a
WHERE NOT EXISTS (SELECT b.col1 FROM TableB b WHERE b.col1 = a.col1 AND
b.col2 >= 100);

Method 3
--------
SELECT * FROM (SELECT * FROM TableA ORDER BY col1 LIMIT 1000) a
WHERE date_part('epoch', CURRENT_TIMESTAMP - a.col2) > 3600 AND NOT EXISTS
(SELECT b.col1 FROM TableB b WHERE b.col1 = a.col1 AND b.col2 >= 100);

Method 4
--------
SELECT * FROM (SELECT * FROM TableA ORDER BY col1 LIMIT 1000) a
WHERE date_part('epoch', CURRENT_TIMESTAMP - a.col2) > 3600 AND a.col1 NOT
IN (SELECT b.col1 FROM TableB b WHERE b.col2 >= 100);
There are a few more ways to write this (using views, temp tables), but what
would be fastest in PostgreSQL? Please suggest other SQL statements. Method
1 may not work because it may end up looping the entire TableA.

Thanks.


Nov 23 '05 #1
Share this question for a faster answer!
Share on Google+

This discussion thread is closed

Replies have been disabled for this discussion.