By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,384 Members | 845 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,384 IT Pros & Developers. It's quick & easy.

Performance weirdness with/without vacuum analyze

P: n/a
I asked earlier about ways of doing an UPDATE involving a left outer
join and got some very useful feedback.

This has thrown up a (to me) strange anomaly about the speed of such
an update.

The input to this query is a fairly large (the example I'm working
with has 335,000 rows) set of records containing numbers to be looked
up in the lookup table. This lookup table has 239 rows.

I'm always reading the suggestion that doing a 'VACUUM ANALYZE' on a
database is 'A Good Thing' as it helps the planner to do the best thing, so
I arranged a vacuum analyze on the input records.

Running the query takes about 13 mins or so.

If, however I *don't* do an analyze, but leave the input table as
it was when imported the run takes about 2.5 mins!

Looking at the output from 'explain' I can see that the main difference
in the way the planner does it is that it does a merge join in the non-analyze
case, and a hash join in the analyze case.

Unfortunately I don't really know what this is implying, hence the call
for assistance.

I have a file with all sorts of info about the problem (details of tables,
output of 'explain' etc) but as it is about 5K in size, and wide as well, I
didn't want to dump it in the list without any warning!

Regards,
Harry.
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On Monday 13 October 2003 19:17, Harry Broomhall wrote:
I asked earlier about ways of doing an UPDATE involving a left outer
join and got some very useful feedback.

This has thrown up a (to me) strange anomaly about the speed of such
an update.

The input to this query is a fairly large (the example I'm working
with has 335,000 rows) set of records containing numbers to be looked
up in the lookup table. This lookup table has 239 rows.
Can yu lookup those 239 values in 335,000 rows instead. The result will be
same but probably it will be lot moer faster...
I'm always reading the suggestion that doing a 'VACUUM ANALYZE' on a
database is 'A Good Thing' as it helps the planner to do the best thing, so
I arranged a vacuum analyze on the input records.

Running the query takes about 13 mins or so.

If, however I *don't* do an analyze, but leave the input table as
it was when imported the run takes about 2.5 mins!
Thats weird..
Looking at the output from 'explain' I can see that the main difference
in the way the planner does it is that it does a merge join in the
non-analyze case, and a hash join in the analyze case.
Turn the comparison other way round as I suggested. See if that makes any
difference..

Unfortunately I don't really know what this is implying, hence the call
for assistance.

I have a file with all sorts of info about the problem (details of
tables, output of 'explain' etc) but as it is about 5K in size, and wide as
well, I didn't want to dump it in the list without any warning!


An explain analyze in both cases+postgresql.conf tuning and hardware/software
information would be very good. Compress it before you post if you think its
too big..

HTH

Shridhar
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #2

P: n/a
Shridhar Daithankar writes:
On Monday 13 October 2003 19:17, Harry Broomhall wrote: [SNIP]
The input to this query is a fairly large (the example I'm working
with has 335,000 rows) set of records containing numbers to be looked
up in the lookup table. This lookup table has 239 rows.


Can yu lookup those 239 values in 335,000 rows instead. The result will be
same but probably it will be lot moer faster...

I'm not entirely sure how I would do this, as the 'lookup' is actualy
a join. I thought that the order of nameing the joined tables didn't
matter (except for 'left' and 'right'), similar to the fact that 1 + 2 is
the same as 2 + 1.

[SNIP]
An explain analyze in both cases+postgresql.conf tuning and hardware/software
information would be very good. Compress it before you post if you think its
too big..

This would amount to an attachment, and I'm not sure how this list views
such things.

I've emailed you the file separately.

Regards,
Harry.
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #3

P: n/a

Harry Broomhall <ha*************@uk.easynet.net> writes:
I'm not entirely sure how I would do this, as the 'lookup' is actualy
a join. I thought that the order of nameing the joined tables didn't
matter (except for 'left' and 'right'), similar to the fact that 1 + 2 is
the same as 2 + 1.


Outer joins are commutative (1+2 == 2+1) at least barring the left/right
issue. But they are not associative. (1+2)+3 != 1+(2+3).

There are cases where the optimizer could prove a particular set of joins
could be reordered this way, but it doesn't know how to do that currently.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 12 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.