By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,660 Members | 1,299 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,660 IT Pros & Developers. It's quick & easy.

Perfomance difference between 7.2 and 7.3

P: n/a
Hi all:

I have here a table with the following schema:

Table "todocinetv"
Column | Type | Modifiers
-------------+-----------------------------+----------------------
id | integer | not null default '0'
datestamp | timestamp without time zone | not null
thread | integer | not null default '0'
parent | integer | not null default '0'
author | character(37) | not null default ''
subject | character(255) | not null default ''
email | character(200) | not null default ''
attachment | character(64) | default ''
host | character(50) | not null default ''
email_reply | character(1) | not null default 'N'
approved | character(1) | not null default 'N'
msgid | character(100) | not null default ''
modifystamp | integer | not null default '0'
userid | integer | not null default '0'
Indexes: todocinetv_approved,
todocinetv_author,
todocinetv_datestamp,
todocinetv_modifystamp,
todocinetv_msgid,
todocinetv_parent,
todocinetv_subject,
todocinetv_thread,
todocinetv_userid,
todocinetvpri_key
(It's actually a table created by the discussion board application
Phorum (version 3.3)).
This table has about 28000 rows, and is running with Postgres 7.2.3
under Red Hat 8.0, in a 2.4 Ghz. Pentiun 4 with 512 Mb. of RAM.
The problem I'm having is that, when you access the main page of the
discussion board, it takes forever to show you the list of posts. The
query that Phorum uses for doing so is:
phorum=# explain
phorum-# SELECT thread, modifystamp, count(id) AS tcount,
datetime(modifystamp) AS latest, max(id) as maxid FROM todocinetv WHERE
approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc,
thread desc limit 30;
NOTICE: QUERY PLAN:

Limit (cost=40354.79..40354.79 rows=30 width=12)
-> Sort (cost=40354.79..40354.79 rows=2879 width=12)
-> Aggregate (cost=39901.43..40189.35 rows=2879 width=12)
-> Group (cost=39901.43..40045.39 rows=28792 width=12)
-> Sort (cost=39901.43..39901.43 rows=28792 width=12)
-> Seq Scan on todocinetv
(cost=0.00..37768.90 rows=28792 width=12)
This query takes up to 3 minutes to execute. I have tried to strip it
down and leaving it in its most vanilla form (without "count(id)" and
such), and it's still almost as slow:
phorum=# explain
phorum-# SELECT thread, modifystamp, datetime(modifystamp) AS latest
from todocinetv WHERE approved='Y' ORDER BY modifystamp desc, thread
desc limit 30;
NOTICE: QUERY PLAN:

Limit (cost=39901.43..39901.43 rows=30 width=8)
-> Sort (cost=39901.43..39901.43 rows=28792 width=8)
-> Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792
width=8)
But here is the weird thing: I dump the table, export it into another
machine running Postgres 7.3.2 (Celeron 1.7 Ghz, 512 Mb. of memory), and
the query takes only 2 or 3 seconds to execute, even though the query
plan is almost the same:
provphorum=# explain
provphorum-# SELECT thread, modifystamp, count(id) AS tcount,
modifystamp AS latest, max(id) as maxid FROM todocinetv WHERE
approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc,
thread desc limit 30 ;
QUERY PLAN

------------------------------------------------------------------------------------------------
Limit (cost=5765.92..5765.99 rows=30 width=12)
-> Sort (cost=5765.92..5772.96 rows=2817 width=12)
Sort Key: modifystamp, thread
-> Aggregate (cost=5252.34..5604.49 rows=2817 width=12)
-> Group (cost=5252.34..5463.63 rows=28172 width=12)
-> Sort (cost=5252.34..5322.77 rows=28172 width=12)
Sort Key: thread, modifystamp
-> Seq Scan on todocinetv
(cost=0.00..3170.15 rows=28172 width=12)
Filter: (approved = 'Y'::bpchar)
(9 rows)
(I took out the "datetime" function, since 7.3 didn't accept it and I
didn't think it was relevant to the performance problem (am I wrong?))

So my question is: what causes such a big difference? (3 min. vs. 3
seconds) Does the version difference (7.2 vs. 7.3) account for all of
it? Or should I start looking at other factors? As I said, both machines
are almost equivalent hardware-wise, and as for the number of shared
buffers, the faster machine actually has less of them (the 7.3 machine
has "shared_buffers = 768", while the 7.2 one has "shared_buffers = 1024").

Paulo Jan.
DDnet.

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #1
Share this Question
Share on Google+
8 Replies


P: n/a
Paulo Jan <ad***@digital.ddnet.es> writes:
-> Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792
width=8)


The estimated cost seems to be more than one disk page read per row
returned. This suggests to me that you have a huge amount of dead space
in that table --- try a VACUUM FULL on it. If that fixes the problem,
then you need to improve your housekeeping procedures on the 7.2
installation: run vacuums more often and ensure that your FSM settings
are large enough.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #2

P: n/a
On Wed, Nov 12, 2003 at 16:30:41 +0100,
Paulo Jan <ad***@digital.ddnet.es> wrote:
This table has about 28000 rows, and is running with Postgres 7.2.3
under Red Hat 8.0, in a 2.4 Ghz. Pentiun 4 with 512 Mb. of RAM.
You probably want to use 7.4 for this since a new way to do aggragates using
hashes has been added. 7.4 is currently in release candidate status and
maybe be released as early as next Monday.

This query takes up to 3 minutes to execute. I have tried to strip
it down and leaving it in its most vanilla form (without "count(id)" and
But here is the weird thing: I dump the table, export it into
another machine running Postgres 7.3.2 (Celeron 1.7 Ghz, 512 Mb. of
memory), and the query takes only 2 or 3 seconds to execute, even though
the query plan is almost the same:


This makes it sound like you haven't been properly vacuuming and/or
analyzing the database. You might want to run a vacuum full on the
production db and see if that speeds things up. Once you have done
a vacuum full then regular vacuums should keep the number of dead tuples
down (as long as FSM is set high enough).

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #3

P: n/a
Tom Lane wrote:
Paulo Jan <ad***@digital.ddnet.es> writes:
-> Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792
width=8)

The estimated cost seems to be more than one disk page read per row
returned. This suggests to me that you have a huge amount of dead space
in that table --- try a VACUUM FULL on it. If that fixes the problem,


Argh!!! The thing is, I *had* run VACUUM several times before posting
to the list... but it was VACUUM ANALYZE, not VACUUM FULL. And here I
was, wondering why VACUUMing so much didn't have any effect...

Paulo Jan.
DDnet.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 12 '05 #4

P: n/a
Is this correct?

vacuum by itself just cleans out the old extraneous tuples so that they
aren't in the way anymore
vacuum analyze rebuilds indexes. If you add an index to a table it won't be
used until you vacuum analyze it
vacuum full actually compresses the table on disk by reclaiming the space
from the old tuples after they have been removed.
----- Original Message -----
From: "Tom Lane" <tg*@sss.pgh.pa.us>
To: "Paulo Jan" <ad***@digital.ddnet.es>
Cc: <pg***********@postgresql.org>
Sent: Wednesday, November 12, 2003 8:38 AM
Subject: Re: [GENERAL] Perfomance difference between 7.2 and 7.3

Paulo Jan <ad***@digital.ddnet.es> writes:
-> Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792
width=8)


The estimated cost seems to be more than one disk page read per row
returned. This suggests to me that you have a huge amount of dead space
in that table --- try a VACUUM FULL on it. If that fixes the problem,
then you need to improve your housekeeping procedures on the 7.2
installation: run vacuums more often and ensure that your FSM settings
are large enough.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #5

P: n/a
"Rick Gigger" <ri**@alpinenetworking.com> writes:
Is this correct?

vacuum by itself just cleans out the old extraneous tuples so that they
aren't in the way anymore
Actually it puts the free space in each page on a list (the free space
map) so it can be reused for new tuples without having to allocate
fresh pages. It finds free space by looking for tuples that can't be
seen any more by any transaction.
vacuum analyze rebuilds indexes. If you add an index to a table it won't be
used until you vacuum analyze it
It doesn't rebuild indexes--REINDEX does that. ANALYZE measures the
size and statistics of the data in the table, so the planner can do a
good job.
vacuum full actually compresses the table on disk by reclaiming the space
from the old tuples after they have been removed.


It moves tuples around and frees up pages at the end of the table,
thus compacting it.

So you're mostly wrong on all three. :)

-Doug

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #6

P: n/a
On Wed, 12 Nov 2003, Rick Gigger wrote:
Is this correct?

vacuum by itself just cleans out the old extraneous tuples so that they
aren't in the way anymore
vacuum analyze rebuilds indexes. If you add an index to a table it won't be
used until you vacuum analyze it
vacuum full actually compresses the table on disk by reclaiming the space
from the old tuples after they have been removed.


You don't have to analyze AFTER index creation, just at some point in
time. I.e.:

create table test ...

import into table test 1000000 rows

analyze test;

create index test_field1_dx on test (id);

select * from test where id=4567; <- this will likely use the index.
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #7

P: n/a
> > Is this correct?

vacuum by itself just cleans out the old extraneous tuples so that they
aren't in the way anymore


Actually it puts the free space in each page on a list (the free space
map) so it can be reused for new tuples without having to allocate
fresh pages. It finds free space by looking for tuples that can't be
seen any more by any transaction.
vacuum analyze rebuilds indexes. If you add an index to a table it won't be used until you vacuum analyze it


It doesn't rebuild indexes--REINDEX does that. ANALYZE measures the
size and statistics of the data in the table, so the planner can do a
good job.


Is REINDEX something that needs to be done on a periodic basis?
vacuum full actually compresses the table on disk by reclaiming the space from the old tuples after they have been removed.


It moves tuples around and frees up pages at the end of the table,
thus compacting it.

So you're mostly wrong on all three. :)

-Doug


Thanks!

Rick
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #8

P: n/a
On Thu, Nov 13, 2003 at 12:06:05 -0700,
Rick Gigger <ri**@alpinenetworking.com> wrote:

Is REINDEX something that needs to be done on a periodic basis?


In version prior to 7.4 some patterns of use will require periodic
reindexing. The problem case is when the index column monoticly
increases (or decreases) and old values eventually get deleted.
In this case the index blocks for the deleted values don't get
reused and the size of the index will continually grow.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.