By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,761 Members | 1,018 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,761 IT Pros & Developers. It's quick & easy.

VACUUM Question

P: n/a
Hi,
just a few questions on the Vaccum

I run a vacuum analyze on the database every night as part of a
maintenance job.

During the day I have a job that loads 30-70,000 records into two tables
(each 30-70k).
This job runs 2-3 times a day; the first time mainly inserts, the 2nd,
3rd time mostly updates.
Both tables have in the area of 1-3Mio records

How reasonable is it to run a Vacuum Analyze before and after the
insert/update of the data.

Also, I noticed that i get quite some performance improvement if I run a
count(*) on the two tables before the insert. Any reasons for that?
One more question; on one server the Vacuum Analyze before the insert
takes approx. 2min after that the same command takes 15min.

I run PG7.3.4

Thanks for any help

Alex



---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #1
Share this Question
Share on Google+
6 Replies

P: n/a
Alex <al**@meerkatsoft.com> writes:
How reasonable is it to run a Vacuum Analyze before and after the
insert/update of the data.
On a busy system you should run vacuum more often than once per day.

You should probably run a VACUUM after the update. And running ANALYZE at the
same time isn't a bad idea, especially if the update changes the distribution
a lot.

There won't be anything to VACUUM after the insert, but perhaps you still want
to run ANALYZE. Note that a plain ANALYZE uses a statistical sample which is
much faster, whereas VACUUM ANALYZE has to look at every record anyways so
it's slower but produces more accurate statistics. If you don't have
performance problems then using VACUUM ANALYZE isn't a bad idea, but it's
probably overkill.
Also, I noticed that i get quite some performance improvement if I run a
count(*) on the two tables before the insert. Any reasons for that?
Uh, just preloading the kernel cache with blocks from the table and index?
One more question; on one server the Vacuum Analyze before the insert takes
approx. 2min after that the same command takes 15min.


You might try a VACUUM FULL sometime when you can deal with 15min of downtime
or so. Actually it would probably be longer. Perhaps the table that's taking
15min has a ton of extra dead tuples left over from the fsm settings being too
low and/or vacuum being too infrequent.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #2

P: n/a
Greg Stark wrote:
Alex <al**@meerkatsoft.com> writes:
There won't be anything to VACUUM after the insert, but perhaps you still want
to run ANALYZE. Note that a plain ANALYZE uses a statistical sample which is
much faster, whereas VACUUM ANALYZE has to look at every record anyways so
it's slower but produces more accurate statistics. If you don't have


That is not true. My nightly vacuum analyze clearly is sampling:

INFO: "tablename": 22102 pages, 30000 rows sampled, 1712934 estimated
total rows

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #3

P: n/a
>>One more question; on one server the Vacuum Analyze before the insert takes
approx. 2min after that the same command takes 15min.

You might try a VACUUM FULL sometime when you can deal with 15min of downtime
or so. Actually it would probably be longer. Perhaps the table that's taking
15min has a ton of extra dead tuples left over from the fsm settings being too
low and/or vacuum being too infrequent.


Does VACUUM FULL just lock entire tables and thus cause you to
essentially have downtime on that database because it doesn't respond
quickly or do you actually have to shut down postgres to safely do a
vacuum full?

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #4

P: n/a
Rick Gigger <ri**@alpinenetworking.com> wrote:
One more question; on one server the Vacuum Analyze before the insert takes
approx. 2min after that the same command takes 15min.

You might try a VACUUM FULL sometime when you can deal with 15min of downtime
or so. Actually it would probably be longer. Perhaps the table that's taking
15min has a ton of extra dead tuples left over from the fsm settings being too
low and/or vacuum being too infrequent.


Does VACUUM FULL just lock entire tables and thus cause you to
essentially have downtime on that database because it doesn't respond
quickly or do you actually have to shut down postgres to safely do a
vacuum full?


The former. You don't shut the database server down, but it won't be
responsive while vacuum full is running.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #5

P: n/a
On Fri, Jun 04, 2004 at 03:40:45PM -0400, Bill Moran wrote:
Rick Gigger <ri**@alpinenetworking.com> wrote:
Does VACUUM FULL just lock entire tables and thus cause you to
essentially have downtime on that database because it doesn't respond
quickly or do you actually have to shut down postgres to safely do a
vacuum full?


The former. You don't shut the database server down, but it won't be
responsive while vacuum full is running.


But only for the table that's currently being vacuumed: there's only one
table locked at any time.

(A different issue is the IO storm caused by the vacuum, which makes
everything else slower.)

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #6

P: n/a
Joseph Shraibman <jk*@selectacast.net> writes:
Greg Stark wrote:
There won't be anything to VACUUM after the insert, but perhaps you still want
to run ANALYZE. Note that a plain ANALYZE uses a statistical sample which is
much faster, whereas VACUUM ANALYZE has to look at every record anyways so
it's slower but produces more accurate statistics. If you don't have
That is not true. My nightly vacuum analyze clearly is sampling: INFO: "tablename": 22102 pages, 30000 rows sampled, 1712934 estimated
total rows


There is (only) one aspect in which VACUUM ANALYZE will produce more
accurate stats than a standalone ANALYZE: what it stores into
pg_class.reltuples is the true total tuple count produced by the VACUUM
phase, rather than the estimate produced by the ANALYZE phase. What
goes into pg_statistic will be the same in both cases, however, and will
be based on a sample not the whole table.

BTW, a VACUUM just after a table is loaded by INSERT/COPY is not
necessarily useless. It won't reclaim space, there being none to
reclaim, but it will mark all the rows as "known committed", thereby
saving some amount of I/O that would otherwise be incurred by later
transactions.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.