By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,660 Members | 1,230 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,660 IT Pros & Developers. It's quick & easy.

CLUSTER

P: n/a
I have a table that II am constantly inserting into (around 10 times a
second right now but hope to increase latter). I hold these rows for a
week then summarize and delete them. During that week I need to access
ranges of these rows based on a timestamp in each row set to now() when I
insert them. I have this column indexed but if I let the system run after
a few weeks it can take a fairly long time to get information from this
table (for instance the avg() of inserted values in the last 5 minuets
could take over a minuet to calculate). If I run CLUSTER on the index of
the time stamps my time to do this drops down to under 5 seconds again. I
checked the plan being used by explain analyze and orginally it was always
doing seqscans so I set enable_seqscan = FALSE just before I run this
query, this forces it to use the index and speeds it up quite a bit.

Does anyone know a way I can reorder the database without doing a
CLUSTER? It stops all insertions into the table and takes several minuets
during which time a large backlog builds up. Or should I be using some
other method of speeding up the table?

--- Adam Kavan
--- ak****@cox.net
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Hello Adam

I had a similar problem, although my table is much smaller, it keeps around
50K rows, I think if you are inserting 10 by second you have 10*3600*24*7,
that's about 6 million records. Also I'm assuming here that you also delete
constantly any record being more than one week old, is this correct or do
you delete all rows once a week and start over with the table empty?

Essentially what we do here is vacuum very frequently (every five minutes or
so), being very careful about FSM settings and run a VACUUM FULL and a
REINDEX once a week, that keeps the table performance within limits. We
never run CLUSTER for that.

We run PG 7.3.2, in any case if your version is below the 7.4 line then you
are exposed to suffer from index bloating, which is probably the cause of
your problem.

hth
cl.

----- Original Message -----
From: "Adam Kavan" <ak****@cox.net>
To: <pg***********@postgresql.org>
Sent: Thursday, October 23, 2003 2:38 PM
Subject: [GENERAL] CLUSTER

I have a table that II am constantly inserting into (around 10 times a
second right now but hope to increase latter). I hold these rows for a
week then summarize and delete them. During that week I need to access
ranges of these rows based on a timestamp in each row set to now() when I
insert them. I have this column indexed but if I let the system run after
a few weeks it can take a fairly long time to get information from this
table (for instance the avg() of inserted values in the last 5 minuets
could take over a minuet to calculate). If I run CLUSTER on the index of
the time stamps my time to do this drops down to under 5 seconds again. I
checked the plan being used by explain analyze and orginally it was always
doing seqscans so I set enable_seqscan = FALSE just before I run this
query, this forces it to use the index and speeds it up quite a bit.

Does anyone know a way I can reorder the database without doing a
CLUSTER? It stops all insertions into the table and takes several minuets
during which time a large backlog builds up. Or should I be using some
other method of speeding up the table?

--- Adam Kavan
--- ak****@cox.net
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #2

P: n/a
At 02:02 AM 10/24/03 -0300, Claudio Lapidus wrote:
Hello Adam

I had a similar problem, although my table is much smaller, it keeps around
50K rows, I think if you are inserting 10 by second you have 10*3600*24*7,
that's about 6 million records. Also I'm assuming here that you also delete
constantly any record being more than one week old, is this correct or do
you delete all rows once a week and start over with the table empty?
Every night at midnight I delete everything over 7 days old.

Essentially what we do here is vacuum very frequently (every five minutes or
so), being very careful about FSM settings and run a VACUUM FULL and a
REINDEX once a week, that keeps the table performance within limits. We
never run CLUSTER for that.


I'm running pg_autovac on my database, and 7.4 (updating to beta 5 today to
see if it improves things).

--- Adam Kavan
--- ak****@cox.net
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.