Ron,
Thanks for your help. Following your lead, I found the FM at
http://www.postgresql.com/docs/7.3/s...-commands.html
and it was very helpful. Serves me right for looking in books.
It's funny how these things work out. Looking at the text there
was a suggestion to "see the section on Routine Reindexing" (section 8.3)
which provided a very good description about why reindexing might
be required. I've quoted it below. Looking at the discussion,
I began to wonder if PostgreSQL's particular B-tree implementation might
show a performance degradation if you indexed a column that was
steadily increasing (just as some sorts break down when you try
to sort a nearly-ordered list). The main field I'm indexing
on is a SERIAL PRIMARY KEY (also used as a foreign key in a number
of child tables). Since the table implements a sequence and uses
it as the primary key, the indexed element is almost by definition
monotonically increasing... something that may not be healthy for
an intensely accessed index.
Anyway, I made a quick hack to use something else as a key
and early testing looks like it gains me a non-trivial performance
boost. Of course I'm NOT SURE YET, but it looks promising.
I'll have to do a bit more testing and implement a proper design.
So thanks again for your suggestion. It may have paid off even
better than I'd hoped.
Gary
Here's the text cited above
PostgreSQL is unable to reuse B-tree index pages in certain cases. The
problem is that if indexed rows are deleted, those index pages can only be
reused by rows with similar values. For example, if indexed rows are deleted
and newly inserted/updated rows have much higher values, the new rows can't
use the index space made available by the deleted rows. Instead, such new
rows must be placed on new index pages. In such cases, disk space used by the
index will grow indefinitely, even if VACUUM is run frequently.
rstp <rs**@linuxwaves.com> wrote in message news:<3F**************@linuxwaves.com>... G.W. Lucas wrote: I apologize if this is a RTFM question, but I have not been able
to find a definitive answer elsewhere.
Does a "REINDEX TABLE" lock the table while it is working?
Can applications write data to the table during the REINDEX?
I am working on a real-time application that handles over 2 million
"events" per day and runs 7-by-24. Most of these events involve an
UPDATE to one or two tables, though a small subset involve a bit
more work. Naturally, I am performing VACUUM nearly constantly and
VACUUM FULL rather more often than I'd prefer. Anyway, testing has
shown that certain important queries happen a LOT faster shortly
after a REINDEX on the relevant tables. Apparently, the frequent
UPDATES are battering the indices to the point where their
performance degrades. I am currently doing
a REINDEX once-per-day (during a relative lull in the event rate),
but would like to do it more often. The problem is that at the
high event rate I need to process, I can't afford to have my
application stall while waiting for a REINDEX. Will it do so?
Thanks in advance for your help.
According to the FM, REINDEX is used to rebuild corrupted indexes, so it
does indeed lock the table. The documentation also suggests that, unless
you need the index you're trying to rebuild, you may be better off just
deleting the index and creating a new one - I'm not sure if this locks
the table or not, but it may be worth a try.
Here is the relevant section of REINDEX documentation:
'Note: Another approach to dealing with a corrupted user-table index is
just to drop and recreate it. This may in fact be preferable if you
would like to maintain some semblance of normal operation on the table
meanwhile. REINDEX acquires exclusive lock on the table, while CREATE
INDEX only locks out writes not reads of the table. '
I would recommend that you look at installing pgAdminII on a windows box
for it's superior help features, if for nothing else. It includes
documentation for version 7.3 and it is very easy to find relevant
documentation when you need it.
HTH
Ron