Large table performance and vacuum

Gavin Scott

Hi,

I'm having a performance problem with a large database table
we use with postgres 7.3.4. The table is:

db=> \d log
Table "public.log"
Column | Type | Modifiers
---------+----------+-----------
hid | integer |
tid | integer |
aid | integer |
id | smallint |
sid | smallint |
cid | smallint |
Indexes: aid_cid_hid_idx btree (aid, cid, hid) WHERE (cid = 32),
hid_idx btree (hid),
tid_idx btree (tid)

The table consists of batches of rows with the same hid, and
hid comes from a sequence, so lower hid entries are older.
About 25 million rows are inserted per day, which
corresponds to about 250,000 unique hids.

In order to keep the table manageable (for things like
pg_dump and vacuum) we're copying out older log entries
(starting with the first hid in the log table) and deleting
them.

The problem is the query "SELECT * FROM log ORDER BY hid
LIMIT 1;", which both EXPLAIN and EXPLAIN ANALYZE show as
Limit / Index Scan on hid_idx. This was very fast before we
started deleting out old log entries the table, but has
started taking an extremely long time, about 341 seconds.

However "SELECT * FROM log ORDER BY hid DESC LIMIT 1;",
which EXPLAIN shows as Limit / Index Scan Backward on
hid_idx is still extremely fast, < 1 ms.

Below I've attached the latest VACUUM ANALYZE VERBOSE output
and EXPLAIN ANALYZE results for the 2 queries mentioned
above run immediately after the vacuum. For the heck of it
there's also a backtrace from the postgres backend that was
running the "EXPLAIN ANALYZE SELECT * FROM log ORDER BY hid
LIMIT 1;" query.

Any ideas what the problem is? I'm guessing it has to do
with all of the unused tuples in the log table and will go
away if I do a VACUUM FULL. Unfortunately that's not
generally an option; the log table is used 24/7 and locking
it for the several hours this would take would require
downtime on our system.

Thanks,
Gavin Scott
ga***@ipalsoftware.com

VACUUM ANALYZE VERBOSE log OUTPUT

INFO: --Relation public.log--
INFO: Index hid_idx: Pages 696549; Tuples 180783386: Deleted 11184600.
CPU 41.45s/141.71u sec elapsed 2248.23 sec.
INFO: Index tid_idx: Pages 608998; Tuples 180866770: Deleted 11184600.
CPU 36.58s/143.80u sec elapsed 3405.02 sec.
INFO: Index aid_cid_hid_idx: Pages 155544; Tuples 22308137: Deleted 1365950.
CPU 10.29s/20.38u sec elapsed 1409.63 sec.
INFO: Removed 11184600 tuples in 72368 pages.
CPU 5.11s/5.11u sec elapsed 240.97 sec.
INFO: Index hid_idx: Pages 697061; Tuples 169790852: Deleted 11184513.
CPU 43.17s/139.36u sec elapsed 2213.30 sec.
INFO: Index tid_idx: Pages 609744; Tuples 169942979: Deleted 11184513.
CPU 35.78s/138.05u sec elapsed 3357.42 sec.
INFO: Index aid_cid_hid_idx: Pages 155775; Tuples 20945181: Deleted 1399976.
CPU 10.22s/19.64u sec elapsed 1408.34 sec.
INFO: Removed 11184513 tuples in 71240 pages.
CPU 5.11s/5.38u sec elapsed 147.56 sec.
INFO: Index hid_idx: Pages 698247; Tuples 163654063: Deleted 6521246.
CPU 41.11s/118.39u sec elapsed 2194.98 sec.
INFO: Index tid_idx: Pages 610817; Tuples 163795162: Deleted 6521246.
CPU 35.28s/117.13u sec elapsed 3170.12 sec.
INFO: Index aid_cid_hid_idx: Pages 156031; Tuples 20172239: Deleted 820413.
CPU 10.57s/17.95u sec elapsed 1438.76 sec.
INFO: Removed 6521246 tuples in 42430 pages.
CPU 3.23s/3.51u sec elapsed 162.80 sec.
INFO: Pages 1342806: Changed 242787, Empty 0; Tup 163016168: Vac 28890359, Keep 0, UnUsed 17697429.
Total CPU 326.60s/908.01u sec elapsed 22642.23 sec.
INFO: Analyzing public.log

db=> EXPLAIN ANALYZE SELECT * FROM log ORDER BY hid DESC LIMIT 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.99 rows=1 width=18) (actual time=0.05..0.06 rows=1 loops=1)
-> Index Scan Backward using hid_idx on log (cost=0.00..161101169.41 rows=163016160 width=18) (actual time=0.05..0.05 rows=2 loops=1)
Total runtime: 0.11 msec
(3 rows)

db=> EXPLAIN ANALYZE SELECT * FROM log ORDER BY hid LIMIT 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.99 rows=1 width=18) (actual time=341820.90..341820.92 rows=1 loops=1)
-> Index Scan using hid_idx on log (cost=0.00..161101169.41 rows=163016160 width=18) (actual time=341820.90..341820.91 rows=2 loops=1)
Total runtime: 341820.98 msec
(3 rows)

BACKTRACE OF POSTGRES PROCESS RUNNING
"EXPLAIN ANALYZE SELECT * FROM log ORDER BY hid LIMIT 1;"

(gdb) where
#0 0x40314714 in read () from /lib/libc.so.6
#1 0x00000300 in ?? ()
#2 0x08126ab9 in mdread ()
#3 0x081275f7 in smgrread ()
#4 0x0811cd05 in ReadBufferInternal ()
#5 0x0811cae5 in ReadBuffer ()
#6 0x080802b1 in _bt_getbuf ()
#7 0x08082c8b in _bt_step ()
#8 0x080830ae in _bt_endpoint ()
#9 0x08082b6b in _bt_first ()
#10 0x080808d1 in btgettuple ()
#11 0x08178ea9 in FunctionCall2 ()
#12 0x0807cd53 in index_getnext ()
#13 0x080dccdf in IndexNext ()
#14 0x080d8a1a in ExecScan ()
p#15 0x080dcd68 in ExecIndexScan ()
#16 0x080d663d in ExecProcNode ()
#17 0x080dfe25 in ExecLimit ()
#18 0x080d66f9 in ExecProcNode ()
#19 0x080d5454 in ExecutePlan ()
#20 0x080d4a89 in ExecutorRun ()
#21 0x0812a4ab in ProcessQuery ()
#22 0x0812896b in pg_exec_query_string ()
#23 0x08129a83 in PostgresMain ()
#24 0x0810fd54 in DoBackend ()
#25 0x0810f63d in BackendStartup ()
#26 0x0810e6d0 in ServerLoop ()
#27 0x0810e109 in PostmasterMain ()
#28 0x080e96ed in main ()
#29 0x4025c1c4 in __libc_start_main () from /lib/libc.so.6

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #1

Subscribe Reply

7164

Christopher Browne

A long time ago, in a galaxy far, far away, ga***@ipalsoftware.com (Gavin Scott) wrote:

Any ideas what the problem is? I'm guessing it has to do
with all of the unused tuples in the log table and will go
away if I do a VACUUM FULL. Unfortunately that's not
generally an option; the log table is used 24/7 and locking
it for the several hours this would take would require
downtime on our system.

The problem with the "ascending" order is that the query walks through
the bulk of the "dead zone" because it won't know those pages are
irrelevant until it actually examines them.

Doing a plain old vacuum (not FULL) would quite likely help by marking
the dead pages as being well and truly dead, and would allow dead
pages to get reused. The more often you DELETE old rows, and then
VACUUM them, the better it is likely to work out, with the caveat that
the VACUUMs will be mighty I/O-expensive.

Another approach might be to have a series of "rotor tables," where on
successive days, you insert into a different table each time. You
insert into "table_1" on Monday, then shift to "table_2" on Tuesday,
and so forth. You can TRUNCATE table_1 once the data is no longer
needed, which will be extremely fast, unlike delete/vacuum.

The _problem_ with that is that if you need to do queries that cross
days, they have to be done very carefully in order to not go
disastrously badly. The "obvious" way is to assemble the tables into
a view, as with

create view table_log as
select * from table_1 union all select * from table_2 union all
select * from table_3 union all select * from table_4;

Unfortunately, self-joins get _really, really, really_ expensive
because that instantly expands into a 16-way join. Doing queries
therefore requires some cleverness; you can't assume the optimizer
will sort it all out cleanly.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://cbbrowne.com/info/emacs.html
"While preceding your entrance with a grenade is a good tactic in
Quake, it can lead to problems if attempted at work." -- C Hacking
-- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

Nov 23 '05 #2

Tom Lane

Gavin Scott <ga***@ipalsoftware.com> writes:

The problem is the query "SELECT * FROM log ORDER BY hid
LIMIT 1;", which both EXPLAIN and EXPLAIN ANALYZE show as
Limit / Index Scan on hid_idx. This was very fast before we
started deleting out old log entries the table, but has
started taking an extremely long time, about 341 seconds.

I'm suspecting that you need to REINDEX hid_idx. This is an
aspect of the pre-7.4 "index bloat" problem: the left end of the index
now consists of entirely-empty pages, which not only occupy space but
take time to scan through for a query like this.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #3

Mike Mascari

Tom Lane wrote:

Gavin Scott <ga***@ipalsoftware.com> writes:
The problem is the query "SELECT * FROM log ORDER BY hid
LIMIT 1;", which both EXPLAIN and EXPLAIN ANALYZE show as
Limit / Index Scan on hid_idx. This was very fast before we
started deleting out old log entries the table, but has
started taking an extremely long time, about 341 seconds.

I'm suspecting that you need to REINDEX hid_idx. This is an
aspect of the pre-7.4 "index bloat" problem: the left end of the index
now consists of entirely-empty pages, which not only occupy space but
take time to scan through for a query like this.

Assuming a "normal" usage pattern, regular VACUUMing, and no
instances of corrupted indexes, are there any scenarios in which one
would need to REINDEX either user or system tables post 7.4?

Mike Mascari
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4

Tom Lane

Mike Mascari <ma*****@mascari.com> writes:

Assuming a "normal" usage pattern, regular VACUUMing, and no
instances of corrupted indexes, are there any scenarios in which one
would need to REINDEX either user or system tables post 7.4?

Ideally not, but we'll have to wait for more field experience before we
really know whether the existing fix covers all "normal" usage patterns.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #5

qualitha

New Member

Gavin,

I have a small "off-topic" question because I´m currently trying to find out how many write-accesses a "normal" DBMS can handle and get an idea what is much and what is less.

You are talking about 25 millions rows per day. Could you give me an idea about your architecture. Is this a number, that a usual DBMS can handle or does this require special architecture (like clustering, blades etc). Do you have any information about what is much or what is less and how much a DBMS can handle?

Any help would be greatly appriciated

Thank
Tom

Jun 20 '06 #6

Similar topics

3457

Database design for large numbers of attributes

by: John | last post by:

I'm developing an application for medical use that will be used to capture patient background and visit data. The application will have approximately 50 forms, with an average of about 20 fields...

Oracle Database

11340

Does REINDEX TABLE lock the table?

by: G.W. Lucas | last post by:

I apologize if this is a RTFM question, but I have not been able to find a definitive answer elsewhere. Does a "REINDEX TABLE" lock the table while it is working? Can applications write data to...

PostgreSQL Database

3133

VACUUM degrades performance significantly. Database becomes unusable!

by: Stephen | last post by:

Hello, Is it normal for plain VACUUM on large table to degrade performance by over 9 times? My database becomes unusable when VACUUM runs. From reading newsgroups, I thought VACUUM should only...

PostgreSQL Database

3226

Query performance question on a large table

by: Együd Csaba | last post by:

Hi All, how can I improve the query performance in the following situation: I have a big (4.5+ million rows) table. One query takes approx. 9 sec to finish resulting ~10000 rows. But if I run...

PostgreSQL Database

2037

Large table redesign

by: Mats Kling | last post by:

Hi all, We are logging approx. 3 million records every day into a history table. Last week we ran into the 64 GB limit in UDB 8 so we recreated the table with 8 k pagesize to get some...

DB2 Database

6931

Join query on 1M row table slow

by: CSN | last post by:

I have a pretty simple select query that joins a table (p) with 125K rows with another table (pc) with almost one million rows: select p.* from product_categories pc inner join products p on...

PostgreSQL Database

2075

performance very slow

by: Mario Soto | last post by:

Hi. i hava a postresql 7.4.2 in a production server. tha machine is a Pentium IV 2,6 GHZ AND 1 GB IN RAM with lINUX RH 9.0. The postresql.conf say: ...

PostgreSQL Database

1524

special table queries slow until dump/restore

by: Damon Hart | last post by:

Hi all - I am experiencing continually degrading performance on queries run against the special system tables. I notice the slowdown when these meta-data queries are run implicitly "behind the...

PostgreSQL Database

1819

Vacuum, per table or entire database? what is better?

by: yateen joshi | last post by:

Hi , I am developing a script that will delete old data from various tables in a database periodically. The script deletes data from all the tables, one by one, in single go. I have two options...

PostgreSQL Database

7225

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

7123

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7324

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

7382

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

7495

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

5627

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

5052

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

1556

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

766

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP