Indexes and Tables: Growth and Treatment (Modified by Thomas F. O'Connell)

Thomas F.O'Connell

I'm helping manage a postgres installation that continually consumes a
considerable amount of disk space, and I'm hoping to learn a bit more
about both treating the symptoms and addressing the causes.

Here are the basics:

It's a pg 7.4.1 installation on a Debian stable GNU/Linux 2.6.2 box
with 4GB RAM with 4 2.4 GHz processors and 36 GB of disk space.

There are thousands of tables, many of which are object-relational
(I.e., many are subclasses of sets of top-level tables). There are
indexes in place for joins that apply to many of the columns in the
subclassed tables.

It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis.

We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes. We began
manually reindexing the worst offenders once we passed 50% disk usage
regularly.

So here are my questions:

1. Is adding reindexdb to cron to reindex the entire database nightly
overkill?

2. If we turn on pg_autovacuum and leave in place one weekly vacuum
full, is that a reasonable strategy?

3. Otherwise, is it better in general to vacuum prior to reindexing?

4. What are the best places to look for causes of the velocity of
growth?

Thanks!

-tfo
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #1

Subscribe Reply

1204

Tom Lane

"Thomas F.O'Connell" <tf*@sitening.c om> writes:

It's a high turnover database, in that the applications that use it
perform thousands of inserts, updates, and deletes on a daily basis. We're seeing about 5-10 GB of increased disk space used on a daily
basis if a vacuum (full) or reindexdb is not performed. We were doing
one vacuum analyze full a week with nightly vacuum analyzes.

Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum). Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.

With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.

I can't recommend an analyze frequency on what you've told us.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #2

Thomas F.O'Connell

On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:

Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum).
I will try autovacuum.
Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.
I was just reading up on FSM settings today. In fact, here's the output
of a recent VACUUM VERBOSE:

INFO: free space map: 1000 relations, 11599 pages stored; 100064 total
pages needed
DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 178 kB
shared memory.

So clearly we need to increase max_fsm_pages. How is this related to
vacuuming? And is it related at all to index growth?
With sufficiently frequent plain vacuums you really shouldn't need
vacuum full at all.
So is the only benefit to that the extreme optimizations of disk space
it undertakes? Is there any point at which the extra compacting
actually results in a performance enhancement?
I can't recommend an analyze frequency on what you've told us.

What more information would you need to make a recommendation?

Thanks for all the tips!

-tfo
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #3

Thomas F.O'Connell

Tom,

If I've got the RAM, should I have max_fsm_relatio ns be large enough to
cover _all_ user tables and indexes?

Thanks!

-tfo

On Jul 13, 2004, at 6:58 PM, Tom Lane wrote:

Try hourly vacuums. If that doesn't stem the tide, make it more often
(or try autovacuum). Also make sure that your FSM settings are large
enough; if they're not then no amount of plain vacuuming will keep you
out of trouble.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #4

Similar topics

476

How to create fulltext indexes in MySQL 3.23.58?

by: Phil Powell | last post by:

Retracing my problem leads me to believe I never successfully created fulltext indexes for MySQL 3.23.58 MyISAM tables. I went to the MySQL manual and was able - or so I thought - to create them, however, my fulltext search queries fail in 3.23.58 but the exact queries (with same data) work perfectly in 4.0.10. --...

MySQL Database

1665

Optimizator and indexes

by: Igor | last post by:

Is there a way to force optimizer to use indexes without hints? (some server setting or index type...) I'll give an example to clarify : I have a table with fields Customer_Code char(10) not null Invoice_Number int not null and an index on those fields IX_1. there are about 2,000,000 records in the table and those two fields are

Microsoft SQL Server

2186

Please help. netscape issues with show/hide tables..

by: Zambien | last post by:

Hi all, Here's my problem. I have tables that are using the menu/submenu idea for hiding rows. This works fine in IE (of course) and does show/hide correctly in netscape, but as soon as the shown method is called, the table gets skewed and the presentation of the data on the page goes horribly wrong. I don't think this is a table issue as I have spent alot of time staring at this code. Here is the html...

Javascript

1179

Fwd: Indexes and Tables: Growth and Treatment

by: Thomas F.O'Connell | last post by:

Matthew, Here's some more feedback on our use of pg_autovaccum. It's clear that it's working and that it's helping, but even after increasing our max_fsm_pages substantially (to in excess of what vacuum verbose suggests is needed), we're still seeing pretty a rapid increase in disk usage. It used to be that nightly reindexing helped substantially, but am I wrong in thinking that the frequency of dynamic analysis is helping

PostgreSQL Database

1237

Cross-datatype Comparisons and Indexes

by: Thomas F.O'Connell | last post by:

Since the current stable version of postgres (7.4.x) doesn't allow cross-datatype comparisons of indexes, is it always necessary to cast my application data explicitly in order for an index to be used, even among the integer types? E.g., If I have a table with a bigint primary key and application data compared against that primary key, must I always explicitly cast the application data to bigint if I want postgres to use the index? ...

PostgreSQL Database

1505

the current scoop on ilike and indexes

by: Kevin Murphy | last post by:

I am pretty sure the answer is no, but ... is there any way to get 'ilike' to use an index? It seems like something that a lot of people would want to do. Otherwise, should I just create redundant case-mapped columns and use 'like'? Thanks, Kevin Murphy ---------------------------(end of broadcast)---------------------------

PostgreSQL Database

1243

Indexes on Expressions -- Parentheses

by: Thomas F.O'Connell | last post by:

From 11.5 in the docs: "The syntax of the CREATE INDEX command normally requires writing parentheses around index expressions, as shown in the second example. The parentheses may be omitted when the expression is just a function call, as in the first example." But when I try this: db=# CREATE INDEX expression_idx on some_table( extract( year from

PostgreSQL Database

2234

sqlserver 2005: indexes on raid-0?

by: boa | last post by:

I'm currently planning disk layouts and use for a new version of our database. The current version has all data and indexes in the default filegroup, placed on one big raid-5 array(6 drives) along with the transaction log. Performance is not the best, as you may imagine... Next week we will add another 14 drives and organize them in different combos of raid-10 and raid-1, and then create several filegroups and place tables and index data...

Microsoft SQL Server

2705

access97, more than 32 indexes on a table, time to move to MSDE / sql

by: lesperancer | last post by:

you start with a small application in access97, then you have more modules and more... and you reach the point where tables like 'item' and 'employee' reach the limit and you know there's more indexes required for RI to come does creating a RI programatically instead of the relationship window still consume one of the 32 indexes ? does access2000 / 2003 allow more indexes per table ?

Microsoft Access / VBA

9719

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9599

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10624

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10371

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10111

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

9193

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7650

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5546

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4330

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp