473,387 Members | 1,721 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Controlling maximal on-disk size of a table

I have a table that is essentially a log where new entries are streaming in continually and from time to time I throw old entries away to
keep the table from growing.

I understand that in addition to issue a
DELETE FROM log WHERE date < xxx
I also need to issue a
VACUUM log
so that new entries will use the space of deleted entries.

Now I want to reserve a certain amount of disk storage to hold the log table. So I first let the table (plus its index, its toast table and toast index) grow until it is about to reach the maximum size. Then a daemon continually deletes old entries and vacuums the table so the on-disk usage stays more or less constant from now on, at least this is the idea.

Of course I would like to keep as much history as possible, given the available space. Also the log may sometimes be quiet and sometimes quite busy, also the size of the text entries may vary quite a bit.

Now to make a good guess about when to issue the next delete, I need to estimate how much of the on-disk usage is accounted for by deleted entries.

I can of course count the number of entries, estimate the bytes needed for storage by averaging the length of the text column, adding the size of the fixed columns and compare that to the on-disk size to conclude how much space is still available. As for the index I assume it is has a fixed size per row.

But these queries are expensive because the log may easily contain millions of entries with an on disk size in the range of a few GB, and must be repeated quite often to prevent sudden bursts of new entries from overflowing the log.

Is there a better way to get at the current "free space" inside of a table/index?
__________________________________________________ ________
Mit WEB.DE FreePhone mit hoechster Qualitaet ab 0 Ct./Min.
weltweit telefonieren! http://freephone.web.de/?mc=021201
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #1
1 3414
What you're looking for sounds like and extended (destructive) version
of what autovacuum does.

So you might try to look at the statistics tables like autovacuum does.
I don't know how it does that, but it seems that that way you'd be able
to incrementally have approximate information about what happens with
the table.

Good luck,

d.
--
David Helgason,
Business Development et al.,
Over the Edge I/S (http://otee.dk)
Direct line +45 2620 0663
Main line +45 3264 5049

On 16. nov 2004, at 13:21, Nils Rennebarth wrote:
I have a table that is essentially a log where new entries are
streaming in continually and from time to time I throw old entries
away to
keep the table from growing.

I understand that in addition to issue a
DELETE FROM log WHERE date < xxx
I also need to issue a
VACUUM log
so that new entries will use the space of deleted entries.

Now I want to reserve a certain amount of disk storage to hold the log
table. So I first let the table (plus its index, its toast table and
toast index) grow until it is about to reach the maximum size. Then a
daemon continually deletes old entries and vacuums the table so the
on-disk usage stays more or less constant from now on, at least this
is the idea.

Of course I would like to keep as much history as possible, given the
available space. Also the log may sometimes be quiet and sometimes
quite busy, also the size of the text entries may vary quite a bit.

Now to make a good guess about when to issue the next delete, I need
to estimate how much of the on-disk usage is accounted for by deleted
entries.

I can of course count the number of entries, estimate the bytes needed
for storage by averaging the length of the text column, adding the
size of the fixed columns and compare that to the on-disk size to
conclude how much space is still available. As for the index I assume
it is has a fixed size per row.

But these queries are expensive because the log may easily contain
millions of entries with an on disk size in the range of a few GB, and
must be repeated quite often to prevent sudden bursts of new entries
from overflowing the log.

Is there a better way to get at the current "free space" inside of a
table/index?
__________________________________________________ ________
Mit WEB.DE FreePhone mit hoechster Qualitaet ab 0 Ct./Min.
weltweit telefonieren! http://freephone.web.de/?mc=021201
---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: martin z | last post by:
I'm having a lot of trouble figuring out the details of controlling Python compiling. I can compile and embed just fine, but I want some finer granularity on what gets compiled into Python. I can...
0
by: Bart Simpson | last post by:
Hi, I would like to create/delete ftp accounts of windows IIS, and to control MS Media Service under windows 2003 by Python program. A few days ago, I found there is iisftp.vbs vb-script that...
2
by: Skip Montanaro | last post by:
I updated the patch that supports PEP 304, "Controlling Generation of Bytecode Files" to apply cleanly against current CVS. I've tested it on Mac OS X (straight Unix build only). I'd appreciate...
1
by: Seda Ozulku | last post by:
I don't have much information about .Net. May be it must be a silly question. I have Windows Application running at the server. I need to Access the program functions ( like at given time...
3
by: yadavindrian01 | last post by:
Hi everybody I am a student of computer engg. I have to make a project - controlling devices through computer using parallel port. I know necessary details of the parallel port. Please guide...
5
by: Mainard | last post by:
Hi, all First off I know alittle about javascript, but i have never worked with controlling froms with it be for i wonder can some help me? for example:I have this <textarea name"write1">Write...
8
by: Andrey Mosienko | last post by:
We are using PostgreSQL about two years beginning from version 7.0. I have one question about starting postmaster: Is there way to detach it from the controlling terminal under FreeBSD? My...
0
by: RWC | last post by:
Hi! I'm having some trouble controlling how the XP Packaging Wizard installs my applications. In 97, there were options to replace older, never or always, if there was a file conflict. Now...
1
by: Rex | last post by:
Hi, If someone has turned off Java, what is the maximal set of info I can learn about a user, his computer & browser? Also, how can I detect whether Java is disabled? Thanks.
6
by: Lubomir | last post by:
Hi, Where in .NET are definded constants for maximal file name length and maximal file path? Thanks, Lubomir
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.