speeding up inserts - PostgreSQL Database

Chris Ochs

First of all, we are still running sapdb at the moment but are in the
process of moving to postgresql, so it seemed a good idea to post this type
of question here.

In our environment we have transaction processing, each transaction accounts
for 10-30 inserts and 3-4 selects. We also have users that use a management
interface for doing all sorts of queries on the data once it gets into the
db. Most of the user queries are selects with a few updates, and fewer
inserts.

The basic problem is that the transaction times are very critical, one
second is a big deal. The data inserted into the db from a transaction does
not have to happen instantly, it can be delayed (to a point anyways). Being
that there is only so much you can do to speed up inserts, I have been
testing out a different system of getting the data from the application into
the database.

Now what our application does is create the queries as it runs, then instead
of inserting them into the database it writes them all out to a single file
at the end of the transaction. This is a huge performance boost. We then
use a separate deamon to run the disk queue once every second and do all the
inserts. If for some reason the main application cant' write to disk it
will revert to inserting them directly.

Is this a crazy way to handle this? No matter what I have tried, opening
and writing a single line to a file on disk is way faster than any database
I have used. I even tried using BerkeleyDB as the queue instead of the
disk, but that wasn't a whole lot faster then using the cached database
handles (our application runs under mod perl).

Chris

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 12 '05 #1

Subscribe Post Reply

1654

Dustin Sallings

On Dec 31, 2003, at 14:16, Chris Ochs wrote:

Now what our application does is create the queries as it runs, then
instead
of inserting them into the database it writes them all out to a single
file
at the end of the transaction. This is a huge performance boost. We
then
use a separate deamon to run the disk queue once every second and do
all the
inserts. If for some reason the main application cant' write to disk
it
will revert to inserting them directly.

Is this a crazy way to handle this? No matter what I have tried,
opening
and writing a single line to a file on disk is way faster than any
database
I have used. I even tried using BerkeleyDB as the queue instead of the
disk, but that wasn't a whole lot faster then using the cached database
handles (our application runs under mod perl).

In my application, I've built a ``TransactionPipeline'' class that
queues up transactions for asynchronous storage. It made an incredible
difference in transaction processing speed in places where the
transaction isn't critical (the atomicity is important, but the main
place this stuff is used is in transactions that record the state of a
device, for example).

Conceptually, it's somewhat similar to yours. The thread that runs
the queue is triggered by the addition of a new transaction to execute
using a normal notification mechanism, so it's basically idle unless
stuff's going on, and if there's a lot of stuff going on, the queue
will just build up until incoming rates are lower than the rates at
which we can actually process stuff.

It's been the only thing that's kept our application running (against
MS SQL Server until we can throw that away in favor of postgres).

--
SPY My girlfriend asked me which one I like better.
pub 1024/3CAE01D5 1994/11/03 Dustin Sallings <du****@spy.net>
| Key fingerprint = 87 02 57 08 02 D0 DA D6 C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #2

Tom Lane

"Chris Ochs" <ch***@paymentonline.com> writes:

Is this a crazy way to handle this?

Depends. Do you care if you lose that data (if the system crashes
before your daemon can insert it into the database)? I think the
majority of the win you are seeing comes from the fact that the data
doesn't actually have to get to disk --- your "write to file" never
gets further than kernel disk buffers in RAM.

I would think that you could get essentially the same win by aggregating
your database transactions into bigger ones. From a reliability point
of view you're doing that anyway --- whatever work the daemon processes
at a time is the real transaction size.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #3

Chris Ochs

"Chris Ochs" <ch***@paymentonline.com> writes:
Is this a crazy way to handle this?

Depends. Do you care if you lose that data (if the system crashes
before your daemon can insert it into the database)? I think the
majority of the win you are seeing comes from the fact that the data
doesn't actually have to get to disk --- your "write to file" never
gets further than kernel disk buffers in RAM.

I would think that you could get essentially the same win by aggregating
your database transactions into bigger ones. From a reliability point
of view you're doing that anyway --- whatever work the daemon processes
at a time is the real transaction size.

regards, tom lane

The transactions are as big as they can be, all the data is committed at
once. I'm guessing that for any database to be as fast as I want it, it
just needs bigger/better hardware, which isnt' an option at the moment.

I was also thinking about data loss with the disk queue. Right now it's
such a small risk, but as we do more transactions it gets bigger. So right
now yes it's an acceptable risk given the chance of it happening and what a
worst case scenario would look like. but at a point it wouldnt' be.

Chris

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 12 '05 #4

Similar topics

Speeding up command line scripts

by: Snyke | last post by:

Hi. I have a command line script which works really fine, the only problem is that it take *really* long for the first output to be printed on screen. Since I also get some HTTP headers I'm...

PHP

speeding up C code

by: dvumani | last post by:

I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the...

C / C++

Speeding up select queries on table with 3million rows.

by: mfyahya | last post by:

Hi, I'm new to databases :) I need help speeding up select queries on my data which are currently taking 4-5 seconds. I set up a single large table of coordinates data with an index on the fields...

MySQL Database

Speeding up inserts

by: Matik | last post by:

Hello everybody, Just short question: I have tables, which are only log tables (very less used for selects), but there is a lotof writing. I would like to have as much speed as possible by...

Microsoft SQL Server

Speeding up IIS (?) or C# apps in general (?)

by: Robert Wilkens | last post by:

Ok... This may be the wrong forum, but it's the first place I'm trying. I'm new to C# and just implemented the 3-tier Distributed application from Chapter 1 (the first walkthrough) in the...

C# / C Sharp

Speeding up pages

by: OHM | last post by:

I was wondering about this topic and although I accept that different situations call for different solutions, but wondered are there any other solutions and whether has anyone carried out a...

ASP.NET

speeding up inserts

by: zelnaga | last post by:

according to the mysql manual, multiple inserts can be sped up by locking the table before doing them and unlocking the table afterwards. is the same true of multiple inserts in mysql? if so,...

Microsoft SQL Server

Speeding Up Communication With Access

by: ags5406 | last post by:

I've created an application that downloads data daily from a secure web site and stores that data in an Access database. Then there are different options for allowing the user to do keyword and...

Visual Basic .NET

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice