473,395 Members | 1,653 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

vacuumdb hanging database cluster

When I run:
vacuumdb --full --all --analyze --quiet
on my database cluster it will complete in < 2 minutes (this cluster
is a few million total rows and ~2GB).

After testing, I set this up as an off-hours cron job and it worked
fine for several days then hung the whole database. After my pager
pulled me from bed I found the vacuumdb process still running but the
vacuum process on the first database (alphabetically) was showing it
was waiting:
postgres: vacuumdb --full --all --analyze --quiet
postgres: postgres firstdb [local] VACUUM waiting

A couple hundred processes were showing as "startup waiting" and one
was "idle in transaction". The process in the "VACUUM waiting" state
was the only one connected to that database - all other connections
were to other databases.

CPU and disk utilization were essentially zero. Suspecting a lock
problem I attempted to use a pre-existing connection to view pg_locks
but it would not respond.

I killed the vacuum process and all the processes in the "waiting"
states cleared within a second or two and system returned to normal.
The pg_locks query also returned but showed no useful info.

I tracked down the process that was "idle in transaction" and it was a
pg_dump process running on another machine. This process does a
periodic dump of one very small table and should complete in a
fraction of a second but was still waiting since the previous day -
apparently without deleterious effects.

There was no useful info in the log.

I've stopped running the vacuum full job via cron till I can trust it.
Any ideas on how to track/prevent this behavior? Server is version
7.4.1 and my web searches have proved futile.

Cheers,
Steve
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #1
3 2796
Steve Crawford <sc*******@pinpointresearch.com> writes:
A couple hundred processes were showing as "startup waiting" and one
was "idle in transaction". The process in the "VACUUM waiting" state
was the only one connected to that database - all other connections
were to other databases.
I suspect what must have happened is that the vacuum process was trying
to vacuum one of the shared catalogs (pg_database or pg_shadow), and was
blocked trying to get exclusive lock because someone else (the "idle in
transaction" guy) was holding some lock on that table. At this point
all incoming connections, to any database, will block behind the VACUUM
until the idle guy closes his transaction and thereby releases his lock.
I tracked down the process that was "idle in transaction" and it was a
pg_dump process running on another machine. This process does a
periodic dump of one very small table and should complete in a
fraction of a second but was still waiting since the previous day -
apparently without deleterious effects.
What was it waiting on? Since it was idle instead of waiting, the
problem must have been on the client side. I've not heard of pg_dump
just going to sleep for no reason...
I've stopped running the vacuum full job via cron till I can trust it.
Any ideas on how to track/prevent this behavior? Server is version
7.4.1 and my web searches have proved futile.


My recommendation would be to lose the --full. If you're doing
sufficiently frequent vacuuming you have no need for that, and getting
rid of it means vacuum doesn't take exclusive table locks. That means
it will neither block nor be blocked by ordinary readers and writers.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #2
On Monday 26 July 2004 2:18 pm, Tom Lane wrote:
Steve Crawford <sc*******@pinpointresearch.com> writes:
A couple hundred processes were showing as "startup waiting" and
one was "idle in transaction". The process in the "VACUUM
waiting" state was the only one connected to that database - all
other connections were to other databases.


I suspect what must have happened is that the vacuum process was
trying to vacuum one of the shared catalogs (pg_database or
pg_shadow), and was blocked trying to get exclusive lock because
someone else (the "idle in transaction" guy) was holding some lock
on that table. At this point all incoming connections, to any
database, will block behind the VACUUM until the idle guy closes
his transaction and thereby releases his lock.


I agree that it has the "smell" of a system-level lock - I just
couldn't get any info from pg_locks till it was cleared.
I tracked down the process that was "idle in transaction" and it
was a pg_dump process running on another machine. This process
does a periodic dump of one very small table and should complete
in a fraction of a second but was still waiting since the
previous day - apparently without deleterious effects.


What was it waiting on? Since it was idle instead of waiting, the
problem must have been on the client side. I've not heard of
pg_dump just going to sleep for no reason...


Beats the heck out of me. We periodically dump some selected small
tables via a script using:
pg_dump -i -h $dbhost -U $dbuser -t $dumptable > dumpfile

It's very vanilla and generally works fine but sometimes (perhaps 1
per 1000+ runs) ends up idle in transaction. I'm going to take a much
closer look at pg_locks next time it happens.

The -i is because pg_dump on the client machine is 7.4.2 and the
server is 7.4.1 but that doesn't seem to be a problem.
I've stopped running the vacuum full job via cron till I can
trust it. Any ideas on how to track/prevent this behavior? Server
is version 7.4.1 and my web searches have proved futile.


My recommendation would be to lose the --full. If you're doing
sufficiently frequent vacuuming you have no need for that, and
getting rid of it means vacuum doesn't take exclusive table locks.
That means it will neither block nor be blocked by ordinary readers
and writers.


Yes, that's where I've headed. I'll save the occasional full vacuum
for manual running when I can watch it.

Thanks,
Steve
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #3
Steve Crawford <sc*******@pinpointresearch.com> writes:
I tracked down the process that was "idle in transaction" and it
was a pg_dump process running on another machine.
What was it waiting on?

Beats the heck out of me. We periodically dump some selected small
tables via a script using:
pg_dump -i -h $dbhost -U $dbuser -t $dumptable > dumpfile It's very vanilla and generally works fine but sometimes (perhaps 1
per 1000+ runs) ends up idle in transaction. I'm going to take a much
closer look at pg_locks next time it happens.


If it is "idle in transaction" and not "<something> waiting" then it is
not blocked waiting for someone's lock, so pg_locks is unlikely to yield
the answer. I think it's got to be something funny on the pg_dump side.
Or maybe a communications problem?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

68
by: rkusenet | last post by:
http://www.eweek.com/article2/0,1759,1820667,00.asp The database market grew by 10.3 percent in 2004, fueled largely by hunger for business intelligence and analytics, according to numbers...
4
by: Keary Suska | last post by:
I received the following errors from an automated full vacuum: vacuumdb: vacuuming of database "milemgr" failed: ERROR: tuple concurrently updated ERROR: Vacuum command failed: Inappropriate...
22
by: Lonni Friedman | last post by:
Greetings, I've got an annoying problem. I'm currently running PostgreSQL-7.3.4 on Linux (x86). This problem started with 7.3.3. I've got a database that is on the larger side (about 3GB dump)....
7
by: Randy Yates | last post by:
I'm a complete newbie to postgres so please look the other way if these questions are really stupid. Is it legitimate to have one database per data file? For organizational and backup purposes,...
1
by: bmt | last post by:
I intend to set up a large database with PostgreSQL: about 1.3 TB, very basic queries (but 1500 per second at peak charge), and most probably only a single one index. The overall application, which...
3
by: Bruno LIVERNAIS | last post by:
Hi, We are currently installing a DB2 V9 ESE on a Linux server (RHEL4U4-x86_64). Installation runs successfully on each node. Database user environment is OK and the instance is well created. To...
0
by: Tony Barker | last post by:
Helpful associates, We are going to acquire and new database server that is intel based and we want to have some failover capabilities. Our question now is should we go with Linux based cluster...
5
by: Virendra | last post by:
I need to figure out programmatically if an instance is a cluster or non-cluster. Also, if it is a cluster what are nodes and associated cluster directories in the instance. All I know is the...
3
by: Mark D Powell | last post by:
I attempted to upgrade my SQL Server 2000 Enterprise Edition (32 bit) to SQL Severer 2005 EE 32 bit and while the prereqs passed the installer will not let me select the version 8.00.x (2000 SP4)...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.