By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,812 Members | 1,978 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,812 IT Pros & Developers. It's quick & easy.

backend exit mystery

P: n/a
I have libpq client program that repeatedly connects to a DB, queries, and
then disconnects. After a seemingly random number of such successful
sessions (sometimes 30, sometimes hundreds), the backend mysteriously exits
after the client calls PQsetdbLogin(), and the client hangs. Any clues?
Details below...

Client: C program linked with 7.2.1 libpq on HP-UX B.11.00 E 9000/803.
Server: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96.

Client code snippet:

1 if (text_db_conn == NULL || PQstatus(text_db_conn) != CONNECTION_OK) {
2 if (text_db_conn!=NULL) PQfinish(text_db_conn);
3 fprintf(stderr,"Connecting to DB...\n");
4 fflush(stderr);
5 text_db_conn = PQsetdbLogin(IP, PORT, NULL, NULL,
6 "mydb", "myuser", NULL);
7 if (PQstatus(text_db_conn) == CONNECTION_BAD) {
8 fprintf(stderr,"Connection attempt failed.\n");
9 } else {
10 fprintf(stderr,"Connected.\n");
11 }
12 }

Client hangs after line 5. Client backtrace when hanging:

(gdb) bt
#0 0x20da28 in _select_sys ()
#1 0x1ec788 in select ()
#2 0xb9818 in pqWait ()
#3 0x4000f0e0 in __d_trap_fptr ()
#4 0x1ec788 in select ()
Error accessing memory address 0xffffffbf: Bad address.

Server log (with server_min_messages = debug5) shows:

2003-10-10 17:04:01 [28501] DEBUG: BackendStartup: forked pid=20296
socket=8
2003-10-10 17:04:01 [20296] LOG: connection received: host=10.0.1.1
port=61438
2003-10-10 17:05:34 [28501] DEBUG: reaping dead processes
2003-10-10 17:05:34 [28501] DEBUG: child process (pid 20296) exited with
exit code 0

I attached to this backend before it exited, and got this backtrace:

(gdb) bt
#0 0x420e8182 in recv () from /lib/i686/libc.so.6
#1 0x081115c8 in secure_read (port=0x82be9f0, ptr=0x826f100, len=8192) at
be-secure.c:301
#2 0x08115322 in pq_recvbuf () at pqcomm.c:463
#3 0x08115439 in pq_getbytes (s=0xbfffdd70 "Ho*\b8?*\b8???d\210\024",
len=4) at pqcomm.c:538
#4 0x081472fa in ProcessStartupPacket (port=0x82be9f0, SSLdone=0 '\0') at
postmaster.c:1094
#5 0x08148914 in DoBackend (port=0x82be9f0) at postmaster.c:2178
#6 0x081483ff in BackendStartup (port=0x82be9f0) at postmaster.c:1924
#7 0x081471f1 in ServerLoop () at postmaster.c:1027
#8 0x08146be6 in PostmasterMain (argc=4, argv=0x82a5ae8) at
postmaster.c:788
#9 0x081160dc in main (argc=4, argv=0xbfffe8b4) at main.c:210
#10 0x42017499 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) p debug_query_string
$1 = 0x0
TIA.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
On Fri, 10 Oct 2003, Ed L. wrote:
I have libpq client program that repeatedly connects to a DB, queries, and
then disconnects. After a seemingly random number of such successful
sessions (sometimes 30, sometimes hundreds), the backend mysteriously exits
after the client calls PQsetdbLogin(), and the client hangs. Any clues?
Details below...

Client: C program linked with 7.2.1 libpq on HP-UX B.11.00 E 9000/803.
Server: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96.


How's the memory situation on the server box?
Don't forget linux has that GREAT feature that randomly kills processes
when memory is tight.... Perhaps there's something in syslog
--
Jeff Trout <je**@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 12 '05 #2

P: n/a
On Saturday October 11 2003 9:00, Jeff wrote:
On Fri, 10 Oct 2003, Ed L. wrote:
I have libpq client program that repeatedly connects to a DB, queries,
and then disconnects. After a seemingly random number of such
successful sessions (sometimes 30, sometimes hundreds), the backend
mysteriously exits after the client calls PQsetdbLogin(), and the
client hangs. Any clues? Details below...

Client: C program linked with 7.2.1 libpq on HP-UX B.11.00 E 9000/803.
Server: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96.


How's the memory situation on the server box?
Don't forget linux has that GREAT feature that randomly kills processes
when memory is tight.... Perhaps there's something in syslog


Hmmm... it's quite repeatable, though not easy to predict after how many
sessions, so I don't think it's that. Happens whether memory is tight or
not.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #3

P: n/a
On Friday October 10 2003 4:46, Ed L. wrote:
I have libpq client program that repeatedly connects to a DB, queries,
and then disconnects. After a seemingly random number of such successful
sessions (sometimes 30, sometimes hundreds), the backend mysteriously
exits after the client calls PQsetdbLogin(), and the client hangs. Any
clues? Details below...

Client: C program linked with 7.2.1 libpq on HP-UX B.11.00 E 9000/803.
Server: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96.


Still looking for clues as to the cause of this repeatable connection
failure. Passing an explicit connection timeout to PQconnectdb() escapes
from any long hangs, but the hanging is still an issue I'd like to
understand. Attached is a small C program that reliably reproduces this
problem on the setup above. I added an explicit timeout to PQconnectdb()
to wait only 30 seconds. I'm curious to know if anyone can easily repeat
the problem (careful, it will generate a bit of traffic, cpu load, and run
forever). My last example run showed 17 timeouts seemingly randomly
dispersed among 5000 consecutive connection attempts.

The server has plenty of available memory on a dual processor machine
running Linux 2.4.18-3smp. Tried to catch snapshot data from netstat on
Recv-Q and Send-Q sizes on the server during a hang... that's a little iffy
with the timing of grepping netstat output, but seems like the server's
Recv-Q's were always zero and the Send-Q's were occasional in the tens
(bytes?).

TIA for any help.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #4

P: n/a
On Friday October 10 2003 4:46, Ed L. wrote:
I have libpq client program that repeatedly connects to a DB, queries,
and then disconnects. After a seemingly random number of such successful
sessions (sometimes 30, sometimes hundreds), the backend mysteriously
exits after the client calls PQsetdbLogin(), and the client hangs. Any
clues? Details below...

Client: C program linked with 7.2.1 libpq on HP-UX B.11.00 E 9000/803.
Server: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.96.


Still looking for clues as to the cause of this repeatable connection
failure. Passing an explicit connection timeout to PQconnectdb() escapes
from any long hangs, but the hanging is still an issue I'd like to
understand. Attached is a small C program that reliably reproduces this
problem on the setup above. I added an explicit timeout to PQconnectdb()
to wait only 30 seconds. I'm curious to know if anyone can easily repeat
the problem (careful, it will generate a bit of traffic, cpu load, and run
forever). My last example run showed 17 timeouts seemingly randomly
dispersed among 5000 consecutive connection attempts.

The server has plenty of available memory on a dual processor machine
running Linux 2.4.18-3smp. Tried to catch snapshot data from netstat on
Recv-Q and Send-Q sizes on the server during a hang... that's a little iffy
with the timing of grepping netstat output, but seems like the server's
Recv-Q's were always zero and the Send-Q's were occasional in the tens
(bytes?).

TIA for any help.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #5

P: n/a
Ed Loehr <ed@LoehrTech.com> writes:
Attached is a small C program that reliably reproduces this=20
problem on the setup above. I added an explicit timeout to PQconnectdb()=
to wait only 30 seconds. I'm curious to know if anyone can easily repeat=
the problem (careful, it will generate a bit of traffic, cpu load, and run=
forever). My last example run showed 17 timeouts seemingly randomly=20
dispersed among 5000 consecutive connection attempts.=20=20


I tried to duplicate the problem, without success --- 20000 connection
attempts without failure. Setup is HPUX 10.20 client, RHL8 server
(2.4.18-24.8.0 kernel); but it's a single-processor machine, not dual as
in your example. I was using CVS-tip PG sources, also.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #6

P: n/a
Ed Loehr <ed@LoehrTech.com> writes:
Attached is a small C program that reliably reproduces this=20
problem on the setup above. I added an explicit timeout to PQconnectdb()=
to wait only 30 seconds. I'm curious to know if anyone can easily repeat=
the problem (careful, it will generate a bit of traffic, cpu load, and run=
forever). My last example run showed 17 timeouts seemingly randomly=20
dispersed among 5000 consecutive connection attempts.=20=20


I tried to duplicate the problem, without success --- 20000 connection
attempts without failure. Setup is HPUX 10.20 client, RHL8 server
(2.4.18-24.8.0 kernel); but it's a single-processor machine, not dual as
in your example. I was using CVS-tip PG sources, also.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.