More than 1024 connections from the same c-backend

Andreas Muck

Hi!

We have an application running on Linux (SuSE 7.2, kernel 2.4.16) that
opens lots of connections to a Postgres database and occasionaly dies
with segfault. Trying to reproduce the crash, I came up with the
following test code:

--------------------- pgsql-test.c ---------------------
#include <stdio.h>
#include <libpq-fe.h>

int main(int argc, char **argv)
{
PGconn *conn;
int i;

for (i = 0; i < 2048; i++)
{
conn = PQsetdbLogin("localhost", "5432", NULL, NULL,
"template1", "postgres", NULL);

if (PQstatus(conn) == CONNECTION_BAD)
printf("%5d: Connection to database FAILED\n", i+1);
else
printf("%5d: Connection to database OK\n", i+1);

if (i > 1010)
{
sleep(10);
}

// PQfinish(conn);
}

// sleep(300);
return 0;
}
--------------------------------------------------------

The test program segfaults after it opens 1020 connections. Then it has
exactly 1024 open file descriptors, including stdin, stdout, stderr and
a file descriptor on /proc/sys/kernel/shmmax.

The system limits on open file descriptors is set to 65535 (both ulimit
and /proc/sys/fs/file-max). It's not related to the max-backends limit
in postmaster either. The test program crashes even if postmaster is not
running at all.

The program seems to crash when it returns from pqWaitTimed(). As
pqWaitTimed uses select() to poll the file descriptors, I suppose the
crash is related to the limit of 1024 file descriptors that fd_set can hold.

The weird thing is that it's not the select() that segfaults. The
segfault occurs on return from pqWaitTimed(). It is 100% reproduceable
on one machine, but it doesn't crash on another one. GDB can't show a
backtrace from the core file:

(gdb) bt
#0 0x08049ab3 in connectDBComplete ()
Cannot access memory at address 0x1

When stepping through the program in gdb, I can see the "conn" pointer
getting lost after on the 1021st connect when pqWaitTimed() returns. So
it looks like the return stack gets corrupted or something like that.

Can anyone confirm this? Am I missing anything here?

Any idea how to get more than 1024 connections with one backend?

Andi

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #1

Subscribe Reply

2030

Andreas Muck

Sorry, libpq was compiled without debug symbols in the previous
email. Here's whet gdb shows after compiling postgresql-7.3.4
--enable-debug:

Core was generated by `./pgsql-test-static'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
#0 0x0804a3f3 in connectDBComplete (conn=???) at fe-connect.c:1124
1124 flag = PQconnectPoll(conn);
(gdb) bt
#0 0x0804a3f3 in connectDBComplete (conn=???) at fe-connect.c:1124
Cannot access memory at address 0x0

Andi

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #2

Tom Lane

Andreas Muck <bb*******************@blitztrade.de> writes:

We have an application running on Linux (SuSE 7.2, kernel 2.4.16) that
opens lots of connections to a Postgres database and occasionaly dies
with segfault. The program seems to crash when it returns from pqWaitTimed(). As
pqWaitTimed uses select() to poll the file descriptors, I suppose the
crash is related to the limit of 1024 file descriptors that fd_set can hold.

If that's the size of fd_set on your machine, then yes, this doesn't
surprise me at all. The code that calls select() is no doubt clobbering
some bit beyond the end of the fd_set array.

7.4 is designed to use poll() in preference to select() (if available),
because of previous complaints about exactly this problem. Not sure if
you want to update to 7.4 beta, but you could consider lifting the
pqWait code out of 7.4.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #3

Andreas Muck

Tom Lane wrote:

7.4 is designed to use poll() in preference to select() (if available),
because of previous complaints about exactly this problem. Not sure if
you want to update to 7.4 beta, but you could consider lifting the
pqWait code out of 7.4.

I see. I'll check it out if upgrading to 7.4 is an option. Thank you for
the confirmation of the problem!

Regards,
Andreas

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #4

Similar topics

303

17420

BIG successes of Lisp (was ...)

by: mike420 | last post by:

In the context of LATEX, some Pythonista asked what the big successes of Lisp were. I think there were at least three *big* successes. a. orbitz.com web site uses Lisp for algorithms, etc. b....

Python

3651

Why does new allocate more memory than there is??

by: Alan Gifford | last post by:

I wrote a program to make sure that new would throw a bad_alloc exception if more memory was requested than was available. On my system, new allocates up to 2931 MBs of memory (I don't have that...

C / C++

2868

Unable to VirtualAllocEx more than 32664 items.

by: Bob Karaban | last post by:

We ran into a problem using VirtualAllocEx and were wondering if anybody has a way around this. We have an executable that stores a hash table in a remote process. The VirtualAllocEx function...

C / C++

1979

easily editting fields with more than 1024 characters

by: dave | last post by:

i suspect i know the answer to this already, but here goes anyway.... i have a table that has field of varchar(2048), which once in a blue moon i need to edit the data manually (until a bad...

Microsoft SQL Server

1775

More CPU???

by: Jason Gyetko | last post by:

I'm running DB2 v8.1 FP5 on a Server with 2 physical processors (4 virtual) and am wondering if there is a way to configure DB2 to use more processor. Right now I'm running a delete query to delete...

DB2 Database

18795

Cannot open any more databases.

by: ultraton | last post by:

While trying to print a report from Access the user receives the following error: Cannot open any more databases. Okay Help Does anyone have any ideas about this behavior? Thank you very...

Microsoft Access / VBA

1002

Cannot Write More Than 1024 Bytes

by: RC | last post by:

What would limit the amount of data that can be written to disk in an ASP.NET Web application? I've looked in the application's Web.config and don't see anything that would explain it. Here's...

ASP.NET

1185

more elegant way than this

by: Edward W. | last post by:

hello, I have this function below which is simple and easy to understand private function ListHeight (byval UserScreenHeight as int) as int if UserScreenHeight < 1024 return 30 else return 50...

Visual Basic .NET

1642

No more than 10 parallell WebService connections?

by: Johan Johansson (Sweden) | last post by:

I wonder why it is that no more than 10 clients can access a webservice at a time, and how to increase that number. The eleventh client get a Http 403.9 (Access forbidden : To many users are...

.NET Framework

1237

use an oledbconnection on more than one form

by: hcs | last post by:

Hello, i have got a form working which displays data from an access database. is it possible to share the connections, dataadapter and datasets that have been setup in this form on another...

Visual Basic .NET

7204

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

7091

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7282

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

7342

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

7464

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

5018

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

3171

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

3162

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

391

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

General