473,468 Members | 1,349 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Need help with bug (glibc, linux, malloc-related)

I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.

After alot of running around and searching through the code I found an
anomaly that I'd like to explain and understand if it's the cause of
valloc() returning a 0. It may be unrelated to my problem above, but I
can't be sure. I've recreated this anomaly in a very simple program.

Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads are
finished, I print out malloc_stats() and mallinfo() and I seem to get
garbage for the mmap() related fields.

Most of the time I run the code, the hblks and hblkshd fields of
mallinfo() come back 0 and 0, but a fair percentage of the time I get
a strange answer where hblks is either 2, 5, -3 or -1 or something
like that. It's almost like there's a race condition inside the
malloc()/free() code that updates these fields.

This is out-of-the-box Ubuntu with gcc 4.1.2

I've included the code at the bottom, but here is an example output:
Arena 0:
system bytes = 135168
in use bytes = 288
Arena 1:
system bytes = 135168
in use bytes = 1128
Total (incl. mmap):
system bytes = 4045234176
in use bytes = 4044965256
max mmap regions = 1
max mmap bytes = 250003456
hblks : -1 hblkshd : -250003456
The mmap() and hblk (from mallinfo()) data seems to be totally
corrupted, to me. (In this particular case, they've gone negative). In
this code, the "answer" should be 0 since everything has been freed,
should it not? Are these numbers supposed to be meaningful?
(this code has a pretty large malloc, but similar results with more
reasonable sized mallocs like 10 megs)
---------------------------------
Built with:
gcc main.c -lpthread

Here is the code I'm running:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <malloc.h>

void *detection_thread();

main (int argc, char *argv[])
{
pthread_t thread1, thread2;
pthread_t thread3, thread4;
struct mallinfo mi;

// spawn threads
pthread_create(&thread1, NULL, detection_thread, NULL);
pthread_create(&thread2, NULL, detection_thread, NULL);

// wait for threads to return
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);

printf("********************************\n");
malloc_stats();
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);
}

void *detection_thread()
{
int *slappy;
int i;
struct mallinfo mi;

for (i = 0; i < 5000; i++)
{
slappy = malloc(1000*1000*250);

if (slappy == NULL)
{
printf("CRASH\n");
exit(1);
}
free(slappy);
}

printf("Done!\n");
}

Feb 5 '07 #1
7 2554
In article <11*********************@h3g2000cwc.googlegroups.c om>,
Louis B. (ldb) <ld********@hotmail.comwrote:
>I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.
None of valloc(), Ada, multithreading, or SSE are defined by the C
language, so your problem is well beyond the scope of comp.lang.c.

A more general Linux programming newsgroup might be better able to answer
your question.
dave

--
Dave Vandervies dj******@csclub.uwaterloo.ca
Biblethumper (n): Someone who ought to try opening it up and reading it
for comprehension for a change, already.
--Shamelessly Stolen From Anthony de Boer in the scary devil monastery
Feb 5 '07 #2
"Louis B. (ldb)" <ld********@hotmail.comwrites:
I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.
comp.lang.c is not the right place to submit a glibc bug report.
I'd suggest a GNU newsgroup or mailing list instead.
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa6 7f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
Feb 5 '07 #3
"Louis B. (ldb)" wrote:
>
.... snip ...
>
Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads
are finished, I print out malloc_stats() and mallinfo() and I seem
to get garbage for the mmap() related fields.
This is all OT for c.l.c and you should try a newsgroup dedicated
to your system and/or threads. However ...

Basically, malloc (and mallinfo) are running in user space. If
those are true threads, as opposed to full processes with separate
data areas, of course the system will get confused. You need to
protect all access to the malloc and mallinfo packages with
suitable constructs, such as semaphores, monitors, etc. You can
see one implementation of both packages (for DJGPP - these things
are system specific) as nmalloc at:

<http://cbfalconer.home.att.net/download/>

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

Feb 5 '07 #4
On Feb 5, 5:10 pm, CBFalconer <cbfalco...@yahoo.comwrote:
"Louis B. (ldb)" wrote:

... snip ...
Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads
are finished, I print out malloc_stats() and mallinfo() and I seem
to get garbage for the mmap() related fields.

This is all OT for c.l.c and you should try a newsgroup dedicated
to your system and/or threads. However ...

Basically, malloc (and mallinfo) are running in user space. If
those are true threads, as opposed to full processes with separate
data areas, of course the system will get confused.
As you said, it's <OT>In this case the system shouldn't get confused
if it conforms to POSIX - it requires that "Each function defined in
the System Interfaces volume of IEEE Std 1003.1-2001 is thread-safe
unless explicitly stated otherwise". malloc() is a system interface
function and nothing is explicitly stated about thread-safeness. So
IMO it *should* be safe to use, unless of course, it isn't (and that
is documented).
To the OP: check the malloc() man page first. If it says it conforms
to POSIX/SUSv3, file a bug report, but not for gcc, this is a (g)libc
problem.</OT>
<snip>
--
WYCIWYG - what you C is what you get

Feb 5 '07 #5
On Mon, 05 Feb 2007 07:18:45 -0800, Louis B. (ldb) wrote:
<snip>
Most of the time I run the code, the hblks and hblkshd fields of
mallinfo() come back 0 and 0, but a fair percentage of the time I get
a strange answer where hblks is either 2, 5, -3 or -1 or something
like that. It's almost like there's a race condition inside the
malloc()/free() code that updates these fields.

This is out-of-the-box Ubuntu with gcc 4.1.2
I can say with an extremely high-degree of confidence that there isn't a
race condition in malloc()/free(), unless some other code is interposing
these functions. (I was once convinced for three days I had found a bug in
GCC, until I spotted that superfluous semi-colon ;)

Try Valgrind. Valgrind has plugs-in to analyze threaded coded and detect
as best it can unprotected shared resources. (Valgrind also will catch
memory errors with better diagnostics than other software.)

If that fails, try another newsgroup. This one is definitely not the group
you want.

- Bill
Feb 5 '07 #6
Louis B. (ldb) wrote On 02/05/07 10:18,:
I have a long running program that eventually crashes when valloc()
returns a 0. [...]
Others have pointed out that this isn't a C question.
However, one possible source of confusion may be a C
mistake:
struct mallinfo mi;
[...]
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);
There's no `struct mallinfo' in Standard C, but on the
box I'm using at the moment all the members of that struct
are of type `unsigned long'. If that's true of your machine,
too, then you're printing them with the wrong format specifier:
"%d" requires a corresponding `(signed) int' argument, not an
`unsigned long'. Turn up your warning levels, and fix what
the compiler complains about.

That might not cure what ails you -- but when you're faced
with a mystery, it's always a good policy to get your code into
squeaky-clean condition before concluding that you've found a
bug.

--
Er*********@sun.com
Feb 5 '07 #7
Eric Sosman wrote:
Louis B. (ldb) wrote On 02/05/07 10:18,:
>I have a long running program that eventually crashes when valloc()
returns a 0. [...]

Others have pointed out that this isn't a C question.
However, one possible source of confusion may be a C mistake:
> struct mallinfo mi;
[...]
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);

There's no `struct mallinfo' in Standard C, but on the
box I'm using at the moment all the members of that struct
are of type `unsigned long'. If that's true of your machine,
too, then you're printing them with the wrong format specifier:
"%d" requires a corresponding `(signed) int' argument, not an
`unsigned long'. Turn up your warning levels, and fix what
the compiler complains about.
Here is the header for my malldbg module, which was deliberately
designed to be compatible with the POSIX mallinfo module, except
for DJGPP. It has some added features. It is specific to the
DJGPP system, where an int and a long are identical. The
mallsethook and malldbgdumpfile functions are not present in
POSIX. Note that the malldbg module is written in standard C
(apart from the int size mentioned above), i.e. the system
dependant stuff is isolated in nmalloc.c. The connection is
established via sysquery.h. You can see the whole thing at:

<http://cbfalconer.home.att.net/download/nmalloc.zip>

/* -------- malldbg.h ----------- */

/* Copyright (c) 2003 by Charles B. Falconer
Licensed under the terms of the GNU LIBRARY GENERAL PUBLIC
LICENSE and/or the terms of COPYING.DJ, all available at
<http://www.delorie.com>.

Bug reports to <mailto:cb********@worldnet.att.net>
*/

#ifndef malldbg_h
#define malldbg_h

/* This is to be used in conjunction with a version of
nmalloc.c compiled with:

gcc -DNDEBUG -o malloc.o -c nmalloc.c

after which linking malldbg.o and malloc.o will
provide the usual malloc, free, realloc calls.
Both malloc.o and malldbg.o can be components
of the normal run time library.
*/

#include <stddef.h>
#include "sysquery.h"

struct mallinfo {
int arena; /* Total space being managed */
int ordblks; /* Count of allocated & free blocks */
int smblks;
int hblks; /* Count of free blocks */
int hblkhd; /* Size of the 'lastsbrk' block */
int usmblks;
int fsmblks;
int uordblks; /* Heap space in use w/o overhead */
int fordblks; /* Total space in free lists */
int keepcost; /* Overhead in tracking storage */
};

struct mallinfo mallinfo(void);
int malloc_verify(void);
int malloc_debug(int level);
void mallocmap(void);
FILE *malldbgdumpfile(FILE *fp);
M_HOOKFN mallsethook(enum m_hook_kind which,
M_HOOKFN newhook);

#endif
/* -------- malldbg.h ----------- */
--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

Feb 5 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Holger Marzen | last post by:
Hi all, I want to upgrade Postgres 7.1.3 on an old Debian Slink machine (with glibc 2.0.7 and Kernel 2.2.17). That machine runs fine and I can not simply upgrade the OS because it's a busy 24/7...
4
by: B.r.K.o.N.j.A. | last post by:
I've been having trouble with sendto function (I open a socket, bind it to eth0, recvfrom works fine but when I execute following line I get sendto(): invalid argument) if(sendto(sokit, buf,...
3
by: Gregory Graham | last post by:
I have a temperature sensor device we use to shutdown our servers. It relys on fopen() blocking when the serial port device isn't ready. Unfortunately that's not what's happening. The fopen()...
5
by: Bill | last post by:
Hi, I discovered adodb and is use it now in order to connect from PHP to ms Access under Windows. No problem. Is this also applicable to PHP under linux, because i get a lot of errors like:...
2
vvsvinu
by: vvsvinu | last post by:
Iam a sys admin in a firm and iam asked to give a configuration list of a server profile. As there are a lot of new servers now a days and also from the net i got some information still iam confused...
1
by: =?Utf-8?B?RGVubmlz?= | last post by:
I have a client socket connection (Linux) and a server socket connection (Windows). All is fine with the sockets themselves. The client-side is sending data (struct) to the server via write()....
0
Fary4u
by: Fary4u | last post by:
Hi i've just few exprence with ISP but i want to design wifi without security enable how this possible that i can make wifi not security enable i need to configer with Linux / BSD / Windows...
4
by: jasonwiener | last post by:
Hi- I am having a VERY odd problem with unpacking right now. I'm reading data from a binary file and then using a very simple struct.unpack to get a long. Works fine on my MacBook, but when I...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.