473,699 Members | 2,838 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Need help with bug (glibc, linux, malloc-related)

I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.

After alot of running around and searching through the code I found an
anomaly that I'd like to explain and understand if it's the cause of
valloc() returning a 0. It may be unrelated to my problem above, but I
can't be sure. I've recreated this anomaly in a very simple program.

Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads are
finished, I print out malloc_stats() and mallinfo() and I seem to get
garbage for the mmap() related fields.

Most of the time I run the code, the hblks and hblkshd fields of
mallinfo() come back 0 and 0, but a fair percentage of the time I get
a strange answer where hblks is either 2, 5, -3 or -1 or something
like that. It's almost like there's a race condition inside the
malloc()/free() code that updates these fields.

This is out-of-the-box Ubuntu with gcc 4.1.2

I've included the code at the bottom, but here is an example output:
Arena 0:
system bytes = 135168
in use bytes = 288
Arena 1:
system bytes = 135168
in use bytes = 1128
Total (incl. mmap):
system bytes = 4045234176
in use bytes = 4044965256
max mmap regions = 1
max mmap bytes = 250003456
hblks : -1 hblkshd : -250003456
The mmap() and hblk (from mallinfo()) data seems to be totally
corrupted, to me. (In this particular case, they've gone negative). In
this code, the "answer" should be 0 since everything has been freed,
should it not? Are these numbers supposed to be meaningful?
(this code has a pretty large malloc, but similar results with more
reasonable sized mallocs like 10 megs)
---------------------------------
Built with:
gcc main.c -lpthread

Here is the code I'm running:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <malloc.h>

void *detection_thre ad();

main (int argc, char *argv[])
{
pthread_t thread1, thread2;
pthread_t thread3, thread4;
struct mallinfo mi;

// spawn threads
pthread_create( &thread1, NULL, detection_threa d, NULL);
pthread_create( &thread2, NULL, detection_threa d, NULL);

// wait for threads to return
pthread_join(th read1, NULL);
pthread_join(th read2, NULL);

printf("******* *************** **********\n");
malloc_stats();
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);
}

void *detection_thre ad()
{
int *slappy;
int i;
struct mallinfo mi;

for (i = 0; i < 5000; i++)
{
slappy = malloc(1000*100 0*250);

if (slappy == NULL)
{
printf("CRASH\n ");
exit(1);
}
free(slappy);
}

printf("Done!\n ");
}

Feb 5 '07 #1
7 2566
In article <11************ *********@h3g20 00cwc.googlegro ups.com>,
Louis B. (ldb) <ld********@hot mail.comwrote:
>I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.
None of valloc(), Ada, multithreading, or SSE are defined by the C
language, so your problem is well beyond the scope of comp.lang.c.

A more general Linux programming newsgroup might be better able to answer
your question.
dave

--
Dave Vandervies dj******@csclub .uwaterloo.ca
Biblethumper (n): Someone who ought to try opening it up and reading it
for comprehension for a change, already.
--Shamelessly Stolen From Anthony de Boer in the scary devil monastery
Feb 5 '07 #2
"Louis B. (ldb)" <ld********@hot mail.comwrites:
I have a long running program that eventually crashes when valloc()
returns a 0. This program is relatively non-trivial as it's written in
Ada, is multithreaded, has alot of SSE routines. A memory leak would
be the most obvious cause but this appears to be more sinister then a
simple memory leak.
comp.lang.c is not the right place to submit a glibc bug report.
I'd suggest a GNU newsgroup or mailing list instead.
--
char a[]="\n .CJacehknorstu" ;int putchar(int);in t main(void){unsi gned long b[]
={0x67dffdff,0x 9aa9aa6a,0xa77f fda9,0x7da6aa6a ,0xa67f6aaa,0xa a9aa9f6,0x11f6} ,*p
=b,i=24;for(;p+ =!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
2:{i++;if(i)bre ak;else default:continu e;if(0)case 1:putchar(a[i&15]);break;}}}
Feb 5 '07 #3
"Louis B. (ldb)" wrote:
>
.... snip ...
>
Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads
are finished, I print out malloc_stats() and mallinfo() and I seem
to get garbage for the mmap() related fields.
This is all OT for c.l.c and you should try a newsgroup dedicated
to your system and/or threads. However ...

Basically, malloc (and mallinfo) are running in user space. If
those are true threads, as opposed to full processes with separate
data areas, of course the system will get confused. You need to
protect all access to the malloc and mallinfo packages with
suitable constructs, such as semaphores, monitors, etc. You can
see one implementation of both packages (for DJGPP - these things
are system specific) as nmalloc at:

<http://cbfalconer.home .att.net/download/>

--
<http://www.cs.auckland .ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfoc us.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

Feb 5 '07 #4
On Feb 5, 5:10 pm, CBFalconer <cbfalco...@yah oo.comwrote:
"Louis B. (ldb)" wrote:

... snip ...
Basically, mallinfo() seems to produce garbage results in multi-
threaded code. In a very single program where I fire up 2 pthreads
have them malloc() and free a bunch of stuff, once all the threads
are finished, I print out malloc_stats() and mallinfo() and I seem
to get garbage for the mmap() related fields.

This is all OT for c.l.c and you should try a newsgroup dedicated
to your system and/or threads. However ...

Basically, malloc (and mallinfo) are running in user space. If
those are true threads, as opposed to full processes with separate
data areas, of course the system will get confused.
As you said, it's <OT>In this case the system shouldn't get confused
if it conforms to POSIX - it requires that "Each function defined in
the System Interfaces volume of IEEE Std 1003.1-2001 is thread-safe
unless explicitly stated otherwise". malloc() is a system interface
function and nothing is explicitly stated about thread-safeness. So
IMO it *should* be safe to use, unless of course, it isn't (and that
is documented).
To the OP: check the malloc() man page first. If it says it conforms
to POSIX/SUSv3, file a bug report, but not for gcc, this is a (g)libc
problem.</OT>
<snip>
--
WYCIWYG - what you C is what you get

Feb 5 '07 #5
On Mon, 05 Feb 2007 07:18:45 -0800, Louis B. (ldb) wrote:
<snip>
Most of the time I run the code, the hblks and hblkshd fields of
mallinfo() come back 0 and 0, but a fair percentage of the time I get
a strange answer where hblks is either 2, 5, -3 or -1 or something
like that. It's almost like there's a race condition inside the
malloc()/free() code that updates these fields.

This is out-of-the-box Ubuntu with gcc 4.1.2
I can say with an extremely high-degree of confidence that there isn't a
race condition in malloc()/free(), unless some other code is interposing
these functions. (I was once convinced for three days I had found a bug in
GCC, until I spotted that superfluous semi-colon ;)

Try Valgrind. Valgrind has plugs-in to analyze threaded coded and detect
as best it can unprotected shared resources. (Valgrind also will catch
memory errors with better diagnostics than other software.)

If that fails, try another newsgroup. This one is definitely not the group
you want.

- Bill
Feb 5 '07 #6
Louis B. (ldb) wrote On 02/05/07 10:18,:
I have a long running program that eventually crashes when valloc()
returns a 0. [...]
Others have pointed out that this isn't a C question.
However, one possible source of confusion may be a C
mistake:
struct mallinfo mi;
[...]
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);
There's no `struct mallinfo' in Standard C, but on the
box I'm using at the moment all the members of that struct
are of type `unsigned long'. If that's true of your machine,
too, then you're printing them with the wrong format specifier:
"%d" requires a corresponding `(signed) int' argument, not an
`unsigned long'. Turn up your warning levels, and fix what
the compiler complains about.

That might not cure what ails you -- but when you're faced
with a mystery, it's always a good policy to get your code into
squeaky-clean condition before concluding that you've found a
bug.

--
Er*********@sun .com
Feb 5 '07 #7
Eric Sosman wrote:
Louis B. (ldb) wrote On 02/05/07 10:18,:
>I have a long running program that eventually crashes when valloc()
returns a 0. [...]

Others have pointed out that this isn't a C question.
However, one possible source of confusion may be a C mistake:
> struct mallinfo mi;
[...]
mi = mallinfo();
printf("hblks : %d hblkshd : %d\n", mi.hblks, mi.hblkhd);

There's no `struct mallinfo' in Standard C, but on the
box I'm using at the moment all the members of that struct
are of type `unsigned long'. If that's true of your machine,
too, then you're printing them with the wrong format specifier:
"%d" requires a corresponding `(signed) int' argument, not an
`unsigned long'. Turn up your warning levels, and fix what
the compiler complains about.
Here is the header for my malldbg module, which was deliberately
designed to be compatible with the POSIX mallinfo module, except
for DJGPP. It has some added features. It is specific to the
DJGPP system, where an int and a long are identical. The
mallsethook and malldbgdumpfile functions are not present in
POSIX. Note that the malldbg module is written in standard C
(apart from the int size mentioned above), i.e. the system
dependant stuff is isolated in nmalloc.c. The connection is
established via sysquery.h. You can see the whole thing at:

<http://cbfalconer.home .att.net/download/nmalloc.zip>

/* -------- malldbg.h ----------- */

/* Copyright (c) 2003 by Charles B. Falconer
Licensed under the terms of the GNU LIBRARY GENERAL PUBLIC
LICENSE and/or the terms of COPYING.DJ, all available at
<http://www.delorie.com >.

Bug reports to <mailto:cb***** ***@worldnet.at t.net>
*/

#ifndef malldbg_h
#define malldbg_h

/* This is to be used in conjunction with a version of
nmalloc.c compiled with:

gcc -DNDEBUG -o malloc.o -c nmalloc.c

after which linking malldbg.o and malloc.o will
provide the usual malloc, free, realloc calls.
Both malloc.o and malldbg.o can be components
of the normal run time library.
*/

#include <stddef.h>
#include "sysquery.h "

struct mallinfo {
int arena; /* Total space being managed */
int ordblks; /* Count of allocated & free blocks */
int smblks;
int hblks; /* Count of free blocks */
int hblkhd; /* Size of the 'lastsbrk' block */
int usmblks;
int fsmblks;
int uordblks; /* Heap space in use w/o overhead */
int fordblks; /* Total space in free lists */
int keepcost; /* Overhead in tracking storage */
};

struct mallinfo mallinfo(void);
int malloc_verify(v oid);
int malloc_debug(in t level);
void mallocmap(void) ;
FILE *malldbgdumpfil e(FILE *fp);
M_HOOKFN mallsethook(enu m m_hook_kind which,
M_HOOKFN newhook);

#endif
/* -------- malldbg.h ----------- */
--
<http://www.cs.auckland .ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfoc us.com/columnists/423>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

Feb 5 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2367
by: Holger Marzen | last post by:
Hi all, I want to upgrade Postgres 7.1.3 on an old Debian Slink machine (with glibc 2.0.7 and Kernel 2.2.17). That machine runs fine and I can not simply upgrade the OS because it's a busy 24/7 machine. When I try to compile PostgreSQL 7.2.4 or 7.4.1 then I get |gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations |-I../../../src/include -D_GNU_SOURCE -c hba.c -o hba.o
4
7292
by: B.r.K.o.N.j.A. | last post by:
I've been having trouble with sendto function (I open a socket, bind it to eth0, recvfrom works fine but when I execute following line I get sendto(): invalid argument) if(sendto(sokit, buf, sizeof(buf), 0, (struct sockaddr *)&drugaadr, sizeof(drugaadr))==-1){ perror("sendto()") exit(-1) }
3
4753
by: Gregory Graham | last post by:
I have a temperature sensor device we use to shutdown our servers. It relys on fopen() blocking when the serial port device isn't ready. Unfortunately that's not what's happening. The fopen() returns immediately without an error. What have I got wrong? Basically when it gets too hot the device connects the 8 and 6 lines to line 20. That should, as I understand it, tell the fopen() that the file is available and cause the rest of the...
5
13006
by: Bill | last post by:
Hi, I discovered adodb and is use it now in order to connect from PHP to ms Access under Windows. No problem. Is this also applicable to PHP under linux, because i get a lot of errors like: include('../adodb/adodb.inc.php'); ---->"permission denied" $conn = 'Provider=Microsoft.Jet.OLEDB.4.0;'. 'Data Source=\\\\10.0.0.181\\db\\newres.mdb;'; $rs = NewADOConnection('ado_access'); ----> "unknown command"
2
1432
vvsvinu
by: vvsvinu | last post by:
Iam a sys admin in a firm and iam asked to give a configuration list of a server profile. As there are a lot of new servers now a days and also from the net i got some information still iam confused with it. About my organization iam having 200 systems connected on lan network of which 50 systems use internet connection. Most of the systems use Linux as we recommend them to be used. Currently we are using a centerlized concept with 2 ISPs...
1
1744
by: =?Utf-8?B?RGVubmlz?= | last post by:
I have a client socket connection (Linux) and a server socket connection (Windows). All is fine with the sockets themselves. The client-side is sending data (struct) to the server via write(). write(sockfd, &struct, sizeof(struct)) The server is receiving data via recv(). recv(clientfd, buf, sizeof(struct), 0) Once data is received by the server, various verification checks are
0
1824
Fary4u
by: Fary4u | last post by:
Hi i've just few exprence with ISP but i want to design wifi without security enable how this possible that i can make wifi not security enable i need to configer with Linux / BSD / Windows server i've exprence in Windows Server & Linux shall i jumped into the Free BSD ? what do i have to do to do this kind of job ?
4
1434
by: jasonwiener | last post by:
Hi- I am having a VERY odd problem with unpacking right now. I'm reading data from a binary file and then using a very simple struct.unpack to get a long. Works fine on my MacBook, but when I push it to a Linux box,it acts differently and ends up pewking. here's the code snippet: fread.seek(0,0) tmp_rebuild = fread.read()
0
8685
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8612
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9171
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8880
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5869
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4373
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4625
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3053
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2342
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.