return memory to the OS

~Gee

Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

When I run this program with top running, I see that even after line
19 is printed the memory(which top returns as 16 M) is not returned to
the operating system(Red Hat Linux 9.). I understand that this is an
optimization by STL to cache the memory for future use, but is it
possible to make STL return this memory to the OS (say using a system
call)?

Any inputs appreciated.
Thanks!

Jul 22 '05 #1

Subscribe Post Reply

3816

John Harrison

On 12 Jul 2004 11:47:14 -0700, ~Gee <hu********@hotmail.com> wrote:

Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

When I run this program with top running, I see that even after line
19 is printed the memory(which top returns as 16 M) is not returned to
the operating system(Red Hat Linux 9.). I understand that this is an
optimization by STL to cache the memory for future use, but is it
possible to make STL return this memory to the OS (say using a system
call)?

Any inputs appreciated.
Thanks!

The STL should not cache memory in the way you describe, but if it is then
you should write your own custom allocator for std::list then you can
allocate and free memory in any way you like.

This topic is covered in The C++ Standard Library by Josuttis for
instance, but post again if you need help doing this.

john

Jul 22 '05 #2

JKop

~Gee posted:

When I run this program with top running, I see that even after line 19 is printed the memory(which top returns as 16 M) is not returned to the operating system(Red Hat Linux 9.). I understand that this is an optimization by STL to cache the memory for future use, but is it possible to make STL return this memory to the OS (say using a system call)?

Any inputs appreciated.
Thanks!

Courtesy of yourDictionary.com:
top
n.
1. The uppermost part, point, surface, or end.

2. The part farthest from a given reference point: took
a jump shot from the top of the key.

3. The crown of the head: from top to toe.

4. The part of a plant, such as a rutabaga, that is
above the ground.

5. Something, such as a lid or cap, that covers or forms
an uppermost part.

6. A garment worn on the upper half of the body,
especially a sweater or knit shirt.

7. Nautical A platform enclosing the head of each mast
of a sailing ship, to which the topmast rigging is
attached.

8. The highest degree, pitch, or point; the peak, acme,
or zenith: "It had come at a time when he was not feeling
at the top of his form" (Anthony Powell).

9. a. The highest position or rank: at the top of his
profession. b. A person in this position.

10. Games The highest card or cards in a suit or hand.

11. The best part.

12. The earliest part or beginning: She played the piece
again, from the top.

13. Baseball The first half of an inning.

14. Sports a. A stroke that lands above the center of a
ball, as in golf or tennis, giving it a forward spin. b. A
forward spin on a ball resulting from such a stroke.

adj.
1. Situated at the top: the top shelf.

2. Of the highest degree, quality, rank, or amount: in
top form; the top ten bestsellers.

3. In a position of preeminence: the top historian in
her department.

v. topped, top·ping, tops
v. tr.
1. To form, furnish with, or serve as a top.

2. To reach the top of.

3. To go over the top of.

4. To exceed or surpass.

5. To be at the head of: She topped her class.

6. To remove the top or uppermost part from; crop:
topped the fruit trees.

Given that, could you kindly inform us of *your* definition
of "top".
-JKop

Jul 22 '05 #3

James Moughan

hu********@hotmail.com (~Gee) wrote in message news:<83**************************@posting.google. com>...

Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

When I run this program with top running, I see that even after line
19 is printed the memory(which top returns as 16 M) is not returned to
the operating system(Red Hat Linux 9.). I understand that this is an
optimization by STL to cache the memory for future use, but is it
possible to make STL return this memory to the OS (say using a system
call)?

Any inputs appreciated.
Thanks!

This is probably not a feature of the STL. (Some implementations do
keep a small object pool, e.g. SGI's, but probably not that big.)

I'd expect it's an implementation detail of malloc in g++. Remember
that when you free memory, you free it to the memory manager, and not
to the OS. The memory manager itself can only free trailing memory
after the last allocated block. Since this is not usually a large
percentage of the total, and since the call to do it is expensive,
it's not a priority. So usually, your application's memory space will
only grow. (I would actually have expected it to release 16M of
trailing memory, mind you. That's a little extreme.)

There are a couple of ways around this. One is to write a pool
allocator which requests the full amount of memory up front, then
serves it out. On supporting architectures, g++ malloc should respond
to very large requests by mmapping them, in which case it will free
the memory back to the OS when it's freed by the application. You'll
also get better memory usage for a linked list of ints, since you can
serve less than 16 bytes at a time.

The other is to write your own implementation of malloc. Good luck
with that one. ;-)

Jam

Jul 22 '05 #4

ckroom

You stupid arrogant, really don't know what's top? I suppose you're a
person who *only* knows windows, the arrogant's O.S.
--
ckroom
http://nazaries.net/~ckroom

Jul 22 '05 #5

Denis Remezov

James Moughan wrote:

hu********@hotmail.com (~Gee) wrote in message news:<83**************************@posting.google. com>...
Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

When I run this program with top running, I see that even after line
19 is printed the memory(which top returns as 16 M) is not returned to
the operating system(Red Hat Linux 9.). I understand that this is an
optimization by STL to cache the memory for future use, but is it
possible to make STL return this memory to the OS (say using a system
call)?

Any inputs appreciated.
Thanks!

A few words of confirmation and comments:
This is probably not a feature of the STL. (Some implementations do
keep a small object pool, e.g. SGI's, but probably not that big.)

I could reproduce the same behaviour on my system. The STL certainly
did not cache 16 Mb in my case, and I doubt that any sane
implementation would (not until we start counting the memory in
multiple digits of Gb, anyway).
Those 16 Mb were returned to the process heap. That was easy
verifiable by allocating another big chunk (slightly less than 16 Mb)
in line 18. top was printing the same value for the process.

I'd expect it's an implementation detail of malloc in g++. Remember
that when you free memory, you free it to the memory manager, and not
to the OS. The memory manager itself can only free trailing memory
after the last allocated block. Since this is not usually a large
percentage of the total, and since the call to do it is expensive,
it's not a priority. So usually, your application's memory space will
only grow. (I would actually have expected it to release 16M of
trailing memory, mind you. That's a little extreme.)

In the original version, cout (line 15) made a call to operator new,
which allocated a few bytes from the heap just past the 16 Mb of
tempList. Apparently, these few bytes prevented the process memory
pool from being truncated, leaving a 16 Mb gap as a free heap space
At least, this is possible on some *n*x systems, where the memory
pool usable for heap is one coniguous block (disclaimer: my
knowledge here is limited and possibly archaic).
By commenting out line 15 (or simply replacing it with
printf("done adding.\n");) I was able to see the memory footprint
of the process decrease by 16 Mb by line 20, i.e. now those pages
could be returned to the kernel.
There are a couple of ways around this. One is to write a pool
allocator which requests the full amount of memory up front, then
serves it out. On supporting architectures, g++ malloc should respond
to very large requests by mmapping them, in which case it will free
the memory back to the OS when it's freed by the application. You'll
also get better memory usage for a linked list of ints, since you can
serve less than 16 bytes at a time.

The other is to write your own implementation of malloc. Good luck
with that one. ;-)

Jam

I like the idea of implementing a custom allocator class (to be
supplied to tempList) based on mmap. Sharing the same heap with
the rest of the program (including the standard library) will likely
cause memory effects like the one above.

Denis

Jul 22 '05 #6

JKop

ckroom posted:

You stupid arrogant, really don't know what's top? I suppose you're a person who *only* knows windows, the arrogant's O.S.

"arrogant" is an adjective, get yourself an English grammar
book.
Actually, I had fun.
-JKop

Jul 22 '05 #7

Stewart Gordon

JKop wrote:

<snip>

Given that, could you kindly inform us of *your* definition
of "top".

A Unix utility that lists running processes and various information
about them like CPU and memory usage and how much processing time
they've eaten up so far.

Stewart.

--
My e-mail is valid but not my primary mailbox, aside from its being the
unfortunate victim of intensive mail-bombing at the moment. Please keep
replies on the 'group where everyone may benefit.

Jul 22 '05 #8

~Gee

<snip>

There are a couple of ways around this. One is to write a pool
allocator which requests the full amount of memory up front, then
serves it out. On supporting architectures, g++ malloc should respond
to very large requests by mmapping them, in which case it will free
the memory back to the OS when it's freed by the application. You'll
also get better memory usage for a linked list of ints, since you can
serve less than 16 bytes at a time.

The other is to write your own implementation of malloc. Good luck
with that one. ;-)

How difficult is it to implement my own allocator? I was thinking of
over-riding the new and delete of the C++ language and to use a linked
list to keep track of all the freed memory. So every time my program
calls delete, I will add that memory to the free list. Supposing, my
program does a new again. It checks the linked list if there is any
free memory and it is given the re-used memory from the list. What can
my program do if this memory is too small for it to use. Should it
keep searching the linked list to find the next available memory?
Would'nt this be a performance hit? Any suggestions welcome!
Thanks!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Jul 22 '05 #9

Ioannis Vranos

~Gee wrote:

Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

With the code tweaked to be portable:
#include<iostream>
#include<list>
#include <cstdlib>

int main()
{
using namespace std;
{
list<int> tempList;

cout << "going to start adding now ..." << endl;
for(long i=0; i<= 1000000; i++)
{
tempList.push_back(i);
}
cout << "done adding." << endl;
system("pause");
}

cout << "out of scope" << endl;
system("pause");
return 0;
}
When i use it with MINGW (a GCC port for Windows) i get the same results
both with and without optimisations.
With VC++ things proceed as they should, list is destroyed immediately
when outside of the block. It seems like a compiler *defect* to me.
Objects should be destroyed when they should, and the programmer may
count on that.

Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys

Jul 22 '05 #10

Kai-Uwe Bux

Ioannis Vranos wrote:

~Gee wrote:
Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

With the code tweaked to be portable:
#include<iostream>
#include<list>
#include <cstdlib>

int main()
{
using namespace std;
{
list<int> tempList;

cout << "going to start adding now ..." << endl;
for(long i=0; i<= 1000000; i++)
{
tempList.push_back(i);
}
cout << "done adding." << endl;
system("pause");
}

cout << "out of scope" << endl;
system("pause");
return 0;
}
When i use it with MINGW (a GCC port for Windows) i get the same results
both with and without optimisations.
With VC++ things proceed as they should, list is destroyed immediately
when outside of the block. It seems like a compiler *defect* to me.
Objects should be destroyed when they should, and the programmer may
count on that.

What makes you think that the object was not destroyed at the end of the
scope? I am pretty confident it was. However, there is no guarantee that
memory will be returned to the operating system. Very likely the list
implementation uses some memory pool for more efficient allocation and
deallocation. If you want to avoid that, use your own allocator.

The following code is a quick hack to demonstrate that the list nodes are
actually deallocated at the end of the scope:

#include<iostream>
#include<list>
#include <cstdlib>
#include <new>

template < typename T >
class tracing_allocator : public std::allocator<T> {
public:

tracing_allocator ( void ) :
std::allocator<T>()
{}

tracing_allocator ( const tracing_allocator & other ) :
std::allocator<T>( other )
{}

template < typename S >
tracing_allocator ( const tracing_allocator<S> & other ) :
std::allocator<T>( other )
{}

~tracing_allocator ( void )
{}

template < typename S >
struct rebind {
typedef tracing_allocator<S> other;
};

typename std::allocator<T>::pointer
allocate ( typename std::allocator<T>::size_type length,
const void* ptr = 0 ) {
std::cerr << "a" <<std::flush;
return( std::allocator<T>::allocate( length, ptr ) );
}

void deallocate ( typename std::allocator<T>::pointer ptr,
typename std::allocator<T>::size_type length) {
std::cerr << "d" << std::flush;
std::allocator<T>::deallocate( ptr, length );
}

void construct ( typename std::allocator<T>::pointer ptr,
typename std::allocator<T>::const_reference value ) {
std::cerr << "c" << std::flush;
std::allocator<T>::construct( ptr, value );
}

void destroy ( typename std::allocator<T>::pointer ptr ) {
std::cerr << "k" << std::flush;
std::allocator<T>::destroy( ptr );
}

};

template < typename T >
inline
bool operator== ( const tracing_allocator<T>& a1,
const tracing_allocator<T>& a2 ) {
return( true );
}

template < typename T >
inline
bool operator!= ( const tracing_allocator<T>& a1,
const tracing_allocator<T>& a2 ) {
return( false );
}

int main()
{
using namespace std;
{
list<int,tracing_allocator<int> > tempList;

cout << "going to start adding now ..." << endl;
for(long i=0; i<= 10; i++)
{
tempList.push_back(i);
}
cout << "done adding." << endl;
system("pause");
}

cout << "out of scope" << endl;
system("pause");
return 0;
}
BTW, I tried several allocators with g++: one implementing new/delete
and one implementing malloc/free. Memory was not returned to the operating
system in either case. My guess is that the underlying C library already
implements some sort of pooling. In this case, you are just out of luck.
Kai-Uwe

Jul 22 '05 #11

Ioannis Vranos

Kai-Uwe Bux wrote:

What makes you think that the object was not destroyed at the end of the
scope? I am pretty confident it was. However, there is no guarantee that
memory will be returned to the operating system. Very likely the list
implementation uses some memory pool for more efficient allocation and
deallocation.

Yes you are right, I didn't think that.

Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys

Jul 22 '05 #12

Denis Remezov

~Gee wrote:

<snip>
There are a couple of ways around this. One is to write a pool
allocator which requests the full amount of memory up front, then
serves it out. On supporting architectures, g++ malloc should respond
to very large requests by mmapping them, in which case it will free
the memory back to the OS when it's freed by the application. You'll
also get better memory usage for a linked list of ints, since you can
serve less than 16 bytes at a time.

The other is to write your own implementation of malloc. Good luck
with that one. ;-)

How difficult is it to implement my own allocator? I was thinking of
over-riding the new and delete of the C++ language and to use a linked
list to keep track of all the freed memory. So every time my program
calls delete, I will add that memory to the free list. Supposing, my
program does a new again. It checks the linked list if there is any
free memory and it is given the re-used memory from the list. What can
my program do if this memory is too small for it to use. Should it
keep searching the linked list to find the next available memory?
Would'nt this be a performance hit? Any suggestions welcome!

Simply providing your own versions of new and delete will likely
not be enough in your case. New and delete (for the problem at
hand, anyway) will have to allocate from a big pool, pre-allocated
from the system. It seems to me that in your case this pool can
grow, can shrink from the end (by pages not in use by heap allocation
functions), but cannot have holes in it (uncommitted regions of
address space). The latter means that if you deallocate lots of
memory even as a contiguous block (e.g. tempList stuff) but happen
to have something else allocated at a higher address, the library
memory functions (delete, free) will not be able to return the unused
gap to the system.
The memory management functions (new, malloc, delete, free) often
do better. There are systems with APIs that do allow decommitting
regions inside a bigger address region (e.g. freeing pages on a
system level). Alternatively, new and malloc can use more than one
pool as need be (e.g. when requested a large amount in a single call).
In either case, if you call operator new with a large argument
(say 16 Mb), a matching call to operator delete might be able to
return the pages to the system (e.g. by decommitting a subregion
or destroying a separate one, created by new).

However, when you allocate your memory by relatively small
pieces, the library memory functions generally cannot guess the
optimal allocation strategy in regards to the system memory
management. Your program, by the way, may still work as desired
(i.e. freeing the 16 Mb by line 18) on some systems, under certain
circumstances, when all of the tempList stuff is allocated
contiguously. Under less ideal circumstances, it won't.

If it is really a problem, you can deal with it by providing
your own allocator for the tempList. Look up the second template
parameter for std::list. That allocator would preallocate a
contiguous and deallocatable (on the system level) region, then
use its own heap management. For bulk-preallocation, it may be
sufficient to just rely on a single call to malloc() or operator
new, or it may not, in which case you might want to do some
system-specific programming. Details are system-specific
(some have already been mentioned in earlier posts).

Denis

Jul 22 '05 #13

Zev_K

hu********@hotmail.com (~Gee) wrote in message news:<83**************************@posting.google. com>...

<snip>
There are a couple of ways around this. One is to write a pool
allocator which requests the full amount of memory up front, then
serves it out. On supporting architectures, g++ malloc should respond
to very large requests by mmapping them, in which case it will free
the memory back to the OS when it's freed by the application. You'll
also get better memory usage for a linked list of ints, since you can
serve less than 16 bytes at a time.

The other is to write your own implementation of malloc. Good luck
with that one. ;-)

How difficult is it to implement my own allocator? I was thinking of
over-riding the new and delete of the C++ language and to use a linked
list to keep track of all the freed memory. So every time my program
calls delete, I will add that memory to the free list. Supposing, my
program does a new again. It checks the linked list if there is any
free memory and it is given the re-used memory from the list. What can
my program do if this memory is too small for it to use. Should it
keep searching the linked list to find the next available memory?
Would'nt this be a performance hit? Any suggestions welcome!
Thanks!

Presuming you want to use a first fit allocator, you want to continue
searching the free list until you find a block of memory large enough
to use. If you want to get more sophisticated and join adjacent blocks
on the free list, you would want to use a doubly linked list, and give
each block top and bottom tags.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Jul 22 '05 #14

~Gee

> BTW, I tried several allocators with g++: one implementing new/delete

and one implementing malloc/free. Memory was not returned to the operating
system in either case. My guess is that the underlying C library already
implements some sort of pooling. In this case, you are just out of luck.

So, is there some way of finding out how much memory the underlying
pool is using? Is there a library call that I can make out how much
memory is cached by the compiler?

Thanks!

Jul 22 '05 #15

Kai-Uwe Bux

~Gee wrote:

BTW, I tried several allocators with g++: one implementing new/delete
and one implementing malloc/free. Memory was not returned to the
operating system in either case. My guess is that the underlying C
library already implements some sort of pooling. In this case, you are
just out of luck.

So, is there some way of finding out how much memory the underlying
pool is using? Is there a library call that I can make out how much
memory is cached by the compiler?

Thanks!

AFAIK, there is no standard way of doing this. It all depends on the
compiler/library. You might want to check with your vendor or try a news
group devoted to your platform.

From your original post, I undestand that you are using Linux. Now, if you
use the GNU compiler and library, you can actually have a look at the
source code of malloc/free, twist that to define your_malloc/your_free and
roll a your_malloc/your_free based allocator. However, it might not be
worth the trouble: Are you sure you need to return memory to the OS? Keep
in mind that the memory in the pool is available to your application and
will be reused.
Best

Kai-Uwe Bux

Jul 22 '05 #16

~Gee

> From your original post, I undestand that you are using Linux. Now, if you

use the GNU compiler and library, you can actually have a look at the
source code of malloc/free, twist that to define your_malloc/your_free and
roll a your_malloc/your_free based allocator. However, it might not be
worth the trouble: Are you sure you need to return memory to the OS? Keep
in mind that the memory in the pool is available to your application and
will be reused.

I do not need to return the memory to the OS. I just need to know how
much memory process is actually using and how much of it is stored in
the pools. I am guessing that,

total memory reported by top = memory used by my process + memory used
by the pools.

Thanks!

Jul 22 '05 #17

bartek

hu********@hotmail.com (~Gee) wrote in news:83c2e7de.0407121047.3a260df2
@posting.google.com:

Hi Folks!
Please see the program below:

1 #include<iostream>
2 #include<list>
3 #include <unistd.h>
4 using namespace std;
5 int main()
6 {
7 {
8 list<int> tempList;
9
10 cout << "going to start adding now ..." << endl;
11 for(long i=0; i<= 1000000; i++)
12 {
13 tempList.push_back(i);
14 }
15 cout << "done adding." << endl;
16 sleep(10);
17 }
18
19 cout << "out of scope" << endl;
20 sleep(10);
21 return 0;
22 }

When I run this program with top running, I see that even after line
19 is printed the memory(which top returns as 16 M) is not returned to
the operating system(Red Hat Linux 9.). I understand that this is an
optimization by STL to cache the memory for future use, but is it
possible to make STL return this memory to the OS (say using a system
call)?

It looks like an obvious small object pooling artifact.
The list<int> container most allocates lots of small nodes, which are
probably around 12 bytes each. AFAIK this object size qualifies for the
small object allocator to put them into its own memory pool. Usually such
pools are never actually explicitly returned to the system, and are
reclaimed only in the act of process termination.

You could do a quick test by supplying a bigger object as an element for
the list.

E.g.

#include <iostream>
#include <list>

struct Big
{
int array[64];
};

using namespace std;

int main()
{
{
list<Big> tempList;

cout << "going to start adding now ..." << endl;

for(long i=0; i<= 1000000; i++)
{
tempList.push_back(Big());
}

cout << "done adding. press enter." << endl;
cin.get();
}

cout << "out of scope. press enter." << endl;
cin.get();

return 0;
}

Cheers.

--
:: bartekd / o2 pl
:: "out of confusion comes chaos -- out of chaos comes confusion and fear
:: -- then comes lunch."

Jul 22 '05 #18

return memory to the OS

Similar topics