Can array allocation cause memory fragmentation

Andreas Schmitt

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)realloc( Messages, sizeof(MMessage) *
(m_num_messages + Num_builtin_messages) + 1 );

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as
N consecutive blocks of memory to hold its items. How can this cause
memory fragmentation, no matter how small the increase in a reallocation
is...?

Sep 14 '07 #1

Subscribe Reply

6175

Victor Bazarov

Andreas Schmitt wrote:

[..]
My question now.. from my undertanding, an array is always allocated
as N consecutive blocks of memory to hold its items. How can this
cause memory fragmentation, no matter how small the increase in a
reallocation is...?

Memory fragmentation is going to happen. Period. Now, how much is too
much is the 64 thousand dollar question - it may mean that if it happens
too soon, you're going to run out of memory, or the performance of the
system is going to be affected or whatever. The more often your code
allocates and deallocates memory, the sooner you're going to run into
the problem caused by the heap fragmentation. That's all. Making your
array allocate in larger chunks makes reallocations somewhat _less_
_frequent_, which in turn may mean that during the run your program is
not going to drive the heap to the threshold of being problematically
fragmented.

Now, how is this a C++ language question, I am not sure.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Sep 14 '07 #2

crisgoogle

On Sep 14, 9:36 am, Andreas Schmitt <keldorkat...@gmx.dewrote:

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)realloc( Messages, sizeof(MMessage) *
(m_num_messages + Num_builtin_messages) + 1 );

First of all, this is a dangerous way to (mis)use realloc. If realloc
_can't_ allocate the new buffer, it returns NULL, which you then copy
into your old pointer. The old value of Messages, your only reference
to the previously allocated memory, is gone.

The correct way to do this is to store the result of realloc in a
temporary, and only assign the value to Messages once you know that
the old value of Messages is no longer needed.

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as
N consecutive blocks of memory to hold its items. How can this cause
memory fragmentation, no matter how small the increase in a reallocation
is...?

The issue isn't really that your realloc by itself causes
fragmentation, it's that other allocations interspersed with it will.

Let's say that your realloc call really does need to move the memory.
Now you have a hole of size 'oldsize' in the memory pool. Now some
other part of the code allocates a few bytes, which may go immediately
after your newer block. Now you reallocate again, leaving a hole of
oldsize + x, and so on, and so on. You'll wind up leaving a string of
holes, all differing in size by a single object.

At the least, you should reallocate for several extra objects, as your
commenter mentioned. This simply reduces the number of reallocations
and therefore the probability of the above scenario being a problem.
(The typical way to do this is to increase the buffer size by some
proportion, by the way, not by some absolute value.)

If you're reallocating frequently, you should also consider some
(probably system-specific) way of allocating this particular array
always from a pool separate from other allocations.

Sep 14 '07 #3

Jim Langston

"Andreas Schmitt" <ke**********@gmx.dewrote in message
news:fc*************@news.t-online.com...

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)realloc( Messages, sizeof(MMessage) *
(m_num_messages + Num_builtin_messages) + 1 );

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just adding
one element, that is wrong. You always need to allocate in batches to
prevent memory fragmentation. Always increment by 5 or 10 at the least,
some const value, and keep track of the actual size of the array so that
you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as N
consecutive blocks of memory to hold its items. How can this cause memory
fragmentation, no matter how small the increase in a reallocation is...?

Cris answered the question rather well. The more you allocate/reallocate
memory the more fragmentation that can happen. The way a few STL
implementaions do it, asi I understand, is to double the amount of memory
each time. This allocates more memory than needed which may not be used,
but reduces the number of allocations that will be needed. But, what is
this fragmentation? Consider

messages you first allocate 100 bytes.
100 bytes = messages
then you allocate 101
100 bytes = nothing
101 bytes = messages
then you allocate 102 bytes. It won't fit in the 100 bytes so it'll go
after that block
100 bytes = nothing
101 bytes = nothing
102 bytes = messages

You've now used up 303 bytes but are only using 102. Next time you go to
allocate 103 bytes, it would grab it from the beginning (hopefully) but then
you probably have other allcoations taking place for other variables, and
you wind up with differnet size holes of unused memory in the freestore.
(For example, say some other variable needed 105 bytes now, it would grab it
making it:

105 bytes = someothervariable
96 bytes = unused
102 bytes = messages

enough of this goes on in the program and you start running out of memory to
grab your increasingly large blocks of memory. Even though the memory is
there unused, it's all broken up in little chunks that can't be grabbed at
once. One way to prevent this is to grab more memory than you need,
reducing the amount of reallocations that take place. Doubling is one way.
Instead of grabbing 101 blocks, grab 200. If you run out of that 200, grab
400 next time, etc... This helps a bit decrease the amount of allocations
that take place. Even though fragmentation will still occur, it might help
a little.

Sep 14 '07 #4

James Kanze

On Sep 14, 6:36 pm, Andreas Schmitt <keldorkat...@gmx.dewrote:

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)realloc( Messages, sizeof(MMessage) *
(m_num_messages + Num_builtin_messages) + 1 );

If MMessage isn't a POD, this won't work.

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

Is this from the same person who requires realloc, rather than
new and delete? He obviously doesn't understand memory
management very well.

My question now.. from my undertanding, an array is always
allocated as N consecutive blocks of memory to hold its items.
How can this cause memory fragmentation, no matter how small
the increase in a reallocation is...?

Indirectly, it could allocate a block of N, then allocate a
block of N+1 and free the block of N. If there are intervening
allocations, there could be an isolated block the size of N
elements. In practice, however:

-- Historically, you'd using realloc because you expect to
increase the size of the existing buffer, with no new
allocation. If there's not space, at least with the
algorithms which were frequent 20 years ago (when I was
working in C), the growing block would fairly quickly
migrate to the end of the free space arena, where it would
almost always be possible to grow it without reallocating.
Unless you can reasonably expect something like this,
there's no point of using realloc.

-- If realloc cannot grow something in place, it must copy it.
I suspect that with many modern implementations of
malloc/free, this would almost always be the case (but I've
not actually looked to see). Copying is, or can be, an
expensive operation; the usual solution when unlimited
growth is required is to use exponential allocation, always
multiplying the size by a fixed factor. (For various
reasons, 1.5 is generally to be preferred, although 2 is
very common as well.)

std::vector uses the second strategy, and that's really what you
should be using here.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sep 15 '07 #5

James Kanze

On Sep 14, 10:35 pm, "Jim Langston" <tazmas...@rocketmail.comwrote:
[...]

The way a few STL implementaions do it, asi I understand, is
to double the amount of memory each time. This allocates more
memory than needed which may not be used, but reduces the
number of allocations that will be needed.

The language specification requires that
std::vector<>::push_back have amoritized constant time. This
requires some sort of exponential growth strategy. In older
implementations, 2 was a common factor, but Andy Koenig (I
think) pointed out that this has the serious drawback that even
if previously allocated blocks are contiguous, they will never
be large enough for the next allocation; the upper limit if you
want some memory reuse is, IIRC, the (1 + sqrt(5))/2---roughly
1.6. 1.5 is pretty close, and easy to calculate without
floating point arithmetic, and I suspect that this is the factor
used in many more recent implementations.

But, what is
this fragmentation? Consider

messages you first allocate 100 bytes.
100 bytes = messages
then you allocate 101
100 bytes = nothing
101 bytes = messages
then you allocate 102 bytes. It won't fit in the 100 bytes so it'll go
after that block
100 bytes = nothing
101 bytes = nothing
102 bytes = messages

You've now used up 303 bytes but are only using 102. Next
time you go to allocate 103 bytes, it would grab it from the
beginning (hopefully) but then you probably have other
allcoations taking place for other variables, and you wind up
with differnet size holes of unused memory in the freestore.

You can't make those sort of assumptions. It all depends on
realloc, and how it works. And realloc doesn't work well if
there are intervening allocations.

(For example, say some other variable needed 105 bytes now, it
would grab it making it:

105 bytes = someothervariable
96 bytes = unused
102 bytes = messages

enough of this goes on in the program and you start running
out of memory to grab your increasingly large blocks of
memory. Even though the memory is there unused, it's all
broken up in little chunks that can't be grabbed at once.

What typically happens is that most of the allocated memory is
at the bottom of the free space arena. Fairly rapidly, realloc
can't find any room in the fragments, and migrates to the top.
Where everything that follows is free, so the next realloc's
just change the size of the buffer.

At least, that was the way it worked 20 years ago, when realloc
was relevant. Today, I suspect that you're right, and that
there will often be enough intervening allocations to require
realloc to actually reallocate and copy. At which point, it
ceases to be relevant. (Of course, an implementation could also
note when a block was being realloc'ed, and when it did actually
allocate something new, place the new memory somewhere where it
would have a maximum of space behind it, and avoid allocating
other memory behind it, so as to increase the chances of being
able to grow the allocation in place.)

One way to prevent this is to grab more memory than you need,
reducing the amount of reallocations that take place.
Doubling is one way. Instead of grabbing 101 blocks, grab
200. If you run out of that 200, grab 400 next time, etc...
This helps a bit decrease the amount of allocations that take
place. Even though fragmentation will still occur, it might
help a little.

Doubling is a very poor solution for this, as it means that you
can never reuse the freed memory for a new block, see above.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sep 15 '07 #6

by: cody | last post by:

i basically want to create an object which contains an array (the last element of the class). the size of the array is determined when the object is created. for performance reasons (avoiding...

C / C++

Object Oriented Programming in C, Avoid Memory Allocation

by: Andrew Au | last post by:

Dear all, I am trying to write a piece of software that use Object Oriented design and implement it with C, I did the following == In Object.h == typedef struct ObjectStructure* Object; ...

C / C++

Memory management and allocation

by: Dan Nilsen | last post by:

Hi! I'm writing a small piece of software that basically runs on an embedded system with a Power-PC cpu. This runs on a stripped down version of Linux - Busybox. As I'm writing a piece of...

C / C++

Checking return values for errors, a matter of style?

by: Johan Tibell | last post by:

I've written a piece of code that uses sockets a lot (I know that sockets aren't portable C, this is not a question about sockets per se). Much of my code ended up looking like this: if...

C / C++

Maximum Size of Byte Array

by: Gerrit | last post by:

Hi all, I'm getting an OutOfMemoryException when I initialize a byte array in C# like this: Byte test = new Byte; I'm using ASP.NET 2.0. In ASP.Net 1.1 it works fine. So what am I doing...

C# / C Sharp

Dynamic Memory Allocation for Matrix(2D array)

by: Peterwkc | last post by:

Hello all expert, i have two program which make me desperate bu after i have noticed the forum, my future is become brightness back. By the way, my problem is like this i the first program was...

C / C++

158

A solution for the allocation failures problem

by: jacob navia | last post by:

1: It is not possible to check EVERY malloc result within complex software. 2: The reasonable solution (use a garbage collector) is not possible for whatever reasons. 3: A solution like the...

C / C++

Re: Array optimizing problem in C++?

by: James Kanze | last post by:

On Apr 10, 4:06 pm, Jerry Coffin <jcof...@taeus.comwrote: Just a note, but in all of these languages, *all* objects are dynamically allocated. I wonder if this isn't more what caused you...

C / C++

query to website

by: eternalLearner | last post by:

i am writing an client application that connects to the server and then sends data to server. but, the problem is that i get a java.net.ConnectException: Connection refused error. The code i have...

Java

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp

php

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

Can array allocation cause memory fragmentation

Similar topics