Can array allocation cause memory fragmentation

Andreas Schmitt

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)real loc( Messages, sizeof(MMessage ) *
(m_num_messages + Num_builtin_mes sages) + 1 );

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as
N consecutive blocks of memory to hold its items. How can this cause
memory fragmentation, no matter how small the increase in a reallocation
is...?

Sep 14 '07 #1

Subscribe Reply

6212

Victor Bazarov

Andreas Schmitt wrote:

[..]
My question now.. from my undertanding, an array is always allocated
as N consecutive blocks of memory to hold its items. How can this
cause memory fragmentation, no matter how small the increase in a
reallocation is...?

Memory fragmentation is going to happen. Period. Now, how much is too
much is the 64 thousand dollar question - it may mean that if it happens
too soon, you're going to run out of memory, or the performance of the
system is going to be affected or whatever. The more often your code
allocates and deallocates memory, the sooner you're going to run into
the problem caused by the heap fragmentation. That's all. Making your
array allocate in larger chunks makes reallocations somewhat _less_
_frequent_, which in turn may mean that during the run your program is
not going to drive the heap to the threshold of being problematically
fragmented.

Now, how is this a C++ language question, I am not sure.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Sep 14 '07 #2

crisgoogle

On Sep 14, 9:36 am, Andreas Schmitt <keldorkat...@g mx.dewrote:

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)real loc( Messages, sizeof(MMessage ) *
(m_num_messages + Num_builtin_mes sages) + 1 );

First of all, this is a dangerous way to (mis)use realloc. If realloc
_can't_ allocate the new buffer, it returns NULL, which you then copy
into your old pointer. The old value of Messages, your only reference
to the previously allocated memory, is gone.

The correct way to do this is to store the result of realloc in a
temporary, and only assign the value to Messages once you know that
the old value of Messages is no longer needed.

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as
N consecutive blocks of memory to hold its items. How can this cause
memory fragmentation, no matter how small the increase in a reallocation
is...?

The issue isn't really that your realloc by itself causes
fragmentation, it's that other allocations interspersed with it will.

Let's say that your realloc call really does need to move the memory.
Now you have a hole of size 'oldsize' in the memory pool. Now some
other part of the code allocates a few bytes, which may go immediately
after your newer block. Now you reallocate again, leaving a hole of
oldsize + x, and so on, and so on. You'll wind up leaving a string of
holes, all differing in size by a single object.

At the least, you should reallocate for several extra objects, as your
commenter mentioned. This simply reduces the number of reallocations
and therefore the probability of the above scenario being a problem.
(The typical way to do this is to increase the buffer size by some
proportion, by the way, not by some absolute value.)

If you're reallocating frequently, you should also consider some
(probably system-specific) way of allocating this particular array
always from a pool separate from other allocations.

Sep 14 '07 #3

Jim Langston

"Andreas Schmitt" <ke**********@g mx.dewrote in message
news:fc******** *****@news.t-online.com...

Hi,

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)real loc( Messages, sizeof(MMessage ) *
(m_num_messages + Num_builtin_mes sages) + 1 );

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just adding
one element, that is wrong. You always need to allocate in batches to
prevent memory fragmentation. Always increment by 5 or 10 at the least,
some const value, and keep track of the actual size of the array so that
you only have to go up in size when you really need it."

My question now.. from my undertanding, an array is always allocated as N
consecutive blocks of memory to hold its items. How can this cause memory
fragmentation, no matter how small the increase in a reallocation is...?

Cris answered the question rather well. The more you allocate/reallocate
memory the more fragmentation that can happen. The way a few STL
implementaions do it, asi I understand, is to double the amount of memory
each time. This allocates more memory than needed which may not be used,
but reduces the number of allocations that will be needed. But, what is
this fragmentation? Consider

messages you first allocate 100 bytes.
100 bytes = messages
then you allocate 101
100 bytes = nothing
101 bytes = messages
then you allocate 102 bytes. It won't fit in the 100 bytes so it'll go
after that block
100 bytes = nothing
101 bytes = nothing
102 bytes = messages

You've now used up 303 bytes but are only using 102. Next time you go to
allocate 103 bytes, it would grab it from the beginning (hopefully) but then
you probably have other allcoations taking place for other variables, and
you wind up with differnet size holes of unused memory in the freestore.
(For example, say some other variable needed 105 bytes now, it would grab it
making it:

105 bytes = someothervariab le
96 bytes = unused
102 bytes = messages

enough of this goes on in the program and you start running out of memory to
grab your increasingly large blocks of memory. Even though the memory is
there unused, it's all broken up in little chunks that can't be grabbed at
once. One way to prevent this is to grab more memory than you need,
reducing the amount of reallocations that take place. Doubling is one way.
Instead of grabbing 101 blocks, grab 200. If you run out of that 200, grab
400 next time, etc... This helps a bit decrease the amount of allocations
that take place. Even though fragmentation will still occur, it might help
a little.

Sep 14 '07 #4

James Kanze

On Sep 14, 6:36 pm, Andreas Schmitt <keldorkat...@g mx.dewrote:

I recently worked on an open source project and tried to make on of the
arrays they are using dynamically allocated to get rid of the max size.

I used the realloc instead of the usual C++ new and delete because they
asked me to stick to their "coding style"

Well whatever..

here's what I wrote:

Messages = (MMessage*)real loc( Messages, sizeof(MMessage ) *
(m_num_messages + Num_builtin_mes sages) + 1 );

If MMessage isn't a POD, this won't work.

And here is one comment that came as a result of this:

"If the goal is to make it properly dynamic and he is actually just
adding one element, that is wrong. You always need to allocate in
batches to prevent memory fragmentation. Always increment by 5 or 10 at
the least, some const value, and keep track of the actual size of the
array so that you only have to go up in size when you really need it."

Is this from the same person who requires realloc, rather than
new and delete? He obviously doesn't understand memory
management very well.

My question now.. from my undertanding, an array is always
allocated as N consecutive blocks of memory to hold its items.
How can this cause memory fragmentation, no matter how small
the increase in a reallocation is...?

Indirectly, it could allocate a block of N, then allocate a
block of N+1 and free the block of N. If there are intervening
allocations, there could be an isolated block the size of N
elements. In practice, however:

-- Historically, you'd using realloc because you expect to
increase the size of the existing buffer, with no new
allocation. If there's not space, at least with the
algorithms which were frequent 20 years ago (when I was
working in C), the growing block would fairly quickly
migrate to the end of the free space arena, where it would
almost always be possible to grow it without reallocating.
Unless you can reasonably expect something like this,
there's no point of using realloc.

-- If realloc cannot grow something in place, it must copy it.
I suspect that with many modern implementations of
malloc/free, this would almost always be the case (but I've
not actually looked to see). Copying is, or can be, an
expensive operation; the usual solution when unlimited
growth is required is to use exponential allocation, always
multiplying the size by a fixed factor. (For various
reasons, 1.5 is generally to be preferred, although 2 is
very common as well.)

std::vector uses the second strategy, and that's really what you
should be using here.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sep 15 '07 #5

James Kanze

On Sep 14, 10:35 pm, "Jim Langston" <tazmas...@rock etmail.comwrote :
[...]

The way a few STL implementaions do it, asi I understand, is
to double the amount of memory each time. This allocates more
memory than needed which may not be used, but reduces the
number of allocations that will be needed.

The language specification requires that
std::vector<>:: push_back have amoritized constant time. This
requires some sort of exponential growth strategy. In older
implementations , 2 was a common factor, but Andy Koenig (I
think) pointed out that this has the serious drawback that even
if previously allocated blocks are contiguous, they will never
be large enough for the next allocation; the upper limit if you
want some memory reuse is, IIRC, the (1 + sqrt(5))/2---roughly
1.6. 1.5 is pretty close, and easy to calculate without
floating point arithmetic, and I suspect that this is the factor
used in many more recent implementations .

But, what is
this fragmentation? Consider

messages you first allocate 100 bytes.
100 bytes = messages
then you allocate 101
100 bytes = nothing
101 bytes = messages
then you allocate 102 bytes. It won't fit in the 100 bytes so it'll go
after that block
100 bytes = nothing
101 bytes = nothing
102 bytes = messages

You've now used up 303 bytes but are only using 102. Next
time you go to allocate 103 bytes, it would grab it from the
beginning (hopefully) but then you probably have other
allcoations taking place for other variables, and you wind up
with differnet size holes of unused memory in the freestore.

You can't make those sort of assumptions. It all depends on
realloc, and how it works. And realloc doesn't work well if
there are intervening allocations.

(For example, say some other variable needed 105 bytes now, it
would grab it making it:

105 bytes = someothervariab le
96 bytes = unused
102 bytes = messages

enough of this goes on in the program and you start running
out of memory to grab your increasingly large blocks of
memory. Even though the memory is there unused, it's all
broken up in little chunks that can't be grabbed at once.

What typically happens is that most of the allocated memory is
at the bottom of the free space arena. Fairly rapidly, realloc
can't find any room in the fragments, and migrates to the top.
Where everything that follows is free, so the next realloc's
just change the size of the buffer.

At least, that was the way it worked 20 years ago, when realloc
was relevant. Today, I suspect that you're right, and that
there will often be enough intervening allocations to require
realloc to actually reallocate and copy. At which point, it
ceases to be relevant. (Of course, an implementation could also
note when a block was being realloc'ed, and when it did actually
allocate something new, place the new memory somewhere where it
would have a maximum of space behind it, and avoid allocating
other memory behind it, so as to increase the chances of being
able to grow the allocation in place.)

One way to prevent this is to grab more memory than you need,
reducing the amount of reallocations that take place.
Doubling is one way. Instead of grabbing 101 blocks, grab
200. If you run out of that 200, grab 400 next time, etc...
This helps a bit decrease the amount of allocations that take
place. Even though fragmentation will still occur, it might
help a little.

Doubling is a very poor solution for this, as it means that you
can never reuse the freed memory for a new block, see above.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sep 15 '07 #6

Similar topics

2029

creating a objekt with an array linear in memory

by: cody | last post by:

i basically want to create an object which contains an array (the last element of the class). the size of the array is determined when the object is created. for performance reasons (avoiding cache misses) the whole objekt should be in one linear chunk of memory, that is the array starts where the objekt ends in memory. class Cool { Cool (int arraysize) { .. }

C / C++

2808

Object Oriented Programming in C, Avoid Memory Allocation

by: Andrew Au | last post by:

Dear all, I am trying to write a piece of software that use Object Oriented design and implement it with C, I did the following == In Object.h == typedef struct ObjectStructure* Object; Object object_create(); int object_getIntegerAttribute(Object object);

C / C++

1859

Memory management and allocation

by: Dan Nilsen | last post by:

Hi! I'm writing a small piece of software that basically runs on an embedded system with a Power-PC cpu. This runs on a stripped down version of Linux - Busybox. As I'm writing a piece of code that basically acts as a server and that will be running for weeks or months and probably even longer, memory management is a topic that is quite crucial.

C / C++

3645

Checking return values for errors, a matter of style?

by: Johan Tibell | last post by:

I've written a piece of code that uses sockets a lot (I know that sockets aren't portable C, this is not a question about sockets per se). Much of my code ended up looking like this: if (function(socket, args) == -1) { perror("function"); exit(EXIT_FAILURE); } I feel that the ifs destroy the readability of my code. Would it be

C / C++

16446

Maximum Size of Byte Array

by: Gerrit | last post by:

Hi all, I'm getting an OutOfMemoryException when I initialize a byte array in C# like this: Byte test = new Byte; I'm using ASP.NET 2.0. In ASP.Net 1.1 it works fine. So what am I doing wrong?

C# / C Sharp

7978

Dynamic Memory Allocation for Matrix(2D array)

by: Peterwkc | last post by:

Hello all expert, i have two program which make me desperate bu after i have noticed the forum, my future is become brightness back. By the way, my problem is like this i the first program was compiled and run without any erros but the second program has a run time error when the function return from allocate and the ptr become NULL. How to fixed this? Second Program: /* Best Method to allocate memory for 2D Array because it's ...

C / C++

158

6136

A solution for the allocation failures problem

by: jacob navia | last post by:

1: It is not possible to check EVERY malloc result within complex software. 2: The reasonable solution (use a garbage collector) is not possible for whatever reasons. 3: A solution like the one proposed by Mr McLean (aborting) is not possible for software quality reasons. The program must decide

C / C++

2042

Re: Array optimizing problem in C++?

by: James Kanze | last post by:

On Apr 10, 4:06 pm, Jerry Coffin <jcof...@taeus.comwrote: Just a note, but in all of these languages, *all* objects are dynamically allocated. I wonder if this isn't more what caused you problems, rather than garbage collection. In a language in which all objects are dynamically allocated, garbage collection is almost a necessity. Can you imagine what

C / C++

3035

query to website

by: eternalLearner | last post by:

i am writing an client application that connects to the server and then sends data to server. but, the problem is that i get a java.net.ConnectException: Connection refused error. The code i have written is : //package bdn; /* The java.net package contains the basics needed for network operations. */ import java.net.*; /* The java.io package contains the basics needed for IO operations. */ import java.io.*; /** The SocketClient class...

Java

9666

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

10410

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10200

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9984

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

9020

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7529

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5551

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3701

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2909

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General