execution time becomes unpredictable?!

vashwath

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

Thanks

Nov 14 '05 #1

Subscribe Post Reply

2510

Neil Kurzman

va******@rediffmail.com wrote:

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

Thanks

Because the time to find a block of memory the correct size will vary. Of
course since the system runs "forever", what will your system do if a
malloc fails due to fragmentation.

Nov 14 '05 #2

Walter Roberson

In article <11*********************@f14g2000cwb.googlegroups. com>,
<va******@rediffmail.com> wrote:

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt. The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

In the general case, the lead is correct, but not -inevitably- so.

The problem with malloc() is that usually it is implimented by
searching through a free list looking for a chunk of a large enough
size (after alignment constraints). If memory is badly fragmented,
there could be a lot of available memory but it might take a long time
to find a chunk known to be long enough.

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging together
adjacent regions; that could in turn trigger more merging, for an
indefinite amount of time spent in reclamation.

The more work that free() does to merge adjacent areas of memory, the
less deterministic the time it will take is -- but if it does not do
merging, then malloc() and kin could end up taking quite a bit of time
to search and could plausibly decide that there is no available memory
even though there are plenty of mergable spaces left.

Some free() implimentations hold off merging until malloc() request
comes in that could be satisfied by merging but not by itself alone.
In that case, free() might be fast, but every once in a while malloc() could
take quite a bit of time.
There are, I believe, versions of malloc() and free() around that have
upper bounds on time. For example, malloc() could divide memory into
a series of pools of fixed size, and always return the smallest
pool that fits. If the pools are sized as powers of 2, then simple
bitmaps can be used to discover whether a pool member is available
or not, and returning a pool member can come out as simply or'ing in
a bit to say the element is free. This pool strategy has to be tuned
to allow for enough of the right size of pool members, which might be
done once at initialization time. There are some varients that will
crack apart larger pools if necessary to provide more small entries:
this makes the timing less deterministic, but the algorithm can
be initialized to place a maximum limit on the depth it will do this
to, thus placing a maximum limit on the malloc()/free() time.

When you are working with real-time systems, there may be some
areas where sheer speed is important, but -generally-
"real-time systems" are more concerned with -predictability- of
response times than with sheer speed. An interrupt must be handled
within so many milliseconds -- so you can't be held up for several
seconds on garbage collection for something low priority. The
handling of the interrupt will not necessarily take very long
(it might involve very little computation as such), but for
"real-time systems" that not-very-long-time usually has to happen
within a relatively short time from the event trigger.
--
Warning: potentially contains traces of nuts.

Nov 14 '05 #3

Thomas Matthews

Peter Nilsson wrote:

va******@rediffmail.com wrote:
Might be off topic but I don't know where to post this question.

comp.programming perhaps.

Hope some body clears my doubt.

The coding standard of the project which I am working on say's
not to use malloc.When I asked my lead(I have just started
working) he said we should not use dynamic allocation in real
time systems, the code will not run in predictable time period.

You have responses already explaining why, but your lead's
concerns are surprising given that real time systems are
implemented in Java and other languages that use garbage
collection.

In critical systems, the main concern is what to do if a
dynamic allocation fails.

Actually, many real-time systems are implemented in C
and C++ as well as other languages that don't have
garbage collection.

As others have stated, garbage collection and memory
fragmentation are a nightmare for real-time and
critical systems. Even stack allocation causes
trembles in experienced programmers; they are
worried about overruning a limited stack area.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library

Nov 14 '05 #4

Richard Bos

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:

The problem with malloc() is that usually it is implimented by
searching through a free list looking for a chunk of a large enough
size (after alignment constraints). If memory is badly fragmented,
there could be a lot of available memory but it might take a long time
to find a chunk known to be long enough.

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging together
adjacent regions; that could in turn trigger more merging, for an
indefinite amount of time spent in reclamation.

Surely only a very unsmart implementation would ever need more than two
merges for a single free()? Once with the block before it; once with the
block after it. You only need more if you save up merges, which seems to
me to be a decidedly suboptimal solution for more reasons than this.

Richard

Nov 14 '05 #5

Eric Sosman

Richard Bos wrote:

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging together
adjacent regions; that could in turn trigger more merging, for an
indefinite amount of time spent in reclamation.

Surely only a very unsmart implementation would ever need more than two
merges for a single free()? Once with the block before it; once with the
block after it. You only need more if you save up merges, which seems to
me to be a decidedly suboptimal solution for more reasons than this.

Two words: "Buddy system."

--
Eric Sosman
es*****@acm-dot-org.invalid

Nov 14 '05 #6

Chris Croughton

On 30 Mar 2005 21:20:50 -0800, Peter Nilsson
<ai***@acay.com.au> wrote:

va******@rediffmail.com wrote:
Might be off topic but I don't know where to post this question.
comp.programming perhaps.
Hope some body clears my doubt.

The coding standard of the project which I am working on say's
not to use malloc.When I asked my lead(I have just started
working) he said we should not use dynamic allocation in real
time systems, the code will not run in predictable time period.

You have responses already explaining why, but your lead's
concerns are surprising given that real time systems are
implemented in Java and other languages that use garbage
collection.

/Some/ real time systems are implemented in Java and other languages
with GC. Those will be ones where deterministic behaviour in the time
domain is not important to the application (or at least to that part of
it using dynamic memory -- a user interface, for instance, probably
doesn't care within a few hundred milliseconds, but the underlying
control application does care a lot). Not all applications can afford
to ignore the time, however.
In critical systems, the main concern is what to do if a
dynamic allocation fails.

That's certainly a major concern, and one for which there is often no
good answer except "don't do it".

Chris C

Nov 14 '05 #7

CBFalconer

Richard Bos wrote:

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:
The problem with malloc() is that usually it is implimented by
searching through a free list looking for a chunk of a large
enough size (after alignment constraints). If memory is badly
fragmented, there could be a lot of available memory but it might
take a long time to find a chunk known to be long enough.

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging
together adjacent regions; that could in turn trigger more
merging, for an indefinite amount of time spent in reclamation.

Surely only a very unsmart implementation would ever need more
than two merges for a single free()? Once with the block before
it; once with the block after it. You only need more if you save
up merges, which seems to me to be a decidedly suboptimal solution
for more reasons than this.

They only need two merges max, but the problem is finding those
candidates. Some systems search lists for adjacencies, in fact I
find most do this. [1] This makes the free operation O(n).
nmalloc keeps, for each block, links to physically adjacent blocks,
which makes the free operation O(1). The significant delays only
show up (in the O(N) systems) after many thousands of allocations
have been made.

The cost of those links appears in 'wasted' memory when mallocing.
In nmalloc they account for an extra 8 bytes per allocation (since
the links need to be bidirectional to avoid any searches).
Similarly the search for a free block is limited to a few block
sizes, expanding in size by two. This makes that operation O(1)
also, but increases the possibility of needless failure. For
details see:

<http://cbfalconer.home.att.net/download/nmalloc.zip>

[1] 'most' here is a sampling of DJGPP 2.03, CRTDLL.DLL in W98,
MSVCRT.DLL from VC 6, and whatever was in CYGWIN a few years ago.
IIRC all exhibited the O(N*N) behaviour in freeing long lists of
allocations. This was the principal reason for creating nmalloc.
It also added healthy debugging capabilities due to the extensive
linking. A quick check can show up inconsistencies, and catch
errant memory usage fairly early.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #8

CBFalconer

Eric Sosman wrote:

Richard Bos wrote:
ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging
together adjacent regions; that could in turn trigger more
merging, for an indefinite amount of time spent in reclamation.

Surely only a very unsmart implementation would ever need more
than two merges for a single free()? Once with the block before
it; once with the block after it. You only need more if you saveto
up merges, which seems me to be a decidedly suboptimal solution
for more reasons than this.

Two words: "Buddy system."

Which I have not looked at for many years, but I believe it has the
potential of leaving up to 50% of memory unavailable. Am I wrong?
My Knuths are hidden away somewhere.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #9

Rob Thorpe

we******@gmail.com wrote in message news:<11**********************@l41g2000cwc.googleg roups.com>...

va******@rediffmail.com wrote:
Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said
we should not use dynamic allocation in real time systems, the code
will not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?
The implementation of malloc used for System V Unix (which was used by
compilers beyond System V Unix) gets slower and slower after repeated
usage. In fact I have seen it actually eventually lose memory
allocations. Some implementations of malloc/free are just really poor.

I hope they don't still use SysV malloc.

Basically the C language standard does not mandate anything about
performance and real implementations of memory heaps *really do*
execute in unpredictable amounts of time. That said, there certainly
do exist O(1) implementations of malloc/free (and realloc, if we ignore
the memcpy overhead.) So malloc() is only slow on *some* systems --
but it may be hard to know which such systems. As far as I know, gcc,
because its open source and "Lea" malloc implementation is reasonably
documented, we know that its memory allocation scheme is fairly
reasonable.
(The following isn't really that relevant to the O/P question, but
interesting:)

Ben Zorn at the university of Colorado used to have a nice webpage
with the source of many malloc's on it including his. Unfortunately
it looks like the university have taken his personal webpage down
since he moved to Microsoft.

Real implementations of malloc at use bining at least, and many other
tricks. For fast machines like modern desktops no-one seems to care
that much to make malloc run very fast. They do try to make it
stable, and stop it from wasting loads of memory which many simple
mallocs do. Very slow algorithms can't be used, but saving some
memory by spending a little more time is OK.

On a very small embedded system it might not be possible to use an
elaborate malloc algorithm because of code space. (Which is obvious,
why did I type it? :) )
OTOH, I've measured MSVC's memory allocation speed, and
its *MUCH* faster than gcc, but since its closed source, I can't sit
here and tell you that its even O(1) in all cases.

The time spent can be split up between malloc and free in different
ways. For example you could split it up so it works like this:-

malloc 1
------
m1 Look at size asked for, find the nearest bin
m2 Get a pointer to that bin from the free list
m3 attach it to the used list (? maybe if there is a used list)
m4 store some data just before the pointer location
m5 give it to the program

free 1
----
f1 Go to the ptr location look at the data you put in just before
f2 Add it back onto the relevant free list
f3 Do some merging (? maybe)
f4 Update the used list (? maybe)

Here, if there is no used list everything malloc does is fast, but
free must traverse the free list. It may also have to merge.

But the algorithm can be done the other way around, so in step 1
malloc traverses one of the free lists and takes an element from the
end. Some of the other data structures you could use would have the
same problem (but I'm sure there'd be one that would be reasonably
fast at both ends at the expense of space).

It could be that GNU malloc splits the time up differently to MS
malloc. Also, it could be that it has an interface into the operating
system that allows it to work faster from knowledge gleaned from
inside the OS. I don't think GNU malloc does that, even on Linux.

Anyway, a "large machine" malloc would probably want to use some
devious way to avoid fragmentation, which would probably make it
unlikely to be O(1). A malloc for a small machine might not need to
so thoroughly though.

(I'm aware I'm telling somes to Paul Hsieh about speed, which is
rather like talking to Gengis Khan about conquering Asia. Don't flame
me too much for any of the above that is wrong.)

Nov 14 '05 #10

Eric Sosman

CBFalconer wrote:

Eric Sosman wrote:
Richard Bos wrote:
ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:

Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging
together adjacent regions; that could in turn trigger more
merging, for an indefinite amount of time spent in reclamation.

Surely only a very unsmart implementation would ever need more
than two merges for a single free()? Once with the block before
it; once with the block after it. You only need more if you saveto
up merges, which seems me to be a decidedly suboptimal solution
for more reasons than this.

Two words: "Buddy system."

Which I have not looked at for many years, but I believe it has the
potential of leaving up to 50% of memory unavailable. Am I wrong?
My Knuths are hidden away somewhere.

For a binary buddy system, the worst case occurs when
all request sizes are of the form (1<<n)+1: the size must be
rounded up to 1<<(n+1), thus wasting just a smidgen less than
half the memory.

Because programmers are superstitious and like to use
powers of two even when there's no reason to, cases close to
the worst are distressingly common. Programmer habitually
requests memory in power-of-two sizes, the memory manager
tacks on a few bytes of metadata, and lo! the block size is
power-of-two plus a small amount ...

Buddy systems don't necessarily have to be binary,
where a large block splits into two equally-sized buddies.
I've seen mention of a Fibonacci system, where a large
block of size F[n] splits into buddies of F[n-1] and F[n-2];
such a scheme might cope better with programmers' fondness
for magic numbers. Can't say I've ever seen such a system
in production, though.

<Drags out Knuth>: "J.M. Robson has shown [...] that
dynamic storage allocation strategies which never relocate
reserved blocks cannot possibly be guaranteed to use memory
efficiently [...] Even when blocks are restricted to be
of sizes 1 and 2, overflow might occur with the memory only
about 2/3 full, no matter what allocation algorithm is used!"

--
Er*********@sun.com

Nov 14 '05 #11

REH

<va******@rediffmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

Thanks

What that means is the time to execute the call to malloc is
non-deterministic. That is, you cannot know exactly how long it will take.
The reason it is important in real-time systems, is you have hard
dead-lines. You have X amount of time to handle an event. You cannot prove
that the software can meet its deal-lines if you call functions with
non-deterministic execution times.

REH

Nov 14 '05 #12

Stephen Sprunk

"REH" <bo***@nowhere.net> wrote in message
news:d2**********@cui1.lmms.lmco.com...

<va******@rediffmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
Might be off topic but I don't know where to post this question.
Hope some body clears my doubt.

The coding standard of the project which I am working on say's
not to use malloc.When I asked my lead(I have just started
working) he said we should not use dynamic allocation in real
time systems, the code will not run in predictable time period.
Can anybody tell what does he mean?Why the execution time
becomes unpredictable?

What that means is the time to execute the call to malloc is
non-deterministic. That is, you cannot know exactly how long it
will take. The reason it is important in real-time systems, is you
have hard dead-lines. You have X amount of time to handle an
event. You cannot prove that the software can meet its deal-lines
if you call functions with non-deterministic execution times.

Non-determinism isn't necessarily bad as long as you can prove the _maximum_
time required for all the functions you call (plus your own function)
doesn't exceed the deadline. The problem with malloc() and free() is that
there is usually no way to prove the maximum time required for one or the
other (or sometimes both). I'm not even sure it's possible in theory for
both.

In some environments, though, this is not enough and the system needs the
exact same cycle count for _every_ execution. Few systems can provide this
even at the hardware level, but they exist.

S

Nov 14 '05 #13

E. Robert Tisdale

va******@rediffmail.com wrote:

The coding standard [for] the project which I am working on
say's not to use malloc.
When I asked my lead (I have just started working), he said that,
"We should not use dynamic allocation in real time systems
[because] the code will not run in predictable time period."
Can anybody tell what he means?
Why [does] the execution time becomes unpredictable?

It is practically impossible to predict how long
it will take malloc to find a free memory fragment
that is at least as large as you requested.

Dynamic memory allocation may still be used in real-time programs
in the initialization phase
before it is expected to service real-time events.

Modern implementations of malloc
on high performance computer architectures are so fast
that you would almost never miss a real-time deadline
but that's not the problem. The problem is that,
if your program begins to miss real-time deadlines intermittently,
you would have no practical way to eliminate malloc
as the source of the problem.

This still wouldn't be a problem except that
real-time applications are so often also *critcal* applications --
you just can't afford to miss deadlines.
For this reason, real-time programmers like your lead
prefer to avoid dynamic memory allocation altogether.

Nov 14 '05 #14

Walter Roberson

In article <42***************@news.individual.net>,
Richard Bos <rl*@hoekstra-uitgeverij.nl> wrote:

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:
Similarily, depending on how smart the free() operation is, the
procedure for returning a chunk of memory might involve merging together
adjacent regions; that could in turn trigger more merging, for an
indefinite amount of time spent in reclamation.
Surely only a very unsmart implementation would ever need more than two
merges for a single free()? Once with the block before it; once with the
block after it. You only need more if you save up merges, which seems to
me to be a decidedly suboptimal solution for more reasons than this.

I don't know if there is a formal name like "Buddy System"
for what I was thinking of. Consider, though, a system which
is written to only merge together blocks to produce larger
superblocks with magic size and alignment properties.

To put it more concretely, free() might only merge together
blocks to the extent that the resulting block size was 2^N
aligned on a "natural" binary boundary.

In such a case, if the block size being returned was the
minimum size handled by the allocator (e.g, 4 bytes) then
the return could trigger a merge with the 4 bytes beside it,
and the resulting 8 byte block could then be merged with the
8 byte block beside that, and so on up through the powers of
2 to the maximum allocation size.

This sort of merging is trivial if handled by a bitmap, but
if for some reason this kind of merging was being done -and-
there was the traditional "each free block points to the
next free block" chain, then a number of pointer fixups could
end up being done.
This issue only arises in situations where certain block sizes or
alignments are magic and those magic properties are to be preserved; as
you point out, it does not come into play if free() just merges
together adjacent blocks of whatever size and alignment happen to be
there.

--
"No one has the right to destroy another person's belief by
demanding empirical evidence." -- Ann Landers

Nov 14 '05 #15

REH

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112291657.2c150949c1527017fcfc6778c3e4ea67@t eranews...

"REH" <bo***@nowhere.net> wrote in message
news:d2**********@cui1.lmms.lmco.com...
<va******@rediffmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
Might be off topic but I don't know where to post this question.
Hope some body clears my doubt.

The coding standard of the project which I am working on say's
not to use malloc.When I asked my lead(I have just started
working) he said we should not use dynamic allocation in real
time systems, the code will not run in predictable time period.
Can anybody tell what does he mean?Why the execution time
becomes unpredictable?
What that means is the time to execute the call to malloc is
non-deterministic. That is, you cannot know exactly how long it
will take. The reason it is important in real-time systems, is you
have hard dead-lines. You have X amount of time to handle an
event. You cannot prove that the software can meet its deal-lines
if you call functions with non-deterministic execution times.

Non-determinism isn't necessarily bad as long as you can prove the

_maximum_ time required for all the functions you call (plus your own function)
doesn't exceed the deadline. If you can prove there is a max. time, then it *is* deterministic.
The problem with malloc() and free() is that
there is usually no way to prove the maximum time required for one or the
other (or sometimes both). I'm not even sure it's possible in theory for
both.

Yes, hence, non-determinstic. sheesh.

Nov 14 '05 #16

Keith Thompson

"REH" <bo***@nowhere.net> writes:

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112291657.2c150949c1527017fcfc6778c3e4ea67@t eranews...

[...]

Non-determinism isn't necessarily bad as long as you can prove the
_maximum_ time required for all the functions you call (plus your
own function) doesn't exceed the deadline.

If you can prove there is a max. time, then it *is* deterministic.

If the time required has a fixed upper bound, but the time for any
instance depends on the phase of the moon divided by a random number,
I'd call that non-deterministic (with a fixed upper bound).

If the word "deterministic" has some other meaning in some specific
field, it's not one I'm familiar with.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #17

Minti

we******@gmail.com wrote:

va******@rediffmail.com wrote:
Might be off topic but I don't know where to post this question.Hope some body clears my doubt.

The coding standard of the project which I am working on say's not to use malloc.When I asked my lead(I have just started working) he said we should not use dynamic allocation in real time systems, the code
will not run in predictable time period.Can anybody tell what does he mean?Why the execution time becomes unpredictable?
The implementation of malloc used for System V Unix (which was used

by compilers beyond System V Unix) gets slower and slower after repeated
usage. In fact I have seen it actually eventually lose memory
allocations. Some implementations of malloc/free are just really poor.
Just for my reference what algorithm did it use?

Basically the C language standard does not mandate anything about
performance and real implementations of memory heaps *really do*
execute in unpredictable amounts of time. That said, there certainly
do exist O(1) implementations of malloc/free (and realloc, if we ignore the memcpy overhead.) So malloc() is only slow on *some* systems --
but it may be hard to know which such systems. As far as I know, gcc, because its open source and "Lea" malloc implementation is reasonably
documented, we know that its memory allocation scheme is fairly
reasonable. OTOH, I've measured MSVC's memory allocation speed, and
its *MUCH* faster than gcc, but since its closed source, I can't sit
here and tell you that its even O(1) in all cases.

This O(1) is I believe only guaranteed as long as the memory requests
are satified using Memory pools, once your memory pools are exhausted,
all bets are off. One more thing, I came across a comment by James
Gosling that in many cases on parallel machines, Java's "new" was much
faster than C's "malloc". I believe that this _only_ because that the
C's implementation was not "parallelized", any other suspicions?

Note to OP: Try to get to know about your compiler and Operating System
specifially how it handles memory allocations. What sort of memory
pools are kept? If the memory pools don't meet your requirement,
allocate your own memory pools, and handle requests using your own
memory pool. There are several algorithms. I am not sure if CBFalconer
is offering source code along. But if he is good news.

--
Imanpreet Singh Arora
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #18

James McIninch

<posted & mailed>

It doesn't have anything to do with C per se.

Namely, if you dynamically request memory, you don't know in advance what
the system is going to do to satisfy the request. Is it going to hand you a
chunk already allocated to your program? does it need to search a
free-memory list to fins one for you? Will it rearrange memory to get you a
chunk the size you want? Will it allocate space in a page file or in swap?

You can't know, thus you can't predict how long it may take to answer the
request. 1 millisecond? 100 miliseconds? 2 seconds? You have no idea.

va******@rediffmail.com wrote:

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

Thanks

--
Remove '.nospam' from e-mail address to reply by e-mail

Nov 14 '05 #19

REH

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

"REH" <bo***@nowhere.net> writes:
"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112291657.2c150949c1527017fcfc6778c3e4ea67@t eranews...

If the time required has a fixed upper bound, but the time for any
instance depends on the phase of the moon divided by a random number,
I'd call that non-deterministic (with a fixed upper bound).

If the word "deterministic" has some other meaning in some specific
field, it's not one I'm familiar with.

Well, then we'll have to agree to disagree. To me, and everyone I have ever
worked with in the real-time embedded world, non-deterministic means it
cannot be bounded. Whether or not you can predict the time of an individual
run will take, the fact that it still has a fixed upper bound, make it
deterministic.

Nov 14 '05 #20

Keith Thompson

"REH" <bo***@nowhere.net> writes:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...
"REH" <bo***@nowhere.net> writes:
> "Stephen Sprunk" <st*****@sprunk.org> wrote in message
> news:1112291657.2c150949c1527017fcfc6778c3e4ea67@t eranews...

If the time required has a fixed upper bound, but the time for any
instance depends on the phase of the moon divided by a random number,
I'd call that non-deterministic (with a fixed upper bound).

If the word "deterministic" has some other meaning in some specific
field, it's not one I'm familiar with.

Well, then we'll have to agree to disagree. To me, and everyone I have ever
worked with in the real-time embedded world, non-deterministic means it
cannot be bounded. Whether or not you can predict the time of an individual
run will take, the fact that it still has a fixed upper bound, make it
deterministic.

Ok, fair enough. Since I don't work in the real-time embedded world,
I'm not familiar with its jargon. (You can probably expect similar
confusion from others outside your field -- just as we C standards
junkies run into confusion about terms like "byte" and "object".)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #21

Christian Bau

In article <11*********************@f14g2000cwb.googlegroups. com>,
va******@rediffmail.com wrote:

Might be off topic but I don't know where to post this question.Hope
some body clears my doubt.

The coding standard of the project which I am working on say's not to
use malloc.When I asked my lead(I have just started working) he said we
should not use dynamic allocation in real time systems, the code will
not run in predictable time period.Can anybody tell what does he
mean?Why the execution time becomes unpredictable?

Well, can you predict it? If you can't, it is unpredictable (at least to
you). If you write software that has to complete a task in a given time
with one hundred percent certainty, then you have to be able to predict
how long it takes or at least give an upper bound.

So how long does a call malloc (1000) take? Give me an answer, then
prove that it is correct.

Nov 14 '05 #22

Flash Gordon

REH wrote:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...
"REH" <bo***@nowhere.net> writes:
"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112291657.2c150949c1527017fcfc6778c3e4ea6 7@teranews...

If the time required has a fixed upper bound, but the time for any
instance depends on the phase of the moon divided by a random number,
I'd call that non-deterministic (with a fixed upper bound).

If the word "deterministic" has some other meaning in some specific
field, it's not one I'm familiar with.

Well, then we'll have to agree to disagree. To me, and everyone I have ever
worked with in the real-time embedded world, non-deterministic means it
cannot be bounded. Whether or not you can predict the time of an individual
run will take, the fact that it still has a fixed upper bound, make it
deterministic.

Regardless of whether having a fixed upper bound makes it deterministic,
there are times when you need the execution time to be as consistent as
possible. I know this because I've worked on such systems.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.

Nov 14 '05 #23

Walter Roberson

In article <d2*********@cui1.lmms.lmco.com>, REH <bo***@nowhere.net> wrote:

Well, then we'll have to agree to disagree. To me, and everyone I have ever
worked with in the real-time embedded world, non-deterministic means it
cannot be bounded. Whether or not you can predict the time of an individual
run will take, the fact that it still has a fixed upper bound, make it
deterministic.

Though it helps to know the range of time distributions, and
to know whether the worst-case scenarios can be avoided by careful
use of the resource, and to know whether the worst-case scenarios
can be somehow cleaned up during spare compute cycles (e.g.,
if the worst case involves a whole bunch of searches, spare cycles
might be used to perform an interruptable resort-by-size algorithm.)

Time variations are not unimportant in real-time work: it's still
good to know, for example, how well the system will stand up when
pushed beyond it's guaranteed-to-handle limits.

e.g. does the 101st phone call just result in a dialtone 20 ms later
than would otherwise be the case, or does it result in all 100 existing
calls being dropped? Does the 101st incoming missile still get shot
down but closer to its target than is comfortable, or does the 101st
incoming missile get ignored and left to hit the target untracked, or
do all the missiles get ignored as soon as the 101'st shows up, or
does it lose track of a random missile for a moment and regains track
of it later, effectively "juggling" which ones are being tracked?

--
Warning: potentially contains traces of nuts.

Nov 14 '05 #24

Walter Roberson

In article <1i************@brenda.flash-gordon.me.uk>,
Flash Gordon <sp**@flash-gordon.me.uk> wrote:

Regardless of whether having a fixed upper bound makes it deterministic,
there are times when you need the execution time to be as consistent as
possible. I know this because I've worked on such systems.

Would you be able to give us an example without compromising NDAs?

I can think of one example: information about security systems can
be gleaned by careful monitoring of the processor work -- cycle
counts, even which -section- of a microprocessor is being used.
On highly secure devices, you want to avoid having your opponents
do the equivilent of putting the stethoscope to the lock and deducing
the numbers one at a time by hearing the lock workload difference
when the key is varying degrees of correct.
--
"I want to make sure [a user] can't get through ... an online
experience without hitting a Microsoft ad"
-- Steve Ballmer [Microsoft Chief Executive]

Nov 14 '05 #25

CBFalconer

we******@gmail.com wrote:

.... snip ...
Basically the C language standard does not mandate anything about
performance and real implementations of memory heaps *really do*
execute in unpredictable amounts of time. That said, there certainly
do exist O(1) implementations of malloc/free (and realloc, if we ignore
the memcpy overhead.) So malloc() is only slow on *some* systems --
but it may be hard to know which such systems. As far as I know, gcc,
because its open source and "Lea" malloc implementation is reasonably
documented, we know that its memory allocation scheme is fairly
reasonable. OTOH, I've measured MSVC's memory allocation speed, and
its *MUCH* faster than gcc, but since its closed source, I can't sit
here and tell you that its even O(1) in all cases.

Do try to be reasonably accurate. malloc is a part of the library,
so it has nothing to do with either gcc or MSVC, but everything to
do with the runtime library. Both sources (many in the case of
gcc) are available, but the MSVC stuff cannot be published without
permission. gcc operates on many systems with different
libraries. MSVC operates on an x86 under Windoze only. Depending
on where malloc gets the memory to hand out, the execution time can
be perfectly predictable, or at least have an upper bound put to
it. The time binds appear during free.

In practice you can often play implementor and link in a new malloc
package before linking the runtime library. There are no
guarantees once you do this.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson

Nov 14 '05 #26

websnarf

Rob Thorpe wrote:

we******@gmail.com wrote:
The implementation of malloc used for System V Unix (which was
used by compilers beyond System V Unix) gets slower and slower
after repeated usage. In fact I have seen it actually eventually
lose memory allocations. Some implementations of malloc/free are
just really poor.
I hope they don't still use SysV malloc.

Right. Very few platforms do. But it clearly has stuck in the
memories/experiences of some people. This is why embedded developers
very often stay away from malloc. I was just diagnosing why people
might feel this way (apparently SCO Unix still uses this malloc.)

And the language standard is of no use here. Compare this to C++'s STL
vector sorting. The C++ language standard actually requires that the
algorithm have O(n*ln(n)) performance and that it performs a "stable
sort" and so on. Something as simple as "malloc and free must have
O(1) performance" would be a extremely valuable.
[...] The time spent can be split up between malloc and free
in different ways. For example you could split it up so it
works like this:-

malloc 1
------
m1 Look at size asked for, find the nearest bin
m2 Get a pointer to that bin from the free list
m3 attach it to the used list (? maybe if there is a used list)
m4 store some data just before the pointer location
m5 give it to the program

free 1
----
f1 Go to the ptr location look at the data you put in just before
f2 Add it back onto the relevant free list
f3 Do some merging (? maybe)
f4 Update the used list (? maybe)

Here, if there is no used list everything malloc does is fast,
but free must traverse the free list. It may also have to merge.
Both malloc and free can use "binning". My own testing indicates that
merging is necessary to avoid "lost allocations" or a failure to
allocate even when memory is available. I.e., merging is not optional.
There are also other interesting ideas/tricks that are worth
consideration.
But the algorithm can be done the other way around, so in
step 1 malloc traverses one of the free lists and takes an
element from the end. Some of the other data structures
you could use would have the same problem (but I'm sure
there'd be one that would be reasonably fast at both ends
at the expense of space).

It could be that GNU malloc splits the time up differently
to MS malloc. Also, it could be that it has an interface
into the operating system that allows it to work faster from
knowledge gleaned from inside the OS. I don't think GNU
malloc does that, even on Linux.
Well the LEA allocator, and UNIX-allocators in general, assume that you
have a sbrk(), which Windows doesn't have. So presumably sbrk() is
emulated somehow (perhaps slowly). However, this only affect
allocations that are made when expanding the programs' maximal foot
print -- my benchmarks lean more heavily on the quiescent state (i.e.,
doing mallocs after lots of frees have happened and therefore should be
obtained from free lists not sbrk())
Anyway, a "large machine" malloc would probably want to
use some devious way to avoid fragmentation, which would
probably make it unlikely to be O(1). A malloc for a small
machine might not need to so thoroughly though.
Ok here's the thing -- there is no way to make a malloc that
deterministically "avoids fragmentation". Think of the following
scenario:

for (i=0; ; i++) {
if (NULL == (p[i] = malloc (16364))) break;
}
/* What should the mallocs be returning to make sure the heap
is not defragmented at this point? There doesn't seem to
be anything better than packed stack-like allocations. */

for (j=0; j < i; j+=2) free (p[i]);
/* Presumably we've recovered half our memory, but what does the
heap look like now? Now matter what we got back from the
calls above, one way or another there is a way to force the
heap to fragment horrendously with some sequence of frees
such as this, even with lots of memory recovered. */

p[0] = malloc (32768);
/* Well there should be plenty of memory available, but is this
even possible now? */

This of course ignores "memory mapping tricks" -- but even then, one is
still limited by address ranges anyway.

Practical defragmentation (i.e., just worried about the most common
cases) is more straightforward -- just make sure allocations are as
packed together as possible. This will naturally increase the size of
your larger contiguous blocks. One way to try to effect this is, when
given a choice, try to return allocations with the "lowest" address.
You can, of course, use heuristics, like assuming that programs will
tend to allocate many of the same sized blocks -- and thus, upon
detection, try to return these as pieces broken off from larger
contiguous blocks.

Since context independent malloc/free designs cannot deterministically
avoid degenerate scenarios that cause arbitrary amounts of
fragmentation anyways, I don't think trying to use non-O(1) algorithms
are the right way to go. Heuristics do fairly well in combating
fragmentation in practice while leaving good algorithms as O(1)
anyways. So personally, I would never recommend an algorithm that went
over the top in trying to combat fragmentation at the expense of
algorithmic complexity.
(I'm aware I'm telling somes to Paul Hsieh about speed, which
is rather like talking to Gengis Khan about conquering Asia.
Don't flame me too much for any of the above that is wrong.)

Lol! Nah, we're all somewhere on the journey to better understanding
of this stuff, and certainly you seem a lot further than most.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 14 '05 #27

websnarf

Minti wrote:

we******@gmail.com wrote:
va******@rediffmail.com wrote:
Might be off topic but I don't know where to post this
question. Hope some body clears my doubt.

The coding standard of the project which I am working on say's
not to use malloc.When I asked my lead(I have just started
working) he said we should not use dynamic allocation in real
time systems, the code will not run in predictable time
period.Can anybody tell what does he mean? Why the execution
time becomes unpredictable?
The implementation of malloc used for System V Unix (which was
used by compilers beyond System V Unix) gets slower and slower
after repeated usage. In fact I have seen it actually eventually
lose memory allocations. Some implementations of malloc/free are
just really poor.

Just for my reference what algorithm did it use?

I wasn't able to analyze it deeply enough to truly understand
everything about it. But what I know for sure is that in its inner
loop it performed some sort of linear search for new blocks when it did
mallocs (presumably looking for a block that was big enough). So as
the number of blocks increases (due to fragmentation I presume) the
performance obviously just goes down. Basically, it doesn't use any
form of binning.

Basically the C language standard does not mandate anything about
performance and real implementations of memory heaps *really do*
execute in unpredictable amounts of time. That said, there
certainly do exist O(1) implementations of malloc/free (and
realloc, if we ignore the memcpy overhead.) So malloc() is only
slow on *some* systems -- but it may be hard to know which such
systems. As far as I know, gcc, because its open source and "Lea"
malloc implementation is reasonably documented, we know that its
memory allocation scheme is fairly reasonable. OTOH, I've
measured MSVC's memory allocation speed, and its *MUCH* faster
than gcc, but since its closed source, I can't sit here and tell
you that its even O(1) in all cases.

This O(1) is I believe only guaranteed as long as the memory
requests are satified using Memory pools, once your memory pools
are exhausted, all bets are off.

Uhh ... that's possible, but usually that just means you have to fetch
from sbrk() or VirtualAlloc() or some other system call. Technically,
yes that could take an unknown amount of time, but it still should be
reducible to O(ln(system-bits)) < O(1) time.
[...] One more thing, I came across a comment by James
Gosling that in many cases on parallel machines, Java's "new" was
much faster than C's "malloc". I believe that this _only_ because
that the C's implementation was not "parallelized", any other
suspicions?
Well actually I would be rather surprised if "new" wasn't *always*
faster than malloc() hands down. Garbage collection has the advantage
that it can always procure its memory from bigger blocks (and thus be
very stack-like), since it knows that there will not be common
interleaving frees and it doesn't have to support realloc. The real
complexity is supposed to be at garbage collection time when the free
blocks have to be recoallesced into blocks ready for use by "new".
Note to OP: Try to get to know about your compiler and Operating
System specifially how it handles memory allocations. What sort of
memory pools are kept? If the memory pools don't meet your
requirement, allocate your own memory pools, and handle requests
using your own memory pool.
Indeed. The gcc LEA allocator, we know is O(1), as is the BSD
allocator however it may reject allocations that should be possible,
due to excessive prebinning.) And as I said, if you are willing to
trust Microsoft, their allocator *seems* fast according to my
measurements. If the algorithm doesn't use some sort of "binning" I
would be highly suspicious of it -- in fact I believe its not possible
to be O(1) and still return with non-NULL allocations when it should be
possible without something like binning or an equivalent strategy.
There are also companies that *sell* high performance memory allocators
such as Microquill: http://www.microquill.com/ (many CPU companies use
this heap when testing their system on the Spec CPU benchmark.)
[...] There are several algorithms. I am not sure if CBFalconer
is offering source code along. But if he is good news.

I would be highly suspicious of any code CBFalconer offers (his idea of
a "general hash table" is one that can hold at most 8.5 million
entries). Certainly you should heavily audit the code before use.

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Nov 14 '05 #28

CBFalconer

we******@gmail.com wrote:

Minti wrote:
.... snip ...
[...] There are several algorithms. I am not sure if CBFalconer
is offering source code along. But if he is good news.

I would be highly suspicious of any code CBFalconer offers (his idea of
a "general hash table" is one that can hold at most 8.5 million
entries). Certainly you should heavily audit the code before use.

This comes from someone who can't recognize that a hash function
that requires a length parameter has to measure the length of an
input string, or who can't recognize that a 32 bit quantity won't
fit into an int and be portable. He is also a proponent of shift
operations on signed quantities. Similarly he seems unable to
recognize that the 8.5 million limit he mentioned is on entries,
each of which requires data storage, and thus represents something
between 5 and 100 up times that in storage bytes. Besides which,
if the user wishes, he simply adds an entry to a table to increase
the maximum. Meanwhile, all smaller quantities are handled with
minimum wastage (about 50% max).

<Aside to Hsieh - hash tables are intended to be in-memory storage,
and tend to be less efficient when they cause thrashing in a
virtual memory system. This puts an inherent maximum on the item
count usable on a given class of systems. If this is not clear to
you I suggest you use e-mail to get an explanation in smaller words
- this is not the appropriate forum>

<For everyone: The following is a quote from the hashlib source,
and is the area that creates this horrendous 8 mega-entries
limitation. I think how to adjust the upper limit should be fairly
obvious to any reasonably skilled C programmer.>

/* table of k where 2**n - k is prime, for n=8 up. 0 ends */
/* These numbers are chosen so that memory allocation will */
/* usually allow space for system overhead in a 2**n block */
/*http://www.utm.edu/research/primes/l...mall/0bit.html */
#define FIRSTN 8
static int primetbl[] = {45, 45, 41, 45, 45, 45, 45, 49,
57, 49, 41, 45, 59, 55, 57, 61, 0};
/* So the prime of interest, vs index i into above table, */
/* is ( 2**(FIRSTN + i) ) - primetbl[i] */
/* The above table suffices for about 8,000,000 entries. */

Hsieh also seems unable to recognize that the operation of the
malloc package is independant of the compiler, nor that
cryptographic qualities of a hash function have nothing to do with
its suitability for a hash table. There seems to be a lesson here,
that mathematical ability and knowledge have very little to do with
the ability to create clean and trouble free code. However he is
making some progress, in that he has recognized that the action of
sbrk (or an equivalent system function) can affect the timing of a
malloc call. Further hint: This is the one of the reasons malloc
is part of the system and is generally not portable. Alignment
also raises its head here.

At any rate, you (the public) may judge for yourself. My code is
generally available at:

<http://cbfalconer.home.att.net/download/>

and I recommend the hashlib and nmalloc packages for this
controversy, together with id2id-20 and hshftst. The latter arose
out of Hsiehs earlier erroneous claims.

I concede that I am basing my low opinion of his code on his
published hash function. It is possible that this is an inadequate
sample. At any rate, I will not get excited and irrational if told
that there is a problem with my code, and I welcome any such
reports.

--
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

Nov 14 '05 #29

Flash Gordon

Walter Roberson wrote:

In article <1i************@brenda.flash-gordon.me.uk>,
Flash Gordon <sp**@flash-gordon.me.uk> wrote:
Regardless of whether having a fixed upper bound makes it deterministic,
there are times when you need the execution time to be as consistent as
possible. I know this because I've worked on such systems.

Would you be able to give us an example without compromising NDAs?

I can describe an algorithm where this is important without going in to
the specific application.

The SW is controlling a motor which is moving something back and forth
in a saw-tooth wave, so the output motion looks like:
/| /| /|
/ | / | / |
/ |/ |/ |/
etc.

The "flyback" time (i.e. the down stroke) is very fast in this instance
but obviously not instantaneous!

Every x ms you do the following:
1) Measure the position
2) Calculate the current error
3) Calculate the required drive to apply to the motor (see 5 below)
4) Apply drive
5) Calculate a correction to apply earlier in the next mechanical
cycle to avoid getting the error in the first place

This is obviously triggered off an interrupt.

The motor is a very basic job, basically an electromagnet pulling
against a spring, so the drive is the force applied against the spring.
So if you apply the force late then it will start slowing, if you apply
it early it will speed up. So if there is a variation in time between
steps 1 and 4 you will introduce a random error which will be fed in to
the next mechanical cycle when the timing variation might be in the
opposite direction leading to more error. So basically you loose
mechanical accuracy if you have any time variation.

Obviously, in this instance I did not have any reason to even think of
using dynamic memory allocation. However, control loops of this form
where stability of timing is important are not *that* uncommon where you
are trying for accurate control.

Another instance where accurate timing, rather than timing with a known
maximum, is important is when they are giving your heart a shock to
stabilise it's rhythms (when you have arythmia as opposed to being flat
lined). If you give the shock too late OR too early it can damage your
heart. This is only hear say though, where as the previous example was
one I actually worked on myself.
I can think of one example: information about security systems can
be gleaned by careful monitoring of the processor work -- cycle
counts, even which -section- of a microprocessor is being used.
On highly secure devices, you want to avoid having your opponents
do the equivilent of putting the stethoscope to the lock and deducing
the numbers one at a time by hearing the lock workload difference
when the key is varying degrees of correct.

I'm not an expert on such systems, so can't really comment about them.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.

Nov 14 '05 #30

Chris Croughton

On Thu, 31 Mar 2005 15:21:29 -0500, REH
<bo***@nowhere.net> wrote:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...
"REH" <bo***@nowhere.net> writes:
> "Stephen Sprunk" <st*****@sprunk.org> wrote in message
> news:1112291657.2c150949c1527017fcfc6778c3e4ea67@t eranews...

If the time required has a fixed upper bound, but the time for any
instance depends on the phase of the moon divided by a random number,
I'd call that non-deterministic (with a fixed upper bound).

If the word "deterministic" has some other meaning in some specific
field, it's not one I'm familiar with.

Well, then we'll have to agree to disagree. To me, and everyone I have ever
worked with in the real-time embedded world, non-deterministic means it
cannot be bounded. Whether or not you can predict the time of an individual
run will take, the fact that it still has a fixed upper bound, make it
deterministic.

Not anywhere I've been in the RT embedded world in the last 25+ years!
Deterministic means "can be determined in advance" (and in particular
can be reproduced with the same input conditions). If the time can be
bounded, we call it 'bounded' (duh!). There are some functions where
the bound is so small that it doesn't matter much so the behaviour of
the system is 'nearly' deterministic, but it still isn't deterministic
if the behaviour changes at all.

And there's no reason that dynamic memory allocation has an upper bound
on its time, it could sit there forever shuffling memory to try to find
some that's free. It's not even deterministic whether it succeeds or
do...

Chris C

Nov 14 '05 #31

Stephen Sprunk

"REH" <bo***@nowhere.net> wrote in message
news:d2*********@cui1.lmms.lmco.com...

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
Non-determinism isn't necessarily bad as long as you can prove
the _maximum_ time required for all the functions you call (plus
your own function) doesn't exceed the deadline.

If you can prove there is a max. time, then it *is* deterministic.

That depends on if "deterministic" is defined as fixed time or bounded time.
I've always understood it to mean the former.

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

Nov 14 '05 #32

REH

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112377387.ed4aba15b26977118e5a872681fdc3d7@t eranews...

"REH" <bo***@nowhere.net> wrote in message
news:d2*********@cui1.lmms.lmco.com...
"Stephen Sprunk" <st*****@sprunk.org> wrote in message
Non-determinism isn't necessarily bad as long as you can prove
the _maximum_ time required for all the functions you call (plus
your own function) doesn't exceed the deadline.
If you can prove there is a max. time, then it *is* deterministic.

That depends on if "deterministic" is defined as fixed time or bounded

time. I've always understood it to mean the former.

In the company I work for, it was always the later. No big whoop, I guess.

REH

Nov 14 '05 #33

Rob Thorpe

we******@gmail.com wrote in message news:<11**********************@f14g2000cwb.googleg roups.com>...

Rob Thorpe wrote:
we******@gmail.com wrote:
The implementation of malloc used for System V Unix (which was
used by compilers beyond System V Unix) gets slower and slower
after repeated usage. In fact I have seen it actually eventually
lose memory allocations. Some implementations of malloc/free are
just really poor.
I hope they don't still use SysV malloc.

Right. Very few platforms do. But it clearly has stuck in the
memories/experiences of some people. This is why embedded developers
very often stay away from malloc. I was just diagnosing why people
might feel this way (apparently SCO Unix still uses this malloc.)

And the language standard is of no use here. Compare this to C++'s STL
vector sorting. The C++ language standard actually requires that the
algorithm have O(n*ln(n)) performance and that it performs a "stable
sort" and so on. Something as simple as "malloc and free must have
O(1) performance" would be a extremely valuable.

I think it would be for people doing embedded software. It's not so
important as the average speed for the rest of us. Maybe it should be
in an embedded C standard.

[...] The time spent can be split up between malloc and free
in different ways. For example you could split it up so it
works like this:-

malloc 1
------
m1 Look at size asked for, find the nearest bin
m2 Get a pointer to that bin from the free list
m3 attach it to the used list (? maybe if there is a used list)
m4 store some data just before the pointer location
m5 give it to the program

free 1
----
f1 Go to the ptr location look at the data you put in just before
f2 Add it back onto the relevant free list
f3 Do some merging (? maybe)
f4 Update the used list (? maybe)

Here, if there is no used list everything malloc does is fast,
but free must traverse the free list. It may also have to merge.
Both malloc and free can use "binning". My own testing indicates that
merging is necessary to avoid "lost allocations" or a failure to
allocate even when memory is available. I.e., merging is not optional.
There are also other interesting ideas/tricks that are worth
consideration.

I was giving this as an example of a malloc/free implementation to
show how the slow bits could be put in different places. I hope that
people don't write malloc routines without merging these days.

That said, it depends on how insistent the programmer is on writing
good quality software. Some malloc's are very wasteful and people
don't seem to mind much, that said waste is a little less of a crime
than prematurely running out of memory.

But the algorithm can be done the other way around, so in
step 1 malloc traverses one of the free lists and takes an
element from the end. Some of the other data structures
you could use would have the same problem (but I'm sure
there'd be one that would be reasonably fast at both ends
at the expense of space).

It could be that GNU malloc splits the time up differently
to MS malloc. Also, it could be that it has an interface
into the operating system that allows it to work faster from
knowledge gleaned from inside the OS. I don't think GNU
malloc does that, even on Linux.
Well the LEA allocator, and UNIX-allocators in general, assume that you
have a sbrk(), which Windows doesn't have. So presumably sbrk() is
emulated somehow (perhaps slowly). However, this only affect
allocations that are made when expanding the programs' maximal foot
print -- my benchmarks lean more heavily on the quiescent state (i.e.,
doing mallocs after lots of frees have happened and therefore should be
obtained from free lists not sbrk())

You mentioned you were using GCC, but not which library. Cygwin,
DJGPP and GLibc use different Mallocs. I think maybe the Linux kernel
has a different Malloc again. All these versions are somewhat
similar.

It sounds like you did a fairly thorough test of your version. Given
that, it probably is just a slow implementation of malloc/free.

Anyway, a "large machine" malloc would probably want to
use some devious way to avoid fragmentation, which would
probably make it unlikely to be O(1). A malloc for a small
machine might not need to so thoroughly though.
Ok here's the thing -- there is no way to make a malloc that
deterministically "avoids fragmentation". Think of the following
scenario:

for (i=0; ; i++) {
if (NULL == (p[i] = malloc (16364))) break;
}
/* What should the mallocs be returning to make sure the heap
is not defragmented at this point? There doesn't seem to
be anything better than packed stack-like allocations. */

for (j=0; j < i; j+=2) free (p[i]);
/* Presumably we've recovered half our memory, but what does the
heap look like now? Now matter what we got back from the
calls above, one way or another there is a way to force the
heap to fragment horrendously with some sequence of frees
such as this, even with lots of memory recovered. */

p[0] = malloc (32768);
/* Well there should be plenty of memory available, but is this
even possible now? */

I see what you mean, the user of malloc controls the pattern of the
frees, and therefore the fragmentation of memory, so there's no total
solution, only some approximations for common program behaviour.

This of course ignores "memory mapping tricks" -- but even then, one is
still limited by address ranges anyway.

Practical defragmentation (i.e., just worried about the most common
cases) is more straightforward -- just make sure allocations are as
packed together as possible. This will naturally increase the size of
your larger contiguous blocks. One way to try to effect this is, when
given a choice, try to return allocations with the "lowest" address.
You can, of course, use heuristics, like assuming that programs will
tend to allocate many of the same sized blocks -- and thus, upon
detection, try to return these as pieces broken off from larger
contiguous blocks.
Doug Lea's malloc also uses "deferred coalescing". Say a program
mallocs a number of pieces the same size, then frees some of them.
Instead of merging the pieces left after the frees straight away it
waits to see if the program is going to allocate more chunks that
size, which is likely.

Also, if a program uses one particular size often then malloc/free can
create a special bin for that size, there can be a set of such bins.

Since context independent malloc/free designs cannot deterministically
avoid degenerate scenarios that cause arbitrary amounts of
fragmentation anyways, I don't think trying to use non-O(1) algorithms
are the right way to go. Heuristics do fairly well in combating
fragmentation in practice while leaving good algorithms as O(1)
anyways. So personally, I would never recommend an algorithm that went
over the top in trying to combat fragmentation at the expense of
algorithmic complexity.
Fragmentation brings up interesting problems, and interesting
solutions. But, a little done about it goes a long way. On
reflection, you're probably right, a good, solid malloc/free
implementation could probably could be written that is completely
O(1). For modern PCs and other big machines though people who write
malloc routines probably don't consider it that important so long as
it's fast most of the time.

For small embedded machines I expect people drop all the frills, take
slightly worse fragmentation and try to keep malloc and free O(1). A
bunch of Smart-card programmers work in the office next to mine, I'll
ask them.

(I'm aware I'm telling somes to Paul Hsieh about speed, which
is rather like talking to Gengis Khan about conquering Asia.
Don't flame me too much for any of the above that is wrong.)

Lol! Nah, we're all somewhere on the journey to better understanding
of this stuff, and certainly you seem a lot further than most.

Thank you.

Speed is certainly curious stuff to understand. One of the most
interesting attitudes to it I've seen is the garbage collector in GCC,
it only kicks in when the compiler uses several megs of memory. This
means that on most compiles it never kick in at all, so in many cases
no complex memory management even happens. The problems here are
firstly running many instances of the same program and secondly the
point where it does kick in.

Nov 14 '05 #34

Richard Bos

ro******@ibd.nrc-cnrc.gc.ca (Walter Roberson) wrote:

In article <42***************@news.individual.net>,
Richard Bos <rl*@hoekstra-uitgeverij.nl> wrote:
Surely only a very unsmart implementation would ever need more than two
merges for a single free()? Once with the block before it; once with the
block after it. You only need more if you save up merges, which seems to
me to be a decidedly suboptimal solution for more reasons than this.
I don't know if there is a formal name like "Buddy System"
for what I was thinking of. Consider, though, a system which
is written to only merge together blocks to produce larger
superblocks with magic size and alignment properties.

This sort of merging is trivial if handled by a bitmap, but
if for some reason this kind of merging was being done -and-
there was the traditional "each free block points to the
next free block" chain, then a number of pointer fixups could
end up being done.

Sure. But I'd call that a very unsmart implementation. Buddy system with
bitmap, OK. But a chimaera of two systems is hardly ever a good idea.

Richard

Nov 14 '05 #35

Kevin D. Quitt

On Fri, 1 Apr 2005 13:13:24 -0500, "REH" <bo***@nowhere.net> wrote:

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112377387.ed4aba15b26977118e5a872681fdc3d7@ teranews...
"REH" <bo***@nowhere.net> wrote in message
news:d2*********@cui1.lmms.lmco.com...
> "Stephen Sprunk" <st*****@sprunk.org> wrote in message
> > Non-determinism isn't necessarily bad as long as you can prove
> > the _maximum_ time required for all the functions you call (plus
> > your own function) doesn't exceed the deadline.
>
> If you can prove there is a max. time, then it *is* deterministic.

That depends on if "deterministic" is defined as fixed time or bounded

time.
I've always understood it to mean the former.

In the company I work for, it was always the later. No big whoop, I guess.

I doubt there are many instances of sophisticated embedded systems that
are fixed-time deterministic. If you have interrupts, you cannot be
fixed-time deterministic.

A system is deterministic iff you are *guaranteed* that every task will
complete its function within the time allowed or before its required
completion time.

--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list

Nov 14 '05 #36

REH

"Kevin D. Quitt" <KQ****@IEEInc.com> wrote in message
news:5o********************************@4ax.com...

On Fri, 1 Apr 2005 13:13:24 -0500, "REH" <bo***@nowhere.net> wrote:

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112377387.ed4aba15b26977118e5a872681fdc3d7@ teranews...
"REH" <bo***@nowhere.net> wrote in message
news:d2*********@cui1.lmms.lmco.com...
> "Stephen Sprunk" <st*****@sprunk.org> wrote in message
> > Non-determinism isn't necessarily bad as long as you can prove
> > the _maximum_ time required for all the functions you call (plus
> > your own function) doesn't exceed the deadline.
>
> If you can prove there is a max. time, then it *is* deterministic.

That depends on if "deterministic" is defined as fixed time or boundedtime.
I've always understood it to mean the former.

In the company I work for, it was always the later. No big whoop, I

guess.
I doubt there are many instances of sophisticated embedded systems that
are fixed-time deterministic. If you have interrupts, you cannot be
fixed-time deterministic.

A system is deterministic iff you are *guaranteed* that every task will
complete its function within the time allowed or before its required
completion time.

Agreed, but I gave up trying to argue the point here.

Nov 14 '05 #37

Flash Gordon

Kevin D. Quitt wrote:

On Fri, 1 Apr 2005 13:13:24 -0500, "REH" <bo***@nowhere.net> wrote:

"Stephen Sprunk" <st*****@sprunk.org> wrote in message
news:1112377387.ed4aba15b26977118e5a872681fdc3d7 @teranews...
"REH" <bo***@nowhere.net> wrote in message
news:d2*********@cui1.lmms.lmco.com...

"Stephen Sprunk" <st*****@sprunk.org> wrote in message

>Non-determinism isn't necessarily bad as long as you can prove
>the _maximum_ time required for all the functions you call (plus
>your own function) doesn't exceed the deadline.

If you can prove there is a max. time, then it *is* deterministic.

That depends on if "deterministic" is defined as fixed time or bounded
time.
I've always understood it to mean the former.

In the company I work for, it was always the later. No big whoop, I guess.

I doubt there are many instances of sophisticated embedded systems that
are fixed-time deterministic. If you have interrupts, you cannot be
fixed-time deterministic.

You can still be fixed-time deterministic for sections of the
application which, in my experience, is all that is required. You just
have interrupts disables for the critical periods which is in my
experience easy to arrange since the critical periods are often
triggered by an interrupt.
A system is deterministic iff you are *guaranteed* that every task will
complete its function within the time allowed or before its required
completion time.

This can be analysed and proved to be correct. On the systems I worked
on we always made timing estimates as part of the requirements analysis
and refined them during the design phase.
--
Flash Gordon
Living in interesting times.
Although my email address says spam, it is real and I read it.

Nov 14 '05 #38

REH

"Flash Gordon" <sp**@flash-gordon.me.uk> wrote in message
news:7i************@brenda.flash-gordon.me.uk...

You can still be fixed-time deterministic for sections of the
application which, in my experience, is all that is required. You just
have interrupts disables for the critical periods which is in my
experience easy to arrange since the critical periods are often
triggered by an interrupt.

Have you ever work on a large scale real-time system? The above is simply
unacceptable. You have to (al least in my job) guarantee that EVERY
dead-line will be met. Plus, there are a lot of events that simply cannot
be tied to in interrupt. Even for the ones that are, locking interrupts is
a bad thing. You introduce drift, increase interrupt latency, and inhibit
the scheduler. The periodicity of (most) every task must be shown to be
achievable.

Nov 14 '05 #39

execution time becomes unpredictable?!

Similar topics