On Fri, 10 Oct 2008 12:20:42 +0100, David Given <dg@cowlark.com>
wrote:
>Richard Harter wrote:
[...]
>As others have said, you need to look at groups specific to your
platform. There is no way to portably write a storage allocator
in standard C only - to get chunks of memory you have to have
access to the OS memory management facilities.
I think I haven't made myself clear.
I don't want anything platform specific. What I want is an allocator for
an *abstract linear resource*, which could be virtual memory, or time,
or space, or anything like that --- I want to be able to allocate chunks
from a contiguous linear range of numbers. Not only can this be done in
portable C, I *want* it in portable C, because I have to run it on at
least two highly dissimilar platforms.
i.e. an interface something like this:
typedef unsigned int domain_t;
void init_allocator(domain_t start, domain_t length);
domain_t allocrange(domain_t length);
void freerange(domain_t start);
domain_t resizerange(domain_t start, domain_t newlength);
domain_t is utterly abstract and does not need to represent any physical
resource. I only mentioned virtual memory because that's the specific
problem I need to solve, but I'm actually looking for a general solution.
As I said, this is a very similar problem to the one that's solved by
malloc(), but most malloc() implementations make assumptions about what
they're allocating that allows them to take shortcuts, which don't apply
in my case. Which means I can't use them.
As Eric remarks, this really is a general programming question;
however we are here and now and I see no point in not answering
your question.
If I undestand your question correctly, getspace and many other
allocators actually actually have what you want; unfortunately
they also necessarily have a lot of stuff that you don't need.
Your thought about allocators making taking shortcuts is a red
herring - the core code in getspace doesn't do that. (It does
put guard characters surrounding allocated space but that is a
refinement but that has nothing to with the underlying
algorithms.) What it is managing is an abstract resource
(interger range) that is correlated with physical memory. Robust
allocators do this so that metadata and resource are not
comingled.
Side comment: Your initializer allocates a fixed domain size -
what if you need a larger domain?
Second side comment: What are your assumptions about resource
usage and desired efficiency. This will affect what kind of
allocation strategy you use. N.b. You probably should read up on
storage allocation strategies. Knuth has some material (not
enough) on the issues.
Third side comment: Will your usage let you change "start" under
the hood? If it does, you can effectively use compaction
techniques.
An abstract allocator has two basic problems to solve; finding a
block of the requisite size when allocrange is called and finding
the metadata for "start" when freerange is called. Resizing is
trivial if you've solved those problems.
The two general strategies that I will mention are buddy system
allocators and best fit, immediate merge allocators with metadata
units for each block. I don't propose to describe how to
implement buddy allocators here. They are good and they are
simple to implement. IIRC the overhead is two bitmaps of the
domain range. The one drawback is that they are relatively
inefficient in their allocation - about 1/3 of the available
resource will be unused. Best fit allocators are (in some
environments) more efficient in their domain usage; however they
have significant overhead per block.
Getspace uses the latter; however there is a lot of complexity
that has been added for performance reasons. Let's look at the
core functionality. Our resource is broken up into a list of
contiguous blocks (ranges in your notation) that are either free
(available for allocation) or in use (allocated). We have a
block representation node that looks like this:
struct allocation_node {
struct allocation_node * domain_left;
struct allocation_node * domain_right;
int index;
int usage_flag;
/* more stuff */
};
Left and right are smaller and larger indices respectively.
Domain_left and domain_right are the links in a doubly linked
list. Index is the start of the represented block. And, of
course, usage_flag says whether the block is allocated or not.
Now we need two data structures. One holds all of the nodes with
their size as a key - the size is available by subtracting the
domain_right index from the current index. The other holds all
of the allocated nodes with their index as a key.
When we allocate we search the first data structure for the
smallest block that is larger than the desired size. If it is
the right size we mark it as in use and return the index. If it
is too large we create a new node that is linked between the node
and its domain right neighbour, and use it to hold the remnant.
The resized node is moved to its new location in the data
structure and the remnant node is added. The resized node is
added to the second data structure.
When we free a block we search the second data structure to find
its node. We remove the node from the second data structure.
Now we do the merges. If left and right neighbours are in use we
mark the node free and add it to the first data structure. If it
has free neighbours they are removed from the first data
structure and are delinked. The neighbour blocks are
incorporated in the node and the resized block is marked free and
is added to the first data structure.
Getspace uses some "interesting" data structures to gain
performance but that is the core of the package.
Richard Harter,
cr*@tiac.net http://home.tiac.net/~cri,
http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.