473,326 Members | 2,104 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

A class scope allocator using the Curiously Recursive Template Pattern

Hey C++ folks,

I created this today, just for fun.
You can make object allocation for any class around 6 times faster,
simply by doing the following.

class MyClass : public TXpQAlloc<MyClass,N>

N is a chunking factor (defaults to 10).

No other code change needed to apply this to your own classes.
It seemed pretty cool to me, so I'm posting it here.
It's written on Windows, but the same approach should work anywhere.
Hope it's useful to someone. Let me know if it is.

....AndrewD...

//
// CRTP.cpp : A test program for am efficient class scope allocator
scheme.
// making use of the Curiously Recursive Template Pattern
(CRTP)..
//
#include "stdafx.h"
#include <iostream>

#define WIN32_LEAN_AND_MEAN
#include "windows.h"
using namespace std;

//
// TXpQAlloc - An efficient Class Allocator.
//
// Problems with heap allocation/deallocation.
// 1. Processing overhead per allocation/deallocation.
// 2. Space overhead per allocation, especially for many small
allocations.
// 3. Microsoft small heap allocation strategy will not free memory
back to
// O/S when many small allocations are deallocated. Try allocating a
million
// 8 byte classes, then freeing them all. Quite apart from the
processing and
// space allocation overheads during the processing, you will now
find that
// you have several megabytes of space assigned to your process that
will not
// be returned to the O/S for use by other apps.
// If these are problems for your design, then TXpQAlloc may help.
//
// TXpQAlloc allocates heap space in chunks large enough to contain
many instances
// of your class. Individual allocations of your class will be assigned
space within
// a heap allocated chunk. When a chunk is full, a new chunk is
allocated on the heap.
//
// On delete(), the space is hooked onto a chain of free space, to be
recycled.
// On new(), the free space chain is checked first.
// Recycling occurs in order of the Most Recently Used space, to
improve utilisation
// of processor cache.
//
// Proper alignment of object instances is guaranteed.
//
// Usage: As per "Curiously Recursive Template Pattern".
//
// class MyClass : public TXpQAlloc<MyClass,N>
//
// - N is the chunking factor (defaults to 10)..
// Instances of MyClass per heap allocated chunk.
// If you know ahead of time that your program will allocate a
particular number
// of objects of your class, then use that number as the
chunking factor.
// If you just know there will be a large number of
allocations, then choose a
// chunking factor N, such that the chunk size (N *
sizeof(MyClass)) is 1KB,
// ensuring use of MS large allocation strategy. Larger may be
marginally better,
// but increases the overhead of allocated but unused space in
the latest chunk.
//
// Then use new/delete as usual.
// When all instances of the class are delete()'d, all chunks will
be free()'d from the heap.
// Until then, all allocated chunks will remain, because TXpQAlloc
can not tell when an
// individual chunk is unused.
//
// Overheads:
//
// 32 bytes in class scope.
// 0 bytes per object instance.
// 16 bytes per chunk. For a chunk factor of 10, this equates to 1.6
bytes per object instance.
// Whatever unused space there may be in the latest chunk.
//
// Performance appears too be around 6 times better than regular
new/delete.
//
typedef unsigned char XByte;
typedef __int64 XLongLong;

#pragma warning(disable : 4355) // warning C4355: 'this' : used in
base member initializer list
#pragma warning(disable : 4291) // warning C4291: no matching operator
delete found;
class CXpQAllocChunk
{
public:
CXpQAllocChunk(CXpQAllocChunk *poPrevChunk, size_t siChunk)
: poPrevChunk_m (poPrevChunk)
, pyEndOfChunk_m ((XByte *)(this + 1) + siChunk) {}
~CXpQAllocChunk() { delete
poPrevChunk_m; }
void *operator new (size_t siThis, size_t siChunk) { return
malloc(siThis + siChunk); }
void operator delete(void *pvMem) { free(pvMem);
}
XByte *GetEndOfChunk () { return
pyEndOfChunk_m; }
private:
CXpQAllocChunk *poPrevChunk_m;
XByte *pyEndOfChunk_m;
};

template<class T, int N=10class TXpQAlloc
{
public:
TXpQAlloc() {}
TXpQAlloc(const TXpQAlloc<T>&) {}
~TXpQAlloc() {}

void *operator new(size_t /*siThis*/)
{
void *pvAlloc = pvFreeMRU_sm;
if (pvAlloc)
pvFreeMRU_sm = *(void **)pvAlloc;
else
{
if ((pvAlloc = (void *)pyNextFreeByte_sm) == NULL)
{
poLastChunk_sm = new(N*sizeof(T))
CXpQAllocChunk(poLastChunk_sm, N*sizeof(T));
pyNextFreeByte_sm = (XByte *)(poLastChunk_sm + 1); // Space
after chunk header.
pvAlloc = (void *)pyNextFreeByte_sm;
}
pyNextFreeByte_sm += sizeof(T);
if (pyNextFreeByte_sm >= poLastChunk_sm->GetEndOfChunk()) //
Chunk is full now.
pyNextFreeByte_sm = NULL; // Don't allocate new chunk
until needed.
}

++iAllocCount_sm;
return pvAlloc;
}

void operator delete(void *pvMem)
{
if (pvMem) // delete NULL is no-op.
{
// Hook deleted object space into MRU free list.
*(void **)pvMem = pvFreeMRU_sm;
pvFreeMRU_sm = pvMem;

if (--iAllocCount_sm == 0)
{
// Cascading destruction of all chunks when zero instances
left.
delete poLastChunk_sm;
poLastChunk_sm = NULL;
pyNextFreeByte_sm = NULL;
pvFreeMRU_sm = NULL;
}
}
}

private:
// Trail of chunk allocations per class.
static CXpQAllocChunk *poLastChunk_sm; // Last chunk allocated
which has link to one before etc.
static XByte *pyNextFreeByte_sm; // Position in last chunk
for next allocation.
static int iAllocCount_sm; // Count of object
allocated.
static void *pvFreeMRU_sm; // Most recently used free
slot.
};
template<class T,int NCXpQAllocChunk *TXpQAlloc<T>::poLastChunk_sm
= NULL;
template<class T,int NXByte *TXpQAlloc<T>::pyNextFreeByte_sm
= NULL;
template<class T,int Nint TXpQAlloc<T>::iAllocCount_sm
= 0;
template<class T,int Nvoid *TXpQAlloc<T>::pvFreeMRU_sm
= NULL;
class CXpBlah //: public TXpQAlloc<CXpBlah, 10000>
{
public:
CXpBlah(int iBlah1, int iBlah2)
: iBlah1_m (iBlah1)
, iBlah2_m (iBlah2) {}
~CXpBlah() {}

private:
int iBlah1_m;
int iBlah2_m;
};
typedef CXpBlah *PCXpBlah;

#define NumAlloc_L 1000000
int main(int argc, char* argv[])
{
PCXpBlah *papoBlah = new PCXpBlah[NumAlloc_L];
int i;

XLongLong llStartCounter;
QueryPerformanceCounter((LARGE_INTEGER *)&llStartCounter);

// Allocate all.
for (i=0; i<NumAlloc_L; i++)
{
*(papoBlah+i) = new CXpBlah(i, i);
}

// Deallocate every second one.
for (i=0; i<NumAlloc_L; i+=2)
{
delete *(papoBlah+i);
}

// Allocate every second one.again
for (i=0; i<NumAlloc_L; i++)
{
*(papoBlah+i) = new CXpBlah(i, i);
}

// Deallocate all.
for (i=0; i<NumAlloc_L; i++)
{
delete *(papoBlah+i);
}

XLongLong llFinalCounter;
QueryPerformanceCounter((LARGE_INTEGER *)&llFinalCounter);

XLongLong llCounterFrequency;
QueryPerformanceFrequency((LARGE_INTEGER *)&llCounterFrequency);
double dMsPerCount = (double)1000000.00 /
(double)llCounterFrequency;
XLongLong llDeltaMs = (XLongLong)(dMsPerCount *
(double)(llFinalCounter - llStartCounter));
printf("DeltaTime = %I64d.\n", llDeltaMs);

return 0;
}

Jan 26 '07 #1
4 2555
AndrewD wrote:
Hey C++ folks,

I created this today, just for fun.
You can make object allocation for any class around 6 times faster,
simply by doing the following.

class MyClass : public TXpQAlloc<MyClass,N>

N is a chunking factor (defaults to 10).

No other code change needed to apply this to your own classes.
It seemed pretty cool to me, so I'm posting it here.
It's written on Windows, but the same approach should work anywhere.
Hope it's useful to someone. Let me know if it is.
Nice, but based on a quick look at the code I don't think it will work
if you do this

class MyClass : public TXpQAlloc<MyClass,N>
{
};

class MyDerivedClass : public MyClass
{
};

because TXpQAlloc<MyClass,Nassumes that it will only ever allocate
MyClass sized objects.

john
Jan 26 '07 #2
On Jan 26, 1:46 pm, "AndrewD" <andrew.down...@gmail.comwrote:
Hey C++ folks,

I created this today, just for fun.
You can make object allocation for any class around 6 times faster,
simply by doing the following.

class MyClass : public TXpQAlloc<MyClass,N>

N is a chunking factor (defaults to 10).

No other code change needed to apply this to your own classes.
It seemed pretty cool to me, so I'm posting it here.
It's written on Windows, but the same approach should work anywhere.
Hope it's useful to someone. Let me know if it is.

Haven't read all the code but it looks kind of like a slab allocator to
me, but if that's the case then why would one need the CRTP? And could
you not make all objects (regardless of type) of the same size share a
chunk?

--
Erik Wikström

Jan 26 '07 #3
Erik,

On Jan 27, 12:51 am, "Erik Wikström" <eri...@student.chalmers.se>
wrote:
Haven't read all the code but it looks kind of like a slab allocator to
me, but if that's the case then why would one need the CRTP? And could
you not make all objects (regardless of type) of the same size share a
chunk?
I was really trying to deal with the situation where there's a very
large number
of light weight objects. By light weight, I mean I don't want it to be
having
virtual methods and preferably, I'd like all methods to be inline-able.
CRTP allows me to have a base class with knowledge of the concrete
class
and to have inline methods, no virtual functions, no vtable and yet be
type-safe
in the allocations.
You could make a ::new operator override that managed pools (or patio's
if
we follow the slab simile) of same size objects.
It's been done. It was not my goal.

....AndrewD...

Jan 26 '07 #4
On Jan 27, 12:09 am, John Harrison <john_androni...@hotmail.comwrote:
Nice, but based on a quick look at the code I don't think it will work
if you do this

class MyClass : public TXpQAlloc<MyClass,N>
{

};class MyDerivedClass : public MyClass
{

};because TXpQAlloc<MyClass,Nassumes that it will only ever allocate
MyClass sized objects.

john

I was really trying to deal with the situation where there's a very
large number
of light weight objects. By light weight, I mean I don't want it to be
having
virtual methods and preferably, I'd like all methods to be inline-able.
CRTP allows me to have a base class with knowledge of the concrete
class
and to have inline methods, no virtual functions, no vtable and yet be
type-safe
in the allocations.
I was aware of the limitation that it wouldn't work for further derived
objects, but
really didn't care, given my design goal of light weight objects.
i.e. I was explicitly never going to derive anything from them anyhow.

On the other hand, I'm pretty sure it would work for:

class MyClass {};
class MyDerivedClass : public MyClass, public TXpQAlloc<MyDerivedClass
,N{};

So this could be used to make a fast allocatable variant of any
existing
class without changing the existing class.

Did you notice that there's actually zero bytes overhead per object
instance
and yet it recycles the individual delete()'d object instances into
subsequent
new()'s on a most recently used basis for cache efficiency..

....AndrewD...

Jan 26 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Tim Clacy | last post by:
Your expertise will be appreciated... Here's a general templatised class; all specialisations of this class should have a pointer to a specialisation of the same, templatised type: ...
5
by: papi1976 | last post by:
Hello to everyone I heard about a strange thing about wich i have'nt find anything on the net or books, someone called this "Curiously recursive pattern" or a strange trick to play with...
0
by: Brian Riis | last post by:
This all pertains to the above mentioned article. Anyone want to read it (it's rather neat, actually), here's the address: http://www.cuj.com/documents/s=8205/cujweb0305meynard/ Anyway, I was...
2
by: gg | last post by:
I am facing some problems with following program. I am using aCC version 03.27 on HP-UX11. The command line I use to compile is - aCC -AA -I. TestTempMethods.C Can anybody pls suggest how to...
21
by: Jon Slaughter | last post by:
I have a class that is basicaly duplicated throughout several files with only members names changing according to the class name yet with virtually the exact same coding going on. e.g. class...
3
by: Chris | last post by:
I am having a very strange problem involving virtual functions in template classes. First of all, here is an extremely simplified structure of the two classes I am having problems with. ...
7
by: Alden Pierre | last post by:
Hello, I'm trying to create my own user define container, but I'm having a little hard time figuring out why is my class considered undefined by my compiler. Here is the following code. //...
9
by: Mirko Puhic | last post by:
Is there a way to properly do this? struct Derived; struct Base{ Derived der; };
6
by: rep_movsd | last post by:
Hi folks I was on topcoder and came across an interesting problem... It involved dynamic programming ( storing function results in a map to avoid repeated computation ) and I ended up having...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.