473,321 Members | 1,916 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,321 software developers and data experts.

Re: How to write your own allocator?

Juha Nieminen <no****@thanks.invalidwrote in news:47f91b3c$0$8161
$4*******@news.tdc.fi:
I performed some tests about creating memory allocators. It's actually
quite incredible how much, for example, std::list speeds up by using
your own efficient allocator instead of relying on the one in libc.

I actually made a library and a webpage on this subject:

http://warp.povusers.org/FSBAllocator/
I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page), so
you are comparing not quite the same things. Thread synchronization is
quite expensive (and becomes relatively more expensive all the time with
the number of processor cores growing). I think in the future one routinely
needs at least three different allocators: multi-threaded, single-threaded
and single-thread-favoured (deallocating a block in another thread works,
albeit being slower; IIRC Intel's TBB library contains some allocators of
the latter sort). Not sure what C++0x has to say about this issue, though
it would deserve attention IMHO.

Regards
Paavo
Jun 27 '08 #1
6 3362
"Paavo Helde" <no****@ebi.eewrote in message
news:Xn*************************@216.196.97.131...
Juha Nieminen <no****@thanks.invalidwrote in news:47f91b3c$0$8161
$4*******@news.tdc.fi:
> I performed some tests about creating memory allocators. It's actually
quite incredible how much, for example, std::list speeds up by using
your own efficient allocator instead of relying on the one in libc.

I actually made a library and a webpage on this subject:

http://warp.povusers.org/FSBAllocator/

I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page), so
you are comparing not quite the same things. Thread synchronization is
quite expensive (and becomes relatively more expensive all the time with
the number of processor cores growing).
[...]

Smart multi-threaded allocators can usually grant allocation and
deallocation requests without using any synchronization whatsoever.
State-of-the-art allocators are based on per-thread heaps. The only time
sync is needed is when a thread tries to free memory that it did not
allocate itself. Take a look at Hoard and StreamFlow.

Jun 27 '08 #2
"Paavo Helde" <no****@ebi.eewrote in message
news:Xn*************************@216.196.97.131...
Juha Nieminen <no****@thanks.invalidwrote in news:47f91b3c$0$8161
$4*******@news.tdc.fi:
> I performed some tests about creating memory allocators. It's actually
quite incredible how much, for example, std::list speeds up by using
your own efficient allocator instead of relying on the one in libc.

I actually made a library and a webpage on this subject:

http://warp.povusers.org/FSBAllocator/

I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page), so
you are comparing not quite the same things. Thread synchronization is
quite expensive (and becomes relatively more expensive all the time with
the number of processor cores growing). I think in the future one
routinely
needs at least three different allocators: multi-threaded, single-threaded
and single-thread-favoured (deallocating a block in another thread works,
albeit being slower; IIRC Intel's TBB library contains some allocators of
the latter sort). Not sure what C++0x has to say about this issue, though
it would deserve attention IMHO.
Here is simple single-threaded region allocator:

http://groups.google.com/group/comp....bdc76f9792de1f
Here is how to distribute it over multi-threads it:

http://groups.google.com/group/comp....a2d64acc3e3272
The code should be finished in 9-10 days. It will be released under GPL.
This allocator will get single threaded performance, yet can be used in
multi-threaded environment. Here is another commercial slab allocator I
created:

http://groups.google.com/group/comp....c40d42a04ee855
These are very fast.

Jun 27 '08 #3

"Chris Thomasson" <cr*****@comcast.netwrote in message
news:gb******************************@comcast.com. ..
"Paavo Helde" <no****@ebi.eewrote in message
news:Xn*************************@216.196.97.131...
>Juha Nieminen <no****@thanks.invalidwrote in news:47f91b3c$0$8161
$4*******@news.tdc.fi:
>> I performed some tests about creating memory allocators. It's actually
quite incredible how much, for example, std::list speeds up by using
your own efficient allocator instead of relying on the one in libc.

I actually made a library and a webpage on this subject:

http://warp.povusers.org/FSBAllocator/

I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page), so
you are comparing not quite the same things. Thread synchronization is
quite expensive (and becomes relatively more expensive all the time with
the number of processor cores growing).

[...]

Smart multi-threaded allocators can usually grant allocation and
deallocation requests without using any synchronization whatsoever.
State-of-the-art allocators are based on per-thread heaps. The only time
sync is needed is when a thread tries to free memory that it did not
allocate itself. Take a look at Hoard and StreamFlow.
Sometimes synchronization is not needed when a thread deallocates blocks
that it did not allocate itself. You can post over on
'comp.programming.threads' for more information.

Jun 27 '08 #4
On Apr 24, 1:21 pm, Paavo Helde <nob...@ebi.eewrote:
Juha Nieminen <nos...@thanks.invalidwrote in news:47f91b3c$0$8161
$4f793...@news.tdc.fi:
I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page),
Do compilers in general have a flag to tell it "SINGLE THREADS HERE!"
to optimize away and mutexes, etc ?

Thanks
Jun 27 '08 #5
joseph cook <jo*****@gmail.comwrote in news:e71d762f-2c24-4eec-937f-
42**********@m36g2000hse.googlegroups.com:
On Apr 24, 1:21 pm, Paavo Helde <nob...@ebi.eewrote:
>Juha Nieminen <nos...@thanks.invalidwrote in news:47f91b3c$0$8161
$4f793...@news.tdc.fi:
>I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web page),

Do compilers in general have a flag to tell it "SINGLE THREADS HERE!"
to optimize away and mutexes, etc ?
As the mutexes are not in C++ standard, the compilers in principle are
not supposed do know about them and thus they can't have any general
flags related to multithreading. Any such flags are compiler-specific
extensions. For example, gcc has the -pthread flag, but this has the
opposite meaning to make the code more thread-safe.

For conditional compiling with or without mutexes one could use
preprocessor defines. Some libraries (e.g. PHP) have different versions
for single-threaded and multi-threaded use.

In principle, one could also use template specifications to produce
single-threaded and multi-threaded versions of code, but I've not seen
that. A lot of applications nowadays tempt to be multi-threaded, so any
libraries better ought to support multi-threading. And in case of a final
executable one typically knows ahead if it is going to be single-threaded
or multi-threaded, and can thus code accordingly with or without mutexes.

Regards
Paavo
Jun 27 '08 #6
"Chris Thomasson" <cr*****@comcast.netwrote in
news:-L******************************@comcast.com:
"Paavo Helde" <no****@ebi.eewrote in message
news:Xn*************************@216.196.97.131...
>Juha Nieminen <no****@thanks.invalidwrote in news:47f91b3c$0$8161
$4*******@news.tdc.fi:
>> I performed some tests about creating memory allocators. It's
actually
quite incredible how much, for example, std::list speeds up by using
your own efficient allocator instead of relying on the one in libc.

I actually made a library and a webpage on this subject:

http://warp.povusers.org/FSBAllocator/

I would guess that most speedup comes from the fact that the general
allocator is thread-safe and yours is not (according to the web
page), so you are comparing not quite the same things. Thread
synchronization is quite expensive (and becomes relatively more
expensive all the time with the number of processor cores growing). I
think in the future one routinely
needs at least three different allocators: multi-threaded,
single-threaded and single-thread-favoured (deallocating a block in
another thread works, albeit being slower; IIRC Intel's TBB library
contains some allocators of the latter sort). Not sure what C++0x has
to say about this issue, though it would deserve attention IMHO.

Here is simple single-threaded region allocator:

http://groups.google.com/group/comp....hread/68bdc76f
9792de1f
Here is how to distribute it over multi-threads it:

http://groups.google.com/group/comp....rowse_frm/thre
ad/12a2d64acc3e3272
The code should be finished in 9-10 days. It will be released under
GPL. This allocator will get single threaded performance, yet can be
used in multi-threaded environment. Here is another commercial slab
allocator I created:

http://groups.google.com/group/comp....d/24c40d42a04e
e855
These are very fast.
Thanks for the pointers! Unfortunately we cannot use the code in our
production software because of the GPL license.

Best regards
Paavo
Jun 27 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Ralf Schneeweiß | last post by:
Hi, there is a book from Andre Wilms describing how to write allocators for STL containers. Unfortunately the MS VC++ Compiler is not supporting the full ANSI standard in the template part of the...
5
by: Scott Brady Drummonds | last post by:
Hi, everyone, A coworker and I have been pondering a memory allocation problem that we're having with a very large process. Our joint research has led us to the conclusion that we may have to...
3
by: Bernhard Kick | last post by:
Hi all, I saw this code in the book "Accelerated C++" (chapt 11, iirc): template <class T> class Vec { ... std::allocator<T> alloc; // object to handle memory allocation // ??would static...
13
by: John Harrison | last post by:
If you specify an allocator in an STL container is it a requirement that the allocator allocates object of the right type, or can you assume that the container will rebind the allocator to the...
7
by: Grahamo | last post by:
Hi, can anybody tell me where I can get the boiler plate code for std::allocator. I need to have my version of new and delete called and want to get reference code. My compilers headers are all...
11
by: Chris Dams | last post by:
Dear all, I found out that the program #include<vector> using namespace std; int main() { vector<int*> v(2,0);
3
by: Mike | last post by:
Hi, I have a simple "memPool' class that simply maintains a linked list of chunks from which allocation requests are made. The whole thing is deleted all at once upon destruction. So far, so...
6
by: Juha Nieminen | last post by:
I tested the speed of a simple program like this: //------------------------------------------------------------ #include <list> #include <boost/pool/pool_alloc.hpp> int main() { typedef...
3
by: maneshborase | last post by:
Hi friends, I am facing one serious problem in my application. I am trying to open dicom image file (.dcm) has size around 400 MB. But I am getting and unhandy exceptions, Some time, ...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.