468,490 Members | 2,607 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,490 developers. It's quick & easy.

Do you use a garbage collector?

I followed a link to James Kanze's web site in another thread and was
surprised to read this comment by a link to a GC:

"I can't imagine writing C++ without it"

How many of you c.l.c++'ers use one, and in what percentage of your
projects is one used? I have never used one in personal or professional
C++ programming. Am I a holdover to days gone by?
Apr 10 '08
350 9446
"Lew" <le*@lewscanon.comwrote in message
news:Td******************************@comcast.com. ..
"Lew" wrote
>>However, Roedy did not mention a "global pointer". You introduced that
into the conversation, and have yet to explain what you mean by that.

Chris Thomasson wrote:
>I mean atomically incrementing a global pointer off a common base of
memory. This is basic memory allocator implementation 101. You know this.
Perhaps Roedy was talking about a distributed model. What say you?

I'll repeat what I said, but in different words in hope that I am clearer.

What Roedy meant is that Java allocators simply increment the pointer to
available memory by the size of the allocation. Details of *which* memory
pointer, and any synchronization needed if any, are left to the JVM. This
process is in-built to the semantics of the 'new' operator. One does not
explicitly manage any of that. I could not tell you if the operation
involves a global pointer or not without checking the specifics of a
particular JVM implementation, of which there are several from different
vendors.
>Java can use both, and all methods in between. You don't necessarily want
to send atomic mutations to a common location. Perhaps Roedy meant N
counts off the bases of multiple memory pools. I don't know. I was
speculating. I hope I was wrong. I thought he meant count off a single
location. That's not going to scale very well... You can break big

It scales just fine. The American IRS accepts millions of
electronically-filed tax returns in just a few days through such a system.
My first point, wrt sending atomic mutations to a COMMON location:

Applying "frequent" atomic mutations to a _single_ location does not scale
well at all. This fairly common mistake ends up generates high amounts of
unneeded cache coherency traffic. Your going to ping-pong and simply
saturate the FSB. Post over to comp.programming.threads if you want to know
why this does not scale fine at all. Try and do that on a NUMA system... Not
going to work. Well, it will work, its just going to give very bad
performance.

I know some things about distributed programming... Why would you want
anything that calls new to atomically increment a global singular pointer?
What if then threads which are running on ten separate CPUS to all contend
for a single location in memory? That is not good at all. I have designed
distributed locking schemes, and can tell you that contention on a
per-thread/cpu mutex is MUCH better than contention on a single mutex. This
is directly analogous to multiple CPUS firing atomic updates at a SINGLE
location. Again, post over on c.p.t so we can give you much more detailed
information.

Java can use many different techniques. I don't see why it would ever need
to focus contention on a common location. IMVHO, Java benefits from
distributed memory allocator algorithms.

[...]
I am still thinking about the rest of your post, I just don't have time
right now to make a full response.

Jun 27 '08 #201
"Roedy Green" <se*********@mindprod.com.invalidwrote in message
news:lk********************************@4ax.com...
On Sat, 12 Apr 2008 10:16:24 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote, quoted or indirectly quoted someone who
said :
>>
How does research hurt me, or anybody else?

It is not what you are saying, but HOW you are saying it. You come
across like a prosecutor crossed with a bratty five year old.
:^(

Jun 27 '08 #202
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:UP******************************@comcast.com. ..
[...]
I know some things about distributed programming... Why would you want
anything that calls new to atomically increment a global singular pointer?
What if then threads
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

'then' == 'ten'
ARGH. I am late for an very informal ad-hoc meeting. Need to get my a$s out
of the door if I want to make it to the restaurant on time!

:^o
which are running on ten separate CPUS to all contend for a single
location in memory?
That is not good at all. I have designed distributed locking schemes, and
can tell you that contention on a per-thread/cpu mutex is MUCH better than
contention on a single mutex. This is directly analogous to multiple CPUS
firing atomic updates at a SINGLE location. Again, post over on c.p.t so
we can give you much more detailed information.
[...]

Jun 27 '08 #203
On Sun, 13 Apr 2008 15:10:38 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>I can have a std::vector<itemor a std::deque<itemand don't have to
use new in my code. The container can grow as needed.
How does std::vector keeps growing? Doesn't it allocates larger and
larger blocks of memory? Doesn't vector call the allocate method of
an instance object you are adding to get hold of memory? You didn't
remove 'new' from your code. the 'new' and 'delete' are in the vector
source code.


Jun 27 '08 #204
In article <uc*****************@newsb.telia.net>, Erik-
wi******@telia.com says...
On 2008-04-13 12:25, Razii wrote:
On Sun, 13 Apr 2008 09:33:02 GMT, Erik Wikström
<Erik-wi******@telia.comwrote:
>You would find that fewer people would regard you as a troll if you did
not post things like this.
No problems. I have a thick skin ..
>Anyone with at least a little bit of knowledge of OSes and memory allocation
knows that this is wrong.
As I said, I read this on a web site. What about VirtualAlloc? When is
that called?
Not quite sure, seems to be something like mmap, perhaps you can use it
to allocate memory for a heap, but there are other functions available
to manage heaps in Windows.
VirtualAlloc is (nearly) the lowest level memory allocation routine
available to normal user-mode programs in Win32. Nearly everything else
(e.g. HeapAlloc) is built on top of VirtualAlloc.

It's NOT much like mmap -- the Windows analog of mmap would be
MapViewOfFile (along with CreateFileMapping and OpenFileMapping).

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jun 27 '08 #205

"Jerry Coffin" <jc*****@taeus.comwrote in message
news:MP************************@news.sunsite.dk...
In article <nm****************@newssvr13.news.prodigy.net>,
ms*************@hotmail.com says...
>>
1. Plus the fact that if you have enough information to do GC, you
have more than enough to make compaction work.

Not really. A typical gc in C++ is added on after the fact, so it
does
conservative collection -- since it doesn't know what is or isn't a
pointer, it treats everything as if it was a pointer, and assumes
that
whatever if would point at if it was as pointer is live memory. Of
course, some values wouldn't be valid pointers and are eliminated.

It does NOT, however, know with any certainty that a particular
value IS
a pointer -- some definitely aren't (valid) pointers, but others
might
or might not be. Since it doesn't know for sure which are pointers
and
which are just integers (or whatever) that hold values that could be
pointers, it can't modify any of them. It has enough information to
do
garbage collection, but NOT enough to support compacting the heap.
You make a good point. I overlooked conservative collectors.
Jun 27 '08 #206
On 2008-04-13 21:57, Jerry Coffin wrote:
In article <uc*****************@newsb.telia.net>, Erik-
wi******@telia.com says...
>On 2008-04-13 12:25, Razii wrote:
On Sun, 13 Apr 2008 09:33:02 GMT, Erik Wikström
<Erik-wi******@telia.comwrote:

You would find that fewer people would regard you as a troll if you did
not post things like this.

No problems. I have a thick skin ..

Anyone with at least a little bit of knowledge of OSes and memory allocation
knows that this is wrong.

As I said, I read this on a web site. What about VirtualAlloc? When is
that called?

Not quite sure, seems to be something like mmap, perhaps you can use it
to allocate memory for a heap, but there are other functions available
to manage heaps in Windows.

VirtualAlloc is (nearly) the lowest level memory allocation routine
available to normal user-mode programs in Win32. Nearly everything else
(e.g. HeapAlloc) is built on top of VirtualAlloc.

It's NOT much like mmap -- the Windows analog of mmap would be
MapViewOfFile (along with CreateFileMapping and OpenFileMapping).
I was thinking of using mmap to map anonymous memory (mapping physical
RAM (or swap-backed memory) into the virtual address space), but I see
now that it is a platform-specific extension.

--
Erik Wikström
Jun 27 '08 #207
Razii wrote:
On Sun, 13 Apr 2008 15:10:38 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>I can have a std::vector<itemor a std::deque<itemand don't
have to use new in my code. The container can grow as needed.

How does std::vector keeps growing? Doesn't it allocates larger and
larger blocks of memory? Doesn't vector call the allocate method of
an instance object you are adding to get hold of memory? You didn't
remove 'new' from your code. the 'new' and 'delete' are in the
vector source code.
Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.

That's why the benchmark is silly - you would never do anything like
that in real C++ code.
Bo Persson
Jun 27 '08 #208
On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.
std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.
>That's why the benchmark is silly - you would never do anything like
that in real C++ code.
It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.

Jun 27 '08 #209
On Sun, 13 Apr 2008 18:08:08 -0500, Razii
<DO*************@hotmail.comwrote:
>>they don't allocate int-sized objects, and never 10 million times in a
loop.
There are no ints here. What are you talking about?

http://pastebin.com/f6bfa4d78 (C++ version)

Jun 27 '08 #210
Razii wrote:
On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.

std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.
Didn't you read the above? std::vector is very unlikely to allocate
objects one at a time. If the programmer knows he or she is going to
place a large number of objects he or she will pre-allocate them.
>That's why the benchmark is silly - you would never do anything like
that in real C++ code.

It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.
It's silly because it tests something you will never see in C++ code.

--
Ian Collins.
Jun 27 '08 #211
Razii wrote:
On Sun, 13 Apr 2008 18:08:08 -0500, Razii
<DO*************@hotmail.comwrote:
>>they don't allocate int-sized objects, and never 10 million times in a
loop.

There are no ints here. What are you talking about?
Your test class had one data member, an int.

--
Ian Collins.
Jun 27 '08 #212
On Mon, 14 Apr 2008 11:23:10 +1200, Ian Collins <ia******@hotmail.com>
wrote:
>Your test class had one data member, an int.
Not this one. http://pastebin.com/f6bfa4d78

Jun 27 '08 #213
"Razii" <DO*************@hotmail.comwrote in message
news:r8********************************@4ax.com...
On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>>Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.

std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.
>>That's why the benchmark is silly - you would never do anything like
that in real C++ code.

It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.
Let me go ahead and try to quickly augment your test using a VERY simple
caching allocator:

<this should compile...>
__________________________________________________ __________________
#include <new>
#include <cstddef>
template<typename T>
class cache_allocator {
union node_type {
T m_obj;
node_type* m_next;
};

node_type* m_head;
std::size_t m_depth;
std::size_t const m_max_depth;

public:
cache_allocator(
std::size_t const max_depth = 1024
): m_head(NULL),
m_depth(0),
m_max_depth(max_depth) {
}

~cache_allocator() throw() {
node_type* node = m_head;
while (node) {
node_type* const next = node->m_next;
::operator delete(node);
node = next;
}
}

void sys_destroy(node_type* const node) throw() {
if (m_depth < m_max_depth) {
node->m_next = m_head;
m_head = node;
++m_depth;
} else {
::operator delete(node);
}
}

public:
T* create_ctor() {
node_type* node = m_head;
if (! node) {
node = new (::operator new(sizeof(*node))) node_type;
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
return new (&node->m_obj) T;
}

void destroy_dtor(T* const obj) {
node_type* node = reinterpret_cast<node_type*>(obj);
try {
obj->~T();
sys_destroy(node);
} catch(...) {
sys_destroy(node);
throw;
}
}

public:
T* create_raw() {
node_type* node = m_head;
if (! node) {
node = new (::operator new(sizeof(*node))) node_type;
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
return &node->m_obj;
}

void destroy_raw(T* const obj) throw() {
sys_destroy(reinterpret_cast<node_type*>(obj));
}
};


// Your Test...
#include <ctime>
#include <iostream>
#define TREE_CREATE_CACHE_CTOR g_tree_malloc.create_ctor
#define TREE_CREATE_CACHE_RAW g_tree_malloc.create_raw
#define TREE_CREATE_NEW new Tree
#define TREE_DESTROY_CACHE_CTOR(ptr) g_tree_malloc.destroy_dtor(ptr)
#define TREE_DESTROY_CACHE_RAW(ptr) g_tree_malloc.destroy_raw(ptr)
#define TREE_DESTROY_NEW(ptr) delete ptr
#define TREE_CREATE TREE_CREATE_CACHE_RAW
#define TREE_DESTROY TREE_DESTROY_CACHE_RAW
struct Tree
{
Tree *left;
Tree *right;
};

static cache_allocator<Treeg_tree_malloc;

Tree *CreateTree(int n)
{
if(n <= 0) return NULL;
Tree *t = TREE_CREATE();
t->left = CreateTree(n - 1);
t->right = CreateTree(n - 1);
return t;
}

void DeleteTree(Tree *t)
{
if(t)
{
TREE_DESTROY(t->left);
TREE_DESTROY(t->right);
TREE_DESTROY(t);
}
}
int main(int argc, char *argv[])
{
clock_t start=clock();
for(int i = 0;i < 15;i++) DeleteTree(CreateTree(22));
clock_t endt=clock();
std::cout <<"Time: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
return 0;
}

__________________________________________________ __________________

What numbers do you get? Does that improve anything?

Jun 27 '08 #214

"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Rr******************************@comcast.com. ..
"Razii" <DO*************@hotmail.comwrote in message
news:r8********************************@4ax.com...
>On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:
>>>Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.

std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.
>>>That's why the benchmark is silly - you would never do anything like
that in real C++ code.

It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.
[...]

Your test is silly because it leaks a whole lot of memory! WTF? This means
that your not taking advantage of the cache allocator. Or any caching in the
new/delete implementation for that matter!

YIKES!

I will try and fix it...

Jun 27 '08 #215
Chris Thomasson wrote:
>
I will try and fix it...
Don't bother!

--
Ian Collins.
Jun 27 '08 #216
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:TM******************************@comcast.com. ..
>
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Rr******************************@comcast.com. ..
>"Razii" <DO*************@hotmail.comwrote in message
news:r8********************************@4ax.com.. .
>>On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:

Isn't that what I said? The new and delete operators are not used very
often in the code. Perhaps only once each in std::allocator, where
they don't allocate int-sized objects, and never 10 million times in a
loop.

std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.

That's why the benchmark is silly - you would never do anything like
that in real C++ code.

It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.
[...]

Your test is silly because it leaks a whole lot of memory! WTF? This means
that your not taking advantage of the cache allocator. Or any caching in
the new/delete implementation for that matter!

YIKES!

I will try and fix it...
Okay. The fix was simple enough:
__________________________________________________ __________________
#include <new>
#include <cstddef>
#include <cassert>
#if ! defined(NDEBUG)
# include <cstdio>
# define DBG_PRINTF(mp_exp) std::printf mp_exp
#else
# define DBG_PRINTF(mp_exp)
#endif
template<typename T>
class cache_allocator {
union node_type {
T m_obj;
node_type* m_next;
};

node_type* m_head;
std::size_t m_depth;
std::size_t const m_max_depth;

public:
cache_allocator(
std::size_t const prime = 128,
std::size_t const max_depth = 1024
): m_head(NULL),
m_depth(0),
m_max_depth(max_depth) {
for (std::size_t i = 0; i < prime && i < max_depth; ++i) {
node_type* const node = reinterpret_cast<node_type*>
(::operator new(sizeof(*node)));
node->m_next = m_head;
m_head = node;
++m_depth;
DBG_PRINTF(("(%p/%d)-Cache Prime\n", (void*)node, m_depth));
}
assert(m_depth <= max_depth);
}

~cache_allocator() throw() {
node_type* node = m_head;
while (node) {
node_type* const next = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Teardown\n", (void*)node, m_depth));
::operator delete(node);
node = next;
}
assert(! m_depth);
}

void sys_destroy(node_type* const node) throw() {
if (m_depth < m_max_depth) {
node->m_next = m_head;
m_head = node;
++m_depth;
DBG_PRINTF(("(%p/%d)-Cache Store\n", (void*)node, m_depth));
} else {
DBG_PRINTF(("(%p/%d)-Cache Overflow\n", (void*)node, m_depth));
::operator delete(node);
}
}

public:
T* create_ctor() {
node_type* node = m_head;
if (! node) {
node = new (::operator new(sizeof(*node))) node_type;
DBG_PRINTF(("(%p/%d)-Cache Miss!\n", (void*)node, m_depth));
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Hit!\n", (void*)node, m_depth));
return new (&node->m_obj) T;
}

void destroy_dtor(T* const obj) {
node_type* node = reinterpret_cast<node_type*>(obj);
try {
obj->~T();
sys_destroy(node);
} catch(...) {
sys_destroy(node);
throw;
}
}

public:
T* create_raw() {
node_type* node = m_head;
if (! node) {
node = reinterpret_cast<node_type*>
(::operator new(sizeof(*node)));
DBG_PRINTF(("(%p/%d)-Cache Miss!\n", (void*)node, m_depth));
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Hit!\n", (void*)node, m_depth));
return &node->m_obj;
}

void destroy_raw(T* const obj) throw() {
sys_destroy(reinterpret_cast<node_type*>(obj));
}
};


// Your Test...
#include <ctime>
#include <iostream>
#define TREE_CREATE_CACHE_CTOR g_tree_malloc.create_ctor
#define TREE_CREATE_CACHE_RAW g_tree_malloc.create_raw
#define TREE_CREATE_NEW new Tree
#define TREE_DESTROY_CACHE_CTOR(ptr) g_tree_malloc.destroy_dtor(ptr)
#define TREE_DESTROY_CACHE_RAW(ptr) g_tree_malloc.destroy_raw(ptr)
#define TREE_DESTROY_DELETE(ptr) delete ptr
#define TREE_CREATE TREE_CREATE_CACHE_RAW
#define TREE_DESTROY TREE_DESTROY_CACHE_RAW
// #define RAZII_MEMORY_LEAK_VERSION
struct Tree
{
Tree *left;
Tree *right;
};

static cache_allocator<Treeg_tree_malloc(100000, 250000);
static int g_allocs = 0;
static int g_frees = 0;

Tree *CreateTree(int n)
{
if(n <= 0) return NULL;
Tree *t = TREE_CREATE();
++g_allocs;
t->left = CreateTree(n - 1);
t->right = CreateTree(n - 1);
return t;
}
#if defined(RAZII_MEMORY_LEAK_VERSION)

void DeleteTree(Tree *t)
{
if(t)
{
TREE_DESTROY(t->left);
TREE_DESTROY(t->right);
TREE_DESTROY(t);
g_frees += 3;
}
}

#else

void DeleteTree(Tree *t) {
if (t) {
DeleteTree(t->left);
DeleteTree(t->right);
TREE_DESTROY(t);
++g_frees;
}
}

#endif
int main(int argc, char *argv[])
{
clock_t start=clock();
for(int i = 0;i < 15;i++) DeleteTree(CreateTree(22));
clock_t endt=clock();
std::cout <<"Time: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
if (g_allocs != g_frees) {
std::cout << "YOU HAVE LEAKED " << g_allocs - g_frees << " ITEMS!\n";
}
return 0;
}

__________________________________________________ __________________

Alls you have to do is define RAZII_MEMORY_LEAK_VERSION to see just how many
objects you were leaking! It was a shi%load. Anyway, try this out, and
report the numbers.

Jun 27 '08 #217
"Ian Collins" <ia******@hotmail.comwrote in message
news:66*************@mid.individual.net...
Chris Thomasson wrote:
>>
I will try and fix it...

Don't bother!
Humm... Well, its too late. The fix was trivial. I was just trying to show
Radii that you can sometimes drastically reduce the number of calls to
new/delete by using the simplest techniques. The little trivial
cache_allocator object I very quickly whipped is a simple example:
__________________________________________________ _____________________
#include <new>
#include <cstddef>
#include <cassert>
#if ! defined(NDEBUG)
# include <cstdio>
# define DBG_PRINTF(mp_exp) std::printf mp_exp
#else
# define DBG_PRINTF(mp_exp)
#endif
template<typename T>
class cache_allocator {
union node_type {
T m_obj;
node_type* m_next;
};

node_type* m_head;
std::size_t m_depth;
std::size_t const m_max_depth;
public:
cache_allocator(
std::size_t const prime = 128,
std::size_t const max_depth = 1024
): m_head(NULL),
m_depth(0),
m_max_depth(max_depth) {
for (std::size_t i = 0; i < prime && i < max_depth; ++i) {
node_type* const node = reinterpret_cast<node_type*>
(::operator new(sizeof(*node)));
node->m_next = m_head;
m_head = node;
++m_depth;
DBG_PRINTF(("(%p/%d)-Cache Prime\n", (void*)node, m_depth));
}
assert(m_depth <= max_depth);
}

~cache_allocator() throw() {
node_type* node = m_head;
while (node) {
node_type* const next = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Teardown\n", (void*)node, m_depth));
::operator delete(node);
node = next;
}
assert(! m_depth);
}

void sys_destroy(node_type* const node) throw() {
if (m_depth < m_max_depth) {
node->m_next = m_head;
m_head = node;
++m_depth;
DBG_PRINTF(("(%p/%d)-Cache Store\n", (void*)node, m_depth));
} else {
DBG_PRINTF(("(%p/%d)-Cache Overflow\n", (void*)node, m_depth));
::operator delete(node);
}
}
public:
T* create_ctor() {
node_type* node = m_head;
if (! node) {
node = new (::operator new(sizeof(*node))) node_type;
DBG_PRINTF(("(%p/%d)-Cache Miss!\n", (void*)node, m_depth));
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Hit!\n", (void*)node, m_depth));
return new (&node->m_obj) T;
}

void destroy_dtor(T* const obj) {
node_type* node = reinterpret_cast<node_type*>(obj);
try {
obj->~T();
sys_destroy(node);
} catch(...) {
sys_destroy(node);
throw;
}
}
public:
T* create_raw() {
node_type* node = m_head;
if (! node) {
node = reinterpret_cast<node_type*>
(::operator new(sizeof(*node)));
DBG_PRINTF(("(%p/%d)-Cache Miss!\n", (void*)node, m_depth));
return &node->m_obj;
}
m_head = node->m_next;
--m_depth;
DBG_PRINTF(("(%p/%d)-Cache Hit!\n", (void*)node, m_depth));
return &node->m_obj;
}

void destroy_raw(T* const obj) throw() {
sys_destroy(reinterpret_cast<node_type*>(obj));
}
};
__________________________________________________ _____________________
I have not really looked at it for any issues, but, AFAICT it just might be
useful to others. I am wondering about undefined behaviors wrt the way I am
using the cache_allocator<T>::node_type union... Can you notice any "major"
issues? I am not a C++ expert! There must be something I overlooked...

;^)

I think I should rewrite the cache_allocator<T>::destroy_dtor() function
like:

void destroy_dtor(T* const obj) {
node_type* node = reinterpret_cast<node_type*>(obj);
try {
obj->~T();
} catch(...) {
sys_destroy(node);
throw;
}
sys_destroy(node);
}
I don't need sys_destroy in the try-block because it will never throw any
exceptions.

Jun 27 '08 #218
On Sun, 13 Apr 2008 18:11:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>What numbers do you get? Does that improve anything?
No. 26891 ms

It's worse since the memory peak was 700 MB! (the original was only
around 65 MB peak).

Here is java version...

http://pastebin.com/f3f559ae2 (java )

(run it with these flags)
java -server -Xmx86m -Xms86m -Xmn85m Test

it's around 1800ms with 73 MB peak memory , exactly 15% faster.
Jun 27 '08 #219
"Razii" <DO*************@hotmail.comwrote in message
news:fj********************************@4ax.com...
On Sun, 13 Apr 2008 18:11:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>What numbers do you get? Does that improve anything?

No. 26891 ms

It's worse since the memory peak was 700 MB! (the original was only
around 65 MB peak).

Here is java version...

http://pastebin.com/f3f559ae2 (java )

(run it with these flags)
java -server -Xmx86m -Xms86m -Xmn85m Test

it's around 1800ms with 73 MB peak memory , exactly 15% faster.
Your original C++ version was leaking a shi%load of memory. Here is how to
"fix" your major bug:
void DeleteTree(Tree *t) {
if (t) {
DeleteTree(t->left);
DeleteTree(t->right);
delete t;
}
}


I have posted a "fixed" version along with my tweaks here:

http://groups.google.com/group/comp....9e0f6f1d0ed775
Jun 27 '08 #220
On Sun, 13 Apr 2008 18:38:27 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>Your test is silly because it leaks a whole lot of memory! WTF?
My version was leaking memory (it's actually the version by Mirek
Fidler)? Absolutely not. The memory peak in that old version was 65
MB. Yours was leaking with 700 MB memory!
Jun 27 '08 #221
On Mon, 14 Apr 2008 12:38:41 +1200, Ian Collins <ia******@hotmail.com>
wrote:
>Don't bother!
No he should since his version was leaking memory.

Jun 27 '08 #222

"Razii" <DO*************@hotmail.comwrote in message
news:1f********************************@4ax.com...
On Sun, 13 Apr 2008 18:38:27 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>Your test is silly because it leaks a whole lot of memory! WTF?

My version was leaking memory (it's actually the version by Mirek
Fidler)? Absolutely not. The memory peak in that old version was 65
MB. Yours was leaking with 700 MB memory!
The code linked to right here leaks memory:

http://pastebin.com/f6bfa4d78

If you did not write that, then I am sorry.

Jun 27 '08 #223
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Cv******************************@comcast.com. ..
>
"Razii" <DO*************@hotmail.comwrote in message
news:1f********************************@4ax.com...
>On Sun, 13 Apr 2008 18:38:27 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>>Your test is silly because it leaks a whole lot of memory! WTF?

My version was leaking memory (it's actually the version by Mirek
Fidler)? Absolutely not. The memory peak in that old version was 65
MB. Yours was leaking with 700 MB memory!

The code linked to right here leaks memory:

http://pastebin.com/f6bfa4d78

If you did not write that, then I am sorry.
WOAH! I am seeing things! There is NO memory leak. I don't know why, but the
first code I downloaded had the DeleteTree function written as:

1.. void DeleteTree(Tree *t)
2.. {
3.. if(t)
4.. {
5.. delete t->left;
6.. delete t->right;
7.. delete t;
8.. }
9.. }
WTF! Sorry!

Jun 27 '08 #224
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:p-******************************@comcast.com...
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:TM******************************@comcast.com. ..
>>
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Rr******************************@comcast.com ...
>>"Razii" <DO*************@hotmail.comwrote in message
news:r8********************************@4ax.com. ..
On Sun, 13 Apr 2008 23:41:29 +0200, "Bo Persson" <bo*@gmb.dkwrote:

>Isn't that what I said? The new and delete operators are not used very
>often in the code. Perhaps only once each in std::allocator, where
>they don't allocate int-sized objects, and never 10 million times in a
>loop.

std::vector has new and delete. They are still there and must be there
every time memory is allocated dynamically.

>That's why the benchmark is silly - you would never do anything like
>that in real C++ code.

It's not silly. It's a benchmark that tests dynamic memory allocation
and GC performance.
[...]

Your test is silly because it leaks a whole lot of memory! WTF? This
means that your not taking advantage of the cache allocator. Or any
caching in the new/delete implementation for that matter!

YIKES!

I will try and fix it...

Okay. The fix was simple enough:
__________________________________________________ __________________
[...]
>
__________________________________________________ __________________

Alls you have to do is define RAZII_MEMORY_LEAK_VERSION to see just how
many objects you were leaking! It was a shi%load. Anyway, try this out,
and report the numbers.
I don't know WTF I was thinking. The code on the site:

http://pastebin.com/f6bfa4d78

is fine. I wonder what happened. Anyway, I apologize.

Jun 27 '08 #225
"Razii" <DO*************@hotmail.comwrote in message
news:fj********************************@4ax.com...
On Sun, 13 Apr 2008 18:11:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>What numbers do you get? Does that improve anything?

No. 26891 ms

It's worse since the memory peak was 700 MB! (the original was only
around 65 MB peak).

Here is java version...

http://pastebin.com/f3f559ae2 (java )

(run it with these flags)
java -server -Xmx86m -Xms86m -Xmn85m Test

it's around 1800ms with 73 MB peak memory , exactly 15% faster.
I need to download the JDK for on the computer I am using right now.

Jun 27 '08 #226
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:bJ******************************@comcast.com. ..
"Razii" <DO*************@hotmail.comwrote in message
news:fj********************************@4ax.com...
>On Sun, 13 Apr 2008 18:11:06 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>>What numbers do you get? Does that improve anything?

No. 26891 ms

It's worse since the memory peak was 700 MB! (the original was only
around 65 MB peak).

Here is java version...

http://pastebin.com/f3f559ae2 (java )

(run it with these flags)
java -server -Xmx86m -Xms86m -Xmn85m Test

it's around 1800ms with 73 MB peak memory , exactly 15% faster.

Your original C++ version was leaking a shi%load of memory. Here is how to
"fix" your major bug:
False alarm.

Jun 27 '08 #227
On Sun, 13 Apr 2008 19:21:44 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>is fine. I wonder what happened. Anyway, I apologize.
Ok, that's fine. I am now confused with all the versions posted.

Can you post clean the working version on pastebin site

http://pastebin.com/

and post the link here.
Jun 27 '08 #228

"Razii" <DO*************@hotmail.comwrote in message
news:an********************************@4ax.com...
On Sun, 13 Apr 2008 19:21:44 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>is fine. I wonder what happened. Anyway, I apologize.

Ok, that's fine. I am now confused with all the versions posted.

Can you post clean the working version on pastebin site

http://pastebin.com/

and post the link here.

Here ya go:

http://pastebin.com/m2493f289
This probably won't beat the Java version, but it should be better than
using new/delete directly.

Jun 27 '08 #229
On Sun, 13 Apr 2008 19:27:40 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>I need to download the JDK for on the computer I am using right now.
Here is the bytecode version: Test.class

http://www.yousendit.com/transfer.ph...7BD69B5598B446

all it needs is JRE. the flags to run it (for best performance in this
case) should be.

java -server -Xmx86m -Xms86m -Xmn85m Test
Jun 27 '08 #230
On Sun, 13 Apr 2008 19:51:10 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>
"Razii" <DO*************@hotmail.comwrote in message
news:an********************************@4ax.com.. .
>On Sun, 13 Apr 2008 19:21:44 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>>is fine. I wonder what happened. Anyway, I apologize.

Ok, that's fine. I am now confused with all the versions posted.

Can you post clean the working version on pastebin site

http://pastebin.com/

and post the link here.


Here ya go:

http://pastebin.com/m2493f289
It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)
Jun 27 '08 #231
On Sun, 13 Apr 2008 21:02:43 -0500, Razii
<DO*************@hotmail.comwrote:
>It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)
Never mind. Why leave all printfs in the benchmark version? Anyway..

Time: 26875 ms

Not much improvemnt.

peak memory was 60 MB.

Jun 27 '08 #232
On Sun, 13 Apr 2008 20:15:44 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>Are you sure that you have been compiling the C++ code in non-debug mode?
Yes, I am sure..

g++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.exe"

and VC9

cl /O2 /GL new.cpp /link /ltcg

on VC9 it was 24500 ms

Jun 27 '08 #233
On Sun, 13 Apr 2008 20:20:28 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>VC++ IDE should automatically define NDEBUG for release builds.
I don't use IDE. For VC++, I use command line:

cl /O2 /GL test.cpp /link /ltcg

and for gcc,

g++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.exe"

The macro output was printed anyway. In any case, I commented them
out.
Jun 27 '08 #234
"Razii" <DO*************@hotmail.comwrote in message
news:se********************************@4ax.com...
On Sun, 13 Apr 2008 21:02:43 -0500, Razii
<DO*************@hotmail.comwrote:
>>It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)

Never mind. Why leave all printfs in the benchmark version? Anyway..

Time: 26875 ms

Not much improvemnt.

peak memory was 60 MB.
Okay. That's what I expected. I am going to play around with this and see if
I can get somewhere near the Java version. Thanks for your patience.

;^)

Jun 27 '08 #235

"Chris Thomasson" <cr*****@comcast.netwrote in message
news:yo******************************@comcast.com. ..
"Razii" <DO*************@hotmail.comwrote in message
news:se********************************@4ax.com...
>On Sun, 13 Apr 2008 21:02:43 -0500, Razii
<DO*************@hotmail.comwrote:
>>>It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)

Never mind. Why leave all printfs in the benchmark version? Anyway..

Time: 26875 ms

Not much improvemnt.

peak memory was 60 MB.

Okay. That's what I expected. I am going to play around with this and see
if I can get somewhere near the Java version. Thanks for your patience.

;^)

One comment on 'System.go()':

http://java.sun.com/j2se/1.4.2/docs/...ystem.html#gc()

This is basically only a "strong suggestion" that the JVM runs a collection.
Java could basically say, na, I don't need to run a collection cycle at this
time; it knows better that the programmer most of the time. Is there any way
to know for sure if Java actually runs a gc cycle?

Jun 27 '08 #236
On Sun, 13 Apr 2008 20:45:02 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>
Okay. That's what I expected. I am going to play around with this and see if
I can get somewhere near the Java version. Thanks for your patience.
You haven't posted the time for java version and what flags you used.
The flags are important and in this case must be

Java -server -Xms86m -Xmx86m -Xmn85m Test

either use that or....

Java -server -Xms170m -Xmx170m -XX:NewRatio=1 Test

Jun 27 '08 #237
On Sun, 13 Apr 2008 20:49:13 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>This is basically only a "strong suggestion" that the JVM runs a collection.
Java could basically say, na, I don't need to run a collection cycle at this
time; it knows better that the programmer most of the time. Is there any way
to know for sure if Java actually runs a gc cycle?
Adding the flag

-verbose:gc

shows exactly when GC runs.

In this case, it runs everytime when System.gc() is called (probably
since Eden memory defined by -Xmn flag is full after each loop).

I can get around the same time (~1800ms), even if I remove System.gc()
but the flags then must be
Java -server -Xms1024m -Xmx1024m -XX:NewRatio=1 Test

However memory usage then can get very high. I am sure there are other
flags/ways to get the best result without explicitly calling
System.gc() and without high memory usage.
Jun 27 '08 #238
ldv
Give the following GC benchmark a shot guys, you'll be surprised:

http://www.experimentalstuff.com/Technologies/GCold/

LDV
Jun 27 '08 #239
On Apr 14, 4:23 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sun, 13 Apr 2008 20:15:44 -0700, "Chris Thomasson"

<cris...@comcast.netwrote:
Are you sure that you have been compiling the C++ code in non-debug mode?

Yes, I am sure..

g++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.exe"
No, you are not. Add -DNDEBUG. Also did you measure with -O3? Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?

--
gpd
Jun 27 '08 #240

"Razii" <DO*************@hotmail.comwrote in message
news:se********************************@4ax.com...
On Sun, 13 Apr 2008 21:02:43 -0500, Razii
<DO*************@hotmail.comwrote:
>>It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)

Never mind. Why leave all printfs in the benchmark version? Anyway..

Time: 26875 ms

Not much improvemnt.

peak memory was 60 MB.
Please check this one out:

http://pastebin.com/m3a18a8e1
I believe that it will dramatically improve the times for sure. Please post
your results!

Thanks.

WARNING!

The slab_allocator template is NOT build for general purpose. I very quickly
created it for this benchmark only! Also, the code compiles with G++ and
VC++, but on Comeau. This is because of the dlist API.

Jun 27 '08 #241

"Razii" <DO*************@hotmail.comwrote in message
news:3o********************************@4ax.com...
On Sun, 13 Apr 2008 20:49:13 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>>This is basically only a "strong suggestion" that the JVM runs a
collection.
Java could basically say, na, I don't need to run a collection cycle at
this
time; it knows better that the programmer most of the time. Is there any
way
to know for sure if Java actually runs a gc cycle?

Adding the flag

-verbose:gc

shows exactly when GC runs.

In this case, it runs everytime when System.gc() is called (probably
since Eden memory defined by -Xmn flag is full after each loop).

I can get around the same time (~1800ms), even if I remove System.gc()
but the flags then must be
Java -server -Xms1024m -Xmx1024m -XX:NewRatio=1 Test

However memory usage then can get very high. I am sure there are other
flags/ways to get the best result without explicitly calling
System.gc() and without high memory usage.
Thanks.

Jun 27 '08 #242
On Apr 14, 4:48 pm, gpderetta <gpdere...@gmail.comwrote:
On Apr 14, 4:23 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sun, 13 Apr 2008 20:15:44 -0700, "Chris Thomasson"
<cris...@comcast.netwrote:
>Are you sure that you have been compiling the C++ code in non-debug mode?
Yes, I am sure..
g++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.exe"

No, you are not. Add -DNDEBUG. Also did you measure with -O3? Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?
BTW, could you benchmark this version:

http://pastebin.com/m16980424

This is nothing a decent C++ programmer would ever write [1], but so
it is the benchmark itself.
Be sure to use -O3 (on my machine, with gcc it is two time faster than
-O2).

[1] my version uses a very simple region allocator, which in some
extreme cases (HPC, embedded devices, benchmarks :) ) might actually
make sense.

--
gpd

Jun 27 '08 #243
On Mon, 14 Apr 2008 07:48:12 -0700 (PDT), gpderetta
<gp*******@gmail.comwrote:
>No, you are not. Add -DNDEBUG. Also did you measure with -O3?
Yes I tried -O3. There was no difference.
>Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?
Why? At least commercial C++ software will have to target the
least-common-denominator processor. The flags we use must target the
least-common-denominator processor. In any case, I added

-march=athlon-xp

there was no change.

Time: 26656 ms

Java version (with the flags I suggested) is at

Time: 1789 ms

14 (or is that 15?) times faster.
Jun 27 '08 #244

"gpderetta" <gp*******@gmail.comwrote in message
news:1e**********************************@c65g2000 hsa.googlegroups.com...
On Apr 14, 4:48 pm, gpderetta <gpdere...@gmail.comwrote:
>On Apr 14, 4:23 am, Razii <DONTwhatever...@hotmail.comwrote:
On Sun, 13 Apr 2008 20:15:44 -0700, "Chris Thomasson"
<cris...@comcast.netwrote:
Are you sure that you have been compiling the C++ code in non-debug
mode?
Yes, I am sure..
g++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.exe"

No, you are not. Add -DNDEBUG. Also did you measure with -O3? Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?

BTW, could you benchmark this version:

http://pastebin.com/m16980424

This is nothing a decent C++ programmer would ever write [1], but so
it is the benchmark itself.
Be sure to use -O3 (on my machine, with gcc it is two time faster than
-O2).

[1] my version uses a very simple region allocator, which in some
extreme cases (HPC, embedded devices, benchmarks :) ) might actually
make sense.
On my old machine (P4 3.06 HyperThread)

This version <http://pastebin.com/m16980424outputs:

I literally waited for about two minutes, and finally hit Ctrl-C... There is
something wrong. I have not studided your code yet.

And my newest version <http://pastebin.com/m3a18a8e1outputs:

Time: 7015 ms

What times are you getting?

Jun 27 '08 #245
On Apr 13, 7:42 pm, "Chris Thomasson" <cris...@comcast.netwrote:

OFFTOPIC: Chris, I have tried to send you an email concerning AppCore.
Have you got it?

Mirek
Jun 27 '08 #246

"Chris Thomasson" <cr*****@comcast.netwrote in message
news:V8******************************@comcast.com. ..
>
"Razii" <DO*************@hotmail.comwrote in message
news:se********************************@4ax.com...
>On Sun, 13 Apr 2008 21:02:43 -0500, Razii
<DO*************@hotmail.comwrote:
>>>It's not working...When I run it, I get

(00473118/19862)-Cache Prime
(00473128/19863)-Cache Prime
(snip)
(00FCF9B8/250000)-Cache Overflow
(00FCF9F8/250000)-Cache Overflow
(snip}
(01B31518/0)-Cache Miss!
(01B31528/0)-Cache Miss!
(01B31538/0)-Cache Miss!
(01B31548/0)-Cache Miss!
(01B31558/0)-Cache Miss!
(snip)

Never mind. Why leave all printfs in the benchmark version? Anyway..

Time: 26875 ms

Not much improvemnt.

peak memory was 60 MB.

Please check this one out:

http://pastebin.com/m3a18a8e1
I believe that it will dramatically improve the times for sure. Please
post your results!
You should play around with the cache settings. The posted code as-is uses
the following configuration:
static slab_allocator<Tree, 16384g_tree_malloc(2, 4);

which means:

Objects Per-Slab: 16384
Cache Prime: 2 slabs
Cache Threshold: 4 slabs


>
Thanks.

WARNING!

The slab_allocator template is NOT build for general purpose. I very
quickly created it for this benchmark only! Also, the code compiles with
G++ and VC++, but on Comeau. This is because of the dlist API.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Also, the code compiles with G++ and VC++, but NOT on Comeau. This is
because of the dlist API
I could improve on this oh so much more, but I am not sure its worth all the
effort. I know that if I worked some more on it I could probably beat, or
come within several milliseconds of the Java times. I can use less memory as
well. Two possible improvements I can think of off hand:

- automatic slab detection using pointer range. This would eliminate the
extra word (e.g., the slab pointer) in the user_type.
- align on page boundary. This would improve cache performance.

Jun 27 '08 #247
Razii wrote:
On Mon, 14 Apr 2008 07:48:12 -0700 (PDT), gpderetta
<gp*******@gmail.comwrote:
>No, you are not. Add -DNDEBUG. Also did you measure with -O3?

Yes I tried -O3. There was no difference.
>Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?

Why? At least commercial C++ software will have to target the
least-common-denominator processor. The flags we use must target the
least-common-denominator processor.
Of course not, now you are being silly again.

If the program runs fast enough on mid-sized hardware, optimize for
that. It will run even better on the big iron.

If you need the lastest hardware, optimize for that. Don't bother with
the rest.
In any case, I added

-march=athlon-xp
You're kidding! :-)
>
there was no change.

Time: 26656 ms

Java version (with the flags I suggested) is at

Time: 1789 ms

14 (or is that 15?) times faster.
Ok, so for Java you must optmize for the actual test machine? :-))
Bo Persson
Jun 27 '08 #248
On Mon, 14 Apr 2008 09:37:29 -0700, "Chris Thomasson"
<cr*****@comcast.netwrote:
>This version <http://pastebin.com/m16980424outputs:

I literally waited for about two minutes, and finally hit Ctrl-C... There is
something wrong. I have not studided your code yet.
I also ended it after getting bored to death.

Jun 27 '08 #249
"Razii" <DO*************@hotmail.comwrote in message
news:96********************************@4ax.com...
On Mon, 14 Apr 2008 07:48:12 -0700 (PDT), gpderetta
<gp*******@gmail.comwrote:
>>No, you are not. Add -DNDEBUG. Also did you measure with -O3?

Yes I tried -O3. There was no difference.
>>Did you
try to tune -march for your architecture (this can make a *lot* of
difference - or not, depending of the program)?

Why? At least commercial C++ software will have to target the
least-common-denominator processor. The flags we use must target the
least-common-denominator processor. In any case, I added

-march=athlon-xp

there was no change.

Time: 26656 ms

Java version (with the flags I suggested) is at

Time: 1789 ms

14 (or is that 15?) times faster.
Here are my results for the various tests posted here...


Compile flags:

G++: -O3 -fomit-frame-pointer -finline-functions -pedantic -Wall -DNDEBUG
Java: -server -Xms86m -Xmx86m -Xmn85m


The times I get on my old machine (P4 3.06ghz HyperThread):


- My first try, cache_allocator <http://pastebin.com/m2493f289>:

Time: 47704 ms


- My second try, slab_allocator <http://pastebin.com/m3a18a8e1>:

Time: 6969 ms


- The Java version by Razii http://pastebin.com/f3f559ae2:

Time: 2895 ms

Can anybody else post their timings please?

Thanks.

Jun 27 '08 #250

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Pedro Miguel Carvalho | last post: by
5 posts views Thread by Ben | last post: by
13 posts views Thread by Mingnan G. | last post: by
28 posts views Thread by Goalie_Ca | last post: by
142 posts views Thread by jacob navia | last post: by
8 posts views Thread by Paul.Lee.1971 | last post: by
56 posts views Thread by Johnny E. Jensen | last post: by
46 posts views Thread by Carlo Milanesi | last post: by
reply views Thread by NPC403 | last post: by
3 posts views Thread by gieforce | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.