By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,217 Members | 1,374 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,217 IT Pros & Developers. It's quick & easy.

gcc 3.4.3 performance problem illustrated

P: n/a
I was noticing significantly worse performance in some of my C++ codes compiled with gcc 3.4.3
as compared to gcc 3.3.4. I have boiled it down into one relatively short code that illustrates.
It seems to be an issue of excessive cache misses in certain pointer lookup operations in gcc
3.4.3 binaries. BTW, are there any tools to actually count cache misses?

If anyone has a few minutes to compile and run the following code, I would be interested in
knowing if you experience the same problems. I'm running AMD64 athlon 3200 with 1024KB cache. I
compiled with

g++ -O3 -Wall -march=k8

Compiled with gcc 3.3.4 average run time: 2.0 seconds
Compiled with gcc 3.4.3 average run time: 2.9 seconds

I've noticed even more dramatic differences in larger codes that actually do something.

I would be interested in answering the following questions:

1) is this observed only on AMD64, or also x86 ?
2) how does gcc 4.0.0 do?
3) are there compiler options that would improve performance (none that I've tried did)
4) what changed between gcc 3.3 and 3.4 to cause this?

If you have any spare time, I think this is an interesting example, and worth the effort for
someone to figure out. I'm afraid my compiler expertise is not sufficient, so I am asking for
some help. Thanks.

Code:

// run time is anywhere from 33 to 50 % longer when compiled with gcc 3.4.3 compared to 3.3.4
// compiled with g++ -O3 -Wall -march=k8 (same performance lag observed with -O2)
//
// Objects are created in a heirarchy of classes.
// When referenced, it seems that the pointer lookups
// must cause more cache misses in gcc 3.4.3 binaries.

#include <stdio.h>
#include <vector>

class mytype_A {
public:
int id;
mytype_A():id(0) {}
};

class mytype_B {
public:
mytype_A* A;
mytype_B(mytype_A* p):A(p) {}
};

class mytype_C {
public:
mytype_B* B;
mytype_C(mytype_B* p):B(p) {}
};
class mytype_D {
public:
// mytype_C* C[2]; // less performance difference if we use simple arrays
std::vector<mytype_C*> C;
int junk[3]; // affects performance (must cause cache misses)

public:
mytype_D(mytype_A* a0, mytype_A* a1) {
// C[0] = new mytype_C(new mytype_B(a0));
// C[1] = new mytype_C(new mytype_B(a0));
C.push_back(new mytype_C(new mytype_B(a0)));
C.push_back(new mytype_C(new mytype_B(a0)));
}
};

int main() {
int k = 5000; // run-time not linear in k
mytype_A* A[k];
mytype_D* D[k];
for (int i=0;i<=k;i++)
A[i] = new mytype_A();
for (int i=0;i<k;i++)
D[i] = new mytype_D(A[i],A[k-i]); // intentionally make some pointers farther apart

clock_t before = clock();

int k0 = 0;
for (int i=0;i<k;i++) {
k0 = 0;
for (int j=0;j<k;j++) { // run through list of D's, and reference pointers
mytype_D* d = D[j];
if (d->C[0]->B->A->id) k0++;
if (d->C[1]->B->A->id) k0++;
}
}
printf("%d\n",k0); // don't allow compiler to optimize away k0

printf("time: %f\n",(double)(clock()-before)/CLOCKS_PER_SEC);

return 0;
}

--
Kenneth Massey
http://www.masseyratings.com
Jul 23 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Kenneth Massey wrote:
I was noticing significantly worse performance in some of my C++
codes compiled with gcc 3.4.3 as compared to gcc 3.3.4. I have boiled
it down [...]

I would be interested in answering the following questions:

1) is this observed only on AMD64, or also x86 ?
2) how does gcc 4.0.0 do?
3) are there compiler options that would improve performance (none
that I've tried did) 4) what changed between gcc 3.3 and 3.4 to cause
this?

If you have any spare time, I think this is an interesting example,
and worth the effort for someone to figure out. I'm afraid my
compiler expertise is not sufficient, so I am asking for some help.
[...]


Please re-post this to gnu.g++.help. This is all very compiler-specific
and as such not a C++ *language* issue but rather a compiler issue. You
should be able to get much better help in the newsgroup for your compiler.

Thanks.

V
Jul 23 '05 #2

P: n/a
Kenneth Massey wrote:
I was noticing significantly worse performance in some of my C++ codes
compiled with gcc 3.4.3 as compared to gcc 3.3.4. I have boiled it down
into one relatively short code that illustrates. It seems to be an issue
of excessive cache misses in certain pointer lookup operations in gcc
3.4.3 binaries. BTW, are there any tools to actually count cache misses?

If anyone has a few minutes to compile and run the following code, I would
be interested in knowing if you experience the same problems. I'm running
AMD64 athlon 3200 with 1024KB cache. I compiled with

g++ -O3 -Wall -march=k8

Compiled with gcc 3.3.4 average run time: 2.0 seconds
Compiled with gcc 3.4.3 average run time: 2.9 seconds

[snip]

My results:
~/Projects/stl_string> ./mytest
0
time: 5.210000

Compiled as:
g++ -O3 -Wall -march=athlon -o mytest main.cpp

My specs:
AMD Athlon 1800+
1GB PC2700 DDR
SuSE 9.1 Pro

~/Projects/stl_string> g++ -v
g++ -v
Reading specs from /usr/lib/gcc-lib/i586-suse-linux/3.3.3/specs
Configured with: ../configure --enable-threads=posix --prefix=/usr
--with-local-prefix=/usr/local --infodir=/usr/share/info
--mandir=/usr/share/man --enable-languages=c,c++,f77,objc,java,ada
--disable-checking --libdir=/usr/lib --enable-libgcj
--with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib
--with-system-zlib --enable-shared --enable-__cxa_atexit i586-suse-linux
Thread model: posix
gcc version 3.3.3 (SuSE Linux)
Hope this helps.

Alvin

Jul 23 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.