469,927 Members | 1,880 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,927 developers. It's quick & easy.

ANSI C problem on P4 under Linux & Windows

VNG
I have an ANSI C program that was compiled under Windows MSVC++ 6.0 (SP6) and
under Linux gnu, and ran under P3, P4 and AMD.

It runs fine on P3 and AMD under both Windows and Linux, but under P4 it has
problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3... and under
Linux not only that it runs slower (while AMD is 40 times faster), but it also
produces wrong numerical results...

Any suggestion what can be the problem?

How to fix the P4 speed under MSVC++ (SP6)?
How to fix P4's speed and numerical result under Linux?

Here's some more details about the compilation:
GNU:
CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce
-funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't sure if
it causes the problem) looks like this:

static long loop_order;

void functionname ()
{
register float *iPtr, *itPtr, *iPtr1, *cPtr, acc;
register long j;
:
{
register float c1, c2;
j = loop_order;
while (j--)
{
acc = *itPtr-- * c1;
acc += *itPtr-- * c2;
acc += *itPtr++ * c3;
*cPtr++ += *iPtr1++ * acc;
}
}
:
}

We have tried to eliminate the use of the word "register" and redefined "j" as
volatile, no change.
Thanks,
-- VNG


Jul 22 '05 #1
1 1648
VNG wrote:
I have an ANSI C program that was compiled under Windows MSVC++ 6.0
(SP6) and
under Linux gnu, and ran under P3, P4 and AMD.

It runs fine on P3 and AMD under both Windows and Linux, but under P4 it
has
problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3...
and under
Linux not only that it runs slower (while AMD is 40 times faster), but
it also
produces wrong numerical results...

Any suggestion what can be the problem?

How to fix the P4 speed under MSVC++ (SP6)?
How to fix P4's speed and numerical result under Linux?

Here's some more details about the compilation:
GNU:
CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce
-funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't
sure if
it causes the problem) looks like this:

static long loop_order;

void functionname ()
{
register float *iPtr, *itPtr, *iPtr1, *cPtr, acc;
register long j;
:
{
register float c1, c2;
j = loop_order;
while (j--)
{
acc = *itPtr-- * c1;
acc += *itPtr-- * c2;
acc += *itPtr++ * c3;
*cPtr++ += *iPtr1++ * acc;
}
}
:
}

We have tried to eliminate the use of the word "register" and redefined
"j" as
volatile, no change.

Why volatile? Also -ffast-math sounds like lower floating pointprecision
than normal.
The command line parameters I use for C90 programs:
-std=iso9899:199409 -pedantic-errors -Wall -fexpensive-optimizations -O3
-ffloat-store -mcpu=pentiumpro


Try this, and do not use volatile and register unless needed.


Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Eric Myers | last post: by
30 posts views Thread by Christopher Kurtis Koeber | last post: by
48 posts views Thread by Daniele C. | last post: by
83 posts views Thread by sunny | last post: by
65 posts views Thread by Leslie Kis-Adam | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.