By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,928 Members | 1,173 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,928 IT Pros & Developers. It's quick & easy.

ANSI C problem on P4 under Linux & Windows

P: n/a
VNG
I have an ANSI C program that was compiled under Windows MSVC++ 6.0 (SP6) and
under Linux gnu, and ran under P3, P4 and AMD.

It runs fine on P3 and AMD under both Windows and Linux, but under P4 it has
problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3... and under
Linux not only that it runs slower (while AMD is 40 times faster), but it also
produces wrong numerical results...

Any suggestion what can be the problem?

How to fix the P4 speed under MSVC++ (SP6)?
How to fix P4's speed and numerical result under Linux?

Here's some more details about the compilation:
GNU:
CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce
-funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't sure if
it causes the problem) looks like this:

static long loop_order;

void functionname ()
{
register float *iPtr, *itPtr, *iPtr1, *cPtr, acc;
register long j;
:
{
register float c1, c2;
j = loop_order;
while (j--)
{
acc = *itPtr-- * c1;
acc += *itPtr-- * c2;
acc += *itPtr++ * c3;
*cPtr++ += *iPtr1++ * acc;
}
}
:
}

We have tried to eliminate the use of the word "register" and redefined "j" as
volatile, no change.
Thanks,
-- VNG


Jul 22 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
VNG wrote:
I have an ANSI C program that was compiled under Windows MSVC++ 6.0
(SP6) and
under Linux gnu, and ran under P3, P4 and AMD.

It runs fine on P3 and AMD under both Windows and Linux, but under P4 it
has
problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3...
and under
Linux not only that it runs slower (while AMD is 40 times faster), but
it also
produces wrong numerical results...

Any suggestion what can be the problem?

How to fix the P4 speed under MSVC++ (SP6)?
How to fix P4's speed and numerical result under Linux?

Here's some more details about the compilation:
GNU:
CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce
-funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't
sure if
it causes the problem) looks like this:

static long loop_order;

void functionname ()
{
register float *iPtr, *itPtr, *iPtr1, *cPtr, acc;
register long j;
:
{
register float c1, c2;
j = loop_order;
while (j--)
{
acc = *itPtr-- * c1;
acc += *itPtr-- * c2;
acc += *itPtr++ * c3;
*cPtr++ += *iPtr1++ * acc;
}
}
:
}

We have tried to eliminate the use of the word "register" and redefined
"j" as
volatile, no change.

Why volatile? Also -ffast-math sounds like lower floating pointprecision
than normal.
The command line parameters I use for C90 programs:
-std=iso9899:199409 -pedantic-errors -Wall -fexpensive-optimizations -O3
-ffloat-store -mcpu=pentiumpro


Try this, and do not use volatile and register unless needed.


Regards,

Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.