I have an ANSI C program that was compiled under Windows MSVC++ 6.0 (SP6) and
under Linux gnu, and ran under P3, P4 and AMD.
It runs fine on P3 and AMD under both Windows and Linux, but under P4 it has
problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3... and under
Linux not only that it runs slower (while AMD is 40 times faster), but it also
produces wrong numerical results...
Any suggestion what can be the problem?
How to fix the P4 speed under MSVC++ (SP6)?
How to fix P4's speed and numerical result under Linux?
Here's some more details about the compilation:
GNU:
CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce
-funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't sure if
it causes the problem) looks like this:
static long loop_order;
void functionname ()
{
register float *iPtr, *itPtr, *iPtr1, *cPtr, acc;
register long j;
:
{
register float c1, c2;
j = loop_order;
while (j--)
{
acc = *itPtr-- * c1;
acc += *itPtr-- * c2;
acc += *itPtr++ * c3;
*cPtr++ += *iPtr1++ * acc;
}
}
:
}
We have tried to eliminate the use of the word "register" and redefined "j" as
volatile, no change.
Thanks,
-- VNG 1 1804
VNG wrote: I have an ANSI C program that was compiled under Windows MSVC++ 6.0 (SP6) and under Linux gnu, and ran under P3, P4 and AMD.
It runs fine on P3 and AMD under both Windows and Linux, but under P4 it has problems. Under Windows 3GHz P4 runs twice slower than 800MHz P3... and under Linux not only that it runs slower (while AMD is 40 times faster), but it also produces wrong numerical results...
Any suggestion what can be the problem?
How to fix the P4 speed under MSVC++ (SP6)? How to fix P4's speed and numerical result under Linux?
Here's some more details about the compilation: GNU: CFLAGS=-O6 -fexpensive-optimizations -ffast-math -fno-strength-reduce -funroll-loops -fomit-frame-pointer -Wno-long-long -Wno-unused
Basically one of the most intensive loops (that we suspect in but aren't sure if it causes the problem) looks like this:
static long loop_order;
void functionname () { register float *iPtr, *itPtr, *iPtr1, *cPtr, acc; register long j; : { register float c1, c2; j = loop_order; while (j--) { acc = *itPtr-- * c1; acc += *itPtr-- * c2; acc += *itPtr++ * c3; *cPtr++ += *iPtr1++ * acc; } } : }
We have tried to eliminate the use of the word "register" and redefined "j" as volatile, no change.
Why volatile? Also -ffast-math sounds like lower floating pointprecision
than normal.
The command line parameters I use for C90 programs:
-std=iso9899:199409 -pedantic-errors -Wall -fexpensive-optimizations -O3
-ffloat-store -mcpu=pentiumpro
Try this, and do not use volatile and register unless needed.
Regards,
Ioannis Vranos http://www23.brinkster.com/noicys This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Mike Dee |
last post by:
A very very basic UTF-8 question that's driving me nuts:
If I have this in the beginning of my Python script in Linux:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
should I - or should I...
|
by: Eric Myers |
last post by:
Hello folks:
(This message is also posted on the help forum at the pexpect
sourceforge page, but all indentation in the code got stripped away
when I submitted the post.)
For some time I've...
|
by: VNG |
last post by:
I have an ANSI C program that was compiled under Windows MSVC++ 6.0 (SP6) and
under Linux gnu, and ran under P3, P4 and AMD.
It runs fine on P3 and AMD under both Windows and Linux, but under P4...
|
by: Martín Marconcini |
last post by:
Hello there,
I'm writting (or trying to) a Console Application in C#. I has to be
console.
I remember back in the old days of Cobol (Unisys), Clipper and even Basic,
I used to use a program...
|
by: Christopher Kurtis Koeber |
last post by:
Here is the story from Linux.org:
http://www.linux.org/news/2004/10/05/0009.html
I just want to know what everyone thinks on the issue. What will all of us
VB developers do, there isn't much of...
|
by: wwxw_0 |
last post by:
I am going to have some look at the ansi C implemention source of
linux, such as stdio, file operation and so on, Where can I get some
source code, I have downloaded linux source code but I cann't...
|
by: Daniele C. |
last post by:
As soon as my sourceforge.net project gets approved, I am going to
build a ncurses port to win32 bindable to sockets, e.g. allowing
VT100/ANSI terminals and the creation of simple terminal servers...
|
by: sunny |
last post by:
Hi All
What is C99 Standard is all about. is it portable, i mean i saw
-std=C99 option in GCC
but there is no such thing in VC++.?
which one is better ANSI C / C99?
can i know the major...
|
by: Leslie Kis-Adam |
last post by:
Hi everyone!
Does anyone know, if it is possible to clear the screen in ANSI C?
If it is,then how?
Any help would be appreciated.
Laszlo Kis-Adam
<dfighter_AT-NOSPAM_freemail.hu
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: ryjfgjl |
last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
| |