sqrt() double trouble

copx

I am writing a program which uses sqrt() a lot. A historically very slow
function but I read on CPUs with SSE support this is actually fast. Problem:
C's sqrt() (unlike C++ sqrt()) is defined to work on doubles only, and early
versions of SSE did not support double precision. The CPU requirement
"something with at least first generation SEE support" (everything from
P3/Athlon XP and up) is acceptable for me. However, requiring a CPU which
supports double precision SSE is too much.
Is there any work around for this problem? (except for switching to C++)

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

Jun 27 '08 #1

Subscribe Post Reply

3989

Walter Roberson

In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.
--
"The Romans believed that every man had his Genius, and every
woman her Juno." -- Thomas Bulfinch

Jun 27 '08 #2

copx

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im Newsbeitrag
news:g2**********@canopus.cc.umanitoba.ca...

In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.

Is this a new function of C99 or is it also part of C89?

Jun 27 '08 #3

copx

"copx" <co**@gazeta.plschrieb im Newsbeitrag
news:g2**********@inews.gazeta.pl...

>
"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im Newsbeitrag
news:g2**********@canopus.cc.umanitoba.ca...
>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities
of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.

Is this a new function of C99 or is it also part of C89?

I have looked this up in 3 different C standard library references and the
result suggests that this is a C99-only thing. Well, at least it seems
someone on the standard comitee was aware of the issue.

Jun 27 '08 #4

Tim Prince

copx wrote:

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im Newsbeitrag
news:g2**********@canopus.cc.umanitoba.ca...
>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.
If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.

Is this a new function of C99 or is it also part of C89?

C89 reserved the name for this usage, but made its presence optional. For
some versions of MSVC you had to set an option to make it available. In
C99 there's also <tgmath.h. Apparently, you are among those who don't
consider C99 to be C (do you take K&R 1 as your standard?), but you are a
bit late to be asking for a different way of doing things.

Jun 27 '08 #5

copx

"Tim Prince" <tp*****@nospamcomputer.orgschrieb im Newsbeitrag
news:cT**************@flpi150.ffdc.sbc.com...
[snip]

C89 reserved the name for this usage, but made its presence optional. For
some versions of MSVC you had to set an option to make it available. In
C99 there's also <tgmath.h. Apparently, you are among those who don't
consider C99 to be C (do you take K&R 1 as your standard?), but you are a
bit late to be asking for a different way of doing things.

It's a question of portability. C99 is a trainwreck of a standard - at least
on the desktop - because those who matter don't support it. MS owns 90% of
the desktop market, their compiler is the standard for almost all desktop
development - and they do not support C99 nor seem to have the intention to
ever add support for it. Borland (now rather irrelevant I know) does not
support C99 either I think. You are stuck with a hobbiest port of GCC
(MinGW) if you want to develop non-trivial C99 apps on Windows. All the
other alternatives are even worse.
I wish C99 would be supported better so that it could actually replace C89,
but so far fully conforming C99 compilers are exotic beasts while C89
compilers are available almost everywhere. That's why C99 is not a
replacement for C89 - unfortunately.
It seems that the language is considered a dead/legacy support thing by most
of the industry. I mean, it is not like C99 is such a complex standard to
implement, right? MS could probably update their compiler to C99 in a month
or Apple/Redhat/IBM etc. could pay some people to fix GCC's last C99
conformance issues in a similar time frame, but there seems to be not enough
interest.

Jun 27 '08 #6

Keith Thompson

"copx" <co**@gazeta.plwrites:

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im Newsbeitrag
news:g2**********@canopus.cc.umanitoba.ca...
>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.

Is this a new function of C99 or is it also part of C89?

C90 only has sqrt(double). C99 adds sqrtf(float) and sqrtl(long
double) (as well as the type-generic sqrt() macro in <tgmath.h>).

If your implementation doesn't provide sqrtf(), you can probably find
an open source implementation of it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Jun 27 '08 #7

jacob navia

copx wrote:

"copx" <co**@gazeta.plschrieb im Newsbeitrag
news:g2**********@inews.gazeta.pl...
>"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im Newsbeitrag
news:g2**********@canopus.cc.umanitoba.ca...
>>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities
of
the target CPUs.
If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.
Is this a new function of C99 or is it also part of C89?

I have looked this up in 3 different C standard library references and the
result suggests that this is a C99-only thing. Well, at least it seems
someone on the standard comitee was aware of the issue.

/* sqrtf.c
*
* Square root
*
*
*
* SYNOPSIS:
*
* float x, y, sqrtf();
*
* y = sqrtf( x );
*
*
*
* DESCRIPTION:
*
* Returns the square root of x.
*
* Range reduction involves isolating the power of two of the
* argument and using a polynomial approximation to obtain
* a rough value for the square root. Then Heron's iteration
* is used three times to converge to an accurate value.
*
*
*
* ACCURACY:
*
*
* Relative error:
* arithmetic domain # trials peak rms
* IEEE 0,1.e38 100000 8.7e-8 2.9e-8
*
*
* ERROR MESSAGES:
*
* message condition value returned
* sqrtf domain x < 0 0.0
*
*/

/*
Cephes Math Library Release 2.2: June, 1992
Copyright 1984, 1987, 1988, 1992 by Stephen L. Moshier
Direct inquiries to 30 Frost Street, Cambridge, MA 02140
*/

/* Single precision square root
* test interval: [sqrt(2)/2, sqrt(2)]
* trials: 30000
* peak relative error: 8.8e-8
* rms relative error: 3.3e-8
*
* test interval: [0.01, 100.0]
* trials: 50000
* peak relative error: 8.7e-8
* rms relative error: 3.3e-8
*
* Copyright (C) 1989 by Stephen L. Moshier. All rights reserved.
*/
#include "mconf.h"

float frexpf( float, int * );
float ldexpf( float, int );

float sqrtf( float xx )#else
float frexpf(), ldexpf();

float sqrtf(float xx)
{
float f, x, y;
int e;

f = xx;
if( f <= 0.0 ) {
if( f < 0.0 )
//mtherr( "sqrtf", DOMAIN );
return( 0.0 );
}

x = frexpf( f, &e ); /* f = x * 2**e, 0.5 <= x < 1.0 */
/* If power of 2 is odd, double x and decrement the power of 2. */
if( e & 1 ) {
x = x + x;
e -= 1;
}

e >>= 1; /* The power of 2 of the square root. */

if( x 1.41421356237 ) {
/* x is between sqrt(2) and 2. */
x = x - 2.0;
y =
((((( -9.8843065718E-4 * x
+ 7.9479950957E-4) * x
- 3.5890535377E-3) * x
+ 1.1028809744E-2) * x
- 4.4195203560E-2) * x
+ 3.5355338194E-1) * x
+ 1.41421356237E0;
goto sqdon;
}

if( x 0.707106781187 ) {
/* x is between sqrt(2)/2 and sqrt(2). */
x = x - 1.0;
y =
((((( 1.35199291026E-2 * x
- 2.26657767832E-2) * x
+ 2.78720776889E-2) * x
- 3.89582788321E-2) * x
+ 6.24811144548E-2) * x
- 1.25001503933E-1) * x * x
+ 0.5 * x
+ 1.0;
goto sqdon;
}

/* x is between 0.5 and sqrt(2)/2. */
x = x - 0.5;
y =
((((( -3.9495006054E-1 * x
+ 5.1743034569E-1) * x
- 4.3214437330E-1) * x
+ 3.5310730460E-1) * x
- 3.5354581892E-1) * x
+ 7.0710676017E-1) * x
+ 7.07106781187E-1;

sqdon:
y = ldexpf( y, e ); /* y = y * 2**e */
return( y);
}
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Jun 27 '08 #8

Keith Thompson

Tim Prince <tp*****@nospamcomputer.orgwrites:

copx wrote:
>"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caschrieb im
Newsbeitrag news:g2**********@canopus.cc.umanitoba.ca...
>>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.
If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

You will find this in section 7.12.7.5 of the C99 standard.
Is this a new function of C99 or is it also part of C89?

C89 reserved the name for this usage, but made its presence optional.

Interesting. You're right (C90 7.13.4), but I hadn't realized that.

I wonder why the committee reserved the names but didn't require the
functions to be implemented.

For some versions of MSVC you had to set an option to make it
available. In C99 there's also <tgmath.h. Apparently, you are
among those who don't consider C99 to be C (do you take K&R 1 as your
standard?), but you are a bit late to be asking for a different way of
doing things.

How does asking whether something is C99-specific imply anything more
than wanting to know whether it's C99-specific?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Jun 27 '08 #9

Martin Ambuhl

copx wrote:

I am writing a program which uses sqrt() a lot. A historically very slow
function but I read on CPUs with SSE support this is actually fast. Problem:
C's sqrt() (unlike C++ sqrt()) is defined to work on doubles only, and early
versions of SSE did not support double precision. The CPU requirement
"something with at least first generation SEE support" (everything from
P3/Athlon XP and up) is acceptable for me. However, requiring a CPU which
supports double precision SSE is too much.
Is there any work around for this problem? (except for switching to C++)

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

In no way does C++'s alleged support for long double arguments to sqrt()
have anything to do with your problem. C++'s long double sqrt() cannot
magically turn your CPU into one with SSE support for long double
sqrt(). All C++ can do for you is substitute a call to some
sqrt_for_long_double() for sqrt() behind your back. The same
functionality is available in C if you get a version of the math library
supporting C99's sqrtl() function, or any of the packages of math
routines for C with long double functionality, such as the Cephes one
<http://www.netlib.org/cephes/ldouble.tgz>.

Jun 27 '08 #10

santosh

copx wrote:

>
"Tim Prince" <tp*****@nospamcomputer.orgschrieb im Newsbeitrag
news:cT**************@flpi150.ffdc.sbc.com...
[snip]
>C89 reserved the name for this usage, but made its presence optional.
For
some versions of MSVC you had to set an option to make it available.
In
C99 there's also <tgmath.h. Apparently, you are among those who
don't consider C99 to be C (do you take K&R 1 as your standard?), but
you are a bit late to be asking for a different way of doing things.

It's a question of portability. C99 is a trainwreck of a standard - at
least on the desktop - because those who matter don't support it. MS
owns 90% of the desktop market, their compiler is the standard for
almost all desktop development - and they do not support C99 nor seem
to have the intention to ever add support for it. Borland (now rather
irrelevant I know) does not support C99 either I think. You are stuck
with a hobbiest port of GCC (MinGW) if you want to develop non-trivial
C99 apps on Windows. All the other alternatives are even worse.

What about Intel's compiler? It's a commercial product and seems to
support the vast majority of C99 features, even though their
documentation tells you otherwise.

<snip>

Jun 27 '08 #11

Dan

"copx" <co**@gazeta.plwrote in message
news:g2**********@inews.gazeta.pl...

>I am writing a program which uses sqrt() a lot. A historically very slow
function but I read on CPUs with SSE support this is actually fast.
Problem: C's sqrt() (unlike C++ sqrt()) is defined to work on doubles only,
and early versions of SSE did not support double precision. The CPU
requirement "something with at least first generation SEE support"
(everything from P3/Athlon XP and up) is acceptable for me. However,
requiring a CPU which supports double precision SSE is too much.
Is there any work around for this problem? (except for switching to C++)

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

Does anyone have the code for the routine handy? Is it that hard to change
it to float?

Jun 27 '08 #12

Chris Dollin

copx wrote:

I am writing a program which uses sqrt() a lot. A historically very slow
function but I read on CPUs with SSE support this is actually fast. Problem:
C's sqrt() (unlike C++ sqrt()) is defined to work on doubles only, and early
versions of SSE did not support double precision. The CPU requirement
"something with at least first generation SEE support" (everything from
P3/Athlon XP and up) is acceptable for me. However, requiring a CPU which
supports double precision SSE is too much.
Is there any work around for this problem? (except for switching to C++)

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

Doesn't it?

Suppose the compiler noticed the expression `(float) sqrt( aFloat )` (or
moral equivalents) and implemented it with a machine-specific sqrt-float
instruction or millicode. Would that violate the Standard?

If not, then the Standard isn't preventing you from exploiting the
capabilities of the target CPUs -- it's the implementations.

(I'm assuming you have independent measurements that show that sqrt-speed
really is an important performance bottleneck for your application.)

--
"The letter was not unproductive." /Mansfield Park/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

Jun 27 '08 #13

Ben Bacarisse

jacob navia <ja***@nospam.comwrites:

<snip discussion>

/* sqrtf.c

<snip>

* SYNOPSIS:
*
* float x, y, sqrtf();
*
* y = sqrtf( x );

<snip>

*/

/*
Cephes Math Library Release 2.2: June, 1992
Copyright 1984, 1987, 1988, 1992 by Stephen L. Moshier
Direct inquiries to 30 Frost Street, Cambridge, MA 02140
*/

Modified by you I think.

<snip>

float sqrtf(float xx)
{
float f, x, y;
int e;

f = xx;
if( f <= 0.0 ) {
if( f < 0.0 )
//mtherr( "sqrtf", DOMAIN );
return( 0.0 );
}

Your C99 comment introduces a bug. I hope this is not the code used
in your library. I leave the rest so you can see it assumes that f !=
0.0f from here on down.

x = frexpf( f, &e ); /* f = x * 2**e, 0.5 <= x < 1.0 */
/* If power of 2 is odd, double x and decrement the power of 2. */
if( e & 1 ) {
x = x + x;
e -= 1;
}

e >>= 1; /* The power of 2 of the square root. */

if( x 1.41421356237 ) {
/* x is between sqrt(2) and 2. */
x = x - 2.0;
y =
((((( -9.8843065718E-4 * x
+ 7.9479950957E-4) * x
- 3.5890535377E-3) * x
+ 1.1028809744E-2) * x
- 4.4195203560E-2) * x
+ 3.5355338194E-1) * x
+ 1.41421356237E0;
goto sqdon;
}

if( x 0.707106781187 ) {
/* x is between sqrt(2)/2 and sqrt(2). */
x = x - 1.0;
y =
((((( 1.35199291026E-2 * x
- 2.26657767832E-2) * x
+ 2.78720776889E-2) * x
- 3.89582788321E-2) * x
+ 6.24811144548E-2) * x
- 1.25001503933E-1) * x * x
+ 0.5 * x
+ 1.0;
goto sqdon;
}

/* x is between 0.5 and sqrt(2)/2. */
x = x - 0.5;
y =
((((( -3.9495006054E-1 * x
+ 5.1743034569E-1) * x
- 4.3214437330E-1) * x
+ 3.5310730460E-1) * x
- 3.5354581892E-1) * x
+ 7.0710676017E-1) * x
+ 7.07106781187E-1;

sqdon:
y = ldexpf( y, e ); /* y = y * 2**e */
return( y);
}

--
Ben.

Jun 27 '08 #14

Bartc

"jacob navia" <ja***@nospam.comwrote in message
news:g2**********@aioe.org...

<snip lots of code>

If floating point hardware is available, then using the standard sqrt()
function, even with conversion from and to float, is going to be faster.

--
Bartc

Jun 27 '08 #15

jacob navia

Ben Bacarisse wrote:
[snip]

> if( f < 0.0 )
//mtherr( "sqrtf", DOMAIN );
return( 0.0 );

You are right!!!

Excuse me. Just erase the test and the call to mtherr.

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Jun 27 '08 #16

Tim Prince

santosh wrote:

copx wrote:

>"Tim Prince" <tp*****@nospamcomputer.orgschrieb im Newsbeitrag
news:cT**************@flpi150.ffdc.sbc.com...
[snip]
>>C89 reserved the name for this usage, but made its presence optional.
For
some versions of MSVC you had to set an option to make it available.
In
C99 there's also <tgmath.h. Apparently, you are among those who
don't consider C99 to be C (do you take K&R 1 as your standard?), but
you are a bit late to be asking for a different way of doing things.
It's a question of portability. C99 is a trainwreck of a standard - at
least on the desktop - because those who matter don't support it. MS
owns 90% of the desktop market, their compiler is the standard for
almost all desktop development - and they do not support C99 nor seem
to have the intention to ever add support for it. Borland (now rather
irrelevant I know) does not support C99 either I think. You are stuck
with a hobbiest port of GCC (MinGW) if you want to develop non-trivial
C99 apps on Windows. All the other alternatives are even worse.

What about Intel's compiler? It's a commercial product and seems to
support the vast majority of C99 features, even though their
documentation tells you otherwise.

<snip>

OP wants to invent his own variation of C, as if that would solve the
problems of not all compiler vendors supporting current C. How would he
persuade MS and Borland to support his personal extensions?
Intel C supports all C99 features which are supported by gcc or MSVC. If
you have use for additional C99 features, you can submit a request and
example of the usefulness to the support site.
There are compilers available which support more or all of C99.
I wouldn't characterize all the good work being done on gcc for Windows as
hobbyist in nature.

Jun 27 '08 #17

Richard Heathfield

Ben Bacarisse said:

jacob navia <ja***@nospam.comwrites:

<snip>

>if( f <= 0.0 ) {
if( f < 0.0 )
//mtherr( "sqrtf", DOMAIN );
return( 0.0 );
}

Your C99 comment introduces a bug.

Those who habitually lay their control constructs out like this:

<control-keyword>(control-expression(s))
{
body;
}

even for single-line bodies, are nowhere near as susceptible to the
accidental introduction of such a bug as those who don't.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Jun 27 '08 #18

Richard

Richard Heathfield <rj*@see.sig.invalidwrites:

Ben Bacarisse said:

>jacob navia <ja***@nospam.comwrites:

<snip>

>>if( f <= 0.0 ) {
if( f < 0.0 )
//mtherr( "sqrtf", DOMAIN );
return( 0.0 );
}

Your C99 comment introduces a bug.

Those who habitually lay their control constructs out like this:

<control-keyword>(control-expression(s))
{
body;
}

even for single-line bodies, are nowhere near as susceptible to the
accidental introduction of such a bug as those who don't.

I do not believe that for a minute. Especially in the day and age of
language sensitive editors with bracket matching etc.

Jun 27 '08 #19

rio

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caha scritto nel messaggio
news:g2**********@canopus.cc.umanitoba.ca...

In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

float sqrtf(float x){float r; double b=x; r = (float) sqrt(b); return
r;}

You will find this in section 7.12.7.5 of the C99 standard.
--
"The Romans believed that every man had his Genius, and every
woman her Juno." -- Thomas Bulfinch

Jun 27 '08 #20

user923005

On Jun 5, 10:59*pm, "copx" <c...@gazeta.plwrote:

I am writing a program which uses sqrt() a lot. A historically very slow
function but I read on CPUs with SSE support this is actually fast. Problem:
C's sqrt() (unlike C++ sqrt()) is defined to work on doubles only, and early
versions of SSE did not support *double precision. The CPU requirement
"something with at least first generation SEE support" (everything from
P3/Athlon XP and up) is acceptable for me. However, requiring a CPU which
supports double precision SSE is too much.
Is there any work around for this problem? (except for switching to C++)

For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

What compiler are you using?
For AMD/INTEL (and many other popular platforms) sqrt() is a single
assembly instruction {given suitable compiler options} that is about
the same speed as a single floating point divide.

Another option is to remove the square root calls. For instance, if
you are minimizing distances, you can simply minimize the square of
the distance.

Jun 27 '08 #21

Keith Thompson

"rio" <a@b.cwrites:

"Walter Roberson" <ro******@ibd.nrc-cnrc.gc.caha scritto nel messaggio
news:g2**********@canopus.cc.umanitoba.ca...
>In article <g2**********@inews.gazeta.pl>, copx <co**@gazeta.plwrote:

>>>For those who will certainly complain that discussing the use of CPU
specific instructions is OT here - I say this question is on topic here
because my problem is caused by the C standard which only features double
precision sqrt() and thus does not allow me to exploit the capabilities of
the target CPUs.

If you have a problem with the C standard and want it changed, then
take the issue up in comp.std.c .

Personally, though, I wouldn't bother doing that: I'd just use
the standard-mandated float square root function,

#include <math.h>

float sqrtf(float x)

float sqrtf(float x){float r; double b=x; r = (float) sqrt(b); return
r;}

>You will find this in section 7.12.7.5 of the C99 standard.

You'll find the declaration of sqrtf in that section. You certainly
won't find that implementation.

A simpler implementation would be:

float sqrtf(float x) { return sqrt(x); }

The temporaries and the cast are entirely unnecessary.

However, the standard doesn't mandate any particular implementation
for sqrtf(). On some systems, the above might be best. On others, it
might be faster to do the calculations entirely in type float, without
converting to or from double at all.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Jun 27 '08 #22

Flash Gordon

Richard wrote, On 06/06/08 16:23:

Richard Heathfield <rj*@see.sig.invalidwrites:

>Ben Bacarisse said:

<snip>

>Those who habitually lay their control constructs out like this:

<control-keyword>(control-expression(s))
{
body;
}

even for single-line bodies, are nowhere near as susceptible to the
accidental introduction of such a bug as those who don't.

I do not believe that for a minute. Especially in the day and age of
language sensitive editors with bracket matching etc.

I've seen that kind of error by someone who uses a language sensitive
editor with brace matching etc.
--
Flash Gordon

Jun 27 '08 #23

copx

"user923005" <dc*****@connx.comschrieb im Newsbeitrag
news:78**********************************@r66g2000 hsg.googlegroups.com...

>What compiler are you using?

Multiple ones. I am only doing open source hobby development, and as a user
I find it incredibly annoying if a project requires a specific compiler to
be built, so I try to avoid that situation in my own work.

>For AMD/INTEL (and many other popular platforms) sqrt() is a single
assembly instruction {given suitable compiler options} that is about
the same speed as a single floating point divide.

Not on all AMD/Intel CPUs. I even described the issue in my OP. sqrt() only
became a simple/fast instructinon - when done on floats! - with the
introduction of SSE1 (supported by Athlon XP/P3 or better). Fast sqrt() on
doubles requires at least SSE2 (Pentium 4 or better).

Jun 27 '08 #24

Bartc

"copx" <co**@gazeta.plwrote in message
news:g2**********@inews.gazeta.pl...

>
"user923005" <dc*****@connx.comschrieb im Newsbeitrag
news:78**********************************@r66g2000 hsg.googlegroups.com...
>>What compiler are you using?

Multiple ones. I am only doing open source hobby development, and as a
user I find it incredibly annoying if a project requires a specific
compiler to be built, so I try to avoid that situation in my own work.

>>For AMD/INTEL (and many other popular platforms) sqrt() is a single
assembly instruction {given suitable compiler options} that is about
the same speed as a single floating point divide.

Not on all AMD/Intel CPUs. I even described the issue in my OP. sqrt()
only became a simple/fast instructinon - when done on floats! - with the
introduction of SSE1 (supported by Athlon XP/P3 or better). Fast sqrt() on
doubles requires at least SSE2 (Pentium 4 or better).

What's the problem with the fsqrt instruction? Your sqrt() call will
typically inline to this instruction, together with instructions to load a
float /or/ double.

--
Bartc

Jun 27 '08 #25

copx

"Tim Prince" <tp*****@nospamcomputer.orgschrieb im Newsbeitrag
news:Zd**************@nlpi069.nbdc.sbc.com...

OP wants to invent his own variation of C

No, I don't. Nor does anything I wrote suggest that.

How would he persuade MS and Borland to support his personal extensions?

Personal extensions?! Square on floats is part of the current C standard.
Unfortunately that standard is anything but standard in the real world.

Intel C supports all C99 features which are supported by gcc or MSVC. If
you have use for additional C99 features, you can submit a request and
example of the usefulness to the support site.

Yeah, as if they would care. MS does not add a special feature for you
unless you are a Fortune 500 company.
The paid GCC developers only work on what their employers want them to work
on. The volunteers work on what they want. Obviously, neither camp considers
full C99 support to be important.

I wouldn't characterize all the good work being done on gcc for Windows as
hobbyist in nature.

But it is. Nobody who works on the Windows port of GCC gets paid, it's a
pure hobbyist thing.

Jun 27 '08 #26

lawrence.jones

Keith Thompson <ks***@mib.orgwrote:

>
I wonder why the committee reserved the names but didn't require the
functions to be implemented.

We wanted to reserve the names since they were already in common usage,
but we didn't think they were sufficiently useful to require every
implementation to provide them, especially since they had no real
benefit on many (if not most) existing platforms. In 1999, they *did*
have real benefit on many existing platforms, so we decided to go ahead
and require them.

-- Larry Jones

I've got to start listening to those quiet, nagging doubts. -- Calvin

Jun 27 '08 #27

user923005

On Jun 6, 5:34*pm, "copx" <c...@gazeta.plwrote:

"Tim Prince" <tpri...@nospamcomputer.orgschrieb im Newsbeitragnews:Zdc2k..40********@nlpi069.nbdc.sbc .com...

OP wants to invent his own variation of C

No, I don't. Nor does anything I wrote suggest that.

How would he persuade MS and Borland to support his personal extensions?

Personal extensions?! Square on floats is part of the current C standard.
Unfortunately that standard is anything but standard in the real world.

Intel C supports all C99 features which are supported by gcc or MSVC. *If
you have use for additional C99 features, you can submit a request and
example of the usefulness to the support site.

Yeah, as if they would care. MS does not add a special feature for you
unless you are a Fortune 500 company.

They added one for me. I asked for line lengths longer than the
standard dictates and they did it.

The paid GCC developers only work on what their employers want them to work
on. The volunteers work on what they want. Obviously, neither camp considers
full C99 support to be important.

I wouldn't characterize all the good work being done on gcc for Windows as
hobbyist in nature.

But it is. Nobody who works on the Windows port of GCC gets paid, it's a
pure hobbyist thing.

I guess that the intrinsic fsqrt instruction is going to be pretty
hard to beat.

#include <stdio.h>
#include <math.h>

char string[256];

int main()
{
double d;
float f;
puts("Enter a number");
fgets(string, sizeof string, stdin);
d = atof(string);
f = (float) sqrt(d);
printf("The square root of %f is %f\n", d, f);
return 0;
}
/* Here is the assembly that was generated:

; Listing generated by Microsoft (R) Optimizing Compiler Version
14.00.50727.762

TITLE c:\tmp\foo.c
.686P
.XMM
include listing.inc
.model flat

INCLUDELIB OLDNAMES

EXTRN __imp____iob_func:PROC
EXTRN __imp__fgets:PROC
EXTRN __imp__printf:PROC
EXTRN __imp__puts:PROC
EXTRN __imp__atof:PROC
COMM _string:BYTE:0100H
$SG-4 DB 'Enter a number', 00H
ORG $+1
$SG-5 DB 'The square root of %f is %f', 0aH, 00H
PUBLIC _main
EXTRN __fltused:DWORD
; Function compile flags: /Ogtpy
; File c:\tmp\foo.c
_TEXT SEGMENT
_main PROC

; 7 : {

00000 55 push ebp
00001 8b ec mov ebp, esp
00003 83 e4 c0 and esp, -64 ; ffffffc0H

; 8 : double d;
; 9 : float f;
; 10 : puts("Enter a number");

00006 68 00 00 00 00 push OFFSET $SG-4
0000b ff 15 00 00 00
00 call DWORD PTR __imp__puts

; 11 : fgets(string, sizeof string, stdin);

00011 ff 15 00 00 00
00 call DWORD PTR __imp____iob_func
00017 50 push eax
00018 68 00 01 00 00 push 256 ; 00000100H
0001d 68 00 00 00 00 push OFFSET _string
00022 ff 15 00 00 00
00 call DWORD PTR __imp__fgets

; 12 : d = atof(string);

00028 68 00 00 00 00 push OFFSET _string
0002d ff 15 00 00 00
00 call DWORD PTR __imp__atof

; 13 : f = (float) sqrt(d);

00033 d9 c0 fld ST(0)
00035 d9 fa fsqrt

; 14 : printf("The square root of %f is %f\n", d, f);

00037 83 c4 04 add esp, 4
0003a dd 5c 24 08 fstp QWORD PTR [esp+8]
0003e dd 1c 24 fstp QWORD PTR [esp]
00041 68 00 00 00 00 push OFFSET $SG-5
00046 ff 15 00 00 00
00 call DWORD PTR __imp__printf
0004c 83 c4 14 add esp, 20 ; 00000014H

; 15 : return 0;

0004f 33 c0 xor eax, eax

; 16 : }

00051 8b e5 mov esp, ebp
00053 5d pop ebp
00054 c3 ret 0
_main ENDP
_TEXT ENDS
END
*/

Jun 27 '08 #28

Dann Corbit

"copx" <co**@gazeta.plwrote in message
news:g2**********@inews.gazeta.pl...

>
"user923005" <dc*****@connx.comschrieb im Newsbeitrag
news:78**********************************@r66g2000 hsg.googlegroups.com...
>>What compiler are you using?

Multiple ones. I am only doing open source hobby development, and as a
user I find it incredibly annoying if a project requires a specific
compiler to be built, so I try to avoid that situation in my own work.

>>For AMD/INTEL (and many other popular platforms) sqrt() is a single
assembly instruction {given suitable compiler options} that is about
the same speed as a single floating point divide.

Not on all AMD/Intel CPUs. I even described the issue in my OP. sqrt()
only became a simple/fast instructinon - when done on floats! - with the
introduction of SSE1 (supported by Athlon XP/P3 or better). Fast sqrt() on
doubles requires at least SSE2 (Pentium 4 or better).

If you are using the Microsoft compiler then you want /Oi (enable
intrinsics). It works on all CPUs.
If you are using the Intel compiler then you want /O2 (enable intrinsics,
among other things).
I believe that GCC does intrinsics, but I don't remember the flags to use.

What is the calculation you are trying to accelerate? I bet there is an
easy way to do it.
** Posted from http://www.teranews.com **

Jun 27 '08 #29

Mark McIntyre

copx wrote:

"user923005" <dc*****@connx.comschrieb im Newsbeitrag
news:78**********************************@r66g2000 hsg.googlegroups.com...

>For AMD/INTEL (and many other popular platforms) sqrt() is a single
assembly instruction {given suitable compiler options} that is about
the same speed as a single floating point divide.

Not on all AMD/Intel CPUs. I even described the issue in my OP. sqrt() only
became a simple/fast instructinon - when done on floats! - with the
introduction of SSE1 (supported by Athlon XP/P3 or better). Fast sqrt() on
doubles requires at least SSE2 (Pentium 4 or better).

Blame your compiler and/or OS then - the ability to execute the fast
version isn't C's fault.

And believe me, if you're micro-optimising to this level, you've
forgotten the three rules. :-)

--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

Jun 27 '08 #30

Mark McIntyre

copx wrote:

>>I wouldn't characterize all the good work being done on gcc for Windows as
hobbyist in nature.

But it is. Nobody who works on the Windows port of GCC gets paid, it's a
pure hobbyist thing.

You've misunderstood the fundamental difference between hobbyist and
unpaid. I do unpaid work for a charity and also maintain computers for
friends & family. Neither is a hobby.

--
Mark McIntyre

CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

Jun 27 '08 #31

sqrt() double trouble

Similar topics