Re: long double precision

James Kanze <ja*********@gmail.comwrites:

On Nov 16, 10:55 pm, "BobR" <removeBadB...@worldnet.att.netwrote:
>Markus Moll wrote in message...
What does MSVC++ say about sizeof(long double) vs sizeof(double)?

>I asked My Second Virtual Cousin (twice added), and he said nothing! <G>

>In Assembler, I used to use eight-byte(dd) and ten-byte(dt) types. That's
not even close to "twice the size" (If we're talking number of bits).
[ assembler == a386 ]

And g++ on a PC, at least in some configurations, uses 8 and 12
bytes.

At the hardware level, there are 10 bytes of information in an
Intel long double.

True, with some additional considerations. The commonly used IEEE 754
floating point formats are

single precision: 32 bits including 1 sign bit, 23 significand bits
(with an implicit leading 1, for 24 total), and 8 exponent bits

double precision: 64 bits including 1 sign bit, 52 significand bits
(with an implicit leading 1, for 24 total), and 11 exponent bits

double extended precision: 80 bits including 1 sign bit, 64 significand
bits (no implicit leading 1), and 15 exponent bits.

The native format of the x87 FPU is "double extended". We run into this
from time to time when floating point computations compile with more
optimization give slightly different results. The issue is that one
optimiziation is to hold intermediate results in 80-bit FPU registers
instead of rounding them down to fit in 64-bit memory locations. gcc
offers the -ffloat-store to suppress this optimization for that very
reason.

In addition, the x87 control register allows one to select between
extended (the default), double and single precision. Try this (on
Linux/gcc/glibc), for example:

#include <fpu_control.h>

void set_double(void)
{
unsigned short cw;
_FPU_GETCW(cw);
cw = (cw & ~_FPU_EXTENDED) | _FPU_DOUBLE;
_FPU_SETCW(cw);
}

and similarly for "set_extended". Note that this will make all kinds of
trouble for you because libm depends on having the FPU in extended
precision mode.

Now comes the real kicker: the SSE/SSE2/SSE3 vector co-processors do not
support double extended precision; only single and double precision.
gcc and icc are both using these co-processors pretty extensively now,
so you're not guaranteed to get even intermediate results done in
double-extended arithmetic.

Going back on-topic, if you want to peek at the binary representations
of IEEE floating-point numbers, you might enjoy the template included
below. I supply typedefs for single and double precision, writing the
typedef for double extended precision is left as an exercise to the
reader.

// IEEE Floating-Point template
// Copyright (C) 2007 Charles M. "Chip" Coldwell <co******@gmail.com>

// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.

// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.

#ifndef IEEEFLOAT_HH
#define IEEEFLOAT_HH

template
<typename _float_t, typename _sint_t, typename _uint_t, int _mbits, int _ebits>
class ieee_float {
public:
typedef _uint_t uint_t;
typedef _sint_t sint_t;
typedef _float_t float_t;
enum { mbits = _mbits, ebits = _ebits };

#ifdef __BIG_ENDIAN__
uint_t s:1;
uint_t e:ebits;
uint_t m:mbits;
#else
uint_t m:mbits;
uint_t e:ebits;
uint_t s:1;
#endif

static const uint_t mdenom = ((uint_t)1 << mbits);
static const uint_t ebias = ((uint_t)1 << (ebits - 1)) - 1;

sint_t sign(void) const { return 1 - 2*s; }
sint_t exponent(void) const { return e ? e - ebias : -(ebias - 1); }
uint_t mantissa(void) const { return ((uint_t)(!!e) << mbits) | m; }
bool infinity(void) const { return (e == ((1 << ebits) - 1)) && (m == 0); }
bool nan(void) const { return (e == ((1 << ebits) - 1)) && (m != 0); }
bool denormal(void) const { return (e == 0) && (m != 0); }

ieee_float(float_t f) { *reinterpret_cast<float_t *>(this) = f; }
ieee_float(uint_t u) { *reinterpret_cast<uint_t *>(this) = u; }

operator float_t() const { return *reinterpret_cast<const float_t *>(this); }
operator uint_t() const { return *reinterpret_cast<const uint_t *>(this); }
};

typedef ieee_float<float, int, unsigned, 23, 8>
single_precision;
typedef ieee_float<double, long long, unsigned long long, 52, 11>
double_precision;

#endif

Chip

--
Charles M. "Chip" Coldwell
"Turn on, log in, tune out"
GPG Key ID: 852E052F
GPG Key Fingerprint: 77E5 2B51 4907 F08A 7E92 DE80 AFA9 9A8F 852E 052F

Jun 27 '08 #1

Subscribe Post Reply

2532

Similar topics

floats doubles long doubles

by: dan | last post by:

Can someone tell me the difference between these three data types? Thanks Daniel

C / C++

how long is double

by: f | last post by:

I have this double sum, a, b, c; sum = a + b + c; printf("%.20f = %.20f, %.20f, %.20f", sum, a, b, c); I found that the debug version and release version of the same code give me different...

C / C++

storing a long in a float with same precision on Solaris 64bit OS

by: Nishant Deshpande | last post by:

On Solaris 64-bit: sizeof(long) = 8 sizeof(double) = 8 !!! So i figure i can't store a long with the same precision in a double. But, when i create a long and assign its largest value to it,...

C / C++

Scientific Calculation with long double

by: Miguel Morales | last post by:

Hello: Currently, I am trying to solve a problem that requires the calculation of Airy functions for big arguments. Because Airy functions diverge rapidly for positive values, the results of the...

C / C++

Typecast long double->double seems to go wrong

by: Michael Mair | last post by:

Hi there, actually, I have posted the same question in g.g.help. As there were no answers, I am still not sure whether this is a bug or only something open to the compiler that is seemingly...

C / C++

Wintel compilers with: C99 tgmath.h? 16-byte "long double"?

by: bq | last post by:

Hello, Two questions related to floating point support: What C compilers for the wintel (MS Windows + x86) platform are C99 compliant as far as <math.h> and <tgmath.h> are concerned? What...

C / C++

How to use long double floating?

by: Bryan Parkoff | last post by:

The guideline says to use %f in printf() function using the keyword float and double. For example float a = 1.2345; double b = 5.166666667; printf("%.2f\n %f\n", a, b);

C / C++

long double in gcc implementations

by: lcw1964 | last post by:

This may be in the category of bush-league rudimentary, but I am quite perplexed on this and diligent Googling has not provided me with a clear straight answer--perhaps I don't know how to ask the...

C / C++

Long double in Python

by: Charles Vejnar | last post by:

Hi, I have a C library using "long double" numbers. I would like to be able to keep this precision in Python (even if it's not portable) : for the moment I have to cast the "long double" numbers...

Python

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA