single double precision question ....... more

ma740988

I've got an unpacker that unpacks a 32 bit word into 3-10 bits samples.
Bits 0 and 1 are dont cares. For the purposes of perfoming an FFT and
an inverse FFT, I cast the 10 bit values into doubles. I'm told:

"floats and doubles have bits for mantissa and exponent. (I knew that).
If you are subtract bits that partly belong to mantissa and partly
belong to exponent, that hardly makes sense. Anyway, looking on bits
only, a float has 32 bits and doesn't behave different than a integer."

My advisor went on to say:
If I unpacked the samples into an - say an array of floats, it's quite
possible that I could get a left side value whose bits were
*very* different to th eright-hand integer *and* most likely wouldn't
match exactly to the integer.

So now:
float[0] = 0x2AA;
or better:
float f = (float)0x2AA;
cout << f << endl;

would show perhaps
682.00000123

Makes absolutely no sense to me. All machines I ran the source on
produced 682.00000000000 (depending of course on your precision).

Am I being mislead here?

Sep 30 '05 #1

Subscribe Reply

1858

David White

ma******@gmail. com wrote:

I've got an unpacker that unpacks a 32 bit word into 3-10 bits
samples. Bits 0 and 1 are dont cares. For the purposes of perfoming
an FFT and an inverse FFT, I cast the 10 bit values into doubles.
I'm told:

"floats and doubles have bits for mantissa and exponent. (I knew
that). If you are subtract bits that partly belong to mantissa and
partly belong to exponent, that hardly makes sense. Anyway, looking
on bits only, a float has 32 bits and doesn't behave different than a
integer."

My advisor went on to say:
If I unpacked the samples into an - say an array of floats, it's quite
possible that I could get a left side value whose bits were
*very* different to th eright-hand integer *and* most likely wouldn't
match exactly to the integer.
I don't know what you mean here by "left side" and "right" side.

So now:
float[0] = 0x2AA;
or better:
float f = (float)0x2AA;
cout << f << endl;

would show perhaps
682.00000123

Makes absolutely no sense to me. All machines I ran the source on
produced 682.00000000000 (depending of course on your precision).
All you are doing is a standard conversion from 0x2AA (682) to a float.
Floats are capable of representing integer values exactly, and that's what's
happening. There's no reason to expect a slight error.
Am I being mislead here?

I don't know because I'm not sure what you are getting at. If you are
talking about reinterpreting the bit pattern of an integer as a float, then
you could end up with any nonsense value, but if you are just doing a
standard conversion, in which the compiler takes care of correctly mapping
the integer bit pattern to the float bit pattern of the equivalant value,
then you got the expected result.

DW

Sep 30 '05 #2

Jack Klein

On 29 Sep 2005 18:44:51 -0700, ma******@gmail. com wrote in
comp.lang.c++:

I've got an unpacker that unpacks a 32 bit word into 3-10 bits samples.
Bits 0 and 1 are dont cares. For the purposes of perfoming an FFT and
an inverse FFT, I cast the 10 bit values into doubles. I'm told:
Why are you casting? Values with accessible bits are integer types,
and you can just assign them to doubles. A case is redundant and has
no effect at all in this case.
"floats and doubles have bits for mantissa and exponent. (I knew that).
If you are subtract bits that partly belong to mantissa and partly
belong to exponent, that hardly makes sense. Anyway, looking on bits
only, a float has 32 bits and doesn't behave different than a integer."
C++ does not say how many bits a float has. It may have 32, it may
have 64, it can't have as few as 6, on a conforming implementation.
My advisor went on to say:
If I unpacked the samples into an - say an array of floats, it's quite
possible that I could get a left side value whose bits were
*very* different to th eright-hand integer *and* most likely wouldn't
match exactly to the integer.
There is no way to "unpack" bits into an array of floats defined by
the C++ language. You can extract some of the bits from an integer
type into another suitably sized integer type object. Or you can
extract some of the bits and assign them to floating point type, that
is defined.
So now:
float[0] = 0x2AA;
This is a syntax error, you can't have an array with a name identical
to a keyword. So let's assume that there is an array of floats named
'my_float'.

Now given:

my_float[0] = 0x2AA;

....then the integer literal '0x2AA' has type int and the value 682.
You assign this value to a float, which causes the compiler to convert
the value 682 into the float representation for 682.0 and assign it to
the float.
or better:
float f = (float)0x2AA;
This is no better, it is worse because the cast is redundant and it
shows a lack of understanding.
cout << f << endl;

would show perhaps
682.00000123
No it wouldn't, not with the code you posted.
Makes absolutely no sense to me. All machines I ran the source on
produced 682.00000000000 (depending of course on your precision).

Am I being mislead here?

There are two things you are missing here. The first is that your
advisor is discussing ideas that you are not yet ready for, and may
never need.

But the main thing that you are missing here is that in C++
conversions are based on value, not on differences in the internal
bitwise implementation of the data.

Your advisor is talking about the differences in the internal bitwise
implementation of the int value 682 and the float value 682.0. If
programs are design correctly, very, very few of them will ever need
to be concerned about the difference.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.l earn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Sep 30 '05 #3

ma740988

Thanks gents. I have a hard time expressing my ideas sometimes but
Jack you pointed out what I was alluding to: internal representation.
|| Your advisor is talking about the differences in the internal
bitwise
|| implementation of the int value 682 and the float value 682.0. If
|| programs are design correctly, very, very few of them will ever need

|| to be concerned about the difference.
Now can you highlight a case where I would be concerned about the
difference. That was my 'real' question because for some reason I
cant see it.

You see in my mind and for my case:

my_float[0] = 0x2AA;

The value 682 converted from int to float representation would amount
to 682.0. I cant see how it will be 682.00000000000 023 or ...

Sep 30 '05 #4

Christian Meier

<ma******@gmail .com> schrieb im Newsbeitrag
news:11******** **************@ g14g2000cwa.goo glegroups.com.. .

Thanks gents. I have a hard time expressing my ideas sometimes but
Jack you pointed out what I was alluding to: internal representation.
|| Your advisor is talking about the differences in the internal
bitwise
|| implementation of the int value 682 and the float value 682.0. If
|| programs are design correctly, very, very few of them will ever need

|| to be concerned about the difference.
Now can you highlight a case where I would be concerned about the
difference. That was my 'real' question because for some reason I
cant see it.

You see in my mind and for my case:

my_float[0] = 0x2AA;

The value 682 converted from int to float representation would amount
to 682.0. I cant see how it will be 682.00000000000 023 or ...

floating point datatypes have a deviation....

Sep 30 '05 #5

ma740988

uhmnn, David and Jack just told me it doesnt. I'll get teh correct
result always. i.e 682.00000000000 00000

What deviation are you referring to?

Sep 30 '05 #6

Kai-Uwe Bux

ma******@gmail. com wrote:

Thanks gents. I have a hard time expressing my ideas sometimes but
Jack you pointed out what I was alluding to: internal representation.
|| Your advisor is talking about the differences in the internal
bitwise
|| implementation of the int value 682 and the float value 682.0. If
|| programs are design correctly, very, very few of them will ever need

|| to be concerned about the difference.
Now can you highlight a case where I would be concerned about the
difference. That was my 'real' question because for some reason I
cant see it.

You see in my mind and for my case:

my_float[0] = 0x2AA;

The value 682 converted from int to float representation would amount
to 682.0. I cant see how it will be 682.00000000000 023 or ...

It will not. Small integer values have exact representations in float or
double on any decent c++ implementation. (It is true that this is a quality
of implementation issue as the standard is remarkably shy to give any
guarantees about floating point arithmetic). Floating point arithmetic
represents numbers by sign, mantissa, and exponent. Since a float or a
double uses only finitely many bits, only finitely many real numbers are
representable as a float. Usually, we deal with the missing reals by
considering a nearby float approximating them as their representation.
However, as long as the bitlength of the mantissa can host your integer, it
can be represented as a float without being just approximated.

Now, as for the bit patterns, they will generally look vastly different.
However, that should not be of your concern. The compiler will generate the
code taking care of all necessary bit-shuffling when converting an int to a
float.
Best

Kai-Uwe Bux

Sep 30 '05 #7

David White

<ma******@gmail .com> wrote in message
news:11******** **************@ g14g2000cwa.goo glegroups.com.. .

Thanks gents. I have a hard time expressing my ideas sometimes but
Jack you pointed out what I was alluding to: internal representation.
|| Your advisor is talking about the differences in the internal
bitwise
|| implementation of the int value 682 and the float value 682.0. If
|| programs are design correctly, very, very few of them will ever need

|| to be concerned about the difference.
Now can you highlight a case where I would be concerned about the
difference. That was my 'real' question because for some reason I
cant see it.

You see in my mind and for my case:

my_float[0] = 0x2AA;

The value 682 converted from int to float representation would amount
to 682.0. I cant see how it will be 682.00000000000 023 or ...

I can't highlight a case concerning a direct conversion from integer 682, or
similar sized value, to a float, because on any implementation you are
likely to come across you'll get an exact conversion. I can, however,
describe a real case I had recently: I was reading the speed of a vacuum
pump that has a maximum speed of 1500 Hz. I needed to display it to the user
as a percentage of maximum speed using a Number object (our own class) that
was created with limits 0 to 100. I read the Hz value and converted it to a
percentage like this:
userDisplayPara m.setValue(spee dInHz / 15.f);
The problem was that if you started the program with the pump already at
full speed it would sometimes take minutes to show anything but the default
zero value for the speed, even though it was being read every second, and it
strangely would never show 100. The highest you ever saw was 99.9. It turned
out that the "division" in the expression above was turned into a
multiplication by the compiler, i.e., instead of speed / 15.f it was doing
speed * (1/15.f), where 1/15.f was pre-calculated by the compiler. This
makes sense because multiplications can be done faster than divisions.
Unfortunately, 1/15.f cannot be represented exactly in a binary float value,
so there was a slight error in the result even when the speed in Hz is
exactly divisible by 15. My assumed exact 100% result for 1500/15 turned out
to be something like 100.0001, which exceeded the limit of the user-display
Number, so it never showed 100% for the speed.

DW

Oct 1 '05 #8

Similar topics

4183

Typecast long double->double seems to go wrong

by: Michael Mair | last post by:

Hi there, actually, I have posted the same question in g.g.help. As there were no answers, I am still not sure whether this is a bug or only something open to the compiler that is seemingly inconsistent or whether my understanding of C is not complete enough. I would appreciate answers or pointers to answers very much.

C / C++

6640

double to int conversion yields strange results

by: Bjørn Augestad | last post by:

Below is a program which converts a double to an integer in two different ways, giving me two different values for the int. The basic expression is 1.0 / (1.0 * 365.0) which should be 365, but one variable becomes 364 and the other one becomes 365. Does anyone have any insight to what the problem is? Thanks in advance. Bjørn

C / C++

5452

Obtaining double precision random number?

by: Ronny Mandal | last post by:

Is there a function that will do this task properly? -- Thanks Ronny Mandal

C / C++

2705

Converting from Single to double error

by: Tor Aadnevik | last post by:

Hi, I have a problem converting values from Single to double. eg. When the Single value 12.19 is converted to double, the result is 12.1899995803833. Anyone know how to avoid this? Regards Totto

.NET Framework

5714

single x double precision on 32bit arch

by: R.Biloti | last post by:

Hi folks I wrote the naive program to show up the unit roundoff (machine precision) for single and double precision: #include <stdio.h> int main (void) { double x;

C / C++

7211

float? double?

by: Erick-> | last post by:

hi all... I've readed some lines about the difference between float and double data types... but, in the real world, which is the best? when should we use float or double?? thanks Erick

C / C++

10415

Single precision floating point calcs?

by: Grant Edwards | last post by:

I'm pretty sure the answer is "no", but before I give up on the idea, I thought I'd ask... Is there any way to do single-precision floating point calculations in Python? I know the various array modules generally support arrays of single-precision floats. I suppose I could turn all my variables into single-element arrays, but that would be way ugly...

Python

2964

Code to print each part of double as a separate group of bits

by: Virtual_X | last post by:

As in IEEE754 double consist of sign bit 11 bits for exponent 52 bits for fraction i write this code to print double parts as it explained in ieee754 i want to know if the code contain any bug , i am still c++ beginner

by: md | last post by:

Hi Does any body know, how to round a double value with a specific number of digits after the decimal points? A function like this: RoundMyDouble (double &value, short numberOfPrecisions) It then updates the value with numberOfPrecisions after the decimal

C / C++

5114

Re: Rounding error when converting from double to int

by: clintonb | last post by:

Victor said: The double value that I'm trying to convert to GCSMoney (which is implemented as cents) was produced by multiplying a dollar amount by an interest rate to get interest. double amount = 126.60; double interestRate = .075; double interest = amount * interestRate;

C / C++

8761

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

9426

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9281

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

9142

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8148

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6722

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6022

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

3238

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

2163

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General