I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
Thanks,
-Dan
Jan 9 '07
130 6157
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
How
you use something doesn't, to my mind, define what it *is*.
By that logic, a computer that hosts applications on a network shouldn't be
called a "server" and a computer that connects to it shouldn't be called a
"client", since they're all "computers" anyway and how you use them is
irrelevant to what they are? Some people might claim that how you use
something might be very closely tied to how we classify it.
32-bit integers I use in .NET will never actually have numbers above
1000 in them, but that doesn't mean that the type itself can't store
anything above 1000 - what it can and can't store is precisely defined
regardless of how I choose to use it.
However when you store 989 in it, you are certain to get back 989. Not 988
or 990 "approximations" of 989.
My concern is that float is being arbitrarily labeled as "approximate"
just because it can't store all values of one particular base, despite
the fact that decimal can't store all values of other bases.
Base 10 is what we use in SQL for floating point numbers, most likely
because no one has introduced a better system. Perhaps you should recommend
Base 3 to ANSI, or recommend that people work out their own binary
representations of floating point values and store them directly using the
BINARY data type?
Yes, base
10 is obviously the most widely used base, but it *is* just a base.
We're lucky that it's a multiple of 2 - otherwise exact float values
wouldn't be exactly representable in decimal...
I thought your argument was that Base was irrelevant? Now you have a bias
for Base 2? What about Base 3?
Any float value is exact in and of itself. If it's an approximation to
some other value it was originally converted from, so be it - that
doesn't, to my mind, make the type itself "approximate".
And in the case of 28.08, the float that is stored is *exactly* the wrong
number.
The value in the float is only approximate equal to the decimal value
which was originally converted, but in itself it is an exact number.
Given a floating point value, there is a well-defined, precise number
that value represents.
And in SQL we tend to concern ourselves with little things like data
integrity; i.e., am I going to get out what I put in? Or am I going to get
back something different?
Frankly, how many *decimals* a float stores is irrelevant to me. What
I'm interested in is whether the value stored in a float can be
regarded as "exact" or merely "approximate". Given a float
representing, say, 1.0101, that is an exact value.
The fact that you can't store many values that are coincident to the data
type exactly without data loss or introducing error (i.e., "approximation")
makes them "approximate".
"Shuurai" <Sh*******@hotmail.comwrote in message
news:11**********************@51g2000cwl.googlegro ups.com...
So your argument is that they store SOME numbers exactly?
From his posts so far his argument seems to boil down to "all data types are
stored as bits which are either *exactly* on or *exactly* off".
Of course by that reasoning you could create a data type that completely
disregards anything you put in and stores something entirely different, but
it would be just as *exact* as any other data type. It would be interesting
(though admittedly pointless) to have a SURPRISE data type where you could
put in "28.08" and retrieve "A suffusion of yellow", "999999.9283746", "c^2
= a^2 + b^2", "3.141592", or "9i + 12"; depending on what the computer
decided to store in it at any given moment. But since bits are exactly
either on or off, this would be another "exact" data type.
Another one of those "It's technically accurate, but completely useless"
type things.
Erland Sommarskog <es****@sommarskog.sewrote:
Jon Skeet [C# MVP] (sk***@pobox.com) writes:
My concern is that float is being arbitrarily labeled as "approximate"
No, it is not arbitrary. It relates to what humans would like to be
able to store, but which is not always possible. The fact that the floating
number actually belongs to the set of decimal numbers is completely
rrelevant when it comes to usage.
In many cases, it's what humans want to store - but not always. What
about if I'm storing some data in the database when it's been computed
(or measured) using binary floating point to start with? At that point,
assuming you've got the right precision, you will lose no data at all.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Well, the only way I was able to get the *exact* value out of a float
so that I could then convert it (in C#) into a double was to use
binary. I suspect if I'd used C# to retrieve the value in the first
place (rather than just Query Analyzer) I would have got the right
answer that way - it's just a shame that you can't (as far as I can
tell) get SQL Server itself to tell you the exact value of the float.
Yet even with C# you can't always retrieve the *exact* value you originally
assigned. You retrieve an approximation of that value... You put 28.08
into the bucket and pull out 28.079999923706055, which is not the "right
answer".
You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.
I wouldn't choose to use the words "exact" or "approximate" in the
first place.
<Message Author="Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact."
</Message>
I'm still waiting to hear your definition, but only because you indicated
that you actually have one.
If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Shuurai <Sh*******@hotmail.comwrote:
Either way you render the definition useless. The way it's used in SQL
explains the behavior that can be expected.
... at the cost of making it less clear what's actually going on, IMO.
While making it more clear what can be expected as a result. Those of
us who know how it's working don't need the definition as much as
someone who's just looking to get the same number out that they put in.
Well, I don't think it makes it any clearer, but that's clearly a
matter of opinion.
When you convert 28.02 into an integer you are storing it as 28. Not
an approximation, but exactly 28. The definition describes the
behavior of the data type.
When you convert 28.08 into a float, you are storing it as
28.0799999999999982946974341757595539093017578125. Not an
approximation, but exactly
28.0799999999999982946974341757595539093017578125
And you'll note that 28.0 !=
28.0799999999999982946974341757595539093017578125
(I assume you meant 28.08 rather than 28.0?)
Yes, just as 28 != 28.08 in your integer example. Just as with integer
you are storing exactly 28 (i.e. the closest integer to 28.08 with
appropriate rounding behaviour), with float you're storing exactly the
closest float to 28.08 with the specified rounding behaviour.
Now, that's certainly less intuitive than 28.08 -28, but it's still
an exact number.
Yes, it's an exact number. It's just not the exact number that we
stored.
Well, it's the exact number that you *stored*, just not the exact
number which was the source of the conversion.
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
Exactly - but the definitions are there to describe to a user (like our
OP) who wants to know which data type will give him the same number he
intended to store.
Unfortunately, by doing so it loses sight of the fact that floats store
numbers exactly - they just can't store all decimal numbers exactly.
So your argument is that they store SOME numbers exactly?
All types available only store *some* numbers exactly. Decimal doesn't
store all numbers exactly: "a third" is a number, but it can't be
stored exactly. Be careful to distinguish between "decimal numbers" and
"numbers".
I believe that definitions should be *accurate*, and in my view (I'm
afraid no-one's changed my mind, and I don't expect I've changed anyone
else's mind) the definition *isn't* accurate.
Definitions need to have a balance between usefulness and accuracy.
Your definition, even if technically accurate (which I don't believe it
is) just isn't useful.
Unsurprisingly, I disagree. I don't expect either of us to persuade the
other.
We could have told him that without giving the impression that the
float type is "approximate" though:
"When a decimal number is converted into a FLOAT, information can be
lost. It's a lossy conversion because not all decimal numbers can be
exactly represented as binary floating point numbers. Use the DECIMAL
type instead to keep the exact decimal value."
Which is absolutely correct. The definition is used to describe what
can be expected of the data type. It doesn't address (or need to
address) anything that is going on in the background. The point of the
definition is to tell the user which types will store 28.08 as 28.08.
It's not clear to me what you're regarding as "absolutely correct". The
bit in quotes in my post was my suggested alternative to just saying
that float is an approximate type. If you agree that my text is
correct, and tells the user which type will store 28.08 as 28.08,
what's the disadvantage of using my text?
Or, we could give him the nitty-gritty details of converting base2
numbers to base10, which while certainly interesting isn't all that
helpful.
Does the above go into nitty gritty? Nope - but it doesn't give a
misleading impression either, IMO.
Not misleading; just not particularly useful.
What does it *not* address that would be useful?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
>Yet even with C# you can't always retrieve the *exact* value you
>originally assigned. You retrieve an approximation of that value... You put 28.08 into the bucket and pull out 28.079999923706055, which is not the "right answer".
You can retrieve the exact value which was actually assigned,
BUT not the value that was assigned *by the user.* And that's the only
value *the user* tends to care about.
A
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
Maybe it's the surprise factor.
Maybe it's that we are used to being able to supply *less* than the number
of acceptable significant digits, and not having the values change on us
after the fact.
Maybe it's because integers are simpler, and the rounding of 28.08 -28 is
much more intuitive, clear, expected, and predictable. This is stuff we all
learned in grade school. I think it takes a much better understanding of
mathematics to draw the same conclusions from the approximation of 28.08 ->
28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
A
Mike C# <xy*@xyz.comwrote:
>
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP***********************@msnews.microsoft.co m...
Frankly, I don't have the energy. I don't know whether you genuinely
don't believe it's possible or whether you really want me to waste my
time on it. There are plenty of descriptions of how floats are stored,
and it's far from difficult to reverse engineer that to work out what
*can* be stored.
Why would it possibly be worth my time writing it up?
<Message Author = "Jon Skeet [C# MVP]">
"Well, in that case I could (somewhat painstakingly) write a similar
description in terms of binary strings which are representable as floats."
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact. "
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"I'll do that (publish a complete list of valid values for the FLOAT and
REAL types) if you publish a complete list of valid values for
NUMERIC(38,20)."
</Message>
I'm only interested in your definitions and descriptions because you
indicate that you either have them on hand, or are able to produce them. If
you don't have any definitions and/or can't produce what you said you can,
then I'd say the whole issue is fairly well resolved.
Don't be absurd. It would take time to write down all the rules for
which binary strings are exactly representable as float due to
denormalisation etc, but I *hope* that from descriptions of the format
(eg in http://www.yoda.arachsys.com/csharp/floatingpoint.html) that
it's obvious that such a rule is *possible* to write down with effort.
If you accept that it's possible - that there is inherently such a
rule, even though it would take a while to write it out accurately -
then you should accept that float falls within your definition of an
exact type. What good would having the actual rule do?
Now, do you actually believe it's not possible to write down the rule
which says which values are allowed? (Heck, I've given you a program
which would generate them all given long enough - so the theoretical
rule could be "if it appears in this list"; the list clearly exists,
even if it takes a long time to produce.)
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
How
you use something doesn't, to my mind, define what it *is*.
By that logic, a computer that hosts applications on a network shouldn't be
called a "server" and a computer that connects to it shouldn't be called a
"client", since they're all "computers" anyway and how you use them is
irrelevant to what they are? Some people might claim that how you use
something might be very closely tied to how we classify it.
That's a fair point, although I'd argue there is a difference here when
you talk about a type *being* approximate.
32-bit integers I use in .NET will never actually have numbers above
1000 in them, but that doesn't mean that the type itself can't store
anything above 1000 - what it can and can't store is precisely defined
regardless of how I choose to use it.
However when you store 989 in it, you are certain to get back 989. Not 988
or 990 "approximations" of 989.
But if you store 989.1 in it, you won't get 989.1 back. No different to
float in that respect.
My concern is that float is being arbitrarily labeled as "approximate"
just because it can't store all values of one particular base, despite
the fact that decimal can't store all values of other bases.
Base 10 is what we use in SQL for floating point numbers, most likely
because no one has introduced a better system. Perhaps you should recommend
Base 3 to ANSI, or recommend that people work out their own binary
representations of floating point values and store them directly using the
BINARY data type?
Well, communicating with SQL server from another program, if I were
storing or retrieving a float I almost certainly *wouldn't* use base 10
- I'd use base 2, as that's the "native" base of float within SQL
server and would almost certainly be the "native" base of my storage
format within the other program, too.
It's only SQL *literals* (and values of type decimal and numeric, of
course) which are explicitly base 10, isn't it?
Yes, base
10 is obviously the most widely used base, but it *is* just a base.
We're lucky that it's a multiple of 2 - otherwise exact float values
wouldn't be exactly representable in decimal...
I thought your argument was that Base was irrelevant? Now you have a bias
for Base 2? What about Base 3?
No, I don't have a bias for base 2 - I'm just saying that the fact that
we *can* represent all floats accurately as decimals comes from the
fact that 10 is divisible by 2. It's always nice when conversions *can*
be lossless.
Any float value is exact in and of itself. If it's an approximation to
some other value it was originally converted from, so be it - that
doesn't, to my mind, make the type itself "approximate".
And in the case of 28.08, the float that is stored is *exactly* the wrong
number.
Yes. Just as if I try to store 28.08 as an integer, I'll be storing
exactly the wrong number (28), and just as if I try to store 1.234567
in a numeric (2,3) I'll be storing exactly the wrong number.
As we've already agreed, you can only expect a type to store numbers
within a particular set of available values. That's true for *all* the
types involved.
The value in the float is only approximate equal to the decimal value
which was originally converted, but in itself it is an exact number.
Given a floating point value, there is a well-defined, precise number
that value represents.
And in SQL we tend to concern ourselves with little things like data
integrity; i.e., am I going to get out what I put in? Or am I going to get
back something different?
Which is why it's important for people to understand the difference
between float and decimal - but they can understand that without
believing that float doesn't store its values exactly. It just has a
different (but still well-defined) set of valid values.
Frankly, how many *decimals* a float stores is irrelevant to me. What
I'm interested in is whether the value stored in a float can be
regarded as "exact" or merely "approximate". Given a float
representing, say, 1.0101, that is an exact value.
The fact that you can't store many values that are coincident to the data
type exactly without data loss or introducing error (i.e., "approximation")
makes them "approximate".
No, you can store *all* values that are coincident to the data type.
You just can't store *any* values which *aren't* coincident to the data
type - which is the same for all other types as well.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
Maybe it's the surprise factor.
Maybe it's that we are used to being able to supply *less* than the number
of acceptable significant digits, and not having the values change on us
after the fact.
FWIW, I've always thought that describing floating binary point types
as being accurate to a certain number of *decimal* digits is asking for
trouble.
Maybe it's because integers are simpler, and the rounding of 28.08 -28 is
much more intuitive, clear, expected, and predictable. This is stuff we all
learned in grade school. I think it takes a much better understanding of
mathematics to draw the same conclusions from the approximation of 28.08 ->
28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Yes, I dare say it takes a bit more understanding if you want to know
exactly why it stores the values it does.
I would even have been *somewhat* happy with an original answer (when I
started this whole business) of "yes, we know that float is exact
really, but explaining it as approximate is easier than bringing
conversions into the conversation". I would still have argued against
that, *but* the fact is that everyone's been arguing that float really
*is* inherently approximate, and from a mathematical standpoint which
doesn't have a base 10 bias I just can't accept that as being true.
Just because a conversion takes a bit more understanding doesn't mean
the value stored isn't exact.
All the types we've talked about (integer, decimal, float) have the
following in common:
1) They have a well-defined set of values they can exactly represent.
2) If you ask them to store a value which isn't in that set, they will
store (exactly) an approximation to that value. The rules for that
approximation are also well-defined.
3) If you ask them to store a value which *is* in that set, they will
store that value exactly.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
And you'll note that 28.0 !=
28.0799999999999982946974341757595539093017578125
(I assume you meant 28.08 rather than 28.0?)
Yes.
Yes, just as 28 != 28.08 in your integer example. Just as with integer
you are storing exactly 28 (i.e. the closest integer to 28.08 with
appropriate rounding behaviour), with float you're storing exactly the
closest float to 28.08 with the specified rounding behaviour.
I don't want to store the closest anything to 28.08 -- I want to store
28.08. That is why DECIMAL is called an "exact" data type and FLOAT is
called an approximate data type. Listen to what you are saying above.
You spell it out yourself -- you're storing the *closest* thing to what
you intended to store.
As for the rest of your statement about INT, that is the intended
behavior. When you put 28.08 into an INT, you know it's going to be
converted to INT. You know that it's going to be exactly 28 when you
read it, not just "the closest thing" to 28.
Now, that's certainly less intuitive than 28.08 -28, but it's still
an exact number.
Yes, it's an exact number. It's just not the exact number that we
stored.
Well, it's the exact number that you *stored*, just not the exact
number which was the source of the conversion.
You have to understand that we database types are kind of biased
towards knowing what we're storing and how to get it back. Part of
that is knowing that we've got to built the right container for what
we're storing.
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
The difference is that with an INT, I know that if I store a valid
number I'll get that exact same number back, not just the closest
number to it. And when I store 28.08 as an INT, I know that it's going
to be converted to the nearest integer - that's by design.
Exactly - but the definitions are there to describe to a user (like our
OP) who wants to know which data type will give him the same number he
intended to store.
>
Unfortunately, by doing so it loses sight of the fact that floats store
numbers exactly - they just can't store all decimal numbers exactly.
So your argument is that they store SOME numbers exactly?
All types available only store *some* numbers exactly. Decimal doesn't
store all numbers exactly: "a third" is a number, but it can't be
stored exactly. Be careful to distinguish between "decimal numbers" and
"numbers".
DECIMAL will store all decimals within the range you specify, and will
store them exactly as entered. FLOAT will not. In other words, if I
define a column as DECIMAL(4,2) I know that when I store 28.08 it will
remain 28.08, and not just something close. I know that when I store
28.082 as DECIMAL(4,2) it will also be stored as 28.08 -- that is by
design.
I believe that definitions should be *accurate*, and in my view (I'm
afraid no-one's changed my mind, and I don't expect I've changed anyone
else's mind) the definition *isn't* accurate.
Definitions need to have a balance between usefulness and accuracy.
Your definition, even if technically accurate (which I don't believe it
is) just isn't useful.
Unsurprisingly, I disagree. I don't expect either of us to persuade the
other.
What good is an accurate definition if it isn't useful?
We could have told him that without giving the impression that the
float type is "approximate" though:
>
"When a decimal number is converted into a FLOAT, information can be
lost. It's a lossy conversion because not all decimal numbers can be
exactly represented as binary floating point numbers. Use the DECIMAL
type instead to keep the exact decimal value."
Which is absolutely correct. The definition is used to describe what
can be expected of the data type. It doesn't address (or need to
address) anything that is going on in the background. The point of the
definition is to tell the user which types will store 28.08 as 28.08.
It's not clear to me what you're regarding as "absolutely correct". The
bit in quotes in my post was my suggested alternative to just saying
that float is an approximate type. If you agree that my text is
correct, and tells the user which type will store 28.08 as 28.08,
what's the disadvantage of using my text?
Again, read what you wrote... "Use the DECIMAL type instead to keep
the *exact* decimal value."
Or, we could give him the nitty-gritty details of converting base2
numbers to base10, which while certainly interesting isn't all that
helpful.
>
Does the above go into nitty gritty? Nope - but it doesn't give a
misleading impression either, IMO.
Not misleading; just not particularly useful.
What does it *not* address that would be useful?
When the average user ought to use which data type.
Mike C# <xy*@xyz.comwrote:
<snip>
I'm only interested in your definitions and descriptions because you
indicate that you either have them on hand, or are able to produce them. If
you don't have any definitions and/or can't produce what you said you can,
then I'd say the whole issue is fairly well resolved.
As I suspect we'll keep going for a very long time if I don't do this,
here's a first pass at a definition of what binary strings are exactly
representable in a float(53). There *may* be some off-by-one errors -
I've tried a "mock-up" on paper with a floating point type with a 2-bit
mantissa and a 3-bit exponent, so *hopefully* it's correct. There'd be
a similar definition for a 32-bit floating binary point value, but I at
least hope you will be satisfied with one specific type:
A binary number is representable in float(53) if either:
1) (Normal numbers) It has 53 or fewer significant digits, *and* it has
at most 1024 binary digits after the binary point when trailing
insignificant 0s after the binary point are discarded, and it has at
most 1024 binary digits before the binary point when leading
insignificant 0s are discarded.
or
2) (Subnormal numbers) It is of the form 0.xy where x is 1023 zeroes,
and y is up to 52 binary digits (any sequence is acceptable)
3) (Zero) It is 0.
I'm not certain of SQL Server's behaviour with regards to NaN and
infinity - we'd have to decide on whether they count as "numbers" in
the first place to work out whether or not to include them in the
rules. It's easy to do if you want to though.
Right, are you satisfied that float(53) is an exact type by your
definition then?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Daniel Manes (da******@cox.net) writes:
Thanks, everyone, I think :) At least I can't complain that no one
responded to my question! Anyway, I definitely need to rethink when I
use float/double and when I use decimal. As for when to use numeric,
SQL Server docs says "numeric is functionally equivalent to decimal" so
I'm thinking...never?
I believe the ANSI standards makes a distinction between numeric and
decimal, although it's a bit of a hair-splitting thing. In SQL Server
numeric and decimal are equvivalent.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se
Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx
Shuurai <Sh*******@hotmail.comwrote:
Yes, just as 28 != 28.08 in your integer example. Just as with integer
you are storing exactly 28 (i.e. the closest integer to 28.08 with
appropriate rounding behaviour), with float you're storing exactly the
closest float to 28.08 with the specified rounding behaviour.
I don't want to store the closest anything to 28.08 -- I want to store
28.08. That is why DECIMAL is called an "exact" data type and FLOAT is
called an approximate data type. Listen to what you are saying above.
You spell it out yourself -- you're storing the *closest* thing to what
you intended to store.
As for the rest of your statement about INT, that is the intended
behavior. When you put 28.08 into an INT, you know it's going to be
converted to INT.
If you put 28.08 into a FLOAT, you should jolly well be aware that it's
going to be converted to FLOAT too!
You know that it's going to be exactly 28 when you read it, not just
"the closest thing" to 28.
It's the closest thing to 28.08 though, and 28.08 is the number you
were trying to store, supposedly.
Really, other than awareness of how the conversion takes place (and
being able to spot by eye what value will actually be stored), what's
the difference here?
Now, that's certainly less intuitive than 28.08 -28, but it's still
an exact number.
>
Yes, it's an exact number. It's just not the exact number that we
stored.
Well, it's the exact number that you *stored*, just not the exact
number which was the source of the conversion.
You have to understand that we database types are kind of biased
towards knowing what we're storing and how to get it back. Part of
that is knowing that we've got to built the right container for what
we're storing.
Which is why if you're trying to store exact decimal numbers, you
should indeed use the decimal type. I've never argued against that.
I've argued against the idea that float is approximate, just because
when it's asked to store something it can't store exactly, it stores
the closest thing to it (or whatever the rule for the type is - I'm
thinking of the different rounding mechanisms for int) - just as
decimal would if you could ask it to store "a third" and just as
integer does when you ask it to store 28.08.
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
The difference is that with an INT, I know that if I store a valid
number I'll get that exact same number back, not just the closest
number to it.
That's exactly the same with float though: your idea of "a valid
number" is presumably "an integer", i.e. one of the sets of exact
values that the type can hold. The same is true with float. "28.08"
isn't a valid int, and it isn't a valid float either.
And when I store 28.08 as an INT, I know that it's going
to be converted to the nearest integer - that's by design.
And so you should know that when you store 28.08 as a float, it's going
to be converted to the neareset float - that's by design too.
All types available only store *some* numbers exactly. Decimal doesn't
store all numbers exactly: "a third" is a number, but it can't be
stored exactly. Be careful to distinguish between "decimal numbers" and
"numbers".
DECIMAL will store all decimals within the range you specify, and will
store them exactly as entered. FLOAT will not.
Indeed - the key thing being "all decimals". That's not the same as
"all numbers".
In other words, if I define a column as DECIMAL(4,2) I know that when
I store 28.08 it will remain 28.08, and not just something close. I
know that when I store 28.082 as DECIMAL(4,2) it will also be stored
as 28.08 -- that is by design.
Indeed, just as it's by design that when you ask FLOAT to store a value
which it can't cope with exactly, it will take the nearest value it
*can* cope with.
I believe that definitions should be *accurate*, and in my view (I'm
afraid no-one's changed my mind, and I don't expect I've changed anyone
else's mind) the definition *isn't* accurate.
>
Definitions need to have a balance between usefulness and accuracy.
Your definition, even if technically accurate (which I don't believe it
is) just isn't useful.
Unsurprisingly, I disagree. I don't expect either of us to persuade the
other.
What good is an accurate definition if it isn't useful?
Perhaps I didn't make myself clear: I believe my definition is both
technically accurate *and* useful.
We could have told him that without giving the impression that the
float type is "approximate" though:
"When a decimal number is converted into a FLOAT, information can be
lost. It's a lossy conversion because not all decimal numbers can be
exactly represented as binary floating point numbers. Use the DECIMAL
type instead to keep the exact decimal value."
>
Which is absolutely correct. The definition is used to describe what
can be expected of the data type. It doesn't address (or need to
address) anything that is going on in the background. The point of the
definition is to tell the user which types will store 28.08 as 28.08.
It's not clear to me what you're regarding as "absolutely correct". The
bit in quotes in my post was my suggested alternative to just saying
that float is an approximate type. If you agree that my text is
correct, and tells the user which type will store 28.08 as 28.08,
what's the disadvantage of using my text?
Again, read what you wrote... "Use the DECIMAL type instead to keep
the *exact* decimal value."
Yes - what's wrong with that? It gives practical advice based on the
previous two sentences. It doesn't go against anything I've said
anywhere in the thread.
Or, we could give him the nitty-gritty details of converting base2
numbers to base10, which while certainly interesting isn't all that
helpful.
Does the above go into nitty gritty? Nope - but it doesn't give a
misleading impression either, IMO.
>
Not misleading; just not particularly useful.
What does it *not* address that would be useful?
When the average user ought to use which data type.
Okay, so we could expand slightly it to say when to use FLOAT. It
already says when to use DECIMAL (i.e. when you want to preserve an
exact decimal value).
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP] (sk***@pobox.com) writes:
In many cases, it's what humans want to store - but not always. What
about if I'm storing some data in the database when it's been computed
(or measured) using binary floating point to start with? At that point,
assuming you've got the right precision, you will lose no data at all.
That does not change anything. The float data type is designed to fit
a broad set of approxamite values. In the case of measurements of
physical phenomens, the measurements themselves are often approxamite.
If you are able to find a fringe case where you can actually use float
to represent something exact, that does not change anything.
As a parallel, consider pieces of paper with a number and some complex
pattern on it, and often with a small metal ribbon in it. We call such
paper "money". But in fact, it's perfectly possible to use that paper
to clean your rear parts when you have been to the loo. Nevertheless,
no one get the idea to refer to it as "toilet paper".
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se
Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
Maybe it's the surprise factor.
Maybe it's that we are used to being able to supply *less* than the number of acceptable significant digits, and not having the values change on us after the fact.
FWIW, I've always thought that describing floating binary point types
as being accurate to a certain number of *decimal* digits is asking for
trouble.
>Maybe it's because integers are simpler, and the rounding of 28.08 -28 is much more intuitive, clear, expected, and predictable. This is stuff we all learned in grade school. I think it takes a much better understanding of mathematics to draw the same conclusions from the approximation of 28.08 -> 28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Yes, I dare say it takes a bit more understanding if you want to know
exactly why it stores the values it does.
I would even have been *somewhat* happy with an original answer (when I
started this whole business) of "yes, we know that float is exact
really, but explaining it as approximate is easier than bringing
conversions into the conversation". I would still have argued against
that, *but* the fact is that everyone's been arguing that float really
*is* inherently approximate, and from a mathematical standpoint which
doesn't have a base 10 bias I just can't accept that as being true.
Just because a conversion takes a bit more understanding doesn't mean
the value stored isn't exact.
All the types we've talked about (integer, decimal, float) have the
following in common:
1) They have a well-defined set of values they can exactly represent.
2) If you ask them to store a value which isn't in that set, they will
store (exactly) an approximation to that value. The rules for that
approximation are also well-defined.
3) If you ask them to store a value which *is* in that set, they will
store that value exactly.
And the difference is: If you store a value in an INTEGER or DECIMAL that
is coincident with that type you do not risk introducing error or data loss.
With REAL or FLOAT you risk introducing error or losing data when storing
values that are coincident with the type.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP***********************@msnews.microsoft.co m...
>Yet even with C# you can't always retrieve the *exact* value you originally assigned. You retrieve an approximation of that value... You put 28.08 into the bucket and pull out 28.079999923706055, which is not the "right answer".
You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.
LOFL. Now you want to tell me that when I type in "f = 28.08" in C# or "SET
@f = 28.08" that it is not an "assignment statement"? Are you serious?
Please do tell what you call it.
><Message Author="Jon Skeet [C# MVP]"> "No, by my definition pretty much all datatypes are exact." </Message>
I'm still waiting to hear your definition, but only because you indicated that you actually have one.
If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.
I have to say I'm pleasantly surprised that you feel my definition is
reasonable! Unfortunately by my definition not all data types are "exact".
It doesn't include FLOAT and REAL, as I'll show you here:
First we'll consider my definition of "exact" types, which you agree is
"pretty reasonable":
"Exact data types are those data types which can store and retrieve single
values coincident with their type without loss of data, and without the
introduction of error."
To further expand, let's consider a definition for "coincident". To
coincide with a data type, a singleton value should meet these requirements:
1) The value must be composed of digits/characters that are defined as
valid, in the proper format defined for the data type,
2) For numeric types, the scale (digits after the decimal point) of the
value needs to be less than or equal to the scale of the data type,
3) For numeric types, the difference between precision (total number of
digits) and scale of the value needs to be less than or equal to the
difference between the precision and scale of the data type,
4) For numeric types, the value must be greater than or equal to the lower
bound defined for the data type, and less than or equal to the upper bound
defined for the data type,
5) For non-numeric types, the length of the value must be less than or
equal to the length of the data type
I think we can agree that by this definition, the singleton value 28.08
coincides with the following data types:
* DECIMAL (4, 2) since it consists of valid characters and the precision and
scale are less than or equal to that defined for the data type,
* FLOAT since it consists of valid characters and the precision and scale
are less than that defined for the data type,
* REAL since it consists of valid characters and the precision and scale are
less than that defined for the data type
Use of the DECIMAL(4, 2) will not cause any loss of data or introduction of
error when storing values that coincide with the type, therefore by my
definition it is an "exact" type. Use of the FLOAT and REAL types can cause
lost data and introduction of error when attempting to store values that
coincide with the type, so by this definition they are not "exact".
Additionally by this definition 28.08 does not coincide with the following
data types:
* TINYINT, SMALLINT, INT, BIGINT since the scale of 2 is larger than that
defined for these data types (0)
* DECIMAL (3, 1) since the scale of 2 for the value is larger than the scale
of 1 defined for this type
Because 28.08 does not coincide with these data types, rounding errors and
data loss are to be expected and have no bearing on the "exactness" of the
data type.
Additionally, the result of the expression 1/3, by this definition, does not
coincide with any data types since it's scale is infinite and the scale for
all of these data types is finite. Therefore rounding errors and/or data
loss from trying to assign the result of this expression (or rather the
finite truncated/rounded result of this expression) to any of these data
types has no bearing on the "exactness" of the data type.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
><Message Author = "Jon Skeet [C# MVP]"> "Well, in that case I could (somewhat painstakingly) write a similar description in terms of binary strings which are representable as floats." </Message> ... <Message Author = "Jon Skeet [C# MVP]"> "No, by my definition pretty much all datatypes are exact. " </Message> ... <Message Author = "Jon Skeet [C# MVP]"> "I'll do that (publish a complete list of valid values for the FLOAT and REAL types) if you publish a complete list of valid values for NUMERIC(38,20)." </Message>
I'm only interested in your definitions and descriptions because you indicate that you either have them on hand, or are able to produce them. If you don't have any definitions and/or can't produce what you said you can, then I'd say the whole issue is fairly well resolved.
Don't be absurd. It would take time to write down all the rules for
which binary strings are exactly representable as float due to
denormalisation etc, but I *hope* that from descriptions of the format
(eg in http://www.yoda.arachsys.com/csharp/floatingpoint.html) that
it's obvious that such a rule is *possible* to write down with effort.
What I see as absurd is your promise to do exactly what you rail against
above when you never intended to do it; apparently under the assumption that
I either couldn't or wouldn't keep up my end of the bargain (which I did, by
the way). They have a saying down south - don't let your mouth write checks
your @$$ can't cash.
If you accept that it's possible - that there is inherently such a
rule, even though it would take a while to write it out accurately -
then you should accept that float falls within your definition of an
exact type. What good would having the actual rule do?
I'm sure that a stickler like yourself, with your demands for stringent
proofs and strict definitions, will fully understand that I'm under no
obligation to make any assumptions about what you can or cannot do, or what
is or is not possible at all. Just as you are not willing to take anyone
else's arguments at face value without proof, you shouldn't expect others to
take your word as Gospel.
Now, do you actually believe it's not possible to write down the rule
which says which values are allowed? (Heck, I've given you a program
which would generate them all given long enough - so the theoretical
rule could be "if it appears in this list"; the list clearly exists,
even if it takes a long time to produce.)
Like you, I have no reason to believe *anything* until I see it.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP**********************@msnews.microsoft.com ...
Mike C# <xy*@xyz.comwrote:
<snip>
>I'm only interested in your definitions and descriptions because you indicate that you either have them on hand, or are able to produce them. If you don't have any definitions and/or can't produce what you said you can, then I'd say the whole issue is fairly well resolved.
As I suspect we'll keep going for a very long time if I don't do this,
here's a first pass at a definition of what binary strings are exactly
representable in a float(53). There *may* be some off-by-one errors -
I've tried a "mock-up" on paper with a floating point type with a 2-bit
mantissa and a 3-bit exponent, so *hopefully* it's correct. There'd be
a similar definition for a 32-bit floating binary point value, but I at
least hope you will be satisfied with one specific type:
A binary number is representable in float(53) if either:
1) (Normal numbers) It has 53 or fewer significant digits, *and* it has
at most 1024 binary digits after the binary point when trailing
insignificant 0s after the binary point are discarded, and it has at
most 1024 binary digits before the binary point when leading
insignificant 0s are discarded.
or
2) (Subnormal numbers) It is of the form 0.xy where x is 1023 zeroes,
and y is up to 52 binary digits (any sequence is acceptable)
3) (Zero) It is 0.
I'm not certain of SQL Server's behaviour with regards to NaN and
infinity - we'd have to decide on whether they count as "numbers" in
the first place to work out whether or not to include them in the
rules. It's easy to do if you want to though.
Right, are you satisfied that float(53) is an exact type by your
definition then?
No, because it cannot store some values that coincide with the FLOAT type
without loss of data or introduction of error.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
>By that logic, a computer that hosts applications on a network shouldn't be called a "server" and a computer that connects to it shouldn't be called a "client", since they're all "computers" anyway and how you use them is irrelevant to what they are? Some people might claim that how you use something might be very closely tied to how we classify it.
That's a fair point, although I'd argue there is a difference here when
you talk about a type *being* approximate.
LOL My first response included something about a knife being classified as a
"murder weapon", but decided to use the computer analogy instead. Figured I
might scare the locals :)
32-bit integers I use in .NET will never actually have numbers above
1000 in them, but that doesn't mean that the type itself can't store
anything above 1000 - what it can and can't store is precisely defined
regardless of how I choose to use it.
However when you store 989 in it, you are certain to get back 989. Not 988 or 990 "approximations" of 989.
But if you store 989.1 in it, you won't get 989.1 back. No different to
float in that respect.
However, 989.1 does not coincide with the INTEGER data type since it's scale
is 1, and the INTEGER data type's scale is 0. 989.1 is well within the
precision and scale limitations of FLOAT, so it does coincide with that
type.
>Base 10 is what we use in SQL for floating point numbers, most likely because no one has introduced a better system. Perhaps you should recommend Base 3 to ANSI, or recommend that people work out their own binary representations of floating point values and store them directly using the BINARY data type?
Well, communicating with SQL server from another program, if I were
storing or retrieving a float I almost certainly *wouldn't* use base 10
- I'd use base 2, as that's the "native" base of float within SQL
server and would almost certainly be the "native" base of my storage
format within the other program, too.
And when storing these numbers in Base-2, are you going to manually convert
them as well? Again what's the point of a FLOAT data type if you have to
create your own operators for it, and do your own conversions to and from
decimal?
It's only SQL *literals* (and values of type decimal and numeric, of
course) which are explicitly base 10, isn't it?
Yes, SQL *literals*, which just happen to be used to represent values of
type DECIMAL, NUMERIC, INT, FLOAT, REAL, etc.
>I thought your argument was that Base was irrelevant? Now you have a bias for Base 2? What about Base 3?
No, I don't have a bias for base 2 - I'm just saying that the fact that
we *can* represent all floats accurately as decimals comes from the
fact that 10 is divisible by 2. It's always nice when conversions *can*
be lossless.
But you can't represent all floats accurately as decimals. Take, for
instance, 28.08...
>And in the case of 28.08, the float that is stored is *exactly* the wrong number.
Yes. Just as if I try to store 28.08 as an integer, I'll be storing
exactly the wrong number (28), and just as if I try to store 1.234567
in a numeric (2,3) I'll be storing exactly the wrong number.
Except that 28.08 has a larger scale (2) than the INTEGER data type (0), so
that assigning 28.08 to it has no bearing on the "exactness" of the data
type. NUMERIC(2, 3) is an invalid language construct, so you'll never be
able to create an column or variable of that type, much less assign a value
to it anyway. 28.08 and 1.234567 are both well within the precision and
scale limits for the FLOAT data type, but can FLOAT accurately store those
values or can it store only the exact wrong number?
As we've already agreed, you can only expect a type to store numbers
within a particular set of available values. That's true for *all* the
types involved.
You can expect exact types to store the set of all values that coincide with
the type. You can expect approximate types like FLOAT to not be able to
store some values from the set of values that coincide with the type.
>And in SQL we tend to concern ourselves with little things like data integrity; i.e., am I going to get out what I put in? Or am I going to get back something different?
Which is why it's important for people to understand the difference
between float and decimal - but they can understand that without
believing that float doesn't store its values exactly. It just has a
different (but still well-defined) set of valid values.
And it can only store approximations of many values that are coincident with
the type.
>The fact that you can't store many values that are coincident to the data type exactly without data loss or introducing error (i.e., "approximation") makes them "approximate".
No, you can store *all* values that are coincident to the data type.
You just can't store *any* values which *aren't* coincident to the data
type - which is the same for all other types as well.
28.08 is coincident with the data type FLOAT, but it cannot be stored in a
FLOAT; only an "approximation" can be stored in it making the data type
"approximate".
Mike C# <xy*@xyz.comwrote:
<snip>
Right, are you satisfied that float(53) is an exact type by your
definition then?
No, because it cannot store some values that coincide with the FLOAT type
without loss of data or introduction of error.
You=3Fve said this a number of times without a shred of proof.
Please provide a value which is conicendent with the float type which
can=3Ft be stored without introduction of error or loss of data.
Hint: 28.08 is *not* coincident with the float type, and there are ways
of storing and retrieving data other than using decimal
representations.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
Yet even with C# you can't always retrieve the *exact* value you
originally
assigned. You retrieve an approximation of that value... You put 28.08
into the bucket and pull out 28.079999923706055, which is not the "right
answer".
You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.
LOFL. Now you want to tell me that when I type in "f = 28.08" in C# or "SET
@f = 28.08" that it is not an "assignment statement"? Are you serious?
Please do tell what you call it.
The statement contains an assignment and an implicit conversion. The
conversion converts 28.08 into 28.07999(etc), and that=3Fs the value
which ends up being assigned.
If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.
I have to say I'm pleasantly surprised that you feel my definition is
reasonable! Unfortunately by my definition not all data types are "exact".
It doesn't include FLOAT and REAL, as I'll show you here:
First we'll consider my definition of "exact" types, which you agree is
"pretty reasonable":
"Exact data types are those data types which can store and retrieve single
values coincident with their type without loss of data, and without the
introduction of error."
To further expand, let's consider a definition for "coincident". To
coincide with a data type, a singleton value should meet these requirements:
1) The value must be composed of digits/characters that are defined as
valid, in the proper format defined for the data type,
2) For numeric types, the scale (digits after the decimal point) of the
value needs to be less than or equal to the scale of the data type,
3) For numeric types, the difference between precision (total number of
digits) and scale of the value needs to be less than or equal to the
difference between the precision and scale of the data type,
4) For numeric types, the value must be greater than or equal to the lower
bound defined for the data type, and less than or equal to the upper bound
defined for the data type,
5) For non-numeric types, the length of the value must be less than or
equal to the length of the data type
I think we can agree that by this definition, the singleton value 28.08
coincides with the following data types:
* DECIMAL (4, 2) since it consists of valid characters and the precision and
scale are less than or equal to that defined for the data type,
* FLOAT since it consists of valid characters and the precision and scale
are less than that defined for the data type,
Nope, because they=3Fre not in the proper format defined for the type.
The definition I=3Fve provided for valid values of float (as you
requested) is in terms of a binary string. If you=3Fre going to insist on
decimal representations then *obviously* decimal will =3Fwin=3F.
<snip>
Additionally, the result of the expression 1/3, by this definition, does not
coincide with any data types since it's scale is infinite and the scale for
all of these data types is finite.
It doesn=3Ft coincide with any of the types we=3Fve talked about. 0.1 (base
3) *would* be coicident with a base 3 type.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
<snip>
All the types we've talked about (integer, decimal, float) have the
following in common:
1) They have a well-defined set of values they can exactly represent.
2) If you ask them to store a value which isn't in that set, they will
store (exactly) an approximation to that value. The rules for that
approximation are also well-defined.
3) If you ask them to store a value which *is* in that set, they will
store that value exactly.
And the difference is: If you store a value in an INTEGER or DECIMAL that
is coincident with that type you do not risk introducing error or data loss.
With REAL or FLOAT you risk introducing error or losing data when storing
values that are coincident with the type.
See other replies. By defining =3Fcoincident with the type=3F to require
evaluation from a string of *decimal* digits, you=3Fre shifting the
goalposts from where I at least interpreted them to be.
I had originally understood your =3Fvalues coincident with the type=3F to
be =3Fvalues which can be exactly represented by that type=3F, and that=3Fs
the definition I was considering when I agreed with your definition of
an exact data type. If you look up =3Fcoincident=3F with a Google search of
=3Fdefine: coincident=3F then the definitions given there are a lot closer
to =3Fvalues which can be exactly represented by that type=3F than =3Fworks
with a given length of decimal digits=3F (a simplification of what you
said, but the main thrust).
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Really, other than awareness of how the conversion takes place (and
being able to spot by eye what value will actually be stored), what's
the difference here?
The main difference I see is that when you convert 28.08 to an INT, you are
*losing* data merely by rounding or truncating the input. IMHO, this is a
very easy concept to grasp, since we've been doing it since middle school.
When you convert 28.08 to a FLOAT, a data type most database users don't
fully comprehend, the conversion actually *adds* data. This is a much
tougher concept to grasp, and that is why you see 1800 conversations on
Google about FLOAT conversions, and very few questions about why 28.08 got
changed to 28.
When I get home from work, and I am asked what I did today, do you think I
go on about how I created a partition function against one of our OLTP
tables, which column I chose for the range and why, and the exact deltas in
i/o and throughput that were gained/lost against various workloads? Or do
you think I just say, "I made our database faster."? Since using your
definition of a FLOAT type (which I still don't agree with, but that's
another story), and adding information about base 3 and other base values
that the user couldn't care less about and probably doesn't understand, we
tend to do the same thing here: simplify, without a drawn-out lesson in
mathematical theory. So, I'm going to keep calling FLOAT and REAL
approximate data types, okay?
A
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike C# <xy*@xyz.comwrote:
>Yet even with C# you can't always retrieve the *exact* value you originally assigned. You retrieve an approximation of that value... You put 28.08 into the bucket and pull out 28.079999923706055, which is not the "right answer".
You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.
LOFL. Now you want to tell me that when I type in "f = 28.08" in C# or "SET @f = 28.08" that it is not an "assignment statement"? Are you serious? Please do tell what you call it.
The statement contains an assignment and an implicit conversion. The
conversion converts 28.08 into 28.07999(etc), and that's the value
which ends up being assigned.
As mentioned before, *EVERY* numeric assignment statement contains an
implicit conversion regardless of the type. So what in the world do you
feel is so special about pointing it out for FLOAT as opposed to pointing it
out for DECIMAL or INTEGER? When you assign the INTEGER value 28 to an
INTEGER variable, as in the following example: "i = 28" or "SET @i = 28",
there is an implicit conversion as well.
I'd like to know what's special about the conversion of the literal 28.08 to
a FLOAT as compared to the conversion of the literal 28 to an INTEGER?
What's the difference?
If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.
I have to say I'm pleasantly surprised that you feel my definition is reasonable! Unfortunately by my definition not all data types are "exact". It doesn't include FLOAT and REAL, as I'll show you here:
First we'll consider my definition of "exact" types, which you agree is "pretty reasonable":
"Exact data types are those data types which can store and retrieve single values coincident with their type without loss of data, and without the introduction of error."
To further expand, let's consider a definition for "coincident". To coincide with a data type, a singleton value should meet these requirements:
1) The value must be composed of digits/characters that are defined as valid, in the proper format defined for the data type, 2) For numeric types, the scale (digits after the decimal point) of the value needs to be less than or equal to the scale of the data type, 3) For numeric types, the difference between precision (total number of digits) and scale of the value needs to be less than or equal to the difference between the precision and scale of the data type, 4) For numeric types, the value must be greater than or equal to the lower bound defined for the data type, and less than or equal to the upper bound defined for the data type, 5) For non-numeric types, the length of the value must be less than or equal to the length of the data type
I think we can agree that by this definition, the singleton value 28.08 coincides with the following data types:
* DECIMAL (4, 2) since it consists of valid characters and the precision and scale are less than or equal to that defined for the data type,
* FLOAT since it consists of valid characters and the precision and scale are less than that defined for the data type,
Nope, because they're not in the proper format defined for the type.
The definition I've provided for valid values of float (as you
requested) is in terms of a binary string. If you're going to insist on
decimal representations then *obviously* decimal will "win".
The SQL Server range for FLOAT and REAL (as defined by BOL at http://msdn2.microsoft.com/en-us/lib...s173773.aspx):
FLOAT: -1.79E+308 to -2.23E-308, 0 and 2.23E-308 to 1.79E+308
REAL: -3.40E + 38 to -1.18E - 38, 0 and 1.18E - 38 to 3.40E + 38
The values for the range are given in decimal numbers.
MSDN (here http://msdn2.microsoft.com/en-us/lib...za(VS.80).aspx)
defines the C# float type as having this range:
float: +/-1.5 x 10 ^ -45 to +/- 3.4 x 10 ^ 38
Values are given by the standard in Base 10.
Further, as I'm sure you're aware, the C# Language Specification (right
here: http://download.microsoft.com/downlo...on%20v1.0.doc).
The SQL specification also has a specific definition for real literals which
is based on decimal digits. The C# 1.0 Language Specification pp. 47 - 48
states the following:
"Real literals are used to write values of types float, double, and
decimal."
The C# 1.0 Language specification provides the following format for Real
literals:
"real-literal:
decimal-digits . decimal-digits exponent-partopt real-type-suffixopt
.. decimal-digits exponent-partopt real-type-suffixopt
decimal-digits exponent-part real-type-suffixopt
decimal-digits real-type-suffix
exponent-part:
e signopt decimal-digits
E signopt decimal-digits
sign: one of
+ -
real-type-suffix: one of
F f D d M m"
This is sufficient proof that the intended format for FLOAT and REAL
literals is decimal; therefore I am insisting on "decimal representations"
since that is what the standards and specifications "insist on". Therefore,
to use your words, "*obviously* decimal will always win" basically because
the standards bodies define floating point literals in terms of decimal
digits.
If you want people to stop using decimal representations for FLOAT and REAL
literals, then lobby Microsoft, ANSI, IEEE and the other relevant standards
bodies. Until then, I think I've provided as precise and accurate a
real-world definition as you're likely to locate.
If you do happen to find a better real-world definition of an "exact" data
type, please feel free to share it with us.
Until then, feel free to try and keep your arguments within the bounds of
the standards and specifications for the tools with which we are concerning
ourselves here.
>Additionally, the result of the expression 1/3, by this definition, does not coincide with any data types since it's scale is infinite and the scale for all of these data types is finite.
It doesn't coincide with any of the types we've talked about. 0.1 (base
3) *would* be coicident with a base 3 type.
As long as you agree that it doesn't coincide with any of the data types
we've talked about. SQL does not contain a Base 3 data type (neither does
C#, C++, VB, yadda yadda), so this is really a non-issue. But when you
finally convince ANSI to add a Base 3 data type to SQL or talk Microsoft
into adding one to their languages, my definition will still cover it. And
1/3 will coincide with that data type *once it is implemented*. But as I
said, back in the real world, 1/3 does not coincide with any existing
numeric data types.
Aaron Bertrand [SQL Server MVP] (te*****@dnartreb.noraa) writes:
This is a much tougher concept to grasp, and that is why you see 1800
conversations on Google about FLOAT conversions, and very few questions
about why 28.08 got changed to 28.
Nah, those questions are just as common, although for a different reason.
And, OK, may not just with the numbers 28.08 and 28. A typical result people
expect is 0.333333.... when they get 0.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se
Books Online for SQL Server 2005 at http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinf...ons/books.mspx
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike C# <xy*@xyz.comwrote:
<snip>
Right, are you satisfied that float(53) is an exact type by your
definition then?
No, because it cannot store some values that coincide with the FLOAT type without loss of data or introduction of error.
You've said this a number of times without a shred of proof.
28.08 is the proof.
Please provide a value which is conicendent with the float type which
can't be stored without introduction of error or loss of data.
Hint: 28.08 is *not* coincident with the float type, and there are ways
of storing and retrieving data other than using decimal
representations.
28.08, by my definition which you accepted, does coincide with the FLOAT
data type. By way of proof:
* It contains all valid digits for a FLOAT type literal;
* It is in the proper format for a FLOAT type literal as defined by the
standard;
* It's precision and scale are within the bounds for a FLOAT type literal;
* It is between the lower and upper bounds for a FLOAT type literal
At this point I have proven that 28.08 does coincide with the definition for
a FLOAT literal, including the definition of a floating point literal taken
directly from the C# language specifications (with which I'm sure you're
familiar). You have yet to prove that 28.08 does not coincide with the
definition for a FLOAT type literal; and please provide your references as I
have done.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike C# <xy*@xyz.comwrote:
See other replies. By defining "coincident with the type" to require
evaluation from a string of *decimal* digits, you're shifting the
goalposts from where I at least interpreted them to be.
To use your analogy, the "goalposts" were put in place by the standards
bodies. See my reply which includes the C# definition of a float type
literal. Your interpretation of where the "goalposts" are located is
incorrect according to the standards for the tools with which we are
concerning ourselves.
I had originally understood your "values coincident with the type" to
be "values which can be exactly represented by that type", and that's
the definition I was considering when I agreed with your definition of
If I had meant "values which can be exactly represented by that type", I
would have used those words. If you want the "main thrust" of what I said,
then we can replace the word "coincident" with my definition, so that the
full definition is the following:
"Exact data types are those data types which can store and retrieve single
values which conform to the following rules without loss of data, and
without the
introduction of error:
1) The value must be composed of digits/characters that are defined as
valid, in the proper format defined for the data type,
2) For numeric types, the scale (digits after the decimal point) of the
value needs to be less than or equal to the scale of the data type,
3) For numeric types, the difference between precision (total number of
digits) and scale of the value needs to be less than or equal to the
difference between the precision and scale of the data type,
4) For numeric types, the value must be greater than or equal to the lower
bound defined for the data type, and less than or equal to the upper bound
defined for the data type,
5) For non-numeric types, the length of the value must be less than or
equal to the length of the data type"
an exact data type. If you look up =3Fcoincident=3F with a Google search
of
=3Fdefine: coincident=3F then the definitions given there are a lot closer
to =3Fvalues which can be exactly represented by that type=3F than
=3Fworks
with a given length of decimal digits=3F (a simplification of what you
said, but the main thrust).
In fact, an oversimplification of what I said and not the main thrust at
all. If you want a simplification of what I said with the main thrust, see
above.
As I said, if you can provide a better definition, feel free. Otherwise, I
think this definition is the best you are going to achieve considering where
the standards bodies placed the "goalposts".
<snip>
I'm replying to the original post as this is going to be a summary
post. I've started repeating myself, and I'm sure I'm not the only one.
I don't see any sign that anyone's convincing anyone of anything, and
as I effectively started this mess, I think it makes sense for me to be
the one to end it (if others want it ended). At least, I expect when I
stop posting the thread will probably die fairly soon - if it doesn't,
of course, that's fine. My previous attempt to leave the thread clearly
failed miserably so this time I've set my newsreader to ignore the
thread. In a week I'll "unignore" it to read any extra responses, but
by then I'm sure my sense of netiquette will overwhelm my desire to
respond :) In the unlikely case that anyone *really* wants to hear any
more from me, please mail me.
So, I thought I'd just summarise my position for the sake of anyone
who's interested in the future. Here we go:
1) I believe the correct answer to the OP's question of why 28.08 shows
up as 28.08000(etc)2 is that the conversion from a decimal number to a
binary number (e.g. a float) is one which can lose information. The
result of the conversion is stored exactly, but that result isn't the
same as the original number.
2) I agree with everyone else in terms of what the OP should do: use a
decimal based type.
3) I suspect we all at least mostly agree on the actual technical
aspects of what's happening, but we come from very different points of
view. My point of view tries to be "base-agnostic", viewing a number as
a number regardless of base, whereas the point of view which treats
float as "approximate" places a greater importance on decimal numbers
than other numbers.
4) I believe that the float type can accurately store and retrieve
every value it can accurately represent. This set of values is *not*
every number within the range of the minimum to the maximum values of
the type, just as integer types can't represent every number between
their minimum and maximum values - they can only represent every
*integer* between those values.
5) I believe that the business of which base literals are usually
expressed in is irrelevant to whether or not the values that a type
stores are exact or not. There will always be a conversion involved
from a decimal type to a binary type, and that conversion will always
be "lossy". I believe people should be aware of that, as well as being
aware that the value actually stored is an exact value - just not
necessarily the value they would expect. I also believe it's pretty
much as easy to explain that as to just say that float is an
approximate type. Of course, it's definitely worth pointing out that if
you want a conversion from a decimal literal which is lossless (within
the appropriate range/scale/precision) you should use a decimal type.
6) I agree that the examples I've given about base 3 values are
irrelevant to what will actually happen in real life, but I believe
they lead to a useful thought experiment which shows that float is just
as "exact" as decimal. Every type we've been talking about (float,
integer, decimal) has a set of values it can represent, and when
presented with a value from a type which can represent values not in
the set, the original value may not be representable. (It's not
*necessarily* an information-losing conversion - converting from uint
to int and back in C#, for instance, you can avoid losing data but you
can't always represent the original value.)
I dare say I've missed some points I've made throughout the rest of the
discussion, but never mind. I hope others have found the discussion as
interesting as I have.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
See other replies. By defining =3Fcoincident with the type=3F to require
evaluation from a string of *decimal* digits, you=3Fre shifting the
goalposts from where I at least interpreted them to be.
By your last two posts, the real issue here is more obvious. The real issue
has nothing to do with the definition of "exact" or "approximate" data
types. The real issue is your misinterpretation of where the "goalposts"
are. Fortunately this issue is easily corrected by simply familiarizing
yourself with the standards. Once you understand that C#, SQL, et al., do
not allow Base 3 floating point type literals, etc., it is much easier to
determine where the "goalposts" are.
As you said, "if you're going to insist on decimal representations then
*obviously* decimal will win." As you read the standards, you will learn
that the standards define floating point literals in terms of the decimal
representations. So obviously decimal will win; because that's where the
standard sets that "goalpost".
I've noticed throughout this discussion that you tend to embrace that which
you feel supports your argument - even partial sentences - while completely
ignoring (as opposed to disproving) that which refutes your argument; even
to the point of cutting a sentence in half to remove the proper context! If
that is how you also read official standards, then it is no wonder that your
interpretation of where the "goalposts" are located is inconsistent with
those published standards.
I think Mr. Skeet has an excellent idea in wrapping up this thread. As
such, I'd like to provide an opposing view to the question which Mr. Skeet
implied early on; in a nutshell: "Why are FLOAT and REAL called
'approximate' types while INT, DECIMAL, etc., types called 'exact'?"
I believe that the following simple definition follows the intent of the
standards put forth by ANSI (SQL), Microsoft (C#, VB, etc.) and other
standards bodies in referring to data types based on the IEEE floating
standards as "approximate" and other data types as "exact":
"Exact data types are those data types which can store and retrieve single
values which conform to the following rules without loss of data, and
without the introduction of error:
1) The value must be composed of digits/characters that are defined as
valid, in the proper format defined for the data type,
2) For numeric types, the scale (digits after the decimal point) of the
value needs to be less than or equal to the scale of the data type,
3) For numeric types, the difference between precision (total number of
digits) and scale of the value needs to be less than or equal to the
difference between the precision and scale of the data type,
4) For numeric types, the value must be greater than or equal to the lower
bound defined for the data type, and less than or equal to the upper bound
defined for the data type,
5) For non-numeric types, the length of the value must be less than or
equal to the length of the data type"
This definition follows the intent of the standards bodies, including their
specification of floating point literals as Base 10. This definition also
accounts for truncation and rounding errors in irrational numbers and
infinitely repeating decimal values since their scale is infinitely beyond
what can be accurately represented by any built-in finite SQL, C#, or VB
types.
While Mr. Skeet also expressed an issue with calling FLOAT and REAL data
types "approximate", these labels are entirely accurate, as they succinctly
express the position that when you try to store a numeric literal that
follows the above rules, an "approximation" of that value may be stored
instead of the "exact" value.
Like Mr. Skeet, I too feel this thread has run its course, and also feel
that arguments about items that have no real world applications or relevance
(quite possibly why they were left outside the scope of the adopted
standards: i.e., "Base 3", etc.) has no real bearing on the issues at hand.
This has been an interesting discussion despite having to deal with side
issues that have no basis in reality, and the ignoring of accepted standards
as well as the intent of the standards bodies in defining these types and in
labeling them as "approximate" or "exact". ### This discussion thread is closed Replies have been disabled for this discussion. ### Similar topics
79 posts
views
Thread by Daniel Manes |
last post: by
| | | | | | | | | | |