I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
Thanks,
-Dan 130 6007
FLOAT is an approximate data type. If you want precision, then use an
appropriate DECIMAL instead of FLOAT.
"Daniel Manes" <da******@cox.netwrote in message
news:11**********************@k58g2000hse.googlegr oups.com...
I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
Thanks,
-Dan
Daniel,
Float is an approximate datatype. Lookup approximate numeric data in BOL.
-- Bill
"Daniel Manes" <da******@cox.netwrote in message
news:11**********************@k58g2000hse.googlegr oups.com...
I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
Thanks,
-Dan
Daniel Manes <da******@cox.netwrote:
I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
See http://www.pobox.com/~skeet/csharp/floatingpoint.html
For reference, the closest .NET double to 28.08 is exactly
28.0799999999999982946974341757595539093017578125
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
FLOAT is an approximate data type. If you want precision, then use an
appropriate DECIMAL instead of FLOAT.
Decimal is as "approximate" as float, in that neither can represent
every possible rational number exactly. They just have different bases
- decimal will represent numbers like 0.1234 exactly, but will be
inaccurate with 1/3 in the same way that float is.
Both are floating point types - float is a floating *binary* point
type, and decimal is a floating *decimal* point type.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Take a look to this discussion:
(Why 9.09 == 9.0899999999999999) http://www.nnseek.com/e/microsoft.pu...9_089999999999
9999_19543198t.html
Regards
--
Cholo Lennon
Bs.As.
ARG
"Daniel Manes" <da******@cox.netwrote in message
news:11**********************@k58g2000hse.googlegr oups.com...
I'm baffled. I have a column in a SQL Server Express database called
"Longitude," which is a float. When I view the table in a DataGridView,
some of the numbers, which only have two decimal places in the database
show up with *15* decimal places and are ever so slightly off (in the
example in the subject line, by about 2E-15).
I'm not doing any operations on this column. It's just running a stored
procedure which performs a pretty basic SELECT on the table. If I run
the stored procedure in Management Studio Express, all numbers show up
fine (just two decimal places).
What's going on here?
Thanks,
-Dan
Jon Skeet [C# MVP] <sk***@pobox.comwrote:
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
FLOAT is an approximate data type. If you want precision, then use an
appropriate DECIMAL instead of FLOAT.
Decimal is as "approximate" as float, in that neither can represent
every possible rational number exactly. They just have different bases
- decimal will represent numbers like 0.1234 exactly, but will be
inaccurate with 1/3 in the same way that float is.
Both are floating point types - float is a floating *binary* point
type, and decimal is a floating *decimal* point type.
Apologies - some clarification is required here. The above is certainly
true for the C# float/decimal types. After a bit of digging (I don't
have SQL Server help available at the minute) I believe that any
particular column with a defined precision and scale, a DECIMAL column
in SQL Server is effectively "fixed point" (i.e. the value itself
doesn't specify where the decimal point is, the column does).
That doesn't mean it's "precise" in a way that FLOAT isn't, it just
alters the storage (and therefore the range and precision available).
The above probably isn't terribly clear, but the bottom line is that a
floating point number is a very precise number - it has an exact value
- but not every number can be exactly represented as a floating point
number (given the base/storage size etc). That's true for fixed point
numbers as well, and it's not the "fixedness" that makes DECIMAL more
appropriate for business calculations, but the fact that it uses base
10 instead of base 2.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
What I meant by "not approximate" is that if you put 28.08 in a
decimal(5,2), you are never going to get 28.08000000000000002
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Jon Skeet [C# MVP] <sk***@pobox.comwrote:
>Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
FLOAT is an approximate data type. If you want precision, then use an
appropriate DECIMAL instead of FLOAT.
Decimal is as "approximate" as float, in that neither can represent every possible rational number exactly. They just have different bases - decimal will represent numbers like 0.1234 exactly, but will be inaccurate with 1/3 in the same way that float is.
Both are floating point types - float is a floating *binary* point type, and decimal is a floating *decimal* point type.
Apologies - some clarification is required here. The above is certainly
true for the C# float/decimal types. After a bit of digging (I don't
have SQL Server help available at the minute) I believe that any
particular column with a defined precision and scale, a DECIMAL column
in SQL Server is effectively "fixed point" (i.e. the value itself
doesn't specify where the decimal point is, the column does).
That doesn't mean it's "precise" in a way that FLOAT isn't, it just
alters the storage (and therefore the range and precision available).
The above probably isn't terribly clear, but the bottom line is that a
floating point number is a very precise number - it has an exact value
- but not every number can be exactly represented as a floating point
number (given the base/storage size etc). That's true for fixed point
numbers as well, and it's not the "fixedness" that makes DECIMAL more
appropriate for business calculations, but the fact that it uses base
10 instead of base 2.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
What I meant by "not approximate" is that if you put 28.08 in a
decimal(5,2), you are never going to get 28.08000000000000002
True. If you ask SQL server to divide 1 by 3, however, you certainly
*won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
The reason I'm bothering to make the distinction is that there's a
widespread myth that decimal types are "exact" in a way that floating
binary point types aren't - it's just down to people having a natural
bias to base 10. If you consider numbers in base 3 (or 7, or 11, etc)
instead, DECIMAL is just as bad as FLOAT.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
>What I meant by "not approximate" is that if you put 28.08 in a decimal(5,2), you are never going to get 28.08000000000000002
True. If you ask SQL server to divide 1 by 3, however, you certainly
*won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
The reason I'm bothering to make the distinction is that there's a
widespread myth that decimal types are "exact" in a way that floating
binary point types aren't -
I think that most of us recognize the difference between SET @foo = 0.33 and
SET @foo = 1.0/3 ... in the former, we're choosing to limit the "exact"
nature of the result, and if we choose to store it in a DECIMAL as opposed
to a FLOAT, we know we're going to get 0.33 every time...
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
What I meant by "not approximate" is that if you put 28.08 in a
decimal(5,2), you are never going to get 28.08000000000000002
True. If you ask SQL server to divide 1 by 3, however, you certainly
*won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
The reason I'm bothering to make the distinction is that there's a
widespread myth that decimal types are "exact" in a way that floating
binary point types aren't -
I think that most of us recognize the difference between SET @foo = 0.33 and
SET @foo = 1.0/3 ... in the former, we're choosing to limit the "exact"
nature of the result, and if we choose to store it in a DECIMAL as opposed
to a FLOAT, we know we're going to get 0.33 every time...
But if you were to express floating binary point numbers in a binary
format (eg 0.01101) then you'd get the exact same result back when you
reformatted it as binary. The nature of the literal formats available
doesn't alter whether or not a data type is essentially approximate or
not.
The "approximate" nature of binary floating point isn't because it's
floating point, and isn't changed by using a decimal format. It's just
the inability to exactly represent all *decimal* numbers, which isn't
the same thing as all numbers.
I personally believe that's *not* well understood - hence people (e.g.
3 different people in this thread) referring to floating binary point
numbers as "approximate" when they're not at all - they're exact
numbers which may only be approximately equal to the number you
originally tried to set them to.
As an example of what I mean, consider integer types. Are they
"approximate"? No - they represent exact numbers. However, if you do
(in C#):
int x = (int) 10.3;
The value of x won't be 10.3, it will be 10. 10 is an exact number, but
the conversion from 10.3 was an approximation. The same is true if you
have a float and do
SET @my_float = 0.33 (SQL)
or
float myFloat = 0.33f; (C#)
my_float represents an exact number, but the conversion from the
literal "0.33" to floating binary point is an approximating one. The
differences between the floating binary point conversion and the
conversion to an integer earlier are:
1) The integer conversion is explicit, highlighting that data may be
lost
2) When reformatted as a decimal value, an integer is always "tidy"
whereas a float often isn't (as evidenced by the subject line of this
thread)
I don't see either of those as reasons to describe floating binary
point numbers as "approximate" when integers aren't described in that
way.
Given a floating point value, I can tell you *exactly* what that number
is as a decimal expansion. In what sense is the value (or the type)
"approximate"? It's important (IMO) not to be blinded by the bias of
the fact that humans tend to represent numbers as decimals.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP***********************@msnews.microsoft.co m...
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
>What I meant by "not approximate" is that if you put 28.08 in a decimal(5,2), you are never going to get 28.08000000000000002
True. If you ask SQL server to divide 1 by 3, however, you certainly
*won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
That's a shortcoming of numbers in general, not of SQL Server's
representation of numeric values. T-SQL defines real and float as
"approximate" data types because using the IEEE standard binary
representation not all numbers in the range of valid values can be stored in
an exact representation. Or, as Aaron pointed out, if you put in "28.08"
you can get back "28.08000000000000002". http://msdn2.microsoft.com/en-us/library/ms173773.aspx
Decimal and numeric are "exact" data types because they can represent all
the numbers in their range of valid values in an exact representation.
I.e., "28.08" is stored exactly as 28.08.
The reason I'm bothering to make the distinction is that there's a
widespread myth that decimal types are "exact" in a way that floating
binary point types aren't - it's just down to people having a natural
bias to base 10. If you consider numbers in base 3 (or 7, or 11, etc)
instead, DECIMAL is just as bad as FLOAT.
As pointed out, decimal types are "exact" in that they can "exactly"
represent every single number in their range of valid values. IEEE standard
floating point binary types are "approximate" because they cannot.
Mike C# <xy*@xyz.comwrote:
True. If you ask SQL server to divide 1 by 3, however, you certainly
*won't* get an exact answer in a DECIMAL though (or FLOAT, admittedly).
That's a shortcoming of numbers in general, not of SQL Server's
representation of numeric values.
No, it's a shortcoming of storing numbers as sequences of digits,
whatever the base. You could represent all rationals as one integer
divided by another integer, and always have the exact value of the
rational (assuming arbitrary length integers) - until you started to
perform operations like square roots, of course.
T-SQL defines real and float as
"approximate" data types because using the IEEE standard binary
representation not all numbers in the range of valid values can be stored in
an exact representation.
The same is true for decimal, however: not all numbers in the range of
valid values can be stored in an exact representation. All *decimal*
values can
Or, as Aaron pointed out, if you put in "28.08"
you can get back "28.08000000000000002".
Just as if you had a way of representing base 3 literals and you put in
".1" for a decimal you wouldn't (necessarily, at least) get ".1" back
when you reformatted to base 3 - because information would have been
lost. http://msdn2.microsoft.com/en-us/library/ms173773.aspx
I disagree with MSDN as well. It's not the first time, including for
floating point types - I had to battle with people at MSDN for them to
correct the description of the .NET Decimal type, which was originally
documented as a fixed point type, despite it clearly not being so.
Fortunately that's now fixed...
Decimal and numeric are "exact" data types because they can represent all
the numbers in their range of valid values in an exact representation.
I.e., "28.08" is stored exactly as 28.08.
No, they can't, any more than binary floating point types can. They can
only store numbers to a certain *decimal* precision, in the same way
that floating binary point types can only store numbers to a certain
*binary* precision.
As an example, the real (and even rational) number obtained by dividing
1 by 3 is within the range of decimal (as expressed in the MSDN page
for decimal: "valid values are from - 10^38 +1 through 10^38 - 1"), but
*cannot* stored exactly.
The reason I'm bothering to make the distinction is that there's a
widespread myth that decimal types are "exact" in a way that floating
binary point types aren't - it's just down to people having a natural
bias to base 10. If you consider numbers in base 3 (or 7, or 11, etc)
instead, DECIMAL is just as bad as FLOAT.
As pointed out, decimal types are "exact" in that they can "exactly"
represent every single number in their range of valid values. IEEE standard
floating point binary types are "approximate" because they cannot.
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
1/3 is just as "real" a number as 1/10 - and yet you ignore the fact
that it can't be exactly represented by a decimal, even though it's in
the range. Why should the failure to represent 1/10 accurately make a
data type be regarded as "approximate" when the failure to represent
1/3 doesn't? From a mathematical point of view, there's very little
difference.
From a *business* point of view, there's a large difference in terms of
how useful the types are and for what operations, but that doesn't
affect whether or not a data type should be regarded as fundamentally
"exact" or not, IMO.
Look at it from the raw data point of view: both types essentially
represent a long sequence of digits and a point somewhere within the
sequence. You could choose *any* base - it so happens that the common
ones used are 2 and 10.
Now, from that definition, and disregarding the fact that humans have
10 fingers, what's the difference between a sequence of bits with a
point in and a sequence of decimal digits with a point in that makes
you say that the first is approximate but the second is exact?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
The "approximate" nature of binary floating point isn't because it's
floating point, and isn't changed by using a decimal format. It's just
the inability to exactly represent all *decimal* numbers, which isn't
the same thing as all numbers.
That seems like hair-splitting, for real.
I personally believe that's *not* well understood - hence people (e.g.
3 different people in this thread) referring to floating binary point
numbers as "approximate" when they're not at all - they're exact
numbers which may only be approximately equal to the number you
originally tried to set them to.
People in this newsgroup refer to binary floating point numbers as
"approximate" because that's what the ANSI SQL standard calls them.
"The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively
referred to as exact numeric types. The data types FLOAT, REAL, and DOUBLE
PRECISION are collectively referred to as approximate numeric types. Exact
numeric types and approximate numeric types are collectively referred to as
numeric types. Values of numeric type are referred to as numbers." -- ANSI
SQL-92
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
No, it's a shortcoming of storing numbers as sequences of digits,
whatever the base. You could represent all rationals as one integer
divided by another integer, and always have the exact value of the
rational (assuming arbitrary length integers) - until you started to
perform operations like square roots, of course.
And what about irrational numbers?
The same is true for decimal, however: not all numbers in the range of
valid values can be stored in an exact representation. All *decimal*
values can
The SQL Decimal and Numeric data types are designed to store every single
last number in the range of valid values in an *EXACT* representation.
>Or, as Aaron pointed out, if you put in "28.08" you can get back "28.08000000000000002".
Just as if you had a way of representing base 3 literals and you put in
".1" for a decimal you wouldn't (necessarily, at least) get ".1" back
when you reformatted to base 3 - because information would have been
lost.
Are we talking about information loss from the simple act of converting from
one base to another, which is what the Decimal and Numeric types address; or
are you talking about information loss from performing operations on data
which goes back to your 1/3 example earlier?
>http://msdn2.microsoft.com/en-us/library/ms173773.aspx
I disagree with MSDN as well. It's not the first time, including for
floating point types - I had to battle with people at MSDN for them to
correct the description of the .NET Decimal type, which was originally
documented as a fixed point type, despite it clearly not being so.
Fortunately that's now fixed...
You need to present your arguments about this to ANSI, since these are the
definitions they put forth.
>Decimal and numeric are "exact" data types because they can represent all the numbers in their range of valid values in an exact representation. I.e., "28.08" is stored exactly as 28.08.
No, they can't, any more than binary floating point types can. They can
only store numbers to a certain *decimal* precision, in the same way
that floating binary point types can only store numbers to a certain
*binary* precision.
They can store *every single valid value in their defined set of values* in
an exact representation. With Numeric and Decimal types precision and scale
that you specify define the valid set of values.
As an example, the real (and even rational) number obtained by dividing
1 by 3 is within the range of decimal (as expressed in the MSDN page
for decimal: "valid values are from - 10^38 +1 through 10^38 - 1"), but
*cannot* stored exactly.
You might want to re-read that, because you seem to have picked out just
enough of a single sentence to support your argument, but completely ignored
the part that doesn't. Here's the full sentence for you (the emphasis is
mine):
"WHEN MAXIMUM PRECISION IS USED, valid values are from - 10^38 +1 through
10^38 - 1."
That also defines a default scale of "0". If, however, you define a
different precision and/or scale, the valid set of values changes.
>As pointed out, decimal types are "exact" in that they can "exactly" represent every single number in their range of valid values. IEEE standard floating point binary types are "approximate" because they cannot.
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
Of course I'm biased to decimal numbers, as are most of the people here who
were born with 10 fingers and 10 toes. Now let's talk about your bias
against irrational numbers, since you seem to be tip-toeing all around them
here.
1/3 is just as "real" a number as 1/10 - and yet you ignore the fact
that it can't be exactly represented by a decimal, even though it's in
the range. Why should the failure to represent 1/10 accurately make a
data type be regarded as "approximate" when the failure to represent
1/3 doesn't? From a mathematical point of view, there's very little
difference.
There is no "fraction" data type in SQL, which you might consider a
shortcoming of the language. There is also no "fraction" data type built
into C#, C++, or most other languages. The reason for that might be a huge
bias or conspiracy against vulgar fractions, or it might be the lack of a
useful vinculum in the original ASCII character set, or who it might just be
that most people just don't find it all that useful.
Regardless of the reason, 1/10 is not a repeating decimal in decimal (yes,
I'm showing my "bias" for decimal again). It has a finite representation in
decimal. 1/3 is a repeating decimal in decimal. Since it's decimal
representation repeats infinitely, it's impossible to store it exactly
anywhere. However, with regards to Decimal and Numeric data types in SQL,
"0.33333" is not a member of the set of valid values for a DECIMAL (10, 2)
or a NUMERIC (10, 3) [for instance].
From a *business* point of view, there's a large difference in terms of
how useful the types are and for what operations, but that doesn't
affect whether or not a data type should be regarded as fundamentally
"exact" or not, IMO.
The definitions of SQL DECIMAL and NUMERIC specifically allow you to define
the precision and scale. By doing so you define the number of total digits
and the number of digits after the decimal point. Any numbers that are in
the set of valid values for the precision and scale that you define will be
stored *exactly*. Any numbers that are not in the set of valid values for
that precision and scale will not be stored exactly. SQL REAL and FLOAT
types do not come with this guarantee.
Look at it from the raw data point of view: both types essentially
represent a long sequence of digits and a point somewhere within the
sequence. You could choose *any* base - it so happens that the common
ones used are 2 and 10.
And there happen to be very good reasons for that...
Now, from that definition, and disregarding the fact that humans have
10 fingers, what's the difference between a sequence of bits with a
point in and a sequence of decimal digits with a point in that makes
you say that the first is approximate but the second is exact?
Quite simply because with NUMERIC and DECIMAL types I can define the
precision and scale, and every valid value in that range will be stored
*EXACTLY*. REAL and FLOAT come with pre-defined ranges, but have no
guarantees concerning how the exactness with which your numeric data will be
stored. As we saw, even a number as simple as 28.08 is stored as an
approximation of 28.08 (or 2808/100 if you prefer your fractions). Whereas
with a NUMERIC (10, 2) it would be stored as -- well, "28.08" *EXACTLY*.
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
Well, yeah, Sherlock.
The OP passed 28.08 into the database. When he asked for the value back, he
got a different answer. He asked how to fix it, so I guess, yeah, our bias
in this case is toward using DECIMAL. Where, regardless of what fantasy
world you're converting to base 3 et. al., when you input 28.08 and then ask
for the value back, you're still going to get 28.08. This isn't about math
theory or MSDN or base 3 or 1/3 or some religious war about whether you
think the standards' definition of "approximate" is acceptable to you. It's
about storing a known value (not an expression, like 1/3) with a fixed
number of decimal places, and getting the same number back, every time, in a
deterministic fashion.
A
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
Well, yeah, Sherlock.
The OP passed 28.08 into the database. When he asked for the value back, he
got a different answer. He asked how to fix it, so I guess, yeah, our bias
in this case is toward using DECIMAL. Where, regardless of what fantasy
world you're converting to base 3 et. al., when you input 28.08 and then ask
for the value back, you're still going to get 28.08. This isn't about math
theory or MSDN or base 3 or 1/3 or some religious war about whether you
think the standards' definition of "approximate" is acceptable to you. It's
about storing a known value (not an expression, like 1/3) with a fixed
number of decimal places, and getting the same number back, every time, in a
deterministic fashion.
A
Mike C# <xy*@xyz.comwrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
The "approximate" nature of binary floating point isn't because it's
floating point, and isn't changed by using a decimal format. It's just
the inability to exactly represent all *decimal* numbers, which isn't
the same thing as all numbers.
That seems like hair-splitting, for real.
I see nothing hair-splitting about recognising that 1/3 is a number.
I personally believe that's *not* well understood - hence people (e.g.
3 different people in this thread) referring to floating binary point
numbers as "approximate" when they're not at all - they're exact
numbers which may only be approximately equal to the number you
originally tried to set them to.
People in this newsgroup refer to binary floating point numbers as
"approximate" because that's what the ANSI SQL standard calls them.
You may notice that there are multiple newsgroups involved here...
"The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively
referred to as exact numeric types. The data types FLOAT, REAL, and DOUBLE
PRECISION are collectively referred to as approximate numeric types. Exact
numeric types and approximate numeric types are collectively referred to as
numeric types. Values of numeric type are referred to as numbers." -- ANSI
SQL-92
Just because ANSI says something doesn't mean it's right, IMO.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
No, it's a shortcoming of storing numbers as sequences of digits,
whatever the base. You could represent all rationals as one integer
divided by another integer, and always have the exact value of the
rational (assuming arbitrary length integers) - until you started to
perform operations like square roots, of course.
And what about irrational numbers?
That's why I said you couldn't cope when square roots are involved.
Basically we don't *have* a good way of exactly representing numbers,
unless you want to keep everything at a symbolic level.
The same is true for decimal, however: not all numbers in the range of
valid values can be stored in an exact representation. All *decimal*
values can
The SQL Decimal and Numeric data types are designed to store every single
last number in the range of valid values in an *EXACT* representation.
1/3 is in the range, but can't be stored exactly. Which part of that
statement do you disagree with?
Or, as Aaron pointed out, if you put in "28.08"
you can get back "28.08000000000000002".
Just as if you had a way of representing base 3 literals and you put in
".1" for a decimal you wouldn't (necessarily, at least) get ".1" back
when you reformatted to base 3 - because information would have been
lost.
Are we talking about information loss from the simple act of converting from
one base to another, which is what the Decimal and Numeric types address; or
are you talking about information loss from performing operations on data
which goes back to your 1/3 example earlier?
They're the same thing. Converting 0.1 base 3 (i.e. 1/3) into a decimal
*is* a conversion from one base to another, and the Decimal and Numeric
types *do not* store that number exactly. http://msdn2.microsoft.com/en-us/library/ms173773.aspx
I disagree with MSDN as well. It's not the first time, including for
floating point types - I had to battle with people at MSDN for them to
correct the description of the .NET Decimal type, which was originally
documented as a fixed point type, despite it clearly not being so.
Fortunately that's now fixed...
You need to present your arguments about this to ANSI, since these are the
definitions they put forth.
I'm happy to disagree with any number of people, if what they're saying
doesn't make sense.
Decimal and numeric are "exact" data types because they can represent all
the numbers in their range of valid values in an exact representation.
I.e., "28.08" is stored exactly as 28.08.
No, they can't, any more than binary floating point types can. They can
only store numbers to a certain *decimal* precision, in the same way
that floating binary point types can only store numbers to a certain
*binary* precision.
They can store *every single valid value in their defined set of values* in
an exact representation. With Numeric and Decimal types precision and scale
that you specify define the valid set of values.
In that case the same can be said for Float. Every Float value is an
exact value - you can write down the *exact* binary string it
represents, and even convert it into an exact decimal representation.
As an example, the real (and even rational) number obtained by dividing
1 by 3 is within the range of decimal (as expressed in the MSDN page
for decimal: "valid values are from - 10^38 +1 through 10^38 - 1"), but
*cannot* stored exactly.
You might want to re-read that, because you seem to have picked out just
enough of a single sentence to support your argument, but completely ignored
the part that doesn't. Here's the full sentence for you (the emphasis is
mine):
"WHEN MAXIMUM PRECISION IS USED, valid values are from - 10^38 +1 through
10^38 - 1."
That also defines a default scale of "0". If, however, you define a
different precision and/or scale, the valid set of values changes.
I don't think that makes any difference - when maximum precision is
used, the range of valid values is given in terms of a minimum and a
maximum, and not every value in that range is exactly representable. If
you're going to treat it as "every value that can be exactly
represented can be exactly represented" then as I say, the same is true
for Float.
As pointed out, decimal types are "exact" in that they can "exactly"
represent every single number in their range of valid values. IEEE
standard
floating point binary types are "approximate" because they cannot.
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
Of course I'm biased to decimal numbers, as are most of the people here who
were born with 10 fingers and 10 toes. Now let's talk about your bias
against irrational numbers, since you seem to be tip-toeing all around them
here.
No, I'm recognising the mathematical reality that the existence (and
value) of a number doesn't depend on what base you represent it in.
1/3 is just as "real" a number as 1/10 - and yet you ignore the fact
that it can't be exactly represented by a decimal, even though it's in
the range. Why should the failure to represent 1/10 accurately make a
data type be regarded as "approximate" when the failure to represent
1/3 doesn't? From a mathematical point of view, there's very little
difference.
There is no "fraction" data type in SQL, which you might consider a
shortcoming of the language. There is also no "fraction" data type built
into C#, C++, or most other languages. The reason for that might be a huge
bias or conspiracy against vulgar fractions, or it might be the lack of a
useful vinculum in the original ASCII character set, or who it might just be
that most people just don't find it all that useful.
I wasn't talking about usefulness, I was talking about existence.
Suggesting that 1/3 isn't a number is silly, IMO.
Regardless of the reason, 1/10 is not a repeating decimal in decimal (yes,
I'm showing my "bias" for decimal again). It has a finite representation in
decimal. 1/3 is a repeating decimal in decimal. Since it's decimal
representation repeats infinitely, it's impossible to store it exactly
anywhere. However, with regards to Decimal and Numeric data types in SQL,
"0.33333" is not a member of the set of valid values for a DECIMAL (10, 2)
or a NUMERIC (10, 3) [for instance].
Indeed.
From a *business* point of view, there's a large difference in terms of
how useful the types are and for what operations, but that doesn't
affect whether or not a data type should be regarded as fundamentally
"exact" or not, IMO.
The definitions of SQL DECIMAL and NUMERIC specifically allow you to define
the precision and scale. By doing so you define the number of total digits
and the number of digits after the decimal point. Any numbers that are in
the set of valid values for the precision and scale that you define will be
stored *exactly*. Any numbers that are not in the set of valid values for
that precision and scale will not be stored exactly. SQL REAL and FLOAT
types do not come with this guarantee.
There's no reason why they shouldn't - why would it not be able to
guarantees tha the number is essentially reproducible, which is all I
can understand that we're talking about? The binary string "0.010101"
represents an exact number and no others.
Look at it from the raw data point of view: both types essentially
represent a long sequence of digits and a point somewhere within the
sequence. You could choose *any* base - it so happens that the common
ones used are 2 and 10.
And there happen to be very good reasons for that...
Of course, which I've never denied.
Now, from that definition, and disregarding the fact that humans have
10 fingers, what's the difference between a sequence of bits with a
point in and a sequence of decimal digits with a point in that makes
you say that the first is approximate but the second is exact?
Quite simply because with NUMERIC and DECIMAL types I can define the
precision and scale, and every valid value in that range will be stored
*EXACTLY*. REAL and FLOAT come with pre-defined ranges, but have no
guarantees concerning how the exactness with which your numeric data will be
stored. As we saw, even a number as simple as 28.08 is stored as an
approximation of 28.08 (or 2808/100 if you prefer your fractions). Whereas
with a NUMERIC (10, 2) it would be stored as -- well, "28.08" *EXACTLY*.
Do you regard 28.08 as in the "range" for FLOAT? It seems unfair to
demand that it is while saying that 1/3 isn't in the "RANGE" for
NUMERIC, but that's what you've basically got to do in order to claim
that FLOAT is "approximate" but DECIMAL is "exact".
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
You're showing a bias to decimal numbers. There are perfectly valid
reasons to have that bias and to have that decimal type, but you should
acknowledge that the bias is there.
Well, yeah, Sherlock.
It seems to be a fact which is blinding everyone to the existence of
1/3 as a number.
The OP passed 28.08 into the database. When he asked for the value back, he
got a different answer. He asked how to fix it, so I guess, yeah, our bias
in this case is toward using DECIMAL.
And I never said that wasn't a good idea.
Where, regardless of what fantasy world you're converting to base 3
et. al., when you input 28.08 and then ask for the value back, you're
still going to get 28.08.
No, if you input a number in base 3, convert it to base 10, and then
later convert it back to base 3, you will *not* (always) get the same
number back, for exactly the same reason that 28.08 doesn't come back
as 28.08. Information is lost.
This isn't about math theory or MSDN or base 3 or 1/3 or some
religious war about whether you think the standards' definition of
"approximate" is acceptable to you.
That's precisely what this part of the thread is about. I never
quibbled with the solution to the OP's problem, merely the terminology
used.
It's about storing a known value (not an expression, like 1/3)
1/3 is a known value - just think of it as the literal "0.1" in base 3.
with a fixed number of decimal places, and getting the same number
back, every time, in a deterministic fashion.
You get determinism with floating binary point too - just not lossless
conversion from base 10, that's all.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP] <sk***@pobox.comwrote:
<snip>
Having failed to sleep properly due to nuances about this kind of thing
floating (no pun intended) round my head, I think it's probably best
for everyone if I sign off from the thread at this point. That's not in
any way meant to try to tell people not to reply to the posts I made a
short while ago - I'm not trying to get the last word.
I've enjoyed the debate, and I hope you all have too. Back to bed for a
couple of hours, hopefully to get some rest...
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Hi guys,
It is very interesting, but I see in this long thread not any question from
a VB.Net guy/girl, they know this probably already, so will you next time
disconnect the not relevant newsgroups as you recognise this.
Thanks in advance,
Cor
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP**********************@msnews.microsoft.com ...
Mike C# <xy*@xyz.comwrote:
>"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message The SQL Decimal and Numeric data types are designed to store every single last number in the range of valid values in an *EXACT* representation.
1/3 is in the range, but can't be stored exactly. Which part of that
statement do you disagree with?
Just as an example: Consider the SQL NUMERIC(2, 1) data type. 1/3 is
*absolutely not* a member of the set of valid values for this type. The
complete set of valid values for this type is:
{ -9.9, -9.8, -9.7, -9.6, -9.5, -9.4, -9.3, -9.2, -9.1, -9.0, -8.9, -8.8, -8.7,
-8.6, -8.5, -8.4, -8.3, -8.2, -8.1, -8.0, -7.9, -7.8, -7.7, -7.6, -7.5, -7.4,
-7.3, -7.2, -7.1, -7.0, -6.9, -6.8, -6.7, -6.6, -6.5, -6.4, -6.3, -6.2, -6.1,
-6.0, -5.9, -5.8, -5.7, -5.6, -5.5, -5.4, -5.3, -5.2, -5.1, -5.0, -4.9, -4.8,
-4.7, -4.6, -4.5, -4.4, -4.3, -4.2, -4.1, -4.0, -3.9, -3.8, -3.7, -3.6, -3.5,
-3.4, -3.3, -3.2, -3.1, -3.0, -2.9, -2.8, -2.7, -2.6, -2.5, -2.4, -2.3, -2.2,
-2.1, -2.0, -1.9, -1.8, -1.7, -1.6, -1.5, -1.4, -1.3, -1.2, -1.1, -1.0, -0.9,
-0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0.0, 0.1, 0.2, 0.3, 0.4,
0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9,
2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4,
3.5, 3.6, 3.7, 3.8, 3.9, 4.0, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9,
5.0, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6.0, 6.1, 6.2, 6.3, 6.4,
6.5, 6.6, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9,
8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9.0, 9.1, 9.2, 9.3, 9.4,
9.5, 9.6, 9.7, 9.8, 9.9 }
Likewise with NUMERIC (10, 2), DECIMAL (24, 18), or any other valid
precision and scale settings for the NUMERIC and DECIMAL types.
The fact that you cannot accurately store a value that is not a member of
the set of valid values for a type is of no significance. Since by
definition 1/3 is not a member of the set of valid values for the DECIMAL or
NUMERIC types, why do you find it surprising that 1/3 cannot be stored
exactly using these types?
>Are we talking about information loss from the simple act of converting from one base to another, which is what the Decimal and Numeric types address; or are you talking about information loss from performing operations on data which goes back to your 1/3 example earlier?
They're the same thing. Converting 0.1 base 3 (i.e. 1/3) into a decimal
*is* a conversion from one base to another, and the Decimal and Numeric
types *do not* store that number exactly.
No, they are two different arguments. As explained, 0.1 is a non-repeating
decimal, whereas 1/3 is an infinitely repeating decimal which cannot be
represented exactly in decimal form, period. The fact that an infinitely
repeating decimal cannot be represented *exactly* using a DECIMAL or NUMERIC
type should come as no surprise to you or anyone else for that matter. The
fact that a non-repeating decimal with a scale of 1 or 2 (just one or two
digits after the decimal point) cannot be stored *exactly* using FLOAT and
REAL data types comes as a surprise to many people.
>You need to present your arguments about this to ANSI, since these are the definitions they put forth.
I'm happy to disagree with any number of people, if what they're saying
doesn't make sense.
That's perfectly fine; however, as long as the standard calls INT, DECIMAL,
NUMERIC, etc., "exact" numeric data types and it calls FLOAT and REAL
"approximate" numeric data types, you may as well prepare yourself to
disagree with every SQL database programmer and designer in the world.
>They can store *every single valid value in their defined set of values* in an exact representation. With Numeric and Decimal types precision and scale that you specify define the valid set of values.
In that case the same can be said for Float. Every Float value is an
exact value - you can write down the *exact* binary string it
represents, and even convert it into an exact decimal representation.
Sounds great! Prove it: Store 28.08 as a FLOAT value and write doen the
*exact* binary string it represents. Then convert it into an exact decimal
representation. Here's something to get you started:
DECLARE @r FLOAT
SELECT @r = 28.08
SELECT @r
--Result in QA: 28.079999999999998
>You might want to re-read that, because you seem to have picked out just enough of a single sentence to support your argument, but completely ignored the part that doesn't. Here's the full sentence for you (the emphasis is mine):
"WHEN MAXIMUM PRECISION IS USED, valid values are from - 10^38 +1 through 10^38 - 1."
That also defines a default scale of "0". If, however, you define a different precision and/or scale, the valid set of values changes.
I don't think that makes any difference - when maximum precision is
used, the range of valid values is given in terms of a minimum and a
maximum, and not every value in that range is exactly representable. If
you're going to treat it as "every value that can be exactly
represented can be exactly represented" then as I say, the same is true
for Float.
Of course you don't think it makes any difference, or you would have quoted
the full quote instead of just the last part that appears to substantiate
your argument. The point is that when the precision and scale are turned
down the set of valid values changes for a given NUMERIC or DECIMAL data
type declaration. See the example above for NUMERIC (2, 1) for an example.
The range of valid values for NUMERIC (2, 1) is nowhere near -10^38 + 1 nor
+10^38 - 1. The minimum and maximum values in this case are -9.9 and +9.9,
and the set of valid values includes only decimal numbers that consist of
two digits where one of those digits is after the decimal point. Your
example of 1/3 does not fit in that set since it has more than 1 digit after
the decimal point (in fact, it has infinitely more digits after the decimal
point).
>Of course I'm biased to decimal numbers, as are most of the people here who were born with 10 fingers and 10 toes. Now let's talk about your bias against irrational numbers, since you seem to be tip-toeing all around them here.
No, I'm recognising the mathematical reality that the existence (and
value) of a number doesn't depend on what base you represent it in.
The mathematical reality is that very few people through the course of human
history have needed to use Base 3. The reality is that NUMERIC and DECIMAL
types do not represent every single number "in existence", nor every single
number between the minimum and maximum values specified by the precision and
scale set by the user. The real reality is that the user defines the set of
*valid values* as well as implicit upper and lower bounds when they declare
a DECIMAL or NUMERIC type. The reality is that every single *valid value*
in this set can be stored *exactly*; that is, you get out of it exactly what
you put into it.
>There is no "fraction" data type in SQL, which you might consider a shortcoming of the language. There is also no "fraction" data type built into C#, C++, or most other languages. The reason for that might be a huge bias or conspiracy against vulgar fractions, or it might be the lack of a useful vinculum in the original ASCII character set, or who it might just be that most people just don't find it all that useful.
I wasn't talking about usefulness, I was talking about existence.
Suggesting that 1/3 isn't a number is silly, IMO.
1/3 is a number, and I for one did not say otherwise. Now let me tell you
what is really silly (IMO):
- Suggesting that because you can't store an exact representation of the
infinitely repeating fraction 1/3 in a NUMERIC or DECIMAL (or any data type
for that matter) makes it inexact is ludicrous. As mentioned above, 1/3 is
not defined as a *valid value* for any NUMERIC or DECIMAL precision and
scale.
- Saying that trying to store the infinitely repeating decimal number 1/3
and getting less than an infinitely repeating decimal number is *the same
thing* as storing the non-repeating decimal number 1/10 and getting
something other than 0.1 back as a result is absolutely ridiculous.
Maybe a couple of examples will help, but let's get away from numbers and
bases for a minute and talk about character data (so we don't have to worry
about my decimal bias, or your love for "Base 3"):
0. Consider the VARCHAR data type on SQL 2000. It maxes out at 8,000
characters. You absolutely cannot store an infinite number of 'A'
characters in a VARCHAR(8000); therefore it is not exact by your reasoning.
By your reasoning, the only "exact" representation would be one in which you
could store every possible combination of characters ever, in an infinitely
long character string; even though a character string with 9,000 characters
is not a member of the set of valid values for a VARCHAR(8000). Likewise
you cannot store an infinitely repeating decimal (like 1/3) which, by your
reasoning, makes the NUMERIC and DECIMAL data types "inexact" as well;
despite the fact that 1/3 is not a member of the set of *valid values*
defined for the type.
1. Consider an imaginary data type similar to VARCHAR. We'll call it
APPROXIMATE VARCHAR. The APPROXIMATE VARCHAR will store basically the
strings we pass it; however, it may lose some accuracy in the least
significant characters (right-hand side of the string) due to the way it is
stored. If you were to store 8000 'A' characters in an APPROXIMATE
VARCHAR(8000), but then select the value back you might get 7,999 'A'
characters and one 'B' character. This might be no big deal if you really
only ever need to use the first 10 characters anyway, which appears to be
IEEE's rational behind their representation of floating point numbers. The
analogy is, of course, storing 28.08 in a FLOAT and retrieving 28.07999...
or 28.080000002 or some such number that *approximates* the exact value you
stored originally.
By your reasoning, both of these situations are the same: i.e., trying to
store an infinite number of characters in a data type that has a
well-defined limitation on the number of characters that can be stored *is
the same thing* as storing an approximation of the character string you pass
in. This is pretty silly, IMO.
>Regardless of the reason, 1/10 is not a repeating decimal in decimal (yes, I'm showing my "bias" for decimal again). It has a finite representation in decimal. 1/3 is a repeating decimal in decimal. Since it's decimal representation repeats infinitely, it's impossible to store it exactly anywhere. However, with regards to Decimal and Numeric data types in SQL, "0.33333" is not a member of the set of valid values for a DECIMAL (10, 2) or a NUMERIC (10, 3) [for instance].
Indeed.
Glad we agree on something.
>The definitions of SQL DECIMAL and NUMERIC specifically allow you to define the precision and scale. By doing so you define the number of total digits and the number of digits after the decimal point. Any numbers that are in the set of valid values for the precision and scale that you define will be stored *exactly*. Any numbers that are not in the set of valid values for that precision and scale will not be stored exactly. SQL REAL and FLOAT types do not come with this guarantee.
There's no reason why they shouldn't - why would it not be able to
guarantees tha the number is essentially reproducible, which is all I
can understand that we're talking about? The binary string "0.010101"
represents an exact number and no others.
SQL REAL and FLOAT types are based on the IEEE standard which guarantees
that it will store an approximation of a value you feed to it. Since we're
dealing with binary computers (as opposed to analog computers) we can expect
that once our exact value is converted to an approximation the internal
representation *of that approximation* will be exact (1's and 0's).
However, the exact value you feed it is still converted to an approximation,
as shown by the OP. NUMERIC and DECIMAL types do not suffer from this
approximation. They store only values from the set of *valid values*
defined for their precision and scale, and they store them exactly. In your
example of 1/3, storing it as a NUMERIC (2, 1) results in "0.3" being
stored, and that rounding occurs during casting, not during the storage
operation. Basically it is attempting to cast 1/3, which is not a member of
the valid set of numbers for NUMERIC(2, 1), to a valid member of that set.
It has nothing to do with the exactness with which "0.3" is actually stored;
whereas that is the entire issue with FLOAT and REAL.
>Quite simply because with NUMERIC and DECIMAL types I can define the precision and scale, and every valid value in that range will be stored *EXACTLY*. REAL and FLOAT come with pre-defined ranges, but have no guarantees concerning how the exactness with which your numeric data will be stored. As we saw, even a number as simple as 28.08 is stored as an approximation of 28.08 (or 2808/100 if you prefer your fractions). Whereas with a NUMERIC (10, 2) it would be stored as -- well, "28.08" *EXACTLY*.
Do you regard 28.08 as in the "range" for FLOAT? It seems unfair to
demand that it is while saying that 1/3 isn't in the "RANGE" for
NUMERIC, but that's what you've basically got to do in order to claim
that FLOAT is "approximate" but DECIMAL is "exact".
Life is not fair; however, your argument here is lacking.
FLOAT is a floating point type and 28.08 is a non-repeating decimal. The
fact that FLOAT cannot accurately store 28.08 exactly but DECIMAL and
NUMERIC can proves the point very well.
DECIMAL and NUMERIC are fixed-point types with a guarantee of storage
"exactness". As mentioned, you define the precision (total number of
digits) and scale (number of digits after the decimal point) for DECIMAL and
NUMERIC, and in the process you implicitly define the upper and lower
bounds, as well as the set of *valid values* for the DECIMAL and NUMERIC
types. The guarantee is that any value that is a member of the set of
*valid values* for those types will be stored exactly. If you want to store
1/3 as DECIMAL or NUMERIC you have to cast it (implicitly or explicitly) to
a member of the set of *valid values* defined for the precision and scale
you have chosen. 1/3 is not a *valid value* from the set of *valid values*
for any NUMERIC or DECIMAL type (since they all have a finite scale).
Consider a DECIMAL (5, 4). You can store only 4 digits after the decimal
point; therefore you cannot store an infinite number of 3's after the
decimal point. Therefore 1/3 is not in the set of valid values for that
type (see previous for example).
Personally I don't find your argument about not being able to store an exact
representation of an infinitely repeating decimal on machines with a finite
amount of memory to be all that compelling a reason to call DECIMAL and
NUMERIC data types "inexact".
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike C# <xy*@xyz.comwrote:
>"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message news:MP************************@msnews.microsoft. com...
The "approximate" nature of binary floating point isn't because it's
floating point, and isn't changed by using a decimal format. It's just
the inability to exactly represent all *decimal* numbers, which isn't
the same thing as all numbers.
That seems like hair-splitting, for real.
I see nothing hair-splitting about recognising that 1/3 is a number.
Listen to some of your arguments:
Argument A
========
0. "All numeric data types are *approximate* and none are *exact* because
you can't represent every single combination, including infinitely repeating
combinations, using them."
By extension (and we may need to enter the realm of philosophy here), your
argument says there is no such thing as an exact data type of *any kind*
since you can't represent every combination of whatever their base type is
using them; including infinitely repeating series. VARCHAR(8000) is now
*approximate* because I can't store 8,001 'Z' characters in it. INT is
*approximate* because I can't store 999999999999999999999999999999999999999
in it. Personally I find that silly, IMO. An exact data type is not
*exact* based on being able to store every infinite combination; it is exact
because the valid values it does store are stored using *exact*
representations instead of approximations.
Argument B
========
0. "I type in the FLOAT value 10.0,"
1. "The computer stores it as an approximate value of 9.999999,"
2. "The binary representation of 9.999999 is *exact*,"
3. "Therefore FLOAT is an *exact* data types."
Really...
>People in this newsgroup refer to binary floating point numbers as "approximate" because that's what the ANSI SQL standard calls them.
You may notice that there are multiple newsgroups involved here...
Yes I did notice, and I'm responding from the sqlserver newsgroup, just FYI.
SQL folks call floating point numbers "approximate" because that's what the
standard says.
>"The data types NUMERIC, DECIMAL, INTEGER, and SMALLINT are collectively referred to as exact numeric types. The data types FLOAT, REAL, and DOUBLE PRECISION are collectively referred to as approximate numeric types. Exact numeric types and approximate numeric types are collectively referred to as numeric types. Values of numeric type are referred to as numbers." -- ANSI SQL-92
Just because ANSI says something doesn't mean it's right, IMO.
I agree with that. However, they did just fine with this one. Unless you
can find a better argument than "You can't store an infinitely repeating
decimal exactly in a fixed-decimal point type, on a machine with a finite
amount of memory, therefore the fixed-decimal point type is not an *exact*
type", you're not going to convince ANSI or just about anyone else of the
justness of your cause.
Mike C# wrote:
[...] The fact that an infinitely
repeating decimal cannot be represented *exactly* using a DECIMAL or NUMERIC
type should come as no surprise to you or anyone else for that matter. The
fact that a non-repeating decimal with a scale of 1 or 2 (just one or two
digits after the decimal point) cannot be stored *exactly* using FLOAT and
REAL data types comes as a surprise to many people. [...]
I agree with both statements above. But you have to consider that the
number 0.2, for example (as well as 28.08), is an "infinitely repeating
binary number" (although is not an "infinitely repeating decimal").
The representation of 0.2 in base 2 is:
0.0011001100110011...
For easier reading, the same thing in base-16 (hexadecimal), would be:
0.3333...
The representation of 0.08 in base 2 is:
0.00010100011110101110 00010100011110101110 00010100011110101110 ...
For easier reading, in hex that would be:
0.147AE147AE147AE...
This may come as surprise to many people, but if you really think about
it, it is normal that some numbers which can be represented in base-10
with a finit number of digits after the decimal point, will be
represented in base-2 with an infinite number of digits. Obviously,
these numbers cannot be accurately stored in SQL using a float or real
data type (which uses base 2).
To avoid problems when displaying such a number, in the conversion from
base-2 back to base-10, the last bits are sometimes ignored (because
they cannot form an entire decimal digit anyway) and the result
sometimes happens to be the expected result (e.g. 0.2), but not always.
Razvan
"Razvan Socol" <rs****@gmail.comwrote in message
news:11**********************@p59g2000hsd.googlegr oups.com...
Mike C# wrote:
I agree with both statements above. But you have to consider that the
number 0.2, for example (as well as 28.08), is an "infinitely repeating
binary number" (although is not an "infinitely repeating decimal").
Absolutely.
This may come as surprise to many people, but if you really think about
it, it is normal that some numbers which can be represented in base-10
with a finit number of digits after the decimal point, will be
represented in base-2 with an infinite number of digits. Obviously,
these numbers cannot be accurately stored in SQL using a float or real
data type (which uses base 2).
That's exactly the point I was discussing with Mr. Skeet. The
"approximation" comes into play precisely because of the conversion from
base-10 to base-2 in the FLOAT and REAL types, whereas with DECIMAL and
NUMERIC types presumably use some form of Binary Coded Decimal (possibly
packed BCD or some similar algorithm) to represent the number, which
prevents data loss during the storage process.
Mike C# <xy*@xyz.comwrote:
<snip>
Just because ANSI says something doesn't mean it's right, IMO.
I agree with that. However, they did just fine with this one. Unless you
can find a better argument than "You can't store an infinitely repeating
decimal exactly in a fixed-decimal point type, on a machine with a finite
amount of memory, therefore the fixed-decimal point type is not an *exact*
type", you're not going to convince ANSI or just about anyone else of the
justness of your cause.
Just *one* point I can't resist here: I've never claimed that decimal
isn't exact. I've claimed that float is exact too.
Once you've got a float value, you can always convert it to an exact
decimal string too, and indeed I've got C# code to do it, referenced
from http://www.pobox.com/~skeet/csharp/floatingpoint.html
There's also an online version where you can type in the decimal value
you want to find the closest double to, and then show the exact value
of the double. (These are C# floats and doubles, but the principle is
the same). For example, the closest double to 28.08 is *exactly*
28.0799999999999982946974341757595539093017578125
My point (which I thought I'd made again and again, but clearly not
enough) is that it's the conversion from a decimal literal to a
floating point value which is an "approximate" conversion, *not* the
float value itself. The point about base 3 is that the same would be
true - convert a base 3 number to a decimal and you'd end up with an
exact value which is the closest decimal to that number, but that may
not be exactly the same as the original value. (eg 1/3 becoming 0.333).
In both cases the conversion is "lossy" but the resultant value is
exact.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Just *one* point I can't resist here: I've never claimed that decimal
isn't exact. I've claimed that float is exact too.
The problem is, if I hard-code a value like 28.08, and store it in a FLOAT,
when I ask for that value back, I may not get that EXACT value. It doesn't
matter to me if that approximation occurred during conversion, or
translation, or storage, or divine intervention. The value I thought I had
sent to the database has been changed on me.
A
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
Just *one* point I can't resist here: I've never claimed that decimal
isn't exact. I've claimed that float is exact too.
The problem is, if I hard-code a value like 28.08, and store it in a FLOAT,
when I ask for that value back, I may not get that EXACT value.
Indeed - and that's why you want to use DECIMAL in such a situation.
I've never denied the usefulness of DECIMAL in the slightest.
It doesn't matter to me if that approximation occurred during
conversion, or translation, or storage, or divine intervention. The
value I thought I had sent to the database has been changed on me.
It may not matter to you, but accuracy in description of types matters
to me. It's a bit like people claiming that "objects are passed by
reference by default" in C#, or that "value types live on the stack and
reference types live on the heap" - both statements sound okay on first
hearing, but dig a little deeper and they're inaccurate. I like to be
accurate from the start (where I can, at least). Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike C# <xy*@xyz.comwrote:
<snip>
Just because ANSI says something doesn't mean it's right, IMO.
I agree with that. However, they did just fine with this one. Unless you can find a better argument than "You can't store an infinitely repeating decimal exactly in a fixed-decimal point type, on a machine with a finite amount of memory, therefore the fixed-decimal point type is not an *exact* type", you're not going to convince ANSI or just about anyone else of the justness of your cause.
Just *one* point I can't resist here: I've never claimed that decimal
isn't exact. I've claimed that float is exact too.
I pointed this out in Argument B above: You enter 10.0, the computer
converts it to an approximate FLOAT base-2 representation of "9.999999" and
stores that value. Since "9.999999" can be stored "exactly" using the IEEE
floating point standard, it follows that the FLOAT data type must be exact
too.
Hogwash. The reason it is "approximate" is because it is converted to an
Base-2 approximate representation since the exact number can't be
represented in Base-2. The reason the other types are exact is because they
don't convert to Base-2 but instead store the decimal digits in a packed
format so no conversion to a different base needs to take place.
Once you've got a float value, you can always convert it to an exact
decimal string too, and indeed I've got C# code to do it, referenced
from http://www.pobox.com/~skeet/csharp/floatingpoint.html
There's also an online version where you can type in the decimal value
you want to find the closest double to, and then show the exact value
of the double. (These are C# floats and doubles, but the principle is
the same). For example, the closest double to 28.08 is *exactly*
28.0799999999999982946974341757595539093017578125
But 28.0799999999999982946974341757595539093017578125 is not *EXACTLY*
28.08. It is an approximation. Hence the word "approximate".
Perhaps not surprisingly, the closest NUMERIC(4, 2) to 28.08 is *EXACTLY*
28.08! And the *EXACT* string representation of the NUMERIC(4, 2) value
28.08 is *EXACTLY* 28.08.
My point (which I thought I'd made again and again, but clearly not
enough) is that it's the conversion from a decimal literal to a
floating point value which is an "approximate" conversion, *not* the
float value itself. The point about base 3 is that the same would be
true - convert a base 3 number to a decimal and you'd end up with an
exact value which is the closest decimal to that number, but that may
not be exactly the same as the original value. (eg 1/3 becoming 0.333).
In both cases the conversion is "lossy" but the resultant value is
exact.
"You'd end up with an exact value which is the closest decimal to that
number". Again you're talking about base conversions here, which is good as
it happens to be the root of the problem with FLOAT. However, DECIMAL and
NUMERIC types were introduced because sometimes - that is to say every once
in a while - you actually want the EXACT value back that you originally fed
into the computer.
Every once in a while when you deposit $100 in the bank, you might want to
later take $100 out. You obviously can't do that if the bank "approximates"
the value of your deposit down to $92 every time you deposit. It would be
interesting to watch a bank manager explain to his customer that $92 is the
EXACT value of their deposit since it is an EXACT number, even though it's
not EXACTLY the amount they deposited. After all 92 is an EXACT number, it
also happens to be EXACTLY *not* 100...
The loss of information from conversion to a different base is what ANSI had
in mind when they applied the terms "approximate" and "exact". The loss of
information from truncating an infinitely repeating decimal is neither here
nor there no matter what media you are storing data on. You can expect
truncation under any circumstances in which you have an inifinitely long
repeating decimal and a finite amount of storage - whether it's computer
memory or a college-ruled notebook. The fact that FLOAT can't represent
many simple non-repeating decimals without corruption is what makes it
"approximate". I can't say it any more plainly than that.
Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
Because in SQL Server there is no plausible distinction. I pass in 28.08, I
get back a lot more decimals, it really is irrelevant whether it's the type
of the conversion or something else.
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
Because some MVP will jump in and claim that the conversion cannot
technically be called "approximate". After all, the conversion is a
hard-coded set of instructions that perform the same function every time and
produce the same value given the same input every time. Sounds fairly exact
to me.
Mike C# <xy*@xyz.comwrote:
Just *one* point I can't resist here: I've never claimed that decimal
isn't exact. I've claimed that float is exact too.
I pointed this out in Argument B above: You enter 10.0, the computer
converts it to an approximate FLOAT base-2 representation of "9.999999" and
stores that value. Since "9.999999" can be stored "exactly" using the IEEE
floating point standard, it follows that the FLOAT data type must be exact
too.
Indeed. The value is 9.999999, and that value is stored exactly.
Hogwash. The reason it is "approximate" is because it is converted to an
Base-2 approximate representation since the exact number can't be
represented in Base-2. The reason the other types are exact is because they
don't convert to Base-2 but instead store the decimal digits in a packed
format so no conversion to a different base needs to take place.
And that's great, so long as the numbers are originally specified in
base 10 - you don't need a conversion.
I distinguish between an approximate *conversion* and whether or not a
*type* is exact.
Once you've got a float value, you can always convert it to an exact
decimal string too, and indeed I've got C# code to do it, referenced
from http://www.pobox.com/~skeet/csharp/floatingpoint.html
There's also an online version where you can type in the decimal value
you want to find the closest double to, and then show the exact value
of the double. (These are C# floats and doubles, but the principle is
the same). For example, the closest double to 28.08 is *exactly*
28.0799999999999982946974341757595539093017578125
But 28.0799999999999982946974341757595539093017578125 is not *EXACTLY*
28.08. It is an approximation. Hence the word "approximate".
The conversion is an approximation, the value is exact.
Perhaps not surprisingly, the closest NUMERIC(4, 2) to 28.08 is *EXACTLY*
28.08! And the *EXACT* string representation of the NUMERIC(4, 2) value
28.08 is *EXACTLY* 28.08.
Yup, and if we didn't use base 10, we'd have (potentially) exactly the
same loss with DECIMAL. DECIMAL itself isn't inherently more "exact"
than FLOAT, it's only the fact that there's no lossy conversion
required when an initial value is specified in base 10 that leads to
this (IMO) invalid distinction.
My point (which I thought I'd made again and again, but clearly not
enough) is that it's the conversion from a decimal literal to a
floating point value which is an "approximate" conversion, *not* the
float value itself. The point about base 3 is that the same would be
true - convert a base 3 number to a decimal and you'd end up with an
exact value which is the closest decimal to that number, but that may
not be exactly the same as the original value. (eg 1/3 becoming 0.333).
In both cases the conversion is "lossy" but the resultant value is
exact.
"You'd end up with an exact value which is the closest decimal to that
number". Again you're talking about base conversions here, which is good as
it happens to be the root of the problem with FLOAT. However, DECIMAL and
NUMERIC types were introduced because sometimes - that is to say every once
in a while - you actually want the EXACT value back that you originally fed
into the computer.
How many times do I have to say that I have nothing against DECIMAL,
that I can quite easily see why it exists, and that it is indeed the
most appropriate answer to the OP's dilemma?
The only issue I have is the claim that FLOAT is approximate but
DECIMAL is exact. The reality as I see it is that both FLOAT and
DECIMAL are exact, but conversions from base 10 to FLOAT are
approximate.
<snip>
The fact that FLOAT can't represent
many simple non-repeating decimals without corruption is what makes it
"approximate". I can't say it any more plainly than that.
Again, that's a base-10-biased view of the world, rather than a
maths/computer science view, IMO. I like to look at the inherent
properties of a type, viewing conversions which may be used as separate
from the types themselves. Maybe it's because I'm a mathematician and
computer scientist by training rather than a banker, but it certainly
feels like a more deeply accurate view of the types involved than one
which is effectively based on how many fingers we have.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
Because in SQL Server there is no plausible distinction. I pass in 28.08, I
get back a lot more decimals, it really is irrelevant whether it's the type
of the conversion or something else.
You don't like to understand what's going on beneath the surface? Why
there's a loss, and why there *wouldn't* be a loss if you pass in a
value which is already in binary, and why (if you were able to do so)
you'd still lose information if you passed a base 3 value into a
DECIMAL?
<shrugI guess I'm just inquisitive.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mike C# <xy*@xyz.comwrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
Because some MVP will jump in and claim that the conversion cannot
technically be called "approximate". After all, the conversion is a
hard-coded set of instructions that perform the same function every time and
produce the same value given the same input every time. Sounds fairly exact
to me.
That means it's deterministic, not that it's exact. It certainly *is*
an approximation, and it's losing information. I'd certainly be
perfectly happy to argue the "approximate" side of that debate :)
Do either you or Aaron actually disagree that when you separate the
idea of "type" from "conversion" it's the *conversion* which is
approximate rather than the FLOAT type itself? If you do, it's probably
worth continuing to debate (despite my 5am post!) - if not, it's
presumably just a matter of how much we all do/don't care about making
the distinction between types and conversions. That's a matter of
opinion and probably not worth arguing about.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
It doesn't matter to me if that approximation occurred during
conversion, or translation, or storage, or divine intervention. The
value I thought I had sent to the database has been changed on me.
It may not matter to you, but accuracy in description of types matters
to me. It's a bit like people claiming that "objects are passed by
reference by default" in C#, or that "value types live on the stack and
reference types live on the heap" - both statements sound okay on first
hearing, but dig a little deeper and they're inaccurate. I like to be
accurate from the start (where I can, at least). Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
The benefit is that you're left with a useful definition. By your
reasoning, no datatype can ever be "exact" so the term is rendered
essentially useless. We can't ever store the "exact" value of 1/3 in
any meaningful way.
In this case the terminology is more or less describing the way the
datatype can be expected to perform. When storing a valid value
(28.08) in a DECIMAL column you can expect to get exactly the same
value back again. Store the same value in FLOAT and you get an
approximation.
Shuurai wrote:
We can't ever store the "exact" value of 1/3 in
any meaningful way.
Hate to add more noise to this thread, but I dissagree.
You can store 1/3 exactly, just not on base 10.
If you use for example to base 24 (who does not use base 24 on a day by
day basis...), you can store 1/3 of a day exactly.
8 hours.
Shuurai <Sh*******@hotmail.comwrote:
It may not matter to you, but accuracy in description of types matters
to me. It's a bit like people claiming that "objects are passed by
reference by default" in C#, or that "value types live on the stack and
reference types live on the heap" - both statements sound okay on first
hearing, but dig a little deeper and they're inaccurate. I like to be
accurate from the start (where I can, at least). Why claim that it's
the type which is approximate rather than the conversion? Where's the
benefit?
The benefit is that you're left with a useful definition. By your
reasoning, no datatype can ever be "exact" so the term is rendered
essentially useless. We can't ever store the "exact" value of 1/3 in
any meaningful way.
No, by my definition pretty much all datatypes are exact. (I can't
think of any that aren't, offhand.) Conversions between datatypes (at
least numeric ones) tend to be lossy.
Let's look at another example - INTEGER. Is INTEGER approximate? I
wouldn't say so - but if you tried to convert 28.08 into an INTEGER,
you'd certainly lose information, because that conversion is lossy.
Is INTEGER described as an "approximate" type by ANSI? I doubt it
somehow...
In this case the terminology is more or less describing the way the
datatype can be expected to perform. When storing a valid value
(28.08) in a DECIMAL column you can expect to get exactly the same
value back again. Store the same value in FLOAT and you get an
approximation.
So when there's no lossy conversion involved, there's no loss of
information, but when there *is* a lossy conversion involved,
information is lost. That's not exactly a surprise, and has nothing to
do with the types themselves, as far as I can see.
I don't see why people believe it's so difficult to explain/understand
that it's the conversion which is lossy, not the type.
Note that if you pass in floating binary point data rather than decimal
data, you *won't* lose information - because hey, again, there's no
conversion involved. That isn't clear if you regard the type itself as
being lossy, but it's absolutely obvious when you separate conversions
from types.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
>Because in SQL Server there is no plausible distinction. I pass in
>28.08, I get back a lot more decimals, it really is irrelevant whether it's the type of the conversion or something else.
You don't like to understand what's going on beneath the surface?
It's interesting, sure. But I don't have a problem with the standards
calling it an "approximate data type" if it's meant to describe how data
will potentially be stored differently from how it is provided to the
database. As I've said multiple times, it really isn't relevant in this
case whether the approximation occured during conversion or during storage
or during retrieval. And again, base 3 is about as useful as fool's gold
here, because base 3 does not come into play.
Unlike INT or DECIMAL, where "approximation" can occur because you used a
value that doesn't fit into the constraints you've defined, with FLOAT there
is much less control over how this happens, and it is much more surprising
to most users I have come across who chose FLOAT for reasons other than the
fact that they knew the nature of its conversion and storage. Do you know
how many people I have seen choose FLOAT for storing monetary values like
prices, or even quantities? And do you know how surprised they are to find
out that SQL Server changed 28.08 to 28.0800000000002? SQL Server
approximated the data that they passed in, and they don't really care how it
happened beneath the surface.
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
You don't like to understand what's going on beneath the surface?
It's interesting, sure. But I don't have a problem with the standards
calling it an "approximate data type" if it's meant to describe how data
will potentially be stored differently from how it is provided to the
database. As I've said multiple times, it really isn't relevant in this
case whether the approximation occured during conversion or during storage
or during retrieval. And again, base 3 is about as useful as fool's gold
here, because base 3 does not come into play.
Well, I think it's useful for demonstrating that it's not the type
that's exact, merely the conversion from data which is already in
decimal form. I never claimed real world usefulness.
Unlike INT or DECIMAL, where "approximation" can occur because you used a
value that doesn't fit into the constraints you've defined, with FLOAT there
is much less control over how this happens, and it is much more surprising
to most users I have come across who chose FLOAT for reasons other than the
fact that they knew the nature of its conversion and storage. Do you know
how many people I have seen choose FLOAT for storing monetary values like
prices, or even quantities? And do you know how surprised they are to find
out that SQL Server changed 28.08 to 28.0800000000002? SQL Server
approximated the data that they passed in, and they don't really care how it
happened beneath the surface.
Do you think it would be hard to explain that it's the *conversion*
which loses the data, rather than giving the impression that FLOAT is
inherently inexact?
I don't doubt that people use FLOAT inappropriately, and that they are
indeed surprised by the lossy conversion, but I don't see how
describing the type as inaccurate is any clearer than saying
(accurately) that it's the conversion which loses the information.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
I have a fair idea. And they're the same people who get a surprise when they
light the barbecue after pouring petrol (gasoline) on the charcoal.
It's the best argument that there is for not allowing users to muck about
with things that they don't understand.
"Aaron Bertrand [SQL Server MVP]" <te*****@dnartreb.noraawrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
<snip>
fact that they knew the nature of its conversion and storage. Do you know
how many people I have seen choose FLOAT for storing monetary values like
prices, or even quantities? And do you know how surprised they are to
find out that SQL Server changed 28.08 to 28.0800000000002? SQL Server
</snip>
Do you think it would be hard to explain that it's the *conversion*
which loses the data, rather than giving the impression that FLOAT is
inherently inexact?
I don't think it would be too hard, but I think it would be useless. As I
said before, most of the people don't care why it happened, they just want
to know how to fix it. When you get your oil changed, do you want to know
exactly how impure your oil was and how this affected your engine and gas
mileage down to the nano-details? Or do you just want new oil?
I've had an article about this since 2003, and you're the first person I've
ever heard even raise a peep about it. http://classicasp.aspfaq.com/general...000000002.html
And I still argue that it is valid to call the data type approximate,
because frankly, conversion is a part of the data type, as far as I'm
concerned. YMMV.
A
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
>Hogwash. The reason it is "approximate" is because it is converted to an Base-2 approximate representation since the exact number can't be represented in Base-2. The reason the other types are exact is because they don't convert to Base-2 but instead store the decimal digits in a packed format so no conversion to a different base needs to take place.
And that's great, so long as the numbers are originally specified in
base 10 - you don't need a conversion.
I distinguish between an approximate *conversion* and whether or not a
*type* is exact.
There's absolutely a conversion. Every time you feed information into the
computer and store it as one of these data types it is converted from the
input format to the internal representation. Every single time. Being a CS
guy you should know that :) It just so happens that with the NUMERIC(4, 2)
type, for instance, there is a single lossless conversion from the character
string "28.08" to the internal representation [shown in Base-16 here]:
"0x04020001F80A0000". For REAL and FLOAT there is a lossy conversion from
the character string to an approximate Base-2 representation of
"0x403C147AE147AE14", which represents a number that is exactly not "28.08".
As for distinguishing between the "type" and the "conversion", do you
specifically target the "conversion" in C# when you write a statement like
this?
float f = 28.08;
Technically this is a conversion of the character string "28.08" to an
internal IEEE-standard floating-point format and subsequent assignment of
the result of that conversion to the variable "f". However, since there is
*always* an implicit conversion most people take it for granted. It might
make a good trivia question, but beyond that it's not important to specify
this particular implicit conversion every time you reference a variable
assignment. Same goes for SQL.
To say that the fault lies completely with the conversion process and the
storage format bears no blame, is that an implication that the conversion
process can be fixed so that this information loss does not occur? I tend
to think that the conversion process is necessitated by the limitations of
the storage format itself and not the other way around.
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
Do you think it would be hard to explain that it's the *conversion*
which loses the data, rather than giving the impression that FLOAT is
inherently inexact?
I don't think it would be too hard, but I think it would be useless. As I
said before, most of the people don't care why it happened, they just want
to know how to fix it. When you get your oil changed, do you want to know
exactly how impure your oil was and how this affected your engine and gas
mileage down to the nano-details? Or do you just want new oil?
I guess it depends on what kind of person you are. If I get results I
don't expect, I want to get a good, *accurate* description of what's
going on - in this case, where I'm losing information. The fact that
it's the conversion which is lossy would be important to me.
I've had an article about this since 2003, and you're the first person I've
ever heard even raise a peep about it. http://classicasp.aspfaq.com/general...000000002.html
I have no problems with that article.
And I still argue that it is valid to call the data type approximate,
because frankly, conversion is a part of the data type, as far as I'm
concerned. YMMV.
The conversion from decimal is part of what you can do with the type,
but I don't think that's enough to deem the type itself as approximate.
To go back to the base 3 case, if there *were* a conversion from base 3
to decimal, would that make the decimal type in some way approximate in
a way that it isn't at the moment? It would still be able to represent
the same data it can today, so how would it have become less exact?
(No, I'm still not saying that a base 3 type will ever come about -
it's merely a thought experiment to examine how reasonable the "exact"
vs "approximate" distinction is.)
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
To say that the fault lies completely with the conversion process and the
storage format bears no blame, is that an implication that the conversion
process can be fixed so that this information loss does not occur? I tend
to think that the conversion process is necessitated by the limitations of
the storage format itself and not the other way around.
YES, I agree!
Next topic, please...
The conversion from decimal is part of what you can do with the type,
but I don't think that's enough to deem the type itself as approximate.
It is if the limitations of the type affect how certain data is stored
there. I'm not going to keep beating dead horse here (and once again, I
don't really care about base 3, if I hadn't already beat that dead horse
enough).
A
[Finally remembered to remove the VB newsgroup as requested by Cor.]
Mike C# <xy*@xyz.comwrote:
And that's great, so long as the numbers are originally specified in
base 10 - you don't need a conversion.
I distinguish between an approximate *conversion* and whether or not a
*type* is exact.
There's absolutely a conversion. Every time you feed information into the
computer and store it as one of these data types it is converted from the
input format to the internal representation. Every single time. Being a CS
guy you should know that :) It just so happens that with the NUMERIC(4, 2)
type, for instance, there is a single lossless conversion from the character
string "28.08" to the internal representation [shown in Base-16 here]:
"0x04020001F80A0000".
Okay, agreed - apologies. I suppose I was considering the conversion
(or lack thereof) from one base to another. I should have been more
precise.
For REAL and FLOAT there is a lossy conversion from
the character string to an approximate Base-2 representation of
"0x403C147AE147AE14", which represents a number that is exactly not "28.08".
Yup.
As for distinguishing between the "type" and the "conversion", do you
specifically target the "conversion" in C# when you write a statement like
this?
float f = 28.08;
Technically this is a conversion of the character string "28.08" to an
internal IEEE-standard floating-point format and subsequent assignment of
the result of that conversion to the variable "f". However, since there is
*always* an implicit conversion most people take it for granted. It might
make a good trivia question, but beyond that it's not important to specify
this particular implicit conversion every time you reference a variable
assignment. Same goes for SQL.
Yes, I would regard there as being a compile-time conversion going on -
and it's defined by the spec:
<quote>
The value of a real literal having type float or double is determined
by using the IEC 60559 =3Fround to nearest=3F mode.
</quote>
(The spec also talks about the rounding conversion used, if any is
required, in converting a literal into a decimal.)
To say that the fault lies completely with the conversion process and the
storage format bears no blame, is that an implication that the conversion
process can be fixed so that this information loss does not occur?
Not at all, just as I wouldn't say that the conversion from "28.08" to
an INTEGER can be "fixed" so that the information loss doesn't occur.
I tend to think that the conversion process is necessitated by the
limitations of the storage format itself and not the other way
around.
The storage format for FLOAT has a set of values it can represent, just
the same as INTEGER and DECIMAL. I still don't understand why it seems
to be "forgiveable" for INTEGER and DECIMAL (in that it doesn't stop
them from being labelled "exact", but not for FLOAT.
You wrote elsewhere:
<quote>
The fact that you cannot accurately store a value that is not a member
of the set of valid values for a type is of no significance.
</quote>
The set of valid values for FLOAT doesn't include 28.08 in the same way
that the set of valid values for INTEGER doesn't include 28.08 and in
the same way that the set of valid values for NUMERIC(2,1) doesn't
include 28.08. I see those three situations as being the same, but
you're highlighting FLOAT as being "different". Why?
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
[Removed VB group]
Aaron Bertrand [SQL Server MVP] <te*****@dnartreb.noraawrote:
The conversion from decimal is part of what you can do with the type,
but I don't think that's enough to deem the type itself as approximate.
It is if the limitations of the type affect how certain data is stored
there. I'm not going to keep beating dead horse here (and once again, I
don't really care about base 3, if I hadn't already beat that dead horse
enough).
I think we can work towards a compromise then. How about the following
statements:
<proposal>
FLOAT is an approximate type with respect to some values which are
exactly representable in base 10. It is an exact type with respect to
values which are exactly representable in base 2.
DECIMAL is an approximate type with respect to some values which are
exactly representable in base 3. It is an exact type with respect to
values which are exactly representable in base 10.
</proposal>
Is there anything in there that you'd actually disagree with (rather
than just not caring about)?
The difference between those statements and the simplistic statement
that "FLOAT is approximate, DECIMAL is exact" is that the above make it
perfectly clear that it's only the base involved which determines the
"exactness", not anything else about the type.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Do you think it would be hard to explain that it's the *conversion*
which loses the data, rather than giving the impression that FLOAT is
inherently inexact?
SQL is an abstraction, as are C#, C++, VB, yadda yadda. Part of that
abstraction is designed to pull people away from the underbelly of the
machine and give them tools they can use to get a particular job done.
While it's useful from a CS perspective to understand the IEEE floating
point standard, how chip designers and manufacturers implement FPU's, the
differences between binary floating point/binary coded decimal/packed binary
coded decimal, the Assembly Language instructions happening under the hood,
etc., it's just not *necessary* to have this intimate knowledge to do
99.9999999999999999% of what most SQL developers, DBA's, etc., do. Just
like it's not necessary to have Ph.D.'s in mechanical engineering, civil
engineering, physics, mathematics, and electrical engineering to drive a
car. It might be nice to have all that knowledge, but it's not really
necessary.
I don't doubt that people use FLOAT inappropriately, and that they are
indeed surprised by the lossy conversion, but I don't see how
describing the type as inaccurate is any clearer than saying
(accurately) that it's the conversion which loses the information.
Speaking of "inaccurate", the FLOAT type is "approximate", not
"inaccurate". I don't recall seeing anyone call the FLOAT type "inaccurate"
until this post.
As for why it's described as "approximate"; because SQL is an abstraction,
and in SQL (which is not object-oriented) you cannot override operators like
the assignment operator. The operators are closely tied to the data types,
and to almost all users (and the vast majority of developers) it's enough to
know that if you assign a FLOAT variable a value and later retrieve it, it
might not be the same value you assigned. That's the level of abstraction
brought by SQL - it's several layers of abstraction higher than Assembler
Language, C++, or even C#.
DECIMAL is an approximate type with respect to some values which are
exactly representable in base 3. It is an exact type with respect to
values which are exactly representable in base 10.
</proposal>
Is there anything in there that you'd actually disagree with (rather
than just not caring about)?
I don't think there's any reason to point out that decimal can't store base
3 precisely, since this is completely irrelevant in SQL Server. If you
really like those definitions, then propose them to the standards bodies and
to Microsoft (connect.microsoft.com/sql). I don't feel very inclined to
adopt them.
The difference between those statements and the simplistic statement
that "FLOAT is approximate, DECIMAL is exact" is that the above make it
perfectly clear that it's only the base involved which determines the
"exactness", not anything else about the type.
Irrelevant in the context of this discussion, imho, since you cannot alter
the base for any of the numeric data types in SQL Server. ### This discussion thread is closed Replies have been disabled for this discussion. ### Similar topics
79 posts
views
Thread by Daniel Manes |
last post: by
| | | | | | | | | | |