By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,586 Members | 2,418 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,586 IT Pros & Developers. It's quick & easy.

SUM(double_value) is not deterministic?

P: n/a
Hi,

I'm using predefined SUM(() function on DOUBLE column. Surpisingly the
sum can give different results depending on the order of values in my
table.

Is that correct?

I can explain it, as adding two double values can entail rouding the
result, so the order of adding influence the final result.

Actually I cannot change my select statement to cast double column to
decimal, or to change the type in the database. I'm looking for a new
SUM(double) aggregate function, which internaly will be adding values
using decimal, f.e. decimal(31,15) and which finally return casted
double value.
Here is a code to replicate the issue:
---------------------------------------------------------------
create table table1 (value double)
@
create procedure load()
begin
declare c int default 0;
while (c < 100000)
do

set c = c + 1;
insert into table1 values ( cast(rand() as double) );

end while;
end
@
call load()
@
create table table2 like table1
@
insert into table2 select * from table1 order by value
@
select sum(value) from table1
@
select sum(value) from table2
@
select double(sum(cast(value as decimal(31,15)))) from table1
@
results:
+4,99359445173494E+004
+4,99359445173499E+004

The correct one in my case is:
+4,99359445173498E+004

Regards,
Artur Wronski

Apr 25 '06 #1
Share this Question
Share on Google+
8 Replies


P: n/a
Artur wrote:
Hi,

I'm using predefined SUM(() function on DOUBLE column. Surpisingly the
sum can give different results depending on the order of values in my
table.

Is that correct?

I can explain it, as adding two double values can entail rouding the
result, so the order of adding influence the final result.

Actually I cannot change my select statement to cast double column to
decimal, or to change the type in the database. I'm looking for a new
SUM(double) aggregate function, which internaly will be adding values
using decimal, f.e. decimal(31,15) and which finally return casted
double value.
Here is a code to replicate the issue:
---------------------------------------------------------------
create table table1 (value double)
@
create procedure load()
begin
declare c int default 0;
while (c < 100000)
do

set c = c + 1;
insert into table1 values ( cast(rand() as double) );

end while;
end
@
call load()
@
create table table2 like table1
@
insert into table2 select * from table1 order by value
@
select sum(value) from table1
@
select sum(value) from table2
@
select double(sum(cast(value as decimal(31,15)))) from table1
@
results:
+4,99359445173494E+004
+4,99359445173499E+004

The correct one in my case is:
+4,99359445173498E+004

I'm no expert in the matter, but I can't say that I'm terribly surprised.

Cheers
Serge

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab
Apr 26 '06 #2

P: n/a
First, It is well known that the result of sum of floating point
numbers would be different by sequence of summation of numbers, in
mathematical area.
To get more accurate result, usually summed from small numbers to large
numbers.

That might be the reason of the different results of SUM on table1 and
table2.

But, I don't know this means SUM is not deterministic (It might be
depending on DB2 specification(definition of deterministic)).

Second, You tried
select double(sum(cast(value as decimal(31,15)))) from table1
I thought that an error(drop some lowest digits) may be introduced at
the time of converting from float to decimal.
So, this way may not usefull to get correct result.

Last, my question is how do you get correct
one(+4,99359445173498E+004)?

Apr 26 '06 #3

P: n/a
The problem is that having two different sessions, one from CLP, one
from CLI application I get different results of the sum running the
same query on the same static table.

Let's say that I have only 3 values, and in my case let's assume the
precision is only 3 digits:
1.14
2.23
7.99

I can do the sum in two ways, that give different final result:

1.14 + 2.23 -> 3.37 + 7.99 -> 11.4
2.23 + 7.99 -> 10.2 + 1.14 ->11.3

And in my case the correct sum is:
1.14 + 2.23 -> 3.37 + 7.99 -> 11.36 -> 11.4
2.23 + 7.99 -> 10.22 + 1.14 -> 11.36 -> 11.4

Correct in my definition means not order depended. I know that in
general it can be difficult to achieve, but the idea is to add values
internally using different type (which can hold the intermediate result
without truncation), or even with predefined, each time the same order
(fe. not to start adding the highest with lowest value).

I wish to have a way to writh my own sum(double) aggregate function to
replace the standard one (my customer is having badly coded report
which stores data in double field, and have no access to the source
code).

-- Artur Wronski

Apr 26 '06 #4

P: n/a
> I can do the sum in two ways, that give different final result:

1.14 + 2.23 -> 3.37 + 7.99 -> 11.4
2.23 + 7.99 -> 10.2 + 1.14 ->11.3

How did you calculate these expressions?
Especially my question is that why interim result(10.2) in second
expression has only one decimal digit .

Here is an example of calculation of second expression by SQL. I got
correct interim result and final sum.
SELECT interim
, interim + 1.14 fimnal
FROM (VALUES 2.23 + 7.99 ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- -------
10.22 11.36

1 record(s) selected.

Apr 27 '06 #5

P: n/a
I forgot you wrote "precision is only 3 digits".
But, how do you know scale of interim result without knowing data
values boforehand, if you want to make scale of interim result also 3?
If calculated by SQL and you didn't specify precision and scale of
interim result, precisions and scales will be followings.
1.14 + 2.23 -> 3.37 + 7.99 [ -> 11.36] -> 11.3
DEC(3,2) + DEC(3,2) -> DEC(4,2) + DEC(3,2) [ -> DEC(6,2) ] -> DEC(3,1)
2.23 + 7.99 -> 10.22 + 1.14 [ -> 11.36] ->11.3
DEC(3,2) + DEC(3,2) -> DEC(4,2) + DEC(3,2) [ -> DEC(6,2) ] -> DEC(3,1)
To get correct answer with 3 digits, it would be better to use ROUND
for final result.

SELECT interim
, DEC(ROUND( (interim + DEC(7.99, 3 , 2)), 1), 3, 1) fimnal
FROM (VALUES DEC(1.14, 3,2) + DEC(2.23, 3 , 2) ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- ------
3.37 11.4

1 record(s) selected.
------------------------- Commands Entered -------------------------
SELECT interim
, DEC(ROUND( (interim + DEC(1.14, 3 , 2)), 1), 3, 1) fimnal
FROM (VALUES DEC(2.23, 3,2) + DEC(7.99, 3 , 2) ) Q (interim);
--------------------------------------------------------------------

INTERIM FIMNAL
------- ------
10.22 11.4

1 record(s) selected.

Apr 27 '06 #6

P: n/a
Artur,

When you add the smallest double 0.000000.......000001 to all your
double numbers and then sum up and round, it is giving the right total.
I tried this

select round(sum(myfield+0.0000000000000000000001), 2) from mytable

You can replace 2 with any number you want.

regards,

Mehmet

Apr 27 '06 #7

P: n/a
Artur wrote:
Hi,

I'm using predefined SUM(() function on DOUBLE column. Surpisingly the
sum can give different results depending on the order of values in my
table.

Is that correct?


There is a very nicely written summary on floating point numbers here:

http://docs.sun.com/source/806-3568/ncg_goldberg.html

--
Knut Stolze
DB2 Information Integration Development
IBM Germany
Apr 28 '06 #8

P: n/a
Thank you guys for you prompt response.

I'm convinced to analyze value ranges in the database and suggest the
database owner to change double to some more precise type.

-- Artur Wronski

Apr 28 '06 #9

This discussion thread is closed

Replies have been disabled for this discussion.