Precision errors when aggregating data

James Foreman

I'm using DB2 UDB 8.1.2 on Suse linux. Try this:

CREATE TABLE testsums (cost_of_sale decimal(7,2), price decimal(9,
2));
INSERT INTO testsums VALUES (10000, 30000);
SELECT cost_of_sale/price AS division1 FROM testsums;
SELECT sum(cost_of_sale)/sum(price) AS division2 FROM testsums;

The first select gives 0.333333333333333333333333
The second gives 0.

But it gets worse; if you then try

INSERT INTO testsums VALUES (1000, 2000);

and attempt the same selects again, you get:

DIVISION1
---------------------------------
0.333333333333333333333333
0.500000000000000000000000

and

DIVISION2
---------------------------------
0.

Does this come as a surprise to anybody? I don't understand why it's
not maintaining the decimal(7,2) in the results. If we have large
sums then you might expect the total of cost_of_sale to exceed the
space in a decimal(7,2), but it isn't clear why it should produce in
the first division results with (I lose count, but I think) about 24
decimal places.

What can we do to rectify this?

Nov 12 '05 #1

Subscribe Post Reply

8613

Knut Stolze

James Foreman <ja***********@abcmail.co.uk> wrote:

I'm using DB2 UDB 8.1.2 on Suse linux. Try this:

CREATE TABLE testsums (cost_of_sale decimal(7,2), price decimal(9,
2));
INSERT INTO testsums VALUES (10000, 30000);
SELECT cost_of_sale/price AS division1 FROM testsums;
SELECT sum(cost_of_sale)/sum(price) AS division2 FROM testsums;

The first select gives 0.333333333333333333333333
The second gives 0.

But it gets worse; if you then try

INSERT INTO testsums VALUES (1000, 2000);

and attempt the same selects again, you get:

DIVISION1
---------------------------------
0.333333333333333333333333
0.500000000000000000000000

and

DIVISION2
---------------------------------
0.

Does this come as a surprise to anybody? I don't understand why it's
not maintaining the decimal(7,2) in the results. If we have large
sums then you might expect the total of cost_of_sale to exceed the
space in a decimal(7,2), but it isn't clear why it should produce in
the first division results with (I lose count, but I think) about 24
decimal places.

What can we do to rectify this?

When I do a describe on your statement, I see that the resulting data type
of the division is DECIMAL(31, 0). So there are no digits after the
decimal point. Given that, your example above acts as one would expect.

$ db2 "describe SELECT sum(cost_of_sale) / sum(price) AS division2 FROM
testsums"

sqltype sqllen
-------------------- ------
485 DECIMAL 31, 0
I can't reproduce the result "0.5" that you got with 1000 and 2000. What
exactly did you do there?

To address this, you might want to adjust your data types like this (or
something similar):

$ db2 "SELECT decimal(sum(cost_of_sale), 7, 2) / decimal(sum(price), 9, 2)
FROM testsums"

1
---------------------------------
0.333333333333333333333333

1 record(s) selected.
--
Knut Stolze
Information Integration
IBM Germany / University of Jena

Nov 12 '05 #2

James Foreman

When I do a describe on your statement, I see that the resulting data type
of the division is DECIMAL(31, 0). So there are no digits after the
decimal point. Given that, your example above acts as one would expect.

$ db2 "describe SELECT sum(cost_of_sale) / sum(price) AS division2 FROM
testsums"

sqltype sqllen
-------------------- ------
485 DECIMAL 31, 0
I can't reproduce the result "0.5" that you got with 1000 and 2000. What
exactly did you do there?

To address this, you might want to adjust your data types like this (or
something similar):

$ db2 "SELECT decimal(sum(cost_of_sale), 7, 2) / decimal(sum(price), 9, 2)
FROM testsums"

1
---------------------------------
0.333333333333333333333333

1 record(s) selected.

Thanks Knut. That does fix the rounding error, but it does still
confuse me why the resulting data type is decimal (31, 0) - I'd have
expected DB2 to preserve the level of accuracy from the base data
type. (It will quickly become tiresome to do this every time we have
to carry out such a division!)

To get the "0.5" result I do
select cost_of_sale, price, decimal((cost_of_sale/price), 7, 2),
cost_of_sale/price
FROM testsums;

COST_OF_SALE PRICE 3 4
------------ ----------- --------- ---------------------------------
10000.00 30000.00 0.33 0.333333333333333333333333
1000.00 2000.00 0.50 0.500000000000000000000000

which now gives me the result I want in column 3; although with my
previous comment, I'm confused why I'd have to specify something I'd
expect to be the default behaviour - I may have to take this up with
IBM, unless there's a configuration change that will produce this for
me.

Nov 12 '05 #3

Ian

James Foreman wrote:

When I do a describe on your statement, I see that the resulting data type
of the division is DECIMAL(31, 0). So there are no digits after the
decimal point. Given that, your example above acts as one would expect.

$ db2 "describe SELECT sum(cost_of_sale) / sum(price) AS division2 FROM
testsums"

sqltype sqllen
-------------------- ------
485 DECIMAL 31, 0
I can't reproduce the result "0.5" that you got with 1000 and 2000. What
exactly did you do there?

To address this, you might want to adjust your data types like this (or
something similar):

$ db2 "SELECT decimal(sum(cost_of_sale), 7, 2) / decimal(sum(price), 9, 2)
FROM testsums"

1
---------------------------------
0.333333333333333333333333

1 record(s) selected.

Thanks Knut. That does fix the rounding error, but it does still
confuse me why the resulting data type is decimal (31, 0) - I'd have
expected DB2 to preserve the level of accuracy from the base data
type. (It will quickly become tiresome to do this every time we have
to carry out such a division!)

There are factors influencing this:

1) When you do a SUM on a decimal(p,s) column, your result will
always be DECIMAL(31,s).

This happens in order to avoid overflows in the datatype, and
also explains why Knut's suggestion of casting the sum as
decimal(7,2) happens to work with your example data.
2) The reason that you get a result of 0. is a function of how DB2
does decimal division.

From the manual (Messages Reference, SQL0419N):

The formula used internally to calculate the scale of the result
for decimal division is:

Scale of result = 31 - np + ns - ds

where np is the precision of the numerator, ns is the scale of
the numerator, and ds is the scale of the denominator.

Dividing 2 decimal(31,2) results yields a result that's decimal(31,0)
(as you experienced).
To solve this: The better solution is to cast your SUM() results into
the correct precision. Of course, this could be somewhat time consuming,
as you've found.

There is also another solution:

The database configuration parameter MIN_DEC_DIV_3 can be set to YES,
which ensures that all decimal division operations will have a minimum
scale of 3. This parameter is not shown in the output of 'get db cfg'.
If you set this, you'll get a result (from your original query) of
0.500.

There are some potential complications with doing this, so check the
manual for the caveats if you go this route.
Good luck,

-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----

Nov 12 '05 #4

Similar topics

Precision issue

by: Ladvánszky Károly | last post by:

Entering 3.4 in Python yields 3.3999999999999999. I know it is due to the fact that 3.4 can not be precisely expressed by the powers of 2. Can the float handling rules of the underlying layers be...

Python

Fixed precision floating point and locale facets

by: Roger Leigh | last post by:

Hello, I'm writing a fixed-precision floating point class, based on the ideas in the example fixed_pt class in the "Practical C++ Programming" book by Steve Oualline (O' Reilly). This uses a...

C / C++

When (32-bit) double precision isn't precise enough

by: DAVID SCHULMAN | last post by:

I've been trying to perform a calculation that has been running into an underflow (insufficient precision) problem in Microsoft Excel, which calculates using at most 15 significant digits. For this...

C / C++

Floating Point Precision

by: michael.mcgarry | last post by:

Hi, I have a question about floating point precision in C. What is the minimum distinguishable difference between 2 floating point numbers? Does this differ for various computers? Is this...

C / C++

Float/Double arithmetic precision error

by: Madan | last post by:

Hi all, I had problem regarding float/double arithmetic only with + and - operations, which gives inaccurate precisions. I would like to know how the arithmetic operations are internally handled...

C# / C Sharp

Arbitrary precision modulo operation

by: Chadwick Boggs | last post by:

I need to perform modulo operations on extremely large numbers. The % operator is giving me number out of range errors and the mod(x, y) function simply seems to return the wrong results. Also,...

PostgreSQL Database

Date Precision Problem

by: Gugale at Lincoln | last post by:

Hi, I am working on an application which uses date as a primary key. All my records are at least a few millisecond apart and are in the form "20070630T12:50:24.207". SQL Server has a precision...

C# / C Sharp

Changing precision level of floating point variables

by: Matthew | last post by:

Hi, I want to change the precision level of floating point variables and calculations which is done in php.ini. However the server I rent for my domain does not give me access to php.ini,...

PHP

Heap overflow/corruption problem in an arbitrary precision class

by: Martin the Third | last post by:

Hi, I need some help! I'm writing an infinite-precision floating point library called ipfloat (I know infinite is a misnomer - but arbitrary was taken). A quick overview: I'm storing numbers as...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice