Variables in SP do not compare as equal when both are NULL

Mark A

DB2 8.2 for Linux, FP 10 (also performs the same on DB2 8.2 for Windoes, FP
11).

Using the SAMPLE database, tables EMP and EMLOYEE.

In the followng stored procedure, 2 NULL columns (COMM) are selected into 2
different SP variables and compared for equal. They are both NULL, but do
not compare as equal. When the Not NULL columns (SALARY) are compared, they
do compare as equal.

Is there a reason for this?

Statement terminator is @.

-- SET THE COMMISION TO NULL ON TWO SAMPLE TABLES FOR EMPNO '000010'

UPDATE EMP SET COMM = NULL WHERE EMPNO = '000010'@
UPDATE EMPLOYEE SET COMM = NULL WHERE EMPNO = '000010'@

--------------------------------------------------
DROP PROCEDURE TEST_SP@

CREATE PROCEDURE TEST_SP
(
OUT EMP_COMM DECIMAL(9,2),
OUT EMP_SALARY DECIMAL(9,2),
OUT EMPLOYEE_COMM DECIMAL(9,2),
OUT EMPLOYEE_SALARY DECIMAL(9,2),
OUT v_ERRMSG_1 VARCHAR(500),
OUT v_ERRMSG_2 VARCHAR(500)
)
SPECIFIC TEST_SP
RESULT SETS 0
MODIFIES SQL DATA
LANGUAGE SQL
BEGIN

---------------------------------------------
-- Declare variables
---------------------------------------------

DECLARE not_found CONDITION FOR SQLSTATE '02000';
DECLARE at_end SMALLINT DEFAULT 0;

DECLARE A_EMPNO CHAR(6);
DECLARE A_EMP_COMM DECIMAL(9,2);
DECLARE A_EMP_SALARY DECIMAL(9,2);
DECLARE B_EMPNO CHAR(6);
DECLARE B_EMPLOYEE_COMM DECIMAL(9,2);
DECLARE B_EMPLOYEE_SALARY DECIMAL(9,2);

---------------------------------------------
-- Declare exit handlers
---------------------------------------------

DECLARE EXIT HANDLER FOR SQLEXCEPTION
GET DIAGNOSTICS EXCEPTION 1 v_ERRMSG_1 = MESSAGE_TEXT;

DECLARE CONTINUE HANDLER FOR not_found
SET at_end = 1;

DECLARE EXIT HANDLER FOR SQLWARNING
GET DIAGNOSTICS EXCEPTION 1 v_ERRMSG_1 = MESSAGE_TEXT;

---------------------------------------------------------------
-- Begin processing logic
---------------------------------------------------------------

SELECT EMPNO, COMM, SALARY INTO A_EMPNO, A_EMP_COMM, A_EMP_SALARY
FROM DB2INST1.EMP
WHERE EMPNO = '000010';

SELECT EMPNO, COMM, SALARY INTO B_EMPNO, B_EMPLOYEE_COMM, B_EMPLOYEE_SALARY
FROM DB2INST1.EMPLOYEE
WHERE EMPNO = A_EMPNO;

IF A_EMP_COMM = B_EMPLOYEE_COMM THEN
SET V_ERRMSG_1 = 'NULLS COMPARE AS EQUAL';
ELSE
SET V_ERRMSG_1 = 'NULLS DO NOT COMPARE AS EQUAL';
END IF;

IF A_EMP_SALARY = B_EMPLOYEE_SALARY THEN
SET V_ERRMSG_2 = 'NON-NULLS COMPARE AS EQUAL';
ELSE
SET V_ERRMSG_2 = 'NON-NULLS DO NOT COMPARE AS EQUAL';
END IF;

SET EMP_COMM = A_EMP_COMM;
SET EMP_SALARY = A_EMP_SALARY;

SET EMPLOYEE_COMM = B_EMPLOYEE_COMM;
SET EMPLOYEE_SALARY = B_EMPLOYEE_SALARY;

END@

CALL TEST_SP (?,?,?,?,?,?)@

Value of output parameters
--------------------------
Parameter Name : EMP_COMM
Parameter Value : -

Parameter Name : EMP_SALARY
Parameter Value : 52750.00

Parameter Name : EMPLOYEE_COMM
Parameter Value : -

Parameter Name : EMPLOYEE_SALARY
Parameter Value : 52750.00

Parameter Name : V_ERRMSG_1
Parameter Value : NULLS DO NOT COMPARE AS EQUAL

Parameter Name : V_ERRMSG_2
Parameter Value : NON-NULLS COMPARE AS EQUAL

Return Status = 0

Jul 10 '06 #1

Subscribe Reply

4494

mike

In the followng stored procedure, 2 NULL columns (COMM) are selected into 2

different SP variables and compared for equal. They are both NULL, but do
not compare as equal. When the Not NULL columns (SALARY) are compared, they
do compare as equal.

Is there a reason for this?

Yep, that's the way it's supposed to be.

db2 "values (case when (nullif(0,0) = nullif(0,0)) then 1 else 0 end)"

1
-----------
0

1 record(s) selected.

db2 "values (case when (nullif(0,0) is null AND nullif(0,0) IS NULL)
then>

1
-----------
1

1 record(s) selected.

Jul 10 '06 #2

Mark A

"mike" <_l*****@yahoo.comwrote in message
news:11*********************@h48g2000cwc.googlegro ups.com...

Yep, that's the way it's supposed to be.

db2 "values (case when (nullif(0,0) = nullif(0,0)) then 1 else 0 end)"

1
-----------
0

1 record(s) selected.

db2 "values (case when (nullif(0,0) is null AND nullif(0,0) IS NULL)
then>

1
-----------
1

1 record(s) selected.

Yes, I figured out that is how it works. Is there is reason for this?

Jul 10 '06 #3

Jeroen van den Broek

"Mark A" <no****@nowhere.comschreef in bericht
news:f_******************************@comcast.com. ..

"mike" <_l*****@yahoo.comwrote in message
news:11*********************@h48g2000cwc.googlegro ups.com...
>Yep, that's the way it's supposed to be.

db2 "values (case when (nullif(0,0) = nullif(0,0)) then 1 else 0 end)"

1
-----------
0

1 record(s) selected.

db2 "values (case when (nullif(0,0) is null AND nullif(0,0) IS NULL)
then>

1
-----------
1

1 record(s) selected.

Yes, I figured out that is how it works. Is there is reason for this?

Yes, there is.
It is in the definition of NULL's.
A NULL means the specific value is unknown, NULL in itself isn't a value
that can be matched against any other value.

--
Jeroen

Jul 10 '06 #4

mike

Yes, I figured out that is how it works. Is there is reason for this?

It is because in SQL, NULL is not a value, but the absence of one.
The equality predicate compares two values, not absence of them,
and if either of the operands to the equality predicate are NULL
then the result is unknown.

Jul 10 '06 #5

Mark A

"mike" <_l*****@yahoo.comwrote in message
news:11**********************@75g2000cwc.googlegro ups.com...

>
It is because in SQL, NULL is not a value, but the absence of one.
The equality predicate compares two values, not absence of them,
and if either of the operands to the equality predicate are NULL
then the result is unknown.

That sounds like a circular argument to me (as does the other one posted by
Jeroene).

You claim that null is not a value, but the phrase "null value" turns up
frequently, including in DB2 documentation. And there are exceptions, such
as when using the DISTINCT predicate, where "null values are considered
equal. So (COL1 is NOT DISTINCT from COL2) will be true if both columns
contain an equal non-null value and also when both columns are the null
value." [quote from iSeries DB2 Infocenter].

Obviously, someone thought that relational theory dictated that two nulls
are not equal, but I don't quite understand under what conditions that it
would be useful. It just seems we have to do a lot of extra coding to get
around this "rule".

Jul 10 '06 #6

mike

... It just seems we have to do a lot of extra coding to get

around this "rule".

Consider using COALESCE with predicates that involve nullable columns
to avoid explicitly coding the extra null tests.

Jul 10 '06 #7

Brian Tkatch

Mark A wrote:

"mike" <_l*****@yahoo.comwrote in message
news:11**********************@75g2000cwc.googlegro ups.com...

It is because in SQL, NULL is not a value, but the absence of one.
The equality predicate compares two values, not absence of them,
and if either of the operands to the equality predicate are NULL
then the result is unknown.

That sounds like a circular argument to me (as does the other one posted by
Jeroene).

You claim that null is not a value, but the phrase "null value" turns up
frequently, including in DB2 documentation. And there are exceptions, such
as when using the DISTINCT predicate, where "null values are considered
equal. So (COL1 is NOT DISTINCT from COL2) will be true if both columns
contain an equal non-null value and also when both columns are the null
value." [quote from iSeries DB2 Infocenter].

Obviously, someone thought that relational theory dictated that two nulls
are not equal, but I don't quite understand under what conditions that it
would be useful. It just seems we have to do a lot of extra coding to get
around this "rule".

It is a misunderstanding of what NULL is. NULL is on a different
"level" than other values, as it talks about the value. All values are
either known or unknown. That is, if it is known, the actual value can
be returned, if it is unknown, only a big question-mark can be
returned.

The second thing is, that a WHERE clause tests for a boolean value, and
so, a truth-table can be used to discern what it means. For example,
the AND truth-table:

(T)RUE/(F)ALSE

T+T=T
T+F=F
F+T=F
F+F=F

If NULL is added to any equation, we cannot answer any questions,
because since we don't know the value, its truth cannot be discerned.
Being neither TRUE nor FALSE, we must say we don't know, which is
database terminology is called NULL. Comparing NULL to NULL is
comparing to completely unknown values. There is no way anyone could
figure out how to compare them, so the result must be NULL as well.

Ultimately, to deal with NULLs, the special keyword IS is used.

As for why the documentation refers to it as a NULL value, it is
because it is differentiation between types of value, not of the values
themselves.

As for the INDEX, INDEXes are not part of the data, as much as they are
part of making the database work. So, implementation of NULL values on
an INDEX is up to the developers, and database implementations are not
consistent.

If you find NULL being more trouble than it is worth, perhaps NULL is
not appropriate for your scenario. Because if you use =, you are saying
you know what the value is. If there is no value, it should be a
zero-length string ('') not a NULL-value.

B.

Jul 10 '06 #8

Serge Rielau

Mark A wrote:

"mike" <_l*****@yahoo.comwrote in message
news:11**********************@75g2000cwc.googlegro ups.com...
>It is because in SQL, NULL is not a value, but the absence of one.
The equality predicate compares two values, not absence of them,
and if either of the operands to the equality predicate are NULL
then the result is unknown.

That sounds like a circular argument to me (as does the other one posted by
Jeroene).

To make a long story short: ANSI SQL
There are ton's of papers on how NULL should or shouldn't behave.

You claim that null is not a value, but the phrase "null value" turns up
frequently, including in DB2 documentation. And there are exceptions, such
as when using the DISTINCT predicate, where "null values are considered
equal. So (COL1 is NOT DISTINCT from COL2) will be true if both columns
contain an equal non-null value and also when both columns are the null
value." [quote from iSeries DB2 Infocenter].

Correct. It's so special they had to talk about it. :-)
There is an alternative predicate in the works for the SQL Standard
which will treat NULL as equal because there are valid arguments for
both sides.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

IOD Conference
http://www.ibm.com/software/data/ond...ness/conf2006/

Jul 10 '06 #9

Mark A

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message at
conditions that it

>
If NULL is added to any equation, we cannot answer any questions,
because since we don't know the value, its truth cannot be discerned.
Being neither TRUE nor FALSE, we must say we don't know, which is
database terminology is called NULL. Comparing NULL to NULL is
comparing to completely unknown values. There is no way anyone could
figure out how to compare them, so the result must be NULL as well.

If COMM on table EMP is null, and COMM on table EMPLOYEE is null (both for
the same EMPNO= '000010', which is the PK), if I say:

IF EMP.COMM = EMPLOYEE.COMM

then I "CAN" compare them and IMO, they should be equal even if both values
are null. I understand that you disagree, but that is your opinion.

Ultimately, to deal with NULLs, the special keyword IS is used.

Not exactly. What I can do is:

IF COALESCE(EMP.COMM,0) = COALESCE(EMPLOYEE.COMM,0)

or

IF ((EMP.COMM = EMPLOYEE.COMM) OR (EMP.COMM IS NULL AND EMPLOYEE.COMM IS
NULL))

It seems to me that those who actually have a need for IF EMP.COMM =
EMPLOYEE.COMM to test false (assuming that both columns are null) are in the
small minority (it would be hard for me to imagine what that would be used
for) and that any special syntax for it test true (as Serge mentioned may be
coming) should have been reserved for the rare exceptions and not the
majority case where most people would expect it to test true.

As for why the documentation refers to it as a NULL value, it is
because it is differentiation between types of value, not of the values
themselves.

This sounds like gobblygook to me. If the column is null, then it contains
null values. The value is null.

Jul 10 '06 #10

Brian Tkatch

Mark A wrote:

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message at
conditions that it

If NULL is added to any equation, we cannot answer any questions,
because since we don't know the value, its truth cannot be discerned.
Being neither TRUE nor FALSE, we must say we don't know, which is
database terminology is called NULL. Comparing NULL to NULL is
comparing to completely unknown values. There is no way anyone could
figure out how to compare them, so the result must be NULL as well.

If COMM on table EMP is null, and COMM on table EMPLOYEE is null (both for
the same EMPNO= '000010', which is the PK), if I say:

IF EMP.COMM = EMPLOYEE.COMM

then I "CAN" compare them and IMO, they should be equal even if both values
are null. I understand that you disagree, but that is your opinion.

Imagine you are an investigator, and you want to find out if the
perpetrator of crime A is also the perpetrator of Crime B. So, you look
at the files, and find that the perpetrator in both cases in unknown.
According to your definition of NULL (AKA unknown) you conclude that
both perpetrators are indeed the same person.

You seem to be making the mistake of thinking that NULL is a value.
NULL is not a value. It is a state of unknown.

Ultimately, to deal with NULLs, the special keyword IS is used.

Not exactly. What I can do is:

IF COALESCE(EMP.COMM,0) = COALESCE(EMPLOYEE.COMM,0)

And what do you think COALESCE() does anyway?

It tests for IS NULL!

or

IF ((EMP.COMM = EMPLOYEE.COMM) OR (EMP.COMM IS NULL AND EMPLOYEE.COMM IS
NULL))

Correct. Each case of IS NULL returns either TRUE or FALSE.

It seems to me that those who actually have a need for IF EMP.COMM =
EMPLOYEE.COMM to test false (assuming that both columns are null) are in the
small minority

Noone wants it to test FALSE. They want it to test NULL.

(it would be hard for me to imagine what that would be used for)

Guess Willy Wanka would be dissapoiinted with you, eh? :)

Imagine i have a database that tracks all my lottery ticket purchases
and winning numbers i can run reports of my accuracy. At the beginning
of the week i purchase a ticket, and put in my chosen numbers, leaving
the field for the winning numbers NULL, because they are unknown. After
the winning number is chosen i UPDATE the field with the winning
numbers to the now known entry.

When FTPing a file, many times the file size is unknown until after the
stream is done. If a database tracked downloads, the size field would
be NULL at first.

When creating a new part at a factory, before a number has been
assigned to the part, the name would be NULL until the paperwork goes
through.

And so on. The cases abound, and happen every day.

>and that any special syntax for it test true (as Serge mentioned may be
coming) should have been reserved for the rare exceptions and not the
majority case where most people would expect it to test true.

I see your assertion that most people want it to test true, and i match
it with the assertion that most people want it to test NULL. And, i
raise the pot with the assertion that if someone wanted it to be true
they should actually be using a zero or a zero-length string.

>
As for why the documentation refers to it as a NULL value, it is
because it is differentiation between types of value, not of the values
themselves.

This sounds like gobblygook to me. If the column is null, then it contains
null values. The value is null.

I'm sorry it sounds like that to you as i believe it to make a lot of
sense. However, you are certainly wlecome to your own opinion, and can
submit it to the ANSI comittee on SQL to force it on others. Either
way, you may use a database anyway you please, as it is your
application. Note though, that DB2 is a database implementation that
adheres to SQL standards (as set by ANSI), and NULL is very much
handled correctly.

B.

Jul 10 '06 #11

Jeroen van den Broek

"Mark A" <no****@nowhere.comschreef in bericht
news:iv******************************@comcast.com. ..

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message at
conditions that it
>>
If NULL is added to any equation, we cannot answer any questions,
because since we don't know the value, its truth cannot be discerned.
Being neither TRUE nor FALSE, we must say we don't know, which is
database terminology is called NULL. Comparing NULL to NULL is
comparing to completely unknown values. There is no way anyone could
figure out how to compare them, so the result must be NULL as well.

If COMM on table EMP is null, and COMM on table EMPLOYEE is null (both for
the same EMPNO= '000010', which is the PK), if I say:

IF EMP.COMM = EMPLOYEE.COMM

then I "CAN" compare them and IMO, they should be equal even if both
values are null. I understand that you disagree, but that is your opinion.

>Ultimately, to deal with NULLs, the special keyword IS is used.

Not exactly. What I can do is:

IF COALESCE(EMP.COMM,0) = COALESCE(EMPLOYEE.COMM,0)

or

IF ((EMP.COMM = EMPLOYEE.COMM) OR (EMP.COMM IS NULL AND EMPLOYEE.COMM IS
NULL))

It seems to me that those who actually have a need for IF EMP.COMM =
EMPLOYEE.COMM to test false (assuming that both columns are null) are in
the small minority (it would be hard for me to imagine what that would be
used for) and that any special syntax for it test true (as Serge mentioned
may be coming) should have been reserved for the rare exceptions and not
the majority case where most people would expect it to test true.

>As for why the documentation refers to it as a NULL value, it is
because it is differentiation between types of value, not of the values
themselves.

This sounds like gobblygook to me. If the column is null, then it contains
null values. The value is null.

Mark,
please read this blog (the entry for April 5th 2006).
It is not specific for DB2 - in fact it uses SQL Server syntax -, but it
pretty much explains the pitfalls when using NULL's (watch the examples for
AVG and SUM, for instance).
When using NULL's, you must use 3-valued logic, not the ordinary TRUE/FALSE
binary logic.
Also have a look here:
http://www.reviewnet.net/newsletter/...s/a3_feb02.htm
which handles the same 'problem' area, from an Oracle perspective.

--
Jeroen

Jul 10 '06 #12

Mark A

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message >

Imagine you are an investigator, and you want to find out if the
perpetrator of crime A is also the perpetrator of Crime B. So, you look
at the files, and find that the perpetrator in both cases in unknown.
According to your definition of NULL (AKA unknown) you conclude that
both perpetrators are indeed the same person.

You seem to be making the mistake of thinking that NULL is a value.
NULL is not a value. It is a state of unknown.

So, if I run the following query to find out how many crimes each
perpetrator has committed, this is what I get (for simplicity, assume there
is only one perpetrator per criminal case):

SELECT PERPETRATOR, COUNT(*) AS CRIMES
FROM CRIMINAL_CASE
GROUP BY PERPETRATOR

PERPETRATOR CRIMES
----------------------- -----------

ANDERSON, JOHN 1
ANSI BOARD 348
JONES, BOB 3
SMITH, SAM 2
- 20

Now from the above report, it shows that for 20 crimes, no perpetrator has
been identified. It does not mean that the same person committed all 20
unsolved crimes (only an imbecile would conclude that).

So the ANSI implementation of COL1 = COL2 (if both all NULL) is inconsistent
with the ANSI specification of a GROUP BY query. The GROUP BY considers them
to be the same, but the "equal" test considers them to be different.

>
No one wants it to test FALSE. They want it to test NULL.

Not so. If I have an "IF" statement in a procedure comparing the two
variables, then I want to test for True/False, not for NULL.

I'm sorry it sounds like that to you as i believe it to make a lot of
sense. However, you are certainly wlecome to your own opinion, and can
submit it to the ANSI comittee on SQL to force it on others. Either
way, you may use a database anyway you please, as it is your
application. Note though, that DB2 is a database implementation that
adheres to SQL standards (as set by ANSI), and NULL is very much
handled correctly.

B.

Yes, it is clear that DB2 follows the ANSI standard, but it is not clear
that the ANSI standard is consistent (as noted above) or is the preferred
interpretation, and there are several databases that apparently use
different interpretations (unless you set them to be ANSI compliant). As
noted by Serge there are valid arguments on both sides, and DB2 is looking
at an alternative solution via SQL syntax.

Jul 10 '06 #13

wombat53

A NULL value is not equal to anything, period (including another NULL
value), in the the three valued logic of IBM's implementation of SQL.
Hence the predicate syntax "IS/IS NOT NULL", as opposed to " = NULL".
Since its value is not known, one cannot say that it is equal (or not
equal) to something else. It is unknown. It may be known at a later
time, or it may be forever unknowable (i.e. inapplicable; which Dr.
Codd later differentiated as two different types of NULL values....) .

There are some anomalies associated with IBM's implementation of NULL
values over the years, and I can't get into them here. One example,
hoever, is that in a unique index, if you have two NULL values, they
are duplicates!!! So, in that case they sort the same (i.e. equal).
However I think IBM implemented a workaround to this, which I cannot
quote off the top of my head. I am sure you can find it in the doco.
wombat53
Mark A wrote:

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message >
Imagine you are an investigator, and you want to find out if the
perpetrator of crime A is also the perpetrator of Crime B. So, you look
at the files, and find that the perpetrator in both cases in unknown.
According to your definition of NULL (AKA unknown) you conclude that
both perpetrators are indeed the same person.

You seem to be making the mistake of thinking that NULL is a value.
NULL is not a value. It is a state of unknown.

So, if I run the following query to find out how many crimes each
perpetrator has committed, this is what I get (for simplicity, assume there
is only one perpetrator per criminal case):

SELECT PERPETRATOR, COUNT(*) AS CRIMES
FROM CRIMINAL_CASE
GROUP BY PERPETRATOR

PERPETRATOR CRIMES
----------------------- -----------

ANDERSON, JOHN 1
ANSI BOARD 348
JONES, BOB 3
SMITH, SAM 2
- 20

Now from the above report, it shows that for 20 crimes, no perpetrator has
been identified. It does not mean that the same person committed all 20
unsolved crimes (only an imbecile would conclude that).

So the ANSI implementation of COL1 = COL2 (if both all NULL) is inconsistent
with the ANSI specification of a GROUP BY query. The GROUP BY considers them
to be the same, but the "equal" test considers them to be different.

No one wants it to test FALSE. They want it to test NULL.

Not so. If I have an "IF" statement in a procedure comparing the two
variables, then I want to test for True/False, not for NULL.

I'm sorry it sounds like that to you as i believe it to make a lot of
sense. However, you are certainly wlecome to your own opinion, and can
submit it to the ANSI comittee on SQL to force it on others. Either
way, you may use a database anyway you please, as it is your
application. Note though, that DB2 is a database implementation that
adheres to SQL standards (as set by ANSI), and NULL is very much
handled correctly.

B.

Yes, it is clear that DB2 follows the ANSI standard, but it is not clear
that the ANSI standard is consistent (as noted above) or is the preferred
interpretation, and there are several databases that apparently use
different interpretations (unless you set them to be ANSI compliant). As
noted by Serge there are valid arguments on both sides, and DB2 is looking
at an alternative solution via SQL syntax.

Jul 11 '06 #14

Brian Tkatch

Mark A wrote:

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message >
Imagine you are an investigator, and you want to find out if the
perpetrator of crime A is also the perpetrator of Crime B. So, you look
at the files, and find that the perpetrator in both cases in unknown.
According to your definition of NULL (AKA unknown) you conclude that
both perpetrators are indeed the same person.

You seem to be making the mistake of thinking that NULL is a value.
NULL is not a value. It is a state of unknown.

So, if I run the following query to find out how many crimes each
perpetrator has committed, this is what I get (for simplicity, assume there
is only one perpetrator per criminal case):

SELECT PERPETRATOR, COUNT(*) AS CRIMES
FROM CRIMINAL_CASE
GROUP BY PERPETRATOR

PERPETRATOR CRIMES
----------------------- -----------

ANDERSON, JOHN 1
ANSI BOARD 348
JONES, BOB 3
SMITH, SAM 2
- 20

Now from the above report, it shows that for 20 crimes, no perpetrator has
been identified. It does not mean that the same person committed all 20
unsolved crimes (only an imbecile would conclude that).

So the ANSI implementation of COL1 = COL2 (if both all NULL) is inconsistent
with the ANSI specification of a GROUP BY query. The GROUP BY considers them
to be the same, but the "equal" test considers them to be different.

No, GROUP BY does not consider them to be equal. GROUP BY splits a
TABLE into separate groups for use with aggregate FUNCTIONs. This
includes unknown values all put in one group.

Aggregate FUNCTIONs, however, only deal with known values, and as such
ignore NULL. COUNT(*), however, deals with records, not fields, and as
such even if an individual field IS NULL, the record itself is still
counted.

No one wants it to test FALSE. They want it to test NULL.

Not so. If I have an "IF" statement in a procedure comparing the two
variables, then I want to test for True/False, not for NULL.

I want to ask you a question, because it is my contention that you
haven't the slightest idea when to use NULL, and when to use a zero or
a zero-length string instead.

Why are there NULL fields in you database? That is, why are some fields
NULL, instead of using a zero-length string for character data or zero
for numeric data?

I'm sorry it sounds like that to you as i believe it to make a lot of
sense. However, you are certainly wlecome to your own opinion, and can
submit it to the ANSI comittee on SQL to force it on others. Either
way, you may use a database anyway you please, as it is your
application. Note though, that DB2 is a database implementation that
adheres to SQL standards (as set by ANSI), and NULL is very much
handled correctly.

B.

Yes, it is clear that DB2 follows the ANSI standard, but it is not clear
that the ANSI standard is consistent (as noted above)

As above is not a challenge to its consistency. Please provide a real
problem with it before challenging its consistency.

>or is the preferred interpretation

It *is* the preferred interpretation. That is why basically every
database implementation out there uses that with NULL. Only
database-esque programs that use SQL for it's concise syntax, but not
for the defintion of data would compare NULLs otherwise. MySQL being a
good example of that genre.

>, and there are several databases that apparently use
different interpretations (unless you set them to be ANSI compliant).

I'm calling your bluff. I want the names of databases that do that.
Perhaps they are out there, but i don't know of them (yet).

As noted by Serge there are valid arguments on both sides, and DB2 is looking
at an alternative solution via SQL syntax.

Basically, that would be adding another opeator that handles NULLs
otherwise. You could add a FUNCTION yourself that does that.

B.

Jul 11 '06 #15

Mark A

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message

No, GROUP BY does not consider them to be equal. GROUP BY splits a
TABLE into separate groups for use with aggregate FUNCTIONs. This
includes unknown values all put in one group.

Aggregate FUNCTIONs, however, only deal with known values, and as such
ignore NULL. COUNT(*), however, deals with records, not fields, and as
such even if an individual field IS NULL, the record itself is still
counted.

That is a gigantic cop-out.

The GROUP BY puts them together because it assumes them to be the same for
the purposes of the query. That obviously does not mean that the perpetrator
for all the cases with no perpetrator were committed by the same
perpetrator.

I want to ask you a question, because it is my contention that you
haven't the slightest idea when to use NULL, and when to use a zero or
a zero-length string instead.

Why are there NULL fields in you database? That is, why are some fields
NULL, instead of using a zero-length string for character data or zero
for numeric data?

You are way off-base buddy. I know when and when not to use NULL. The
original question referred to variables in DB2 SQL/PL and when one is
determining if two variables with the exact same "values" in the specified
are equal. So the question does not directly relate to database design, it
relates to programming variables. It just so happens that in SQL/PL,
variables are SQL data types, but this is not necessarily true in other
languages.

As above is not a challenge to its consistency. Please provide a real
problem with it before challenging its consistency.

>>or is the preferred interpretation

It *is* the preferred interpretation. That is why basically every
database implementation out there uses that with NULL. Only
database-esque programs that use SQL for it's concise syntax, but not
for the defintion of data would compare NULLs otherwise. MySQL being a
good example of that genre.

>>, and there are several databases that apparently use
different interpretations (unless you set them to be ANSI compliant).

I'm calling your bluff. I want the names of databases that do that.
Perhaps they are out there, but i don't know of them (yet).

I don't claim to be an expert in SQL Server (I perssume that he means MS SQL
Server, but nor sure), or Sybase, but my statement was based on the
following quote. I don't know if it still applies to the most current
releases of these products:

"...in Oracle, Null is neither equal to nor not equal to Null. SQL Server,
by default, does not do it that way: in SQL Server and Sybase, Null is equal
to Null. Neither Oracle's, Sybase nor SQL Server's SQL processing is wrong -
they are just different. Both databases are in fact ANSI compliant databases
but they still work differently. There are ambiguities, backward
compatibility issues, and so on, to be overcome. For example, SQL Server
supports the ANSI method of Null comparison, just not by default (it would
break thousands of existing legacy applications built on that database)."

http://www.apress.com/betabooks/bb_c..._ch01_BETA.pdf

Jul 12 '06 #16

Brian Tkatch

Mark A wrote:

"Brian Tkatch" <Ma***********@ThePentagon.comwrote in message
No, GROUP BY does not consider them to be equal. GROUP BY splits a
TABLE into separate groups for use with aggregate FUNCTIONs. This
includes unknown values all put in one group.

Aggregate FUNCTIONs, however, only deal with known values, and as such
ignore NULL. COUNT(*), however, deals with records, not fields, and as
such even if an individual field IS NULL, the record itself is still
counted.

That is a gigantic cop-out.

The GROUP BY puts them together because it assumes them to be the same for
the purposes of the query. That obviously does not mean that the perpetrator
for all the cases with no perpetrator were committed by the same
perpetrator.

And let's say it didn't consider them the same. Should it then return
one record for each NULL value? If NULL is unknown, it can neither say
that they are the same, or that they are different. So, it just doesn't
do anything.

By default, the entire TABLE is one group. GROUP BY separates based on
values. In a sense, it leave NULLs alone, but because all values get
their own group, the NULLs are left alone, and together.

I want to ask you a question, because it is my contention that you
haven't the slightest idea when to use NULL, and when to use a zero or
a zero-length string instead.

Why are there NULL fields in you database? That is, why are some fields
NULL, instead of using a zero-length string for character data or zero
for numeric data?

You are way off-base buddy. I know when and when not to use NULL.

I challenge that assertion. Because your questions on using them for
equality show a basic ignorance of its nature.

The original question referred to variables in DB2 SQL/PL and when one is
determining if two variables with the exact same "values" in the specified
are equal. So the question does not directly relate to database design, it
relates to programming variables. It just so happens that in SQL/PL,
variables are SQL data types, but this is not necessarily true in other
languages.

SQL/PL processes data. Data is defined by the database. Therefore,
SQL/PL must itself be designed specifically to database requirements.

Database requirements have a specific nature of NULL. SQL/PL must
follow through and support that definition of NULL.

So, if SQL/PL was to change the way of handling NULLs, it would first
need to be changed in the database.

That is why i interpret your question on handling NULLs in SQL/PL as a
question on the database itself.

As above is not a challenge to its consistency. Please provide a real
problem with it before challenging its consistency.

>or is the preferred interpretation
It *is* the preferred interpretation. That is why basically every
database implementation out there uses that with NULL. Only
database-esque programs that use SQL for it's concise syntax, but not
for the defintion of data would compare NULLs otherwise. MySQL being a
good example of that genre.

>, and there are several databases that apparently use
different interpretations (unless you set them to be ANSI compliant).
I'm calling your bluff. I want the names of databases that do that.
Perhaps they are out there, but i don't know of them (yet).

I don't claim to be an expert in SQL Server (I perssume that he means MS SQL
Server, but nor sure), or Sybase, but my statement was based on the
following quote. I don't know if it still applies to the most current
releases of these products:

"...in Oracle, Null is neither equal to nor not equal to Null. SQL Server,
by default, does not do it that way: in SQL Server and Sybase, Null is equal
to Null. Neither Oracle's, Sybase nor SQL Server's SQL processing is wrong -
they are just different. Both databases are in fact ANSI compliant databases
but they still work differently. There are ambiguities, backward
compatibility issues, and so on, to be overcome. For example, SQL Server
supports the ANSI method of Null comparison, just not by default (it would
break thousands of existing legacy applications built on that database)."

http://www.apress.com/betabooks/bb_c..._ch01_BETA.pdf

OK, i just checked a Sybase implementation (which mostly SQL Server
acts the same way, being the same base) and indeed it supports the
equality operator with NULL.

So, that is one (two) major implementations that do it.

(I would just like to add, that i despise SQL Server (and Sybase). They
break nearly every rule. (And when i am asked for help, i usually try
things i know are not valid SQL to see if they work, because there's no
telling what it'll actually do). If only because they allow SELECT
statements without a FROM clause, it would be enough to make anyone
question it calling itself a database. Unfortunately, most people do
consider it a database, so i must concede the point.)

B.

Jul 12 '06 #17

--CELKO--

>Yes, it is clear that DB2 follows the ANSI standard, but it is not clear that the ANSI standard is consistent (as noted above) or is the preferred interpretation, and there are several databases that apparently use different interpretations (unless you set them to be ANSI compliant).<<

GROUPING is not the same as testing for equality. Grouping is done
with groups and equality is for scalars. We debated this in the old
ANSI X3H2 committee decades ago when it was still an issue.

If you want to see if some columns are all NULL, use "COALESCE (c1, c2,
,,cn) IS NULL" on your scalars.

Jul 13 '06 #18

Variables in SP do not compare as equal when both are NULL

Similar topics