Bryan Zash (bz***@hotmail.com) writes:
For example, my customer table has 1 million unique records, so the
results of the following query are as such:
select count(customer_nbr) from customer = 1,000,000
There is bit field in the customer table that denotes whether a
customer has placed an order with us called. That flag is called
order_flag
If I run the following query:
select count(customer_nbr) from customer where order_flag = 1
The result is 3,000,000 records.
There is no logical way that this is possible, as my table only
contains 1,000,000 unique records and the number of customers with an
order should be a subset of this.
If a run the above query with a distinct before customer number, I get
the results I want:
select count(distinct customer_nbr) from customer where order_flag = 1
600,000 records.
So while I can get to the answer I want, I have no idea why I am
returning incorrect values if I don't select distinct.
Something does not seem quite right here, but let's first observe a few
things:
CREATE TABLE #h (a int NULL, b bit NOT NULL)
go
INSERT #h (a, b) VALUES (12, 1)
INSERT #h (a, b) VALUES (12, 1)
INSERT #h (a, b) VALUES (112, 1)
INSERT #h (a, b) VALUES (NULL, 1)
INSERT #h (a, b) VALUES (15, 1)
INSERT #h (a, b) VALUES (12, 0)
INSERT #h (a, b) VALUES (12, 0)
INSERT #h (a, b) VALUES (112, 0)
INSERT #h (a, b) VALUES (NULL, 0)
INSERT #h (a, b) VALUES (15, 0)
go
CREATE INDEX bitix ON #h(b)
go
SELECT COUNT(*) FROM #h -- 10
SELECT COUNT(a) FROM #h -- 8
SELECT COUNT(DISTINCT a) FROM #h -- 3
SELECT COUNT(*) FROM #h WHERE b = 1 -- 5
SELECT COUNT(a) FROM #h WHERE b = 1 -- 4
SELECT COUNT(DISTINCT a) FROM #h WHERE b = 1 -- 3
go
DROP TABLE #h
COUNT(*) counts all rows. COUNT(a) only counts rows where A has a non-NULL
value. COUNT(DISTINCT a) counts the number of distinct values. I'm a
little uncertain from your post whether you are aware of this, which
is why I brought it up.
But of course, SELECT COUNT(a) should give a higher number if you
apply a filter, so it does smell like a bug here. Unfortunately, though,
there is not much I can do without a reproducible case. And that
may be difficult to achieve. I have a vague recollection of that I've
might have seen something like this before. First thing is to check
SELECT @@version and see what it says. If says something lower than
8.00.760 you don't have SP3, and you should apply that service pack.
If you already have SP3, and need to pursue the issue more in detail,
your best bet may be to open a case with Microsoft, since it does not
seem likely that we will be able to resolve over the newsgroups. Note
that if it is indeed a bug in SQL Server, any expenses you have for
the case should be refunded.
Of course, if you are able to reproduce the problem with a much smaller
data set, so that you could post a script, then I'm interested.
--
Erland Sommarskog, SQL Server MVP,
es****@sommarskog.se
Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp