Connecting Tech Pros Worldwide Forums | Help | Site Map

Subquery with count problem

Newbie
 
Join Date: Nov 2008
Location: Stavanger, Norway
Posts: 2
#1: Nov 20 '08
Hi

I have a table KDOCUMENT with the columns
[DocID],
[Title],
[CreatedByUser],
[CreateDate],
[LastModifiedByUserName],
[LastModificationDate],
[Comment]

On this table I want to find the test documents so that they can be deleted, hence I have created the query to find documents with the words "test" and "delete" in [Title] and [Comment] looking like this:
SELECT [DocID], [Title],[CreatedByUser], [CreateDate],[LastModifiedByUserName], [LastModificationDate], [Comment]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%test%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%delete%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0.
This one is working fine returning 5070 rows with documents containing test and delete.

Problem is that we only want to list the documents belonging to users that have X documents or more (for instance 20), because we have to contact the people and ask them which documents actually are test documents. If someone only have 1 document we don't want to bother them, it is more important to clean up the "real heavy" user's documents.
So, I created a query

SELECT COUNT([CreatedByUser]) AS 'Number of docs', [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%test%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%delete%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20
ORDER BY COUNT([CreatedByUser]) DESC

returning correctly the number of documents for each user and the user name. It also limits itself to listing only users with a count of 20 or more.

Now, I want to combine these two, listing all documents with DocID and everything where a user has 20 documents or more. But when I do this:

SELECT [DocID], [Title],[CreatedByUser], [CreateDate], [LastModifiedByUserName], [LastModificationDate], [Comment]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%delete%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%test%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0
AND [CreatedByUser] IN (
SELECT [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20)

ORDER BY [CreatedByUser]

I lists all the documents as in the first query, returning 5070 rows while it should return about 4715 (I know this by adding up the numbers in the count query).

It has been a long time since I've been working with SQL, and I am struggling a bit here. Can someone help me? I get the same result using EXISTS, and when using = I of course get the error that it is returning more than one row.

ck9663's Avatar
Expert
 
Join Date: Jun 2007
Posts: 1,925
#2: Nov 21 '08

re: Subquery with count problem


Quote:

Originally Posted by Hasse1982

Hi

I have a table KDOCUMENT with the columns
[DocID],
[Title],
[CreatedByUser],
[CreateDate],
[LastModifiedByUserName],
[LastModificationDate],
[Comment]

On this table I want to find the test documents so that they can be deleted, hence I have created the query to find documents with the words "test" and "delete" in [Title] and [Comment] looking like this:
SELECT [DocID], [Title],[CreatedByUser], [CreateDate],[LastModifiedByUserName], [LastModificationDate], [Comment]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%test%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%delete%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0.
This one is working fine returning 5070 rows with documents containing test and delete.

Problem is that we only want to list the documents belonging to users that have X documents or more (for instance 20), because we have to contact the people and ask them which documents actually are test documents. If someone only have 1 document we don't want to bother them, it is more important to clean up the "real heavy" user's documents.
So, I created a query

SELECT COUNT([CreatedByUser]) AS 'Number of docs', [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%test%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%delete%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20
ORDER BY COUNT([CreatedByUser]) DESC

returning correctly the number of documents for each user and the user name. It also limits itself to listing only users with a count of 20 or more.

Now, I want to combine these two, listing all documents with DocID and everything where a user has 20 documents or more. But when I do this:

SELECT [DocID], [Title],[CreatedByUser], [CreateDate], [LastModifiedByUserName], [LastModificationDate], [Comment]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [Title] LIKE '%delete%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%test%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0
AND [CreatedByUser] IN (
SELECT [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20)

ORDER BY [CreatedByUser]

I lists all the documents as in the first query, returning 5070 rows while it should return about 4715 (I know this by adding up the numbers in the count query).

It has been a long time since I've been working with SQL, and I am struggling a bit here. Can someone help me? I get the same result using EXISTS, and when using = I of course get the error that it is returning more than one row.

Use parenthesis to group your conditions
Expand|Select|Wrap|Line Numbers
  1.  
  2. WHERE ([Title] LIKE '%delete%'                                                
  3.     OR [Comment] LIKE '%test%'                                                
  4.     OR [Title] LIKE '%test%'                                                
  5.     OR [Comment] LIKE '%delete%'                                            
  6.     OR [OnlineVersionsSize] = 0)
  7.    AND [CreatedByUser] IN (
  8.                SELECT [CreatedByUser]
  9.                              FROM [DMSDB].[dbo].[KDOCUMENT]
  10.                             GROUP BY [CreatedByUser]
  11.                            HAVING COUNT([CreatedByUser]) > 20)
  12.  
  13.  
Conditions are evaluated left to right. If one of those ORs is true, your whole condition is true.

Similar to math formula:

2 + 3 * 5 = 25

2 + (3 * 5) = 17

Happy coding!

-- CK
Newbie
 
Join Date: Nov 2008
Location: Stavanger, Norway
Posts: 2
#3: Nov 21 '08

re: Subquery with count problem


Quote:

Originally Posted by ck9663

Use parenthesis to group your conditions

Expand|Select|Wrap|Line Numbers
  1.  
  2. WHERE ([Title] LIKE '%delete%'                                                
  3.     OR [Comment] LIKE '%test%'                                                
  4.     OR [Title] LIKE '%test%'                                                
  5.     OR [Comment] LIKE '%delete%'                                            
  6.     OR [OnlineVersionsSize] = 0)
  7.    AND [CreatedByUser] IN (
  8.                SELECT [CreatedByUser]
  9.                              FROM [DMSDB].[dbo].[KDOCUMENT]
  10.                             GROUP BY [CreatedByUser]
  11.                            HAVING COUNT([CreatedByUser]) > 20)
  12.  
  13.  
Conditions are evaluated left to right. If one of those ORs is true, your whole condition is true.

Similar to math formula:

2 + 3 * 5 = 25

2 + (3 * 5) = 17

Happy coding!

-- CK

Wow, thank you!

It works like a charm, and the best part is that I was not completely off, it was just some paranthesis missing.
I now just discovered one thing you might be able to help me with:
This query: SELECT [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20)
counts ALL users in the entire table, but I only want it to count the ones with the criteria listed in the first query (Title and Comment equals "test" and/or "delete". When I try to put:

WHERE ([Title] LIKE '%delete%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%test%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0

into the second query like this:

SELECT [DocID], [Title],[CreatedByUser], [CreateDate], [LastModifiedByUserName], [LastModificationDate], [Comment], [OnlineVersionsSize]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE [CreatedByUser] IN (
SELECT [CreatedByUser]
FROM [DMSDB].[dbo].[KDOCUMENT]
WHERE ([Title] LIKE '%delete%'
OR [Comment] LIKE '%test%'
OR [Title] LIKE '%test%'
OR [Comment] LIKE '%delete%'
OR [OnlineVersionsSize] = 0)
GROUP BY [CreatedByUser]
HAVING COUNT([CreatedByUser]) > 20)

it shows 0 results. Any ideas?
There might be another way to do this, so I am very open for suggestions :)
ck9663's Avatar
Expert
 
Join Date: Jun 2007
Posts: 1,925
#4: Nov 21 '08

re: Subquery with count problem


Do you have NULL in CreatedByUser ?

try :

Expand|Select|Wrap|Line Numbers
  1.  
  2. SELECT DocID, Title,CreatedByUser, CreateDate, LastModifiedByUserName, 
  3.    LastModificationDate, Comment, OnlineVersionsSize
  4. FROM DMSDB.dbo.KDOCUMENT
  5. WHERE CreatedByUser IN 
  6.    (
  7.       SELECT CreatedByUser
  8.       FROM DMSDB.dbo.KDOCUMENT
  9.       WHERE (Title LIKE '%delete%' OR Comment LIKE '%test%' OR Title LIKE '%test%' OR Comment LIKE '%delete%' OR OnlineVersionsSize = 0)
  10.          and CreatedByUser IS NOT NULL
  11.       GROUP BY CreatedByUser
  12.       HAVING COUNT(*) > 20
  13.    )
  14. WHERE CreatedByUser IS NOT NULL
  15.  
  16.  

You should be careful with IN. Read more about it here

-- CK
Reply


Similar Microsoft SQL Server bytes