473,396 Members | 1,784 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

how to group/list top 3 of each category w/o using Union?

Hello,

So my table contains say 100,000 records, and I need to group the
categories in fld1 by the highest count of subcategories. Say fld1
contains categories A, B, C, D, E.

All of these categories contain subcategories AA, AB, AC, AD,...AJ, BA,
BB...BJ, CA, CB, CC...CJ, etc in fld2.

I am counting how many subcategories are listed for each category. Like
A may contain 5 of AA, 7 of AB, 3 of AC, 11 of AD...1 for the rest and
20 of AJ. B may contain 2 of BA, 11 of BB, 7 of BC, and 1 for the rest.
I want to pick up the top 3 subcategory counts for each category. Would
look like this:

Cat SubCat Count
A AJ 20
A AD 11
A AB 7
B BB 11
B BC 7
B BA 2

So event though each category contains 10 subcategories, I only want to
list the top 3 categories with the highest counts as above. If I just
do a group by and sort I can get this:

Cat SubCat Count
A ... ...
A
A
A
A
A
A
...
B ... ...
B
B
B
B
B
...

But I just want the top 3 of each category. The only way I can think of
to do this is to query each category individually and Select Top 3, and
then Union these guys into one query. The problem is that I have to
hardcode each category in the Union query. There may be new categoris
that I miss. Is there a way to achieve what I want without using Union?

Thanks,
Rich

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #1
3 4581
[posted and mailed, please reply in news]

Rich Protzel (rp*****@aol.com) writes:
So my table contains say 100,000 records, and I need to group the
categories in fld1 by the highest count of subcategories. Say fld1
contains categories A, B, C, D, E.

All of these categories contain subcategories AA, AB, AC, AD,...AJ, BA,
BB...BJ, CA, CB, CC...CJ, etc in fld2.

I am counting how many subcategories are listed for each category. Like
A may contain 5 of AA, 7 of AB, 3 of AC, 11 of AD...1 for the rest and
20 of AJ. B may contain 2 of BA, 11 of BB, 7 of BC, and 1 for the rest.
I want to pick up the top 3 subcategory counts for each category. Would
look like this:

Cat SubCat Count
A AJ 20
A AD 11
A AB 7
B BB 11
B BC 7
B BA 2

So event though each category contains 10 subcategories, I only want to
list the top 3 categories with the highest counts as above. If I just
do a group by and sort I can get this:


Here is a pragmatic solution:

CREATE TABLE #temp (id int IDENTITY,
fld1 char(2) NOT NULL,
fld2 int NOT NULL,
cnt int NOT NULL)

INSERT #temp (fld1, fld2, cnt)
SELECT xtype, id % 10, COUNT(*)
FROM sysobjects
GROUP BY xtype, id % 10
ORDER BY 3 DESC
OPTION (MAXDOP 1)

SELECT DISTINCT t.*
FROM #temp t
JOIN (SELECT fld1, id = MIN(id)
FROM #temp
GROUP BY fld1) AS s ON t.id BETWEEN s.id AND s.id + 2
ORDER BY t.fld1, t.cnt DESC
go
drop table #temp

Here I have replaced your table with a funky categorization of a system
tables. (Hint: post CREATE TABLE statement for your tables, INSERT
statements with sample data and the desired output and you get a
tested solution that fits your situation.)

The solution is not fool-proof. There is not really any guarantee that
the identity values are assigned in the order specified in the SELECT
statement. But it usually works, particularly with the query hint that
turns of parallelism.
--
Erland Sommarskog, SQL Server MVP, so****@algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2
"Rich Protzel" <rp*****@aol.com> wrote in message news:3f*********************@news.frii.net...
Hello,

So my table contains say 100,000 records, and I need to group the
categories in fld1 by the highest count of subcategories. Say fld1
contains categories A, B, C, D, E.

All of these categories contain subcategories AA, AB, AC, AD,...AJ, BA,
BB...BJ, CA, CB, CC...CJ, etc in fld2.

I am counting how many subcategories are listed for each category. Like
A may contain 5 of AA, 7 of AB, 3 of AC, 11 of AD...1 for the rest and
20 of AJ. B may contain 2 of BA, 11 of BB, 7 of BC, and 1 for the rest.
I want to pick up the top 3 subcategory counts for each category. Would
look like this:

Cat SubCat Count
A AJ 20
A AD 11
A AB 7
B BB 11
B BC 7
B BA 2

So event though each category contains 10 subcategories, I only want to
list the top 3 categories with the highest counts as above. If I just
do a group by and sort I can get this:

Cat SubCat Count
A ... ...
A
A
A
A
A
A
..
B ... ...
B
B
B
B
B
..

But I just want the top 3 of each category. The only way I can think of
to do this is to query each category individually and Select Top 3, and
then Union these guys into one query. The problem is that I have to
hardcode each category in the Union query. There may be new categoris
that I miss. Is there a way to achieve what I want without using Union?

Thanks,
Rich

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!


Here's a UDF that will return the top N for each category where
N is provided as an argument to the function.

CREATE TABLE CategoryCounts
(
category CHAR(1) NOT NULL,
subcategory CHAR(2) NOT NULL PRIMARY KEY,
cnt INT NOT NULL
)

-- Sample data
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('A', 'AA', 5)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('A', 'AB', 7)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('A', 'AC', 3)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('A', 'AD', 11)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('A', 'AE', 1)

INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('B', 'BA', 2)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('B', 'BB', 11)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('B', 'BC', 7)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('B', 'BD', 1)
INSERT INTO CategoryCounts (category, subcategory, cnt)
VALUES ('B', 'BE', 1)

CREATE FUNCTION TopNFromEachCategory
(@n INT)
RETURNS TABLE
AS
RETURN(
SELECT C1.category, C1.subcategory, C1.cnt
FROM CategoryCounts AS C1
WHERE @n >= (SELECT COUNT(DISTINCT C2.cnt)
FROM CategoryCounts AS C2
WHERE C1.category = C2.category AND
C2.cnt >= C1.cnt)
)

SELECT *
FROM TopNFromEachCategory(3)
ORDER BY category ASC, cnt DESC, subcategory ASC

category subcategory cnt
A AD 11
A AB 7
A AA 5
B BB 11
B BC 7
B BA 2

Regards,
jag
Jul 20 '05 #3
Just a quick update on what I tried with the given solutions:

I created a static temp table which included an ID field, a Category
field, subCategory, and cnt. The first select statement here produced
the desired results with my actual data of retrieving only the top 3
subcategories (with the highest cnt's) for each category in Query
Analyzer. Note that in the temp table I ordered the table by Category
ASC and for cnt DESC.

SELECT DISTINCT t.*
FROM Temp t
JOIN (SELECT Category, id = MIN(id)
FROM Temp
GROUP BY Category) AS s ON t.id BETWEEN s.id AND s.id + 2
ORDER BY t.Category, t.cnt DESC
I also created a temp table without the ID field but all the rest of the
fields as above. The following select statement did not limit the
subcategories to the top 3. It did remove a few of the subcategories,
but removed the subcategories with the highest count. Maybe I missed
something or maybe it needs an ID field. I would be interested to know
what it would take to make this second procedure work as desired - since
it is so similar to the first one, except doesn't use ID field, and
doesn't use Min function, but uses Count instead.

SELECT C1.category, C1.subcategory, C1.cnt
FROM Temp AS C1
WHERE 3 >= (SELECT COUNT(DISTINCT C2.cnt)
FROM Temp AS C2
WHERE C1.category = C2.category AND
C2.cnt >= C1.cnt)
Again, I thank you all for helping me out.
Rich

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Betty Harvey | last post by:
The next meeting of the XML Users Group will be held on Wednesday, February 18, 2004 at the American Geophysical Union (AGU) at 2000 Florida Avenue, N.W., Washington, DC 20009-1277. The meeting...
6
by: Robin S. | last post by:
**Eric and Salad - thank you both for the polite kick in the butt. I hope I've done a better job of explaining myself below. I am trying to produce a form to add products to a table (new...
3
by: Abhi | last post by:
Hi! I am wondering if this query is possible somehow: I have a table with many fields that all can have a value from 1 to 5. if I wanna see the count of each value from one field, then this...
2
by: google | last post by:
Hello everyone, I am having an issue using the "Multi Select" option in a list box in MS Access 2003. I am making a form that users can fill out to add an issue to the database. Each issue can...
3
by: Egbert | last post by:
I use a System.Web.UI.WebControls.CheckBoxList object to fill a list. I need to label groups of checkboxes in the list by Category (Determined dynamically from the database) at runtime. Is it...
1
by: theeverdead | last post by:
Ok I have a file in it is a record of a persons first and last name. Format is like: Trevor Johnson Kevin Smith Allan Harris I need to read that file into program and then turn it into a linked...
17
by: trose178 | last post by:
Good day all, I am working on a multi-select list box for a standard question checklist database and I am running into a syntax error in the code that I cannot seem to correct. I will also note...
2
by: benhaynes | last post by:
I am writing a query to create a tree menu, it pulls from a table of music "tracks". In this database there are four "sub_genre" fields for each track, and I need to create a list of all used...
2
by: gnawz | last post by:
I have a table that consists of 5 categories and each category has items called brands. I want to be able to list the Category of each set of brands on top of the list using PHP and MySQL in...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.