By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,491 Members | 858 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,491 IT Pros & Developers. It's quick & easy.

Top 2 per category

P: 6
I need to get the largest and 2nd largest amounts in a column for each postcode.
I then want to add these two figures to a table which has other information for each postcode (e.g. counts, totals, etc).
I can get largest fine:
Expand|Select|Wrap|Line Numbers
  1. SELECT [4_digit_postcode],  max([Total_SI]) AS [LargestRiskSI]
  2.    INTO tLargestRiskPerPostcode
  3.    FROM [Location file_DO]
  4.    GROUP BY [4_digit_postcode];
And update the table where the postcodes match.

And I get get the top 2 overall:
Expand|Select|Wrap|Line Numbers
  1. SELECT  top 2 Total_SI
  2.   FROM [Location file_DO]
  3.   ORDER BY Total_SI DESC;
And the top 2 for a specific postcode, but its way too slow to run this code for every single postcode in Excel (I tried).
For each postcode:
Expand|Select|Wrap|Line Numbers
  1. SELECT  top 2 TotalSI
  2.   FROM [Location file_DO]
  3.   WHERE  Postcode = **variablePostcode**
  4.   ORDER BY TotalSI DESC
I'm having no luck with google trying to get second largest by postcode. The tables are in Access, so it's Access SQL. Would you be able to tell me what my code should look like? It needs to be run regularly from an Excel VBA macro, with the results being put into an Excel sheet.
If there is a tie at 1, I'd like the same number in largest and 2nd largest.
May 12 '10 #1

✓ answered by NeoPa

Diablette: So am I understanding correctly that I can't even get a table like this using SQL (i.e. with all but the top 2 values per group/postcode excluded):
Absolutely, you are. There is no way in SQL that I'm aware of.

At this point I should add an update.

I set up a test bed database and actually managed to get something to work on lines very similar to what you're after.

I started by setting up a table with values :
Table Name=[tblGroup]
Expand|Select|Wrap|Line Numbers
  1. ID  Group  Value
  2.  1    A       1
  3.  2    A       2
  4.  3    A       3
  5.  4    A       4
  6.  5    B       2
  7.  6    B       4
  8.  7    B       6
  9.  8    C       5
  10.  9    D       4
  11. 10    D       8
I then put some SQL together :
Expand|Select|Wrap|Line Numbers
  1. SELECT   tGo.ID
  2.        , tGo.Group
  3.        , tGo.Value
  4.  
  5. FROM     tblGroup AS tGo
  6.  
  7. WHERE    tGo.ID IN(
  8.     SELECT   TOP 2 tGi.ID
  9.     FROM     tblGroup AS tGi
  10.     WHERE    tGi.Group=tGo.Group
  11.     ORDER BY tGi.Value DESC
  12.     )
  13.  
  14. ORDER BY tGo.Group
  15.        , tGo.Value DESC
This produced these results :
Expand|Select|Wrap|Line Numbers
  1. ID  Group  Value
  2.  4    A       4
  3.  3    A       3
  4.  7    B       6
  5.  6    B       4
  6.  8    C       5
  7. 10    D       8
  8.  9    D       4
You will notice that it cannot create a second value for the Group C for you, but otherwise is fundamentally what you want I think.

Share this Question
Share on Google+
11 Replies


P: 6
Any help with this will be greatly appreciated.
I have to get it working for work, and I'm really stuck.
Thank you
May 13 '10 #2

Jim Doherty
Expert 100+
P: 897
@Diablette
What is your data row size here are you working in excel or Access?
May 13 '10 #3

P: 6
Hi Jim,
I trust you will be able to help me. Thank you for taking the time to look at my query.

The original table that I am working with today has 100,000 rows.
It has 25,000 unique postcode values.
In the data I am working with today the postcodes are numbers, but other tables will have alphanumeric postcodes. There are no memo or very long string fields in the table.
(I hope I understand your question on data row size)

The table is in Access.
I am writing code in Excel which runs various queries on the tables in Access and puts summary information into Excel sheets.
May 13 '10 #4

Jim Doherty
Expert 100+
P: 897
@Diablette
OK well I have an appointment very shortly but I will come back to you on this. In the meantime if you have anything there ie: table field names and structures to post back maybe I can replicate something when i come back to test things out and help you
May 13 '10 #5

P: 6
Hi Jim,
The sort of table I have (ignoring the rest of the data):
Expand|Select|Wrap|Line Numbers
  1. Postcode  Name     TotalSI
  2. 2600    Place A      500
  3. 2600    Place A      400
  4. 3800    Place B      600
  5. 3800    Place B      500
  6. 3800    Place B      400
  7. 3800    Place B      300
  8. 4460    Place C      300
  9. 4460    Place C      300
  10. 4460    Place C      200
  11. 1100    Place D      100
What I need to end up with somehow:
Expand|Select|Wrap|Line Numbers
  1. Postcode  Largest 2nd largest
  2. 2600        500    400
  3. 3800        600    500
  4. 4460        300    300
  5. 1100        100    null (or 0)
I hope this makes my question clearer.
May 13 '10 #6

NeoPa
Expert Mod 15k+
P: 31,186
This won't be done simply using SQL. You'll need some sort of public VBA function. I would suggest a function that takes two parameters. One to identify the grouping (in this case [Postcode]) and one as the value.

The function would return the top two values in a string. The SQL would need to get the Max() of the returned values within any group, and separate out the two values into separate fields.

It (the function) would need to maintain static data of the grouping last used and the highest two values to date. If would need to determine in each iteration of the function whether or not the newly passed value should effect either of the top two positions. The Max() of all the results in any grouping should always be the correct result to use.
May 13 '10 #7

P: 6
Hi NeoPa,

Thank you for your reply.

So am I understanding correctly that I can't even get a table like this using SQL (i.e. with all but the top 2 values per group/postcode excluded):
Expand|Select|Wrap|Line Numbers
  1. Postcode  Value 
  2. 2600    500
  3. 2600    400 
  4. 3800    600
  5. 3800    500 
  6. 4460    300
  7. 4460    300 
  8. 1100    100
The only way is doing this sql within a function:
Expand|Select|Wrap|Line Numbers
  1. SELECT  top 2 TotalSI 
  2.   FROM [Location file_DO] 
  3.   WHERE  Postcode = **variablePostcode** 
  4.   ORDER BY TotalSI DESC 
and calling it 25,000 times?
(This is the code I wrote in my first post.)

Apologies, but I don't understand this part of your reply:
It (the function) would need to maintain static data of the grouping last used and the highest two values to date. If would need to determine in each iteration of the function whether or not the newly passed value should effect either of the top two positions.

I also don't understand what the value parameter is that you refer to the function taking in. I only understand putting in one variable (the postcode) and getting out the top two values.
May 14 '10 #8

NeoPa
Expert Mod 15k+
P: 31,186
Diablette: So am I understanding correctly that I can't even get a table like this using SQL (i.e. with all but the top 2 values per group/postcode excluded):
Absolutely, you are. There is no way in SQL that I'm aware of.

At this point I should add an update.

I set up a test bed database and actually managed to get something to work on lines very similar to what you're after.

I started by setting up a table with values :
Table Name=[tblGroup]
Expand|Select|Wrap|Line Numbers
  1. ID  Group  Value
  2.  1    A       1
  3.  2    A       2
  4.  3    A       3
  5.  4    A       4
  6.  5    B       2
  7.  6    B       4
  8.  7    B       6
  9.  8    C       5
  10.  9    D       4
  11. 10    D       8
I then put some SQL together :
Expand|Select|Wrap|Line Numbers
  1. SELECT   tGo.ID
  2.        , tGo.Group
  3.        , tGo.Value
  4.  
  5. FROM     tblGroup AS tGo
  6.  
  7. WHERE    tGo.ID IN(
  8.     SELECT   TOP 2 tGi.ID
  9.     FROM     tblGroup AS tGi
  10.     WHERE    tGi.Group=tGo.Group
  11.     ORDER BY tGi.Value DESC
  12.     )
  13.  
  14. ORDER BY tGo.Group
  15.        , tGo.Value DESC
This produced these results :
Expand|Select|Wrap|Line Numbers
  1. ID  Group  Value
  2.  4    A       4
  3.  3    A       3
  4.  7    B       6
  5.  6    B       4
  6.  8    C       5
  7. 10    D       8
  8.  9    D       4
You will notice that it cannot create a second value for the Group C for you, but otherwise is fundamentally what you want I think.
May 14 '10 #9

NeoPa
Expert Mod 15k+
P: 31,186
Diablette: Apologies, but I don't understand this part of your reply:
It (the function) would need to maintain static data of the grouping last used and the highest two values to date. If would need to determine in each iteration of the function whether or not the newly passed value should effect either of the top two positions.

I also don't understand what the value parameter is that you refer to the function taking in. I only understand putting in one variable (the postcode) and getting out the top two values.
I would just like to add that it's always pleasant to get an intelligent response from a member when they are confused by something I've written. The most common responses by far are :
  1. Ignoring it completely.
  2. Explaining why I must be wrong.
In this case it is so much easier to deal with :)
First, the reference to the function. You need to invert your understanding somewhat to get the picture. The SQL is not processed from within the function. Rather, the function is called by the SQL.

Second, for the function to be able to do its job it would need both the data that is grouped by, and the value. It is called for every record in the table (but it's important that the data come in already sorted by the grouped field). The function can remember previous matching records so can determine the highest two values it has so far seen. Until all the data for the group has been processed there will be many return values of the function that are simply wrong, but the Max() of these should always be correct.

With the SQL now proving possible, there is no need to worry about these details at all of course.
May 14 '10 #10

P: 6
Thank you very, very much.
I got your SQL code to work.
And bonus it doesn't take too long.
I think I will create a table from this code, another table just using max to get the largest with their ids and then a unmatched query on id to get the 2nd largest values, where they exist.
Your help is greatly appreciated.
And thank you for the compliment :)
May 14 '10 #11

NeoPa
Expert Mod 15k+
P: 31,186
It was a pleasure, and Welcome to Bytes!
May 14 '10 #12

Post your reply

Sign in to post your reply or Sign up for a free account.