473,788 Members | 2,828 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to get most frequent and least frequent values in a column?

I'm a noob SQL user, crossing over from SAS. I have a table with about
200k rows and one of the columns is empssn, which holds the employee
social security number. The same empssn may appear in lots of different
rows. I want to get a list of the 40 top empssns, sorted by the number
of times they appear in the table. I also want a list of the very rarest
empssns (ones that only appear once or twice).

Can anyone help me with this? BTW, this isn't a homework problem.

TIA.

Matt
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #1
2 5429
On Mon, Sep 20, 2004 at 02:27:41PM +0000, Matthew Wilson wrote:
I'm a noob SQL user, crossing over from SAS. I have a table with about
200k rows and one of the columns is empssn, which holds the employee
social security number. The same empssn may appear in lots of different
rows. I want to get a list of the 40 top empssns, sorted by the number
of times they appear in the table. I also want a list of the very rarest
empssns (ones that only appear once or twice).

Can anyone help me with this? BTW, this isn't a homework problem.


select empssn, count(*) from table
group by empssn
order by count(*) desc limit 40;

and

select empssn, count(*) from table
group by empssn
having count(*) < 3;

may be close to what you're looking for.

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #2
Steve Atkins wrote:
On Mon, Sep 20, 2004 at 02:27:41PM +0000, Matthew Wilson wrote:
I'm a noob SQL user, crossing over from SAS. I have a table with about
200k rows and one of the columns is empssn, which holds the employee
social security number. The same empssn may appear in lots of different
rows. I want to get a list of the 40 top empssns, sorted by the number
of times they appear in the table. I also want a list of the very rarest
empssns (ones that only appear once or twice).

Can anyone help me with this? BTW, this isn't a homework problem.

select empssn, count(*) from table
group by empssn
order by count(*) desc limit 40;

and

select empssn, count(*) from table
group by empssn
having count(*) < 3;

may be close to what you're looking for.

Cheers,
Steve

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

to get the 10 rarest empssns, it should be fine to do:

SELECT empssn, count(*) from table
GROUP BY empssn
ORDER BY count(*) ASC LIMIT 10;
the only thing here change from the previous post is the ASC (ascending)
or DESC (descending) sort order.
the having count(*) < 3 is good but will only return rows if there are
empssn that only occur less then 3 times in the table.
however with a table with 200 000 records there is a chance/risk that
not any empssn occurs less then 3 times, in such a case the query with
the HAVING clause will return zero rows whereas the LIMIT 10 still will
give the 10 least frequent.
Nov 23 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

26
45449
by: Agoston Bejo | last post by:
I want to enforce such a constraint on a column that would ensure that the values be all unique, but this wouldn't apply to NULL values. (I.e. there may be more than one NULL value in the column.) How can I achieve this? I suppose I would get the most-hated "table/view is changing, trigger/function may not see it" error if I tried to write a trigger that checks the uniqueness of non-null values upon insert/update.
5
3335
by: Ross Presser | last post by:
The purpose, as you can probably guess, is to produce a set of sample documents from a large document run. The data row has a CLUB column and an IFC column; I want a set of samples that contains at least one of each CLUB and at least one of each IFC, but no more than necessary. Example schema and data: CREATE TABLE mDATA ( ID INTEGER, CLUB CHAR(7),
7
3896
by: Clifford Heath | last post by:
I have a case where a table has two candidate primary keys, but either (but not both) may be NULL. I don't want to store a copy of the concatenated ISNULL'ed fields as an additional column, though that would work if necessary. Instead, I tried the following (this is a related simplified example, not my real one): CREATE FUNCTION ApplyActionPK( @IP int = NULL, @DNS varchar(64) = NULL
3
2765
by: D Denholm | last post by:
I am a Access newbie... Hopefully somebody can help me figure this out. I have a database that looks like: Asset Economic Minimum ----- ---------------- 10555 320 10555 320 10555 320
2
2013
by: Balamurukan | last post by:
How to retrive property values from our own property window
20
3720
by: MLH | last post by:
120 MyString = "How many copies of each letter do you need?" 150 MyVariant = InputBox(MyString, "How Many?", "3") If MyVariant = "2" Then MsgBox "MyVariant equals the string '2'" If MyVariant = 2 Then MsgBox "MyVariant also equals the value 2" 160 If MyVariant = "" Then HowManyCopies = 1 170 If Not IsNumeric(MyVariant) Then HowManyCopies = 1 MsgBox "OK. HowManyCopies has a value of " & CStr(HowManyCopies) 180 For i =...
1
2502
by: Intrepid_Yellow | last post by:
Hi, I have the following code that runs my report generator. The user selects a table from a combo box, then whatever fields they want from a list box. (This part all works and the report runs fine). There is then a combo box they can select a field from (eg CompanyID etc) and then the list box below that contains the values (eg Microsoft, Novell etc). These are all multi-select list boxes. Now I can get the code to work if the user...
5
1680
by: sparks | last post by:
I was using this to add a record to a table but the systax is messing me up. strSQL = "INSERT INTO TblChallengeRECESS ( childname, teacher, kidid, formdate) " & _ "VALUES (me.childname.column(0)," & _ "me.teacher.column(0), me.kidID, " & _ "#" & Me.formdate & "#);"
11
3957
by: sqlservernewbie | last post by:
Hi Everyone, Here is a theoretical, and definition question for you. In databases, we have: Relation a table with columns and rows
0
9656
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10175
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9969
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8993
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7518
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6750
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5536
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4070
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3675
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.