By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,534 Members | 1,807 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,534 IT Pros & Developers. It's quick & easy.

The Median Function in SQL Server

P: n/a
I read the follow query about calculating median posted by Daivd Porta
on 10/8/03.

CREATE TABLE SomeValues (keyx CHAR(1) PRIMARY KEY, valuex INTEGER NOT
NULL)

INSERT INTO SomeValues VALUES ('A',1)
INSERT INTO SomeValues VALUES ('B',2)
INSERT INTO SomeValues VALUES ('C',3)
INSERT INTO SomeValues VALUES ('D',4)
INSERT INTO SomeValues VALUES ('E',5)

SELECT S1.valuex AS median
FROM SomeValues AS S1, SomeValues AS S2
GROUP BY S1.valuex
HAVING SUM(CASE WHEN S2.valuex <= S1.valuex
THEN 1 ELSE 0 END)
= ((COUNT(*) + 1) / 2) AND SUM(CASE WHEN S2.valuex >= S1.valuex
THEN 1 ELSE 0 END)= (COUNT(*)/2 + 1)


I have difficulty to understand the having clause. If S1 and S2 are
the same table, what it means by S2.valuex >= S1.valuex? Could some
give me a help?

Also, if I have a table structured as:

classID field1 field2 field3
c1 1 2 3
c1 4 5 6
c1 7 8 9
c2 9 8 7
c2 6 5 4
c2 3 2 1

Is there a way to create a user-defined function that can get the
median for each field as easy as the average function. Such as

select distinct classID,
median(field1),
median(field2),
median(field3)
from [tablename]
group by classID
Thanks in advance
Jul 20 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
It is not very simple to write UDFs for all the aggregate requirements. Not
the best way, but here is one using the data & logic you have posted, though
there is no real need for a table variable:

CREATE FUNCTION dbo.Median ( @cid CHAR( 2 ), @fd INT )
RETURNS INT
AS
BEGIN
DECLARE @t TABLE ( c INT )
INSERT @t SELECT CASE @fd WHEN 1 THEN field1
WHEN 2 THEN field2
WHEN 3 THEN field3
END
FROM tbl
WHERE classID = @cid ;
RETURN ( SELECT c
FROM ( SELECT t1.c,
SUM( CASE WHEN t2.c <= t1.c
THEN 1 ELSE 0 END ),
SUM( CASE WHEN t2.c >= t1.c
THEN 1 ELSE 0 END ),
( COUNT( * ) + 1 ) / 2,
( COUNT( * )/2 + 1 )
FROM @t t1
CROSS JOIN @t t2
GROUP BY t1.c ) D ( c, s1, s2, c1, c2)
WHERE s1 >= c1 AND s2 >= c2 )
END

Now you can have your query with the above UDF like:

SELECT classID,
dbo.Median(classID, 1),
dbo.Median(classID, 2),
dbo.Median(classID, 3)
FROM tbl
GROUP BY classID ;

--
- Anith
( Please reply to newsgroups only )
Jul 20 '05 #2

P: n/a
> the same table, what it means by S2.valuex >= S1.valuex? Could some
give me a help?


The query is a Cross Join, which is difficult to avoid when calculating the
Median. The two SUM(CASE...) expressions compare every S1 value to every S2
value and the HAVING clause says that the count of values where S2 <= S1 and
the count of values where S2 >= S1 must both be >= COUNT(*)/2 (= the
middle).

I didn't devise this solution but it's one of a number of alternative median
queries that you'll find if you search the archives of the SQLServer groups.

--
David Portas
------------
Please reply only to the newsgroup
--
Jul 20 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.