458,182 Members | 1,548 Online Need help? Post your question and get tips & solutions from a community of 458,182 IT Pros & Developers. It's quick & easy.

# Median Statistical function

 P: 9 hi everyone! i was wondering if anyone had any ideas on how to go about coding a method that would determine the median of a list of numbers just to clarify- a median is the 'middle value' when all of the numbers are placed in increasing order; for example- the median of 3 6 12 40 41 57 58 90 is 40.5 i was thinking about using a method that sorts the list in increasing order, then determining if there is an even or odd number of items, (if the number is even, there will be 2 middle values that have to be averaged) then 'cutting off' the first and last numbers until we reach the median value i was wondering if anyone had any suggestions that may help with this endeavor, or even any guidance that may help me program the method! im just not certain that this may be the easiest way, and wanted to get some feedback on what other people think.. thanks to everyone! Apr 22 '08 #1
7 Replies

 Expert 2.5K+ P: 3,112 hi everyone! i was wondering if anyone had any ideas on how to go about coding a method that would determine the median of a list of numbers just to clarify- a median is the 'middle value' when all of the numbers are placed in increasing order; for example- the median of 3 6 12 40 41 57 58 90 is 40.5 i was thinking about using a method that sorts the list in increasing order, then determining if there is an even or odd number of items, (if the number is even, there will be 2 middle values that have to be averaged) then 'cutting off' the first and last numbers until we reach the median value i was wondering if anyone had any suggestions that may help with this endeavor, or even any guidance that may help me program the method! im just not certain that this may be the easiest way, and wanted to get some feedback on what other people think.. thanks to everyone! Hi tomshorts07! Just to clarify, if the list were3 6 12 40 42 57 58 90the median would be 41, correct? The method you suggested is possible, however you don't have to sort the values. As sorting a collection can take quite some time, I would probably solve your problem something like this: Say, you have an unsorted array of numbers like this: Expand|Select|Wrap|Line Numbers int[] numbers = {58, 41, 6, 40, 12, 90, 57, 3}; then you have a look at, say, the first element: 58. Now, go through the array and compare each number to that 58. (Constant time!) You'll find: there is 1 number bigger than 58 in the array. Expand|Select|Wrap|Line Numbers int bigger = 1; As the array has the length 8, you know: there are (8 - 1 - bigger) = 6 numbers smaller than 58 in the array. Expand|Select|Wrap|Line Numbers int smaller = numbers.length() - 1 - bigger // 58 itself is neither bigger nor smaller than 58 Now, Expand|Select|Wrap|Line Numbers (bigger - smaller) / 2 = (1 - 6) / 2 = -5 / 2 = -2.5 (divide by two, as you want to find the number in the middle) tells you, that there are two numbers relevant for the median: the second and the third smaller number than 58. Find those numbers, add them up and divide by two - there's your median. Depending on the size of your array, that may save a lot of time, if you have a good way of finding the second and third (or whatever your values are) smaller number. Hint: here a sorted collection like a TreeMap or a TreeSet or maybe the Collection.sort() method might be a good idea. (Think about how many elements you need...) Greetings, Nepomuk Apr 22 '08 #2

 Expert 100+ P: 1,216 Apr 23 '08 #3

 Expert 2.5K+ P: 3,112 http://en.wikipedia.org/wiki/Selection_algorithm I guess, you mean the "Median of Medians algorithm"? It looks like it runs a bit faster than mine, but does it actually do the right thing? The description is a bit strange... If I understand it correctly, it does this:We have a Collection {13, 7, 29, 2, 97, 61, 5, 17, 31, 101} This Collection is split into two Collections {13, 7, 29, 2, 97} and {61, 5, 17, 31, 101} Then it calculates the medians of each group: 13 and 17 The median of these numbers is calculated: 15 The median of medians splits the original Collection somewhere between 30% / 70% and 70% / 30%: 2, 5, 7, 13 < 15 < 17, 29, 31, 61, 97, 101 => 40% / 60% (but so far, it doesn't know that, does it?) And now? From there on, I'm a little confused. My algorithm find's out, how many numbers are greater and how many are smaller faster, but of course it's not necessarily so near to the actual median. Please explain, what happens now, to actually find the median? Greetings, Nepomuk Apr 23 '08 #4

 Expert 100+ P: 1,216 I don't understand your algorithm since your explanation ends with quite the hand wave. One simple median algorithm is to write essentially a modified quick sort: 1. Partition the data. 2. Instead of recursing on both partitions, select the one which would contain the median and recurse on only that one. This algorithm has an expected time of O(N). Apr 23 '08 #5

 Expert 2.5K+ P: 3,112 I don't understand your algorithm since your explanation ends with quite the hand wave. What I was thinking is this: You create a Collection (let's say an array, just for the sake of easiness) with as many elements as the furthest median value is away from the first number. (As calculated in the previous part of the algorithm.) Let's say, that first number is 58. So: Expand|Select|Wrap|Line Numbers int[] numbers = {58, 41, 6, 40, 12, 90, 57, 3}; double difference = -2.5; // calculated above int[] helpArray = new int[(int)(1 + abs(difference))]; // (int)(1 + 2.5) = (int) 3.5 = 3   So, say the relevant numbers for the median are those 2 and 3 numbers smaller than 58, then the array has the size of 3. Expand|Select|Wrap|Line Numbers // This is NOT supposed to be valid Java-code! helpArray.lenth() == 3 Now, go through the list of numbers and add the first 3 values, that are smaller than 58. Sort them, smallest first. (This is probably the part of the algorithm, that takes longest.) Expand|Select|Wrap|Line Numbers // Neither is this or most of the other following code helpArray == {41, 6, 40}; sort(helpArray); helpArray == {6, 40, 41};   Now, go through the rest of the list and when you find a value that's smaller than 58 and larger than the smallest in the list, replace that smallest number and sort the list again (simple bubble sort like algorithm). Expand|Select|Wrap|Line Numbers => helpArray == {12, 40, 41} => helpArray == {57, 40, 41} => helpArray == {40, 41, 57}   Continue until the end of the list. The smallest number in your list is now the 3rd smaller number from 58, the second smallest is the 2nd smaller than 58. Expand|Select|Wrap|Line Numbers int[] relevant = {40, 41}; int median = (relevant + relevant)/2; System.out.println("The median is " + median);   Determine the median. Done. Let me think, it should have a worst case runtime of something like(n-1)*(n/2) + (n/2) + (n/2 - 1) + ... + 1Not ideal I guess, but better than sorting the whole thing. ^^ Greetings, Nepomuk Apr 23 '08 #6

 P: 9 What I was thinking is this: You create a Collection (let's say an array, just for the sake of easiness) with as many elements as the furthest median value is away from the first number. (As calculated in the previous part of the algorithm.) Let's say, that first number is 58. So: Expand|Select|Wrap|Line Numbers int[] numbers = {58, 41, 6, 40, 12, 90, 57, 3}; double difference = -2.5; // calculated above int[] helpArray = new int[(int)(1 + abs(difference))]; // (int)(1 + 2.5) = (int) 3.5 = 3   So, say the relevant numbers for the median are those 2 and 3 numbers smaller than 58, then the array has the size of 3. Expand|Select|Wrap|Line Numbers // This is NOT supposed to be valid Java-code! helpArray.lenth() == 3 Now, go through the list of numbers and add the first 3 values, that are smaller than 58. Sort them, smallest first. (This is probably the part of the algorithm, that takes longest.) Expand|Select|Wrap|Line Numbers // Neither is this or most of the other following code helpArray == {41, 6, 40}; sort(helpArray); helpArray == {6, 40, 41};   Now, go through the rest of the list and when you find a value that's smaller than 58 and larger than the smallest in the list, replace that smallest number and sort the list again (simple bubble sort like algorithm). Expand|Select|Wrap|Line Numbers => helpArray == {12, 40, 41} => helpArray == {57, 40, 41} => helpArray == {40, 41, 57}   Continue until the end of the list. The smallest number in your list is now the 3rd smaller number from 58, the second smallest is the 2nd smaller than 58. Expand|Select|Wrap|Line Numbers int[] relevant = {40, 41}; int median = (relevant + relevant)/2; System.out.println("The median is " + median);   Determine the median. Done. Let me think, it should have a worst case runtime of something like(n-1)*(n/2) + (n/2) + (n/2 - 1) + ... + 1Not ideal I guess, but better than sorting the whole thing. ^^ Greetings, Nepomuk thanks nepomuk, your explanation is really helpful, and i really understand how to implement the method a lot better now. theres just one thing about your explanation that i dont understand- your 'helparray' section. i was just hoping you could explain this part a little bit more in detail. for example, why is this array only 3 items deep? wouldnt it be easier to just use 2 values, allowing you to calculate the final median without transferring it to the 'relevant' array? another thing about the array, it seems like using the bubble sort like method would waste a lot of processing time? is there some other way that may be quicker? May 1 '08 #7 