472,805 Members | 2,194 Online

# Binary Search Algorithm

176 Expert 100+
Why?:
One commonly used algorithm is the binary search.
Very helpful. Saves much CPU. Reduces computations exponentially.

When searching though a lot of information to find a match, the first idea that comes to mind is the linear search. Loop through all the values looking for a match. If you find it, return the location, or value and end the search.
However this becomes highly inefficient when the values to be searched become very large. It's a O(N) algorithm. At worse case you may have to search ALL the values to find a match, or even worse find that it's not there!

How?:
If the list of items can be sorted numerically or ASCII. Use a binary search!

First, sort the list.Then compare the value in the very center (round up or down in case of uneven list elements) to your search term. If it is "less than" your term, you know that it must be lower than that point. So you can eliminate the upper range. If the term has a greater value than your center point value, eliminate the lower range.

Thus reducing the amount of items to be searched by half.

Apply this technique to the remaining items continually and reduce the search by half each iteration.

Thus running at the much faster O(log N) algorithm.

Example:

Warning: uses "recursion", remember to consider function call loads on CPU. In some languages this can actually be slower than linear. See non recursive below.
(taken from Wikipedia)

Expand|Select|Wrap|Line Numbers
1. BinarySearch(A[0..N-1], value, low, high) {
2.        if (high < low)
3.            return -1 // not found
4.        mid = low + ((high - low) / 2)  // Note: not (low + high) / 2 !!
5.        if (A[mid] > value)
6.            return BinarySearch(A, value, low, mid-1)
7.        else if (A[mid] < value)
8.            return BinarySearch(A, value, mid+1, high)
9.        else
10.            return mid // found
11.    }
12.
There is an optimization here.
Sometimes when dealing with large numbers, or floating point high precision.
mid = (low + high)/2
can overflow and hog memory, resulting in unreliable results.
mid = low + ((high - low)/2)
fixes it. :)
See here...

Non Recursive:
Expand|Select|Wrap|Line Numbers
1. low = 0
2.        high = N
3.        while (low < high) {
4.            mid = low + ((high - low) / 2)  // Note: not (low + high) / 2 !!
5.            if (A[mid] < value)
6.                low = mid + 1;
7.            else
8.                 //can't be high = mid-1: here A[mid] >= value,
9.                 //so high can't be < mid if A[mid] == value
10.                 high = mid;
11.        }
12.        // high == low, using high or low depends on taste
13.        if ((low < N) && (A[low] == value))
14.            return low // found
15.        else
16.            return -1 // not found
17.

There are MANY sources for this most elementary algorithm. Just type "binary search" into Google.

Happy coding....
Mar 31 '09 #1
6 11993
Markus
6,050 Expert 4TB
I'd never thought of doing it like that.

Thanks a bunch :D
Apr 9 '09 #2
JosAH
11,448 Expert 8TB
Also see Fibonacci Search for a slightly better search than the binary search method.

kind regards,

Jos
Jun 16 '09 #3
Markus
6,050 Expert 4TB
While we're talking about search algorithms, the boyer-moore algorithm is interesting.
Jun 16 '09 #4
Kelicula
176 Expert 100+
Nice, I had never heard of the Fibonacci search myself.

Very interesting...
Jun 16 '09 #5
in researching the Fibonacci search code i also noticed the golden ratio search.
Sep 11 '09 #6
jkmyoung
2,057 Expert 2GB
? The golden ratio search appears to be a mathematical search based on unknown secondary values. I don't see how that applies in terms of searching for a particular item within an array.

The Fibonacci search reminds me of a priority heap.

--
Re the original binary search:
We should note that if we're ever only going to search once or twice through this array, sorting is not that useful. It becomes much more useful when we are repeatedly searching through this array.

To be honest, I find the low + ((high - low) / 2) to be a waste in this case, requiring an extra operation. You're dealing with integers, and unless the number of elements reaches INT_MAX/2, you don't really need to use it. As stated before, this is more of a floating point addition concern. Further, if concerned about speed you should use >> 1 instead of / 2.
If you get to the point where you are actually reaching extremely large numbers of elements, you may want to consider a Interpolation search (sometimes called a dictionary search), until you reach a certain smaller range, or provide subkeys for ranges.
Mar 22 '10 #7