>I have a database of zipcodes with latitude and longitude. I also have the
method of calculating the distance between two zipcodes. What I want to
know is if there is an efficient algorithm for obtaining the zip codes
within a specified distance of the first zipcode without having to retrieve
and calculate for every record in the database.
It depends on what you are doing, and how accurate you need to be.
Is that distance supposed to be DRIVING distance or distance as
the missile flies?
Precomputing some of this can be used to advantage.
(1) You could precalculate the distance between every zip code and
every other zip code. This takes a lot of storage but it only has
to be computed once. You didn't say whether a zip code can be
characterized by its center or whether you have to take into account
whether any part of it is within the distance given.
(2) If the objective is to find the nearest store, or all stores
within N miles, where N has a small number of choices on a web page
(like 5, 10, 20, 50 miles), you can precalculate the distance between
each zip code WITH A STORE IN IT (and you could use the exact
lat/long of the store, not that of the zip code it's in, if you
know it) and each zip code where customers might be.
Make a table (store number, zip code (of CUSTOMER), distance) where
you drop any entries where the distance is unreasonably large or
much better alternatives exist. For example, you might leave in
an entry for a store 200 miles away if it's the closest one, but
drop it if there are 3 others within 10 miles of the customer. This
is a decision needed when constructing the table but you only have
to re-do it when you add/delete/move stores or if zip codes move
(they have been known to split, so the center moves).
You can also hand-tune recommendations (e.g. drop any that are close
but require the customer to travel through war zones, swim across
rivers where there aren't bridges, or over military artillery
practice areas. This requires a lot more detailed info than what
you've got, but someone familiar with the area around the store
might know this). Select on zip code matching and distance < the
radius the customer specified.
(3) You can try to simplify the calculation by converting the
coordinates to a linear grid (E-W and N-S from some point in the
middle of the area in question). This is terrible if you are trying
to do the entire USA (including Alaska and Hawaii) since the world
really isn't flat, but not too bad if you're trying to do a 100-mile
radius around some city, where the world is a reasonable approximation
to flat. Use a bounding box to limit the calculation. If you want
zip codes within 20 miles of (x,y), anything not within (x-21 ..
x+21, y-21 .. y+21) is DEFINITELY out, and you can use the more
precise calculation on the remaining candidates. Indexes on the
coordinates might help.
Gordon L. Burditt