I've been handed a mess of related data in the form of multiple spead
sheets. Each set of data contains part or parts of the what is needed
in the final database but there is no common usable column that is
usable for selecting data from all sources and no two sets contain all
the rows of any other. By that, I mean that one list will have member
names as (last, first, middle) while another omits the middle name and
a third contains only the full name, including suffix. One is
formatted mixed case, another is all upper case, a third is a mix of
both. Each set contains a subset of names from one master list, but
not necessarily the same names as any other set. The case issue is
simple - a word processor and a quick macro on the .del file solves
that problem in a few minutes but I'm still left with the problem of
reconciling all these disparate sources. The ambiguities (how many
David Smiths can there be in 80k entries?) can be partially resolved
with address matches, etc but will still have to be flagged.
Anybody got a decent plan for attacking a mess like this? Ideally,
the end result will be something like 3 or 4 tables with a common data
column to use for linkages. So far, I've come up with several schemes
but they all appear to invoke cartesian searches and run forever so I
need some expert advice or a pointer to a tutorial source somewhere.
--
Will Honea