Sudonim1234,
We dealt with something similar at a former employer of mine. We had a list
of standardized values for all the varients we were aware of. What we did
is maintain a table of these standardized values and joined it to our
recieved table. Before each run we would have to check the incoming data to
see if any new varients had appeared, add these to our table of standardized
values, then have a human validate the new rows, then make a new table with
the standardized values substituted for the varients.
You want to avoid changing the source data if at all possible. You also
want to implement a means by which the current batch run can be rolled back.
This way if the transform step screws up there is a way to restore the data
to the way it was before your process to import the data began. It is also
helpful to be able to show the client what changes are made so your process
can be audited and vetted.
"sudonim1234" <su*****@gmail.com> wrote in message
news:10*********************@k26g2000oda.googlegro ups.com...
My company receives a CD every month with US Gov. information about
imports and exports from the US. One of the things we have to do with
this before it is useable is to clean the data up. There are thousands
of Typos and inconsistencies in the raw data we get. At the moment I
probably have 300 or so individual update queries that take the form
similar to below:
UPDATE ImportsUnprocessed SET ImportsUnprocessed.Consignee = "SAMSUNG"
WHERE (((ImportsUnprocessed.Consignee) Like "*samsung*"));
We have about 1 field each time that need to be updated in each of
these different cases. Is there a way to compress this so I can run a
function with case statements or have a list of search criteria and
what to update it to?
There has to be an easier way!!! Please help