By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,831 Members | 1,024 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,831 IT Pros & Developers. It's quick & easy.

tools for cleaning name and address data?

P: n/a
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?
Jul 20 '05 #1
Share this Question
Share on Google+
10 Replies


P: n/a
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #2

P: n/a
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #3

P: n/a
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #4

P: n/a
GL
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #5

P: n/a
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #6

P: n/a
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #7

P: n/a
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #8

P: n/a
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #9

P: n/a
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #10

P: n/a
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.