473,326 Members | 2,013 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

tools for cleaning name and address data?

What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?
Jul 20 '05 #1
10 4788
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #2
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #3
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #4
GL
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #5
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #6
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #7
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #8
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #9
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #10
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Ellen K. | last post by:
What tools has everyone used for cleaning name and address data (including identifying not-immediately-obvious duplicates) in connection with a CRM project or the Customer dimension of a data...
12
by: Santosh | last post by:
Since I just started my new work, I have inherited a MS Access database which has nearly 13000 records in a single table. Now, my mandate is to clean the database and maybe split the table into...
14
by: helpful sql | last post by:
Hi all, Are there any good Sql code generation tools out there in the market? If not can you please give me tips or sample code for creating one? I need to automate code generation for data...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.