473,396 Members | 2,087 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

tools for cleaning name and address data?

What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?
Jul 20 '05 #1
10 4791
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #2
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #3
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #4
GL
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #5
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #6
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #7
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.ReplyToNewsgroup.com> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com.. .
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #8
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotmail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.ReplyToNewsgroup.com> wrote in message news:<10*************@news.supernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************************@compuserve.com> wrote in message
news:ks********************************@4ax.com...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #9
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #10
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotmail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************************@compuserve.com> wrote in message news:<6o********************************@4ax.com>. ..
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Ellen K. | last post by:
What tools has everyone used for cleaning name and address data (including identifying not-immediately-obvious duplicates) in connection with a CRM project or the Customer dimension of a data...
12
by: Santosh | last post by:
Since I just started my new work, I have inherited a MS Access database which has nearly 13000 records in a single table. Now, my mandate is to clean the database and maybe split the table into...
14
by: helpful sql | last post by:
Hi all, Are there any good Sql code generation tools out there in the market? If not can you please give me tips or sample code for creating one? I need to automate code generation for data...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.