473,659 Members | 2,765 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

tools for cleaning name and address data?

What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?
Jul 20 '05 #1
10 4826
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.Repl yToNewsgroup.co m> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******* *************** **********@4ax. com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #2
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotm ail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.Repl yToNewsgroup.co m> wrote in message news:<10******* ******@news.sup ernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******** *************** *********@4ax.c om...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #3
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************ ************@co mpuserve.com> wrote in message news:<6o******* *************** **********@4ax. com>...
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #4
GL
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******** *************** *********@4ax.c om...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #5
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.Repl yToNewsgroup.co m> wrote in message news:<10******* ******@news.sup ernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******** *************** *********@4ax.c om...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?

Jul 20 '05 #6
Actually (small world or something) it turns out we already HAVE
QuickAddress, which populates addresses for us in our call center app.
What I need for my data warehouse is more than that. So far the
interactive and API products from peoplesmith are looking pretty
interesting...

On 23 Apr 2004 01:40:33 -0700, ry********@hotm ail.com (Ryan) wrote:
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternativel y you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************ ************@co mpuserve.com> wrote in message news:<6o******* *************** **********@4ax. com>...
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)


Jul 20 '05 #7
We have customers in Spanish-speaking countries, so if that is a
specific issue I guess this one wouldn't work.

Thanks for the info. :)

On Wed, 21 Apr 2004 15:39:23 -0700, "GL"
<GL@noSpam.Repl yToNewsgroup.co m> wrote:
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******* *************** **********@4ax. com...
What tools has everyone used for cleaning name and address data
(including identifying not-immediately-obvious duplicates) in
connection with a CRM project or the Customer dimension of a data
warehouse? What did you like/dislike about the tool you used? How
customizable was the tool you used?


Jul 20 '05 #8
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

On 22 Apr 2004 00:58:57 -0700, ry********@hotm ail.com (Ryan) wrote:
I use the 32 Bit API's from QAS ( www.qas.com ) and build up my own
solutions in Delphi 5. Their example code is in C++ but pretty easy to
convert (pseudo code provided). You can either write your own code
using their API's or use one of their utilities to cleanse the data
for you. The backend is SQL 7, but this could be anything.

QAS Batch allows you to check batches of data and is very good.
QAS Pro allows you the user to choose (and drill in to) the data
matched to a search string.

With this (and some info from Experian) we have been able to get 98%
of our customer data accurate.

We have a couple of applications we use for cleansing data in batches.
The API stuff works really well and is quick. I have been very
impressed so far. We can tweak the output very easily and have
consistent formatting of the results. In QAS Batch you get a result
code telling you the results of checking each part of the address and
where it matched or failed. You can determine which ones pass/fail
your checks.

With Pro, there is fuzzy logic with the checking. This is impressive
and accurate (especially on Welsh addresses). The helpdesk were really
good and willing to check over the Delphi code (not supported but
someone there knew Delphi) and give a few pointers when I got stuck.

Ryan
"GL" <GL@noSpam.Repl yToNewsgroup.co m> wrote in message news:<10******* ******@news.sup ernews.com>...
I have used components from www.MelissaData.com. They mostly worked for me
for address data, however they tended to have issues parsing Spanish street
names, PO boxes and PMB (Personal Mail Box).

GL

"Ellen K." <72************ ************@co mpuserve.com> wrote in message
news:ks******** *************** *********@4ax.c om...
> What tools has everyone used for cleaning name and address data
> (including identifying not-immediately-obvious duplicates) in
> connection with a CRM project or the Customer dimension of a data
> warehouse? What did you like/dislike about the tool you used? How
> customizable was the tool you used?


Jul 20 '05 #9
There were examples in VB as well if I remember correctly. Besides QAS
were really helpful on the coding side. You can download example (full
?) copies of the code which may be worth a look. I built a VB version
first using the examples and then converted to Delphi.

Alternatively you could look at their own solutions as they are really
good. You could even send them the data for them to cleanse, or have
them turn up and cleanse it for you on site. You can possibly
integrate their apps into your app, or use them to cleanse the data
seperately. I wrote my own partly for the experience with API's, but
also as we have additional information we need to add to the file
based on the results of the address so it was easiest to do this all
together.

To give you an idea of performance (as best I can), with a simple
Delphi app, SQL 7 backend (2x2.4Ghz Xeon server 2Gb) and QAS, we can
batch cleanse 100,000 records (2 passes of address) in between 25-30
hours. This is with all sorts of other additions to the code (which is
pretty quick anyway). It should be able to fly through any checking.
We run this every 3 months (ish). We could probably knock a third off
that if we just did the addresses only.

Hope that helps.

Ryan

Ellen K. <72************ ************@co mpuserve.com> wrote in message news:<6o******* *************** **********@4ax. com>...
This sounds like it's worth pursuing, I will check it out. We are
currently a VB shop but I could certainly convert from C++ examples,
although we were hoping not to have to code much of anything, i.e. the
idea of a purchased solution was to eliminate that.

Thanks very much. :)

Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1252
by: Ellen K. | last post by:
What tools has everyone used for cleaning name and address data (including identifying not-immediately-obvious duplicates) in connection with a CRM project or the Customer dimension of a data warehouse? What did you like/dislike about the tool you used? How customizable was the tool you used?
12
9845
by: Santosh | last post by:
Since I just started my new work, I have inherited a MS Access database which has nearly 13000 records in a single table. Now, my mandate is to clean the database and maybe split the table into two. Can anyone please give me some ideas on how to clean duplicated records without going through all the records. Any help or idea will be greatly appreciated? I am not much of a tech whiz but I am quite familiar with databases. I haven't dealt...
14
333
by: helpful sql | last post by:
Hi all, Are there any good Sql code generation tools out there in the market? If not can you please give me tips or sample code for creating one? I need to automate code generation for data integration. Here is what I repeatedly need to do... We have a table called CONTSUPP in our Sql Server database. All of our clients have the same database structure so they all have the CONTSUPP table. I need to create different kinds of views on...
0
8428
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8339
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8751
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8629
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7360
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5650
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4176
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4338
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1982
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.