473,545 Members | 2,639 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to develop/acquire effective duplicate matching routine?

1 New Member
I'm not sure if I am posting this message in the right place. Please let me know if I should redirect this question elsewhere.

I'm currently working on a large database cleanup and consolidation project where I own pretty much everything to do with the project. The database represents a healthcare network.

Part of the data cleanup is merging duplicate doctors. The data is split into a master table as well as additional tables for addresses, ID's, etc. The current database contains data sourced from 3 different various data sources over time so the range of duplicates is pretty vast.

I have created a semi-effective duplicate finding routine in SQL Server T-SQL but it is slow and not very flexible. Ideally I'd like the same duplicate finding/matching routine to be able to be used for various purposes such as cleaning up current data, validating new data entered by users (checking if provider already exists), and properly matching up doctors from external data sources to doctors into our internal database.

Having a highly accurate and efficient duplicate control is a crucial part of this project and what I've created so far doesn't work as well as I would like. There's so many variations of names, ID's, etc. that I'm not sure how to improve it. I imagine there's some kind of confidence level technique I can use or something.

I've tried searching online for some tips but I can't seem to find anything specific to my situation. Anyone have any ideas or comments on how they would proceed? I currently am able to develop with SQL Server (T-SQL) and VB.NET. I also am familiar with creating CLR routines (coupling VB.NET with SQL Server). I also am considering outsourcing it to have someone else work on this component, but don't know where to go for that either.

Thanks,

Zach
Sep 18 '10 #1
0 1081

Sign in to post your reply or Sign up for a free account.

Similar topics

3
3211
by: Giloosh | last post by:
Hello, i need some help if possible... i have a payments table with over 500 records i want to run a query that searches through the table spotting out any duplicate ID#'s and Dates. So basically it will run a search spotting out duplicate ID#'s. Than with in those ID#'s spot out duplicate dates. I could do this by eye and spend hours...
17
14025
by: Andrew McLean | last post by:
I have a problem that is suspect isn't unusual and I'm looking to see if there is any code available to help. I've Googled without success. Basically, I have two databases containing lists of postal addresses and need to look for matching addresses in the two databases. More precisely, for each address in database A I want to find a single...
10
3232
by: Rada Chirkova | last post by:
Hi, at NC State University, my students and I are working on a project called "self-organizing databases," please see description below. I would like to use an open-source database system for implementation and would really appreciate your opinion on whether PostgreSQL is suitable for the project. In general, I am very impressed by the...
1
1261
by: Danny | last post by:
I am trying to loop through a database and if a keyfield exists in another table, I need to duplicate it. What do you recommend? I assume I need to link the two based on the key field (fields will be unique) and return only matching records, but then how do I duplicate each record in same database for returned or matching records. Thanks...
2
2998
by: P B via AccessMonster.com | last post by:
I have a list of 160,000 records with these fields: fname, lname, address, city, state, zip, dob I need to generate a list with all fields where the first initial of lname and the dob are equal. How? I can't seem to get Left() to work in a query. Could I put this in a VBscript and generate a recordset? What's the easiest way to...
4
1758
by: Mark L. Breen | last post by:
Hello Guys and Galls, I use combos on my forms. The code to initialise the combos is as follows Dim dsPIDTypes As DataSet dsPIDTypes = PartDB.GetPIDTypes ' Returns a dataset object cboPIDType.DataSource = dsPIDTypes cboPIDType.DisplayMember = "tlkpPIDType.PT_Type"
1
1491
by: walterbyrd | last post by:
I am trying to develop a simple user authentication routine. I started with something I got from a book called "PHP in Easy Steps." It works like this: - create a table in a database with basic user information: name, login, password - create a simple html form which loads "authenticate.php" when the submit button is pushed. -...
2
2420
by: mavmavv | last post by:
I have a Form where I have created a duplicate record button, no problem... The subform is where my problem lies. The subform displays data matching the mainform's ID, these two values are linked. The subform has no primary key, since there are multiple matching entries. Basically the mainform displays Customer info + order totals, and the...
15
5906
by: Peter | last post by:
I right now reading this book. And he is iterating some points I'm following since 1996, e.g. exception safety. But e.g. he is missing one of the major exception safety guidelines, which is Allocate only a single resource inside a constructor body (deallocate in the matching destructor body). Chain such classes into...
7
5443
bwesenberg
by: bwesenberg | last post by:
Hello All I have a form that my users enter their Audits in. The Primary key is the RECID field. They enter a policy number and an effective date in this form as well. What they want to happen is after they enter the policy number tab and enter the effective date if the Policy Number and the Effictive date is a duplicate record they want an...
0
7490
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7425
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7935
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
6009
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5351
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5069
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3479
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3465
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1037
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.