Connecting Tech Pros Worldwide Help | Site Map

Fuzzy Duplicates

  #1  
Old June 30th, 2009, 10:53 PM
Newbie
 
Join Date: Jun 2009
Posts: 1
Hello all,

sorry for being NEW today. I have read many threads here and have hit google for a long time. I've tried many examples and just cannot do what i need to do.


Here's the setup.


I have a table with two columns . A unique ID and Product descriptions. Product descriptions are added by end users. These users sometimes misspell or typo and add a new record. I need to find those duplicates and report on them.

sample data:

1 Bytes Website
2 Bytes Website - New
3 Book Store
4 Bytes Website - Expert Help - Website
5 Ice Cream Shop
6 Cat Food
7 Cat Food - CatFood.com - NEW YORK

In the above data sample i would like to only see the three "bytes website" & the two "cat food" records. Leaving the other products out of the mix.

I'll then go in and manually shift the data around. I just need a duplicate report.

Thank you.

*sorry if i'm bad at trying to tell what i need*
  #2  
Old July 4th, 2009, 01:20 AM
Member
 
Join Date: Aug 2007
Posts: 119

re: Fuzzy Duplicates


That's a common problem with all databases and it would be great to prevent users from entering duplicates. However, in the real world, they're going to happen just as you described.

You could potentially write a script to do what you want by going through the entire database looking at one record at a time, storing it in a variable and then comparing it to all of the other descriptions with the LIKE construct.

There is probably a better way. Hopefully someone here will be able to help you better than I can.

My preference would be to force the user to search the database before entering a new description thereby cutting down the duplicates. Write the code so the user enters the description, but instead of updating the database right away, search for LIKE descriptions and display those to the user first. Then you can give them the option to accept one of the descriptions found, add a new description as typed, or cancel the transaction.
Reply


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can Access use Fuzzy Logic cassetti@gmail.com answers 24 April 7th, 2006 04:45 PM
Quick fuzzy search Martin Schneider answers 8 April 6th, 2006 11:05 PM
Fuzzy Lookups BBands answers 24 February 10th, 2006 05:05 AM
Fuzzy matching of postal addresses Andrew McLean answers 17 July 18th, 2005 09:35 PM