By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,091 Members | 1,917 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,091 IT Pros & Developers. It's quick & easy.

find almost matching data

100+
P: 139
Hi there,

I have two tables with site information. There's a field in each for the site name, with sites from a - z. Unfortunately the site name isn't quite an exact match between tables:

Common_List.EquipmentDesc

ADMIRALTY SR, TORPOINT
ALLERS WPS, TIVERTON
BARLEY LANE SR, EXETER
...

SList.SiteDesc
ADMIRALTY SR
ALLERS WPS TIVERTON
BARLEY LANE SR (EXETER)
...

All the strings are of different lengths, have various numbers of words etc. Is there a way to get Access to compare say, the first word of one table with another, and output the matches?

Thanks for you help!
Feb 8 '12 #1
Share this Question
Share on Google+
7 Replies


TheSmileyCoder
Expert Mod 100+
P: 2,321
There are various ways this can be done. The simplest involves checking the first X charecters of each string to each other. This can be done by writing your query in the query design window, swithcing to SQL view and look at the join statement.

It will look something like this:
Expand|Select|Wrap|Line Numbers
  1. ON Common_List.EquipmentDesc=SList.SiteDesc
Modify it to read like this:
Expand|Select|Wrap|Line Numbers
  1. ON Left(Common_List.EquipmentDesc,4)=Left(SList.SiteDesc,4)
Replace 4 with however many charecters you want to use. You may need to replace , with ;, depending on your computers regional settings.

A more advanced way would be to split based on the first space.
Expand|Select|Wrap|Line Numbers
  1. ON Split(Common_List.EquipmentDesc," ")(0)=Split(SList.SiteDesc," ")(0)

With all that said and done to answer your immediate question, I must strongly emphazise that unless this is data coming in from sources over which you have no control (Such as external Database, or excel spreadsheets) you need to prevent this thing from ever happening in the first place.

You need to define the acceptable list of Descriptions and put them in a seperate table, and include a combobox in your forms, referencing that table. That way you will not get these kind of erros where even a simple type can mess up your results. For more information try reading: Database normalisation and table structures
Feb 8 '12 #2

NeoPa
Expert Mod 15k+
P: 31,494
As Smiley says, this is a situation to avoid. Solving all the problems are likely to take you many days of work, and that's assuming you have a brain capable of working out a way to handle it. No disrespect to your brain - I know nothing about that after all - but very very few can handle creating the logic to compare such data reliably. I know, as I managed a decent approximation once and it was extraordinarily complex.
Feb 8 '12 #3

100+
P: 759
A very long time ago I design a program to do exactly what you need now (for my hobby).
Since my program had worked on a Spectrum computer (HC) with a Z80 processor (is anyone who remember this toys ? ) I can't give you more than an idea:

As long as I remember my approach was:

Expand|Select|Wrap|Line Numbers
  1. For Each STR_1 in FirstString
  2.     'I define a variable to store the muched percent
  3.      'and I initialize it with ZERO
  4.     MatchedPercent = 0
  5.     'and a string variable to store the AlmostMatchedString
  6.     'and I initialize it with "" (zero length)
  7.     AlmostMatchedString = ""
  8.     'then:
  9.     For Each STR_2 in SecondStrings
  10.         'FullSTR_2=STR_2 'Store the value for further use
  11.         'I store the total length of strings
  12.         FirstTotalLength = LEN(Str_1) + LEN(Str_2)
  13.         'Then I eliminate all characters that not matched
  14.          between strings.
  15.         Call EliminateNotMatchedChars(Str_1, Str_2)
  16.         'I calculate the new total length:
  17.         NewTotalLength = LEN(Str_1) + LEN(Str_2)
  18.         'and the new matched percent
  19.         NewMatchedPercent = NewTotalLength / FirstTotalLength
  20.         'I compare the new matched percent with
  21.         'the preview one
  22.         IF NewMatchedPercent > MatchedPercent THEN
  23.             'Store the higher percent
  24.             MatchedPercent = NewMatchedPercent
  25.             'Store the string with higher matched percent
  26.             AlmostMatchedString = FullSTR_2
  27.          END IF
  28.     NEXT STR_2
  29.     'Here you have the STR_2 with the top matched percent
  30.     'with STR_1
  31.     Debug.Print STR_1, FullSTR_2
  32. NEXT STR_1
Of course you can't be 100% sure that the program find exactly you need.

I think that is not a problem for you to design the EliminateNotMatchedChars subroutine.

Good luck !
Feb 9 '12 #4

100+
P: 332
Assuming you do some data clean up to ease the job by stripping out any coma, period, parenthesis etc, you could adapt the following.
Here it runs an two arrays, but the process is similar for two recordsets.
You need Microsoft VBScript Regular Expression Reference enabled.

Expand|Select|Wrap|Line Numbers
  1. Function findPartialMatch()
  2.     Dim re As New RegExp
  3.     Dim pattern As String
  4.     Dim arr1, a1 As Variant
  5.     Dim arr2, a2 As Variant
  6.     re.Global = True
  7.     arr1 = Array("ADMIRALTY SR TORPOINT", "ALLERS WPS TIVERTON", "BARLEY LANE SR EXETER")
  8.     arr2 = Array("ADMIRALTY SR", "ALLERS WPS TIVERTON", "BARLEY LANE SR EXETER")
  9.     Debug.Print "---Search for elements of array arr1 in array arr2---"
  10.     For Each a1 In arr1
  11.         re.pattern = a1
  12.         For Each a2 In arr2
  13.             If (re.test(a2) = True) Then
  14.                 Debug.Print a1 & " matched in " & a2
  15.             End If
  16.         Next a2
  17.     Next a1
  18.     Debug.Print "---Search for elements of array arr2 in array arr1---"
  19.     For Each a2 In arr2
  20.        re.pattern = a2
  21.        For Each a1 In arr1
  22.            If (re.test(a1) = True) Then
  23.                Debug.Print a2 & " matched  " & a1
  24.            End If
  25.        Next a1
  26.     Next a2
  27.  
  28. End Function
  29.  
Output:
Expand|Select|Wrap|Line Numbers
  1. ---Search for elements of array arr1 in array arr2---
  2. ALLERS WPS TIVERTON matched in ALLERS WPS TIVERTON
  3. BARLEY LANE SR EXETER matched in BARLEY LANE SR EXETER
  4. ---Search for elements of array arr2 in array arr1---
  5. ADMIRALTY SR matched  ADMIRALTY SR TORPOINT
  6. ALLERS WPS TIVERTON matched  ALLERS WPS TIVERTON
  7. BARLEY LANE SR EXETER matched  BARLEY LANE SR EXETER
  8.  
Feb 9 '12 #5

100+
P: 759
@Mariostg
I think that your code can be translated to use regular function InStr(). So no need to add new library to refer to.
This was a comment not the purpose for this post

From what I see from your results the code find a perfect match between one string and a part (piece) from the other one.
What happen if a perfect match can't be find ?

Example:
Str_1 = ALLERS WPS TIVERTON
Str_2 = TIVERTON WPS ALLERS
Feb 10 '12 #6

100+
P: 332
I always forget about the InStr() function. Regular expressions is what pop up in my head first when I saw the post.
Indeed, with the two strings you proposed that won't work. I just based my function on the few liners given by the OP. To expend what I wrote, each of a1 and a2 would probably need to have each of their words sorted. i.e. a1 ->split -> sort array -> join array back into string then compare after doing the same with a2.

If only Access had some of Python abilities. Just dreaming. As Neopa said, there is lots of brain work involved there. I would strongly consider the closing part of Smiley's post.
Feb 10 '12 #7

P: 1
Check this out......

ChangeValueTableName: Put the name of table where record has to be compared and changed.

ChangeValueTableFieldName: Name of the field in the ChangeValueTableName where record has to be compared and changed.

CorrectTableName: Put the name of table to which record has to be compared.

CorrectTableFieldName: Name of the field in the CorrectTableFieldName to which record is to be compared.

SensiivityFactor: Value between 1 to 99. Increase the sensitivity as per your need.


Expand|Select|Wrap|Line Numbers
  1. Public Function PercentRelation(ByVal ChangeValueTableName As String, _
  2. ByVal ChangeValueTableFieldName As String, _
  3. ByVal CorrectTableName As String, _
  4. ByVal CorrectTableFieldName As String, _
  5. ByVal SensitivityFactor As Double)
  6.  
  7. Dim TString As String
  8. Dim OString As String
  9. Dim TCount As Double
  10. Dim OCount As Double
  11. Dim XT As Double
  12. Dim XO As Double
  13. Dim TEach As String
  14. Dim OEach As String
  15. Dim RPercent As Double
  16. Dim TPercent As Double
  17. Dim PMatch As Double
  18. Dim TMatch As Double
  19. Dim FMatch As Double
  20. Dim NewMatchFound As Double
  21. Dim Eval As String
  22. Dim XSense As Double
  23. Dim CRecords As Double
  24. Dim TRecords As Double
  25.  
  26. XSense = SensitivityFactor
  27. CRecords = 0
  28. Dim dbs As Database
  29. Dim rstUserPwd As Recordset
  30.  
  31. Set dbs = CurrentDb
  32. Set rstUserPwd = dbs.OpenRecordset(ChangeValueTableName)
  33.  
  34. TRecords = dbs.OpenRecordset(ChangeValueTableName).RecordCount
  35.  
  36. If rstUserPwd.RecordCount > 0 Then
  37.    rstUserPwd.MoveFirst
  38.  
  39. Do While rstUserPwd.EOF = False
  40.  
  41. NewMatchFound = 0
  42. FMatch = 0
  43.  
  44. '*******************************************************************************
  45. Dim dbsF As Database
  46. Dim rstUserPwdF As Recordset
  47.  
  48. Set dbsF = CurrentDb
  49. Set rstUserPwdF = dbsF.OpenRecordset(CorrectTableName)
  50.  
  51. If rstUserPwdF.RecordCount > 0 Then
  52.    rstUserPwdF.MoveFirst
  53.  
  54. Do While rstUserPwdF.EOF = False
  55. '*******************************************************************************
  56. Dim TPlace As String
  57. Dim OPlace As String
  58.  
  59. TPlace = rstUserPwd(ChangeValueTableFieldName)
  60. OPlace = rstUserPwdF(CorrectTableFieldName)
  61.  
  62. If Left$(TPlace, 1) <> Left$(OPlace, 1) Then
  63. GoTo jump
  64. End If
  65.  
  66. '*******************************************************************************
  67. TString = TPlace
  68. OString = OPlace
  69.  
  70. TCount = Len(TString)
  71. OCount = Len(OString)
  72.  
  73. RPercent = 100 / OCount
  74. TPercent = 100 / TCount
  75. PMatch = 0
  76. TMatch = 0
  77.  
  78. XT = 1
  79.  
  80. Do Until XT > TCount
  81.  
  82. XO = 1
  83.  
  84. TEach = Mid$(TString, XT, 1)
  85.  
  86. Do Until XO > OCount
  87.  
  88. OEach = Mid$(OString, XO, 1)
  89.  
  90. If OEach = TEach Then
  91. PMatch = PMatch + RPercent
  92. TMatch = TMatch + TPercent
  93.  
  94. If XO = 1 Then
  95. OString = Right(OString, Len(OString) - 1)
  96. Else
  97. OString = Left$(OString, XO - 1) & Right$(OString, Len(OString) - XO)
  98. End If
  99.  
  100. XO = 1
  101. OCount = Len(OString)
  102. Exit Do
  103. Else
  104. XO = XO + 1
  105. End If
  106.  
  107. Loop
  108.  
  109. XT = XT + 1
  110. Loop
  111.  
  112. FMatch = Round((PMatch + TMatch) * 0.5, 1)
  113.  
  114. '*******************************************************************************
  115. If NewMatchFound < FMatch Then
  116. NewMatchFound = FMatch
  117. Eval = OPlace
  118. End If
  119. jump:
  120. rstUserPwdF.MoveNext
  121. Loop
  122. End If
  123.  
  124.  
  125. '*******************************************************************************
  126.  
  127. If NewMatchFound > XSense Then
  128. rstUserPwd.Edit
  129. rstUserPwd(ChangeValueTableFieldName) = Eval
  130. rstUserPwd.Update
  131. CRecords = CRecords + 1
  132. End If
  133.  
  134. rstUserPwd.MoveNext
  135. Loop
  136. End If
  137. MsgBox CRecords & " records matched out of " & TRecords & " records."
  138.  
Jul 7 '14 #8

Post your reply

Sign in to post your reply or Sign up for a free account.