473,395 Members | 2,010 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

find almost matching data

139 100+
Hi there,

I have two tables with site information. There's a field in each for the site name, with sites from a - z. Unfortunately the site name isn't quite an exact match between tables:

Common_List.EquipmentDesc

ADMIRALTY SR, TORPOINT
ALLERS WPS, TIVERTON
BARLEY LANE SR, EXETER
...

SList.SiteDesc
ADMIRALTY SR
ALLERS WPS TIVERTON
BARLEY LANE SR (EXETER)
...

All the strings are of different lengths, have various numbers of words etc. Is there a way to get Access to compare say, the first word of one table with another, and output the matches?

Thanks for you help!
Feb 8 '12 #1
7 6061
TheSmileyCoder
2,322 Expert Mod 2GB
There are various ways this can be done. The simplest involves checking the first X charecters of each string to each other. This can be done by writing your query in the query design window, swithcing to SQL view and look at the join statement.

It will look something like this:
Expand|Select|Wrap|Line Numbers
  1. ON Common_List.EquipmentDesc=SList.SiteDesc
Modify it to read like this:
Expand|Select|Wrap|Line Numbers
  1. ON Left(Common_List.EquipmentDesc,4)=Left(SList.SiteDesc,4)
Replace 4 with however many charecters you want to use. You may need to replace , with ;, depending on your computers regional settings.

A more advanced way would be to split based on the first space.
Expand|Select|Wrap|Line Numbers
  1. ON Split(Common_List.EquipmentDesc," ")(0)=Split(SList.SiteDesc," ")(0)

With all that said and done to answer your immediate question, I must strongly emphazise that unless this is data coming in from sources over which you have no control (Such as external Database, or excel spreadsheets) you need to prevent this thing from ever happening in the first place.

You need to define the acceptable list of Descriptions and put them in a seperate table, and include a combobox in your forms, referencing that table. That way you will not get these kind of erros where even a simple type can mess up your results. For more information try reading: Database normalisation and table structures
Feb 8 '12 #2
NeoPa
32,556 Expert Mod 16PB
As Smiley says, this is a situation to avoid. Solving all the problems are likely to take you many days of work, and that's assuming you have a brain capable of working out a way to handle it. No disrespect to your brain - I know nothing about that after all - but very very few can handle creating the logic to compare such data reliably. I know, as I managed a decent approximation once and it was extraordinarily complex.
Feb 8 '12 #3
Mihail
759 512MB
A very long time ago I design a program to do exactly what you need now (for my hobby).
Since my program had worked on a Spectrum computer (HC) with a Z80 processor (is anyone who remember this toys ? ) I can't give you more than an idea:

As long as I remember my approach was:

Expand|Select|Wrap|Line Numbers
  1. For Each STR_1 in FirstString
  2.     'I define a variable to store the muched percent
  3.      'and I initialize it with ZERO
  4.     MatchedPercent = 0
  5.     'and a string variable to store the AlmostMatchedString
  6.     'and I initialize it with "" (zero length)
  7.     AlmostMatchedString = ""
  8.     'then:
  9.     For Each STR_2 in SecondStrings
  10.         'FullSTR_2=STR_2 'Store the value for further use
  11.         'I store the total length of strings
  12.         FirstTotalLength = LEN(Str_1) + LEN(Str_2)
  13.         'Then I eliminate all characters that not matched
  14.          between strings.
  15.         Call EliminateNotMatchedChars(Str_1, Str_2)
  16.         'I calculate the new total length:
  17.         NewTotalLength = LEN(Str_1) + LEN(Str_2)
  18.         'and the new matched percent
  19.         NewMatchedPercent = NewTotalLength / FirstTotalLength
  20.         'I compare the new matched percent with
  21.         'the preview one
  22.         IF NewMatchedPercent > MatchedPercent THEN
  23.             'Store the higher percent
  24.             MatchedPercent = NewMatchedPercent
  25.             'Store the string with higher matched percent
  26.             AlmostMatchedString = FullSTR_2
  27.          END IF
  28.     NEXT STR_2
  29.     'Here you have the STR_2 with the top matched percent
  30.     'with STR_1
  31.     Debug.Print STR_1, FullSTR_2
  32. NEXT STR_1
Of course you can't be 100% sure that the program find exactly you need.

I think that is not a problem for you to design the EliminateNotMatchedChars subroutine.

Good luck !
Feb 9 '12 #4
Mariostg
332 100+
Assuming you do some data clean up to ease the job by stripping out any coma, period, parenthesis etc, you could adapt the following.
Here it runs an two arrays, but the process is similar for two recordsets.
You need Microsoft VBScript Regular Expression Reference enabled.

Expand|Select|Wrap|Line Numbers
  1. Function findPartialMatch()
  2.     Dim re As New RegExp
  3.     Dim pattern As String
  4.     Dim arr1, a1 As Variant
  5.     Dim arr2, a2 As Variant
  6.     re.Global = True
  7.     arr1 = Array("ADMIRALTY SR TORPOINT", "ALLERS WPS TIVERTON", "BARLEY LANE SR EXETER")
  8.     arr2 = Array("ADMIRALTY SR", "ALLERS WPS TIVERTON", "BARLEY LANE SR EXETER")
  9.     Debug.Print "---Search for elements of array arr1 in array arr2---"
  10.     For Each a1 In arr1
  11.         re.pattern = a1
  12.         For Each a2 In arr2
  13.             If (re.test(a2) = True) Then
  14.                 Debug.Print a1 & " matched in " & a2
  15.             End If
  16.         Next a2
  17.     Next a1
  18.     Debug.Print "---Search for elements of array arr2 in array arr1---"
  19.     For Each a2 In arr2
  20.        re.pattern = a2
  21.        For Each a1 In arr1
  22.            If (re.test(a1) = True) Then
  23.                Debug.Print a2 & " matched  " & a1
  24.            End If
  25.        Next a1
  26.     Next a2
  27.  
  28. End Function
  29.  
Output:
Expand|Select|Wrap|Line Numbers
  1. ---Search for elements of array arr1 in array arr2---
  2. ALLERS WPS TIVERTON matched in ALLERS WPS TIVERTON
  3. BARLEY LANE SR EXETER matched in BARLEY LANE SR EXETER
  4. ---Search for elements of array arr2 in array arr1---
  5. ADMIRALTY SR matched  ADMIRALTY SR TORPOINT
  6. ALLERS WPS TIVERTON matched  ALLERS WPS TIVERTON
  7. BARLEY LANE SR EXETER matched  BARLEY LANE SR EXETER
  8.  
Feb 9 '12 #5
Mihail
759 512MB
@Mariostg
I think that your code can be translated to use regular function InStr(). So no need to add new library to refer to.
This was a comment not the purpose for this post

From what I see from your results the code find a perfect match between one string and a part (piece) from the other one.
What happen if a perfect match can't be find ?

Example:
Str_1 = ALLERS WPS TIVERTON
Str_2 = TIVERTON WPS ALLERS
Feb 10 '12 #6
Mariostg
332 100+
I always forget about the InStr() function. Regular expressions is what pop up in my head first when I saw the post.
Indeed, with the two strings you proposed that won't work. I just based my function on the few liners given by the OP. To expend what I wrote, each of a1 and a2 would probably need to have each of their words sorted. i.e. a1 ->split -> sort array -> join array back into string then compare after doing the same with a2.

If only Access had some of Python abilities. Just dreaming. As Neopa said, there is lots of brain work involved there. I would strongly consider the closing part of Smiley's post.
Feb 10 '12 #7
Check this out......

ChangeValueTableName: Put the name of table where record has to be compared and changed.

ChangeValueTableFieldName: Name of the field in the ChangeValueTableName where record has to be compared and changed.

CorrectTableName: Put the name of table to which record has to be compared.

CorrectTableFieldName: Name of the field in the CorrectTableFieldName to which record is to be compared.

SensiivityFactor: Value between 1 to 99. Increase the sensitivity as per your need.


Expand|Select|Wrap|Line Numbers
  1. Public Function PercentRelation(ByVal ChangeValueTableName As String, _
  2. ByVal ChangeValueTableFieldName As String, _
  3. ByVal CorrectTableName As String, _
  4. ByVal CorrectTableFieldName As String, _
  5. ByVal SensitivityFactor As Double)
  6.  
  7. Dim TString As String
  8. Dim OString As String
  9. Dim TCount As Double
  10. Dim OCount As Double
  11. Dim XT As Double
  12. Dim XO As Double
  13. Dim TEach As String
  14. Dim OEach As String
  15. Dim RPercent As Double
  16. Dim TPercent As Double
  17. Dim PMatch As Double
  18. Dim TMatch As Double
  19. Dim FMatch As Double
  20. Dim NewMatchFound As Double
  21. Dim Eval As String
  22. Dim XSense As Double
  23. Dim CRecords As Double
  24. Dim TRecords As Double
  25.  
  26. XSense = SensitivityFactor
  27. CRecords = 0
  28. Dim dbs As Database
  29. Dim rstUserPwd As Recordset
  30.  
  31. Set dbs = CurrentDb
  32. Set rstUserPwd = dbs.OpenRecordset(ChangeValueTableName)
  33.  
  34. TRecords = dbs.OpenRecordset(ChangeValueTableName).RecordCount
  35.  
  36. If rstUserPwd.RecordCount > 0 Then
  37.    rstUserPwd.MoveFirst
  38.  
  39. Do While rstUserPwd.EOF = False
  40.  
  41. NewMatchFound = 0
  42. FMatch = 0
  43.  
  44. '*******************************************************************************
  45. Dim dbsF As Database
  46. Dim rstUserPwdF As Recordset
  47.  
  48. Set dbsF = CurrentDb
  49. Set rstUserPwdF = dbsF.OpenRecordset(CorrectTableName)
  50.  
  51. If rstUserPwdF.RecordCount > 0 Then
  52.    rstUserPwdF.MoveFirst
  53.  
  54. Do While rstUserPwdF.EOF = False
  55. '*******************************************************************************
  56. Dim TPlace As String
  57. Dim OPlace As String
  58.  
  59. TPlace = rstUserPwd(ChangeValueTableFieldName)
  60. OPlace = rstUserPwdF(CorrectTableFieldName)
  61.  
  62. If Left$(TPlace, 1) <> Left$(OPlace, 1) Then
  63. GoTo jump
  64. End If
  65.  
  66. '*******************************************************************************
  67. TString = TPlace
  68. OString = OPlace
  69.  
  70. TCount = Len(TString)
  71. OCount = Len(OString)
  72.  
  73. RPercent = 100 / OCount
  74. TPercent = 100 / TCount
  75. PMatch = 0
  76. TMatch = 0
  77.  
  78. XT = 1
  79.  
  80. Do Until XT > TCount
  81.  
  82. XO = 1
  83.  
  84. TEach = Mid$(TString, XT, 1)
  85.  
  86. Do Until XO > OCount
  87.  
  88. OEach = Mid$(OString, XO, 1)
  89.  
  90. If OEach = TEach Then
  91. PMatch = PMatch + RPercent
  92. TMatch = TMatch + TPercent
  93.  
  94. If XO = 1 Then
  95. OString = Right(OString, Len(OString) - 1)
  96. Else
  97. OString = Left$(OString, XO - 1) & Right$(OString, Len(OString) - XO)
  98. End If
  99.  
  100. XO = 1
  101. OCount = Len(OString)
  102. Exit Do
  103. Else
  104. XO = XO + 1
  105. End If
  106.  
  107. Loop
  108.  
  109. XT = XT + 1
  110. Loop
  111.  
  112. FMatch = Round((PMatch + TMatch) * 0.5, 1)
  113.  
  114. '*******************************************************************************
  115. If NewMatchFound < FMatch Then
  116. NewMatchFound = FMatch
  117. Eval = OPlace
  118. End If
  119. jump:
  120. rstUserPwdF.MoveNext
  121. Loop
  122. End If
  123.  
  124.  
  125. '*******************************************************************************
  126.  
  127. If NewMatchFound > XSense Then
  128. rstUserPwd.Edit
  129. rstUserPwd(ChangeValueTableFieldName) = Eval
  130. rstUserPwd.Update
  131. CRecords = CRecords + 1
  132. End If
  133.  
  134. rstUserPwd.MoveNext
  135. Loop
  136. End If
  137. MsgBox CRecords & " records matched out of " & TRecords & " records."
  138.  
Jul 7 '14 #8

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: Psybar Phreak | last post by:
hi all i have an array of Process objects, each elements has an id (process.id), and arrival time (process.at), processing time (process.pt) and a boolean indicating whether the process has...
0
by: Petterson Mikael | last post by:
Hi, I have managed to get the name of the enum e.g. TxDeviceGroup_BbBusState in the first part of the xml in the class element. Now I need to find the minimum value of the enum called...
2
by: Joe | last post by:
Hello all! I need to display a list of names similar to a spell checker where the list is the closest match to the name entered. Not too sure where to begin with this - any suggestions? Thanks,...
9
by: Chi Man Wong | last post by:
I have a problem using VB which I'll try to explain: In sheet A I have a table with : column 1: dates (e.g. 01-01-2005) over a range of 3 years column 2 till 8: numbers In sheet B I have a...
3
by: Daine | last post by:
I would appreciate some help in a problem I've been having. We use two packages one for accounting and one to keep track of sales made. I have two tables imported into access from each of the...
1
by: cypriot | last post by:
Hi. I want to insert data to ms access but I think I have a prob with data types. This is my code. savebutton.addActionListener( new ActionListener() { public void...
3
by: boots | last post by:
I have a file with two worksheets. Column C of Sheet1 has FILE IDs, as does Column D of Sheet2. I would like to find any matches and then copy all of the data from the matching row in Sheet1 to the...
4
by: Thomas Mlynarczyk | last post by:
Hello, I have two arrays like this: $aSearch = array( 'A', 'B', 'C.D', 'E', 'F' ); $aSubject = array( 'A' =0, 'A.B' =1, 'X' =2, 'C.D.E' =3, 'A.B.C' => 4 ); Now I want to search $aSubject...
2
by: prashantdixit | last post by:
Hi all, I am new to Excel/VBA and would require your help. I have stuck again somewhere and will be highly obliged if you can help me. I have two worksheet 1. Import File Utility (Sheet A)...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.