By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,490 Members | 1,789 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,490 IT Pros & Developers. It's quick & easy.

Compare address

P: 39
hello all,

I'm trying to figure out a good way to compare address across two tables. Example: Table 1 has the address of 12345 E Main St and table 2 has 12345 Main.

I need to just look at the numbers and actual street. If they match than I'm good. I've done the below
Expand|Select|Wrap|Line Numbers
  1. Expr2: Left([table1]![address],InStr([table1]![address]," ")-1)
  2.  
to split out the numbers, and I can keep going deeper with something like this, but wanted to know if anyone else has a better query or use of functions or anything that will allow me to use a query to look at this rather than manually looking at the records.

Any ideas, or need more info?

Thanks in advance.
Will
May 4 '09 #1
Share this Question
Share on Google+
11 Replies


JustJim
Expert 100+
P: 407
@timber910
Hi Will,

I'd like to know first of all, why you have what appears to be essentially the same information in two tables? You may have some data normalisation problems here. Have a quick glance at Mary's famous article on normalisation. It's fairly hefty if you're unused to it, but take the time and it will help a lot.

Let us know about the reason for addresses in two tables and we can go from there.

Jim
May 4 '09 #2

P: 39
LOL! No, no.. Getting customers from another company. So I'm getting their data. One database has their billing information (lets say this is table1) and the second database is the customer info. Both of these will stay it two different systems after they are converted. Trust me, I would love to get rid of one of these systems.

I just need to be able to match address between the two in order to know if I'm missing anything or need to send address back to the seller becuase they are too different. The directions or street, drive or what not isn't important. Just need to match the number and the actual street. I'll get the information in two files and I'll add them as two tables so i can then do my matching, etc. between name, address, city, state, zip, etc. but the last three are the easy ones.

Heres my current query so to speak

Expand|Select|Wrap|Line Numbers
  1. SELECT Account.NOTES AS ACOUNT_NOTES, 
  2. [Billing Data].Address1, 
  3. Account.site_addr1, 
  4. Left([Billing Data]![Address1],InStr([Billing Data]![Address1]," ")-1) AS Expr1, 
  5. Left([Account]![site_addr1],InStr([Account]![site_addr1]," ")-1) AS Expr2,
  6.  [Billing Data].City, Account.city_name, 
  7. [Billing Data].State, 
  8. Account.state_id, 
  9. [Billing Data].Zip, 
  10. Account.zip_code
  11. FROM Account INNER JOIN [Billing Data] ON Account.cs_no = [Billing Data].Monitoring_number
  12. WHERE ((([Billing Data].Address1)<>[Account]![site_addr1]));
  13.  
May 4 '09 #3

ADezii
Expert 5K+
P: 8,679
@timber910
You can extract the Numeric Component of the [Address] Field in Table1 for every Record, then see if it is contained within any of the [Address] Fields in Table2. You would then output the Critical Field values to a File, Table, etc. This process would greatly reduce the amount of Records to manually search for partial Matches. I've completed part of the puzzle for you, but if you need further help, just let me or one of the other Experts/Moderators/Admin know.
Expand|Select|Wrap|Line Numbers
  1. Public Function fCompareNumbersFromAddress()
  2. Dim MyDB As DAO.Database
  3. Dim rstTblOne As DAO.Recordset
  4. Dim varSplit As Variant
  5. Dim intIndex As Integer
  6.  
  7. Set MyDB = CurrentDb
  8. Set rstTblOne = MyDB.OpenRecordset("Table1", dbOpenForwardOnly)
  9.  
  10. With rstTblOne
  11.   Do While Not .EOF
  12.     If Len(![Address]) > 0 Then
  13.       varSplit = Split(![Address], " ")
  14.  
  15.       For intIndex = LBound(varSplit) To UBound(varSplit)
  16.         If IsNumeric(varSplit(intIndex)) Then
  17.           'See if the Numeric Value appears in the [Address[ Field in any
  18.           'of the Records in Table 2, noting Primary Key and Field Values
  19.           'comparison here ==>  varSplit(intIndex)
  20.         End If
  21.       Next
  22.     End If
  23.     .MoveNext
  24.   Loop
  25. End With
  26.  
  27. rstTblOne.Close
  28. Set rstTblOne = Nothing
  29. End Function
May 4 '09 #4

JustJim
Expert 100+
P: 407
Nice one Adezii. That'll do nicely!

Jim
May 4 '09 #5

P: 39
WOW! Way better then what I was hoping for. Now to work it into what I need. OK. I'm following the varSplit to put the field into an array. To make sure I'm understanding the FOR section.

Expand|Select|Wrap|Line Numbers
  1. For intIndex = LBound(varSplit) To UBound(varSplit)
  2.  
intIndex is being set to the lower llimit of the varSplit array to the upper limit of the varSplit array. So it is grabing the first part of the array which should be numbers if the field was "1234 main st" Correct?

Then i use the IsNumeric statement to look for a match in my other table. I think I'm going to need help on that part to. I'm still learning everything as I go.

Thanks again for all the help.
May 5 '09 #6

100+
P: 675
I am one for planning before coding, so my comments are often ignored, but I will comment anyway.

I don't know the size of these tables, the ratio of exact matches to non-matches, or the frequency that this compare is run. Also, is there more than one other company, where there is a different set of records to compare?

What can change from compare-run to compare-run. Some businesses keep records by address, and the customer name changes. Other businesses keep records by name, and the address might change.


Before I would write any code, I would consider the steps necessary, such as -
1) Write a 'Sync' program that would identify any changes since the last run
2) Attempt to re-establish a match after identified changes since prior sync
3) Multi-step match algorithms for non-sync, new, and deleted (or missing) records, asking for operator to confirm if necessary.

My Sync routine would have a new table, perhaps tblCustMatch, which stored the keys of both tables where match had been established, and probably the address fields of each table so I could detect changes. I might need the names also, if the name can change but not the address, for either table.


Some thoughts:
Some addresses will never match exactly. Just from personal local knowledge here in USA, there is a Boulevard Way, an East Street, an Avenue of the Giants, and 2 parallel streets, Park Avenue & Park Street, where numeric addresses and zip codes are the same. New York City has an Avenue A as a street, Los Angeles has an Avenue 32.

The numeric portion may have 2 or 3 fields when split, as in 123 1/2 Marlin Drive West. Here the street name is in the middle. This address might refer to a home behind another home, or above a garage. In rural California, and elsewhere, roads were simply named Road 1, Road 2, etc. each being 1 mile apart. Now, occasionally new roads are named with 1/2. Therefore 12 1/2 Road 12 1/2 could exist as an address.

Compass references can cause duplication of numbers. In Pittsburgh, PA, Marlin Drive East has the same numeric addresses, same zip code, and is parallel to and one block east of Marlin Drive West.
South Park Drive may be south of Park Drive or of North Park Drive, or it may run through South Park, or be in the town of South Park, therefore not really being a compass reference which might otherwise be ignored. East 1st Street is the same street as West 1st Street, the designation changing as it crosses Main St or 1st Avenue. Numbers probably duplicate also. Many cities have both numeric Streets and numeric Avenues. These may intersect, and the address numbers may duplicate.
Salt Lake City uses a numeric grid system so 400 E 400 S would be an address. But the streets are also known as 4th, 5th, etc., so the same address is 400 E 4th S. To further compond the issue, this street is also named University Blvd. so 400 E University Blvd is the same location.

There are addresses with hiway numbers, 50 US 50 #50, Eureka, Nevada, or street names with multiple words, El Camino Real, with or without the 'suffix' (Street, Avenue, etc.) The suffix may be spelled out in full, or abreviated. Therefore, 'Avenue' = 'Ave' = 'Ave.' = 'Av' = 'Av.' Suffix may be in language other than American English, so a table of these would be extensive, and will be ambiguous.

In USA, Zip+4 will greatly limit the number of possibilities. Are these in both tables?

Are apartment/suite/unit number included in either table?

What happens to non-matching records? Can they exist? Are they handled differently if missing from customer table than if missing from own table?

How are non-sync records to be displayed?

Very clever algorithms can be written, such as have already been proposed. You are starting with comparing 2 arrays generated from the Split function, and deciding a match if some number of fields match. If 3 fields match, is it a match? Is "44 East Rd. #33" a match with "44 East 33rd"?

All thoughts so far assume no errors in data. But of course we can assume the data to be error-free. Right!?
May 5 '09 #7

ADezii
Expert 5K+
P: 8,679
@timber910
OldBirdman makes some excellent points as he always does. All my Algorithm does is point to prospective Address Matches in the [Address] Fields of Table1 and Table2, on the Numeric Component only, and assumes that no interrelationships exist between them. The Output will look something like:
Expand|Select|Wrap|Line Numbers
  1. Table1: [ID]=234 {12345 East Main Street} ==> Table2: [ID]=972 {12345 E. Main St.}
Now, it will be up to you to manually determine if these Records in the Tables match. Their Unique IDs will specifically point to them.

P.S. - I'll finish the coding and Post it as soon as I get a chance. You decide if it is usable or not.
May 5 '09 #8

ADezii
Expert 5K+
P: 8,679
The following code will write potential Matches based solely on the Numeric Component of [Address] Fields in Tables Table1 and Table2 to a Text File named Matches.txt in the same Directory as the Database, but first a couple of very simple Assumptions:
  1. The 2 Table Names are Table1 and Table2 respectfully.
  2. Both Tables contain an [Address] {TEXT} Field as well as an [ID] {AUTONUMBER/LONG} Field.
  3. The Data Type of the [ID] Field is consistent across both Tables.
Expand|Select|Wrap|Line Numbers
  1. Public Function fCompareNumbersFromAddress()
  2. Dim MyDB As DAO.Database
  3. Dim rstTblOne As DAO.Recordset
  4. Dim rstTblTwo As DAO.Recordset
  5. Dim varSplit As Variant
  6. Dim intIndex As Integer
  7.  
  8. Set MyDB = CurrentDb
  9. Set rstTblOne = MyDB.OpenRecordset("Table1", dbOpenForwardOnly)
  10. Set rstTblTwo = MyDB.OpenRecordset("Table2", dbOpenSnapshot)
  11.  
  12. Open CurrentProject.Path & "\Matches.txt" For Output As #1
  13.  
  14. With rstTblOne
  15.   Do While Not .EOF
  16.     If Len(![Address]) > 0 Then
  17.       varSplit = Split(![Address], " ")
  18.  
  19.       For intIndex = LBound(varSplit) To UBound(varSplit)
  20.         If IsNumeric(varSplit(intIndex)) Then
  21.           Do While Not rstTblTwo.EOF
  22.             If InStr(rstTblTwo![Address], varSplit(intIndex)) > 0 Then
  23.               Print #1, "Table1: [ID]=" & Format$(![Id], "00000") & " {" & ![Address] & "} ==> Table2: [ID]=" & _
  24.                          Format$(rstTblTwo![Id], "00000") & " {" & rstTblTwo![Address] & "}"
  25.  
  26.             End If
  27.               rstTblTwo.MoveNext
  28.           Loop
  29.             rstTblTwo.MoveFirst
  30.         End If
  31.       Next
  32.     End If
  33.     .MoveNext
  34.   Loop
  35. End With
  36.  
  37. Close #1
  38.  
  39. rstTblOne.Close
  40. rstTblTwo.Close
  41. Set rstTblOne = Nothing
  42. Set rstTblTwo = Nothing
  43. End Function
SAMPLE OUTPUT based on SAMPLE DATA:
Expand|Select|Wrap|Line Numbers
  1. Table1: [ID]=00001 {13846 East Main Street} ==> Table2: [ID]=00003 {13846 E. Main St.}
  2. Table1: [ID]=00002 {West 2222 East Main} ==> Table2: [ID]=00011 {West 2222 East Main}
  3. Table1: [ID]=00005 {7789 East 43rd Avenue} ==> Table2: [ID]=00010 {7789 E. 43rd Ave.}
  4. Table1: [ID]=00009 {555 Nordick Drive} ==> Table2: [ID]=00006 {555 East Nordick Dr.}
  5. Table1: [ID]=00011 {11 West 9th Street} ==> Table2: [ID]=00001 {1118 East Main Street}
  6. Table1: [ID]=00020 {2917 S. 16th Street} ==> Table2: [ID]=00012 {2917 South Sixteenth St.}
P.S. - This is not a 'Cure All' but only a mechanism to list 'potential' matches based solely on a Numeric Address Component. You will still have to do the leg work, hopefully it will make it a little easier.
May 5 '09 #9

P: 39
Excellent points about the differences in address. Thanks for the code. It will make it that much easier to at least get a jumping off point in searching these. I get a different database each time I do this. So i get to start the process again and I'm just trying to find a more efficent way to compaire and see what my work will call a match. (address number and street) And what doesn't I get to send back for the other company to figure out which is right. Ideally what I get back from them should be exactly the same from both databases. But you know how that goes! LOL!

While my table names aren't table 1 and two. I should be able to adapt the code to match the tables I get. (those also differ each time I get data)..
May 5 '09 #10

ADezii
Expert 5K+
P: 8,679
@timber910
You can modify the Function to accept 4 Arguments, the Names of the 2 Tables, and the Names of the 2 Primary Key Fields to make the code portable.
May 5 '09 #11

ADezii
Expert 5K+
P: 8,679
Just as a side note timber10, you can populate a Table with the results. This Table can consist of 4 Fields such as: [PK_1], [Name_1], [PK_2], [Name_2]. Now within each Record, you can test and see if there is another (at least 1) common element between the Addresses, and if so consider them a Match.
Expand|Select|Wrap|Line Numbers
  1. 1569 East 34th Street <==> 1569 E. 34th St
.
would be considered a Match since they have 2 Common Elements, namely 1569 and 34th. This Logic, however, would be far from foolproof. If possible, I would like to see some of the Data from the 2 Tables to test this Logic.
May 5 '09 #12

Post your reply

Sign in to post your reply or Sign up for a free account.