473,407 Members | 2,320 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Comparing data

1
Need some tips to help me approach this problem. I would consider myself advanced in Access and VBA, so just looking for conceptual pointers.

In essence, I have a number of text files, each of which contains data that is parsed into various database tables. The data thus stored in the tables contains 4 fields of information. I have this part of the system working 100%.

The 4 fields are: ParseID, lngTime (times are derived in thousands of a second, so I store all times in a long format), strDataSource and strMsg

Some other relevant info:
-Each text file will contain approx 70,000 number of records and takes about 10 seconds to parse. So from 2 datasources, I will get about 140,000 records
-Each record in the text file contains a time stamp (lngTime); however, the clocks between each recordsource are not syncronised. These time differences vary from recordsource to recordsource, but are always less than 30 seconds.

What I need to do:
I am looking for records where the strMsg field from one datasource is exactly the same as from the other datasource, but the difference between the lngTime field is less than 30 seconds. i.e. I want to identify "identical" records from different datasources (taking into account the time lag).

I was originally hoping that I could do a SELECT DISTINCT on the records, but the one major flaw is that the time offset from the different datasources varies every so slightly. For example, it may be 22.789 seconds in one case, and then 23.015 in the next. [Question: is there an opportunity to use rounding to make things faster?]

What I have got to so far is to create an array and loop through it, but as expected this take a bit of time. (note that the following code doesnt delete any records yet, but it does identify where the matching records are in the Array)
Expand|Select|Wrap|Line Numbers
  1. Sub CombineRecordsets()
  2.  
  3. Dim i As Long
  4. Dim j As Long
  5.  
  6. Dim cnn As ADODB.Connection
  7. Dim rs1 As ADODB.Recordset
  8. Dim strSql1 As String
  9. Dim lngRecordCount1 As Long
  10. Dim lngParseID As Long
  11. Dim strMsg As String
  12. Dim lngTime1 As Long
  13. Dim strDataSource1 As String
  14. Dim strDataSource2 As String
  15. Dim lngCurrRecord As Long
  16.  
  17. Dim varArray
  18.  
  19. 'Recordset to retrieve the combat log
  20.     Set cnn = CurrentProject.Connection
  21.     Set rs1 = New ADODB.Recordset
  22.  
  23.     strSql1 = "Select ParseID, lngTime, strDataSource, strMsg, lngParseID from tblParsedData ORDER BY lngTime ASC"
  24.  
  25.     rs1.Open strSql1, cnn, adOpenKeyset, adLockOptimistic
  26.  
  27.  
  28.     varArray = rs1.GetRows
  29.     lngRecordCount1 = rs1.RecordCount
  30.  
  31.     lngRecordCount1 = rs1.RecordCount - 1
  32.  
  33.     strDataSource1 = "Mike"
  34.     strDataSource2 = "Steve"
  35.  
  36.     For i = 0 To lngRecordCount1
  37.             If i = 10000 Then
  38.             Stop
  39.             End If
  40.  
  41.         If Not IsNull(varArray(4, i)) Then 'Looks to see if we have matched it yet
  42.             'Do nothing as its aleady been matched, so move onto next i
  43.         Else
  44.             If varArray(2, i) = strDataSource1 Then
  45.             strMsg = varArray(3, i)
  46.             lngParseID = varArray(0, i)
  47.             lngTime1 = varArray(1, i)
  48.                 For j = lngCurrRecord To lngRecordCount1
  49.                     If (varArray(1, j) - lngTime1) > 30000 Then '>30 seconds, stop looking for a match
  50.                         Exit For
  51.                     Else
  52.                         If IsNull(varArray(4, j)) Then 'Looks to see if we have matched it yet
  53.                             If varArray(2, j) = strDataSource2 Then
  54.                                 If varArray(3, j) = strMsg Then
  55.                                     varArray(4, j) = lngParseID
  56.                                     varArray(4, i) = varArray(0, j)
  57.                                     lngCurrRecord = j
  58.                                     Exit For
  59.                                 End If
  60.                             End If
  61.                         End If
  62.                     End If
  63.                 Next
  64.             End If
  65.  
  66.         End If
  67.  
  68.  
  69.     Next i
  70.  
  71.  
  72. End Sub
  73.  
Sooooo, I would appreciate it if you could let me know if there are any faster methods or if I should be approaching this from a completely different angle.

PS - Sorry for the formatting, but this forum doesnt seem to recognise [tab] spaces.
Apr 24 '07 #1
2 1765
MMcCarthy
14,534 Expert Mod 8TB
Try this ...
Expand|Select|Wrap|Line Numbers
  1. SELECT Table1.ParseID, Table2.ParseID
  2. FROM Table1, Table2
  3. WHERE Table1.strMsg = Table2.strMsg
  4. AND ((Table1.lngTime - Table2.lngTime) < 30000
  5.    OR (Table2.lngTime - Table1.lngTime) < 30000)
  6.  
Mary
Apr 25 '07 #2
MMcCarthy
14,534 Expert Mod 8TB
BTW, I added code tags to your original post. That's how you get the code to display correctly.
Apr 25 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: Dan Stromberg | last post by:
We will soon have 3 copies, for testing purposes, of what should be about 4.5 terrabytes of data. Rather than cmp'ing twice, to verify data integrity, I was thinking we could speed up the...
11
by: Peter | last post by:
Hi how can I compare two byte arrays in VB.NET Thank Peter
41
by: Odd-R. | last post by:
I have to lists, A and B, that may, or may not be equal. If they are not identical, I want the output to be three new lists, X,Y and Z where X has all the elements that are in A, but not in B, and...
2
by: Manny Chohan | last post by:
Hi, i have two datetime values in format 11/22/04 9:00 AM and 11/22/04 9:30 AM. How can i compare dates .net c# or if there is any other way such as Javascript. Thanks Manny
19
by: Dennis | last post by:
I have a public variable in a class of type color declared as follows: public mycolor as color = color.Empty I want to check to see if the user has specified a color like; if mycolor =...
0
by: richardkreidl | last post by:
I have the following hash script that I use to compare two text files. 'Class Public Class FileComparison Public Class FileComparisonException Public Enum ExceptionType U 'Unknown A 'Add...
1
by: Will Chamberlain | last post by:
I just created an application that displays data from 2 sources in a datagrid. There are 2 repeaters nested in the datagrid (probably not the best idea). The whole purpose of this application is to...
4
by: Frank | last post by:
Hello, Developing an app where the user fills out a sometimes quite lengthy form of chkboxes, txtboxes, radbtns, etc. User responses are saved to a mySql db, which the user can later edit. When...
5
by: ma740988 | last post by:
There's a need for me to move around at specified offsets within memory. As as a result - long story short - unsigned char* is the type of choice. At issue: Consider the case ( test code ) where...
1
by: Patrick C | last post by:
hey everyone, i'm going to be comparing data in a list. Just basic >= or <= type stuff. However, the list i have somtimes starts with 'n/a'. That is somtimes it starts like this: data = or...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.