By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,444 Members | 3,182 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,444 IT Pros & Developers. It's quick & easy.

Comparing successive records in large access file - VB help

P: 3
I am using Access 2003 on Windows XP. I am looking to analyse web traffic files using this - I appreciate there are bespoke applications which will do the same task. Each month there are roughly 500,000 records. What I want to do is compare successive records, to calculate new fields at a visitor and visit level. Such as page number this visit, page number all visits by this visitor, time spent on current page, etc.

My table (shown below) only has 3 columns, although I've manually added the other 3 for illustration purposes in case I wasn't making myself clear.

Expand|Select|Wrap|Line Numbers
  1. ID        HitDate      HitTime     VisitorPageNo VisitNo VisitPageNo
  2. 10004-055052   03/02/2007   17:50:52   1             1       1
  3. 10011-172249   03/02/2007   17:22:48   1             1       1
  4. 10011-172249   03/02/2007   17:23:02   2             1       2
  5. 10011-172249   03/02/2007   19:23:28   3             2       1
  6. 10011-172249   03/02/2007   19:24:21   4             2       2
  7. 10019-093023   03/02/2007   09:30:23   1             1       1
  8. 10023-215744   03/02/2007   21:57:44   1             1       1
  9. 10023-215744   03/02/2007   21:57:49   2             1       2
  10. 10023-215744   03/02/2007   21:58:30   3             1       3
  11. 10033-202533   03/02/2007   19:25:32   1             1       1
  12. 10033-202533   03/02/2007   20:25:45   2             2       1
  13. 10033-202533   03/02/2007   20:26:00   3             2       2
  14. 10033-202533   03/02/2007   21:26:09   4             3       1
  15. 10033-202533   03/02/2007   22:31:40   5             4       1
I have written a series of queries (8 at the last count) which calculate most of the fields I require. I'm sure I could include a couple more to get the job completed. It runs reasonably quickly, but it just seems a long winded and inelegant way of solving the problem.

I'm sure the same result could be achieved in a simpler and more efficient way using VB. I have experience of Basic and other programming languages, but I'm a novice at VB. I've searched on recordsets and so on, but have not been able to find something similar I can adapt.

I've attempted to write some logical instructions that should solve the problem. I don't know if this is of any help!

Expand|Select|Wrap|Line Numbers
  1. dim x as integer
  2. dim VisitPageNo, VisitNo, VisitPageNo, TimeOnPage as recordset
  3.  
  4. x = 1
  5. VisitorPageNo(x) = 1
  6. VisitNo(x) = 1
  7. VisitPageNo(x) = 1
  8.  
  9. While not EOF
  10. x = x+1
  11.   If ID(x) = ID(x-1) then
  12.     If HitDate&HitTime(x) < HitDate&HitTime(x-1) + 1 hour then
  13.       TimeOnPage(x-1) = HitDate&HitTime(x) - HitDate&HitTime(x-1)
  14.       VisitorPageNo(x) = VisitorPageNo(x-1) + 1
  15.       VisitNo(x) = VisitNo(x-1)
  16.       VisitPageNo(x) = VisitPageNo(x-1) + 1
  17.     Else
  18.       VisitorPageNo(x) = VisitorPageNo(x-1) + 1
  19.       VisitNo(x) = VisitNo(x-1) + 1
  20.       VisitPageNo(x) = 1
  21.     End
  22.   Else
  23.     VisitorPageNo(x) = 1
  24.     VisitNo(x) = 1
  25.     VisitPageNo(x) = 1
  26.   End if
  27. Wend
Assistance in solving this problem would be greatly appreciated. Ideally I'd like a new query which adds the additional fields to the existing ones.

Thanks
Feb 21 '07 #1
Share this Question
Share on Google+
5 Replies


ADezii
Expert 5K+
P: 8,638
I am using Access 2003 on Windows XP. I am looking to analyse web traffic files using this - I appreciate there are bespoke applications which will do the same task. Each month there are roughly 500,000 records. What I want to do is compare successive records, to calculate new fields at a visitor and visit level. Such as page number this visit, page number all visits by this visitor, time spent on current page, etc.

My table (shown below) only has 3 columns, although I've manually added the other 3 for illustration purposes in case I wasn't making myself clear.

ID Hit Date HitTime VisitorPageNo Visit No VisitPageNo
10004-055052 03/02/2007 17:50:52 1 1 1
10011-172249 03/02/2007 17:22:48 1 1 1
10011-172249 03/02/2007 17:23:02 2 1 2
10011-172249 03/02/2007 19:23:28 3 2 1
10011-172249 03/02/2007 19:24:21 4 2 2
10019-093023 03/02/2007 09:30:23 1 1 1
10023-215744 03/02/2007 21:57:44 1 1 1
10023-215744 03/02/2007 21:57:49 2 1 2
10023-215744 03/02/2007 21:58:30 3 1 3
10033-202533 03/02/2007 19:25:32 1 1 1
10033-202533 03/02/2007 20:25:45 2 2 1
10033-202533 03/02/2007 20:26:00 3 2 2
10033-202533 03/02/2007 21:26:09 4 3 1
10033-202533 03/02/2007 22:31:40 5 4 1

I have written a series of queries (8 at the last count) which calculate most of the fields I require. I'm sure I could include a couple more to get the job completed. It runs reasonably quickly, but it just seems a long winded and inelegant way of solving the problem.

I'm sure the same result could be achieved in a simpler and more efficient way using VB. I have experience of Basic and other programming languages, but I'm a novice at VB. I've searched on recordsets and so on, but have not been able to find something similar I can adapt.

I've attempted to write some logical instructions that should solve the problem. I don't know if this is of any help!

dim x as integer
dim VisitPageNo, Visit No, VisitPageNo, Tomoenage as recordset

x = 1
VisitorPageNo(x) = 1
Visit(x) = 1
VisitPageNo(x) = 1

While not EOF
x = x+1
If ID(x) = ID(x-1) then
If Hit Date&HitTime(x) < HitDate&HitTime(x-1) + 1 hour then
Tomoenage(x-1) = Hit Date&HitTime(x) - HitDate&HitTime(x-1)
VisitorPageNo(x) = VisitorPageNo(x-1) + 1
Visit(x) = Visit(x-1)
VisitPageNo(x) = VisitPageNo(x-1) + 1
Else
VisitorPageNo(x) = VisitorPageNo(x-1) + 1
Visit(x) = Visit(x-1) + 1
VisitPageNo(x) = 1
End
Else
VisitorPageNo(x) = 1
Visit(x) = 1
VisitPageNo(x) = 1
End if
Wend

Assistance in solving this problem would be greatly appreciated. Ideally I'd like a new query which adds the additional fields to the existing ones.

Thanks
I was kind of pressed for time, but I think I solved 1/3 of your problem (VisitorPageNo) - the rest you probably can deduce from the code.
1. Create a Query, for now Query1.
2. Add [ID], [HitDate], and [HitTime] to the Query Grid.
3. The [ID] Field must be set to Ascending or Descending Sort (critical).
4. Create a Calculated Field called Visitor Page No.
5. This Calculated Field will call the fCalculateVisitorPageNo([ID]) Function and pass it the value of the [ID] Field as a Parameter. The Calculated Field will look like this:
Expand|Select|Wrap|Line Numbers
  1. Visitor Page No: fCalculateVisitorPageNo([ID])
The actual Function code is listed below:
Expand|Select|Wrap|Line Numbers
  1. Public Function fCalculateVisitorPageNo(WebID As String) As Integer
  2. Dim intTotalNoOfPagesVisited As Integer
  3. Static intPageNoVisited As Integer
  4.  
  5. intTotalNoOfPagesVisited = DCount("*", "Table1", "[ID] = '" & WebID & "'")
  6.  
  7. If intTotalNoOfPagesVisited = 1 Then
  8.   'Reset the Page#
  9.   intPageNoVisited = 0
  10.   fCalculateVisitorPageNo = 1   'Only 1 Page
  11. Else    '> 1 Page visited
  12.   intPageNoVisited = intPageNoVisited + 1
  13.   If intPageNoVisited >= intTotalNoOfPagesVisited Then   'Last Page visited
  14.     fCalculateVisitorPageNo = intTotalNoOfPagesVisited
  15.     intPageNoVisited = 0
  16.   Else
  17.     fCalculateVisitorPageNo = intPageNoVisited
  18.   End If
  19. End If
  20. End Function
Typical Output would be:
Expand|Select|Wrap|Line Numbers
  1. ID                  Visitor Page No
  2. 10004-055052          1
  3. 10011-172249          1
  4. 10011-172249          2
  5. 10011-172249          3
  6. 10011-172249          4
  7. 10019-093023          1
  8. 10023-215744          1
  9. 10023-215744          2
  10. 10023-215744          3
  11. 10033-202533          1
  12. 10033-202533          2
  13. 10033-202533          3
  14. 10033-202533          4
  15. 10033-202533          5
  16. 10033-202539          1
  17. 10033-202539          2
  18. 10033-202539          3
The SQL of it would be:
Expand|Select|Wrap|Line Numbers
  1. SELECT Table1.ID, fCalculateVisitorPageNo([ID]) AS [Visitor Page No]
  2. FROM Table1
  3. ORDER BY Table1.ID DESC;
It probably can be done from a purely SQL point of view but I am not as proficient as some of the other Moderators/Experts in SQL. Maybe they'll provide a neater solution.
Feb 22 '07 #2

MMcCarthy
Expert Mod 10K+
P: 14,534
Try this ...

Create a new table (tblStats) to hold all fields:

Expand|Select|Wrap|Line Numbers
  1. Private Function updatetblStats()
  2. Dim VisitorPageNo, PageNo, VisitPageNo as Integers
  3. Dim tmpID As Long ' assumed datatype of the ID field
  4. Dim tmpHitDate As Date ' assumed datatype of the HitDate field
  5. Dim tmpHitTime As Date ' assumed datatype of the HitTime field
  6. Dim tmpHitHour As Integer
  7. Dim db as Database
  8. Dim rs1 as Recordset
  9. Dim rs2 as Recordset
  10.  
  11.   Set db = CurrentDB
  12.   Set rs1 = db.OpenRecordset("SELECT ID, HitDate, HitTime, " & _
  13.                     "Hour([HitTime]) As HitHour " & _
  14.                     "FROM TableName " & _
  15.                     "ORDER BY ID, HitDate, HitTime")
  16.   Set rs1 = db.OpenRecordset("tblStats")
  17.  
  18.   VisitorPageNo = 1
  19.   PageNo = 1
  20.   VisitPageNo = 1
  21.   tmpID = rs1!ID
  22.   tmpHitDate = rs1!HitDate
  23.   tmpHitTime = rs1!HitTime
  24.   tmpHitHour = rs1!HitHour
  25.  
  26.   rs1.MoveFirst
  27.   Do Until rs1.EOF
  28.  
  29.     rs2.AddNew
  30.     rs2!ID = tmpID
  31.     rs2!HitDate = tmpHitDate
  32.     rs2!HitTime = tmpHitTime
  33.     rs2!VisitorPageNo = VisitorPageNo
  34.     rs2!PageNo = PageNo
  35.     rs2!VisitPageNo = VisitPageNo 
  36.     rs2.Update
  37.  
  38.     rs1.MoveNext
  39.  
  40.     If rs1!ID = tmpID Then
  41.       VisitorPageNo = VisitorPageNo + 1
  42.       If rs1!HitDate = tmpHitDate And If rs1!HitHour = tmpHitHour Then
  43.         PageNo = PageNo
  44.         VisitPageNo = VisitPageNo + 1
  45.       Else 
  46.         PageNo = PageNo +1
  47.         VisitPageNo = VisitPageNo
  48.         tmpHitDate = rs1!HitDate
  49.         tmpHitHour = rs1!HitHour
  50.       End If
  51.     Else
  52.       VisitorPageNo = 1
  53.       PageNo = 1
  54.       VisitPageNo = 1
  55.       tmpID = rs1!ID
  56.       tmpHitDate = rs1!HitDate
  57.       tmpHitTime = rs1!HitTime
  58.       tmpHitHour = rs1!HitHour
  59.     EndIf
  60.   Loop
  61.  
  62.   rs1.close
  63.   rs2.Close
  64.   Set rs1=Nothing
  65.   Set rs2=Nothing
  66.   Set db=Nothing
  67.  
  68. End Sub
  69.  
Mary
Feb 22 '07 #3

P: 3
Hi ADezii

Thanks very much for your reply. This was the first time I've posted a question on here and I was shocked that 2 people responded so quickly. As the 2nd reply answered all of my questions - and meant less work for me to do - I have used that one. I hope I didn't take up too much of your time.

Thanks again
Feb 22 '07 #4

P: 3
Hi Mary

Thank you very much for taking the time to answer my question. Your program works wonderfully. You have saved me many hours of fumbling my way around VB.

Thanks again

Jason
Feb 22 '07 #5

MMcCarthy
Expert Mod 10K+
P: 14,534
You're welcome Jason. I don't normally post code to this extent but it was a puzzle of logic and it was easier to write it than explain it. Do spend some time trying to follow what I did though for your own sake and I'll be happy to answer any questions you have.

Thank you as well for reponding to all experts. Some posters forget to thank the experts other than the one who posts the accepted answer.

Glad it's working for you.

Mary
Feb 22 '07 #6

Post your reply

Sign in to post your reply or Sign up for a free account.