473,494 Members | 2,027 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Comparing successive records in large access file - VB help

3 New Member
I am using Access 2003 on Windows XP. I am looking to analyse web traffic files using this - I appreciate there are bespoke applications which will do the same task. Each month there are roughly 500,000 records. What I want to do is compare successive records, to calculate new fields at a visitor and visit level. Such as page number this visit, page number all visits by this visitor, time spent on current page, etc.

My table (shown below) only has 3 columns, although I've manually added the other 3 for illustration purposes in case I wasn't making myself clear.

Expand|Select|Wrap|Line Numbers
  1. ID        HitDate      HitTime     VisitorPageNo VisitNo VisitPageNo
  2. 10004-055052   03/02/2007   17:50:52   1             1       1
  3. 10011-172249   03/02/2007   17:22:48   1             1       1
  4. 10011-172249   03/02/2007   17:23:02   2             1       2
  5. 10011-172249   03/02/2007   19:23:28   3             2       1
  6. 10011-172249   03/02/2007   19:24:21   4             2       2
  7. 10019-093023   03/02/2007   09:30:23   1             1       1
  8. 10023-215744   03/02/2007   21:57:44   1             1       1
  9. 10023-215744   03/02/2007   21:57:49   2             1       2
  10. 10023-215744   03/02/2007   21:58:30   3             1       3
  11. 10033-202533   03/02/2007   19:25:32   1             1       1
  12. 10033-202533   03/02/2007   20:25:45   2             2       1
  13. 10033-202533   03/02/2007   20:26:00   3             2       2
  14. 10033-202533   03/02/2007   21:26:09   4             3       1
  15. 10033-202533   03/02/2007   22:31:40   5             4       1
I have written a series of queries (8 at the last count) which calculate most of the fields I require. I'm sure I could include a couple more to get the job completed. It runs reasonably quickly, but it just seems a long winded and inelegant way of solving the problem.

I'm sure the same result could be achieved in a simpler and more efficient way using VB. I have experience of Basic and other programming languages, but I'm a novice at VB. I've searched on recordsets and so on, but have not been able to find something similar I can adapt.

I've attempted to write some logical instructions that should solve the problem. I don't know if this is of any help!

Expand|Select|Wrap|Line Numbers
  1. dim x as integer
  2. dim VisitPageNo, VisitNo, VisitPageNo, TimeOnPage as recordset
  3.  
  4. x = 1
  5. VisitorPageNo(x) = 1
  6. VisitNo(x) = 1
  7. VisitPageNo(x) = 1
  8.  
  9. While not EOF
  10. x = x+1
  11.   If ID(x) = ID(x-1) then
  12.     If HitDate&HitTime(x) < HitDate&HitTime(x-1) + 1 hour then
  13.       TimeOnPage(x-1) = HitDate&HitTime(x) - HitDate&HitTime(x-1)
  14.       VisitorPageNo(x) = VisitorPageNo(x-1) + 1
  15.       VisitNo(x) = VisitNo(x-1)
  16.       VisitPageNo(x) = VisitPageNo(x-1) + 1
  17.     Else
  18.       VisitorPageNo(x) = VisitorPageNo(x-1) + 1
  19.       VisitNo(x) = VisitNo(x-1) + 1
  20.       VisitPageNo(x) = 1
  21.     End
  22.   Else
  23.     VisitorPageNo(x) = 1
  24.     VisitNo(x) = 1
  25.     VisitPageNo(x) = 1
  26.   End if
  27. Wend
Assistance in solving this problem would be greatly appreciated. Ideally I'd like a new query which adds the additional fields to the existing ones.

Thanks
Feb 21 '07 #1
5 1811
ADezii
8,834 Recognized Expert Expert
I am using Access 2003 on Windows XP. I am looking to analyse web traffic files using this - I appreciate there are bespoke applications which will do the same task. Each month there are roughly 500,000 records. What I want to do is compare successive records, to calculate new fields at a visitor and visit level. Such as page number this visit, page number all visits by this visitor, time spent on current page, etc.

My table (shown below) only has 3 columns, although I've manually added the other 3 for illustration purposes in case I wasn't making myself clear.

ID Hit Date HitTime VisitorPageNo Visit No VisitPageNo
10004-055052 03/02/2007 17:50:52 1 1 1
10011-172249 03/02/2007 17:22:48 1 1 1
10011-172249 03/02/2007 17:23:02 2 1 2
10011-172249 03/02/2007 19:23:28 3 2 1
10011-172249 03/02/2007 19:24:21 4 2 2
10019-093023 03/02/2007 09:30:23 1 1 1
10023-215744 03/02/2007 21:57:44 1 1 1
10023-215744 03/02/2007 21:57:49 2 1 2
10023-215744 03/02/2007 21:58:30 3 1 3
10033-202533 03/02/2007 19:25:32 1 1 1
10033-202533 03/02/2007 20:25:45 2 2 1
10033-202533 03/02/2007 20:26:00 3 2 2
10033-202533 03/02/2007 21:26:09 4 3 1
10033-202533 03/02/2007 22:31:40 5 4 1

I have written a series of queries (8 at the last count) which calculate most of the fields I require. I'm sure I could include a couple more to get the job completed. It runs reasonably quickly, but it just seems a long winded and inelegant way of solving the problem.

I'm sure the same result could be achieved in a simpler and more efficient way using VB. I have experience of Basic and other programming languages, but I'm a novice at VB. I've searched on recordsets and so on, but have not been able to find something similar I can adapt.

I've attempted to write some logical instructions that should solve the problem. I don't know if this is of any help!

dim x as integer
dim VisitPageNo, Visit No, VisitPageNo, Tomoenage as recordset

x = 1
VisitorPageNo(x) = 1
Visit(x) = 1
VisitPageNo(x) = 1

While not EOF
x = x+1
If ID(x) = ID(x-1) then
If Hit Date&HitTime(x) < HitDate&HitTime(x-1) + 1 hour then
Tomoenage(x-1) = Hit Date&HitTime(x) - HitDate&HitTime(x-1)
VisitorPageNo(x) = VisitorPageNo(x-1) + 1
Visit(x) = Visit(x-1)
VisitPageNo(x) = VisitPageNo(x-1) + 1
Else
VisitorPageNo(x) = VisitorPageNo(x-1) + 1
Visit(x) = Visit(x-1) + 1
VisitPageNo(x) = 1
End
Else
VisitorPageNo(x) = 1
Visit(x) = 1
VisitPageNo(x) = 1
End if
Wend

Assistance in solving this problem would be greatly appreciated. Ideally I'd like a new query which adds the additional fields to the existing ones.

Thanks
I was kind of pressed for time, but I think I solved 1/3 of your problem (VisitorPageNo) - the rest you probably can deduce from the code.
1. Create a Query, for now Query1.
2. Add [ID], [HitDate], and [HitTime] to the Query Grid.
3. The [ID] Field must be set to Ascending or Descending Sort (critical).
4. Create a Calculated Field called Visitor Page No.
5. This Calculated Field will call the fCalculateVisitorPageNo([ID]) Function and pass it the value of the [ID] Field as a Parameter. The Calculated Field will look like this:
Expand|Select|Wrap|Line Numbers
  1. Visitor Page No: fCalculateVisitorPageNo([ID])
The actual Function code is listed below:
Expand|Select|Wrap|Line Numbers
  1. Public Function fCalculateVisitorPageNo(WebID As String) As Integer
  2. Dim intTotalNoOfPagesVisited As Integer
  3. Static intPageNoVisited As Integer
  4.  
  5. intTotalNoOfPagesVisited = DCount("*", "Table1", "[ID] = '" & WebID & "'")
  6.  
  7. If intTotalNoOfPagesVisited = 1 Then
  8.   'Reset the Page#
  9.   intPageNoVisited = 0
  10.   fCalculateVisitorPageNo = 1   'Only 1 Page
  11. Else    '> 1 Page visited
  12.   intPageNoVisited = intPageNoVisited + 1
  13.   If intPageNoVisited >= intTotalNoOfPagesVisited Then   'Last Page visited
  14.     fCalculateVisitorPageNo = intTotalNoOfPagesVisited
  15.     intPageNoVisited = 0
  16.   Else
  17.     fCalculateVisitorPageNo = intPageNoVisited
  18.   End If
  19. End If
  20. End Function
Typical Output would be:
Expand|Select|Wrap|Line Numbers
  1. ID                  Visitor Page No
  2. 10004-055052          1
  3. 10011-172249          1
  4. 10011-172249          2
  5. 10011-172249          3
  6. 10011-172249          4
  7. 10019-093023          1
  8. 10023-215744          1
  9. 10023-215744          2
  10. 10023-215744          3
  11. 10033-202533          1
  12. 10033-202533          2
  13. 10033-202533          3
  14. 10033-202533          4
  15. 10033-202533          5
  16. 10033-202539          1
  17. 10033-202539          2
  18. 10033-202539          3
The SQL of it would be:
Expand|Select|Wrap|Line Numbers
  1. SELECT Table1.ID, fCalculateVisitorPageNo([ID]) AS [Visitor Page No]
  2. FROM Table1
  3. ORDER BY Table1.ID DESC;
It probably can be done from a purely SQL point of view but I am not as proficient as some of the other Moderators/Experts in SQL. Maybe they'll provide a neater solution.
Feb 22 '07 #2
MMcCarthy
14,534 Recognized Expert Moderator MVP
Try this ...

Create a new table (tblStats) to hold all fields:

Expand|Select|Wrap|Line Numbers
  1. Private Function updatetblStats()
  2. Dim VisitorPageNo, PageNo, VisitPageNo as Integers
  3. Dim tmpID As Long ' assumed datatype of the ID field
  4. Dim tmpHitDate As Date ' assumed datatype of the HitDate field
  5. Dim tmpHitTime As Date ' assumed datatype of the HitTime field
  6. Dim tmpHitHour As Integer
  7. Dim db as Database
  8. Dim rs1 as Recordset
  9. Dim rs2 as Recordset
  10.  
  11.   Set db = CurrentDB
  12.   Set rs1 = db.OpenRecordset("SELECT ID, HitDate, HitTime, " & _
  13.                     "Hour([HitTime]) As HitHour " & _
  14.                     "FROM TableName " & _
  15.                     "ORDER BY ID, HitDate, HitTime")
  16.   Set rs1 = db.OpenRecordset("tblStats")
  17.  
  18.   VisitorPageNo = 1
  19.   PageNo = 1
  20.   VisitPageNo = 1
  21.   tmpID = rs1!ID
  22.   tmpHitDate = rs1!HitDate
  23.   tmpHitTime = rs1!HitTime
  24.   tmpHitHour = rs1!HitHour
  25.  
  26.   rs1.MoveFirst
  27.   Do Until rs1.EOF
  28.  
  29.     rs2.AddNew
  30.     rs2!ID = tmpID
  31.     rs2!HitDate = tmpHitDate
  32.     rs2!HitTime = tmpHitTime
  33.     rs2!VisitorPageNo = VisitorPageNo
  34.     rs2!PageNo = PageNo
  35.     rs2!VisitPageNo = VisitPageNo 
  36.     rs2.Update
  37.  
  38.     rs1.MoveNext
  39.  
  40.     If rs1!ID = tmpID Then
  41.       VisitorPageNo = VisitorPageNo + 1
  42.       If rs1!HitDate = tmpHitDate And If rs1!HitHour = tmpHitHour Then
  43.         PageNo = PageNo
  44.         VisitPageNo = VisitPageNo + 1
  45.       Else 
  46.         PageNo = PageNo +1
  47.         VisitPageNo = VisitPageNo
  48.         tmpHitDate = rs1!HitDate
  49.         tmpHitHour = rs1!HitHour
  50.       End If
  51.     Else
  52.       VisitorPageNo = 1
  53.       PageNo = 1
  54.       VisitPageNo = 1
  55.       tmpID = rs1!ID
  56.       tmpHitDate = rs1!HitDate
  57.       tmpHitTime = rs1!HitTime
  58.       tmpHitHour = rs1!HitHour
  59.     EndIf
  60.   Loop
  61.  
  62.   rs1.close
  63.   rs2.Close
  64.   Set rs1=Nothing
  65.   Set rs2=Nothing
  66.   Set db=Nothing
  67.  
  68. End Sub
  69.  
Mary
Feb 22 '07 #3
JasonP
3 New Member
Hi ADezii

Thanks very much for your reply. This was the first time I've posted a question on here and I was shocked that 2 people responded so quickly. As the 2nd reply answered all of my questions - and meant less work for me to do - I have used that one. I hope I didn't take up too much of your time.

Thanks again
Feb 22 '07 #4
JasonP
3 New Member
Hi Mary

Thank you very much for taking the time to answer my question. Your program works wonderfully. You have saved me many hours of fumbling my way around VB.

Thanks again

Jason
Feb 22 '07 #5
MMcCarthy
14,534 Recognized Expert Moderator MVP
You're welcome Jason. I don't normally post code to this extent but it was a puzzle of logic and it was easier to write it than explain it. Do spend some time trying to follow what I did though for your own sake and I'll be happy to answer any questions you have.

Thank you as well for reponding to all experts. Some posters forget to thank the experts other than the one who posts the accepted answer.

Glad it's working for you.

Mary
Feb 22 '07 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

26
3432
by: Chris Lasher | last post by:
Hello, I have a rather large (100+ MB) FASTA file from which I need to access records in a random order. The FASTA format is a standard format for storing molecular biological sequences. Each...
5
2407
by: Curtis Gilchrist | last post by:
I am required to read in records from a file and store them in descending order by an customer number, which is a c-style string of length 5. I am storing these records in a linked list. My...
9
6481
by: Noesis Strategy | last post by:
My firm is creating a database that profiles (about 100) companies in various sectors. Each profile is about 10 pages long with perhaps 150 parameters. Some of the parameters will contain a half...
8
1860
by: Riegnman | last post by:
Hey guys, I'm in need of a little help. I am very new to access but have been trying to learn. My problem is as follows. . . We have time clocks that dump the badge punches into a .log file on...
3
2296
by: John Duchowski | last post by:
I've got a fairly large Access file with 4899 records. The ID field runs sequentially but misses two rows: it skips 3727 and 3740. Consequently, the dbase indicates that I have 4901 rows where in...
3
771
by: Michael Thomas | last post by:
Hi everyone Quick question about updating tables. I have a table containing a list of products and I've named the table Products. I got the table into Access 2002 by importing it from a CSV...
35
3178
by: deko | last post by:
Do I get more scalability if I split my database? The way I calculate things now, I'll be lucky to get 100,000 records in my Access 2003 mdb. Here some math: Max mdb/mde size = 2000 x 1024 =...
9
2858
by: MLH | last post by:
I have a database (datatrek.mdb) with a table named DATA. The table has a date/time field with default value = Now(). It has 100 records in it entered over a 50-minute period. I would like the...
2
1769
by: JonoB | last post by:
Need some tips to help me approach this problem. I would consider myself advanced in Access and VBA, so just looking for conceptual pointers. In essence, I have a number of text files, each of...
0
7157
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7195
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6873
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
4579
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3088
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3078
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1400
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
644
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
285
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.